Affects Version/s: None
Fix Version/s: 3.0.0
There is a "rounding" problem in Hourglass trimming logic: once the
system is re-bounced or a new index directory is created, the oldest
directory that still contains valid data could be trimmed. This is
because the trimming code simply checks the calendar time based on the
directory name, and as long as it is older than the trimming threshold
the entire directory will be trimmed.
For example, if the rolling forward frequency is DAY and the retention
period is 14, then the entire directory of 2011-05-01-00-00-00 will be
trimmed right after 2011-05-15-00-00-00 is created. This will cause a
sudden drop of total number of docs because we only have 13 days of data
to serve after the trimming.
One simple fix is to modify getTrimTime() to extend the threshold by
another trim unit, but this fix may accidently include unneeded
"fragmented" index directories if they exist. Another solution is to
keep the existing trim threshold, but, during trimming time, sort the
directory names first and include all directories until one directory
that is older than the trimming threshold is hit.