Partial mid-file read
Past end-of-file read
It's easy to do things when they go as planned. It's when they go sideways is what makes everything much more exciting.
In other words, the next maintenance release includes changes to better handle several new edge cases.
When reading a file, the program does that by requesting blocks of data, back to back:
Then, as nears the end of a file two things can happen. If the file size is not a multiple of block size, the program will receive back an
And if the file is perfectly aligned to the block size, the program will see either a zero-sized block or an explicit "end of file" notification or
Upon seeing either of these lovely events, the program will know that it reached the end of the file and there's nothing else left to copy.
There is however a
, the program may receive a partial block back even if it's not at the end of a file.
It is declared, that if this happens, it indicates a
device or some other intermittent issue. The program is then welcome to retry the request or to read the remainder with a separate query.
, this means a
storage device or a severely stressed system.
Further to that, very few program implement retrying on partial reads and OS vendors fully
So the overwhelmingly common practice is for the OS to just fail such requests, automatically meaning that a
partial read = EOF
However "overwhelmingly common" is not 100%.
As it turns out there
exist storage devices that will return partial blocks mid-file if ...
the file is corrupted
Technically, it makes some sense - this allows salvaging at least some data from a corrupted block. Practically however this is completely useless unless you are in a data recovery business.
Partial mid-file reads
The first edge case that is now being explicitly handled by Bvckup 2 is that of reading a partial block
More specifically, the program will now check that a partial block
the exact tail end of a file and it will raise an alarm if it's not.
As you probably know Bvckup 2 uses async IO and it normally has several read/write requests "in flight".
With the new release, if it detects an EOF of the source file, it will then check any read requests for blocks beyond this EOF location and ensure that they too show the same EOF.
Rapidly growing files
There are cases when being able past EOF marker is actually OK.
If the source file is being actively modified and it is growing very quickly, the program may see an EOF followed by a actual data block, because the latter gets added
the EOF and the second requests.
This is now also detected and reported accordingly.
Rapidly shrinking files
There's also an inverse case - the source file being very quickly shrunk down.
When this happens, Bvckup 2 will see an EOF first. It will realize that EOF is for a smaller file size than it saw when the copying started. So it will re-check the size and the current file size will now be
than the recorded EOF.
Clearly, this is a mess and it is too now detected and logged.
Will be in release