This is the final part of the ultra copier tour
and it covers a much improved support for
resuming copying after cancellations and
IO errors.
It works just as you'd expect - if a copying of
a file is interrupted part-way through, then
on the next run the copying will resume from
around where it was before.
The most interesting bit here is that this works
even if the copy is aborted by disconnecting
a drive, in the middle of a write. More often
than not this will cause data corruption,
but the program is now capable of detecting
the extent of such corruption and correcting it
without recopying the file in full.
† Some conditions apply
First, this is enabled only for
files that are delta-copied,
because all resume-related data is saved with the
file's delta state.
Second, the source file must
remain unchanged between the attempts
(as witnessed by its size and timestamps).
Because if the source changes, we don't know
where the change was, so we must start from
the beginning.
Third, the destination file is,
obviously, expected to also stay the same,
but that's a more general requirement for
the delta copying as a whole. Touching target
file between the runs will automatically
invalidate its delta state and trigger
a re-copying.
* For people lacking certain excitement
in their lives it is possible to suppress
the last check. Inquire within for details.
In addition to recording how far it went
in the source file, the ultra copier also
remembers the reason for aborting a copy.
Three main reasons are the read errors,
write errors and user cancellations.
In case of read errors and
user cancellations
the copier gets a chance to shut down the
copying process in an orderly fashion.
This ensures a consistent state of the backup
copy, so in these cases the copying is
always resumable.
In case of write errors it depends.
Recovering from write errors
If we are to yank out an USB drive while
it is writing data, the state of the backup
copy will be somewhat uncertain.
Ditto for the network copies when the
router decides it had enough for the
day and dies.
So when we are resuming after a write
failure, we need a way to ensure that
the state of the destination file is
consistent with our (or rather
delta copier's) view of it.
Ultra does that by keeping track of file ranges
that it successfully updated on the last run in
a form of a write log.
However since we are resuming after a write
failure, some of these writes may actually
have never hit the storage media even though
they were reported as "completed". This is due to all
the caching, lazy-writing and outright lying that
modern drives do to improve their performance.
For this reason when resuming after a write
failure the ultra will go through the write
log, read respective parts of the backup copy
and check their hashes against
those stored in the delta state. If there's
a match, we are in clear. Otherwise, it's a
block we need to re-copy.
The earliest non-matching block from the
write log gives us an adjusted resume point.
Simple... with a hint of elegance.
So there you have it...
1. Faster bulk copying#
2. Faster delta copying#
3. Resuming support and error recovery
All courtesy of the new ultra copier.
Coming to an update server near you in a few days...
It works just as you'd expect - if a copying of a file is interrupted part-way through, then on the next run the copying will resume from around where it was before. The most interesting bit here is that this works even if the copy is aborted by disconnecting a drive, in the middle of a write. More often than not this will cause data corruption, but the program is now capable of detecting the extent of such corruption and correcting it without recopying the file in full.
† Some conditions apply
First, this is enabled only for files that are delta-copied, because all resume-related data is saved with the file's delta state.Second, the source file must remain unchanged between the attempts (as witnessed by its size and timestamps). Because if the source changes, we don't know where the change was, so we must start from the beginning.
Third, the destination file is, obviously, expected to also stay the same, but that's a more general requirement for the delta copying as a whole. Touching target file between the runs will automatically invalidate its delta state and trigger a re-copying.
* For people lacking certain excitement in their lives it is possible to suppress the last check. Inquire within for details. In addition to recording how far it went in the source file, the ultra copier also remembers the reason for aborting a copy. Three main reasons are the read errors, write errors and user cancellations.
In case of read errors and user cancellations the copier gets a chance to shut down the copying process in an orderly fashion. This ensures a consistent state of the backup copy, so in these cases the copying is always resumable.
In case of write errors it depends.
Recovering from write errors
If we are to yank out an USB drive while it is writing data, the state of the backup copy will be somewhat uncertain. Ditto for the network copies when the router decides it had enough for the day and dies.So when we are resuming after a write failure, we need a way to ensure that the state of the destination file is consistent with our (or rather delta copier's) view of it.
Ultra does that by keeping track of file ranges that it successfully updated on the last run in a form of a write log.
The earliest non-matching block from the write log gives us an adjusted resume point. Simple... with a hint of elegance.
So there you have it...
1. Faster bulk copying #2. Faster delta copying #
3. Resuming support and error recovery
All courtesy of the new ultra copier. Coming to an update server near you in a few days...