This is a part 2 of a tour of the Bvckup's brand new copying engine dubbed the ultra copier.

Part 1 covered improvements to the engine's bulk copying and in this post we will look at changes to its incremental file updater, a.k.a. the delta copier.

Quick recap

Delta copying is a way of updating files whereby only blocks that changed since the last update are copied over. When files change, more often than not they change only here and there, in an isolated manner. This makes delta copying a godsend for backing up all sorts of large, slowly changing files from large RAW photos, to database files, to VM images and encrypted file containers.

If you are familiar with rsync, it is based on the same idea to derive the same speed benefits. However where Bvckup 2 differs from rsync is how it does change detection.

Detecting changes

The recipe is simple - split the file in large-ish blocks, compute block hashes and then see if the hash changed from the last time. If it did, we got a modified block, so we copy it over.

So we still end up reading a file in full, but we save on being selective with writing, which is usually an expensive thing to do.

Bvckup 2 works by saving block hashes between the runs and then using these as a reference point on the next update.

read block
hash block
load hash from the last time
if (new-hash old-hash) then
    write block
    save new-hash

This requires only basic IO access to source/backup copies and some local space to store the block hashes. That is, the delta copying works with any storage devices that are accessible for reading/writing.

* In comparison, rsync requires a copy of itself to be running on the receiving end when copying over the network. It doesn't require any interim storage, but it ends up reading both source and target files in full. This makes it unsuitable for making local copies and explains the "r" in "rsync".

False negatives

It is possible that two different blocks of data will have the same hash. This is a so-called hash collision.

If we ever run into a hash collision, we will end up skipping a modified block ... which is something we most certainly don't want to be doing.

Avoiding false negatives comes down to reducing the chances of running into them on a real-world data. This is done by using cryptographic hashing for its avalanche effect - a property that causes a hash to change drastically in response to small changes to the input.

Using longer hashes is another option.

And using multiple hashes computed with different algorithms is yet another one.

Old delta setup

From its very first public release and up to Release 79 the program used the following delta copying setup:

Block size32KB
Block hashesMD5 + a variant of CRC32
Full-file hashSHA-1

This yields 20 bytes of hash per block, meaning that the delta state of a file is about 0.06% of its size.

Full-file hash

In addition to hashing individual blocks, delta copier was also computing a SHA-1 hash of the whole file and storing it with the delta state.

The purpose of this hash was to catch cases when no modified blocks were detected but the full-file hash changed.

That was *not* a very good idea. It added very little in terms of the false negative protection and it created a bottleneck.

See, SHA-1 can't be parallelized. In fact, very few cryptographic hash functions can be and they all expect to be fed blocks in sequence, one by one.

This means that performance of the delta copier was effectively bounded by the speed of SHA-1 on a single CPU core.

For mechanical drives it's usually not an issue, because they are quite slow. But SSDs and NVMe drives can be read faster than we can hash what we get from them:

HDD Seagate ST9500325AS 70.5 MB/s
HDD WDC WD5000LPVX 118.8 MB/s
SHA-1 Intel i3 @ 2.53GHz 332.4 MB/s
SHA-1 Xeon E5 v4 @ 2.10GHz 380.1 MB/s
SSD Samsung T3, USB 3.0 429.7 MB/s
SHA-1 Intel i5 @ 2.70GHz 520.9 MB/s
SSD Samsung 860 Pro 534.8 MB/s
NVMe Samsung MZVPV512HDGL 2157.9 MB/s

* These are read rates for the drives and bulk rates for SHA-1 using a well-optimized implementation.

So the full-file hash is a bottleneck and it should be eliminated, which is exactly what an ultra copier in delta mode does.

New delta setup

Starting with Release 79 the delta copying uses a larger block size, replaces MD5 with Blake2b and compliments it with two weaker, but very fast hashes called xxHash and spooky.

Block size64KB
Block hashesBlake2b + xxHash + Spooky
Full-file hash-

Per-block hash size is now 32 bytes - 16 from Blake2b, 8 from xxHash and 8 from Spooky, but the blocks are twice as large, so the ratio of delta-state / file-size is actually lower, ~ 0.05%.

Impact on performance

Block hashing is now noticeably faster:

MD5 595.4 MB/s
SHA-1 561.9 MB/s
SHA-256 247.4 MB/s
SHA-512 391.4 MB/s
Blake2b 762.3 MB/s

Non-crypto hashes, xxHash and Spooky, are running at 10.8 GB/s and 11.7 GB/s respectively. That's gigabytes per second, so these hashes effectively come free of charge.

But more importantly, we are no longer constricted by the full-file hash, so we can push these 762 MB/s per core.

* Numbers are obviously specific to our test machine, but they give a good idea of relative performance of the algorithms.

The net effect of these changes is that the delta copier now can fully saturate an IO of a very fast drive while using just a small handful of hashing threads.

This in turn means that delta copying now comes with virtually no performance impact even when it ends up copying files in full. It basically goes as fast as a regular bulk copier.

An example

Copying a 16 GB file from an NVMe to an SSD drive using the old delta copier vs the new one:

Release 78Release 79
Creating262.8 MB/s547.2 MB/s
Updating272.5 MB/s2026.6 MB/s

Copying file afresh now runs at the write speed of the target drive and updating it afterwards (with little to no changes) runs at the read speed of the source drive.

Needless to say, that's a huge improvement.

Next up is the part 3 of the ultra copier series. It covers resuming of interrupted copies and fast error recovery.
Made by Pipemetrics in Switzerland

Blog / RSS
Miscellanea Press resources
On robocopy

Legal Terms