Currently working on the improvements to the scanning and planning modules of the app. No pretty UI screenshots to show, just some performance numbers.

The ultimate goal of this rework is to put a cap on the run-time memory usage, which in turn is needed to better accommodate very large backups (tens of millions files and up).


The app comes with a formal backup planner, which is a module that accepts two full directory trees (one for the source location and another for the backup one), digests them and produces a list of simple steps that, when executed in order, bring one tree in sync with another. The app is then diligently goes through the list and creates, updates, renames and deletes files and folders as directed.

An alternative to formal planning is to to traverse the trees and make on-disk changes just as you spot the differences. There are several drawback to this approach, including not being able to estimate the amount of work needed, to understand disk space requirements and to detect file/folder moves. It's just an altogether messier and less predictable way to go about the whole thing.

That's how Bvckup 1 worked and that's how robocopy works, for example.


Long story short, having a backup planner is the right thing to do. However, it means that the app needs to have two full file trees readily available for the planner to digest. If these trees are kept entirely in memory, it may get expensive with large backups. At 100 to 300 bytes per item, a million file backup means 200 to 600 megs of RAM just for the trees.

That's quite a bit, but more importantly this memory usage grows linearly with the size of the backup. This is no good.


The solution obviously enough is to not keep full trees in memory. This is a fairly complicated endeavour and it maps onto several large changes to the app inner structure.

One such change is nearly done. Both the scanning planning modules no longer require for the trees to be resident in memory at all times. Instead, they now use them in a piecemeal fashion.

Behind the scenes, trees still reside in memory for the time being, but the underlying rework has some nice side effects. Trees now take less memory, they are faster to populate, faster to save and load to/from a file, they take substantally less space on disk and they compress better with NTFS compression.

This will ship as a part of R74.

After that, R75 will replace in-memory trees with version that can swap out tree parts to the disk, significantly reducint the run-time memory footprint of the scanning/planning process.