A quick update on what's happening.

Disk-based virtual data structures, that's what is :)

This is a part of an effort to further reduce and cap app's run-time memory usage. More specifically, it is meant to help the app cope gracefully with multi-million file backups that involve millions of steps and generate lots and lots of log data.

At the moment Bvckup's UI keeps backup logs in memory. It puts an effort to trim them to a reasonable size, but this still eats quite a bit of RAM, especially with large backups.


It costs about 120 bytes plus the size of an entry to keep a single log line in memory, whereby these 120 bytes help weave entries into a hierarchy (a tree) and track their state.

If we want to reduce the memory usage, then the most obvious optimization would be to keep the actual entry text in a disk file and load it on demand, when it's actually visible in the log viewer.

This certainly helps, but as the log grows, the memory usage still climbs. A million file backup generates about 4 million log entries. At 120 bytes per entry that's 0.5 Gig in just supporting structures. This is unacceptable.


So it means that the entire tree structure of the log needs to sit in a disk file and to be read from there as needed.

Disk-based tree structures are routinely used in databases and file systems to store indecies of data sets, typically in a form of a B-Tree and its variations. The kicker is that these structures are meant for storing sorted data and they optimize for searching. Trying to adapt them for storing unsorted data would be nothing short of fitting a square peg in a round hole.


Long story short - there are no ready-made code for storing and manipulating generic data trees on a disk, leave alone fast code. This appears to be an esoteric problem that everyone solves on their own.

The screenshot above shows a part of a small "tree database" library that I ended up writing, together with its own caching module, predictive page loader and a pony.

Took almost two weeks.


On a bright side though, this code is perfectly reusable for storing disk snapshots as well and this paves way for making the planner module work almost entirely off the the memory.

Once done, Bvckup memory usage will NOT depend on the particulars of a backup at all - this a very big deal and an incredibly improtant feature to have for any robust backup software.

So, stay tuned...