Hard-linking and disk-space... whatever...

Toying yet more with compressing rdiff-backup repositories with hard-linking.  Turns out there's already someone who's done it.  My script for hard-link finding is different than the (packaged) one he's using, but it seems to have the same basic idea (I use filecmp, though, rather than reading the files myself, and I don't check owner/group/mode differences, as they seem to be managed externally by rdiff-backup).

Of course, having done all that and actually run the compression... it's disappointing. I see a solid handful of MBs disappear, but since I'm excluding the source-code directories (bzr/svn/cvs checkouts) already, the savings is rather uninspiring. There are hundreds of MBs of duplicated files in the check-outs (4 or 5 checkouts of the same set of dozens of images, for instance), but the virtualenv packages don't seem to add up to much duplication (39MBs for the OpenGL and one client's work-spaces). Basically, while the virtualenvs are hundreds of MBs, most of that space, it turns out is actually in the custom sources, not the dependencies.

Playing around still further, I played with excluding Firefox cache files, for instance... and discovered that the "url classifier" files are twice the size... maybe I need to exclude them too... or maybe I need to just go to the dratted computer store and buy a 2TB drive for the server and stop wasting precious hours on optimizing away a few MBs of storage-space. This kind of saving would likely only be useful if you were doing full-system imaging of very similar machines (or something like that), and I don't have any need for that these days.

Comments

  1. Jack

    Jack on 09/14/2009 2:42 a.m. #

    I was wandering about the whole idea of storing tons of cookies and other temporary internet files on hard drive. It was understood when the connection speed was quite low, but now? The connection speed is so fast that anything become possible, there is no need of using data-hard-drive-storing, it just just obsolete right now.

Comments are closed.

Pingbacks

Pingbacks are closed.