Rabbit Holes.

Or how I realized I loved to solve strange and interesting problems which are utterly useless to me in the real world.

Today I finally created a public GitHub repository called “Rabbit Holes” which I intend to populate with various small projects that answer questions I got intensely curious about, and decided to solve.

The first “rabbit hole” I went down: how does GIT work?

Of course you can’t answer that question without understanding what’s under the hood of the magic .git file lurking inside every GIT repository–and core to that is understanding the contents of the .git/objects directory.

Which, I thought, would be easy.

(Voice over: But much to our hero’s dismay…)

So it turns out that object files contained in a .git/objects file are fairly well documented. Right up until you get to .pack files, and specifically, the OFS_DELTA and REF_DELTA types. Which are as clearly documented as… well, as clear as pea soup.

*sigh*

I did finally figure out what was going on there–specifically, how deltas are stored in the delta file, how the documentation seems woefully inadequate, and how if you ever see the byte sequence “78 01” or “78 9C” in a byte dump of a file, it’s a good hint you’re looking at a zlib-compressed chunk of data. And, more importantly, if you don’t see these bytes, chances are, the sequence is not compressed.

Anyway.

If you’re as curious as I am to know how pack files are actually stored–including working code (in Java) which applies delta changes from a pack file to reconstruct a different version of a stored file–then look at the GIT directory in my rabbit hole.

Leave a comment