mirror of
https://github.com/wezm/wezm.net.git
synced 2024-12-18 18:29:54 +00:00
QA /technical/2010/02/git-object-store-efficiency/
This commit is contained in:
parent
9340702ccc
commit
f2f749abe0
1 changed files with 15 additions and 12 deletions
|
@ -1,42 +1,45 @@
|
|||
Hubert Feyrer posted, <a href="http://www.feyrer.de/NetBSD/bx/blosxom.cgi/nb_20100212_1706.html">Musing about git's object store efficiency</a> yesterday. In it he compared the apparent efficiency of git's object store to CVS's stacked patches. His methodology was to checkout all 963 versions of the NetBSD i386 GENERIC kernel configuration file and then sum up the space used. He comes to the following conclusion:
|
||||
|
||||
<blockquote>the git model requires about 37 times the space that CVS does</blockquote>
|
||||
> the git model requires about 37 times the space that CVS does
|
||||
|
||||
and:
|
||||
<blockquote>that's not counting the overhead of 962 inodes and the related directory bookkeeping</blockquote>
|
||||
> that's not counting the overhead of 962 inodes and the related directory bookkeeping
|
||||
|
||||
He finishes off with an acknowledgement that git has data packing features:
|
||||
<blockquote> I know that git offers some more efficient storage methods via "pack" files, but investigating those is left as an exercise to the reader. </blockquote>
|
||||
|
||||
> I know that git offers some more efficient storage methods via "pack" files, but investigating those is left as an exercise to the reader.
|
||||
|
||||
I generally enjoy Hubert's posts but as a daily user of git this one didn't sit right with me. I thought I'd take up the aforementioned exercise.
|
||||
<!--more-->I retrieved the GENERIC,v rcs file<sup>1</sup> and created a git repository<sup>2</sup>.
|
||||
|
||||
<!--more-->
|
||||
I retrieved the GENERIC,v rcs file<sup>1</sup> and created a git repository<sup>2</sup>.
|
||||
|
||||
I then ran <a href="http://gist.github.com/303277">a script</a><sup>3</sup>, which committed each revision of the file along with a single line commit message extracted from the rcs log.
|
||||
|
||||
The repository then weighed in at 22,352kb<sup>4</sup> with 3,174 files and directories<sup>5</sup>. This is where git-gc comes in. From the man page, "git-gc - Cleanup unnecessary files and optimize the local repository". After running <code>git gc</code><sup>6</sup> the size of the repository was down to 1,068kb, 1.24 times the rcs file. The file and directory count also vastly smaller at 64.
|
||||
The repository then weighed in at 22,352kb<sup>4</sup> with 3,174 files and directories<sup>5</sup>. This is where `git-gc` comes in. From the man page, "git-gc - Cleanup unnecessary files and optimize the local repository". After running `git gc`<sup>6</sup> the size of the repository was down to 1,068kb, 1.24 times the rcs file. The file and directory count also vastly smaller at 64.
|
||||
|
||||
So all in all git fares pretty well. Sure the repository is bigger than CVS and there's a few more files but its not in order Hubert suggests and its a small price to pay for all the benefits git provides.
|
||||
So all in all git fares pretty well. Sure the repository is bigger than CVS and there's a few more files but its not in the order Hubert suggests and its a small price to pay for all the benefits git provides.
|
||||
|
||||
________________________
|
||||
|
||||
1. From <a href="ftp://ftp.netbsd.org/pub/NetBSD/misc/repositories/cvsroot/src/sys/arch/i386/conf/GENERIC,v">ftp.netbsd.org</a>.
|
||||
1\. From <a href="ftp://ftp.netbsd.org/pub/NetBSD/misc/repositories/cvsroot/src/sys/arch/i386/conf/GENERIC,v">ftp.netbsd.org</a>.
|
||||
|
||||
2.
|
||||
2\.
|
||||
<pre>mkdir git
|
||||
cd git
|
||||
git init
|
||||
Initialized empty Git repository in /Users/wmoore/Source/NetBSD i386 GENERIC/git/.git/</pre>
|
||||
|
||||
3.
|
||||
3\.
|
||||
<script src="http://gist.github.com/303277.js?file=populate_git_repo.sh"></script>
|
||||
|
||||
4. Repository sizes detemined via:
|
||||
4\. Repository sizes detemined via:
|
||||
<pre>du -sk .</pre>
|
||||
|
||||
5. File and directory counts determined via:
|
||||
5\. File and directory counts determined via:
|
||||
<pre>find . | wc -l</pre>
|
||||
|
||||
6.
|
||||
6\.
|
||||
<pre>git gc
|
||||
Counting objects: 2871, done.
|
||||
Delta compression using up to 2 threads.
|
||||
|
|
Loading…
Reference in a new issue