Penguin

Git is an OpenSource DVCS (although it originally started as just a storage backend). It was initially written by LinusTorvalds for the needs of the LinuxKernel developers, but due to a clever but simple internal design it can be tailored to nearly any development workflot. It is itself kept in a Git repository. The project was born of necessity when BitKeeper's licence was changed such that it was no longer an acceptable home for the LinuxKernel, but all other VersionControlSystems were found inadequate.

Torvalds seemed aware that his decision to drop BitKeeper would also be controversial. When asked why he called the new software, "git," British slang meaning "a rotten person," he said. "I'm an egotistical bastard, so I name all my projects after myself. First Linux, now git."

From PC World: After controversy, Torvalds begins work on "git"

Design

Git's primary design objective is to keep the computational effort of committing a patch proportional to the size of the patch, rather than the size of the repository, as would be the case with most VersionControlSystems. This is achieved by keeping every version of every object (ie a file in a directory), compressed using ZLib and identified by its SHA1 sum. Contrary to traditional VersionControlSystems, this makes building a history for a single object computationally expensive, while at the same time making getting a view of any particular revision of the repository as a whole very cheap.

The main Git ManPage describes it as “the stupid content tracker”. What this means is that the repository format is designed to store only file and directory contents (known as a tree) but not store information about how trees arrived at their state. F.ex. deleting a file and then untracking it in Git is identical to using git rm, an operation that is special in many other VCSs. While this means that algorithms like rename tracking and merge conflict resolution must initially get by with less information, it also means that as that as they become more intelligent, all of the history benefits, not just history recorded recently enough to have the necessary metadata. It also means that all workflows are created equal.

Git has two main ways of transferring data. Its native application protocol is implemented by the git fetch-pack and git upload-pack commands, but Git can also crawl the files in the remote repository's object store without reliance on server-side intelligence. There are a number of transport protocols:

  • The most efficient transport protocol is called git://, which is merely a thin TCP wrapper around the application protocol, commonly using Port 9418. However it has no built-in access control so usually does not allow pushing, although it can.
  • The default transport protocol is ssh://. This tunnels the native protocol over SSH. Pushing is generally possible.
  • On servers without Git installed, you should probably use rsync(1). Again, pushing is usually possible.
  • Lastly, the slowest option is to simply publish the repository via HTTP. If the WebServer supports WebDav, it can even accept pushes.

Note that when using a dumb transport, you must ensure that git update-server-info is run in the repository any time someone commits to it (eg. by setting up a hook). This maintains a static file with a stored index of the state of the repository which would be dynamically negotiated between git fetch-pack and git upload-pack in smart transports.

Notes

You may encounter the following error:

*** Environment problem:
*** Your name cannot be determined from your system services (gecos).
*** You would need to set GIT_AUTHOR_NAME and GIT_COMMITTER_NAME
*** environment variables; otherwise you won't be able to perform
*** certain operations because of "empty ident" errors.
*** Alternatively, you can use user.name configuration variable.

fatal: empty ident  <user@machine.localdomain> not allowed

This means your RealName is not properly set up in the system user accounts list (/etc/passwd, specifically the gecos field for your account), which Git uses by default. You can either fix that or configure Git itself:

git config --global user.name Joe Random Hacker
git config --global user.email jrh@example.org

This will set your default username (saved in ~/.gitconfig). Alternatively you can set your username specifically for a particular repository (stored in projectroot/.git/config) by omitting the --global switch.

Installing Git documentation when compiling from source

You need to build it separately:

make doc
make install-doc

Note that it depends on quite a few extra tools, so you may need to install extra Packages on your machine. In particular asciidoc must be at least version 7 unless you want to hack the MakeFiles. If you are running DebianLinux Sarge you will need to take this from testing. See AptNotes for more details on how to do this.

Alternatively the ManPages can be found online at http://www.kernel.org/pub/software/scm/git/docs/

To be able to use Git, you need to bootstrap your installation using the TarBall at http://www.codemonkey.org.uk/projects/git-snapshots/git/.

See also


CategoryVersionControl