Chapter 5. Git Grandmastery

This pretentiously named page is my dumping ground for uncategorized Git tricks.

Source Releases

For my projects, Git tracks exactly the files I'd like to archive and release to users. To create a tarball of the source code, I run:

$ git archive --format=tar --prefix=proj-1.2.3/ HEAD

Changelog Generation

It's good practice to keep a changelog, and some projects even require it. If you've been committing frequently, which you should, generate a Changelog by typing:

$ git log > ChangeLog

Git Over SSH, HTTP

Suppose you have ssh access to your web server, but it does not have Git installed. Then download, compile and install Git in your account.

Create a repository in your web directory:

$ GIT_DIR=proj.git git init

and in the "proj.git" directory, run

$ git --bare update-server-info
$ chmod a+x hooks/post-update

From your computer, push via ssh:

$ git push web.server:/path/to/proj.git master

and others get your project via:

$ git clone http://web.server/proj.git

Commit What Changed

Telling Git when you've added, deleted and renamed files gets tedious. Instead, you can type:

$ git add .
$ git add -u

Git will look at the files in the current directory and work out the details by itself. Instead of the second add command, run git commit -a if you also intend to commit at this time.

You can perform the above in a single pass with:

$ git ls-files -d -m -o -z | xargs -0 git update-index --add --remove

The -z and -0 options prevent ill side-effects from filenames containing strange characters. Note this command adds ignored files. You may want to use the -x or -X option.

I Stand Corrected

Did you just commit, but wish you had typed a different message? Realized you forgot to add a file? Then:

$ git commit --amend

can help you out.

Since this changes the history, only do this if you have yet to push your changes, otherwise your tree will diverge from other trees. Of course, if you control all the other trees too, then there is no problem since you can overwrite them.

… And Then Some

Let's suppose the previous problem is ten times worse. After a lengthy session you've made a bunch of commits. But you're not quite happy with the way they're organized, and some of those commit messages could use rewording. This is quite likely if you've been saving early and saving often. Then type

$ git rebase -i HEAD~10

and the last 10 commits will appear in your favourite $EDITOR. A sample excerpt:

pick 5c6eb73 Added repo.or.cz link
pick a311a64 Reordered analogies in "Work How You Want"
pick 100834f Added push target to Makefile

Then:

  • Remove commits by deleting lines.
  • Reorder commits by reordering lines.
  • Replace "pick" with "edit" to mark a commit for amending.
  • Replace "pick" with "squash" to merge a commit with the previous one.

Next run git commit --amend if you marked a commit for editing. Otherwise, run:

$ git rebase --continue

Again, only do this if no one else has a clone of your tree.

Local Changes Last

You're working on an active project. You make some local commits over time, and then you sync with the official tree with a merge. This cycle repeats itself a few times before you're ready to push to the central tree.

But now the history in your local Git clone is a messy jumble of your changes and the official changes. You'd prefer to see all your changes in one contiguous section, and after all the official changes.

This is a job for git rebase as described above. In many cases you can use the --onto flag and avoid interaction.

Also see the manpage for other amazing uses of this command, which really deserves a chapter of its own. You can split commits. You can even rearrange branches of a tree!

My Commit Is Too Big!

Have you neglected to commit for too long? Been coding furiously and forgotten about source control until now? Made a series of unrelated changes, because that's your style?

No worries, use git add -i or git commit -i to interactively choose which edits should belong to the next commit.

Don't Lose Your HEAD

The HEAD tag is like a cursor that normally points at the latest commit, advancing with each new commit. Some Git commands let you move it. For example:

$ git reset HEAD~3

will move the HEAD three commits backwards in time. Thus all Git commands now act as if you hadn't made those last three commits, while your files remain in the present. See the git reset man page for some applications.

But how can you go back to the future? The past commits do not know anything of the future.

If you have the SHA1 of the original HEAD then:

$ git reset SHA1

But suppose you never took it down? Don't worry, for commands like these, Git saves the original HEAD as a tag called ORIG_HEAD, and you can return safe and sound with:

$ git reset ORIG_HEAD

HEAD-hunting

Perhaps ORIG_HEAD isn't enough. Perhaps you've just realized you made a monumental mistake last month and you need to go back to an ancient commit in a long-forgotten branch.

It's hard to lose Git commits permanently, even after deleting branches. As long as you never run git gc --prune, your commits are preserved forever and can be restored at any time.

The trouble is finding the appropriate hash. You could look at all the hash values in .git/objects and use trial and error to find the one you want. But there's a much easier way.

Git records every hash of a commit it computes in .git/logs. The subdirectory refs contains the history of all activity on all branches, while the file HEAD shows every hash value it has ever taken. The latter can be used to find hashes of commits on branches that have been accidentally lopped off.

The reflog command provides a friendly interface to these log files. Try

$ git reflog

and see its manpage for more information.

Eventually, you may want to run git gc --prune to recover space. Be aware that doing so prevents you from recovering lost HEADs.

Making History

Want to migrate a project to Git? If it's managed with one of the more well-known systems, then chances are someone has already written a script to export the whole history to Git.

Otherwise, take a look at git fast-import. This command takes text input in a specific format and creates Git history from scratch. Typically a script is cobbled together and run once to feed this command, migrating the project in a single shot.

As an example, paste the following listing into temporary file, such as /tmp/history:

commit refs/heads/master
committer Alice <alice@example.com> Thu, 01 Jan 1970 00:00:00 +0000
data <<EOT
Initial commit.
EOT

M 100644 inline hello.c
data <<EOT
#include <stdio.h>

int main() {
  printf("Hello, world!\n");
  return 0;
}
EOT


commit refs/heads/master
committer Bob <bob@example.com> Tue, 14 Mar 2000 01:59:26 -0800
data <<EOT
Replace printf() with write().
EOT

M 100644 inline hello.c
data <<EOT
#include <unistd.h>

int main() {
  write(1, "Hello, world!\n", 14);
  return 0;
}
EOT

Then create a Git repository from this temporary file by typing:

$ mkdir project; cd project; git init
$ git fast-import < /tmp/history

You can checkout the latest version of the project with:

$ git checkout master .

Building On Git

In true UNIX fashion, Git's design allows it to be easily used as a low-level component of other programs. There are GUI interfaces, web interfaces, alternative command-line interfaces, and perhaps soon you will have a script or two of your own that calls Git.

One easy trick is to use built-in git aliases shorten your most frequently used commands:

$ git config --global alias.co checkout
$ git config --global --get-regexp alias  # display current aliases
alias.co checkout
$ git co foo                              # same as 'git checkout foo'

Another is to print the current branch in the prompt, or window title. Invoking

$ git symbolic-ref HEAD

shows the current branch name. In practice, you most likely want to remove the "refs/heads/" and ignore errors:

$ git symbolic-ref HEAD 2> /dev/null | cut -b 12-

See the Git homepage for more examples.

Daring Stunts

Recent versions of Git make it difficult for the user to accidentally destroy data. This is perhaps the most compelling reason to upgrade.

Nonetheless, there are times you truly want to destroy data. We show how to override the safeguards for common commands. Only use them if you know what you are doing.

Checkout: If you have uncommitted changes, a plain checkout fails. To destroy your changes, and checkout a given commit anyway, use the force flag:

$ git checkout -f COMMIT

On the other hand, if you specify particular paths for checkout, then there are no safety checks. The supplied paths are quietly overwritten. Take care if you use checkout in this manner.

Reset: Reset also fails in the presence of uncommitted changes. To force it through, run:

$ git reset --hard [COMMIT]

Branch: Deleting branches fails if this causes changes to be lost. To force a deletion, type:

$ git branch -D BRANCH  # instead of -d

Similarly, attempting to overwrite a branch via a move fails if data loss would ensue. To force a branch move, type:

$ git branch -M [SOURCE] TARGET  # instead of -m

Unlike checkout and reset, the destruction is deferred. The changes are still stored in the .git subdirectory, and can be retrieved by recovering the appropriate hash from .git/logs (see "HEAD-hunting" above). The data is only deleted the next time garbage is collected.