Working with Git (adapted from the Dalton&Dirac wiki)

Git is a program that helps multiple developers work together on a program simultaneously. Every revision to the source code is retained, so if something goes wrong it is always possible to go back to a previous state.

Main differences that surprise Subversion users

You can probably skip this if you have never worked with Subversion or CVS. In Subversion you check out a working copy, typically trunk, typically only the last revision (HEAD). With Git you clone the full repository, all branches, entire history. But the repository remembers where it came from: the origin, where git pull and git push will refer to. Subversion identifies revisions with IDs of decimal numbers growing monotonically which are typically small. That is impractical in distributed systems like Git. Git identifies revisions with SHA1 IDs, which are long 160-bit numbers written in hexadecimal. It may look scary at first, but in practice it is not a big hurdle - you can refer to the latest revision by HEAD, its parent as HEAD^ and its parent as HEAD^^ = HEAD~2. Cut'n'paste helps a lot and you can write only the few leading digits of a revision as long as it is unique, Git will guess the rest. In Git you cannot pull (update) or merge or branch or switch branches with uncommitted modifications. When you get used to the Git way this restriction starts to make a lot of sense. But sometimes you want to pull and do not want to commit, in this case you can @git stash@ your modifications and "unstash" (@git stash pop@) them after you have updated. @git commit@ and @git push@ are not the same thing. You commit locally (@git commit@) and then you publish your changes to origin (@git push@) or not. You can @git commit@ several times before you @git push@. A @svn commit@ corresponds to @git commit@ with immediate @git push@. Git does not track directories or files, only content. So you cannot @git add@ and commit an empty directory. @git checkout@ is at first sight rather different from @svn checkout@. You can use @git checkout@ to switch to another branch or to revert modifications but not to clone the repository. However, the more you use Git the more @git checkout@ will resemble @svn checkout@.

Before you start working with git

It is highly recommended to set the following for the optimum git experience: Colorize your life!

$ git config --global color.branch auto
$ git config --global color.diff   auto
$ git config --global color.status auto

Identify yourself - set your name and your e-mail:

$ git config --global user.name "Slim Shady"
$ git config --global user.email slim.shady@ctcc.no

Set the default mode for git push (current, matching or tracking):

$ git config --global push.default current

Disable fast-forward merges (read for instance http://robey.lag.net/2008/07/13/git-for-the-real-world.html) $ git config branch.master.mergeoptions "--no-ff"

Basic work

You can live a fulfilled life with the following few Git commands: Clone the repository (checkout a working copy in Subversion speak):

$ git clone git@repo.ctcc.no:project_name.git

Update your repository with changes from origin (svn update):

$ git pull

Browse the history:

$ git log --topo-order --decorate [--oneline --graph]

Add a file:

$ git add

Move or rename:

$ git mv

Remove:

$ git rm

See which files are modified since last commit:

$ git status

Browse the history and see which files have been modified:

$ git log --stat

See your modifications:

$ git diff

Commit all uncommitted modifications:

$ git commit -a

Commit a specific file:

$ git commit

Publish your changes to upstream (svn commit):

$ git push

Updating the code

Update the current branch (use @git branch@ to figure out where you currently are):

$ git pull

If you have modified code that has also been modified upstream, Git will issue a conflict that you need to resolve (see below). Note that @git pull@ is not the same thing as @svn update@. The reason is that your local master and the remote (repo.ctcc.no) origin/master are two different branches. While you have locally committed changes to your local master perhaps someone else has committed other changes to origin/master in the meantime. When you run @git pull@, Git fetches changes from origin and merges them with your master. Then you will see a commit: "Merge branch 'master' of repo.ctcc.no:project_name" which you didn't do. Git did it for you. There is nothing wrong with this. For large commits this is the recommended way. This also means that if you commit many changes that are likely to conflict with others, then you should @git pull@ often. For very small changes you may not want to clutter the history with such merge commits. In this case you can instead do:

$ git pull --rebase

This will replay your changes on top of the changes made on origin/master. It will modify your commit(s) accordingly so that they look as if you have committed them after the changes made on origin/master.

Committing changes

You can commit one file or several files:

$ git commit

Or you can commit all modifications that you see with @git status@:

$ git commit -a

and write a useful commit message (see below) with your favorite editor (set by the environment variable @$EDITOR@). Note that @git commit@ is only local. You could remove this commit afterwards (see below) and nobody would notice that this commit ever existed. To publish your commit(s) to origin so that others see them, you need to:

$ git push

To see all your unpushed commits, type:

$ git log origin/master..HEAD

With Git, you can @git commit@ many many times before doing a @git push@. You can do whatever you like before you @git push@. If you plan to @git push@ nontrivial changes to origin/master, you should run the test set first.

Writing useful commit messages

Useful commit messages serve as code development documentation. Make it easy for people to search in the future. Example of a bad commit message:

rbast:

fixed bug

This is bad because * It is not clear what was fixed and why. * Using initials/names in commit messages is not needed, Git knows who was the author of a commit. * Many browsers will only show the first line in overviews, in this case "Radovan Bast: rbast:". A better message looks like this:

one line summarizing the commit, no longer than 50 characters
[one blank line]
paragraph with more detailed description of the changeset, why and how
...

Commit changes in coherent units (don't commit unrelated changes in single commits). This is because if you commit something that is a good idea and something which is not in one single step, it is cumbersome to rollback the bad idea without undoing also the good one. It is also easier to collect patches for releases and updates based on coherent commits. If a commit is meant to fix a bug present in an earlier release, begin your commit message with *bugfix:*. If your commit message contains "in addition ..." or "at the same time ..." or similar then you are probably committing several changes in one commit which is considered bad practice.

Ignoring untracked files

Type

$ git status

Do you see "# Untracked files:"? These are either files which you forgot to git add, or files which should not be under version control. If the latter is the case, you need to tell Git to ignore them. For this simply edit the file .gitignore which contains patterns for files and directories that are to be ignored. It is important to keep your @git status@ clean of untracked files.

Editing commits

Never modify commits that have been published (pushed) or pulled by somebody else! In other words +never edit commits that are visible to other people+ or change history which is visible to other people. On the other hand it is perfectly fine to edit your local commits before making them available for your coworkers. In fact you should edit local commits if they went to master and break compilation. Commits to master should never break compilation to make @git bisect@ useful. But as soon as you share commits with others, do not edit them. Such edits would change the history and create horrible conflicts. Modify the last commit message:

$ git commit --amend

Add something (_path_) to previous commit:

$ git commit --amend path

Squash the last 20 commits into 1 (this will start an interactive rebase on the last 20 commits, then use "squash" or "fixup" instead of "pick"):

$ git rebase -i HEAD~20

Undoing changes

Undo uncommitted modifications (restore a file from the last revision, equivalent to svn revert path):

$ git checkout path

Undo all uncommitted modifications (*use with care*):

$ git checkout .

Commit a change that will revert a previous commit:

$ git revert hash

Throw away the last commit (*use with care*):

$ git reset HEAD^

Throw away all commits after revision _hash_ (*use with care*):

$ git reset --hard hash

Working with branches

It is good Git practice to create a branch for every task/feature/bugfix that you work on. Branches are cheap, merges are easy, so don't fear the branches. Branches can be either local branches or tracking branches. Tracking branches track remote branches which reside somewhere else, typically on the central repository (here).

Creating a branch

First list all branches (the one with the star is the one you are on):

$ git branch

List all remote branches:

$ git branch -r

To branch off a new branch _foo_ and switch to it, type:

$ git branch foo
$ git checkout foo

Or create a branch and switch to it in one step:

$ git checkout -b foo

Switching to another branch

List all branches (the one with the star is the one you are on):

$ git branch

Switch to branch _foo_:

$ git checkout foo

Switch back to master:

$ git checkout master

You cannot switch branches with uncommitted modifications. If you really don't want to commit them, you can @git stash@ your modifications and "unstash" (@git stash pop@) them after you have switched.

Deleting a branch

You want to delete branch _foo_:

$ git branch -d foo

Creating remote branches so that other people can use them

Your local branches aren’t automatically synchronized to the remotes you write to — you have to explicitly push the branches you want to share. That way, you can use private branches for work you don’t want to share, and push up only the topic branches you want to collaborate on. As an example, let us create a remote branch called fde-integration-modularization. First we create a local branch from master: $ git checkout -b fde-integration-modularization Now we need to push the branch to origin to make it available for others: $ git push --set-upstream origin fde-integration-modularization We can check if the branch was configured for 'git pull' and 'git push', so we don't need to specify repository and branch name when doing git pull and git push: $ git remote show origin Now we can work on it, push and pull and your collaborators, tracking this branch, will see the changes.

Checking out and tracking a remote branch

Branch foo set up to track remote branch origin/foo and switch to it: $ git checkout -b foo origin/foo

Deleting a remote branch

This will delete remote branch foo $ git push origin :foo Think of this command as pushing empty:foo, empty branch to foo.

Good and bad practices

Branch often. You may read about @git rebase@ as alternative to @git merge@. @git rebase@ is a powerful and useful tool but do not rebase code that is visible to others (see Simpsons episode where Homer travels in time and changes history). Do not develop directly on master branch. On master do only merges with already finished and tested local branches. Have a look also [[Working_with_git#Updating-the-code|here]]. Create one branch for each task. Do not solve several tasks in one branch. If you do that, you are asking for future conflicts. If you see something in the code that is wrong and has nothing to do with the task of the branch, it is a bad idea to fix it on the branch. Someone else may fix it the next day on master and you will get conflicts. The right way is to fix it on master and merge the change to the branch. Do not create two branches which work on the same functionality/code. Rather communicate and work on one topic branch together to avoid conflicts. Topic branches should not (need to) merge between each other. Stay in sync: It can be useful to often @git commit@ and wait for the @git push@. In this case do not forget to @git pull@ from time to time to stay in sync with others. If you work on code that other people work on, and you never @git pull@, you will get conflicts. It is a bad idea to modify core code (integral list in abacus, keyword lists) or code with large overlap with others on a long-lived branch. One day you will want to merge the long-lived branch and you will have many conflicts. Code changes that affect others or that touch core code should merge back very quickly to avoid future conflicts.

Merging branches

It is highly recommended that you merge disabling fast-forward merging (read for instance http://robey.lag.net/2008/07/13/git-for-the-real-world.html). You can disable this either per branch using git config: $ git config branch.master.mergeoptions "--no-ff" Or doing merges with --no-ff flag.

Merging local branches

Merge branch _foo_ to _master_:

$ git checkout master
$ git merge --no-ff foo

Merge _master_ to _foo_:

$ git checkout foo
$ git merge --no-ff master

That's all. You may see conflicts which you need to resolve (see below).

Merging a remote branch

$ git merge --no-ff origin/foo

Example: merging branch cpp

Let us consider an explicit example: merging branch cpp. cpp is a remote branch on origin. First we check it out: $ git checkout -b cpp origin/cpp Since we want to merge to master, we will switch to master: $ git checkout master And verify that this is the case: $ git branch And it is the case:

  cpp
* master

Now we merge:

$ git merge --no-ff cpp
warning: too many files (created: 696 deleted: 795), skipping inexact rename detection
Auto-merging prp/pamrsp.F
CONFLICT (content): Merge conflict in prp/pamrsp.F
Auto-merging prp/pamrvc.F
CONFLICT (content): Merge conflict in prp/pamrvc.F
Auto-merging prp/pamxlr.F
Auto-merging prp/pamxqr.F
Automatic merge failed; fix conflicts and then commit the result.

Ok - there are two conflicting files:

$ git status
# On branch master
# Changes to be committed:
#
#       modified:   include/dcbxlr.h
#       modified:   include/dcbxrs.h
#       modified:   prp/pamxlr.F
#       modified:   prp/pamxqr.F
#       new file:   test/response_complex/LiH.mol
#       new file:   test/response_complex/LiH_mini.mol
#       new file:   test/response_complex/cpp.inp
#
# Unmerged paths:
#   (use "git add/rm ..." as appropriate to mark resolution)
#
#       both modified:      prp/pamrsp.F
#       both modified:      prp/pamrvc.F
#

Now we could resolve the conflicts with our favorite editor and mark resolution with @git add@. Instead we will use @git mergetool@ together with meld because it is so convenient. @git mergetool@ asks me which tool to use and I answered meld.

$ git mergetool prp/pamrsp.F prp/pamrvc.F
merge tool candidates: meld opendiff kdiff3 tkdiff xxdiff tortoisemerge gvimdiff diffuse ecmerge p4merge araxis emerge vimdiff

Normal merge conflict for 'prp/pamrsp.F':
  {local}: modified
  {remote}: modified
Hit return to start merge resolution tool (meld):

Now search for red blocks and decide whether you want to keep the local or remote version. After this is done we verify that the conflicts are resolved:

$ git status
# On branch master
# Changes to be committed:
#
#       modified:   include/dcbxlr.h
#       modified:   include/dcbxrs.h
#       modified:   prp/pamrsp.F
#       modified:   prp/pamrvc.F
#       modified:   prp/pamxlr.F
#       modified:   prp/pamxqr.F
#       new file:   test/response_complex/LiH.mol
#       new file:   test/response_complex/LiH_mini.mol
#       new file:   test/response_complex/cpp.inp
#
# Untracked files:
#   (use "git add ..." to include in what will be committed)
#
#       prp/pamrsp.F.orig
#       prp/pamrvc.F.orig

Now we can commit: $ git commit -a With the following commit message:

Merge branch 'cpp' into master-test-merge

Conflicts:
        prp/pamrsp.F
        prp/pamrvc.F

Before we push the changes upstream we have to run the testset. In case we have introduced errors with the merge we can locally commit corrections. Finally, if the testset passes, we push the changes and remove the deployed branches (both local and remote):

$ git push
$ git branch -d cpp
$ git push origin :cpp

Resolving conflicts

Conflicts happen when you have edited a part of the code which has also been edited upstream by someone else (or by you) or if two branches have overlapping edits. Because Git cannot know which edit is the correct one it will ask you to resolve the conflicts. Conflicts can only happen when merging code. But since @git pull@ contains a @git merge@, also a @git pull@ can lead to a conflict state. You can resolve conflicts (this means tell Git which version is the correct one) in at least two ways:

Manually

1. Edit the file, find conflict markers (search for "<<<<<"), resolve the conflict (decide which version to keep and remove conflict markers). 2. Mark resolution with @git add@. 3. When all conflicts are resolved (verify with @git status@), @git commit@ the merge (it is good to keep the list of conflicting files in the commit message created by Git).

Using git mergetool

@git mergetool@ is a wonderful utility to resolve merge conflicts. It is typically run after git merge and works very nicely with external diff programs like meld or diffuse. If one or more file parameters are given, the merge tool program will be run to resolve differences on each file. If no file names are specified, @git mergetool@ will run the merge tool program on every file with merge conflicts.

Keep "ours" or "theirs"?

Sometimes you know that you want to keep the "ours" or "theirs" version of a conflicting file: "ours" is the version on the branch you are now on before the merge, "theirs" is the version that is coming from the other branch. Resolve conflict by keeping "ours" (my version takes precedence):

$ git checkout --ours 
$ git add

Resolve conflict by keeping "theirs" (the incoming version takes precedence):

$ git checkout --theirs 
$ git add

Going back in time

Sometimes you need to downgrade your "working copy" to an older revision with a certain _hash_. The git way is to create a branch called _branch_name_ from revision _hash_ and switch to it (checkout from it):

$ git checkout -b branch_name rev

After you are done switch back to master and throw the branch away:

$ git checkout master
$ git branch -d branch_name

Interrupted work

You are in the middle of a heavy debugging session with modifications everywhere and somebody asks you to fix some very small issue in the code. You know how to fix it but don't know what to do with the messy debug code. One way to solve this is the following:

$ git stash           # stash away your local modifications
$ git status          # verify that your local modifications are stashed away
$ vi file.F90         # do the small fix
$ git commit file.F90 # commit the fix
$ git push            # publish the fix upstream
$ git stash pop       # bring back the stashed files
$ git status          # continue with heavy debugging

Tips and tricks

A nice way to browse the history:

$ git log --topo-order --decorate [--oneline --graph]

Grep your repository:

$ git grep fixme

You are on a branch and want to see which files differ with respect to branch master:

$ git diff --numstat master

Use @git difftool@ instead of @git diff@:

$ git difftool HEAD^

Resolve conflicts with @git mergetool@:

$ git mergetool conflicting-file

Remove all untracked files and directories:

$ git clean -f -d

Show all commits for an author after specific date:

$ git log --author Eminem --oneline --after="Oct 10 2010"

How to restore a previously deleted file: # Get an overview of all commits that deleted files

git log --diff-filter=D --summary

# Get the SHA of the last commit which modified a given file

git rev-list -n 1 HEAD --

#Restore deleted file

git checkout ^ --

#For the estimators of oneliners:

git checkout $(git rev-list -n 1 HEAD -- )^ --

Note: going back one commit (done with the ^ ) is necessary because the last commit dealing with the desired file has obviously deleted it so you will have to go for the one before which still contained the file.