-
Notifications
You must be signed in to change notification settings - Fork 23
Git Tutorial
Git is a distributed revision control system. This means:
- It's for Revision Control; i.e. tracking changes to things. Usually source code, but text-based documents also work well.
- It's distributed, meaning it requires no centralized infrastructure to use. Every developer has their own repository, though it is possible (and quite simple) to share code between repositories.
- BitBucket: A web-based central repository service
- GitHub: A web-based central repository service
- Repository (or Repo): The complete revision history for a set of related files. Note that each developer has their own repo, though parts of this repo may be shared with others.
- Workflow: A set of conventions for multi-user development using git. A classic example is git flow. Bitbucket has a good document comparing workflows.
First you need to initialize your git configuration:
git config --global user.name "Your Name"
git config --global user.email "[email protected]"
A very, very useful feature (if you use the command line) is colour output:
git config --global color.ui auto
Note that this was all shorthand for editing your git configuration file directly. To see this:
cat ~/.gitconfig
A couple of other options might also be useful. If you have a favourite editor, set it. This, for example, uses emacs in non-window (i.e. terminal) mode:
git config --global core.editor emacs -nw
If you use a Mac, you may want to use the system keychain to store things like ssh keys. To check if you have the helper utility installed:
git credential-osxkeychain
usage: git credential-osxkeychain <get|store|erase>
If you don't see the usage line, then download the osxkeychain helper. Google is your friend.
Finally, configure git to use it:
git config --global credential.helper osxkeychain
cat ~/.gitconfig [user] name = Your Name email = [email protected] [credential] helper = osxkeychain [core] editor = emacs -nw [color] ui = auto
Once your setup is configured, creating a git repository is as simple as
git init
This will create a .git directory that contains the entire git repository, and all revisions. Of course, you haven't added any files or directories to it yet.
git add ''file 1'' git add ''file 2'' git add ''dir 1''
You also add files using wildcards. Be careful when doing this, however, to ensure that you don't end up tracking files that you don't want in the repository. When you encounter a file or directory that you wish to exclude from the repository,
- simply don't add them to the repository (you will be reminded they are untracked whenever you do a git status, or
- set up git to explicitly ignore them using .gitignore (see below).
git commit
to complete the process of importing them into git.
If you make a mistake and want to remove files from your git repository:
git rm --cached ''filename''
This will not delete the file from your working directory, just the git repository. To understand this in more detail we need to introduce one of the main differences between Git and other revision control systems, the git index, the subject of the second tutorial.
It's quite common to have files in your working directory that you don't want under version control. Common examples are build artifacts, editor temporary files, sensitive configuration information such as passwords or keys, etc. You can avoid accidentally committing these, and also avoiding the visual clutter in git status output by adding them to a .gitignore file.
This file consists of multiple lines, each containing a pattern. One example might be:
# Editor files *~ * .swp # Build aritfacts *.[oa]
If added into the top-level directory, this will ignore the *~ and *.swp files that Vim leaves behind, as well as any *.o and *.a generated artifacts, anywhere in that project. .gitignore files can also specify more complex path constraints, and can exist in subdirectories to only ignore files below that portion of your tree.
While a .gitignore will work even if not tracked by git, it's best to check it in to your git repository like any other file, so that other checkouts will be able to make use of it.
Actually, let me say that more strongly:
You should always check in a top-level .gitignore as part of every one of your git projects.
For further information, please see man gitignore.
Change into your project directory
cd $SANDBOX/git-tutorial
mkdir helloworld
cd helloworld
Initialize the repo
git init
git status
git log
fatal: bad default revision 'HEAD'
because there's nothing in that empty repository.
Now let's add something.
echo "hello world" > helloworld.txt
git status
Let's add the file now
git add helloworld.txt
git status now show a change to be committed. The file is still pending a commit, so to get it into the repo we need to run:
git commit
You'll see your favourite editor pop up, and are prompted for a commit message. Note that if you don't get your favourite editer, you can fix that with:
git config --global core.editor emacs # or vim or nano, or...
First line as a capitalized, short (<50 char) summary A blank line A longer description of the change, wrapped at 72 chars
To actually commit, save and exit your editor.
If you want to abort your commit before exiting your editor either:
- delete the entire contents of the message; a blank commit message aborts the commit,
- cause you editor to exist in error (in vim, :cq; YMMV in other editors).
I think we all know the problem In the morning, we sit down to work on a piece of code, and by the end of the day, our working source tree contains code from several different bugfixes, a handful of features that we are working on, and an experimental patch we promised to test for a someone else.
In the bad old days we would either:
- not commit anything until all our work-in-progress was done, or
- commit everything with a brilliant change log entry like "fixes".
In version control systems like CVS or SVN, files are all either committed or uncommitted. In git, there is an intermediate location, the index or staging area --- where we can accumulate the set of patches that will actually get committed when we finally issue a git commit. In short,
- The index is the staging area for the next commit.
The most important command for understanding the git index is:
git status
This command will list files in each of 3 sections:
- changes to be committed; i.e. changes in the index (staging area)
- changed, but not updated; i.e. changes in the working directory
- untracked files; i.e. files that exist in the working directory, but that git knows nothing about (yet).
To invoke it, change into any directory in your git repository and do a
gitk --all &
or (if you decide to use gitx instead):
gitx --all &
This tool will allow you to browse the complete graph of changes, their log messages, and the diffs themselves. In the tutorial part of this section, you'll see that there is a command line hack you can use instead (see the setup for git graph).
Recall that adding a file to the index means that this file will be staged for inclusion in the next git commit.
To add an untracked file to the index:
git add ''filename''
If you have a number of unrelated changes in your working directory, and you wish to include only a subset of these changes in the next commit (or you want to practice good habits), do a:
git add --patch
or
git add -p
This command examines every diff block separately, and lets you decide whether to move it to the staging area. IF you encounter a patch that has several changes wound into a single chunk, the s command will let you split the chunk into smaller and smaller pieces, until the two patches have been disentangled.
To do the opposite action of git add; ie. examine every diff currently in the index, and selectively unstage it, do a:
git reset --patch
or
git reset -p
One caveat to the above is this: git reset requires that git already know about the file in question (it must have been committed at some point in the past). For a newly added file, this not the case. To remove a recently added file from the staging area back into the list of untracked files,
git rm --cached
Note that unlike the regular git rm command, git rm --cached does not delete your file itself.
To undo all changes in the index; i.e. revert the index to the state of the last commit:
git reset HEAD
In git, HEAD really means the latest commit on the currently active branch.
By default, git diff shows you changes between your working copy of the code and the index. In particular, it does not show you what is about to be committed.
If you wish to see the changes between the index and HEAD (i.e. what will be committed when you next do a git commit) :
git diff --cached
One good habit to get into is to review all changes before committing. This is easy to do:
git commit -v
This command will pop up an editor window with the output of both a git status (prefixed by '#' characters) and a git diff --cached. Note that the diff output will also be stripped from the commit message (even though it does not appear to be commented out).
Making consistent use of git commit -v both ensures that no extraneous changes make it into a commit, and allows a useful (and complete) commit message to be composed.
If, after you have committed a change, you discover you have forgotten to add a file, or missed one of the relevant changes, it is simple to amend the commit to include the changes. Simply add the missed changes to the index and do a:
git commit --amend -v
You will be presented with the previous commit message (which you may then edit). git commit --amend actually replaces the previous commit with the new one. Warning: do not use --amend if you have already published your changes to another repository.
The git index allows a developer to organize his or her working directory changes into coherent commits. Commit messages should be detailed and to the point. Though they are technically freeform, it is a good idea to stick to the following style:
The line is a capitalized, short (<50 char) summary (used by many function in git) A blank line between the summary and the detailed parts of the message is important, as it separates the summary line from the rest of the commit content. Then comes a more detailed explanation of the changes, written in the present tense (i.e. "fix bug #4") as opposed to past tense (i.e. "fixed bug #4"). Commit messages should be wrapped at 72 characters per line. * Bulleted lists are useful for commit messages. If you use them, subsequent lines should be indented accordingly.
If you have gitk installed, run
gitk --all
We can simulate the output of a gitk at the command line:
git log --graph --oneline --decorate --all
git log --graph --format='%h - (%cr) %s - %cn' --abbrev-commit --date=relative --all
git log --graph -C -M --date-order --date=short --decorate --all
echo '
[alias]
graph-short = log --graph --oneline --decorate --all
graph-long = log --graph -C -M --date-order --date=short --decorate --all
graph = "!git graph-short"
' >> ~/.gitconfig
git graph
git graph -long
Next, create a new file and commit it to the repository.
emacs README
git add README
git commit -v
Let's make some unrelated changes in our tree.
echo 'my file' >> helloworld.txt
echo 'Read me' >> README
git diff
git add -p # Choose one of the patches to Stage. Don't stage the other.
git commit -v
git add -p # Stage the remaining patch
git reset -p # choose the diffs to unstage
git add -p # choose the diffs to add that you missed
git commit -v
Suppose we made a mistake in our commit message. Then
git graph # or git log if you don't have git graph set up
git commit --amend -v # fix the commit message
git graph
echo 'new change' >> helloworld.txt #make a new change
git diff # See the change
git add -p # Add the change. always use -p!!
git commit --amend -v # Notice that amend adds the new changes to the last commit.
git status
Next, create a new file
touch hello.h
git add hello.h
git commit -m "add header" hello.h #-m adds "add header" as the commit message
git status
git rm hello.h
ls -la hello.h
git status
git commit -m "bye bye"
Let's do it all again, but attempt to change our mind at various stages:
echo "TODO" > INSTALL
git add INSTALL
git status
Changes to be committed: (use "git reset HEAD <file>..." to unstage)
Go ahead and try it
git reset HEAD INSTALL
git status
git add HEAD INSTALL
git status
git rm INSTALL
Now git rm --cached preserves the file (it just removes it from your staging area). Since git rm actually deletes the file, and since this file isn't in the repo yet, you would have lost data.
git rm --cached INSTALL
git status
ls -la INSTALL #see, it's still here
git add INSTALL
git commit -m "initial version"
git status
git rm INSTALL
git status
git reset HEAD INSTALL
git status
git checkout INSTALL
git status
ls -la INSTALL
git rm INSTALL
git commit -m "bored now"
git status
ls -la INSTALL
In git, a branch is simply a node in the change graph. Recall, you can examine this graph using the graphical gitk or gitx utility:
gitx --all &
or the command line instructions from Tutorial 2. Your current position is indicated by the HEAD tag in the output.
If there are branches already created, you can change the branch you are currently working against (change your position in the change graph) by issuing a git checkout:
git checkout ''branch''
If you want to selectively discard your edits when changing branches, do a:
git checkout -p ''branch''
This allows you to change to a new branch, while retaining some (or all) of the changes in your working directory.
It if possible to create a new branch (from the current HEAD) using git checkout:
git checkout -b ''newbranch''
This is equivalent to
git branch ''newbranch'' git checkout ''newbranch''
A merge takes a different branch, and attempts to merge it into the current branch
git checkout ''destination_branch'' git merge ''branch_to_be_merged_into_destination''
Sometimes a merge will result in a conflict (where both branches modified the same line). This isn't hard to deal with, and can be handled either manually, or with a specialized tool. By default, merge conflicts show up in your source files like this:
Here are the lines that are either unchanged from the common
ancestor, or cleanly resolved because only on side changed.
<<<<<<< your:sample.txt
Conflict resolution is hard;
=======
Conflict resolution is easy.
>>>>>>> their:sample.txt
And here is another line that is cleanly resolved or unmodified.
git mergetool
Create a new branch
git branch mydev
git graph #or gitx or gitk
emacs README # make a change
git add -p README
git commit -m "update README docs"
git graph
WE ARE GOING TO A MAKE A DELIBERATE MISTAKE HERE
emacs helloworld.txt # make another change
git add -p
git commit -v
Aborting commit due to empty commit message.
Now switch to the correct branch and try committing.
git checkout mydev
git commit -v
git graph # note where HEAD is. Also note, why we called it 'graph'
gitx
gitk --all
git checkout master
git graph # HEAD has moved to the master branch
git checkout -b bugfix_012
emacs -nw README # make a non-conflicting change
git add -p
git commit -v # always checking your commits is a good habit!
git graph
Our next goal is to merge the bugfix and mydev branches back into master. This should be easy, as the changes we made were nonconflicting (i.e. they did not overlap).
When merging it is important to know where you are:
git branch # the current branch will be highlighted with a *
git branch -v # gives extra info
Merging takes another branch name, and merges it to the current branch. We will start by merging mydev into bugfix_012
git merge mydev
gitx
git graph
Let's merge everything back into master.
git checkout master
git merge bugfix_012
git graph
gitx
git branch # on master
git graph
git branch -d mydev # deletes the mydev branch
git branch # notice mydev is gone
git graph # the commits still exist, but they no longer have branch tags
Merge conflicts happen when the same line is modified in two different branches.
git checkout master # already here!
git branch
git checkout -b conflicting
emacs README # made a change
git add -p
git commit -v
git graph
git checkout master # already here!
git branch
git checkout -b conflicting2
emacs README # made an incompatible change
git add -p
git commit -m "conflict 2"
git graph
git merge conflicting
emacs -nw README # choose what line you want to keep manually
git commit -v
git status # it tells you how to proceed
git diff # see the 3-way diff
git add -p README
git commit -v # fails. We need to just use add
git add README
git commit -v # note the commit message is already there
git graph
Clean up our branches, then move to master
git branch -d conflicting
git graph
git checkout master
git merge conflicting2
git branch -d conflicting2
git graph
First, some terminology:
- Remote branch: A branch living in a different repository entirely
- Remote-tracking branch: a copy of the remote branch that lives in your personal repo.
- Local-tracking branch: Paired with a remote-tracking branch, this is the integration branch used for merging changes between the remote branch and your local changes.
- Topic or Development branch: A local, non-tracking branch.
For this, we will need two repos:
mkdir theirs mine
Let's create a remote repo (theirs) with two branches
cd theirs
git init
echo "their repo" > README.md
git add README.md
git commit -m "Initial import" README.md # add readme on master branch
git checkout -b dev # create a development branch
echo "Public Domain" >> LICENSE
git add LICENSE
git commit -m "initial import" LICENSE
git checkout master
Verify the branches were created:
git branch
dev
* master
And check that everything is checked in:
git status
On branch master
nothing to commit, working directory clean
Now let's clone it:
cd ../mine
git clone ../theirs
ls -la
Notice git has thoughtfully created the directory, too. Change to it and do a status:
git status
On branch master
Your branch is up-to-date with 'origin/master'.
nothing to commit, working directory clean
Notice the new output, indicate that this is a local tracking branch. It's paired remote tracking branch has been given the name origin/master. Remember, this is a local copy of the remote repo. Now is a good time to explore the more verbose modes of branch:
git branch
* master
git branch -v
* master 09d4cc7 Initial import
git branch -vv
* master 09d4cc7 [origin/master] Initial import
git branch -a
* master
remotes/origin/HEAD -> origin/master
remotes/origin/dev
remotes/origin/master
- git remote
- git push
- git fetch
- Some stuff that should go earlier
- skipped git rebase
- exporting from git for releases
- explanation of fast-forward in Tutorial 3
A warning:
- Never do a rebase if you have pushed your checkout anywhere, or if anyone has ever pulled from it.
ls -la
total 64
drwxr-xr-x 11 you staff 374 16 Nov 10:02 .
drwxr-xr-x 4 you staff 136 14 Nov 20:40 ..
drwxr-xr-x 16 you staff 544 16 Nov 10:02 .git
drwxr-xr-x 3 you staff 102 16 Nov 09:23 .ipynb_checkpoints
-rw-r--r-- 1 you staff 218 16 Nov 10:02 README
-rw-r--r-- 1 you staff 96 15 Nov 00:00 README.orig
-rw-r--r-- 1 you staff 45 14 Nov 23:47 helloworld.txt
Say we are in the following situation. We made a commit a while ago with the comment make 2 changes. Split later. We want to travel back in time and do that commit properly; i.e. break it up into 2 parts.
Here's the commit graph
git graph | head
- 44aec4e (HEAD, master) superfluous commit
- 480a6d0 make 2 changes. Split later
- 88f7df3 make this longer
- 9a79581 Merge branch 'conflicting' into conflicting2
- | 79a51f4 conflict 2
- bbe847b (bugfix_012) Merge branch 'mydev' into bugfix_012
480a6d0 git diff -r 88f7df3 -r 480a6d0 diff --git a/README b/README index d4d6f80..205f34c 100644 --- a/README +++ b/README @@ -1,4 +1,4 @@ -Change first line +Change first line Again Changing the first line again helloRead me making a non-conflicting change with the other branch @@ -14,4 +14,4 @@ fds fds fsd sdf -last line +Change last line AgainWe are basically going to rewind to that point in time, split it into two, and then re-roll everything after.
git rebase -i HEAD~~
This will pop open an emacs window (or vi, depending on your editor variable). Notice it lists the last 2 commits, but in backwards order
pick 480a6d0 make 2 changes. Split later pick 44aec4e superfluous commit # Rebase 88f7df3..44aec4e onto 88f7df3 (2 command(s)) # # Commands: # p, pick = use commit # r, reword = use commit, but edit the commit message # e, edit = use commit, but stop for amending # s, squash = use commit, but meld into previous commit # f, fixup = like "squash", but discard this commit's log message # x, exec = run command (the rest of the line) using shell # # These lines can be re-ordered; they are executed from top to bottom. # # If you remove a line here THAT COMMIT WILL BE LOST. # # However, if you remove everything, the rebase will be aborted. # # Note that empty commits are commented out
Change the first word pick to edit (or just e) and save
edit 480a6d0 make 2 changes. Split later pick 44aec4e superfluous commit # Rebase 88f7df3..44aec4e onto 88f7df3 (2 command(s)) # # Commands: # p, pick = use commit # r, reword = use commit, but edit the commit message # e, edit = use commit, but stop for amending # s, squash = use commit, but meld into previous commit # f, fixup = like "squash", but discard this commit's log message # x, exec = run command (the rest of the line) using shell # # These lines can be re-ordered; they are executed from top to bottom. # # If you remove a line here THAT COMMIT WILL BE LOST. # # However, if you remove everything, the rebase will be aborted. # # Note that empty commits are commented outNow you will be in a mystical magical place. The point in history you want to change:
git rebase -i HEAD~~
Stopped at 480a6d08e11b26741f4f675a4679eb39f48a5e55... make 2 changes. Split later
You can amend the commit now, with
git commit --amend
Once you are satisfied with your changes, run
git rebase --continue
git graph |head
* 44aec4e (master) superfluous commit
* 480a6d0 (HEAD) make 2 changes. Split later
* 88f7df3 make this longer
* 9a79581 Merge branch 'conflicting' into conflicting2
|\
| * 5ed1423 Updated the first line of the README file.
* | 79a51f4 conflict 2
|/
* bbe847b (bugfix_012) Merge branch 'mydev' into bugfix_012
|\
git reset HEAD~
Unstaged changes after reset:
M README
git diff
diff --git a/README b/README
index d4d6f80..205f34c 100644
--- a/README
+++ b/README
@@ -1,4 +1,4 @@
-Change first line
+Change first line Again
Changing the first line again
helloRead me
making a non-conflicting change with the other branch
@@ -14,4 +14,4 @@ fds
fds
fsd
sdf
-last line
+Change last line Again
git add -p
diff --git a/README b/README index d4d6f80..205f34c 100644 --- a/README +++ b/README @@ -1,4 +1,4 @@ -Change first line +Change first line Again Changing the first line again helloRead me making a non-conflicting change with the other branch Stage this hunk [y,n,q,a,d,/,j,J,g,e,?]? Say y to this change, and n to the next: In [ ]: @@ -14,4 +14,4 @@ fds fds fsd sdf -last line +Change last line Again Stage this hunk [y,n,q,a,d,/,K,g,e,?]?
now do a commit:
git commit -m "first change. the one I want to do a pull request on"
[detached HEAD 90c3922] first change. the one I want to do a pull request on
1 file changed, 1 insertion(+), 1 deletion(-)
and commit the second part.
git add -p # get in the habit
</souce>
<source lang=bash>
git commit -m "the second commit"
[detached HEAD 539a693] the second commit 1 file changed, 1 insertion(+), 1 deletion(-)
and redo everything after that point
git rebase --continue
Successfully rebased and updated refs/heads/master.
git graph |head
* ed7bc94 (HEAD, master) superfluous commit * 539a693 the second commit * 90c3922 first change. the one I want to do a pull request on * 88f7df3 make this longer * 9a79581 Merge branch 'conflicting' into conflicting2 |\ | * 5ed1423 Updated the first line of the README file. * | 79a51f4 conflict 2 |/ * bbe847b (bugfix_012) Merge branch 'mydev' into bugfix_012
Now we have successfully rewritten history. Push the result to your personal github, and create the pull request for just the diff you want