Skip to content

Part 1: Intro to Git

Elana Hashman edited this page Mar 6, 2016 · 3 revisions

Git is a software project that provides a distributed version control system for software development. GitHub is a platform and website for sharing and hosting public projects that use git for version control. Hosting a project on GitHub allows other people to easily contribute to your project. GitHub is not the only website that offers this; other such websites include BitBucket and GitLab.

Basically, git and GitHub makes it easy for many people to contribute to a software project (or other project).

To explain with an example, the source code for the WiCS website is a public git repository. The repository contains all of the code and content we've written for the website, and all the history of every change we've ever made to it. The repository you are in right now is just a copy of the code there - minus the history of changes. We'll refer to the WiCS website as an example for the rest of the workshop.

Repositories

In order to host a project on GitHub, you'll need a repository or "repo" for it to live in. Think of a repo as the folder that you keep your project in; it contains all the project files as well as any documentation.

Repos are either public or private. In a public repo, anyone can see the code you've written as opposed to a private repo that is, well, private. For example, many people host the code that runs their personal sites in a public repo because they want potential employers to be able to look at the code they've written. However, if you ever host a class project on GitHub, such as your CS246 final project, you will want to host it in a private repo to avoid academic offences. GitHub offers 5 private repositories for two years for students; you can use the UWaterloo GitLab for unlimited private repositories for your personal, academic, and non-commercial use.

Remote and Local Repositories

GitHub saves a copy of your project on the GitHub servers. We'll refer to this as a "remote" copy of your repository. However, you cannot modify this copy directly.

You will have to make a copy of this code on your own computer, so that you can view and edit all the files. This will become your "local" copy of your repository.

Committing and the Commit History

Committing is how you track changes to content in a git repo similar to "saving" changes you make to one or more files in your project. Every time you make a commit, Git generates an ID (known as the "SHA" or "hash") for that commit so that you can identify it later. This is the "version control" bit of the "distributed version control system."

All commits are saved in your commit history; this is one of the many differences between a git repository and a regular directory.

Having a commit history is super useful because:

  • It gives you the ability to return to the state of your (local or remote) repository at any point in the commit history
  • It helps you identify if commits break things
  • It shows you how the repository evolves with each change

You can poke around the commit history of this repo, by going to Code tab > Commits, or the much more interesting commit history of the WiCS website repository.

The internals of how git handles this are too complex for the scope of this workshop. However, you can learn more about it here. There are many blog posts you can find on the subject.

Branches

You will notice when you navigate to the code tab for any repository, you can view the branches for that repository. Our repo has 1 branch: master. Every git repository begins with a master branch. You can think of this as the "main" branch of your repo. In the case of WiCS, we use the master branch to track the actual code that is running the website.

When you make a "branch off of master" you are creating a new and separate commit path, including all the commits that the master branch does at the time of creation. Any changes on this branch will not affect your master branch unless you send them back into master by completing a merge.

Branches can be used for:

  • Developing a feature without affecting any of your working code
  • Testing new features
  • Saving the state of your repo at a particular time

Forks

Imagine how difficult it would be if several people were working directly on the same branch at the same time -- what a nightmare!

Forks are a way for multiple people to contribute to one git project at the same time without having to coordinate their work all the time. This is the "distributed" part of the "distributed version control system." Having a "fork" of a repo means that you made a copy of a public repository that you treat as your own. To contribute back to the original repository, a developer can request that their personal or "downstream" changes are included in the original repository. Maintainers of the original or "upstream" repository can choose to accept the changes from the developer's remote fork. Forks are a concept that exist outside of git itself. In GitHub, developers submit pull requests when they want to merge changes on their fork into the master branch. In GitLab, they are called merge requests. In many older, established software projects (such as the Linux kernel), patches are sent to a mailing list. These are all the same thing.

All of the people who contribute to the WiCS website have a fork of the site. You can see my personal fork of the website here. As you will notice, my fork is (hopefully) up to date with the WiCS website repo, but it doesn't have to be. The two master branches should have the same content and same commit history, until either branch has a change made. If I make a change to my fork, I make a pull request to merge my changes in into the WiCS website. If someone else makes a change to the WiCS website, I update my fork to match.

You will be forking this repo and making changes and submitting pull requests to it.

To get started with installation, click the following links based on the kind of machine you have: