“A journey of a thousand miles must begin with a single step.” - Lao Tzu
Maybe you already know all about Github – in that case, you should probably stop reading now – this article attempts to do nothing more than define what Github is and provide a basic introduction to it’s features, installation and use. If you are like I was however, you probably have heard of it and have some vague idea of what it is but are not comfortable with it. Perhaps you’d rather not deal with it. Maybe you think it sounds too complicated or might expose you as some sort of coding Neanderthal to the elite coding world, should you attempt to get involved. To be honest, I actually thought those very things myself.
Ok, so what is Github? The least complicated answer that is even close to adequate is that Github is a web site providing a nexus for a distributed version control system called Git. However, it is more than that and I’ll touch on what that means a bit later on in the article. Before I do that, I would like to try and make the “least complicated answer” I gave above, be a little less complicated.
FIrst let’s go over a little background on Git itself. Git is a source code version control system that was created in its initial form in 2005 by a group of Linux devotees headed by Linus Torvalds. Unlike other version control systems, for example SVN, Git does not store a base file and then subsequent diffed changes on each check in. Instead it was designed to store “snapshots” of a project over time (although it can save space by referencing unchanged files in previous snapshots).
Coders editing files belonging to a particular project in the centralized git store have a version of that project (and access to all its history) in working directories on their local machines. The files that they edit must go through a staging phase before being committed within the local project in which they are being modified. This local commit is not the same thing as committing changes to the actual central store from which the local version of project was obtained. An additional step must be taken to sync and commit changed versions of the files to the actual project repository in which the master source controlled copies of the files reside.
The differences between the way Git works vs. other source control systems allows Git to be fast and makes it convenient to use even if there is no current network connection. I found an excellent resource on the web (there are many others as well) that explains the basic ideas behind working with Git: http://git-scm.com/book/en/Getting-Started-Git-Basics.
Once you really wrap your head around what Git is, understanding Github is not as fearsome a task at it may have originally seemed. In passing, it should be noted that there are other web sites that host Git repositories and I believe you can even install it to be a stand-alone system on a single machine. However, Github is far and away the most popular and nearly all of the up and coming stuff is stored there. One of the nicest things about Github is, you don’t really have to get very involved in order to utilize the resources found there. You can go to the site without creating an account and grab loads of open source software frameworks, libraries and code without even creating a Github account. You don’t have to jump right in, you can just stick a toe in the water. Nobody will complain. You will be just one more anonymous downloader to them.
So, how does one get software from Github in “ninja” mode?
- Go to the home page at: https://github.com/.
- Type something in the search box at the top of the page, for example: node.js.
- Hit enter and watch the results come pouring forth.
- Select a project repository from this result list.
- Click the button that allows you to download a recent stable version of the project as a compressed file (e.g. zip, tar).
I got 16,249 repository results from a search on “node.js” at the time I wrote this article. That’s 16,249 projects related to Node.js alone! Many of these are world class frameworks and code libraries that are in actual use at major corporations and other legitimate companies – and you can use them too. The majority of these desirable code bases have permissive licenses such as the MIT license or something similar – in essence, free and with almost no restrictions as to their use. (Note: I plan to do a blog post soon on open source licensing, as I think it is a somewhat confusing subject).
So that is one way to use Github – as huge bundle of open source code repositories that you can access even without a Github account. However, there is much more to Github than that:
- It can be a personal version control and backup system for your projects.
- It can provide source control for an entire team of people working on the same project(s).
- You can participate on other people’s open source projects, adding to them and squashing bugs. Some of these projects are world famous. If you are competent, courageous and creative, you might just make a name for yourself.
- You can monitor activity on projects and communicate with other denizens of Github.
Participating in the ways listed above require that you create and use a Github account. Free accounts are available and there are also various levels of paid accounts, as of this writing ranging from $7 to $200 per month. As they rise in cost the number of repositories you may own increases. However, any paid account gets you one thing that you do not get with a free account – free accounts may not be private, but must be open source. Whatever you store in a free, public account may be seen by anyone and people may use and access your code. They may even submit suggested changes and corrections. This is not a problem in the majority of cases and in fact this exemplifies the basic idea of open source. Free accounts do have an unlimited number of repositories and users, so that too, is very nice. The steps to create a free account are easy to follow and the process can initiated from the home page of Github.
Once you have created an account, the next step in being able to use Github for source control purposes is to get some manner of a Git client onto your local machine – you need Git software to use Git, which as has been previously stated, is the source control system that Github uses. Upon creation of an account, the Github site will provide a UI that acts as a sort of “Boot Camp” for beginning to use Github. Part of this Boot Camp is a UI piece that allows you to download a Git client. You will have a few choices. There is a command line Git, which is the original way to use the program, but there are alternatives. There is a GUI client available for both Windows and Mac OS that is arguably easier to use than the command line program. There is also an Eclipse plug-in available. The web site itself also provides GUI UI for the portions of the process that involve fetching code from the Github site down to your local machine.
Disclaimer: In the discussions that follow regarding procedure for using Git and Github, I refer to the actions and concepts behind using this system for source control, but I do not provide the actual syntax of commands if using the command line interface, nor discuss the menu choices or clicks in a GUI. There is help within these programs and other sites on the web for specific command line syntax and GUI options necessary to conduct an action or procedure. Any approach other than this would result in a very long article and risk muddying the explanation. However, I may come back to this topic at some point and provide a few examples.
After you have a Github client on our machine you are able to embark on two-way transmission of code with Github. The process of doing so involves the use of repositories. We know that there are a large number of repositories already in existence that are owned by other people. As you might suspect, the process of working with this type of repository is a little different and a little more complicated than working with repositories that you own or are owned by your team. We’ll first take a look at how to use owned repositories.
Working with owned repositories:
Let’s start with the easiest scenario to understand, the case in which there is already a repository created on Github and you have ownership access to it. To be able to work with the code in a Github repository one must first Clone it from the Github website. Doing so will create a local repository on the user’s local machine. All of the source code and history of the project will be pulled into this local repository. It is then possible to edit source files and mark them as modified which puts them in a local staging area. Any files in staging are able to be committed - still to a local directory. The set of committed files are then ready to be pushed to the Github server repository from which the source was originally obtained.
If no Github repository yet exists for a project, the creation of one begins with creation of a repository on a local machine. Files are then created, moved to staging and committed within the local repository (similarly to the process described above for modifying cloned project files pulled down to a local repository). After that is done, the project may be pushed to a Github repository which is created as part of this process.
Working with Repositories belonging to others:
This process has similarities to the work-flow described above in which a repository is cloned and files are edited, staged, committed and pushed to a Github repository. However, there are two key differences:
- A copy of the original repository on Github must first be created on Github. This is done by Forking a project. Once a project has been forked, it may then be cloned down to a local machine.
- When files are eventually pushed back up to Github, they get pushed to the forked repository, not the repository from which the forked project was copied. The pusher of said files must submit a “pull request” for the files to be “pulled” from the Github fork to the real source repository. The owner of the actual source repository will then review the changes and merge the files into his repository or not, as he sees fit.
In summary, Github functions as both a repository for private projects and world class open source software. However, there is much more to Github. It provides certain statistics, the ability to get notifications of project changes and the ability to communicate with repository owners and pull request submitters. There is even more than this to Github, but the discovery of that, as they say, will be left as an exercise for the reader.