Last week, I wrote about my new MSc student and that he was writing code to do a load of clever spectral processing. In that post I mentioned sharing his code via something called GitHub. Today, I thought I’d take a moment to explain what the hell a ‘Git’ is and why everyone should be interested in them. Also, there will be cake.
The basic idea behind ‘Git’ storage of things like code and text documents is version control when sharing it. If for example, I have written a document I need to share with 5 people then either I must give it to all 5 at the same time and then later try to marry up the changes, or I can give it to each in turn. Both methods are time consuming and are an inefficient way of sharing that document.
Git is essentially a system where multiple readers can all edit the document at the same time, and submit version changes to a central location or repository. This repository can be cloud- or locally-based, but essentially you can see who made which changes – and can then revert them if required.
It sounds basic, but many programs just can’t manage this kind of versioning well, and companies offering ‘Git’ repository management are filling the void. The most well known one of these would be GitHub but there are alternatives like CodePlane and Bitbucket. Personally until this week, I had never looked further than GitHub but some recent news stories have given me pause to reflect on possible alternatives.
Whichever of these providers you use, they all work along similar lines and use similar language – which I will now do my best to explain. Git is primarily used for coding projects, so when I first tried to use it I found the terminology and instructions more than a little perplexing, and it took me a long time to realise that behind it was actually something really useful.
For the purposes of helping to explain how Gits work, I am going to use some example code/text – which I have chosen to be a cake recipe. Now I made this cake for my son’s birthday (something I was very proud of) but many pointed out that it was a very basic recipe.
So using the power of Git, we are going to collectively improve the recipe. By Wednesday next week I’ll bake whatever final recipe we have created. The only rule for making changes is that whatever you suggest has to be actually food/edible and use ingredients I can reasonably source – I’ll go out of my way to find things, but be reasonable! But other than that, go nuts – if you make me bake a cake with Cod and Jaffa Cakes in, I will give it a go.
Okay, so a Git repository exists in two places – local and in the cloud. The provider of the Git (in this case GitHub) keeps the central copy and you have a local copy that you can edit – then ‘sync’ your local copy to the central copy.
The first thing you’ll need, is to create a free GitHub account on their website. If you also want to edit files locally (not required) then you will need to go download & install their native app from the Set Up Git page. Ignore all the scary looking command line things, none of that is important.
With your GitHub account, you can upload your own files and edit other people’s public ones.
I only briefly explained this earlier but a repository is kind of like a folder. Each repository can contain many files, but the idea is that they are all related. For project ‘CAKE’ I have set up a repository here;
This contains two files : a Readme file explaining a bit about the project, and the cake recipe file.
If you click on the recipe file, it will display the content and you’ll have some options at the top including ‘Edit’. Which unsurprisingly, allows you to edit the text file.
Obviously, this is just taking you though what my repository looks like. If you want to make your own, then just find the green button on the landing page – once you’ve logged in. The whole ethos behind Git is open source and sharing, so you’ll notice that when you make a repository they’ll want you to pay for the storage space.
Rather than create you own repository from scratch, another powerful feature of Git is Forking. This is where you essentially take a copy of the project and move it into your own repository. Now any changes you make will not affect the original repository. This allows you to work on a separate version perhaps for a different purpose (perhaps you want to modify the recipe to make it into a bacon sandwich).
While now existing as a separate copy of the original, the two repositories are linked so that if at some later date you think your changes actually would be good in the original (your bacon sandwich turned out to be a bacon cake) you can submit a Pull Request.
A pull request is a simple solution for re-combining your ‘Forked’ copies. Instead of laboriously repeating all the changes in your copy in the original version, a pull request simply sends a request to amend the code at the relevant places and tries to ‘match up’ the two versions.
However, unlike editing the file directly, these requests have to be authorised before they are added to the code. This is a good way of preventing lots of small spurious edits, and puts a layer of control between your code and everyone who is working on it.
I need cake
So, that should sort of get you started with Git repositories and GitHub. I’ve found it very useful for dealing with lots of multi-person input projects, like documents or code. I’ve even found it useful in tracking the versioning on files that aren’t directly editable in GitHub.
The easiest way to understand Git is to actually give it a go, so please, go sign up and edit my cake recipe – it really needs improving!
Update: THE CAKE WAS MADE! it was… very chocolaty! Go try the recipe for yourself 😀