\author{Jonathan Hodgson (Archie)}
A few people recently have asked me about Git.
Git has become the de-facto for most situations.
Microsoft recently moved to git for version controlling Windows and Office.
\frametitle{What is Git}
A very versatile Version Control System
\item Keep track of source code (or other folders and files) and its history
\item Facilitate collaboration
\item Distributed
Git is still being developed.
Being distributed means you can work on repositories offline (Unlike SVN).
It's useful even if you're working on things by your self. This presentation is version controlled.
You can use it to find out when something broke. I won't be covering it today but there is a tool called git bisect that can take a unit test (or script) to analyse when something broke using a binary search.
\frametitle{Obligitary XKCD Comic}
I have done this
Git has a reputation for being hard.
It's interface abstracts away a lot of the work, meaning it's commands can feel like magic. When it works, this is fine but unfortunately when things go wrong, you can be left - like in this comic - with no idea how to proceed.
It's interface can be confusing, there are some commands that do a lot (checkout) and there are often multiple ways to achieve something.
I think that understanding a bit about how Git works under the hood will help de-mistily it.
Git's data model is actually quite simple (beautiful even). Understanding the basics of this can really help.
# Ubuntu / Debian / Kali
sudo apt install git
# Centos / Fedora / Red Hat
sudo dnf install git
# Arch / Antergos / Manjaro
sudo pacman -S git
# Mac
brew install git
# Get the Version
git --version
\href{}{Git for Windows:}
Git is probably already installed if you are on a Linux system. However, if not, it will definitely be in your standard repositories.
There is a version of Git provided with xcode, but it is old. Most of the stuff we cover today should still work but (for example) some things need to be run from the root directory in old versions of git that don't in newer versions.
If you have the misfortune to be using windows, I've heard good things about Git for Windows but have not used it personally. It includes Bash emulation.
Hopefully you have a version greater than 2.23.0 - if not, it's not the end of the world.
\frametitle{Setting It Up}
git config --global "Jonathan Hodgson"
git config --global ""
Hopefully you have Git installed. I will be running it on Linux although the commands should all be the same for Windows and Mac.
Note that I am not using my primary email address. The email address you provide here will be available to anyone with access to repositories you work on.
These settings are stored in \mintinline{bash}{~/.gitconfig}.
\frametitle{Setting It Up}
\textbf{Pick One}
# Set editor to vim
git config --global core.editor "vim"
# Set editor to nano
git config --global core.editor "nano"
# Set editor to VS Code
git config --global core.editor "code -w"
# Set editor to Sublime
git config --global core.editor "subl -w"
There are several times that Git will need to open a text editor. By default, it will use \mintinline{bash}{EDITOR}. If neither is set, it will use VI.
Note that if you are using a GUI editor, you might have to set the wait flag. This makes it so the executable doesn't return until you close it.
\frametitle{Setting It Up}
\textbf{Pick One}
# No colour
git config --global color.ui never
# Auto colour
git config --global color.ui auto
# Force colour
git config --global color.ui always
# Overide for a command
git -c color.ui=always status > ~/some-file
On Linux systems, this is set to auto by default. Might be different on a Mac. Generally auto is probably what you want. It will be coloured unless you are piping the output to a file or another process.
Take note of the incorrect spelling of colour.
You can override all configuration options on an individual command basis if you like.
\textbf{Blob} In Git, a file is called a blob.
\textbf{Tree} In Git, a directory is called a tree.
\textbf{Commit} A snapshot of your code
All of these are referenced by a hash and stored in the \mintinline{bash}{.git/objects/} directory.
Most Git tutorials I have come across focus on memorizing commands. This way, the commands feel like magic and there is never really any understanding of what the commands do under the hood.
\frametitle{Naïve Approach}
.1 Project.
.2 draft.
.3 some.
.3 files.
.2 final-draft.
.3 some.
.3 files.
.2 final.
.3 some.
.3 files.
.2 real-final.
.3 some.
.3 files.
.2 actual-real-final.
.3 some.
.3 files.
I think, being honest, we have all done this. This sort of works, if you're working on something by yourself. Once you start collaborating on software, you are going to have a bad time.
However, this is a simple approach and not a million miles from what Git does internally.
I want this to be quite interactive so first things first, let's get Git setup.
\frametitle{Model it}
\node[commit,minimum size=2cm] at (0,0) (commit1) {Draft};
\node[commit,minimum size=2cm] at (3,0) (commit2) {Final Draft};
\node[commit,minimum size=2cm] at (6,0) (commit3) {Final};
\draw[arrow] (commit1) -- (commit2);
\draw[arrow] (commit2) -- (commit3);
\node[draw,text width=1.8cm,anchor=north,align=center] at (0, -1.5) {\small \vdots\\[0.1cm] };
\node[draw,text width=1.8cm,anchor=north,align=center] at (3, -1.5) {\small \vdots\\[0.1cm] Draft };
\node[draw,text width=1.8cm,anchor=north,align=center] at (6, -1.5) {\small \vdots\\[0.1cm] Final Draft };
\item This is a simple representation of the folder structure we saw, although for simplicity, I'm only showing 3 revisions.
\item Notice that so the computer knows the order, somewhere in each ``snapshot", we include a reference to the previous snapshot
\node[commit] at (0,0) (commit1) {93e4d3d\ldots};
\node[commit] at (3,0) (commit2) {2557962\ldots};
\node[commit] at (6,0) (commit3) {od68560\ldots};
\draw[arrow] (commit1) -- (commit2);
\draw[arrow] (commit2) -- (commit3);
\node[draw,text width=1.8cm,anchor=north,align=center] at (0, -1.5) {\small \vdots\\[0.1cm] };
\node[draw,text width=1.8cm,anchor=north,align=center] at (3, -1.5) {\small \vdots\\[0.1cm] 93e4d3d\ldots };
\node[draw,text width=1.8cm,anchor=north,align=center] at (6, -1.5) {\small \vdots\\[0.1cm] 2557962\ldots };
\item Rather than human readable names, Git references each snapshot (called a commit) by a cryptographic hash. Currently using a hardened sha1 but there is an effort to move to sha256.
\item Similarly to the model above, each commit references the previous (except the first obviously)
\item The commit also includes meta information such as the committer, a timestamp and a message.
\item We will look at this in more detail a bit later.
\frametitle{Commits / Branches}
\node[commit] at (0,0) (commit1) {};
\node[commit] at (2,0) (commit2) {A};
\node[commit] at (4,0) (commit3) {B};
\node[commit] at (4,-2) (commit3b) {C};
\draw[arrow] (commit1) -- (commit2);
\draw[arrow] (commit2) -- (commit3);
\draw[arrow] (commit2) -- (commit3b);
The linear graph we just saw is an overly simplistic representation. In reality, Git represents history using a Directed acyclic graph which allows parents to be shared my multiple commits. This is useful because it allows for Branches. We will look at these a bit more later.
It is good practice to develop features on a separate branch. This allows for multiple people to work on a project as well as allowing things like bug-fixes to be deployed without having to worry about interference from a new feature.
\frametitle{Commits / Branches}
\node[commit] at (0,0) (commit1) {};
\node[commit] at (2,0) (commit2) {};
\node[commit] at (5,0) (commit4) {A};
\node[commit] at (8,0) (commit5) {C};
\node[commit] at (4,-2) (commit3b) {};
\node[commit] at (6,-2) (commit4b) {B};
\draw[arrow] (commit1) -- (commit2);
\draw[arrow] (commit2) -- (commit4);
\draw[arrow] (commit4) -- (commit5);
\draw[arrow] (commit2) -- (commit3b);
\draw[arrow] (commit3b) -- (commit4b);
\draw[arrow] (commit4b) -- (commit5);
As well as 2 commits' ability to share a parent, the opposite is also true, Here, we see that a commit is able to have multiple parents.
This is called a merge commit - because it merges two branches. In a lot of situations git is smart enough to auto-merge branches although at times human intervention is necessary.
By default, git creates a branch called Master when you create a repository.
\frametitle{Create a repository}
Do this in a live terminal. MAKE SURE YOU MAKE YOUR FONT BIGGER
Show that the \mintinline{bash}{.git} folder has been created and do a tree to show what is in it.
\frametitle{Git status}
Create repo and create a file called Make sure to mark it as executable.
Here we see the branch we are on (Master), we are told that there are no commits yet and we see that Git can see the file we've just made but it isn't tracking it.
\frametitle{Staging Area}
# Add files / or directories
git add <file|directory> [<file|directory>...]
# Add everything not in gitignore
git add -A
The staging area is where you put things that you want to be committed.
It can often be useful to manually split changes up into different commits. You might be working on feature A and feature B simultaneously. It is good practice to have each feature as a separate commit so you could add feature A to the staging area, commit it, then do the same for feature B.
We will talk about \mintinline{bash}{.gitignore} in a bit.
\frametitle{Staging Area}
Here can use git status to see what is in the staging area. They are listed in the ``Changes to be committed" section. By default, they will also be green if you have colour switched on.
git commit
\item First line should be concise summary around 50 chars
\item Body Should be wrapped to around 70 chars
\item There should be an empty line separating summary from body
\item If contributing to a project, check per-project guidelines
\item Normally in or similar
\item Use the imperative: ``Fix bug" and not ``Fixed bug" or ``Fixes bug."
First line is often shown by various tools
70 chars allows for good email etiquette. Allowing for 80 char hard wrap with after a few reply indents
Generally you will want to write in imperative as this is what automatic commits like merge do.
\frametitle{When should you commit?}
\framesubtitle{Commit early, commit often}
\item Every time you complete a small change or fix a bug
\item You don't normally want to commit broken code (intentionally at least)
\item In some instances you might want to auto-commit - but probably not too often.
\item Normally this works if changes can't break something. E.g. Password Manager
Unfortunately, this doesn't have one simple answer.
Some examples of auto-committing are for your password manager.
\frametitle{Commit Messages}
In case you hadn't noticed, I quite like Randall Munroe.
I am bad for this, particularly on personal projects.
# Open editor for message
git commit
# Read message from file
git commit -F <file or - for stdin>
# Provide message directly
git commit -m "<message>"
Running git commit will open your editor.
I only really use -F if I am doing so from a script
# Diff between last commit and current state
git diff
# Diff between 2 commits or references
git diff commit1..commit2
# Same as above but on a single file
git diff a/file
Diff is pretty smart. It will normally work for whatever combinations of commits, references (more on that later) or files.
These are hopefully quite easy to understand. Red lines mean a line was removed, green means a line was added.
Here we see a commit done with the -m flag. I generally only use -m if it is a trivial change like this and there is no need to have a body.
You can see that the log shows a list of the two commits we have made on this project.
The git log command has a lot of flags. We will see some of them later.
\frametitle{Under the hood}
I said earlier that we would be looking at how all this works at quite a low level. This is where that starts.
We can also use the cat-file command built into git to do the same thing. We can see that Commits, trees and blobs are all stored in the same way.
You will also see that you can often use a prefix of the first 4 or more characters of a hash. It is quite common to use the first 7 or 8.
Hopefully you will see from this that the inner workings of git isn't that complicated.
\item We have just seen that commits are simply (compressed) text files, addressed by a hash.
\item References are a way of addressing them without remembering the hash.
\item Unlike the hashes, references can change - and they do change.
We've seen a couple of these (sort of)
Master and Head
There are two references we can see here, master and HEAD.
\item References are stored in the \mintinline{bash}{.git/refs} folder
\item The \mintinline{bash}{heads} folder contains references to the heads (or tips) of all local branches
\item The HEAD references is directly in the \mintinline{bash}{.git} folder.
\item It refers to the ``current" commit. It is how git knows where you are.
\item This normally refers to a branch's head commit.
\item In some situations it will refer to a commit directly.
Not sure why it is not in refs folder
If it refers directly to a commit, the repository is in what is called a ``detached head" state.
This file tells git which files not to track.
This will not stop git tracking a file if it's already being tracked.
If you start tracking large binary files, git isn't going to be able to compress them. This will result in a massive repo and a headache for everyone. If at all possible, don't track large files, especially if they are going to be changed. Remember, git stores each version of each file. With text, this is fine as it can be compressed efficiently. If it's not text, it can't.
You should probably also try to avoid including minified files as git won't be able to merge them automatically.
\item Allows multiple features to be developed in parallel without interference.
\item Allows multiple people to collaborate easily.
# List Branches
git branch # -v adds more info
# Create a branch called test
git branch test # or
cp ~/.git/refs/heads/master ~/.git/refs/heads/test
# Switch to new branch
git switch test # or
git checkout test
# Create and switch in one go
git switch -cc test # or
git checkout -be test
Branches are represented in git as references in the heads folder.
They can be created by simply creating a file there.
The git checkout command does A LOT of stuff. It can be confusing so it's functionality has been split up into several smaller commands. If you have git 2.23.0 or newer, you will be able to use it.
Be aware that a lot of tutorials etc. will use the checkout command. Version 2.23.0 was released in August 2019.
As we saw, there are numerous ways to create the commit.
What is interesting to note here is that both are still currently pointing at the same commit.
Head is pointing at test so any new commits will be on this branch.
Also take note of the git log command. \mintinline{bash}{--on-line} shows a short version and \mintinline{bash}{--all} shows all branches.
\frametitle{Differing Branches}
\frametitle{Differing Branches}
This shows what would be needed to take you from master to test.
Notice the \mintinline{bash}{--graph} flag which adds the drawing to the left
\frametitle{Simple Merge}
After working on septate branches, you will probably want to merge them eventually.
In this situation, Git was able to work everything out itself.
\frametitle{Tidy Up}
\frametitle{Git $\ne$ GitHub}
\item GitHub is one of many services that offer hosting for git remotes
\item It is owned by Microsoft now
\item It is not open source
\item You can't self host it
\item It it very popular
\frametitle{Useful supporting tools}
Bat is described as cat with wings.
It adds syntax highlighting to files. Useful even if you're not using Git
As this is a git talk, it shows lines that have changed since the last commit
\frametitle{Useful supporting tools}
\framesubtitle{RigGrep / Fd}
Alternatives to grep and find
Fd, in particular, is not a full replacement for find but does most of what you want
Both (by default) will respect your gitignore file.
\frametitle{Useful supporting tools}
This is a tool that can make your diff output look better.
\frametitle{Useful supporting tools}
\framesubtitle{Shell Integration}
Takes 2 forms. Prompt and completion
\frametitle{Useful supporting tools}
Password manager
\frametitle{Useful supporting tools}
\framesubtitle{BFG Repo Cleaner}
\item You'll need something like this when you realise you have just committed your ssh keys
\item Mistakes happen
For the time that you accidentally commit your ssh keys.
I accidentally committed a database for an Woocommerce site.
%\textbf{Staging area} Waiting area before a commit