It's useful even if you're working on things by your self. This presentation is version controlled.
You can use it to find out when something broke. I won't be covering it today but there is a tool called git bisect that can take a unit test (or script) to analyse when something broke using a binary search.
It's interface abstracts away a lot of the work, meaning it's commands can feel like magic. When it works, this is fine but unfortunately when things go wrong, you can be left - like in this comic - with no idea how to proceed.
It's interface can be confusing, there are some commands that do a lot (checkout) and there are often multiple ways to achieve something.
I think that understanding a bit about how Git works under the hood will help de-mistily it.
Git's data model is actually quite simple (beautiful even). Understanding the basics of this can really help.
\href{https://gitforwindows.org/}{Git for Windows: https://gitforwindows.org/}
\note{%
Git is probably already installed if you are on a Linux system. However, if not, it will definitely be in your standard repositories.
There is a version of Git provided with xcode, but it is old. Most of the stuff we cover today should still work but (for example) some things need to be run from the root directory in old versions of git that don't in newer versions.
If you have the misfortune to be using windows, I've heard good things about Git for Windows but have not used it personally. It includes Bash emulation.
Hopefully you have Git installed. I will be running it on Linux although the commands should all be the same for Windows and Mac.
Note that I am not using my primary email address. The email address you provide here will be available to anyone with access to repositories you work on.
These settings are stored in \mintinline{bash}{~/.gitconfig}.
There are several times that Git will need to open a text editor. By default, it will use \mintinline{bash}{EDITOR}. If neither is set, it will use VI.
On Linux systems, this is set to auto by default. Might be different on a Mac. Generally auto is probably what you want. It will be coloured unless you are piping the output to a file or another process.
Take note of the incorrect spelling of colour.
You can override all configuration options on an individual command basis if you like.
}
\end{frame}
\begin{frame}
\frametitle{Terminology}
\framesubtitle{Objects}
\textbf{Blob} In Git, a file is called a blob.
\textbf{Tree} In Git, a directory is called a tree.
\textbf{Commit} A snapshot of your code
All of these are referenced by a hash and stored in the \mintinline{bash}{.git/objects/} directory.
\note{%
Most Git tutorials I have come across focus on memorizing commands. This way, the commands feel like magic and there is never really any understanding of what the commands do under the hood.
}
\end{frame}
\begin{frame}
\frametitle{Naïve Approach}
\dirtree{%
.1 Project.
.2 draft.
.3 some.
.3 files.
.2 final-draft.
.3 some.
.3 files.
.2 final.
.3 some.
.3 files.
.2 real-final.
.3 some.
.3 files.
.2 actual-real-final.
.3 some.
.3 files.
}
\note{%
I think, being honest, we have all done this. This sort of works, if you're working on something by yourself. Once you start collaborating on software, you are going to have a bad time.
However, this is a simple approach and not a million miles from what Git does internally.
I want this to be quite interactive so first things first, let's get Git setup.
}
\end{frame}
\begin{frame}
\frametitle{Model it}
\begin{center}
\begin{tikzpicture}
%\draw (-1.5,-1.5) rectangle (7.5,1.5);
%\node at (-2.5,0) {master};
\node[commit,minimum size=2cm] at (0,0) (commit1) {Draft};
\node[commit,minimum size=2cm] at (3,0) (commit2) {Final Draft};
\node[commit,minimum size=2cm] at (6,0) (commit3) {Final};
\draw[arrow] (commit1) -- (commit2);
\draw[arrow] (commit2) -- (commit3);
\node[draw,text width=1.8cm,anchor=north,align=center] at (0, -1.5) {\small \vdots\\[0.1cm] };
\node[draw,text width=1.8cm,anchor=north,align=center] at (3, -1.5) {\small \vdots\\[0.1cm] Draft };
\node[draw,text width=1.8cm,anchor=north,align=center] at (6, -1.5) {\small \vdots\\[0.1cm] Final Draft };
\end{tikzpicture}
\end{center}
\note{%
\begin{itemize}
\item This is a simple representation of the folder structure we saw, although for simplicity, I'm only showing 3 revisions.
\node[draw,text width=1.8cm,anchor=north,align=center] at (0, -1.5) {\small \vdots\\[0.1cm] };
\node[draw,text width=1.8cm,anchor=north,align=center] at (3, -1.5) {\small \vdots\\[0.1cm] 93e4d3d\ldots };
\node[draw,text width=1.8cm,anchor=north,align=center] at (6, -1.5) {\small \vdots\\[0.1cm] 2557962\ldots };
\end{tikzpicture}
\end{center}
\note{%
\begin{itemize}
\item Rather than human readable names, Git references each snapshot (called a commit) by a cryptographic hash. Currently using a hardened sha1 but there is an effort to move to sha256.
\item Similarly to the model above, each commit references the previous (except the first obviously)
\item The commit also includes meta information such as the committer, a timestamp and a message.
\item We will look at this in more detail a bit later.
\end{itemize}
}
\end{frame}
\begin{frame}
\frametitle{Commits / Branches}
\begin{center}
\begin{tikzpicture}
%\draw (-1.5,-1.5) rectangle (7.5,1.5);
%\node at (-2.5,0) {master};
\node[commit] at (0,0) (commit1) {};
\node[commit] at (2,0) (commit2) {A};
\node[commit] at (4,0) (commit3) {B};
\node[commit] at (4,-2) (commit3b) {C};
\draw[arrow] (commit1) -- (commit2);
\draw[arrow] (commit2) -- (commit3);
\draw[arrow] (commit2) -- (commit3b);
\end{tikzpicture}
\end{center}
\note{%
The linear graph we just saw is an overly simplistic representation. In reality, Git represents history using a Directed acyclic graph which allows parents to be shared my multiple commits. This is useful because it allows for Branches. We will look at these a bit more later.
It is good practice to develop features on a separate branch. This allows for multiple people to work on a project as well as allowing things like bug-fixes to be deployed without having to worry about interference from a new feature.
}
\end{frame}
\begin{frame}
\frametitle{Commits / Branches}
\begin{center}
\begin{tikzpicture}
%\draw (-1.5,-1.5) rectangle (7.5,1.5);
%\node at (-2.5,0) {master};
\node[commit] at (0,0) (commit1) {};
\node[commit] at (2,0) (commit2) {};
\node[commit] at (5,0) (commit4) {A};
\node[commit] at (8,0) (commit5) {C};
\node[commit] at (4,-2) (commit3b) {};
\node[commit] at (6,-2) (commit4b) {B};
\draw[arrow] (commit1) -- (commit2);
\draw[arrow] (commit2) -- (commit4);
\draw[arrow] (commit4) -- (commit5);
\draw[arrow] (commit2) -- (commit3b);
\draw[arrow] (commit3b) -- (commit4b);
\draw[arrow] (commit4b) -- (commit5);
\end{tikzpicture}
\end{center}
\note{%
As well as 2 commits' ability to share a parent, the opposite is also true, Here, we see that a commit is able to have multiple parents.
This is called a merge commit - because it merges two branches. In a lot of situations git is smart enough to auto-merge branches although at times human intervention is necessary.
By default, git creates a branch called Master when you create a repository.
Here we see the branch we are on (Master), we are told that there are no commits yet and we see that Git can see the file we've just made but it isn't tracking it.
The staging area is where you put things that you want to be committed.
It can often be useful to manually split changes up into different commits. You might be working on feature A and feature B simultaneously. It is good practice to have each feature as a separate commit so you could add feature A to the staging area, commit it, then do the same for feature B.
We will talk about \mintinline{bash}{.gitignore} in a bit.
Here can use git status to see what is in the staging area. They are listed in the ``Changes to be committed" section. By default, they will also be green if you have colour switched on.
\item References are stored in the \mintinline{bash}{.git/refs} folder
\item The \mintinline{bash}{heads} folder contains references to the heads (or tips) of all local branches
\end{itemize}
\note{%
}
\end{frame}
\begin{frame}
\frametitle{References}
\framesubtitle{HEAD}
\begin{itemize}
\item The HEAD references is directly in the \mintinline{bash}{.git} folder.
\item It refers to the ``current" commit. It is how git knows where you are.
\item This normally refers to a branch's head commit.
\item In some situations it will refer to a commit directly.
\end{itemize}
\note{%
Not sure why it is not in refs folder
If it refers directly to a commit, the repository is in what is called a ``detached head" state.
}
\end{frame}
\begin{frame}[fragile]
\frametitle{.gitignore}
This file tells git which files not to track.
\begin{minted}{bash}
*.log
*.doc
*.pem
*.docx
*.jpg
*.jpeg
*.pdf
*.png
.DS_Store/
*.min.css
*.min.js
dist/
\end{minted}
\note{%
This will not stop git tracking a file if it's already being tracked.
If you start tracking large binary files, git isn't going to be able to compress them. This will result in a massive repo and a headache for everyone. If at all possible, don't track large files, especially if they are going to be changed. Remember, git stores each version of each file. With text, this is fine as it can be compressed efficiently. If it's not text, it can't.
You should probably also try to avoid including minified files as git won't be able to merge them automatically.
}
\end{frame}
\begin{frame}[fragile]
\frametitle{Branches}
\begin{itemize}
\item Allows multiple features to be developed in parallel without interference.
\item Allows multiple people to collaborate easily.
Branches are represented in git as references in the heads folder.
They can be created by simply creating a file there.
The git checkout command does A LOT of stuff. It can be confusing so it's functionality has been split up into several smaller commands. If you have git 2.23.0 or newer, you will be able to use it.
Be aware that a lot of tutorials etc. will use the checkout command. Version 2.23.0 was released in August 2019.
At times, git won't be able to merge automatically.
Dealing with merges is something that there are around a million different tools you can use but I think they over complicate what is actually quite a simple process.
Here you can see that the bit(s) git couldn't work out are delimited by \mintinline{bash}{<<<<<<<} and \mintinline{bash}{>>>>>>>} and separated by \mintinline{bash}{=======}.