Monday, 4 May, 2020 UTC


Summary

Branches are the bread and butter of a software developer using a Version Control System (VCS) of any kind.
Today we explore how they work in Git.
In the previous part of this series, we’ve learned that a commit is a full snapshot of the project state. In Git, the branch is a pointer to a particular snapshot. We can think of it as an indicator of a top of a cluster of commits.
The above is in contrast to other Version Control Systems, where we have to create a copy of our source code. Thanks to the way they work in Git, the branches are lightweight and effortless to create.
The branch is a pointer
One of the best ways to learn is to do it with examples. Below, we inspect the express-typescript repository that is a part of the TypeScript Express series.
First, let’s look into the latest commit in the repository using 
git log
:
commit 5b5bb249e4990e672a96bbe4800a6e36d9a60962 (HEAD -> master, origin/master, origin/HEAD)
Author: Marcin Wanago <[email protected]>
Date: Sun Apr 26 19:11:32 2020 +0200
chore(): update @types/mongoose
This long string describing the above commit is a hash. In the previous part of this series, we learn that it acts as an identifier generated based on the contents of the commit.
Using the 
--points-at
 argument, we can get a list of branches that point to a specific commit.
git branch --points-at 5b5bb249e4990e672a96bbe4800a6e36d9a60962
* master
It might prove to be difficult to find a Git repository without the master branch. Although it is not mandatory to have one, the 
git init
 command creates it and it is considered a standard.
Every time we make a commit, the branch pointer moves forward. To see what commit does a branch point to, we can use 
git show-branch
:
git show-branch --sha1-name postgres
[4fcd357] test(Authentication): test if there is a token in the registration response
The above output contains a shorter version of a hash. It is not unusual for Git, and other commands can do this also.
A good example is
git rev-parse
 with the –short argument. By default, it produces a hash that is at least seven characters long. If it is not unique, it returns more characters.
The HEAD
Above in the output of the 
git log
 command, we can see a list of branches:
commit 5b5bb249e4990e672a96bbe4800a6e36d9a60962 (HEAD -> master, origin/master, origin/HEAD)
Simplifying it, HEAD is a pointer to a commit that our repository is checked out on. It most cases it means, that HEAD points to the same commit that a branch that we currently use.
If we make a commit, HEAD now points to it.

A detached head

Although HEAD usually points to a current branch, it is not always the case.
When we use the 
git checkout
 command, we specify which revision of our repository we want to work with. A typical way to use it is with a branch name:
git checkout postgres
The above causes the HEAD to point to the last commit in the postgres branch. However, when using 
git checkout
, we can also provide a hash of a specific commit:
git checkout 3238d85
HEAD is now at 3238d85 feat(Posts): create a relation between the Post and the User
The above puts us in a detached HEAD state. A detached state happens when we check out to a specific commit instead of a branch. If we make some changes now and commit them, they don’t belong to any branch!
When we make changes while having a detached HEAD, we can still create a new branch containing the new code. To do so, we can use the 
checkout -b new-branch
 command.

Git reset

Understanding the HEAD can come in handy. An example is deleting an unpushed commit. Let’s inspect closer the code provided in this StackOverflow answer:
git reset --soft HEAD~1
git reset --hard HEAD~1
The job of 
git reset
 is to reset the HEAD to a specified state.
By providing 
HEAD~1,
 we point to a parent of the last commit. By doing so, we remove the last commit that we’ve made.
We could also delete more than just one commit, for example, by typing 
HEAD~4
 and removing four commits.
We can check what the HEAD points to by looking up the
HEAD
 file in the 
.git
 directory. If we are using a Unix-like system, we can do this with the 
cat
 command:
cat ./.git/HEAD
ref: refs/heads/master
The branch is a type of a reference
In the HEAD file above, we can see a path:  
refs/heads/master
. It leads to a file located in the 
.git/refs/heads
 directory and contains the hash of a commit that the master branch points to.
cat ./.git/refs/heads/master
5b5bb249e4990e672a96bbe4800a6e36d9a60962
In Git, references point to a specific commit. The
The 
.git/refs/heads
 directory contains all of our local branches.
If you want to know what is a local branch, check out the first part of this series.
ls ./.git/refs/heads/
master postgres
Based on the above, we can determine, that branch is a type of a Git reference. Other types of references are tags and remotes.
When we make a commit to a master branch, Git moves the master pointer. Now, it refers to a new commit. To do so, it has to update the 
.git/refs/heads/master
 file.
We can also update refs ourselves using the git-update-ref

Packed-refs

As our repository grows significantly, the above approach might not prove to be very performant. Because of that, Git periodically compresses refs into a single file.
By doing so, Git moves all branches and tags into a single 
packed-refs
 file. If you ever wonder why your 
.git/refs
 directory looks empty, this might have taken place.
We might force the above behavior using the git-gc utility. Let’s do so on the express-typescript repository.
git gc
cat ./.git/packed-refs
# pack-refs with: peeled fully-peeled sorted
5b5bb249e4990e672a96bbe4800a6e36d9a60962 refs/heads/master
4fcd357f55f9eee74c492ce687475679c0890e25 refs/heads/postgres
5b5bb249e4990e672a96bbe4800a6e36d9a60962 refs/remotes/origin/master
4fcd357f55f9eee74c492ce687475679c0890e25 refs/remotes/origin/postgres
The more branches we have locally, the more of them end up in the 
packed-refs
 file.
Summary
It turns out that branches are references stored in the 
.git/refs
 directory. Another important aspect connected to them is the HEAD file. It points us to a commit that is currently checked out. It means that usually, HEAD refers to a current branch. We call it a detached HEAD if it points to a specific commit instead of a branch. Knowing all of the above might help us avoid issues and resolve them if we bump into any.
The post Getting geeky with Git #3. The branch is a reference appeared first on Marcin Wanago Blog - JavaScript, both frontend and backend.