JSFeeds: anago.io - Getting geeky with Git #3. The branch is a reference

Monday, 4 May, 2020 UTC

Getting geeky with Git #3. The branch is a reference

Summary

Branches are the bread and butter of a software developer using a Version Control System (VCS) of any kind.
Today we explore how they work in Git.

In the previous part of this series, we’ve learned that a commit is a full snapshot of the project state. In Git, the branch is a pointer to a particular snapshot. We can think of it as an indicator of a top of a cluster of commits.

The above is in contrast to other Version Control Systems, where we have to create a copy of our source code. Thanks to the way they work in Git, the branches are lightweight and effortless to create.

The branch is a pointer

One of the best ways to learn is to do it with examples. Below, we inspect the express-typescript repository that is a part of the TypeScript Express series.

First, let’s look into the latest commit in the repository using

git log

commit 5b5bb249e4990e672a96bbe4800a6e36d9a60962 (HEAD -> master, origin/master, origin/HEAD)
Author: Marcin Wanago <[email protected]>
Date: Sun Apr 26 19:11:32 2020 +0200

chore(): update @types/mongoose

This long string describing the above commit is a hash. In the previous part of this series, we learn that it acts as an identifier generated based on the contents of the commit.

Using the

--points-at

argument, we can get a list of branches that point to a specific commit.

git branch --points-at 5b5bb249e4990e672a96bbe4800a6e36d9a60962

* master

It might prove to be difficult to find a Git repository without the master branch. Although it is not mandatory to have one, the

git init

command creates it and it is considered a standard.

Every time we make a commit, the branch pointer moves forward. To see what commit does a branch point to, we can use

git show-branch

git show-branch --sha1-name postgres

[4fcd357] test(Authentication): test if there is a token in the registration response

The above output contains a shorter version of a hash. It is not unusual for Git, and other commands can do this also.

A good example is

git rev-parse

with the –short argument. By default, it produces a hash that is at least seven characters long. If it is not unique, it returns more characters.

The HEAD

Above in the output of the

git log

command, we can see a list of branches:

commit 5b5bb249e4990e672a96bbe4800a6e36d9a60962 (HEAD -> master, origin/master, origin/HEAD)

Simplifying it, HEAD is a pointer to a commit that our repository is checked out on. It most cases it means, that HEAD points to the same commit that a branch that we currently use.

If we make a commit, HEAD now points to it.

A detached head

Although HEAD usually points to a current branch, it is not always the case.

When we use the

git checkout

command, we specify which revision of our repository we want to work with. A typical way to use it is with a branch name:

git checkout postgres

The above causes the HEAD to point to the last commit in the postgres branch. However, when using

git checkout

, we can also provide a hash of a specific commit:

git checkout 3238d85

HEAD is now at 3238d85 feat(Posts): create a relation between the Post and the User

The above puts us in a detached HEAD state. A detached state happens when we check out to a specific commit instead of a branch. If we make some changes now and commit them, they don’t belong to any branch!

When we make changes while having a detached HEAD, we can still create a new branch containing the new code. To do so, we can use the

checkout -b new-branch

command.

Git reset

Understanding the HEAD can come in handy. An example is deleting an unpushed commit. Let’s inspect closer the code provided in this StackOverflow answer:

git reset --soft HEAD~1

git reset --hard HEAD~1

The job of

git reset

is to reset the HEAD to a specified state.

By providing

HEAD~1,

we point to a parent of the last commit. By doing so, we remove the last commit that we’ve made.

We could also delete more than just one commit, for example, by typing

HEAD~4

and removing four commits.

We can check what the HEAD points to by looking up the

HEAD

file in the

.git

directory. If we are using a Unix-like system, we can do this with the

cat

command:

cat ./.git/HEAD

ref: refs/heads/master

The branch is a type of a reference

In the HEAD file above, we can see a path:

refs/heads/master

. It leads to a file located in the

.git/refs/heads

directory and contains the hash of a commit that the master branch points to.

cat ./.git/refs/heads/master

5b5bb249e4990e672a96bbe4800a6e36d9a60962

In Git, references point to a specific commit. The

The

.git/refs/heads

directory contains all of our local branches.

If you want to know what is a local branch, check out the first part of this series.

ls ./.git/refs/heads/

master postgres

Based on the above, we can determine, that branch is a type of a Git reference. Other types of references are tags and remotes.

When we make a commit to a master branch, Git moves the master pointer. Now, it refers to a new commit. To do so, it has to update the

.git/refs/heads/master

file.

We can also update refs ourselves using the git-update-ref

Packed-refs

As our repository grows significantly, the above approach might not prove to be very performant. Because of that, Git periodically compresses refs into a single file.

By doing so, Git moves all branches and tags into a single

packed-refs

file. If you ever wonder why your

.git/refs

directory looks empty, this might have taken place.

We might force the above behavior using the git-gc utility. Let’s do so on the express-typescript repository.

git gc
cat ./.git/packed-refs

# pack-refs with: peeled fully-peeled sorted
5b5bb249e4990e672a96bbe4800a6e36d9a60962 refs/heads/master
4fcd357f55f9eee74c492ce687475679c0890e25 refs/heads/postgres
5b5bb249e4990e672a96bbe4800a6e36d9a60962 refs/remotes/origin/master
4fcd357f55f9eee74c492ce687475679c0890e25 refs/remotes/origin/postgres

The more branches we have locally, the more of them end up in the

packed-refs

file.

Summary

It turns out that branches are references stored in the

.git/refs

directory. Another important aspect connected to them is the HEAD file. It points us to a commit that is currently checked out. It means that usually, HEAD refers to a current branch. We call it a detached HEAD if it points to a specific commit instead of a branch. Knowing all of the above might help us avoid issues and resolve them if we bump into any.

The post Getting geeky with Git #3. The branch is a reference appeared first on Marcin Wanago Blog - JavaScript, both frontend and backend.

... more @ anago.io

anago.io