Up and running with Git
The first time I came across Git was when I discovered the Arch User Repositories. I didn’t know much about it at the time but considering I only needed to install community maintained packages, I quickly got around it.
To install Spotify for example I’d type:
$ cd /usr/local/aur # to get into the parent folder where all my community packages are
$ git clone https://aur.archlinux.org/spotify.git # to download the files
$ makepkg -si # to build and install the package
And when a more recent version of Spotify would be released, I’d update with:
$ cd /usr/local/aur/spotify # to get into the tracked folder
$ git pull # to download updated files
$ makepkg -si # to build and install the package
Pretty straight forward, isn’t it? But then I started writing code, and version control got a lot more…
First, let’s make sure we understand the difference between:
- Git, the version control system that track changes in files. You need to install a version of it on your system. Alternatives to Git would be Mercurial, Apache Subversion, for example.
- GitHub, the hosting platform. You need to create an account as it is cloud based. Alternatives to GitHub would be GitLab, Bitbucket, for example.
Let’s do this then! Let’s install Git and create an account on GitHub.
Once installed, run these commands to finalise the initial setup:
git config --global user.name “YourUsername”
git config --global user.email YourEmailAddress
git config --global init.defaultBranch main
defaultBranch main line is about using
main instead of
master and avoid unnecessary reference to slavery.
The username and email are going to be public (in case you want to use GitHub services) but I’d still recommend to use the email address used to create the GitHub account as it will be used to monitor your contributions and populate the contribution graph on your profile page:
(There are options to hide/anonymise your email address in the GitHub settings if you value your privacy)
Creating a project with Git
Here is the typical workflow when you start a project (All the commands need to be run from your
Initiate the tracking filesystem with
git init [project_name].
Create a .gitignore file. It is a list all files that won’t be tracked. This is especially useful with Jupyter Notebooks checkpoints or if you want to keep some files private — like a to-do list for your project for example. The easiest way to do this is using the gitignore.io command line alias.
Add the files you want to track with
git add [file]. [file] can be replace with
. if all the files in the folder need to be tracked. (By all I mean “all but the ones listed in the
Now that files are tracked, it is time to commit them with
git commit -m “Changes made/Comments”. Committing is capturing a snapshot of the files. If you were not using Git, it would be similar to saving the files in a different (local) directory that would be named
YYMMDD-project_name. If something goes wrong, you can always reopen that directory and start coding from there.
There you go, 4 steps later and you have a project with version control!
Now, if you want to share your wisdom, or collaborate with other people, the easiest way is to link this local repository of yours with the cloud version, so let’s head to GitHub!
Sharing a project on GitHub
Once you have committed the files to the project, you need to let Git know that this project will have a cloud version with
git remote add origin [url_of_project].
To get the
[url_of_project], you need to leave the terminal and login to GitHub. Click on the + sign in the top right corner then
New repository. Fill in the form and give your repository a public name. There is no need to initiate with a
README.md nor to create a
.gitignore (as the project already has files).
Push the files to the GitHub with
git push -u origin main. Push is committing to the cloud. To recycle the backup directory analogy, after transferring the files to the
YYMMDD-project_name directory, pushing would be like uploading the directory to Google/Amazon/One Drive.
That’s it, your many project versions are safe and sound!
Passed this initial setup, the workflow is going to be:
git add .to stage all the modified changes.
git commit -m “comment”to commit the changes to your local repository.
git pushto push the files from the local repository to the cloud.
Using version control
There is a lot that can be said here as git is incredibly powerful and is designed to cope with a lot of scenarios but I’ll try to cover a few common concepts. But before that, let’s have a look at Git’s structure to understand:
Git is designed for collaborative software development: multiple people working on the same project at the same time. To allow the whole team to work without breaking the codebase, developers “branch” the codebase. They code on that branch and once the feature is working, that feature is added to the codebase.
- Create a branch:
git branch [branch_name].
- List local and remote branches:
git branch -av,
git branchto list local branches.
- Start working on a branch:
git checkout [branch_name],
git checkout [main]OR[master]to return to the main branch.
- Delete a branch:
git branch -d [branch_name].
To understand the difference between all these concepts (and more), have a look at this brilliant blogpost from Lydia Hallie.
In a nutshell:
git fetchwill get the latest changes without merging. It means the local repository isn’t updated but the changes are locally available. To apply the changes, you’ll then need to
git merge. Merging is applying changes to the local files.
git pullis used to update the files from the cloud to your local repository. It is essentially
git fetchfollowed by
git merge FETCH_HEAD. Pulling is preferred to Fetch+Merge approach.
git pull --rebaseis another way to update the local files. Unlike merging, which will try to figure out which files to keep or not, rebase will assume the most recent changes are in the branch being rebase.
git push, as seen above, will update local changes to the cloud.
Sometimes synchronizing can be a real nightmare, so here are a few useful commands designed to figure that mess out.
git statusis a command that will tell you what files are not up -to-date and not committed yet.
git logdisplays the complete change history.
git diffshows changes to files not staged yet.
git diff --cachedshows the changes to staged files.
git diff [commit1] [commit2]shows the changes between 2 commits.
git show [commit]OR[file]shows the files changes for a commit or file.
git blame [file]is useful to see who has made changes.
git reset [commit] take you back to the desired
[commit] and keep the local changes.
git reset --hard [commit] reset to the desired
[commit] and delete all the changes that happened after that commit.
Working with big files
GitHub has a 100 MB push limit. So it is necessary to install Git Large File Storage if you are working with big files.
- Install Git LFS
git lfs install.
- From the local repository (
git lfs track [big_file]then
git add .gitattributes(to start tracking
git add [big_file].
git commit -m “start tracking big_file”.
git push origin main.
- After this, you can use git as per usual, Git LFS will deal with the details.
If you clone a repository with LFS files, you will need to
git lfs pull before being able to use the file as Git LFS will replace that file itself by a text file listing the information necessary for LFS to retrieve your file:
These few commands should be more than enough to get you started with Git/GitHub, at least for your personal projects.
Collaborating and dealing with merge conflicts is a lot more complex but the good thing about Git is it is a tool designed to create safety nets. You can (almost) always revert to a previous commit, so you can confidently google your way out of merge troubles!