Git — Overview, Configuration & Usage
Why Use Version Control?
Version Control System (VCS) (aka source control or revision control):
- History (track changes) of files (short & long-term undo)
- Backup and restore of a code base
- Collaborate (share changes among users)
- Synchronization (of distributed repositories)
- Sandboxing (develop in dedicated branches)
Why Use Version Control?
- Prevent deletion, accidentally lose of files
- Capability to revert changes in files
- Enables to review the history of files
- Allows in-deeps comparison between different versions of files
Version control systems differ in where the repository lives:
- Distributed (i.e. Git, Mercurial)
- File history in a hidden repository folder inside the working copy
- Checkouts, commits interact with the local repository folder
- Different copies of the repository synchronized by the version control software
- Typically repositories distributed with multiple public/private repositories
- Centralized (i.e. CVS, Subversion)
- Dedicated central server, stores files’ history and controls access
- Separate local working copy from the “master copy” on the server
- Working copy only stores the current versions (history in the server repository)
- Checkouts, commits require connection to the server
Why Using Git?
- Free and open source distributed version control system (no central server)
- Fast since all operations performed locally
- Implicit backup since multiple copies are stored in distributed locations
- All data is store cryptographicaly secured (temper proof)
What is a…
Repository
A repository (a database of changes) is the data structure that stores files with history and metadata. Repositories include four kinds of objects:
- A blob (binary large object) is the content of a file
- A tree object is the equivalent of a directory (cf. Merkle tree)
- A commit object links tree objects together into a history
- A tag object is a container that contains a reference to another object
Reference
A reference ref
is a (named mutable) pointer to an object (usually a commit)
- Git knows different types of references:
- heads refers to an object locally
- remotes refers to an object which exists in a remote repository
- stash refers to an object not yet committed
- tags reference another object
- Stored as Directed Acyclic Graph (DAG) of objects
Referring to objects:
- Use its full SHA-1 commit ID e.g.
66f67970e73b5ad213d9bc69f7e6497b6bfc1b75
- Truncated commit id s long as it is unambiguous e.g.
66f6797
- You can refer to a branch or tag by name
- Append a
^
to get the (first) parent,^2
second parent, etc. - Append
:<path>
for a file or directory inside commit’s tree - Cf.
git help rev-parse
Commit
Commit to add the latest changes to the repository:
- A commit include…
- ID of the previous commit(s)
- Content, commit date & messages
- Author and committer name and email address
- The commit ID (SHA-1 hash) cryptographically certifies the integrity of the entire history of the repository up to that commit
- Commits are immutable (can not be modified) afterwards (except HEAD)
- Child commits point to 1..N parent commits (typically 1 or 2)
HEAD
revision is the active commit, parent of the next commit
Git Ecosystem
Public repository hosters:
- codeberg.org …by Codeberg e.V. …Berlin, Germany …terms of use
- Commercial… (most popular)
- github.com…
- …by GitHub, Inc …subsidiary of Microsoft (since 2018)
- …free of charge …purchasable optional additional features, services
- …terms of service
- gitlab.com
- …by GitLab Inc.
- …free of charge …purchasable optional additional features, services
- …terms of use
- github.com…
- Comparison of source-code-hosting facilities, Wikipedia
Tooling:
Configuration
Customize the user configuration:
Path | Description |
---|---|
~/.gitconfig |
user configuration file |
~/.gitignore_global |
rules for ignoring files in every Git repository |
# configuration documentation
git help config
# dump configuration
git config list
# show configuration scopr
git config list --show-scope
# confguration local to a repository
git config list --local
User & Mail
Set username an mail address for all repositories (in ~/.gitconfig
):
git config --global user.name "Your Name"
git config --global user.email mail@example.com
Repository specific (in .git/config
):
# from the working tree
git config user.name "Your Name"
git config user.email mail@example.com
Aliases
git ls-files -t --exclude-per-directory=.gitignore --exclude-from=.git/info/exclude
# list files
git log --pretty=format:"%C(yellow)%h%Cred%d %Creset%s%Cblue (%cn)" --decorate --numstat
# show commits with a list of cahnges files
git log --pretty=format:"%C(yellow dim)%h%Creset %C(white dim)%cr%Creset ─ %s %C(blue dim)(%cn)%Creset"
# list commt messages one by line
Local Repository
Version-controlled directory on your local machine
- …contains all the files, history, and configuration
- …access without any remote connection over a network
- …used for development work …edit files, stage, commit
- …local history allows to inspect previsous versions & revert changes
Path | Description |
---|---|
.git/ |
Local Git repository directory |
.git/config |
Configuration of the local repository |
init
Create, init
(initialize) a new repository in .git/
:
- Create an empty repository in the current working directory
- By default it will have one master branch
# initialize a new repository
git init
Repositories used for clone, push and pull usually are a bare repository:
- …bare repositories have no working tree attached
- …conventional to give bare repositories the extension (suffix)
.git
- …sync with a bare repository by pushing from another repository
# initialize a new repository without working tree
git init --bare /path/to/my-project.git
Working Tree
Working directory — Project folder …stores the .git
directory
Working tree — Files outside of .git
directory
- …untracked area with all files in a directory structure
- …the working tree is modified during development
- …changes to files not tracked by git will be lost
# show state of files in the working tree
git status
Stagin area (index) — Snapshot of local changes
- …add changes to local files to the index
- …all changes in index included into the next commit
File States
Files states include the following three:
- Modified - Changed file(s) in working copy, not committed to repository
- Staged - Marked modified file(s), current version to be committed to repository
- Committed - Data is safely stored in your local repository
File states belong to one of the following three storage positions:
- The working copy (checkout) contains editable files (a copy of the repository data)
- The staging area (index) holds all marked changes ready to commit
- The git repository
.git/
stores all files, meta-data
add
& rm
Use add
to start tracking files with Git…
- …monitor for changes before staging
- …note that untracked files are ignored
# start tracking a file
git add $path
# add all current changes in working tree into the index
git add .
Remove files from repository (working tree & index)
# …remove file from index only (no delete from working tree)
git rm --cached $path
# check which files are deleted when using a glob pattern
git rm --dry-run [a-c]/*
# recursivly remove files not in version control
git clean -f
commit
The basic workflow:
- Modify a file in the working tree …check with
git status
- Accept a change to the staging are by adding a file with
git add
- Perform a
git commit
that permanently stores files in staging to the repository
git commit -m '<message>' # commit files in staging area
git commit -am '<message>' # commit all local changes
git commit --amend # change last commit
Conventional Commits3 — Specification for human & machine readable commit messages
# Set the committer name & email for a single commit
GIT_COMMITTER_NAME='<name>' GIT_COMMITTER_EMAIL='<mail>' git commit --author 'name <mail>'
log
& grep
git log
lists commits
- …search & filter commit history
- …output is highly customizable in content and representation
git log -1 ... # show last commit
git log -p # with changes inline
git log --decorate --oneline --graph # prettier graph-like structure
git log --stat # list changed files
Search on the current branch…
# search string in commit messages
git log --grep=$string
# search additions/deletions in commits
git log -S $string
# search complete commit for string and show filename and line-number
git grep -n $string $(git rev-list --all)
.gitignore
Path | Description |
---|---|
.gitignore |
List of files to ignore in the working tree |
git status --ignored # list ignored files
git clean -Xn # display a list of ignored files
git clean -Xf # remove the previously displayed files
git check-ignore -v $file # check if file is ignored
Ignore changes to a tracked file in the working tree
git update-index --skip-worktree $file
- …modifications to the file not possible during merge/checkout
- …Git will not erase or commit your skip-worktree changes
- …stage future changes by disabling with
--no-skip-worktree
git update-index --no-skip-worktree $file
git add $file
git update-index --skip-worktree $file
checkout
Undo modifications to file in the working…
# ...tree (by reading it back from the index)
git checkout -- path/to/file
Recover a deleted file…
# ...find the right commit ...check the history for the deleted file
git log -- path/to/file
# ...work with the last commit that still had the file
git checkout $hash -- path/to/file
show
& diff
git show
lists files that were changed in the merge commit
git diff
shows difference of the merge commit’s first parent and the merge commit
git diff # show difference between working tree and index
git diff --cached # show difference between HEAD and index (staged changes)
git diff $commit # show difference between commit and the working tree
reset
Unstage changes to file in the index (without touching the working tree)…
git reset path/to/file
git reset HEAD
…discard staging area (all changes)git reset --hard
…discard non-commited changesgit reset --hard $hash
…discard until specified commit
branch
Commits made on branch currently “checked out”…
- …
git status
shows checked out branch - …files in
.git/refs/heads
(local),.git/refs/remotes
(remote)
List branches…
# list branches in repository (* marks the current branch)
git branch
# list available emote branches
git branch -r
# list available local and remote branches
git branch -a
Create and check out a branch…
# create a new branch
git branch $name
# create new branch at commit (defaults to HEAD), and switch to it
git checkout -b $branch [$commit]
# switch to branch (update HEAD, index, and working tree)
git checkout $branch
# checkout remote branch
git checkout -b $branch $remote/$branch
Delete a local branch…
- …
-d
…only deletes the branch if fully merged in its upstream branch - …
-D
…deletes the branch irrespective of its merged status
# delete branch
git branch -d $branch
# delete local copy of a remote branch
git branch -dr $remote/$branch
Delete remote branch…
git push $remote -d $branch # delete remote branch
tag
# list tags of remote repository
git ls-remote --tags $remote
# fetch remote tags
git fetch
# create new tag to commit (defaults to HEAD)
git tag $name [$commit]
# delete tag
git tag -d $name
# list local tags
git tag -l
# list specific tags
git tag -l $regex
# list local tags with commit message
git tag -n1 -l
# create new local tag
git tag -a $name -m $message
# tag specific commit
git tag -a $name $commit
# push local tag to remote repository
git push $remote $tag
# push all local tags to remote repository
git push --tags $name
# delete all local tags
git tag -l | xargs git tag -d
Signing
Use and SSH key to sign commits…
- …make sure to provide an SSH keys for signing on the hosting service
- …Github requires a dedicated signing key (…alongside the authenication key …even if they are the same)
git config --global gpg.format ssh
git config --global user.signingkey $ssh_key
echo "$(git config --get user.email) namespaces=\"git\" $(cat $ssh_key)" >> ~/.ssh/allowed_signers
config --global gpg.ssh.allowedSignersFile ~/.ssh/allowed_signers
# sign commits & tags
git commit -S #…
git tag -s #…
# automatically sign commits & tags in a repository
git config commit.gpgsign true
git config tag.gpgsign true
git config list --local | grep sign
# verify signing
git config --global alias.ss 'show --show-signature'
merge
Merge — Combine code changes from different branches
- Default merge behaviour is to perform a fast-forward
- Commits without conflicts are simply absorbed into the branch
- A conflict requires a merge commit
- Disable fast-forward
--no-ff
to force every merge to produce a merge commit
# merge into current HEAD
git merge $branch
# avoid a fast-forward commit (modify your working copy)
git merge --no-commit --no-ff $branch
# examine the staged changes
git diff --cached
# undo the merge
git merge --abort
rebase
Do not rebase already pushed commits!
Rebase applies commits from current branch onto the head of the specified branch
- “replaying” changes with new commits (hashes/timestamps)
- Merge resolution is absorbed into the new commit
git checkout $branch
# rebase HEAD onto branch
git rebase $branch
Interactive rebase …with option -i
- …edit previous commits …including commit message
- …split single commit into multiple commits
- …squash multiple commits into a single commit
- …delete & reorder commits
# re-apply last 3 commits
git rebase --interactive $commit # and all following commits
- Editor opens with a commit list …select a command per commit:
pick
…change nothingedit
…modify commitsquash
…multiple commits- …re-order commits (may requires to resolve conflicts)
- Split a commit by selecting it with
edit
…- …
git reset HEAD~1
to remove the currently edited commit - …changes stay on worktree …insert individual commits
- …
# …after edit to a commit …continue rebase
git rebase --continue
# rewind in case of a problem
git rebase --abort
Remote Repository
Remote repositories are hosted on server infrastructure…
- …typically one of the cloud providers like GitHub, GitLab, BitBucket
- …access requires a network connection …push & pull changes
- …user for collaboration with other developers/customers
clone
Git allows bidirectional synchronisation between any number of repositories:
- A Git repository can be configured with references to any number of remotes
- Supports many protocols: SSH, HTTPS, DAV, Git protocol, Rsync, and a path to a local repository
- Allows centralized and/or distributed development models
Copy, clone
a repository from another location:
# clone a remote repository and create a working copy, optionally provide the target directory
git clone <url> [<path>]
Following syntax references remote URLs and local paths:
# remote
ssh://[user@]host.xz[:port]/path/to/repo.git/
git://host.xz[:port]/path/to/repo.git/
http[s]://host.xz[:port]/path/to/repo.git/
[user@]host.xz:/~[user]/path/to/repo.git/
# local
file:///path/to/repo.git/
/path/to/repo.git/
# clone a remote repository and checkout a specific branch
git clone -b <branch> <url> [<path>]
remote
Remote repositories are configured in .git/config
(cf. git help git-config
):
- Freshly cloned repository have…
- One reference to the
origin
remote repository (default source to pull/push) - Automatically create a master branch that tracks
origin/master
- One reference to the
- Checkout of a local branch from a remote branch automatically creates a tracking branch
Modify references to other repositories:
git remote -v # list references to remote repos (including URLs)
git remote add <remote_ame> <remote_url> # add a reference to a remote repository
git remote show <remote_name> # inspect a remote repository
git remote rename <old_name> <new_name> # rename a reference to a remote repository
git remote rm <remote_name> # delete a reference to a remote repository
fetch
Retrive updates from a remote repository
- …without merging them into a local branch
- …downloads the latest changes (commits, branches, tags)
- …updates your local copy of the remote branches
Does not modify current working tree! (or branch)
git fetch
# only from a specific remote
git fetch $remote
pull
Synchronization changes from a remote repository …short for fetch
and merge
git pull $remote $branch
# update all local branches from their corresponding remote branches
git pull --all
Preverably do not use git pull
, instead…
# …short for `fetch` and `rebase`
git pull --rebase
# …in case of a merge conflict
git rebase --abort
# …use regular git pull and resolve merge conflicts
# set an alias for pull with rebase
git config --global alias.pr "pull --rebase"
push
push [<remote_name>] [<branch_name>]
copies local changes to a remote repositorypush -u
track remote with current branch
stash
Local changes will not be overwritten by git pull
…
- …
stash
stores a snapshot of your changes without committing - Separated from the working directory, the staging area, or the repository
Basic workflow example:
git stash # stash the changes in working tree
git pull # pull commits from remove
git stash pop # apply changes on the current working tree
stash save [<message>]
saves changes and reverts the working directorystash list
prints all saves…stash@{0}
number in the curly braces{}
is the indexstash show -p <index>
show files changed with diff-style patch
stash apply <index>
applies the changes and leaves a copy in the stashstash pop [<index>]
applies the changes and removes the files from the stashstash drop <index>
remove stashed changes without applyinggit stash clear
clear the entire stash
Multiple Repositories
git-repos helps to solve the following three use-cases:
- Maintains a list of Git remote repositories associated to a local directory tree.
- Indicate the local status for a list of repositories.
- Indicate the state of remotes for a list of local repositories.
>>> git repos status -v
Reading configuration from ~/.gitrepos
Git in ~/projects/dummy
?? path/to/new/file
Git in ~/projects/scripts
↑1 backup/master
↑3 github/master
M git-repos
Git in ~/projects/site
AM posts/git_repos.markdown
init
creates missing directories defined in the repository liststatus
runsgit status -s
- …on all repositories prints the output
- …checks if the local repositories are ahead of their remotes with
git rev-list
Repository list in $PWD/.gitrepos
, ~/.gitrepos
or option --config PATH
…with following format…
/path/to/the/repository
origin git://host.org/project.git
backup ssh://user@host.org/project.git
/path/to/another/repository
name ~/existing/repo
~/path/to/yet/another/repo
foobar ssh://user@host.org/foobar.git
realitve/path/to/repository
deploy git://fqdn.com/name.git
- …each directory is followed by a list of remotes using the notation of
git remote add
- …first the name of the remote, second the URI to the remote repository.
Advanced Operations
Clean History
Create a new orphan branch…
- …first commit made on this new branch will have no parents
- …it will be the root of a new history
git checkout --orphan orphan
git add -A
git commit -am "Commit history removed"
Delete the original master
branch and rename the orphaned branch:
git branch -D master
git branch -m master
Update the remote repository …option -f
required…
git push -f origin master
Network Proxies
…protocols for client-server communication…
Protocol | Example Connection Address |
---|---|
https | https://example.com/repository.git |
ssh | git@example.com/repository.git |
git | git://example.com/repository.git |
…everyone proxied with different method…
Proxy the HTTP protocol…
- …set environment variables
HTTPS_PROXY
andHTTP_PROXY
- …to use an available HTTP proxy server
Proxy an SSH connection with…
- …a custom
GIT_SSH_COMMAND
- …using the SSH option
-J
to configure a jump host
GIT_SSH_COMMAND="ssh -J PROXY_FQDN " git ...
Proxy the git://
protocol…
- …over an SSH connection…
- …using
netcat
on proxy node
# create a helper script with the proxy command
cat > gitproxy <<'EOF'
#!/bin/bash
exec ssh node.example.org nc "$@"
EOF
# make sure that is is executable
chmod +x gitproxy
# set an environment variable to use a proxy command
export GIT_PROXY_COMMAND=$PWD/gitproxy
References
[1] The ultimate git merge vs rebase tutorial, Toby Fleming (2018)
https://tobywf.com/2018/01/the-ultimate-git-merge-vs-rebase-tutorial/
[2] The Git Parable, Tom Preston-Werner (2009)
http://tom.preston-werner.com/2009/05/19/the-git-parable.html
[3] Git Notes for Professionals
https://goalkicker.com/GitBook/
[4] How to teach Git, Rachel M. Carmena (2018)
https://rachelcarmena.github.io/2018/12/12/how-to-teach-git.html
[5] Pro Git 2nd Edition, Scott Chacon and Ben Straub (2014)
https://git-scm.com/book/en/v2
[6] Gitea, self-hosted Git service
https://gitea.io
[7] GitHub, commercial Git service (Microsoft)
https://github.com/
[8] GitLab, commercial Git service based on Open Source core
https://gitlab.com/
[9] A Visual Git Reference
http://marklodato.github.io/visual-git-guide/index-en.html
Footnotes
GitButler
https://docs.gitbutler.com https://github.com/gitbutlerapp/gitbutler↩︎git-annex
https://git-annex.branchable.com↩︎Conventional Commits
https://www.conventionalcommits.org↩︎