Advanced git / git masterclass materials and topic collection
Collected materials and topics from our open house session on advanced git / git masterclass on 14.01.25
October 08, 2025 - Samantha Wittke, Radovan BastTL;DR
The January 2025 CodeRefinery Open House on “Git masterclass” brought together educators and practitioners to share and map advanced Git teaching resources. The blog compiles existing materials (from Carpentries, universities, CodeRefinery, etc.) and organizes them around key themes: recovering from mistakes, changing history, collaborative workflows, merging/rebasing, user experience, and project organization. It also highlights best practices (commit messages, templates, large files, automation) and links to tools, cheatsheets, and talks. The session served as a starting point, and we plan future discussions to potentially creating new lessons to fill gaps and support flexible teaching.
Background
On January 14, 2025, we held an Open House session on "Git masterclass". Educators and practitioners from various institutions and communities came to share resources and discuss topics usually not taught in basic git classes. This blog post provides an overview of the materials and topics discussed. Please refer to https://coderefinery.org/blog/open-house-git-masterclass/ for an overview of the OpenHouse session itself.
List of material we know about
- Advanced Git in Carpentries incubator (pre-alpha)
-
Other Carpentries incubator projects on intermediate and advanced Git
-
Heidelberg University material for intermediate/expert Git courses
-
Collaborative version control with Git and GitHub by e-Science Center
-
CodeRefinery
- Introductory Git
- Collaborative Git
- (+ blogpost about why they don't start with
git init
anymore: https://coderefinery.org/blog/2024/04/19/git-lesson-rewrite/) - Git branch design (old material, we haven't touched or used this in many years)
- Git cheatsheet
- Collaborating and sharing using GitHub without command line: this was used few years ago, a bit outdated now (since the "Introductory Git" lesson can now be done without command line) but there might be some good things in there
-
KIT - Intermediate git, mostly CodeRefinery material, but with some additions
-
zedif Jena - Collaborative Version Control with Git: An Advanced Workshop
-
Basic GitHub but including pull requests and branching by Newcastle
-
Met Office Resources - Custom Software Carpentry Lessons
Other resources on teaching git
A list of other resources that can be helpful when learning about or teaching git:
- Oh Shit, Git!?!
- Scott Chacon's FOSDEM 2024 talk on Git Tips and Tricks.
- "Every git command I use" by Julia Evans
- Interactive git cheatsheet
- cbeams blogpost about writing commit messages
- devhints git branch cheatsheet
- About Python packaging and code organization
Topic collection
During the open house session we collected topics we would find interesting to have as part of an advanced git or git masterclass course. We then started to link existing materials to the topics. The following is a preliminary collection of topics and their links to at least some of the materials. We encourage anyone who knows about more links of topics and materials to contribute to this blogpost by sending a pull request to https://github.com/coderefinery/coderefinery.org/.
Recovering from local mistakes
- Creating a mental model for commits and branches that allows learners to understand how different commands move along the tree and convey principles and some more theoretical understanding of how git actually works, to give some explanation for (e.g.) how to undo things
- https://learngitbranching.js.org/
- Visualizing Git Concepts with D3
- Collaborate on a meaningful example to convey mental model/understanding which would still be motivating
- Recovering from making commits to the wrong branch
- https://coderefinery.github.io/git-intro/recovering/ (materials exist but we typically don't manage to teach this due to time in our "normal" workshop)
- Undoing/partially doing add (command-line or GUI-assisted)
- https://kernelnewbies.org/FirstKernelPatch#CommittingChanges
- Editing a previous commit (log, splitting, reverting)
- Available material which covers multiple of the above:
- https://mmesiti.github.io/git-intermediate/git-states/
- http://www.ndpsoftware.com/git-cheatsheet.html#loc=workspace;
- https://firstaidgit.io/
- Using the reflog to recover from bad resets
- https://ohshitgit.com/#magic-time-machine
Changing history
-
How to modify an open pull request/ merge request - responding to code review with changes (rebase, squash, edit commits)
- Make additional commits so that reviewers can see changes
- CodeRefinery git collaborative lesson on code review
-
How to remove something from the history
- Start with
git revert
and how it doesn't actually remove from history. Removing from history completely can be done using BFG-repo-cleaner.
- Start with
-
Interactive rebase to squash/delete/re-order commits
- Gradually introduce rebase: Amending/Fixup first, then interactive rebase. Do not overwhelm with full power, links into creating clearer history while working.
- XKCD comic as motivation to git rebase. This repository implements the git history of that comic and fixes it: https://github.com/ssciwr/git-rebase-xkcd-example (Details in Heidelberg university material)
- Squashing commits into a logical unit using reset
-
Creating clearer history while working
- Partial staging and committing
- Pulling with rebase
-
Temporary work and stashing
- Small episode in Heidelberg university material
-
Atomic commits, using
git commit --amend
andgit commit --fixup
which provide a gentle introduction togit rebase -i
(see above) -
History inspection
- Which branches and commit have you worked on? (Analysing/Debugging history - Meta Level)
- Searching through code changes with "pickaxe"
- Finding a commit that introduced a bug with git annotate or bisect
- Example repository with two bugs hidden in history (1 functional, 1 performance): https://github.com/ssciwr/git-bisect-example (Details in Uni Heidelberg material)
Collaborative workflows
- Importance of having a workflow: contributing changes is key for reproducibility, many users fork a project and change it and then just use their own version. Also trains useful transferable skills to industry/research software engineering.
- https://carpentries-incubator.github.io/advanced-git/07-branching-models/index.html
- https://carpentries-incubator.github.io/gitlab-novice/04-collaboration.html
- (not sure where this goes) https://carpentries-incubator.github.io/collaborative-git-and-github-lesson/aio.html
- Git project guidelines, i.e., no merge request without an issue, only a limited number of people merge to master/main, etc. A collection of the most important guidelines, familiarize participants with different options.
- Forking workflow
- https://carpentries-incubator.github.io/advanced-git/09-forking/index.html
- https://carpentries-incubator.github.io/git-novice-branch-pr/10-pull-requests/
- GitFlow, different variations, with or without a development branch, feature branches, hotfixes
- https://carpentries-incubator.github.io/advanced-git/08-gitflow/index.html
- Blog Post on why 'gitflow' may not be a good fit for your project
- Blog Post on alternatives to Gitflow
- Working with PR/MRs and code review
- Organizing commit history for keeping code reviewers' sanity
- Stacked Branches Workflow
Combining changes
- More advanced merge situations
- Cherry-picking to apply change from a single commit (as alternatives to merge)
- Comparing rebasing with merging and understanding when to prefer one or the other
- Rebasing across conflicts
- First should cover the two ways of updating branches, either
git merge
orgit rebase
and explain why and when you might prefer one of the other. Both can result in conflicts though! - Then go on to explain how to resolve conflicts.
- Some of this is covered in the Diverging branches chapter
- First should cover the two ways of updating branches, either
User interfaces and experience
- Nice life hacks in
.gitconfig
includeIf
to have custom configuration based on directory path.- using
aliases
conflictstyle = diff3
under[merge]
to not only see a conflict but also what the original code was when resolvingdefaultBranch = main
pushDefault = origin
- Aliases to avoid long complicated commands
- how to not reveal your email address if you prefer to not reveal it
- Using GitHub/GitLab command line utilities (PRs from command lines)
- Collaborate between Windows users and *nix users (line ending issue)
- git hooks
- pre-commit, some blog posts on this can be found on Neil;s website and there is a chapter on this in fair for research software
- GitHub actions / GitLab CI/CD
- Git shell prompts
- Oh-my-zsh (default on OSX) provides nice Git prompts via themes.
- how to customize fish shell for Git
- how to use GUI for line-based "git add -p" commands, difftool, mergetool
- difftastic
- https://carpentries-incubator.github.io/advanced-git/16-tools/index.html
- excellent tool for diffing and integrates easily with Git (see git configuration)
- delta - syntax-highlighting pager that can be customized
- meld as difftool/mergetool
- Signing commits can be a pain point as GPG is a bit of a pain to use in and of itself.
- Can sign with SSH keys these days.
- How to some ignore files globally for all your repos (these are typically ignore patterns that relate to individual settings rather than ignore paths that would be useful for all)
Organizing projects
Suggestion: this topic belongs on a 'RSE/Data Manager track' of git training, separate from a 'researcher/user track' that focuses more on daily usage, recipes etc - 'history' & 'recovering' sections.
- Nesting repositories with git submodules
- Tracking large files with git-lfs/ git-annex / dvc
- Partial cloning for huge repositories
- Making & distributing template repositories
Best practices:
- How to write a 'proper' git commit message
- What to do when you have hundreds or thousands of repositories in various states of completion/abandonment, e.g. using GitHub's labels and/or other features
- Naming conventions
These best practices could be served by e.g. blog posts disseminated via community forums, but they probably don't need to be part of a formal course.
Resources:
- For keeping track on multiple repositories / places using DataLad could be a solution (based on git-annex)
- CleanCode as setting a base for the (code in the) repository
Acknowledgement
This collection and overview was developed collaboratively during a CodeRefinery OpenHouse session. Thanks goes out to all participants: Radovan Bast, Lukas C. Bossert, Richard Darst, Nishka Dasgupta, Jonathan Hartman, Marc-André Hermanns, Diana Iusan, Dominic Kempf, Christian Knüpfer, Michele Mesiti, Iva Momcheva, Joe Marsh Rossney, Neil Shepard, Jannetta Steyn, Dimitrios Theodorakis.
Outlook
In this Open House session, we looked back: we shared experiences and links to already existing materials. We also preliminary identified topics of interest to the community based on previous experiences.
For the next session, we aim to look into the future: What lessons are missing, where do we put them and how do we collaborate on something useful for the community to mix and match as needed. If you want to be part of this discussion, please contact support@coderefinery.org.