Clean Commits & Version-Control Hygiene — Practice Tasks¶
12 hands-on git exercises, easy to hard. Every task: a messy scenario, an instruction, and a collapsible solution with the exact commands, the cleaned result, and the reasoning. These drills assume git ≥ 2.30. Run them in a throwaway repo (
git init scratch && cd scratch) — most are destructive on purpose.
Table of Contents¶
- Task 1 — Rewrite a "WIP" message into a proper commit (Easy)
- Task 2 — Turn "fix stuff" into Conventional Commits (Easy)
- Task 3 — Write a .gitignore for Go / Java / Python (Easy)
- Task 4 — Trust version control: delete commented-out code (Easy)
- Task 5 — Squash a noisy local branch with autosquash (Medium)
- Task 6 — Recover lost work with reflog (Medium)
- Task 7 — Undo a pushed mistake with revert, not reset (Medium)
- Task 8 — Split a kitchen-sink commit into atomic commits (Hard)
- Task 9 — Find a regression with git bisect (Hard)
- Task 10 — Remove a secret committed by accident (Hard)
- Task 11 — Kill merge noise: rebase a long-lived branch onto main (Hard)
-
Task 12 — Convert a long-lived branch workflow to trunk-based (Hard)
- Related Topics
How to Use¶
Work top to bottom — difficulty rises and later tasks lean on muscle memory from earlier ones. For each task:
- Read the scenario and try it first in a scratch repo. Reproduce the mess before you fix it; rewriting history you cannot recreate teaches nothing.
- Decide on a plan before you type a destructive command. Every history rewrite is a small surgery — know your escape hatch (
git reflog,ORIG_HEAD, a backup branch) before you cut. - Only then open the solution and compare. Matching commands matter less than matching judgment: did you rewrite shared history, or only local? Did you preserve the why?
A decision rule that governs almost every task here:
Task 1 — Rewrite a "WIP" message into a proper commit (Easy)¶
Scenario: You are about to open a pull request. Your last local commit reads:
The diff adds retry logic to the payment HTTP client. The message tells a reviewer — and your future self — nothing.
Instruction: Rewrite the message of the most recent commit into a proper subject + body that captures what changed and, more importantly, why. The commit is local and unpushed.
Solution
Because the commit is local and unpushed, amending is safe: Your editor opens. A "WIP" message becomes:Add exponential backoff to payment HTTP client
The payment gateway intermittently returns 503 under load, and a single
failed request was surfacing as a hard checkout error to the user.
Retry idempotent POSTs up to 3 times with backoff (200ms, 400ms, 800ms)
plus jitter, so transient gateway hiccups self-heal instead of failing
the purchase. Non-idempotent paths are unchanged.
Refs: PAY-1421
Task 2 — Turn "fix stuff" into Conventional Commits (Easy)¶
Scenario: A teammate's branch has three commits, all titled variations of the same non-message:
Your team has adopted Conventional Commits to drive automated changelogs and semantic-version bumps. These three commits would produce an empty, useless changelog.
Instruction: Given what each commit actually did, rewrite all three messages in Conventional Commits format. The commits are on an unpushed local branch.
b2c3d4ecorrects a null-pointer crash when a cart is empty.c3d4e5fadds a new/healthendpoint.e4f5a6bchanges the default page size from 20 to 50 — a behavior change API consumers must be told about.
Solution
Rebase interactively and `reword` each commit: In the todo list, mark all three: git opens an editor for each, oldest first. The rewritten messages:fix(cart): prevent NPE when checking out an empty cart
CartService.total() dereferenced a null line-item list when the cart
had never had an item added. Guard with an early return of zero.
feat(ops): add /health readiness endpoint
Returns 200 with DB and cache connectivity so the load balancer can
drop unhealthy instances instead of routing traffic into them.
Task 3 — Write a .gitignore for Go / Java / Python (Easy)¶
Scenario: A polyglot service repo (a Go binary, a Java module, Python tooling) has no .gitignore. git status is drowning in build artifacts, IDE files, and a committed .env:
$ git status --short
?? bin/server
?? target/payment-1.0.jar
?? __pycache__/
?? .idea/
?? .env
?? coverage.out
Instruction: Write a .gitignore that excludes build output, dependency caches, IDE metadata, and secrets for all three ecosystems. Then handle the file that is already tracked but should not be.
Solution
Create `.gitignore`:# --- Secrets & local config (never commit) ---
.env
.env.*
*.local
# --- Go ---
bin/
*.exe
*.test
*.out # coverage.out, cpu.out
vendor/ # if you don't vendor; remove this line if you do
# --- Java / Maven / Gradle ---
target/
build/
*.jar
*.war
.gradle/
# --- Python ---
__pycache__/
*.py[cod]
.venv/
venv/
*.egg-info/
.pytest_cache/
.mypy_cache/
# --- IDE / editor ---
.idea/
.vscode/
*.iml
.DS_Store
Task 4 — Trust version control: delete commented-out code (Easy)¶
Scenario: A pull request you are reviewing carries this:
def price_with_tax(subtotal: Decimal, rate: Decimal) -> Decimal:
# Old flat-rate logic, keeping just in case:
# tax = subtotal * Decimal("0.07")
# return subtotal + tax
tax = subtotal * rate
return (subtotal + tax).quantize(Decimal("0.01"))
The author left the old implementation commented out "in case we need to roll back."
Instruction: Explain why this belongs in version control, not in the source. Show the diff you would request and the command that makes the "just in case" argument moot.
Solution
Request the dead comment be deleted:def price_with_tax(subtotal: Decimal, rate: Decimal) -> Decimal:
tax = subtotal * rate
return (subtotal + tax).quantize(Decimal("0.01"))
# Find the commit that last contained the old flat-rate line:
git log -S '0.07' --oneline -- pricing.py
# a9f0e1d feat(pricing): make tax rate configurable
# See exactly what that commit changed:
git show a9f0e1d -- pricing.py
# Recover the old version of the whole file if you ever truly need it:
git show a9f0e1d~1:pricing.py
Task 5 — Squash a noisy local branch with autosquash (Medium)¶
Scenario: Your feature branch feat/login-throttle has the shape every honest branch has mid-development — a clean first commit followed by a trail of corrections:
$ git log --oneline main..HEAD
d5e6f7a fix typo in error message
c4d5e6f address review comment: lower the limit
b3c4d5e oops forgot the test
a2b3c4d feat(auth): throttle failed login attempts
The bottom commit is the real change; the top three are fixups to it. You want to push one clean commit.
Instruction: Collapse the three fixup commits into the base commit, keeping the base commit's message, without hand-editing a rebase todo list. Use the fixup/autosquash workflow.
Solution
**The disciplined way** — mark fixups *as you make them*. When you fix the test, instead of writing a fresh commit, attach it to the commit it belongs to:git add login_test.py
git commit --fixup a2b3c4d # creates "fixup! feat(auth): throttle failed login attempts"
Task 6 — Recover lost work with reflog (Medium)¶
Scenario: You just ran a hard reset to "clean up," then realized you nuked two commits of real work:
$ git reset --hard HEAD~2
HEAD is now at a2b3c4d feat(auth): throttle failed login attempts
$ git log --oneline -1
a2b3c4d feat(auth): throttle failed login attempts
# ...the two commits on top are gone from the log. Panic.
There is no remote copy. git log shows no trace of them.
Instruction: Recover the two lost commits. Explain why they were never actually lost.
Solution
The reflog records every move `HEAD` has made, including the ones that vanish from `git log`:a2b3c4d HEAD@{0}: reset: moving to HEAD~2
9f8e7d6 HEAD@{1}: commit: feat(auth): add account lockout after N failures
6c5b4a3 HEAD@{2}: commit: feat(auth): record failed-attempt timestamps
a2b3c4d HEAD@{3}: commit: feat(auth): throttle failed login attempts
Task 7 — Undo a pushed mistake with revert, not reset (Medium)¶
Scenario: You pushed a commit to the shared main that breaks production — it ships a config change pointing at the wrong database:
$ git log --oneline -2 origin/main
b4d2f1a chore(config): point at new analytics DB <-- bad, already pushed & deployed
c3e1a0b feat(reports): add weekly export
Three teammates have already pulled main.
Instruction: Get production back to safety without rewriting shared history. Then contrast with the wrong fix.
Solution
Create a *new* commit that is the inverse of the bad one, and push it forward: git opens an editor with a pre-filled message; refine it:Revert "chore(config): point at new analytics DB"
This points production at an unprovisioned DB, causing connection
failures on every report query. Reverting to restore service; will
re-land once the new DB is migrated and load-tested.
This reverts commit b4d2f1a.
Task 8 — Split a kitchen-sink commit into atomic commits (Hard)¶
Scenario: You committed everything at once at the end of the day. The single commit mixes three unrelated changes:
$ git show --stat HEAD
8a7b6c5 various changes
src/auth.py | 40 ++++++++++++++ # a real feature: 2FA
src/auth.py | 6 +++--- # also: renamed a variable (refactor)
src/utils.py | 80 ++++++-------- # gofmt-style reformat, no logic change
README.md | 4 ++++ # doc update for the feature
This is unreviewable and unbisectable: a reviewer can't approve the 2FA feature without also signing off on a 80-line reformat, and a future bisect that lands here can't tell which of the three broke a test.
Instruction: Split this one commit into three atomic commits — feat (2FA + its doc), refactor (the rename), style (the reformat) — each independently reviewable and revertable. The commit is local and unpushed.
Solution
Rebase interactively and `edit` the commit so the rebase pauses with its changes staged: The rebase stops at that commit. Now **un-commit but keep the changes** in the working tree: Now stage and commit each logical change separately, using `git add -p` to pick **individual hunks** even within the same file:# Commit 1: the feature (and only the feature's lines in auth.py + the doc)
git add -p src/auth.py # answer y/n per hunk: stage the 2FA hunks, skip the rename hunk
git add README.md
git commit -m "feat(auth): add TOTP-based two-factor authentication
Adds opt-in 2FA using time-based one-time passwords. Documented in
README under 'Security'. Refs: AUTH-88"
# Commit 2: the refactor (the variable rename, no behavior change)
git add -p src/auth.py # stage the remaining rename hunk
git commit -m "refactor(auth): rename 'tok' to 'session_token' for clarity"
# Commit 3: the mechanical reformat, isolated so reviewers can skim it
git add src/utils.py
git commit -m "style(utils): apply formatter (no logic change)"
Task 9 — Find a regression with git bisect (Hard)¶
Scenario: A test that passed last release now fails. Somewhere in the last ~200 commits, someone broke test_checkout_applies_discount. Reading 200 diffs by hand is hopeless.
Instruction: Use git bisect to find the exact commit that introduced the regression. Show both the manual and automated forms, and explain why atomic commits make the result actionable.
Solution
You know a **good** commit (last release tag `v2.3.0`, where the test passed) and a **bad** commit (`HEAD`, where it fails). Bisect binary-searches between them — ~8 steps for 200 commits instead of 200. **Manual bisect:** git checks out the midpoint commit. Run the test, then tell git the verdict:# ...build & run the failing test at the checked-out commit...
pytest tests/test_checkout.py::test_checkout_applies_discount
git bisect good # if the test passed here
# or
git bisect bad # if it failed here
7d6e5f4a is the first bad commit
commit 7d6e5f4a
refactor(pricing): inline discount calculation into Cart.total
git bisect reset # ALWAYS run this to return to your original HEAD
git bisect start HEAD v2.3.0
git bisect run pytest -x tests/test_checkout.py::test_checkout_applies_discount
# git walks the whole range unattended and prints the first bad commit.
git bisect reset
Task 10 — Remove a secret committed by accident (Hard)¶
Scenario: Three commits ago, an AWS secret key was committed in config/settings.py and then "removed" in a later commit. But it still lives in history:
$ git log -S 'AKIA' --oneline
9c8b7a6 chore: remove hardcoded key # "removed" it...
4d3c2b1 feat: add S3 upload # ...but it's still in this commit forever
$ git show 4d3c2b1:config/settings.py | grep AKIA
AWS_SECRET_KEY = "AKIAIOSFODNN7EXAMPLE..." # exposed to anyone who clones
The branch is already pushed to a shared remote.
Instruction: Scrub the secret from all of history and lay out the full incident response. Identify the single most important step.
Solution
**Step 0 — the most important step, do it FIRST: rotate the secret.** The moment a secret hits a remote, treat it as compromised. Anyone who cloned or forked has it; scrubbing history does *not* un-leak it.# In the AWS console / CLI: deactivate and delete the exposed key, issue a new one.
aws iam delete-access-key --access-key-id AKIAIOSFODNN7EXAMPLE
# Update your secret store / .env (and make sure .env is gitignored — Task 3).
# Put the literal secret (or a regex) in a replacements file:
echo 'AKIAIOSFODNN7EXAMPLE==>REMOVED' > expressions.txt
git filter-repo --replace-text expressions.txt
# To strip the file entirely instead of just the value:
# git filter-repo --path config/settings.py --invert-paths
Task 11 — Kill merge noise: rebase a long-lived branch onto main (Hard)¶
Scenario: A feature branch has been open for two weeks. To "stay current" the author repeatedly merged main into it. The history is a thicket of merge commits:
$ git log --oneline --graph feat/billing
* a9b8c7d Merge branch 'main' into feat/billing
|\
| * 7f6e5d4 (main) fix(api): patch rate limiter
* | 6e5d4c3 feat(billing): add proration
* 5d4c3b2 Merge branch 'main' into feat/billing
|\
| * 4c3b2a1 chore: bump deps
* | 3b2a190 feat(billing): add invoice model
* 2a19087 feat(billing): scaffold billing module
The three real commits (scaffold, invoice model, proration) are buried under "Merge branch 'main'" noise. Reviewers see a tangle instead of a story.
Instruction: Replay the feature's three real commits cleanly on top of the current main, discarding the merge noise, so the branch reads as a linear sequence. The branch is pushed but solo — no one else builds on it.
Solution
Rebase the branch onto the latest `main`. Rebase replays *your* commits and drops the merge commits entirely: If conflicts surface (they will, where `main` and your work touched the same lines), resolve each, then:# edit the conflicted files...
git add <resolved-files>
git rebase --continue # or: git rebase --abort to bail out safely
Task 12 — Convert a long-lived branch workflow to trunk-based (Hard)¶
Scenario: A team runs long-lived feature branches: each feature gets a branch that lives for weeks, drifts far from main, and produces an agonizing "big bang" merge with hundreds of conflicts at the end. The new-search-ranking branch is now three weeks old, 60 commits deep, and conflicts with half the codebase.
Instruction: Lay out how to convert this team to trunk-based development with short-lived branches and feature flags — and how to land the giant in-flight branch without another big-bang merge.
Solution
**The target workflow:** everyone integrates into `main` (the trunk) at least daily via tiny, short-lived branches (hours to ~2 days), each merged through a fast PR. Unfinished work hides behind a **feature flag**, not behind an unmerged branch.Self-Assessment¶
Score yourself honestly. Aim for "could do it under pressure, on a shared repo, without breaking a teammate."
- Task 1–2: I can rewrite a vague message into a subject + body that explains why, and I know the Conventional Commits types and what each bumps in SemVer.
- Task 3: I can write a multi-language
.gitignore, and I know whygit rm --cached(not plainrm) un-tracks an already-committed file. - Task 4: I delete commented-out code on sight, because I trust
git log -Sandgit show <commit>:<path>to retrieve it. - Task 5: I commit ugly locally but push clean —
git commit --fixup+git rebase -i --autosquash. - Task 6:
git reset --harddoesn't scare me, becausegit reflogandORIG_HEADare my undo history. - Task 7: I know the golden rule cold: rewrite private history freely, never rewrite shared history —
revert(adds), notreset(rewrites), on anything pushed. - Task 8: I can split a mixed commit with
git rebase -i(edit) +git reset+git add -p, one logical change per commit. - Task 9: I can find a regression with
git bisect run, and I can articulate why atomic commits make its output actionable. - Task 10: On a leaked secret I rotate first, then
git filter-repo/BFG, then force-push and coordinate a re-clone. - Task 11: I can rebase a noisy branch onto
mainand push it safely with--force-with-lease— and I know why that's only OK on a solo branch. - Task 12: I can argue for trunk-based development with feature flags and explain how to land a giant branch in slices without a big-bang merge.
If three or more are unchecked, re-do those tasks in a scratch repo before your next real PR.
Related Topics¶
- Chapter README — the positive rules behind these drills.
- junior.md · find-bug.md · optimize.md — the rest of this topic's file set.
- Code Reviews — the etiquette of reviewing the clean history these tasks produce.
- Refactoring — the discipline of behavior-preserving change that atomic commits (Task 8) make safely reviewable.
Next: practice on a real repo. The fastest way to internalize this is to take your own messiest branch and run Tasks 5, 8, and 11 on a copy of it.
In this topic