Skip to content

Instantly share code, notes, and snippets.

@MangaD
Created April 23, 2026 13:01
Show Gist options
  • Select an option

  • Save MangaD/401393f00d790fbae0234d77d66731ee to your computer and use it in GitHub Desktop.

Select an option

Save MangaD/401393f00d790fbae0234d77d66731ee to your computer and use it in GitHub Desktop.
Unified Diff Format

Unified Diff Format

CC0

Disclaimer: ChatGPT generated document.

The unified diff format is one of those deceptively simple tools that quietly powers a huge portion of modern software development. Whether you’re reviewing a pull request, debugging a regression, or applying a patch from a mailing list, you’re almost certainly interacting with unified diffs—often without thinking about it.

Let’s unpack what it is, how it works, and the ecosystem of tools built around it.


What Is the Unified Diff Format?

A unified diff is a textual representation of the differences between two files (or sets of files). It’s designed to be:

  • Human-readable
  • Compact
  • Patchable (i.e., it can be applied automatically)

It originated as an improvement over older formats like context diff and normal diff, combining readability with machine applicability.

You’ll commonly see unified diffs in tools like:

  • Git
  • GNU diffutils
  • Subversion
  • Code review platforms (GitHub, GitLab, etc.)

Basic Structure of a Unified Diff

A unified diff is composed of file headers and hunks.

1. File Header

--- a/file.txt
+++ b/file.txt
  • --- → original file
  • +++ → modified file

The a/ and b/ prefixes are conventions used by Git.


2. Hunk Header

@@ -1,5 +1,6 @@

This line describes where the change occurs:

  • -1,5 → original file: start at line 1, spans 5 lines
  • +1,6 → new file: start at line 1, spans 6 lines

3. Hunk Body

 line 1
 line 2
-line 3
+line three
 line 4

Each line is prefixed with:

  • " " (space): unchanged
  • "-": removed from original
  • "+": added in new version

Example: Full Unified Diff

--- a/example.cpp
+++ b/example.cpp
@@ -3,7 +3,7 @@
 int main() {
-    std::cout << "Hello world\n";
+    std::cout << "Hello, world!\n";
     return 0;
 }

This tells you:

  • One line was modified
  • The rest of the file remains unchanged
  • Context lines help you understand where the change occurs

Why Unified Diff Matters

1. Code Review

Unified diffs are the backbone of modern code review. Tools built on top of Git display diffs to show:

  • What changed
  • Where it changed
  • How it changed

2. Patch Distribution

Before GitHub existed, developers would send patches via email:

diff -u old.c new.c > fix.patch

Then someone else would apply it:

patch < fix.patch

This workflow is still widely used in projects like the Linux kernel.

3. Version Control Internals

Systems like Git use diffs internally for:

  • Storing changes efficiently
  • Generating commits
  • Merging branches

Key Tools in the Ecosystem

1. diff (GNU diffutils)

GNU diffutils provides the canonical diff tool.

Generate unified diff:

diff -u file1.txt file2.txt

Options:

  • -u → unified format
  • -r → recursive (directories)
  • -N → treat missing files as empty

2. patch

The counterpart to diff.

Apply a patch:

patch < changes.diff

It:

  • Matches context lines
  • Applies additions/removals
  • Handles minor mismatches gracefully

3. Git

Git builds heavily on unified diffs.

Common commands:

git diff
git show
git log -p

Customization:

git diff --unified=10

This increases context lines for better readability.


4. colordiff

A wrapper around diff that adds syntax highlighting.

colordiff -u file1 file2

5. diffstat

Summarizes changes:

diffstat patch.diff

Output example:

 file.cpp | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

6. GUI & Advanced Diff Tools

  • Meld
  • KDiff3
  • Beyond Compare

These visualize unified diffs side-by-side.


Advanced Concepts

Context Lines

Unified diff includes surrounding unchanged lines to provide context. This is crucial because:

  • It allows patch to apply changes even if line numbers shift slightly
  • It improves readability

You can control this:

diff -u -U3 file1 file2   # 3 lines of context

Fuzz Factor in patch

When applying patches, patch can tolerate small mismatches:

  • Slightly shifted lines
  • Minor differences in context

This is called fuzz, and it makes patches robust.


Binary Files

Unified diff is primarily for text files.

For binaries:

  • Git shows: Binary files differ
  • Or uses specialized encodings (e.g., git diff --binary)

Rename & Copy Detection (Git)

Git extends unified diff with metadata:

rename from old.cpp
rename to new.cpp

These are not part of the original unified diff spec but are widely used.


Combined Diffs (Merge Conflicts)

Git can produce combined diffs for merges:

git diff --cc

These show changes from multiple parents simultaneously—more complex but powerful.


Limitations

Despite its usefulness, unified diff has constraints:

  • Not ideal for binary data
  • Can be ambiguous if context is insufficient
  • Not structurally aware (e.g., doesn’t understand C++ ASTs)

That’s why modern tools sometimes layer semantic analysis on top.


Practical Workflow Example

1. Create a patch

git diff > feature.patch

2. Share it

  • Email
  • Upload
  • Attach to issue tracker

3. Apply it

git apply feature.patch

or

patch -p1 < feature.patch

Final Thoughts

The unified diff format is a perfect example of Unix philosophy in action:

Simple, composable, text-based, and incredibly powerful.

Even in a world of sophisticated IDEs and AI-assisted development, unified diffs remain:

  • The lingua franca of code changes
  • A bridge between humans and machines
  • A core abstraction in version control

If you’re working deeply with systems programming, distributed systems, or tooling, mastering diffs—and even generating them programmatically—can give you a real edge.


If you want, I can go deeper into:

  • How Git computes diffs (Myers algorithm, patience diff)
  • Writing your own diff parser in C++
  • Applying patches safely in custom tooling
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment