Open main menu
File:Vcs diff.svg
Diff and Merge

There are many times when you need to be able to compare versions of files, or directory trees. Diff tools are the answer, and are a fundamental component to any software developer's toolkit. Diff is even a feature that is extremely important to content authors of all kinds. For example, you find a version comparison tool in OpenOffice. Similarly, it is commonplace to "highlight" and track changes in word processing tools.

Unix Diff

The unix diff command is quite helpful in examining differences between (text) files. It is also useful for comparing entire directory trees very quickly. When a visual tool is desired, KDiff3 is a very powerful tool.

Sometimes, you want to compare files at the level of their 'digital fingerprint' (e.g. md5 checksum). See File Integrity. You can use the --checksum option in rsync (and use the --dry-run option just to get a report of differences without actually changing anything)

Diff a directory tree, ignoring .svn control files
diff -rq --exclude .svn path/to/directoryA path/to/directoryB

E.g.

diff -rq -x .svn ./work/myproject/ ./work/myproject-copy2/

Piping

The diff command is fast, but often the output is hard to read to find the exact difference. You might try piping the output of the diff command to a graphical tool:

diff fileA fileB | kdiff3 -

By using the dash option to kdiff3, you're telling it to read from STDIN, so it uses the output of the former commands being piped to it.


Or, piping to awk to print a list of just what's in the 'left side'.

diff --suppress-common-lines --side-by-side modules.list.a.txt modules.list.b.txt |awk '{print $1}'

Process Substitution

By using process substitution, we can operate on the output of commands without the need for saving to a file.

drush pml --status=enabled --pipe > modules.list.txt
# make a bunch of changes to what modules are enabled, perhaps restoring from a backup, and compare to our original list

diff --suppress-common-lines --side-by-side <(drush pml --status=enabled --pipe) modules.list.txt

This technique comes in handy if you wish to compare two directories at their top level. The extensions directory, for example, in MediaWiki from two different installations:

diff --suppress-common-lines --side-by-side <(ls -1 /var/www/wiki.example.net/www/w/extensions/) <(ls -1 /var/www/wiki.example.org/www/w/extensions/)


Compressed files

zdiff is a tool that can be used directly on compressed files (like zcat, zgrep, etc.)


KDiff3

Tool for Comparison and Merge of Files and Directories

Configuring

You can configure KDiff3 to ignore Version Control keywords, or do other fancy things to assist the matching portion of the KDiff execution. This is one of the most powerful features of KDiff3

sed 's/\$\(Revision\|Author\|Log\|Header\|Date\).*\$/\$\1\$/'

Enter

help:/kdiff3/preprocessors.html

in Konqueror to read up on preprocessing commands

Manual Alignment of Files

One of the very useful features of KDiff3 is to be able to set manual diff markers -- effectively telling the tool where to align the files that you are comparing.

  1. Highlight text in file A
  2. Press Ctrl + Y
  3. Highlight the same text in file B
  4. Press Ctrl + Y
  5. Highlight the same text in file C
  6. Press Ctrl + Y

You will notice the orange margin that connotes a manual mark. Now the differencing algorithm should improve. You can erase all manual marks with Ctrl+Shift+Y.

Potentially surrounding a problematic area by manually marking above it, and manually marking below it will allow the differencing algorithm to pickup the exact differences.

Edit in the Merge Window

Another nice thing about KDiff3 is the ability to edit in the merge window. This means that you can select content from file C (Press Ctrl + 3), and then place the cursor in the merge/edit window and manually adjust the text as needed.

3-way Merge

Another nice thing about KDiff3 is the ability to do a 3-way merge by nominating different file names. Just click on the directory listing window of a directory compare, and you will notice that the first click creates a small 'A' marker, the next click creates a 'B' marker and so-on so that you can compare 'new.txt' to 'new.bak' and '/old/new.txt'

Command Line

Can be run from the command line, see help for details

kdiff3 --help
Usage: kdiff3 [Qt-options] [KDE-options] [options] [File1] [File2] [File3]

Subversion Diff

File:Screenshot-kdesvn-settings.png

How to configure an external diff full-size
in kdesvn

See the Merging topic which gives an example. Your Subversion client includes a diff command; and can also use your favorite external diff tool to perform the diff as well.

Example of diff across the network to look at repository URLs using the command-line subversion client invoking an external visual diff program instead of the built-in diff tool:

svn diff --diff-cmd kdiff3 \
http://svn.example.com/svn/myrepo/path/to/file.txt@7560 \
http://svn.example.com/svn/myrepo/path/to/file.txt@HEAD

GUI subversion clients such as kdesvn or TortoiseSVN have the same capability: built-in differencing or optional setting to invoke an external differencing tool. For KDESvn it's 'Settings'->'Configure kdesvn'->'Diff & Merge' (screenshot provided) ?

svn help diff
diff (di): Display the differences between two revisions or paths.
usage: 1. diff [-c M | -r N[:M]] [TARGET[@REV]...]
2. diff [-r N[:M]] --old=OLD-TGT[@OLDREV] [--new=NEW-TGT[@NEWREV]] \
[PATH...]
3. diff OLD-URL[@OLDREV] NEW-URL[@NEWREV]

1. Display the changes made to TARGETs as they are seen in REV between
two revisions.  TARGETs may be all working copy paths or all URLs.
If TARGETs are working copy paths, N defaults to BASE and M to the
working copy; if URLs, N must be specified and M defaults to HEAD.
The '-c M' option is equivalent to '-r N:M' where N = M-1.
Using '-c -M' does the reverse: '-r M:N' where N = M-1.

2. Display the differences between OLD-TGT as it was seen in OLDREV and
NEW-TGT as it was seen in NEWREV.  PATHs, if given, are relative to
OLD-TGT and NEW-TGT and restrict the output to differences for those
paths.  OLD-TGT and NEW-TGT may be working copy paths or URL[@REV].
NEW-TGT defaults to OLD-TGT if not specified.  -r N makes OLDREV default
to N, -r N:M makes OLDREV default to N and NEWREV default to M.

3. Shorthand for 'svn diff --old=OLD-URL[@OLDREV] --new=NEW-URL[@NEWREV]'

Use just 'svn diff' to display local modifications in a working copy.

Valid options:
-r [--revision] arg      : ARG (some commands also take ARG1:ARG2 range)
A revision argument can be one of:
NUMBER       revision number
'{' DATE '}' revision at start of the date
'HEAD'       latest in repository
'BASE'       base rev of item's working copy
'COMMITTED'  last commit at or before BASE
'PREV'       revision just before COMMITTED
-c [--change] arg        : the change made by revision ARG (like -r ARG-1:ARG)
If ARG is negative this is like -r ARG:ARG-1
--old arg                : use ARG as the older target
--new arg                : use ARG as the newer target
-N [--non-recursive]     : operate on single directory only
--diff-cmd arg           : use ARG as diff command
-x [--extensions] arg    : Default: '-u'. When Subversion is invoking an
external diff program, ARG is simply passed along
to the program. But when Subversion is using its
default internal diff implementation, or when
Subversion is displaying blame annotations, ARG
could be any of the following:
-u (--unified):
Output 3 lines of unified context.
-b (--ignore-space-change):
Ignore changes in the amount of white space.
-w (--ignore-all-space):
Ignore all white space.
--ignore-eol-style:
Ignore changes in EOL style
--no-diff-deleted        : do not print differences for deleted files
--notice-ancestry        : notice ancestry when calculating differences
--summarize              : show a summary of the results
--force                  : force operation to run
--username arg           : specify a username ARG
--password arg           : specify a password ARG
--no-auth-cache          : do not cache authentication tokens
--non-interactive        : do no interactive prompting
--config-dir arg         : read user configuration files from directory ARG

svn diff --diff-cmd kdiff3 fileA.txt fileB.txt

Sharing Diffs

Part of "code review" is to share your program changes. The most common way to share a diff is to use a repository browsing tool such as the repository browsing tool because it has built-in diff capability and is URI addressable so you can just share the link. Git has gitweb.

An equally simple method of sharing among developers is to share the command-line subversion client command which would allow any developer to see what you are seeing locally.