There are many times when you need to be able to compare versions of files, or directory trees. Diff tools are the answer, and are a fundamental component to any software developer's toolkit. Diff is even a feature that is extremely important to content authors of all kinds. For example, you find a version comparison tool in OpenOffice. Similarly, it is commonplace to "highlight" and track changes in word processing tools.
Contents
Unix Diff
The unix diff command is quite helpful in examining differences between (text) files. It is also useful for comparing entire directory trees very quickly. When a visual tool is desired, KDiff3 is a very powerful tool.
Sometimes, you want to compare files at the level of their 'digital fingerprint' (e.g. md5 checksum). See File Integrity. You can use the --checksum option in rsync (and use the --dry-run option just to get a report of differences without actually changing anything)
- Diff a directory tree, ignoring .svn control files
- diff -rq --exclude .svn path/to/directoryA path/to/directoryB
E.g.
diff -rq -x .svn ./work/myproject/ ./work/myproject-copy2/
Piping
The diff
command is fast, but often the output is hard to read to find the exact difference. You might try piping the output of the diff command to a graphical tool:
diff fileA fileB | kdiff3 -
By using the dash option to kdiff3, you're telling it to read from STDIN, so it uses the output of the former commands being piped to it.
Or, piping to awk
to print a list of just what's in the 'left side'.
diff --suppress-common-lines --side-by-side modules.list.a.txt modules.list.b.txt |awk '{print $1}'
Process Substitution
By using process substitution, we can operate on the output of commands without the need for saving to a file.
drush pml --status=enabled --pipe > modules.list.txt
# make a bunch of changes to what modules are enabled, perhaps restoring from a backup, and compare to our original list
diff --suppress-common-lines --side-by-side <(drush pml --status=enabled --pipe) modules.list.txt
Compressed files
zdiff
is a tool that can be used directly on compressed files (like zcat, zgrep, etc.)
KDiff3
Tool for Comparison and Merge of Files and Directories
Configuring
You can configure KDiff3 to ignore Version Control keywords, or do other fancy things to assist the matching portion of the KDiff execution. This is one of the most powerful features of KDiff3
sed 's/\$\(Revision\|Author\|Log\|Header\|Date\).*\$/\$\1\$/'
Enter
help:/kdiff3/preprocessors.html
in Konqueror to read up on preprocessing commands
Manual Alignment of Files
One of the very useful features of KDiff3 is to be able to set manual diff markers -- effectively telling the tool where to align the files that you are comparing.
- Highlight text in file A
- Press Ctrl + Y
- Highlight the same text in file B
- Press Ctrl + Y
- Highlight the same text in file C
- Press Ctrl + Y
You will notice the orange margin that connotes a manual mark. Now the differencing algorithm should improve. You can erase all manual marks with Ctrl+Shift+Y.
Potentially surrounding a problematic area by manually marking above it, and manually marking below it will allow the differencing algorithm to pickup the exact differences.
Edit in the Merge Window
Another nice thing about KDiff3 is the ability to edit in the merge window. This means that you can select content from file C (Press Ctrl + 3), and then place the cursor in the merge/edit window and manually adjust the text as needed.
3-way Merge
Another nice thing about KDiff3 is the ability to do a 3-way merge by nominating different file names. Just click on the directory listing window of a directory compare, and you will notice that the first click creates a small 'A' marker, the next click creates a 'B' marker and so-on so that you can compare 'new.txt' to 'new.bak' and '/old/new.txt'
Command Line
Can be run from the command line, see help for details
kdiff3 --help Usage: kdiff3 [Qt-options] [KDE-options] [options] [File1] [File2] [File3]
Subversion Diff
See the Merging topic which gives an example. Your Subversion client includes a diff command; and can also use your favorite external diff tool to perform the diff as well.
Example of diff across the network to look at repository URLs using the command-line subversion client invoking an external visual diff program instead of the built-in diff tool:
svn diff --diff-cmd kdiff3 \
http://svn.example.com/svn/myrepo/path/to/file.txt@7560 \
http://svn.example.com/svn/myrepo/path/to/file.txt@HEAD
GUI subversion clients such as kdesvn or TortoiseSVN have the same capability: built-in differencing or optional setting to invoke an external differencing tool. For KDESvn it's 'Settings'->'Configure kdesvn'->'Diff & Merge' (screenshot provided) ?
svn help diff diff (di): Display the differences between two revisions or paths. usage: 1. diff [-c M | -r N[:M]] [TARGET[@REV]...] 2. diff [-r N[:M]] --old=OLD-TGT[@OLDREV] [--new=NEW-TGT[@NEWREV]] \ [PATH...] 3. diff OLD-URL[@OLDREV] NEW-URL[@NEWREV] 1. Display the changes made to TARGETs as they are seen in REV between two revisions. TARGETs may be all working copy paths or all URLs. If TARGETs are working copy paths, N defaults to BASE and M to the working copy; if URLs, N must be specified and M defaults to HEAD. The '-c M' option is equivalent to '-r N:M' where N = M-1. Using '-c -M' does the reverse: '-r M:N' where N = M-1. 2. Display the differences between OLD-TGT as it was seen in OLDREV and NEW-TGT as it was seen in NEWREV. PATHs, if given, are relative to OLD-TGT and NEW-TGT and restrict the output to differences for those paths. OLD-TGT and NEW-TGT may be working copy paths or URL[@REV]. NEW-TGT defaults to OLD-TGT if not specified. -r N makes OLDREV default to N, -r N:M makes OLDREV default to N and NEWREV default to M. 3. Shorthand for 'svn diff --old=OLD-URL[@OLDREV] --new=NEW-URL[@NEWREV]' Use just 'svn diff' to display local modifications in a working copy. Valid options: -r [--revision] arg : ARG (some commands also take ARG1:ARG2 range) A revision argument can be one of: NUMBER revision number '{' DATE '}' revision at start of the date 'HEAD' latest in repository 'BASE' base rev of item's working copy 'COMMITTED' last commit at or before BASE 'PREV' revision just before COMMITTED -c [--change] arg : the change made by revision ARG (like -r ARG-1:ARG) If ARG is negative this is like -r ARG:ARG-1 --old arg : use ARG as the older target --new arg : use ARG as the newer target -N [--non-recursive] : operate on single directory only --diff-cmd arg : use ARG as diff command -x [--extensions] arg : Default: '-u'. When Subversion is invoking an external diff program, ARG is simply passed along to the program. But when Subversion is using its default internal diff implementation, or when Subversion is displaying blame annotations, ARG could be any of the following: -u (--unified): Output 3 lines of unified context. -b (--ignore-space-change): Ignore changes in the amount of white space. -w (--ignore-all-space): Ignore all white space. --ignore-eol-style: Ignore changes in EOL style --no-diff-deleted : do not print differences for deleted files --notice-ancestry : notice ancestry when calculating differences --summarize : show a summary of the results --force : force operation to run --username arg : specify a username ARG --password arg : specify a password ARG --no-auth-cache : do not cache authentication tokens --non-interactive : do no interactive prompting --config-dir arg : read user configuration files from directory ARG
svn diff --diff-cmd kdiff3 fileA.txt fileB.txt
Sharing Diffs
Part of "code review" is to share your program changes. The most common way to share a diff is to use a repository browsing tool such as the repository browsing tool because it has built-in diff capability and is URI addressable so you can just share the link. Git has gitweb.
An equally simple method of sharing among developers is to share the command-line subversion client command which would allow any developer to see what you are seeing locally.