DI Management Home > Utilities > A side-by-side HTML diff program for Windows

A side-by-side HTML diff program for Windows


shtmldiff.py is a Python implementation of a side-by-side html diff program designed to run in Windows. It is based on the excellent IETF rfcdiff tool.

New 2021-01-20: Updated to Python 3. See Changes in v1.3.

An Example | Default behaviour | Syntax | A Larger Example | Rfcdiff Tool | Downloads | Contact us

An Example

shtmldiff.py lao.txt tzu.txt
 lao.txt   tzu.txt 
1The Way that can be told of is not the eternal Way;   
2The name that can be named is not the eternal name.   
3The Nameless is the origin of Heaven and Earth; The Nameless is the origin of Heaven and Earth;1
4The Named is the mother of all things. The named is the mother of all things.2
   3
5Therefore let there always be non-being, Therefore let there always be non-being,4
6 so we may see their subtlety,  so we may see their subtlety,5
7And let there always be being, And let there always be being,6
8 so we may see their outcome.  so we may see their outcome.7
9The two are the same, The two are the same,8
10But after they are produced, But after they are produced,9
11 they have different names.  they have different names.10
   They both may be called deep and profound.11
   Deeper and more profound,12
   The door of all subtleties!13
   
 End of changes. 1 change block. 
3 lines changed or deleted5 lines changed or added
                                                                                                                                                      

shtmldiff.py is written in Python 3. We're assuming here that you have Python 3 installed on your system. If not, you can download from Python Releases for Windows.

Default behaviour

The default behaviour of shtmldiff.py is to create an .html file in the current working directory with a name like file2-from-1.diff.html and to open it automatically in the default browser. Line numbering is turned on by default (because that's the way we like it) and tabs are expanded to 8 characters. All these behaviours can be changed using command-line options. For more information use the help option:

    shtmldiff.py --help

Syntax

C:\Python38\python.exe C:/!Data/Python/shtmldiff/shtmldiff.py --help
usage: shtmldiff.py [-h] [-v] [-L] [-w WIDTH] [+l] [-l] [+b] [-b] [-t TABSTOP]
                    [-i] [-s] [-a] [-f FILENAME] [--stdout] [-d]
                    file1 file2

A side-by-side html diff program

positional arguments:
  file1
  file2

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit
  -L, --license         show licence and exit
  -w WIDTH, --width WIDTH
                        set maximum width in characters for each half of
                        display (default=no max)
  +l, --linenum         show linenumbers for each line (default)
  -l, --no-linenum      do not show linenumbers for each line
  +b, --browse          show html output in browser (default)
  -b, --no-browse       do not show html output in browser
  -t TABSTOP, --tabstop TABSTOP
                        set tabstop for expanding tabs to spaces (default=8)
  -i, --ignore-case     consider upper- and lower-case to be the same
  -s, --ignore-space-change
                        ignore changes in the amount of white space
  -a, --ignore-all-space
                        ignore all white space
  -f FILENAME, --filename FILENAME
                        specify filename for html output
                        (default=`file2-from-1.diff.html`)
  --stdout              send output to stdout instead of to a file
  -d                    print debugging output to stderr, `-dd` show more

If file1 or file2 is `-`, read standard input.

Process finished with exit code 0

A Larger Example

Here's an example of running shtmldiff to show changes in a larger file. This is a comparison of the original wdiff.c source file with our new one adapted for Windows.

Rfcdiff Tool

We really, really like the rfcdiff tool used at the IETF Rfcdiff Web Service to compare documents in a side-by-side html diff format.

Unfortunately for Windows users, the original rfcdiff tool (available here) only works on Linux platforms because it is written in bash/awk and relies on the GNU diffutils 'diff' and 'wdiff'. You could upload your files to the IETF website and rather messily copy the resulting html. We wanted a version to use as a stand-alone program on Windows.

Yes, there are other similar programs that do this but we prefer the rfcdiff style and the way it shows differences by word rather than individual characters.

So we wrote our own version for Windows in Python, using the style and logic from Henrik Levkowetz's original code, and released it under the same GNU licence. It's a simplified version that merely compares two text files without any RFC-specific preprocessing.

Diffutils dependencies

The rfcdiff program also requires the Linux-specific GNU utilities wdiff and diff which are not standard on Windows. There is a Windows version of diff (a copy is provided below) but no version of wdiff that we could find. So we created a wdiff.exe program that works on Windows, also available below. For more information, see our wdiff for Windows page.

Downloads

  1. shtmldiff complete setup script including diffutils executables: shtmldiff-1.3.zip (113 kB).
  2. shtmldiff.py source only: shtmldiff-1.3.src.zip (7.4 kB).
  3. wdiff.exe and diff.exe executables only: wdiff-0.5.1W.bin.zip (106 kB).
  4. test files: shtmldiff-tests.zip (0.9 kB): lao.txt and tzu.txt.

Old Python 2.7 version

Installation

The complete setup script (Number 1 above) will install everything including the two diffutils executables. To install:

  1. Download the shtmldiff-X.X.zip file (where "X.X" represents the version e.g. "3.0")
  2. Open a command-line window in the same directory
  3. Type pip install shtmldiff-X.X.zip

This will copy the python script and required diff exe files into \PythonXX\Scripts. If you don't want this, or the setup program doesn't work for you, download the executable files separately and copy the files wdiff.exe and diff.exe to a directory somewhere on your Windows System PATH where the script shtmldiff.py will find them. If the exe files cannot be found, you should get an error like

    OSError: 'wdiff' failed: [Error 2] The system cannot find the file specified

To uninstall, type pip uninstall shtmldiff

Changes in v1.3

Can I use the Python code on a Linux system?

You may need to make a couple of changes. Some Windows-specific techniques are used.

Of course, you could just download and use rfcdiff directly.

Contact us

To comment on this page or to contact us, please send us a message.

This page first published 12 June 2016. Last updated 21 January 2021.