Maintainer: Daniel S. Wilkerson
Developer: Scott McPeak
Documenter: Simon Goldsmith

Summary

Delta assists you in minimizing "interesting" files subject to a test of their interestingness. A common such situation is when attempting to isolate a small failure-inducing substring of a large input that causes your program to exhibit a bug.

Our implementation is based on the Delta Debugging algorithm. Andreas wrote a book "Why Programs Fail" about debugging programs.

This work was supported by professors Alex Aiken and George Necula and was done at UC Berkeley.

I presented Delta at CodeCon 2006. The slides are here.

Releases

Feel free to just get the current Subversion repository version as a guest user.

Introduction

The best way to understand how to use delta is with an example of its usage. Below is one example helpfully written up for me by Simon Goldsmith; read it first. For those wanting more, I also wrote a more detailed and harder to read document describing each tool: Using Delta.

Note that what follows is an example of using delta to minimize an input file to a program that reads programs, much as a compiler does. Note two features of file minimization that are present in the example.

Do a controlled experiment.

Below we don't just minimize a file that causes Oink to produce an error message, we minimize a file that causes gcc to accept AND oink to reject in a specific way. That is, the test delta does is a controlled experiment, where gcc is the control. Ignoring this aspect of the problem seems to be a frequent mistake of first time users.

Exploit nested structure.

One may minimize files of simpler syntax than C++ but really all files are interesting in the first place because they are in some language or another. Some simple configuration files are literally just a list of lines but most languages have some nested structure. Multidelta filters the input through the topformflat utility (included) to suppress any newlines past a particular nesting depth; this "explains" the nesting structure to the otherwise line-oriented delta utility (a brilliantly simple idea of Scott McPeak's). If your input file language has no nesting structure, you can hack on multidelta to remove the filtration through topformflat or just use the raw delta program. If your language has a different nesting structure than C/C++, you can write your own multidelta and substitute it. A simple flex program should suffice; it need not be terribly accurate for delta to do well.

Example use of delta

Simon Goldsmith 8 April / 12 Sept, 2005.

Note that this example is edited for simplicity from the raw output; we sincerely hope we did not introduce any bugs.

Setup

(1) Make a new directory and copy the file to be minimized there.

% mkdir deltaexample
% cd deltaexample/
% cp ../nsCSSDataBlock-23801-1112390043.cpp.g.ii ./foo.ii
% chmod +w foo.ii

(2) (optional) Put a read-only backup copy of the file in, say, orig/ .

% mkdir orig
% cp foo.ii orig/
% chmod -R a-w orig

Define interestingness

(3) Write a script (do not call it 'test' as that is a system utility program) to test the interestingness of the file, as we do below.

Note that for this example, "interesting" means the file passes gcc but fails oink with a particular error message. That is, if 1) gcc accepts, and 2) oink rejects with the desired error message, then we return zero (meaning "interesting"). If anything else happens then we return a nonzero exit code (meaning "not interesting")

Some reminders about shell: a zero exit code means "true"; so for the purposes of &&, a zero exit code means "keep going" and grep returns 0 if it matches, nonzero if not. We redirect output to /dev/null because the output of delta is noisy. Be careful of quoting hell: notice that we've used '.' to match characters like single quote.

% cat > test1.sh
#!/bin/bash

FILE=foo.ii
OINK=/home/simon/oink_all/oink/oink
GCC=/usr/bin/gcc

$GCC -c $FILE -o /dev/null &> /dev/null && $OINK $FILE | \
  grep 'error: cannot convert argument type .class .* const &. ' \
  'to receiver parameter type' \
  &> /dev/null
^D

(4) Make the script executable and run it on the file -- make sure it returns 0. Optionally turn off the redirection to /dev/null temporarily to check the error message that is being found by the grep.

% chmod +x test1.sh
% ./test1.sh foo.ii ; echo $?
0

Minimize automatically

(5) Run multidelta with the script on the file several times at, say, levels 0 0 1 1 2 2 10.

% multidelta -level=0 ./test1.sh foo.ii
(check email)
% multidelta -level=0 ./test1.sh foo.ii
(read slashdot)
% multidelta -level=1 ./test1.sh foo.ii
% multidelta -level=1 ./test1.sh foo.ii
% multidelta -level=2 ./test1.sh foo.ii
% multidelta -level=2 ./test1.sh foo.ii
% multidelta -level=10 ./test1.sh foo.ii
% multidelta -level=10 ./test1.sh foo.ii

(6) The input file will be modified in place and you should be left with something smaller.

[simon@otter][deltaexample]$ ls -l
total 116
-rw-r--r--  1 simon simon  8451 Sep 12 17:10 foo.ii
-rw-r--r--  1 simon simon  8948 Sep 12 17:10 foo.ii.bak
-rw-r--r--  1 simon simon  8451 Sep 12 17:10 foo.ii.ok
-rw-r--r--  1 simon simon 57739 Sep 12 17:10 log
-rw-r--r--  1 simon simon  2752 Sep 12 17:10 multidelta.log
dr-xr-xr-x  2 simon simon  4096 Sep 12 17:16 orig/
-rwxr-xr-x  1 simon simon   385 Sep 12 16:36 test1.sh*
-rw-r--r--  1 simon simon    11 Sep 12 16:02 test1.sh~

[simon@otter][deltaexample]$ ls -l orig/
total 552
-r--r--r--  1 simon simon 558970 Sep 12 16:00 foo.ii

Minimize further by hand

(7) Hack on foo.ii by hand, re-running test1.sh each time to check it is still "interesting". Sometimes it helps to hack on foo.ii a little to get delta unstuck and then rerun delta again. You might want to run indent as well whenever you stop to look at foo.ii as topformflat makes a mess.

Final file:

class A {};
int main() {
  const A *val;
  val->~A ();
}

Note that the original file was about 560 KB!

Endorsements

This section is just for fun because I've never had a tool so widely used before.

Subject: Thanks for Delta
   Date: Thu, 21 Jul 2005 21:13:20 -0700
   From: Flash Sheridan
     To: Daniel S. Wilkerson

This is just a quick thank-you note for Delta.  Andrew Pinski pointed
me towards it after filing a GCC bug with a very long source file; it
immediately reduced a different bug file from 16K lines to ten (GCC
bug 22604).  Oddly enough, it initially found a different bug (22603),
since I'd only specified "internal compiler error", not "segmentation
fault".  I might not have been able to file either of these bugs
without Delta, since the code was proprietary, but the two
Delta-reduced files were small enough to make public.


Subject: Re: Thanks for Delta Date: Tue, 13 Sep 2005 10:56:10 -0700 From: Flash Sheridan To: Daniel S. Wilkerson Delta has become even more valuable since my initial thank-you note. I'm not sure it's helped with all of the GCC bugs I've been filing (I've been tracking them at http://pobox.com/~flash/FlashsOpenSourceBugReports.html), but I couldn't have filed most of them without Delta. Typically I find a bug when GCC is compiling a large, confidential file, which I couldn't post to Bugzilla. Delta has always been able to find a radically smaller file, which I have been able to attach to my bug report.
Date: Sun, 23 Oct 2005 22:01:42 +0200 (CEST) From: Richard Guenther To: Daniel S. Wilkerson > > > Thanks for your interest in Delta. I would be interested to > > > hear more about what you are doing with it. If it is something > > > I can put in the endoresements section ("Delta saved my > > > daugher's life!") that would be great. > > > > Well, delta is saving a lot of gcc developers life ;) I would > > guess 1 of 3 bugs sumitted to the gcc bugzilla get their testcase > > reduced using delta. > > Holy moly! Can I quote that publicly such as on my web page? Yes - a little bit more accurate would be to say we're using delta to reduce all testcases from the gcc bugzilla in case they get entered unreduced.

Delta (both the algorithm and this tool) has been used in the Cal Berkeley (CS169) and Stanford (CS295) in software engineering classes.

Subject: Re: delta debugging on instructional machines?
   Date: Mon, 24 Oct 2005 22:55:08 -0700
   From: Gilad Arnold 
     To: Daniel S. Wilkerson

We've just assigned a delta-related homework to the students today,
. . .  We do hope that we can actually convince the students to use
delta in the course of their project development, but time will
tell. And thanks again!


Subject: Re: use of delta in your class Date: Mon, 24 Oct 2005 13:57:22 -0700 From: Alex Aiken To: Daniel S. Wilkerson Yes, I gave them a homework assignment for CS295 using delta. Feedback was positive but unquantified.