Hypertree Decomposition Research

Starting in late 2002, I made various attempts to tackle the problem of constructing hypertree decompositions for constraint satisfaction problems. This work has been interleaved with investigations in linear programming methods which explains my slow progress to date. Despite the interruptions, some achievements have been made. I have identified redundant data structures in the original algorithm opt-k-decomp. From these redundancies I have developed an algorithm red-k-decomp which provides significant improvement.

The following papers describe how redundant data structures (redundant k-vertices) are identified, and the steps to remove them:

DetailsFiles
Reducing Redundancy in the Hypertree Decomposition Scheme (Peter Harvey, Aditya Ghose)
paper from ICTAI03 (8 pages) .ps  .pdf
 
Fast Hypertree Decompositions (Peter Harvey, Aditya Ghose)
work in progress, first available August 2003 (18+ pages) .ps  .pdf

I have also made available a presentation prepared for ICTAI03. This includes a 0.7MB PDF file for presentation, a 0.21MB gzipped PostScript file with slides and presentation notes. I have taken the slightly unconventional step of creating both 7MB, MP3 file and a 10MB AVI file which includes both audio and slides. Each file runs for approximately 23 minutes. This presentation should hopefully provide a good introduction to hypertree decompositions and our optimisations.

Implementation

I have conducted numerous experiments to compare opt-k-decomp and red-k-decomp. These experiments have been conducted using my implementation of opt-k-decomp and red-k-decomp. To allow others to verify the experiments presented in the above papers, I have made available here executables for Linux and Windows. Both are command line utilities which can be run in interactive or non-interactive mode.

DetailsDownloadSize
Implementations of opt-k-decomp and red-k-decomp
for Microsoft Windows (compiled with MinGW and g++ 2.95) .zip 184KB
for Linux (compiled with g++ 2.95) .tar 29KB

Of course, research has progressed further since that time. I have continued to identify more classes of redundant vertices, providing some impressive results. For example, on a random problem with 25 variables, 35 binary constraints and 10 ternary constraints the original red-decomp (above) removed a total of 32949 vertices, leaving 131271 vertices and 164533 problem nodes from which to construct the decomposition graph. The latest red-decomp (below) removed 148539 vertices, leaving just 15681 vertices and 18923 problem nodes. The resulting speed increase is between 50 and 100.

DetailsDownloadSize
Implementations of opt-k-decomp and latest red-k-decomp
for Microsoft Windows (compiled with MinGW and g++ 2.95) .zip 192KB
for Linux (compiled with g++ 2.95) .tar 31KB

The above archives each contain two files: red-decomp and opt-decomp. Each have the same usage, with the only difference being that the former is an implementation of red-k-decomp (with varying levels of sophistication in their techniques), and the latter an implementation of opt-k-decomp. As they use identical (and, in my opinion, very efficient) data structures they should provide a good pair of implementations for any experimental comparisons of the two algorithms.

Input and Output

Each program uses an input file of the form defined below:

A name consists of any string of characters excluding the delimiters '(),.' and whitespace characters (tab, space, newline). Otherwise, whitespace is ignored.

We can construct a sample input file from the primal graph of a CSP (shown). A correctly formatted input file for this CSP would be: AB(a,b), BD(b,d), DE(e,d), CD(c,d), BC(b,c), CF(c,f).

Note that there needn't be any relationship between constraint names and their list of variables. Also, a constraint name needn't be unique (ie. two constraints named 'Frank' with different variables can be placed in the same input file). This is useful if you wish to use the constraint names to declare the constraint type, for example 'linear_eqn' or 'all_diff'.

Output is in either plain text (when printing statistics) or in dot format (when printing trees or graphs). Examples of the statistics can be found in Usage below. An example hypertree decomposition can be constructed by using the following line in Linux, and feeding the output to dot:

$ echo "AB(a,b),BD(b,d),DE(e,d),CD(c,d),BC(b,c),CF(c,f)." | ./red-decomp -r -d 2 -e
digraph decomp { nslimit=100.0; mclimit=100.0;
n0x8051648 [shape=box,label="{a,b,d,c} {AB,CD}"]
n0x8051bc8 [shape=box,label="{c,f} {CF}"]
n0x8051648 -> n0x8051bc8;
n0x8051b80 [shape=box,label="{d,e} {DE}"]
n0x8051648 -> n0x8051b80;
}

Usage

Each program has an interactive mode in which commands are executed. Alternatively, all commands can be provided as arguments, allowing for easy use when running experiments. The commands are:

build s v c2 c3 ... cn
Generates a problem with random seed s, v variables, c2 extra binary constraints, c3 extra ternary, constraints, etc. If no number is supplied for a constraint type, it is assumed to be zero. This command supersedes generate.
generate v c s
Generates a problem with v variables (and enough constraints to ensure the problem is connected), c extra constraints, and a random seed s. Default values for v and c are 15 and 10 respectively. If s is supplied it sets the random seed value. This command is superseded by build. Any command line "generate v c s" can be rewritten "build s v c".
write filename
Writes a problem description to a file. If no filename is given then stdout is used.
read filename
Reads a problem description from a file. If no filename is given then stdin is used.
decompose w v
Constructs a hypertree decomposition graph for the current problem, with maximum width w. If w is not supplied it defaults to 3. If v is supplied it sets the verbosity level.
extract filename filename
Extracts a hypertree decomposition from a graph (generated by using the decompose command above). If the first filename is supplied, one of the optimum hypertree decomposition is written to that file. If the second filename is supplied, all the optimum hypertree decompositions are written to the second file (as a subgraph of the entire decomposition graph).
statistics filename
Writes statistics about the call to decomposite, like problem sizes, CPU times, etc. If a filename is given then the information is written to the file.
test filename
Writes the statistics in a really brief form useful for when conducintg experiment.

Each command can be used just by supplying the first character of the command (for short). The following two examples use red-decomp to decompose a randomly generated problem instance and print statistics.

Non-interactive mode

$ ./red-decomp -g 20 15 5 -d 3 -s
PROBLEM
         variables 20
       constraints 34
 
COUNTS
redundant-vertices 537
  useless-vertices 841
   branch-vertices 3971
     leaf-vertices 1230
     problem-nodes 9902
      decomp-nodes 172857
 
TIMES
  init-connections 0
     init-vertices 20
        init-graph 20
       build-graph 23430
        calc-width 110
 
DECOMPOSITION
   hypertree-width 3

Interactive mode

$ ./red-decomp

Interactive mode. Type 'help' for a list of available commands
--------------------------------------------------------------
 
decomp> generate 20 15 5
 
decomp> decompose 3
 
decomp> statistics
 
PROBLEM
         variables 20
       constraints 34
 
COUNTS
redundant-vertices 537
  useless-vertices 841
   branch-vertices 3971
     leaf-vertices 1230
     problem-nodes 9902
      decomp-nodes 172857
 
TIMES
  init-connections 0
     init-vertices 20
        init-graph 20
       build-graph 23300
        calc-width 100
 
DECOMPOSITION
   hypertree-width 3

decomp> ^D

Experiments

As a simple comparison of opt-k-decomp and red-k-decomp, consider the following results for random binary problems with 20 variables and increasing numbers of constraints. Note that the PDF versions of the diagrams are all 20KB or less, and the EPS versions are all 44KB or less.

Results for opt-decomp

The first image shows the CPU times taken by opt-decomp.

This image is also available in EPS and PDF.

The second image shows those times divided by the worst-case compexity function for opt-k-decomp. As you can see opt-decomp does in fact take this amount of time to execute, indicated by the convergence of this plot to a fixed, finite number.

This image is also available in EPS and PDF.

The third image shows those times divided by the best-case complexity function for red-k-decomp. Obviously, opt-k-decomp does not have an actual or average complexity function near the best-case complexity of red-k-decomp.

This image is also available in EPS and PDF.

Results for red-decomp

The first image shows the CPU times taken by red-decomp.

This image is also available in EPS and PDF.

The second image shows those times divided by the worst-case compexity function for red-k-decomp (same as that for opt-k-decomp). As you can see red-decomp seems to take on average less than this amount of time to execute, indicated by the convergence of this plot to zero.

This image is also available in EPS and PDF.

The third image shows those times divided by the best-case complexity function for red-k-decomp. It appears that red-k-decomp has an average complexity function near it's best-case complexity function, as the plot does not diverge.

This image is also available in EPS and PDF.

Overall comparison

This final image compares the CPU times of opt-decomp to those of red-decomp by simply dividing the former by the latter. Notice that the scale on the Y axis is logarithmic. :)

This image is also available in EPS and PDF.