Generic workflow

This section only describes the most basic and common use of the benchmark-tool. For more information regarding each subcommand and their options check the corresponding section.

Generating and running benchmarks¶

After installation you can create the required directory structure for the benchmark tool using the init subcommand:

btool init

Afterwards you should see the following folders:

programs/: Place solver/tool executables here
resultparsers/: Place custom resultparsers here
runscripts/: Contains example runscripts
templates/: Contains example script templates

If you want to create a new resultparser or want to modify the provided one, you can use btool init --resultparser-template to also create a copy of the clasp resultparser as rp_tmp.py.

To start using the benchmark-tool create a new [runscript] or modify an existing one and setup you system-under-test (SUT) inside the programs/ folder. If your job uses the template option --single and your SUT is a shell script make sure to use exec. The name of your SUT should match the <system>-<version> given in the runscript. An example script for basic clingo can be found inside the programs/ folder as clingo-latest. Make sure your SUT is executable i.e., has the correct file permission.

Info

When using --single avoid using pipes | in your scripts or make sure signals are properly propagated between processes. In general the use of pipes is strongly discouraged.

Check that all files/folders referenced in the runscript exist. These are most likely your benchmark instances/encodings, templates and SUT. Also make sure, that the the runlim executable is inside the programs/ folder and any custom resultparser is placed inside the resultparsers/ folder. If everything is setup you can generate you benchmarks using the gen subcommand:

btool gen <runscript.xml>

Afterwards, you should see an output folder with your specified name, which contains all your benchmarks. Check the [runscript] section to see how the benchmarks are structured.

You can start your benchmarks by running the start.py script for sequential jobs or the start.sh script for distributed jobs. Both types of scripts can be found inside the <output>/<project>/<machine>/ folder. Alternatively you can use the dispatcher btool run-dist to schedule your distributed jobs.

After running all your benchmarks you can continue to the evaluation step. Optionally you can use the verify subcommand to check for runlim errors inside your results.

Info

At the moment all projects defined inside a runscript have to be run before an evaluation is possible.

Evaluating the results¶

To evaluate your benchmarks and collect the results use the eval subcommand:

btool eval <runscript.xml> > <results.xml>

This newly created .xml file can then be used as input for the conv subcommand to generate an .xlsx file and optionally an .ipynb jupyter notebook. By default only the time and timeout measures are displayed. Further measures can be selected using the -m option.

btool conv -o <out.xlsx> <result.xml>