Generic workflow
This section only describes the most basic and common use of the benchmark-tool. For more information regarding each subcommand and their options check the corresponding section.
Generating and running benchmarks¶
After installation you can create the required directory structure for the benchmark tool using the init subcommand:
btool init
Afterwards you should see the following folders:
programs/: Place solver/tool executables hereresultparsers/: Place custom resultparsers hererunscripts/: Contains example runscriptstemplates/: Contains example script templates
If you want to create a new resultparser or want to modify the provided one, you can use
btool init --resultparser-template to also create a copy of the clasp resultparser as rp_tmp.py.
To start using the benchmark-tool create a new [runscript] or modify an existing one and
setup you system-under-test (SUT) inside the programs/ folder. If your job uses the template
option --single and your SUT is a shell script make sure to use exec. The name of your SUT
should match the <system>-<version> given in the runscript. An example script for basic clingo
can be found inside the programs/ folder as clingo-latest.
Make sure your SUT is executable i.e., has the correct file permission.
Info
When using --single avoid using pipes | in your scripts or make sure signals are properly
propagated between processes. In general the use of pipes is strongly discouraged.
Check that all files/folders referenced in the runscript exist. These are most likely your benchmark
instances/encodings, templates and SUT. Also make sure, that the the runlim executable is inside the
programs/ folder and any custom resultparser is placed inside the resultparsers/ folder.
If everything is setup you can generate you benchmarks using the gen subcommand:
btool gen <runscript.xml>
Afterwards, you should see an output folder with your specified name, which contains all your benchmarks. Check the [runscript] section to see how the benchmarks are structured.
You can start your benchmarks by running the start.py script for sequential jobs or the start.sh script
for distributed jobs. Both types of scripts can be found inside the <output>/<project>/<machine>/ folder.
Alternatively you can use the dispatcher btool run-dist to schedule your distributed jobs.
After running all your benchmarks you can continue to the evaluation step. Optionally you can use the verify subcommand to check for runlim errors inside your results.
Info
At the moment all projects defined inside a runscript have to be run before an evaluation is possible.
Evaluating the results¶
To evaluate your benchmarks and collect the results use the eval subcommand:
btool eval <runscript.xml> > <results.xml>
This newly created .xml file can then be used as input for the conv subcommand to generate an .xlsx file and optionally an .ipynb jupyter notebook. By default only the time and timeout measures are displayed. Further measures can be selected using the -m option.
btool conv -o <out.xlsx> <result.xml>