benchadapt API documentation

class benchadapt.BenchmarkResult(run_name=None, run_id=None, run_tags=<factory>, batch_id=<factory>, run_reason=None, timestamp=<factory>, stats=None, error=None, validation=None, tags=<factory>, info=<factory>, optional_benchmark_info=None, machine_info=<factory>, cluster_info=None, context=<factory>, github=<factory>)

Bases: object

A dataclass for containing results from running a benchmark.

run_name

Name for the run. Current convention is f"{run_reason}: {github['commit']}". If missing and github["commit"] exists, run_name will be populated according to that pattern (even if run_reason is None); otherwise it will remain None. Users should not set this manually unless they want to identify runs in some other fashion. Benchmark name should be specified in tags["name"].

This argument is deprecated and will be removed in the future. Any given name here will be added to run_tags under the “name” key on the server side.

Type:

str

run_id

ID for the run; should be consistent for all results of the run. Should not normally be set manually; adapters will handle this for you.

Type:

str

run_tags

An optional mapping of arbitrary keys and values that describe the CI run. These are used to group and filter runs in the UI and API. Do not include run_reason here; it should be provided below.

The Conbench UI and API assume that all benchmark results with the same run_id share the same run_tags. There is no technical enforcement of this on the server side, so some behavior may not work as intended if this assumption is broken by the client.

Type:

Dict[str, str]

batch_id

ID string for the batch

Type:

str

run_reason

Reason for run (e.g. commit, PR, merge, nightly). In many cases will be set at runtime via an adapter’s result_fields_override init parameter; should not usually be set in _transform_results().

Type:

str

timestamp

Timestamp of call, in ISO format

Type:

str

stats

Measurement data and summary statistics. If data (a list of metric values), unit (for that metric, e.g. "s"), and iterations (replications for microbenchmarks) are specified, summary statistics will be filled in server-side.

Type:

Dict[str, Any]

error

A dict containing information about errors raised when running the benchmark. Any schema is acceptable, but may contain stderr, a traceback, etc.

Type:

Dict[str, Any]

validation

Benchmark results validation metadata (e.g., errors, validation types).

Type:

Dict [str, Any]

tags

Many things. Must include a name element (i.e. the name corresponding to the benchmark code); often includes parameters either as separate keys or as a string in a params key. If suite subdivisions exist, use a suite tag. Determines history runs.

Type:

Dict[str, Any]

info

Things like arrow_version, arrow_compiler_id, arrow_compiler_version, benchmark_language_version, ``arrow_version_r

Type:

Dict[str, Any]

optional_benchmark_info

Optional information about Benchmark results (e.g., telemetry links, logs links). These are unique to each benchmark that is run, but are information that aren’t reasonably expected to impact benchmark performance. Helpful for adding debugging or additional links and context for a benchmark (free-form JSON)

Type:

Dict[str, Any]

machine_info

For benchmarks run on a single node, information about the machine, e.g. OS, architecture, etc. Auto-populated if cluster_info not set. If host name should not be detected with platform.node() (e.g. because a consistent name is needed for CI or cloud runners), it can be overridden with the CONBENCH_MACHINE_INFO_NAME environment variable.

Type:

Dict[str, Any]

cluster_info

For benchmarks run on a cluster, information about the cluster

Type:

Dict[str, Any]

context

Should include benchmark_language and other relevant metadata like compiler flags

Type:

Dict[str, Any]

github

A dictionary containing GitHub-flavored commit information.

Allowed values: no value, a special dictionary.

Not passing an argument upon dataclass construction results in inspection of the environment variables CONBENCH_PROJECT_REPOSITORY, CONBENCH_PROJECT_COMMIT, and CONBENCH_PROJECT_PR_NUMBER, which are used as the special dictionary’s repository, commit, and pr_number keys respectively, if they are set. These are defined below.

If passed a dictionary, it must have at least the repository key, which must be a string, in the format https://github.com/<org>/<repo>.

If the benchmark was run on a reproducible commit (from the default branch or a pull request commit), it must also have the commit key, which must be a string of the full commit hash. Not associating a benchmark result with a commit hash has special, limited purpose (pre-merge benchmarks, testing). It generally means that this benchmark result will not be considered for time series analysis along a commit tree.

If the benchmark was run against the default branch, do not specify additional keys.

If it was run on a GitHub pull request branch, you should provide pr_number.

If it was run on a non-default branch and a non-PR commit, you may supply the branch name via the branch set to a value of the format org:branch.

For more details, consult the Conbench HTTP API specification.

Type:

Dict[str, Any]

Notes

Fields one of which must be supplied:

  • machine_info (generated by default) xor cluster_info

  • stats or error

Fields which should generally not be specified directly on instantiation that will be set later for the run:

  • run_name

  • run_id

  • run_reason

Fields without comprehensive defaults which should be specified directly:

  • stats (and/or error)

  • validation

  • tags

  • info

  • optional_benchmark_info

  • context

  • run_tags

Fields with defaults you may want to override on instantiation:

  • batch_id if multiple benchmarks should be grouped, e.g. for a suite

  • timestamp if run time is inaccurate

  • machine_info if not run on the current machine

  • cluster_info if run on a cluster

  • github

to_publishable_dict()

Return a dictionary representing the benchmark result.

After JSON-serialization, that dictionary is expected to validate against the JSON schema that the Conbench API expects on the endpoint for benchmark result submission.

Return type:

Dict

benchadapt.adapters subpackage

class benchadapt.adapters.ArcheryAdapter(result_fields_override=None, result_fields_append=None)

Bases: GoogleBenchmarkAdapter

A class for running Apache Arrow’s archery benchmarks and sending the results to conbench

class benchadapt.adapters.AsvBenchmarkAdapter(command, result_file, benchmarks_file_path, result_fields_override=None, result_fields_append=None)

Bases: BenchmarkAdapter

A class for adapting Asv Benchmarks and sending the results to conbench

class benchadapt.adapters.BenchmarkAdapter(command, result_fields_override=None, result_fields_append=None)

Bases: ABC

An abstract class to run benchmarks, transform results into conbench form, and send them to a conbench server.

In general, one instance should correspond to one run (likely of many benchmarks).

command

A list of args to be run on the command line, as would be passed to subprocess.run().

Type:

List[str]

result_fields_override

A dict of values to override on each instance of BenchmarkResult. Useful for specifying metadata only available at runtime, e.g. run_reason and build info. Applied before results_field_append. Useful for both dicts (will replace the full dict) and other types.

Type:

Dict[str, Any]

results_fields_append

A dict of values to be appended to BenchmarkResult values after instantiation. Useful for appending extra tags or other metadata in addition to that gathered elsewhere already in a dict field. Only applicable for dict attributes. For each element, will override any keys that already exist, i.e. it does not append recursively.

Type:

Dict[str, Any]

results

Once run() has been called, results from that run

Type:

List[BenchmarkResult]

post_results()

Post results of run to conbench

Return type:

list

run(params=None)

Run benchmarks

Parameters:

params (List[str]) – Additional parameters to be appended to the command before running

Return type:

List[BenchmarkResult]

transform_results()

Method to transform results from the command line call into a list of instances of BenchmarkResult. This method returns results updated with runtime metadata values specified on init.

Return type:

List[BenchmarkResult]

update_benchmark_result(result, run_id)

A method to update instances of BenchmarkResult with values specified on init in result_fields_override and/or result_fields_append.

Parameters:
  • result (BenchmarkResult) – An instance of BenchmarkResult to update

  • run_id (str) – Value to use for run_id if it is not already set directly in a _tranform_results() implementation or via result_fields_override). Should match for all results for a run, so a hex UUID is generated in transform_results() in normal usage.

Return type:

BenchmarkResult

class benchadapt.adapters.CallableAdapter(callable, result_fields_override=None, result_fields_append=None)

Bases: BenchmarkAdapter

A generic adapter for adapting benchmarks defined in a Callable, i.e. a function or class with a __call__() method that directly returns a list of BenchmarkResult instances. Does not shell out.

callable

A Callable (a function or a class with a __call__() method) that returns a list of BenchmarkResult instances

Type:

Callable

run(params=None)

Run benchmarks

Parameters:

params (List[str]) – Additional kwargs to be passed though to the callable

Return type:

List[BenchmarkResult]

class benchadapt.adapters.FollyAdapter(command, result_dir, result_fields_override=None, result_fields_append=None)

Bases: BenchmarkAdapter

Run folly benchmarks and send the results to conbench

class benchadapt.adapters.GoogleBenchmarkAdapter(command, result_file, result_fields_override=None, result_fields_append=None)

Bases: BenchmarkAdapter

A class for running Google Benchmarks and sending the results to conbench