ux: Design interface(s) for reading, writing, and reporting benchmark records #209

nicholasjng · 2025-02-07T09:59:01Z

Supersedes / extends #48.

Currently, we have a ConsoleReporter and a FileReporter in place, which, as their names suggest, report contents of benchmark records, i.e., results of and context around a benchmarking run of a set of parameters. "Reporting" in this case means to present the data contained in the benchmark records in a compelling way, for example in a table as seen in the README:

import nnbench


@nnbench.benchmark
def product(a: int, b: int) -> int:
    return a * b


@nnbench.benchmark
def power(a: int, b: int) -> int:
    return a ** b


reporter = nnbench.ConsoleReporter()
# first, collect the above benchmarks directly from the current module...
benchmarks = nnbench.collect("__main__")
# ... then run the benchmarks with the parameters `a=2, b=10`...
record = nnbench.run(benchmarks, params={"a": 2, "b": 10})
reporter.display(record)  # ...and print the results to the terminal.

# results in a table look like the following:
# ┏━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓
# ┃ Benchmark ┃ Value ┃ Wall time (ns) ┃ Parameters        ┃
# ┡━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩
# │ product   │ 20    │ 1917           │ {'a': 2, 'b': 10} │
# │ power     │ 1024  │ 583            │ {'a': 2, 'b': 10} │
# └───────────┴───────┴────────────────┴───────────────────┘

The console reporter writes the data to a rich table into stdout, while the file reporter writes them to a local file (currently JSON, YAML, CSV, Parquet, and ndjson formats are supported), and afterwards optionally copies it to a remote location using fsspec.

I'm currently working to extend the reporting capabilities to databases (streaming to/from DBs, like postgres/sqlite) and web services (GET/POSTing records in JSON format, like mlflow), but a coherent design has so far eluded me.

While the obvious choice would be a general read/write interface like this:

# Example implementation, taken from issue 48.
class FileIOReporter:
    def write_record(self, r):
        ...

    def write_record_batched(self, rb):
        ...

    def read_record(self):
        ...
    
    def open(self, fp):
        ...

    def close(self, fp):
        ...


class JSONFileReporter(FileIOReporter):

    def open(self, fp):
        with open(fp, "r") as f:
            return json.load(f)

    def close(self, fp):
        fp.close()


class YAMLFileReporter(FileIOReporter):
    ... # same thing as for JSON, but use YAML read/write APIs.

One of the problems is that database reads in general require a query to be useful, while file reads don't - that messes up at least the read() interface:

class FileReporter:
    def read(self, file: str | os.PathLike[str]) -> BenchmarkRecord:
        ...

class DatabaseReporter:
    # incompatible with the above.
    def read(self, db: DatabaseInstance, query: str) -> BenchmarkRecord:
        ...

Another way (which the previous issue already alluded to in its title) would be to create an N-way taxonomy of IO, say, console | files | databases | web, and implement one interface for each.
A potential drawback for this is that some tools fit multiple of these categories, like duckDB, which supports SQL query-based analysis of local or remote files like Parquet/ndjson, so a duckDB reporter would need access to both file and database methods to be useful.

Suggested action:

This is to be understood in the context of our current project.

Decide on a way to handle different kinds of sinks (write) and sources (read) for benchmark data.
Decide on the IO vs. reporting responsibilities.
Implement benchmark record IO for a single database and/or web service (top prio: mlflow).
Explore composability of multiple IOs for a single reporter (like duckDB mentioned before).

cc @AdrianoKF @schroedk @janwillemkl - feel free to comment, create own action items, reach out in case of any unclear points.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ux: Design interface(s) for reading, writing, and reporting benchmark records #209

ux: Design interface(s) for reading, writing, and reporting benchmark records #209

nicholasjng commented Feb 7, 2025

ux: Design interface(s) for reading, writing, and reporting benchmark records #209

ux: Design interface(s) for reading, writing, and reporting benchmark records #209

Comments

nicholasjng commented Feb 7, 2025