Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ux: Design interface(s) for reading, writing, and reporting benchmark records #209

Open
4 tasks
nicholasjng opened this issue Feb 7, 2025 · 0 comments
Open
4 tasks

Comments

@nicholasjng
Copy link
Collaborator

Supersedes / extends #48.

Currently, we have a ConsoleReporter and a FileReporter in place, which, as their names suggest, report contents of benchmark records, i.e., results of and context around a benchmarking run of a set of parameters. "Reporting" in this case means to present the data contained in the benchmark records in a compelling way, for example in a table as seen in the README:

import nnbench


@nnbench.benchmark
def product(a: int, b: int) -> int:
    return a * b


@nnbench.benchmark
def power(a: int, b: int) -> int:
    return a ** b


reporter = nnbench.ConsoleReporter()
# first, collect the above benchmarks directly from the current module...
benchmarks = nnbench.collect("__main__")
# ... then run the benchmarks with the parameters `a=2, b=10`...
record = nnbench.run(benchmarks, params={"a": 2, "b": 10})
reporter.display(record)  # ...and print the results to the terminal.

# results in a table look like the following:
# ┏━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓
# ┃ Benchmark ┃ Value ┃ Wall time (ns) ┃ Parameters        ┃
# ┡━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩
# │ product   │ 20    │ 1917           │ {'a': 2, 'b': 10} │
# │ power     │ 1024  │ 583            │ {'a': 2, 'b': 10} │
# └───────────┴───────┴────────────────┴───────────────────┘

The console reporter writes the data to a rich table into stdout, while the file reporter writes them to a local file (currently JSON, YAML, CSV, Parquet, and ndjson formats are supported), and afterwards optionally copies it to a remote location using fsspec.

I'm currently working to extend the reporting capabilities to databases (streaming to/from DBs, like postgres/sqlite) and web services (GET/POSTing records in JSON format, like mlflow), but a coherent design has so far eluded me.

While the obvious choice would be a general read/write interface like this:

# Example implementation, taken from issue 48.
class FileIOReporter:
    def write_record(self, r):
        ...

    def write_record_batched(self, rb):
        ...

    def read_record(self):
        ...
    
    def open(self, fp):
        ...

    def close(self, fp):
        ...


class JSONFileReporter(FileIOReporter):

    def open(self, fp):
        with open(fp, "r") as f:
            return json.load(f)

    def close(self, fp):
        fp.close()


class YAMLFileReporter(FileIOReporter):
    ... # same thing as for JSON, but use YAML read/write APIs.

One of the problems is that database reads in general require a query to be useful, while file reads don't - that messes up at least the read() interface:

class FileReporter:
    def read(self, file: str | os.PathLike[str]) -> BenchmarkRecord:
        ...

class DatabaseReporter:
    # incompatible with the above.
    def read(self, db: DatabaseInstance, query: str) -> BenchmarkRecord:
        ...

Another way (which the previous issue already alluded to in its title) would be to create an N-way taxonomy of IO, say, console | files | databases | web, and implement one interface for each.
A potential drawback for this is that some tools fit multiple of these categories, like duckDB, which supports SQL query-based analysis of local or remote files like Parquet/ndjson, so a duckDB reporter would need access to both file and database methods to be useful.

Suggested action:

This is to be understood in the context of our current project.

  • Decide on a way to handle different kinds of sinks (write) and sources (read) for benchmark data.
  • Decide on the IO vs. reporting responsibilities.
  • Implement benchmark record IO for a single database and/or web service (top prio: mlflow).
  • Explore composability of multiple IOs for a single reporter (like duckDB mentioned before).

cc @AdrianoKF @schroedk @janwillemkl - feel free to comment, create own action items, reach out in case of any unclear points.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant