Dataset metrics #514

KeenanRileyFaulkner · 2024-10-31T23:45:47Z

Adds two scripts to bfasst, allowing for efficient computation of metrics on randsoc-based graphs for gnn training datasets. The metrics can be computed ad-hoc through the use of command line options. Bfasst is used to iterate through a dataset, run process_graph.py on each graph in it, then run accumulate_metrics.py at the end.

If things run for too long, and you don't want to wait for all instances of process_graph.py to finish, kill the run and invoke the accumulation script on the output directory directly.

The final output is contained in a file called summary_stats.log, with individual metrics for each instance of each ip contained in master_metrics.log. accumulate_metrics.py will overwrite these files every time, but options are available to specify a different output name, if the utility is used directly.

KeenanRileyFaulkner · 2024-11-01T05:54:53Z

Needs dependencies on the scripts in the ninja build statements, still, so that rebuild triggers when utils update

…ry stats file name

jgoeders · 2024-11-29T16:21:43Z

bfasst/flows/analyze_dataset.py

+        """Get the name of the build directory for this flow"""
+        return "dataset_metrics"
+
+    def add_ninja_deps(self, deps):


I don't think this is required if it's just calling super. The behavior will be the same if you remove it.

jgoeders · 2024-11-29T16:24:03Z

bfasst/utils/accumulate_metrics.py

+
+def sort_metrics(metrics):
+    """Sort the values for each metric in the dictionary."""
+    for ip, _ in metrics.items():


If you want to loop through keys, just use:

for ip in metrics:

jgoeders · 2024-12-05T22:13:57Z

@KeenanRileyFaulkner It looks like this is failing CI. Anything I can help with?

KeenanRileyFaulkner · 2024-12-06T05:44:33Z

Sorry. I didn't realize there was a formatting issue. I'm out of town I'm right now, but will try to fix it when I have a chance this weekend.

KeenanRileyFaulkner · 2024-12-09T17:52:16Z

@jgoeders All tests are passing now

KeenanRileyFaulkner added 12 commits October 30, 2024 16:40

added dataset processing on per-graph basis to bfasst

745b7cc

minor format fix

c7876bb

Added basics for accumulation of graph metrics

fc704b1

updated accumulation script to write to file

75883bd

Added accumulation of metrics

aff1897

refactored to use FlowNoDesign

8155cfc

added diameter

604ac18

added degree

1657e9c

added kcore and global/local clustering coefficients

f28c449

updated names for clustering coefficients

d01031f

added options on each metric so they can be turned off/on

0c4a665

do not iterate over the summary or master metrics logs

80e3eda

KeenanRileyFaulkner requested review from jgoeders and dallinjdahl October 31, 2024 23:45

KeenanRileyFaulkner added 2 commits October 31, 2024 17:59

pylint

705feda

pylint

2f77374

KeenanRileyFaulkner added 6 commits November 4, 2024 14:35

removed kcore and local clustering

76dbad6

added utility scripts as deps to dataset_metrics tools

5961870

updated scripts to work per-component and per-instance, updated summa…

e9e09c6

…ry stats file name

pylint

a55111d

added k core

a1c3dd0

make sure k core increments correctly

36ebf50

jgoeders approved these changes Nov 29, 2024

View reviewed changes

KeenanRileyFaulkner added 2 commits December 9, 2024 10:11

format

1bccb27

revert changes

726b498

jgoeders merged commit 1461dd1 into main Dec 9, 2024
18 checks passed

jgoeders deleted the dataset_metrics branch December 9, 2024 17:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dataset metrics #514

Dataset metrics #514

KeenanRileyFaulkner commented Oct 31, 2024

KeenanRileyFaulkner commented Nov 1, 2024

jgoeders Nov 29, 2024

jgoeders Nov 29, 2024

jgoeders commented Dec 5, 2024

KeenanRileyFaulkner commented Dec 6, 2024

KeenanRileyFaulkner commented Dec 9, 2024

Dataset metrics #514

Dataset metrics #514

Conversation

KeenanRileyFaulkner commented Oct 31, 2024

KeenanRileyFaulkner commented Nov 1, 2024

jgoeders Nov 29, 2024

Choose a reason for hiding this comment

jgoeders Nov 29, 2024

Choose a reason for hiding this comment

jgoeders commented Dec 5, 2024

KeenanRileyFaulkner commented Dec 6, 2024

KeenanRileyFaulkner commented Dec 9, 2024