Hardness benchmark #440

ritalyu17 · 2024-12-03T06:48:20Z

Work in progress Integrated Hardness benchmarking task.

To-do:

replace the dataset

CLAassistant · 2024-12-03T06:48:26Z

All committers have signed the CLA.

benchmarks/domains/Hardness.py

ritalyu17 · 2024-12-16T08:17:35Z

The hardness benchmark is ready for review and some feedbacks.

Currently, the bayesian optimization component and multi-task component are set to two Benchmark. Main reason for seperating them is because the arguments in simulate_scenarios are different, specifically initial_data. Maybe there is a way to make the code look nicer?

Thank you!

sgbaird · 2024-12-16T18:49:47Z

benchmarks/domains/Hardness.py

+    dfComposition_temp = dfComposition_temp.sort_values(by="load")
+    # if there are any duplicate values for load, drop them
+    dfComposition_temp = dfComposition_temp.drop_duplicates(subset="load")
+    # if there are less than 5 values, continue to the next composition


Too verbose I think, comments like this can be removed which are very self-explanatory. Overall, just too many comments like this

Quick comment from my side as I also have some stuff regarding comments in my review: I agree with @sgbaird that such individual line comments are not necessary. However, I would appreciate a bit more "high-level" comments like "Filtering composition for which less than 5 hardness values are available", descring what a full block of code is doing.

Note that I only unresolved this comment to make it easier for you to spot this comment here of mine, feel free to immediately un-resolve :)

benchmarks/domains/Hardness.py

AVHopp · 2024-12-19T16:41:09Z

Just FYI: I will give my review here mid of January :)

AVHopp

First of all, thanks for the benchmark :) This is a very first and quick review since I think that minor changes from your end will simplify the review process for me quite significantly. Also, note that the way that there was a PR involving the lookup mechanism (#441 ) This might (or might not) have an influence on your benchmark here.

Hence, I would appreciate if you could rebase your example onto main, verify that this benchmark is compatible with the new lookup and include the first batch of comments. Then I'll be more than happy to give it a full and proper review :)

benchmarks/domains/Hardness.py

AVHopp · 2025-01-08T14:20:02Z

benchmarks/domains/Hardness.py

+)
+
+
+# IMPORT AND PREPROCESS DATA------------------------------------------------------------------------------


There is no need for these kind of headers, ideally remove them or replace them by more descriptive comments.

Ideally, you could briefly describe what happens here in the pre-processing: That is, what does this benchmark describe, what is the pre-processing doing and why is it necessary.

Also, general question (also to @AdrianSosic and @Scienfitz ): Wouldn't it be sufficient to just have the pre-processed data as a .csv file here?

The processing steps clarify how the data is derived. The data are from different source, dfMP from Materials Project and dfExp from experiments.

benchmarks/domains/Hardness.py

AVHopp · 2025-01-08T14:30:09Z

benchmarks/domains/Hardness.py

+    # sort the data by load
+    dfComposition_temp = dfComposition_temp.sort_values(by="load")
+    dfComposition_temp = dfComposition_temp.drop_duplicates(subset="load")
+    if len(dfComposition_temp) < 5:     # continue to the next composition


Why do you continue in this case?

benchmarks/domains/__init__.py

benchmarks/domains/Hardness.py

AVHopp · 2025-01-08T14:34:25Z

benchmarks/domains/Hardness.py

+
+benchmark_config = ConvergenceExperimentSettings(
+    batch_size=1,
+    n_doe_iterations=20,


Can you elaborate on why you chose these values?

AVHopp · 2025-01-08T14:36:31Z

benchmarks/domains/Hardness.py

+        # create a list of dataframes with n samples from dfLookupTable_source to use as initial data
+        lstInitialData_temp = [dfLookupTable_source.sample(n) for _ in range(settings.n_mc_iterations)]
+
+    return simulate_scenarios(


Something is weird here: You only ever call this with the latest value of n, which is 30. Why do you then create several different campaigns and lists?

Hardness benchmark

54bd0f5

ritalyu17 added 2 commits December 3, 2024 09:24

Upload dataset

1c46668

Delete benchmarks/domains/exp_hardness.csv

b2dd010

AVHopp reviewed Dec 9, 2024

View reviewed changes

benchmarks/domains/Hardness.py Outdated Show resolved Hide resolved

ritalyu17 added 4 commits December 16, 2024 03:04

Merge branch 'emdgroup:main' into Hardness

1208c29

Delete benchmarks/domains/Hardness.py

9b95e32

Delete benchmarks/domains/__init__.py

34a2738

Add files via upload

e4bfeb8

ritalyu17 marked this pull request as ready for review December 16, 2024 08:11

ritalyu17 requested a review from AdrianSosic as a code owner December 16, 2024 08:11

sgbaird reviewed Dec 16, 2024

View reviewed changes

Update docstring and clean up some code

5c2d8c0

AVHopp requested changes Jan 8, 2025

View reviewed changes

ritalyu17 added 5 commits January 22, 2025 04:02

Merge branch 'emdgroup:main' into Hardness

b8d8cef

Merge branch 'emdgroup:main' into Hardness

9b726d3

Merge branch 'emdgroup:main' into Hardness

fd7a289

Merge branch 'emdgroup:main' into Hardness

567a34c

Update __init__.py

c2eb1fa

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hardness benchmark #440

Hardness benchmark #440

ritalyu17 commented Dec 3, 2024 •

edited

Loading

CLAassistant commented Dec 3, 2024 •

edited

Loading

ritalyu17 commented Dec 16, 2024 •

edited

Loading

sgbaird Dec 16, 2024

ritalyu17 Dec 17, 2024

AVHopp Jan 8, 2025

AVHopp commented Dec 19, 2024

AVHopp left a comment

AVHopp Jan 8, 2025

AVHopp Jan 8, 2025

ritalyu17 Jan 24, 2025

AVHopp Jan 8, 2025

AVHopp Jan 8, 2025

AVHopp Jan 8, 2025

		)


		# IMPORT AND PREPROCESS DATA------------------------------------------------------------------------------

Hardness benchmark #440

Are you sure you want to change the base?

Hardness benchmark #440

Conversation

ritalyu17 commented Dec 3, 2024 • edited Loading

CLAassistant commented Dec 3, 2024 • edited Loading

ritalyu17 commented Dec 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AVHopp commented Dec 19, 2024

AVHopp left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ritalyu17 commented Dec 3, 2024 •

edited

Loading

CLAassistant commented Dec 3, 2024 •

edited

Loading

ritalyu17 commented Dec 16, 2024 •

edited

Loading