Some confusions about nasbenchmark_201. #178

fze0012 · 2023-04-18T12:27:16Z

For different task numbers, how can I know the best results for canculating the simple or inference regret?

Neeratyoy · 2023-04-18T13:37:52Z

The best score is known only for the tabular benchmarks. For the nn benchmarks the following should work:

from hpobench.benchmarks.ml import TabularBenchmark

b = TabularBenchmark(model="nn", task_id=31)

b.global_minimums

fze0012 · 2023-05-18T17:42:40Z

HPOBench/hpobench/benchmarks/nas/nasbench_201.py

Lines 108 to 124 in 47bf141

    
                                               i) The best possible incumbents (NO AVG!)                       ii) The "average" incumbent 
        
                   Datastet        Metric      (Index of Arch, Accuracy)       (Index, Loss)                   (Index of Arch, Accuracy)       (Index, Loss) 
        
                   ---------------------------------------------------------------------------------------------------------------------------------------------------------- 
        
                   cifar10-valid   train       (258, 100.0)                    (2778, 0.001179278278425336)    (10154, 100)                    (2778, 0.0013082386429297428) 
        
                   cifar10-valid   x-valid     (6111, 91.71999999023437)       (14443, 0.3837750501537323)     (6111, 91.60666665039064)       (3888, 0.3894046771335602) 
        
                   cifar10-valid   x-test 
        
                   cifar10-valid   ori-test    (14174, 91.65)                  (3385, 0.3850496160507202)      (1459, 91.52333333333333)       (3385, 0.3995230517864227) 
        
                   cifar100        train       (9930, 99.948)                  (9930, 0.012630240231156348)    (9930, 99.93733333333334)       (9930, 0.012843489621082942) 
        
                   cifar100        x-valid     (13714, 73.71999998779297)      (13934, 1.1490126512527465)     (9930, 73.4933333577474)        (7361, 1.1600867895126343) 
        
                   cifar100        x-test      (1459, 74.28000004882813)       (15383, 1.1427113876342774)     (9930, 73.51333332112631)       (7337, 1.1747569534301758) 
        
                   cifar100        ori-test    (9930, 73.88)                   (13706, 1.1610547459602356)     (9930, 73.50333333333333)       (7361, 1.1696554500579834) 
        
                   ImageNet16-120  train       (9930, 73.2524719841793)        (9930, 0.9490517352046979)      (9930, 73.22918040138735)       (9930, 0.9524298415108582) 
        
                   ImageNet16-120  x-valid     (13778, 47.39999985758463)      (10721, 2.0826991437276203)     (10676, 46.73333327229818)      (10721, 2.0915397168795264) 
        
                   ImageNet16-120  x-test      (857, 48.03333317057292)        (12887, 2.0940088628133138)     (857, 47.31111100599501)        (11882, 2.106453532218933) 
        
                   ImageNet16-120  ori-test    (857, 47.083333353678384)       (11882, 2.0950548852284747)     (857, 46.8444444647895)         (11882, 2.1028235816955565)

In this file, what is the mean of the prefix ori e.g. ori-test?

Neeratyoy · 2023-05-22T16:07:27Z

Hi,

This docstring is borrowed from the NASBench-201 paper release and thus the actual details can be found here.
This is likely to indicate the numbers on the original test set.

I shall close this issue for now as there is nothing about HPOBench here. Please feel free to reopen or ask any further queries.

fze0012 · 2023-06-12T08:28:44Z

HPOBench/hpobench/benchmarks/nas/nasbench_201.py

Lines 262 to 264 in 47bf141

    
           test_accuracies = [self.data[seed][structure_str]['eval_acc1es'][f'{valid_key}@{199}'] for seed in data_seed] 
        
           test_losses = [self.data[seed][structure_str]['eval_losses'][f'{valid_key}@{199}'] for seed in data_seed] 
        
           test_times = [np.sum((self.data[seed][structure_str]['eval_times'][f'{test_key}@{199}'])

For test_accuracies and test_losses, why the valid_key is used rather than the test_key?

Neeratyoy · 2023-06-13T09:14:01Z

Thanks for raising this.
Would you like to do a PR with this fix?

You can refer to this and this and use the local version for testing.
Once merged, we can upload a new container with this fix.

fze0012 closed this as completed Apr 18, 2023

fze0012 reopened this May 17, 2023

fze0012 closed this as completed May 17, 2023

fze0012 changed the title ~~Where to find the best-known values of nn-benchmarks?~~ Some confusions about nasbenchmark_201. May 18, 2023

fze0012 reopened this May 18, 2023

Neeratyoy closed this as completed May 22, 2023

Neeratyoy reopened this Jun 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some confusions about nasbenchmark_201. #178

Some confusions about nasbenchmark_201. #178

fze0012 commented Apr 18, 2023 •

edited

Loading

Neeratyoy commented Apr 18, 2023

fze0012 commented May 18, 2023

Neeratyoy commented May 22, 2023

fze0012 commented Jun 12, 2023

Neeratyoy commented Jun 13, 2023

Some confusions about nasbenchmark_201. #178

Some confusions about nasbenchmark_201. #178

Comments

fze0012 commented Apr 18, 2023 • edited Loading

Neeratyoy commented Apr 18, 2023

fze0012 commented May 18, 2023

Neeratyoy commented May 22, 2023

fze0012 commented Jun 12, 2023

Neeratyoy commented Jun 13, 2023

fze0012 commented Apr 18, 2023 •

edited

Loading