Auto-build Sphinx documentation

FAIR-Universe · Oct 14, 2024 · 40f06af · 40f06af
1 parent c3b54b4
commit 40f06af
Show file tree

Hide file tree

Showing 11 changed files with 47 additions and 26 deletions.
diff --git a/docs/_sources/pages/Front_page.md.txt b/docs/_sources/pages/Front_page.md.txt
@@ -1,6 +1,6 @@
 # FAIR Universe - HiggsML Uncertainty Challenge
 
-## Introduction 
+## Introduction
 
  This NeurIPS 2024 Machine Learning competition is one of the first to strongly emphasise mastering uncertainties in the input training dataset and outputting credible confidence intervals. This challenge explores uncertainty-aware AI techniques for High Energy Physics (HEP).
 
@@ -13,9 +13,9 @@ There are several information sources regarding the FAIR Universe - HiggsML Unce
 
 * [Codabench](https://www.codabench.org/competitions/2977/): This serves as the platform to submit entries to the competition.
 
-*  [Tutorial Slides](https://fair-universe.lbl.gov/tutorials/Higgs_Uncertainty_Challenge-Codabench_Tutorial.pdf) : These slides will help you register and submit a sample dummy submission. 
+* [Tutorial Slides](https://fair-universe.lbl.gov/tutorials/Higgs_Uncertainty_Challenge-Codabench_Tutorial.pdf) : These slides will help you register and submit a sample dummy submission. 
 
-* Download the [training data](https://www.codabench.org/datasets/download/b9e59d0a-4db3-4da4-b1f8-3f609d1835b2/) if you you want to experiment with the data on your local machines.
+* Download the [Training Data (6.5 GB)](https://www.codabench.org/datasets/download/b9e59d0a-4db3-4da4-b1f8-3f609d1835b2/) if you you want to experiment with the data on your local machines.
 
 * [Documentation](https://fair-universe.lbl.gov/docs/): This contains detailed information about the [science behind the challenge](https://fair-universe.lbl.gov/docs/pages/overview.html#problem-setting), the [specifics of the data](https://fair-universe.lbl.gov/docs/pages/data.html), and [documents the code](https://fair-universe.lbl.gov/docs/rst_source/modules.html) used to facilitate the evaluation of the competition. It also describes the [evaluation metric](https://fair-universe.lbl.gov/docs/pages/evaluation.html)
 
@@ -42,7 +42,7 @@ This competition allows code submissions. Participants are strongly encouraged t
 - Sascha Diefenbacher
 - Steven Farrell
 - Wahid Bhimji
-- Jordan Dudley 
+- Jordan Dudley
 
 ### University of Washington
 - Elham E Khoda

diff --git a/docs/_sources/pages/data.md.txt b/docs/_sources/pages/data.md.txt
@@ -103,7 +103,7 @@ These variables are derived from the primary varibales with the help of `derived
 
 ## How to get Public Data?
 
-Download the [Neurips_Public_data_26_08_2024](https://www.codabench.org/datasets/download/b9e59d0a-4db3-4da4-b1f8-3f609d1835b2/)
+Download the [Neurips_Public_data_26_08_2024 (6.5 GB)](https://www.codabench.org/datasets/download/b9e59d0a-4db3-4da4-b1f8-3f609d1835b2/)
 
 or use the following command to download using terminal
 ```

diff --git a/docs/_sources/pages/evaluation.md.txt b/docs/_sources/pages/evaluation.md.txt
@@ -16,6 +16,16 @@ Not every uncertainty quantification method is able to return a full likelihood
 
 **Methods that a central value and Gaussian uncertainty**: Define $\hat \mu_{16} = \hat \mu - \Delta \hat \mu$ and $\hat \mu_{84} = \hat \mu + \Delta \hat \mu$. If the underlying assumption of a symmetrical uncertainty made by the Gaussian uncertainty is given, this should contain 68.27% of the probability, since the 1 standard deviation region of a Gaussian distribution contains 68.27% of the probability mass.
 
+## Submission requirements
+
+Participants’ submissions must consist of a zip file containing a `model.py` file (which must not be inside a directory), and it can contain any other necessary files (eg, a pre-trained model). The `model.py` file must define a `Model` class which must satisfy the following criteria.
+1. The `Model` class must accept two arguments, `get_train_set` and `systematics`, when it is initialized. The `get_train_set` argument will receive a callable which, when called, will return the public dataset. The `systematics` argument will receive a callable which can be used to apply the systematic effects (adjusting weights and primary features, computing derived features, and applying post-selection cuts) to a dataset.
+1. The `Model` class must have a `fit` method, which will be called once when the submission is being evaluated. This method can be used prepare the model for inference. We encourage participants to submit models which have already been trained, as there is limited compute time for each submission to be evaluated.
+1. The `Model` class must have a `predict` method which must accept a test dataset and return the results as a dictionary containing four items: `”mu_hat”`: the predicted value of mu, `”delta_mu_hat”`: the uncertainty in the predicted value of mu, `”p16”`: the lower bound of the 16th percentile of mu, and `”p84”`: the upper bound of the 84th percentile of mu.
+
+## Hardware description
+
+Throughout the competition, participants’ submissions will be run on either the Perlmutter supercomputer at NERSC or an alternative workstation at LBNL, but they will only be run on Perlmutter for the final evaluation. When running on Perlmutter, submissions will have one node assigned, which consists of 1 AMD EPYC 7763 CPU, 256GB of RAM, and 4 NVIDIA A100 GPUs with 40GB of memory each (https://docs.nersc.gov/systems/perlmutter/architecture/#gpu-nodes). The alternative workstation consists of 1 Intel(R) Xeon(R) Gold 6148 CPU, 376GB of RAM, and 3 Tesla V100-SXM2-16GB GPUs available. On either system, participants’ submissions will be allotted 2 hours to complete evaluation on all of the pseudoexperiments  (10 sets of 100 pseudoexperiments each for the initial phase, and 10 sets of 1000 pseudoexperiments for the final phase). These pseudoexperiments will be parallelized in the following way. Each participants’ `Model.fit()` will be run once, and then each of the pseudoexperiments will be run by one of many parallel workers, with each worker calling `Model.predict()` once. There will be 30 parallel workers when running on Perlmutter. This will be reduced to 10 parallel workers when running on the alternative workstation.
 
 ## Scoring
 

diff --git a/docs/_sources/pages/terms.md.txt b/docs/_sources/pages/terms.md.txt
@@ -14,10 +14,7 @@ This challenge is for educational purposes only and no prizes are granted. It is
 
 To ensure proper representation and verification of your affiliation with an organization in our competition, please adhere to the following guidelines
 
-#### Profile Affiliation
-
-Ensure that your Codabench profile includes your current organizational affiliation. This helps in verifying your credentials and associating your contributions with the correct institution.
-
-#### Registration Requirements
-
-Use your organization-issued email address to register for the competition. This will help us confirm that you are officially associated with the organization and prevent unauthorized entries.
+- Ensure that your Codabench profile includes your current organizational affiliation. This helps in verifying your credentials and associating your contributions with the correct institution.
+- If participants are freelancers they should provide brief details (via email to [email protected]) of their freelance operation, any supporting URL, and their country of residence.
+- Use your organization-issued email address to register for the competition. This will help us confirm that you are officially associated with the organization and prevent unauthorized entries.
+- Registration is not required to access the dataset that is provided on the “Data” tab. We encourage the use of the dataset for research. In the event that a participant who is not eligible for submission via codabench according to the conditions above, can demonstrate a successful method on the challenge task we encourage them to contact the organizers for attribution and potential inclusion in the final competition stage if possible.
diff --git a/docs/pages/Front_page.html b/docs/pages/Front_page.html
@@ -93,7 +93,7 @@ <h2>Introduction<a class="headerlink" href="#introduction" title="Link to this h
 <ul class="simple">
 <li><p><a class="reference external" href="https://www.codabench.org/competitions/2977/">Codabench</a>: This serves as the platform to submit entries to the competition.</p></li>
 <li><p><a class="reference external" href="https://fair-universe.lbl.gov/tutorials/Higgs_Uncertainty_Challenge-Codabench_Tutorial.pdf">Tutorial Slides</a> : These slides will help you register and submit a sample dummy submission.</p></li>
-<li><p>Download the <a class="reference external" href="https://www.codabench.org/datasets/download/b9e59d0a-4db3-4da4-b1f8-3f609d1835b2/">training data</a> if you you want to experiment with the data on your local machines.</p></li>
+<li><p>Download the <a class="reference external" href="https://www.codabench.org/datasets/download/b9e59d0a-4db3-4da4-b1f8-3f609d1835b2/">Training Data (6.5 GB)</a> if you you want to experiment with the data on your local machines.</p></li>
 <li><p><a class="reference external" href="https://fair-universe.lbl.gov/docs/">Documentation</a>: This contains detailed information about the <a class="reference external" href="https://fair-universe.lbl.gov/docs/pages/overview.html#problem-setting">science behind the challenge</a>, the <a class="reference external" href="https://fair-universe.lbl.gov/docs/pages/data.html">specifics of the data</a>, and <a class="reference external" href="https://fair-universe.lbl.gov/docs/rst_source/modules.html">documents the code</a> used to facilitate the evaluation of the competition. It also describes the <a class="reference external" href="https://fair-universe.lbl.gov/docs/pages/evaluation.html">evaluation metric</a></p></li>
 <li><p><a class="reference external" href="https://github.com/FAIR-Universe/HEP-Challenge/tree/master/">Github Repo</a>: This hosts the code for testing submissions, as well as the <a class="reference external" href="https://github.com/FAIR-Universe/HEP-Challenge/blob/master/StartingKit_HiggsML_Uncertainty_Challenge.ipynb">starting kit notebook</a>. The starting kit is also available on <a class="reference external" href="https://colab.research.google.com/github/FAIR-Universe/HEP-Challenge/blob/master/StartingKit_HiggsML_Uncertainty_Challenge.ipynb">Google Colab</a></p></li>
 <li><p><a class="reference external" href="https://arxiv.org/abs/2410.02867">White Paper</a>: This serves as a full breakdown of the competition in detail</p></li>

diff --git a/docs/pages/data.html b/docs/pages/data.html
@@ -305,7 +305,7 @@ <h3>Preselection Cuts<a class="headerlink" href="#preselection-cuts" title="Link
 <hr class="docutils" />
 <section id="how-to-get-public-data">
 <h2>How to get Public Data?<a class="headerlink" href="#how-to-get-public-data" title="Link to this heading"></a></h2>
-<p>Download the <a class="reference external" href="https://www.codabench.org/datasets/download/b9e59d0a-4db3-4da4-b1f8-3f609d1835b2/">Neurips_Public_data_26_08_2024</a></p>
+<p>Download the <a class="reference external" href="https://www.codabench.org/datasets/download/b9e59d0a-4db3-4da4-b1f8-3f609d1835b2/">Neurips_Public_data_26_08_2024 (6.5 GB)</a></p>
 <p>or use the following command to download using terminal</p>
 <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">wget</span> <span class="o">-</span><span class="n">O</span> <span class="n">public_data</span><span class="o">.</span><span class="n">zip</span> <span class="n">https</span><span class="p">:</span><span class="o">//</span><span class="n">www</span><span class="o">.</span><span class="n">codabench</span><span class="o">.</span><span class="n">org</span><span class="o">/</span><span class="n">datasets</span><span class="o">/</span><span class="n">download</span><span class="o">/</span><span class="n">b9e59d0a</span><span class="o">-</span><span class="mi">4</span><span class="n">db3</span><span class="o">-</span><span class="mi">4</span><span class="n">da4</span><span class="o">-</span><span class="n">b1f8</span><span class="o">-</span><span class="mi">3</span><span class="n">f609d1835b2</span><span class="o">/</span>
 </pre></div>

diff --git a/docs/pages/evaluation.html b/docs/pages/evaluation.html
@@ -10,7 +10,6 @@
       <link rel="stylesheet" type="text/css" href="../_static/pygments.css?v=80d5e7a1" />
       <link rel="stylesheet" type="text/css" href="../_static/css/theme.css?v=19f00094" />
 
-
 
     <link rel="shortcut icon" href="../_static/logo.png"/>
       <script src="../_static/jquery.js?v=5d32c60e"></script>
@@ -57,6 +56,8 @@
 <li class="toctree-l3"><a class="reference internal" href="#constructing-the-interval">Constructing the Interval</a></li>
 </ul>
 </li>
+<li class="toctree-l2"><a class="reference internal" href="#submission-requirements">Submission requirements</a></li>
+<li class="toctree-l2"><a class="reference internal" href="#hardware-description">Hardware description</a></li>
 <li class="toctree-l2"><a class="reference internal" href="#scoring">Scoring</a></li>
 </ul>
 </li>
@@ -107,6 +108,19 @@ <h3>Constructing the Interval<a class="headerlink" href="#constructing-the-inter
 <p><strong>Methods that a central value and Gaussian uncertainty</strong>: Define <span class="math notranslate nohighlight">\(\hat \mu_{16} = \hat \mu - \Delta \hat \mu\)</span> and <span class="math notranslate nohighlight">\(\hat \mu_{84} = \hat \mu + \Delta \hat \mu\)</span>. If the underlying assumption of a symmetrical uncertainty made by the Gaussian uncertainty is given, this should contain 68.27% of the probability, since the 1 standard deviation region of a Gaussian distribution contains 68.27% of the probability mass.</p>
 </section>
 </section>
+<section id="submission-requirements">
+<h2>Submission requirements<a class="headerlink" href="#submission-requirements" title="Link to this heading"></a></h2>
+<p>Participants’ submissions must consist of a zip file containing a <code class="docutils literal notranslate"><span class="pre">model.py</span></code> file (which must not be inside a directory), and it can contain any other necessary files (eg, a pre-trained model). The <code class="docutils literal notranslate"><span class="pre">model.py</span></code> file must define a <code class="docutils literal notranslate"><span class="pre">Model</span></code> class which must satisfy the following criteria.</p>
+<ol class="arabic simple">
+<li><p>The <code class="docutils literal notranslate"><span class="pre">Model</span></code> class must accept two arguments, <code class="docutils literal notranslate"><span class="pre">get_train_set</span></code> and <code class="docutils literal notranslate"><span class="pre">systematics</span></code>, when it is initialized. The <code class="docutils literal notranslate"><span class="pre">get_train_set</span></code> argument will receive a callable which, when called, will return the public dataset. The <code class="docutils literal notranslate"><span class="pre">systematics</span></code> argument will receive a callable which can be used to apply the systematic effects (adjusting weights and primary features, computing derived features, and applying post-selection cuts) to a dataset.</p></li>
+<li><p>The <code class="docutils literal notranslate"><span class="pre">Model</span></code> class must have a <code class="docutils literal notranslate"><span class="pre">fit</span></code> method, which will be called once when the submission is being evaluated. This method can be used prepare the model for inference. We encourage participants to submit models which have already been trained, as there is limited compute time for each submission to be evaluated.</p></li>
+<li><p>The <code class="docutils literal notranslate"><span class="pre">Model</span></code> class must have a <code class="docutils literal notranslate"><span class="pre">predict</span></code> method which must accept a test dataset and return the results as a dictionary containing four items: <code class="docutils literal notranslate"><span class="pre">”mu_hat”</span></code>: the predicted value of mu, <code class="docutils literal notranslate"><span class="pre">”delta_mu_hat”</span></code>: the uncertainty in the predicted value of mu, <code class="docutils literal notranslate"><span class="pre">”p16”</span></code>: the lower bound of the 16th percentile of mu, and <code class="docutils literal notranslate"><span class="pre">”p84”</span></code>: the upper bound of the 84th percentile of mu.</p></li>
+</ol>
+</section>
+<section id="hardware-description">
+<h2>Hardware description<a class="headerlink" href="#hardware-description" title="Link to this heading"></a></h2>
+<p>Throughout the competition, participants’ submissions will be run on either the Perlmutter supercomputer at NERSC or an alternative workstation at LBNL, but they will only be run on Perlmutter for the final evaluation. When running on Perlmutter, submissions will have one node assigned, which consists of 1 AMD EPYC 7763 CPU, 256GB of RAM, and 4 NVIDIA A100 GPUs with 40GB of memory each (https://docs.nersc.gov/systems/perlmutter/architecture/#gpu-nodes). The alternative workstation consists of 1 Intel(R) Xeon(R) Gold 6148 CPU, 376GB of RAM, and 3 Tesla V100-SXM2-16GB GPUs available. On either system, participants’ submissions will be allotted 2 hours to complete evaluation on all of the pseudoexperiments  (10 sets of 100 pseudoexperiments each for the initial phase, and 10 sets of 1000 pseudoexperiments for the final phase). These pseudoexperiments will be parallelized in the following way. Each participants’ <code class="docutils literal notranslate"><span class="pre">Model.fit()</span></code> will be run once, and then each of the pseudoexperiments will be run by one of many parallel workers, with each worker calling <code class="docutils literal notranslate"><span class="pre">Model.predict()</span></code> once. There will be 30 parallel workers when running on Perlmutter. This will be reduced to 10 parallel workers when running on the alternative workstation.</p>
+</section>
 <section id="scoring">
 <h2>Scoring<a class="headerlink" href="#scoring" title="Link to this heading"></a></h2>
 <p>The score consist of two parts, the interval width and the coverage:</p>
@@ -156,4 +170,4 @@ <h2>Scoring<a class="headerlink" href="#scoring" title="Link to this heading">
   </script> 
 
 </body>
-</html>
+</html>
diff --git a/docs/pages/terms.html b/docs/pages/terms.html
@@ -96,14 +96,12 @@ <h2>Disqualification Terms<a class="headerlink" href="#disqualification-terms" t
 <section id="participant-affiliation-guidelines">
 <h2>Participant Affiliation Guidelines<a class="headerlink" href="#participant-affiliation-guidelines" title="Link to this heading"></a></h2>
 <p>To ensure proper representation and verification of your affiliation with an organization in our competition, please adhere to the following guidelines</p>
-<section id="profile-affiliation">
-<h3>Profile Affiliation<a class="headerlink" href="#profile-affiliation" title="Link to this heading"></a></h3>
-<p>Ensure that your Codabench profile includes your current organizational affiliation. This helps in verifying your credentials and associating your contributions with the correct institution.</p>
-</section>
-<section id="registration-requirements">
-<h3>Registration Requirements<a class="headerlink" href="#registration-requirements" title="Link to this heading"></a></h3>
-<p>Use your organization-issued email address to register for the competition. This will help us confirm that you are officially associated with the organization and prevent unauthorized entries.</p>
-</section>
+<ul class="simple">
+<li><p>Ensure that your Codabench profile includes your current organizational affiliation. This helps in verifying your credentials and associating your contributions with the correct institution.</p></li>
+<li><p>If participants are freelancers they should provide brief details (via email to fair-universe&#64;lbl.gov) of their freelance operation, any supporting URL, and their country of residence.</p></li>
+<li><p>Use your organization-issued email address to register for the competition. This will help us confirm that you are officially associated with the organization and prevent unauthorized entries.</p></li>
+<li><p>Registration is not required to access the dataset that is provided on the “Data” tab. We encourage the use of the dataset for research. In the event that a participant who is not eligible for submission via codabench according to the conditions above, can demonstrate a successful method on the challenge task we encourage them to contact the organizers for attribution and potential inclusion in the final competition stage if possible.</p></li>
+</ul>
 </section>
 </section>