diff --git a/.all-contributorsrc b/.all-contributorsrc
deleted file mode 100644
index bc6a9103..00000000
--- a/.all-contributorsrc
+++ /dev/null
@@ -1,45 +0,0 @@
-{
-  "files": [
-    "README.md"
-  ],
-  "imageSize": 100,
-  "commit": false,
-  "contributorsPerLine": 7,
-  "projectName": "al-folio",
-  "projectOwner": "alshedivat",
-  "repoType": "github",
-  "repoHost": "https://github.com",
-  "badgeTemplate": "[core_contributors]: https://img.shields.io/badge/core_contributors-<%= contributors.length %>-orange.svg 'Number of core contributors'",
-  "contributorTemplate": "<a href=\"<%= contributor.profile %>\"><img src=\"<%= contributor.avatar_url %>\" width=\"<%= options.imageSize %>px;\" alt=\"\"/><br /><sub><b><%= contributor.name %></b></sub></a>",
-  "skipCi": true,
-  "contributors": [
-    {
-      "login": "alshedivat",
-      "name": "Maruan",
-      "avatar_url": "https://avatars.githubusercontent.com/u/2126561?v=4",
-      "profile": "http://maruan.alshedivat.com",
-      "contributions": [
-        "design",
-        "code"
-      ]
-    },
-    {
-      "login": "rohandebsarkar",
-      "name": "Rohan Deb Sarkar",
-      "avatar_url": "https://avatars.githubusercontent.com/u/50144004?v=4",
-      "profile": "http://rohandebsarkar.github.io",
-      "contributions": [
-        "code"
-      ]
-    },
-    {
-      "login": "pourmand1376",
-      "name": "Amir Pourmand",
-      "avatar_url": "https://avatars.githubusercontent.com/u/32064808?v=4",
-      "profile": "https://amirpourmand.ir",
-      "contributions": [
-        "code"
-      ]
-    }
-  ]
-}
diff --git a/.gitattributes b/.gitattributes
deleted file mode 100644
index 24244739..00000000
--- a/.gitattributes
+++ /dev/null
@@ -1 +0,0 @@
-_config.yml merge=ours
diff --git a/.github/FUNDING.yml b/.github/FUNDING.yml
deleted file mode 100644
index c78502f4..00000000
--- a/.github/FUNDING.yml
+++ /dev/null
@@ -1,12 +0,0 @@
-# These are supported funding model platforms
-
-github: # Replace with up to 4 GitHub Sponsors-enabled usernames e.g., [user1, user2]
-patreon: # Replace with a single Patreon username
-open_collective: # Replace with a single Open Collective username
-ko_fi: alshedivat
-tidelift: # Replace with a single Tidelift platform-name/package-name e.g., npm/babel
-community_bridge: # Replace with a single Community Bridge project-name e.g., cloud-foundry
-liberapay: # Replace with a single Liberapay username
-issuehunt: # Replace with a single IssueHunt username
-otechie: # Replace with a single Otechie username
-custom: # ['https://www.buymeacoffee.com/TkFxuKo']
diff --git a/.github/ISSUE_TEMPLATE/bug_report.md b/.github/ISSUE_TEMPLATE/bug_report.md
deleted file mode 100644
index 511f5851..00000000
--- a/.github/ISSUE_TEMPLATE/bug_report.md
+++ /dev/null
@@ -1,38 +0,0 @@
----
-name: Bug report
-about: Create a report to help us improve
-title: ''
-labels: bug
-assignees: ''
-
----
-
-**Acknowledge the following**
-- [ ] I carefully read and followed the [Getting Started](https://github.com/alshedivat/al-folio#getting-started) guide.
-- [ ] I read through [FAQ](https://github.com/alshedivat/al-folio#faq) and searched through the [past issues](https://github.com/alshedivat/al-folio/issues), none of which addressed my issue.
-- [ ] The issue I am raising is a potential bug in al-folio and not just a usage question. <br> [For usage questions, please post in the [Discussions](https://github.com/alshedivat/al-folio/discussions) instead of raising an issue.]
-
-**Describe the bug**
-A clear and concise description of what the bug is.
-
-**To Reproduce**
-Steps to reproduce the behavior:
-1. Go to '...'
-2. Click on '....'
-3. Scroll down to '....'
-4. See error
-
-**Expected behavior**
-A clear and concise description of what you expected to happen.
-
-**Screenshots**
-If applicable, add screenshots to help explain your problem.
-
-**System (please complete the following information):**
- - OS: [e.g. iOS]
- - Browser (and its version) [e.g. chrome, safari]
- - Jekyll version [e.g. 3.8.7]
-- Ruby version [e.g. 2.6.5]
-
-**Additional context**
-Add any other context about the problem here.
diff --git a/.github/ISSUE_TEMPLATE/feature_request.md b/.github/ISSUE_TEMPLATE/feature_request.md
deleted file mode 100644
index 11fc491e..00000000
--- a/.github/ISSUE_TEMPLATE/feature_request.md
+++ /dev/null
@@ -1,20 +0,0 @@
----
-name: Feature request
-about: Suggest an idea for this project
-title: ''
-labels: enhancement
-assignees: ''
-
----
-
-**Is your feature request related to a problem? Please describe.**
-A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
-
-**Describe the solution you'd like**
-A clear and concise description of what you want to happen.
-
-**Describe alternatives you've considered**
-A clear and concise description of any alternative solutions or features you've considered.
-
-**Additional context**
-Add any other context or screenshots about the feature request here.
diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md
deleted file mode 100644
index 9ae75b4c..00000000
--- a/.github/pull_request_template.md
+++ /dev/null
@@ -1,23 +0,0 @@
-<!-- Please make sure you are opening a pull request against the `accepted` branch (not master!) of the STAGING repo (not 2023!) -->
-
-## OpenReview Submission Thread
-
-<!-- link to your OpenReview submission -->
-
-## Checklist before requesting a review
-
-- [ ] I am opening a pull request against the `accepted` branch of the `staging` repo.
-- [ ] I have de-anonymized my post, added author lists, etc.
-- [ ] My post matches the formatting requirements
-    - [ ] I have a short 2-3 sentence abstract in the `description` field of my front-matter ([example](https://github.com/iclr-blogposts/staging/blob/aa15aa3797b572e7b7bb7c8881fd350d5f76fcbd/_posts/2022-12-01-distill-example.md?plain=1#L4-L5))
-    - [ ] I have a table of contents, formatted using the `toc` field of my front-matter ([example](https://github.com/iclr-blogposts/staging/blob/aa15aa3797b572e7b7bb7c8881fd350d5f76fcbd/_posts/2022-12-01-distill-example.md?plain=1#L33-L42))
-    - [ ] My bibliography is correctly formatted, using a `.bibtex` file as per the sample post
-
-## Changes implemented in response to reviewer feedback
-
-- [ ] Tick this box if you received a conditional accept
-- [ ] I have implemented the necessary changes in response to reviewer feedback (if any)
-
-<!-- briefly add your changes in response to reviewer feedback -->
-
-## Any other comments
diff --git a/.github/stale.yml b/.github/stale.yml
deleted file mode 100644
index 8ec2004d..00000000
--- a/.github/stale.yml
+++ /dev/null
@@ -1,18 +0,0 @@
-# Number of days of inactivity before an issue becomes stale
-daysUntilStale: 60
-# Number of days of inactivity before a stale issue is closed
-daysUntilClose: 7
-# Issues with these labels will never be considered stale
-exemptLabels:
-  - pinned
-  - security
-  - enhancement
-# Label to use when marking an issue as stale
-staleLabel: wontfix
-# Comment to post when marking an issue as stale. Set to `false` to disable
-markComment: >
-  This issue has been automatically marked as stale because it has not had
-  recent activity. It will be closed if no further activity occurs. Thank you
-  for your contributions.
-# Comment to post when closing a stale issue. Set to `false` to disable
-closeComment: false
diff --git a/.github/workflows/deploy-docker-tag.yml b/.github/workflows/deploy-docker-tag.yml
deleted file mode 100644
index 3e6b6a3a..00000000
--- a/.github/workflows/deploy-docker-tag.yml
+++ /dev/null
@@ -1,40 +0,0 @@
-name: Docker Image CI (Upload Tag)
-
-on:
-  push:
-    tags:
-      - 'v*'
-
-jobs:
-
-  build:
-
-    runs-on: ubuntu-latest
-
-    steps:
-    - name: Checkout
-      uses: actions/checkout@v2
-    - name: Buildx
-      uses: docker/setup-buildx-action@v1
-
-    -
-      name: Docker meta
-      id: meta
-      uses: docker/metadata-action@v4
-      with:
-        images: amirpourmand/al-folio
-
-    - name: Login
-      uses: docker/login-action@v1
-      with:
-        username: ${{ secrets.DOCKER_USERNAME }}
-        password: ${{ secrets.DOCKER_PASSWORD }}
-
-    - name: Build and push
-      uses: docker/build-push-action@v3
-      with:
-        context: .
-        push: ${{ github.event_name != 'pull_request' }}
-        tags: ${{ steps.meta.outputs.tags }}
-        labels: ${{ steps.meta.outputs.labels }}
-
diff --git a/.github/workflows/deploy-image.yml b/.github/workflows/deploy-image.yml
deleted file mode 100644
index b747dfc1..00000000
--- a/.github/workflows/deploy-image.yml
+++ /dev/null
@@ -1,31 +0,0 @@
-name: Docker Image CI
-
-on:
-  push:
-    branches: [ master ]
-
-jobs: 
-
-  build:
-
-    runs-on: ubuntu-latest
-    if: github.repository_owner == 'alshedivat'
-
-    steps:
-    - name: Checkout
-      uses: actions/checkout@v2
-    - name: Buildx
-      uses: docker/setup-buildx-action@v1
-
-    - name: Login
-      uses: docker/login-action@v1
-      with:
-        username: ${{ secrets.DOCKER_USERNAME }}
-        password: ${{ secrets.DOCKER_PASSWORD }}
-         
-    - name: Build and push
-      uses: docker/build-push-action@v2
-      with:
-        context: .
-        push: true
-        tags: amirpourmand/al-folio
diff --git a/.github/workflows/deploy.yml b/.github/workflows/deploy.yml
deleted file mode 100644
index dbe26a88..00000000
--- a/.github/workflows/deploy.yml
+++ /dev/null
@@ -1,43 +0,0 @@
-name: deploy
-
-on:
-  push:
-    branches:
-      - master
-      - main
-  pull_request:
-    branches:
-      - master
-      - main
-  workflow_dispatch: {}
-
-jobs:
-  deploy:
-    runs-on: ubuntu-latest
-    steps:
-    - name: Checkout code
-      uses: actions/checkout@v3
-    - name: Setup Ruby
-      uses: ruby/setup-ruby@v1
-      with:
-        ruby-version: '3.0.2'
-        bundler-cache: true
-    - name: Install deps
-      run: |
-        npm install -g mermaid.cli
-    - name: Setup deploy options
-      id: setup
-      run: |
-        git config --global user.name "GitHub Action"
-        git config --global user.email "41898282+github-actions[bot]@users.noreply.github.com"
-        if [[ ${GITHUB_REF} = refs/pull/*/merge ]]; then # pull request
-          echo "SRC_BRANCH=${GITHUB_HEAD_REF}" >> $GITHUB_OUTPUT
-          echo "NO_PUSH=--no-push" >> $GITHUB_OUTPUT
-        elif [[ ${GITHUB_REF} = refs/heads/* ]]; then # branch, e.g. master, source etc
-          echo "SRC_BRANCH=${GITHUB_REF#refs/heads/}" >> $GITHUB_OUTPUT
-        fi
-        echo "DEPLOY_BRANCH=gh-pages" >> $GITHUB_OUTPUT
-    - name: Deploy website 
-      run:  yes | bash bin/deploy --verbose ${{ steps.setup.outputs.NO_PUSH }}
-                    --src ${{ steps.setup.outputs.SRC_BRANCH }} 
-                    --deploy ${{ steps.setup.outputs.DEPLOY_BRANCH }} 
diff --git a/.nojekyll b/.nojekyll
new file mode 100644
index 00000000..e69de29b
diff --git a/404.html b/404.html
index 0da4ee0b..f558c339 100644
--- a/404.html
+++ b/404.html
@@ -1,9 +1 @@
----
-layout: page
-permalink: /404.html
-title: "Page not found"
-description: "Looks like there has been a mistake. Nothing exists here."
-redirect: true
----
-
-<p>You will be redirected to the main page within 3 seconds. If not redirected, please click <a href="{{ site.baseurl }}/">here</a>.</p>
+<!DOCTYPE html> <html lang="en"> <head><meta http-equiv="refresh" content="3; url=/2023/"/> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <title>Page not found | ICLR Blogposts 2023</title> <meta name="author" content="abc b c"/> <meta name="description" content="Looks like there has been a mistake. Nothing exists here."/> <meta name="keywords" content="machine-learning, ml, deep-learning, reinforcement-learning, iclr"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha256-DF7Zhf293AJxJNTmh5zhoYYIMs2oXitRfBjY+9L//AY=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/css/mdb.min.css" integrity="sha256-jpjYvU3G3N6nrrBwXJoVEYI/0zw8htfFnhT9ljN3JJw=" crossorigin="anonymous"/> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fortawesome/fontawesome-free@5.15.4/css/all.min.css" integrity="sha256-mUZM63G8m73Mcidfrv5E+Y61y7a12O5mW4ezU3bxqW4=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/academicons@1.9.1/css/academicons.min.css" integrity="sha256-i1+4qU2G2860dGGIOJscdC30s9beBXjFfzjWLjBRsBg=" crossorigin="anonymous"> <link rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700|Roboto+Slab:100,300,400,500,700|Material+Icons"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/github.css" media="" id="highlight_theme_light"/> <link rel="shortcut icon" href="/2023/assets/img/iclr_favicon.ico"/> <link rel="stylesheet" href="/2023/assets/css/main.css"> <link rel="canonical" href="https://iclr-blogposts.github.io/2023/404.html"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/native.css" media="none" id="highlight_theme_dark"/> <script src="/2023/assets/js/theme.js"></script> <script src="/2023/assets/js/dark_mode.js"></script> </head> <body class="fixed-top-nav "> <header> <nav id="navbar" class="navbar navbar-light navbar-expand-sm fixed-top"> <div class="container"> <a class="navbar-brand title font-weight-lighter" href="/2023/">ICLR Blogposts 2023</a> <button class="navbar-toggler collapsed ml-auto" type="button" data-toggle="collapse" data-target="#navbarNav" aria-controls="navbarNav" aria-expanded="false" aria-label="Toggle navigation"> <span class="sr-only">Toggle navigation</span> <span class="icon-bar top-bar"></span> <span class="icon-bar middle-bar"></span> <span class="icon-bar bottom-bar"></span> </button> <div class="collapse navbar-collapse text-right" id="navbarNav"> <ul class="navbar-nav ml-auto flex-nowrap"> <li class="nav-item "> <a class="nav-link" href="/2023/about">about</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/call">call for blogposts</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/submitting">submitting</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/reviewing">reviewing</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/blog/index.html">blog</a> </li> <li class="nav-item dropdown "> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdown" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">other iterations</a> <div class="dropdown-menu dropdown-menu-right" aria-labelledby="navbarDropdown"> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2025/">2025</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2024/">2024</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blog-track.github.io/home/" target="_blank" rel="noopener noreferrer">2022</a> </div> </li> <li class="toggle-container"> <button id="light-toggle" title="Change theme"> <i class="fas fa-moon"></i> <i class="fas fa-sun"></i> </button> </li> </ul> </div> </div> </nav> </header> <div class="header-background"><div class="img"></div></div> <div class="container mt-5"> <div class="post"> <article> <p>You will be redirected to the main page within 3 seconds. If not redirected, please click <a href="/2023/">here</a>.</p> </article> </div> </div> <script src="https://cdn.jsdelivr.net/npm/jquery@3.6.0/dist/jquery.min.js" integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/js/bootstrap.bundle.min.js" integrity="sha256-fgLAgv7fyCGopR/gBNq2iW3ZKIdqIcyshnUULC4vex8=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/js/mdb.min.js" integrity="sha256-NdbiivsvWt7VYCt6hYNT3h/th9vSTL4EDWeGs5SN3DA=" crossorigin="anonymous"></script> <script defer src="https://cdn.jsdelivr.net/npm/masonry-layout@4.2.2/dist/masonry.pkgd.min.js" integrity="sha256-Nn1q/fx0H7SNLZMQ5Hw5JLaTRZp0yILA/FRexe19VdI=" crossorigin="anonymous"></script> <script defer src="https://cdn.jsdelivr.net/npm/imagesloaded@4/imagesloaded.pkgd.min.js"></script> <script defer src="/2023/assets/js/masonry.js" type="text/javascript"></script> <script defer src="https://cdn.jsdelivr.net/npm/medium-zoom@1.0.6/dist/medium-zoom.min.js" integrity="sha256-EdPgYcPk/IIrw7FYeuJQexva49pVRZNmt3LculEr7zM=" crossorigin="anonymous"></script> <script defer src="/2023/assets/js/zoom.js"></script> <script defer src="/2023/assets/js/common.js"></script> <script type="text/javascript">window.MathJax={tex:{tags:"ams"}};</script> <script defer type="text/javascript" id="MathJax-script" src="https://cdn.jsdelivr.net/npm/mathjax@3.2.0/es5/tex-mml-chtml.js"></script> <script defer src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script> </body> </html>
\ No newline at end of file
diff --git a/Gemfile b/Gemfile
deleted file mode 100644
index 98de3166..00000000
--- a/Gemfile
+++ /dev/null
@@ -1,24 +0,0 @@
-source 'https://rubygems.org'
-group :jekyll_plugins do
-    gem 'jekyll'
-    gem 'jekyll-archives'
-    gem 'jekyll-diagrams'
-    gem 'jekyll-email-protect'
-    gem 'jekyll-feed'
-    gem 'jekyll-imagemagick'
-    gem 'jekyll-minifier'
-    gem 'jekyll-paginate-v2'
-    gem 'jekyll-scholar'
-    gem 'jekyll-sitemap'
-    gem 'jekyll-target-blank'
-    gem 'jekyll-twitter-plugin'
-    gem 'jekyll-redirect-from'
-    # gem 'jemoji'
-    gem 'mini_racer'
-    gem 'unicode_utils'
-    gem 'webrick'
-end
-group :other_plugins do
-    gem 'httparty'
-    gem 'feedjira'
-end
diff --git a/_bibliography/papers.bib b/_bibliography/papers.bib
deleted file mode 100644
index 1c707bc5..00000000
--- a/_bibliography/papers.bib
+++ /dev/null
@@ -1,82 +0,0 @@
----
----
-
-@string{aps = {American Physical Society,}}
-
-@book{einstein1956investigations,
-  bibtex_show={true},
-  title={Investigations on the Theory of the Brownian Movement},
-  author={Einstein, Albert},
-  year={1956},
-  publisher={Courier Corporation,},
-  preview={brownian-motion.gif}
-}
-
-@article{einstein1950meaning,
-  abbr={AJP},
-  bibtex_show={true},
-  title={The meaning of relativity},
-  author={Einstein, Albert and Taub, AH},
-  journal={American Journal of Physics,},
-  volume={18},
-  number={6},
-  pages={403--404},
-  year={1950},
-  publisher={American Association of Physics Teachers,}
-}
-
-@article{PhysRev.47.777,
-  abbr={PhysRev},
-  title={Can Quantum-Mechanical Description of Physical Reality Be Considered Complete?},
-  author={Einstein, A. and Podolsky, B. and Rosen, N.},
-  abstract={In a complete theory there is an element corresponding to each element of reality. A sufficient condition for the reality of a physical quantity is the possibility of predicting it with certainty, without disturbing the system. In quantum mechanics in the case of two physical quantities described by non-commuting operators, the knowledge of one precludes the knowledge of the other. Then either (1) the description of reality given by the wave function in quantum mechanics is not complete or (2) these two quantities cannot have simultaneous reality. Consideration of the problem of making predictions concerning a system on the basis of measurements made on another system that had previously interacted with it leads to the result that if (1) is false then (2) is also false. One is thus led to conclude that the description of reality as given by a wave function is not complete.},
-  journal={Phys. Rev.,},
-  volume={47},
-  issue={10},
-  pages={777--780},
-  numpages={0},
-  year={1935},
-  month={May},
-  publisher=aps,
-  doi={10.1103/PhysRev.47.777},
-  url={http://link.aps.org/doi/10.1103/PhysRev.47.777},
-  html={https://journals.aps.org/pr/abstract/10.1103/PhysRev.47.777},
-  pdf={example_pdf.pdf},
-  selected={true}
-}
-
-@article{einstein1905molekularkinetischen,
-  title={{\"U}ber die von der molekularkinetischen Theorie der W{\"a}rme geforderte Bewegung von in ruhenden Fl{\"u}ssigkeiten suspendierten Teilchen},
-  author={Einstein, A.},
-  journal={Annalen der physik,},
-  volume={322},
-  number={8},
-  pages={549--560},
-  year={1905},
-  publisher={Wiley Online Library}
-}
-
-@article{einstein1905movement,
-  abbr={Ann. Phys.},
-  title={Un the movement of small particles suspended in statiunary liquids required by the molecular-kinetic theory 0f heat},
-  author={Einstein, A.},
-  journal={Ann. Phys.,},
-  volume={17},
-  pages={549--560},
-  year={1905}
-}
-
-@article{einstein1905electrodynamics,
-  title={On the electrodynamics of moving bodies},
-  author={Einstein, A.},
-  year={1905}
-}
-
-@book{przibram1967letters,
-  bibtex_show={true},
-  title={Letters on wave mechanics},
-  author={Einstein, Albert and Schrödinger, Erwin and Planck, Max and Lorentz, Hendrik Antoon and Przibram, Karl},
-  year={1967},
-  publisher={Vision},
-  preview={wave-mechanics.gif}
-}
diff --git a/_config.yml b/_config.yml
deleted file mode 100644
index d72da1c7..00000000
--- a/_config.yml
+++ /dev/null
@@ -1,342 +0,0 @@
-# -----------------------------------------------------------------------------
-# Site settings
-# -----------------------------------------------------------------------------
-
-title: ICLR Blogposts 2023 # the website title (if blank, full name will be used instead)
-first_name: abc
-middle_name: b
-last_name: c
-email: charlie.gauthier@umontreal.ca
-description: > # the ">" symbol means to ignore newlines until "footer_text:"
-  Home to the 2023 ICLR Blogposts track
-footer_text: >
-  Powered by <a href="https://jekyllrb.com/" target="_blank">Jekyll</a> with <a href="https://github.com/alshedivat/al-folio">al-folio</a> theme.
-  Hosted by <a href="https://pages.github.com/" target="_blank">GitHub Pages</a>.
-  Photos from <a href="https://unsplash.com" target="_blank">Unsplash</a>.
-keywords: machine-learning, ml, deep-learning, reinforcement-learning, iclr  # add your own keywords or leave empty
-
-lang: en # the language of your site (for example: en, fr, cn, ru, etc.)
-icon: iclr_favicon.ico  # the emoji used as the favicon (alternatively, provide image name in /assets/img/)
-
-url: https://iclr-blogposts.github.io # the base hostname & protocol for your site
-baseurl: /2023 # the subpath of your site, e.g. /blog/
-last_updated: false # set to true if you want to display last updated in the footer
-impressum_path:  # set to path to include impressum link in the footer, use the same path as permalink in a page, helps to conform with EU GDPR
-
-timezone: Africa/Kigali
-
-# -----------------------------------------------------------------------------
-# Theme
-# -----------------------------------------------------------------------------
-
-# code highlighter theme
-highlight_theme_light: github   # https://github.com/jwarby/jekyll-pygments-themes
-highlight_theme_dark: native    # https://github.com/jwarby/jekyll-pygments-themes
-
-# repo color theme
-repo_theme_light: default       # https://github.com/anuraghazra/github-readme-stats/blob/master/themes/README.md
-repo_theme_dark: dark           # https://github.com/anuraghazra/github-readme-stats/blob/master/themes/README.md
-
-# -----------------------------------------------------------------------------
-# RSS Feed
-# -----------------------------------------------------------------------------
-# will use title and url fields
-# Take a look to https://github.com/jekyll/jekyll-feed for more customization
-
-rss_icon: true
-
-# -----------------------------------------------------------------------------
-# Layout
-# -----------------------------------------------------------------------------
-
-navbar_fixed: true
-footer_fixed: true
-
-# Dimensions
-max_width: 1000px
-
-# TODO: add layout settings (single page vs. multi-page)
-
-# -----------------------------------------------------------------------------
-# Open Graph & Schema.org
-# -----------------------------------------------------------------------------
-# Display links to the page with a preview object on social media.
-serve_og_meta: false # Include Open Graph meta tags in the HTML head
-serve_schema_org: false # Include Schema.org in the HTML head
-og_image: # The site-wide (default for all links) Open Graph preview image
-
-# -----------------------------------------------------------------------------
-# Social integration
-# -----------------------------------------------------------------------------
-
-github_username: # your GitHub user name
-gitlab_username: # your GitLab user name
-twitter_username: # your Twitter handle
-linkedin_username: # your LinkedIn user name
-scholar_userid: # your Google Scholar ID
-semanticscholar_id: # your Semantic Scholar ID
-orcid_id: # your ORCID ID
-medium_username: # your Medium username
-quora_username: # your Quora username
-publons_id: # your ID on Publons
-research_gate_profile: # your profile on ResearchGate
-blogger_url: # your blogger URL
-work_url: # work page URL
-keybase_username: # your keybase user name
-wikidata_id: # your wikidata id
-dblp_url: # your DBLP profile url
-stackoverflow_id: # your stackoverflow id
-kaggle_id: # your kaggle id
-lastfm_id: # your lastfm id
-spotify_id: # your spotify id
-pinterest_id: # your pinterest id
-unsplash_id: # your unsplash id
-instagram_id: # your instagram id
-facebook_id: # your facebook id
-discord_id: # your discord id (18-digit unique numerical identifier)
-
-contact_note:
-
-# -----------------------------------------------------------------------------
-# Analytics and search engine verification
-# -----------------------------------------------------------------------------
-
-google_analytics:  # your Goole Analytics measurement ID (format: G-XXXXXXXXXX)
-panelbear_analytics:  # panelbear analytics site ID (format: XXXXXXXXX)
-
-google_site_verification:  # your google-site-verification ID (Google Search Console)
-bing_site_verification:  # out your bing-site-verification ID (Bing Webmaster)
-
-# -----------------------------------------------------------------------------
-# Blog
-# -----------------------------------------------------------------------------
-
-blog_name: blogposts # blog_name will be displayed in your blog page
-blog_nav_title: blog # your blog must have a title for it to be displayed in the nav bar
-blog_description: Accepted Blog Posts
-permalink: /blog/:year/:title/
-
-# Pagination
-pagination:
-  enabled: true
-
-# Comments
-disqus_shortname: # put your disqus shortname
-# https://help.disqus.com/en/articles/1717111-what-s-a-shortname
-
-# External sources.
-# If you have blog posts published on medium.com or other exteranl sources,
-# you can display them in your blog by adding a link to the RSS feed.
-external_sources:
-
-# -----------------------------------------------------------------------------
-# Collections
-# -----------------------------------------------------------------------------
-
-collections:
-  news:
-    defaults:
-      layout: post
-    output: true
-    permalink: /news/:path/
-  projects:
-    output: false
-    permalink: /projects/:path/
-
-news_scrollable: true # adds a vertical scroll bar if there are more than 3 news items
-news_limit: 5 # leave blank to include all the news in the `_news` folder
-
-# -----------------------------------------------------------------------------
-# Jekyll settings
-# -----------------------------------------------------------------------------
-
-# Markdown and syntax highlight
-markdown: kramdown
-highlighter: rouge
-kramdown:
-  input: GFM
-  syntax_highlighter_opts:
-    css_class: 'highlight'
-    span:
-      line_numbers: false
-    block:
-      line_numbers: false
-      start_line: 1
-
-# Includes & excludes
-include: ['_pages']
-exclude:
-  - bin
-  - Gemfile
-  - Gemfile.lock
-  - vendor
-keep_files:
-  - CNAME
-  - .nojekyll
-  - .git
-
-# Plug-ins
-plugins:
-  - jekyll-archives
-  - jekyll-diagrams
-  - jekyll-email-protect
-  - jekyll-feed
-  - jekyll-imagemagick
-  - jekyll-minifier
-  - jekyll-paginate-v2
-  - jekyll/scholar
-  - jekyll-sitemap
-  - jekyll-target-blank
-  - jekyll-twitter-plugin
-  # - jemoji
-
-# Sitemap settings
-defaults:
-  - scope:
-      path: "assets/**/*.*"
-    values:
-      sitemap: false
-
-# -----------------------------------------------------------------------------
-# Jekyll Minifier
-# -----------------------------------------------------------------------------
-
-jekyll-minifier:
-  exclude: ['robots.txt']
-  uglifier_args:
-    harmony: true
-
-# -----------------------------------------------------------------------------
-# Jekyll Archives
-# -----------------------------------------------------------------------------
-
-jekyll-archives:
-  enabled: [year, tags, categories] # enables year, tag and category archives (remove if you need to disable one of them).
-  layouts:
-    year: archive-year
-    tag: archive-tag
-    category: archive-category
-  permalinks:
-    year: '/blog/:year/'
-    tag: '/blog/tag/:name/'
-    category: '/blog/category/:name/'
-
-# display_tags: ['formatting', 'images', 'links', 'math', 'code'] # this tags will be dispalyed on the front page of your blog
-
-# -----------------------------------------------------------------------------
-# Jekyll Scholar
-# -----------------------------------------------------------------------------
-
-scholar:
-
-  last_name:
-  first_name:
-
-  style: apa
-  locale: en
-
-  source: /_bibliography/
-  bibliography: papers.bib
-  bibliography_template: bib
-  # Note: if you have latex math in your bibtex, the latex filter
-  # preprocessing may conflict with MathJAX if the latter is enabled.
-  # See https://github.com/alshedivat/al-folio/issues/357.
-  bibtex_filters: [latex, smallcaps, superscript]
-
-  replace_strings: true
-  join_strings: true
-
-  details_dir: bibliography
-  details_layout: bibtex.html
-  details_link: Details
-
-  query: "@*"
-
-# Filter out certain bibtex entry keywords used internally from the bib output
-filtered_bibtex_keywords: [abbr, abstract, arxiv, bibtex_show, html, pdf, selected, supp, blog, code, poster, slides, website, preview]
-
-# Maximum number of authors to be shown for each publication (more authors are visible on click)
-max_author_limit: 3  # leave blank to always show all authors
-more_authors_animation_delay: 10  # more authors are revealed on click using animation; smaller delay means faster animation
-
-
-# -----------------------------------------------------------------------------
-# Responsive WebP Images
-# -----------------------------------------------------------------------------
-
-imagemagick:
-  enabled: true # enables responsive images for your site (recomended, see https://github.com/alshedivat/al-folio/issues/537)
-  widths:
-    - 480
-    - 800
-    - 1400
-  input_directories:
-    - assets/img/
-  input_formats:
-    - ".jpg"
-    - ".jpeg"
-    - ".png"
-    - ".tiff"
-  output_formats:
-    webp: "-resize 800x"
-
-# -----------------------------------------------------------------------------
-# Jekyll Diagrams
-# -----------------------------------------------------------------------------
-
-jekyll-diagrams:
-    # configuration, see https://github.com/zhustec/jekyll-diagrams.
-    # feel free to comment out this section if not using jekyll diagrams.
-
-
-# -----------------------------------------------------------------------------
-# Optional Features
-# -----------------------------------------------------------------------------
-
-enable_google_analytics:    false  # enables google analytics
-enable_panelbear_analytics: false  # enables panelbear analytics
-enable_google_verification: false  # enables google site verification
-enable_bing_verification:   false  # enables bing site verification
-enable_masonry:             true   # enables automatic project cards arangement
-enable_math:                true   # enables math typesetting (uses MathJax)
-enable_tooltips:            false  # enables automatic tooltip links generated
-                                   # for each section titles on pages and posts
-enable_darkmode:            true   # enables switching between light/dark modes
-enable_navbar_social:       false  # enables displaying social links in the
-                                   # navbar on the about page
-enable_project_categories:  true   # enables categorization of projects into
-                                   # multiple categories
-enable_medium_zoom:         true   # enables image zoom feature (as on medium.com)
-
-
-# -----------------------------------------------------------------------------
-# Library versions
-# -----------------------------------------------------------------------------
-
-academicons:
-  version: "1.9.1"
-  integrity: "sha256-i1+4qU2G2860dGGIOJscdC30s9beBXjFfzjWLjBRsBg="
-bootstrap:
-  version: "4.6.1"
-  integrity:
-    css: "sha256-DF7Zhf293AJxJNTmh5zhoYYIMs2oXitRfBjY+9L//AY="
-    js: "sha256-fgLAgv7fyCGopR/gBNq2iW3ZKIdqIcyshnUULC4vex8="
-fontawesome:
-  version: "5.15.4"
-  integrity: "sha256-mUZM63G8m73Mcidfrv5E+Y61y7a12O5mW4ezU3bxqW4="
-jquery:
-  version: "3.6.0"
-  integrity: "sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4="
-mathjax:
-  version: "3.2.0"
-masonry:
-  version: "4.2.2"
-  integrity: "sha256-Nn1q/fx0H7SNLZMQ5Hw5JLaTRZp0yILA/FRexe19VdI="
-mdb:
-  version: "4.20.0"
-  integrity:
-    css: "sha256-jpjYvU3G3N6nrrBwXJoVEYI/0zw8htfFnhT9ljN3JJw="
-    js: "sha256-NdbiivsvWt7VYCt6hYNT3h/th9vSTL4EDWeGs5SN3DA="
-medium_zoom:
-  version: "1.0.6"
-  integrity: "sha256-EdPgYcPk/IIrw7FYeuJQexva49pVRZNmt3LculEr7zM="
diff --git a/_data/coauthors.yml b/_data/coauthors.yml
deleted file mode 100644
index 8ed52124..00000000
--- a/_data/coauthors.yml
+++ /dev/null
@@ -1,34 +0,0 @@
-"Adams":
-  - firstname: ["Edwin", "E.", "E. P.", "Edwin Plimpton"]
-    url: https://en.wikipedia.org/wiki/Edwin_Plimpton_Adams
-
-"Podolsky":
-  - firstname: ["Boris", "B.", "B. Y.", "Boris Yakovlevich"]
-    url: https://en.wikipedia.org/wiki/Boris_Podolsky
-
-"Rosen":
-  - firstname: ["Nathan", "N."]
-    url: https://en.wikipedia.org/wiki/Nathan_Rosen
-
-"Bach": 
-  - firstname: ["Johann Sebastian", "J. S."]
-    url: https://en.wikipedia.org/wiki/Johann_Sebastian_Bach
-
-  - firstname: ["Carl Philipp Emanuel", "C. P. E."]
-    url: https://en.wikipedia.org/wiki/Carl_Philipp_Emanuel_Bach
-
-"Przibram":
-  - firstname: ["Karl"]
-    url: https://link.springer.com/article/10.1007/s00016-019-00242-z
-
-"Schrödinger":
-  - firstname: ["Erwin"]
-    url: https://en.wikipedia.org/wiki/Erwin_Schr%C3%B6dinger
-
-"Lorentz":
-  - firstname: ["Hendrik Antoon"]
-    url: https://en.wikipedia.org/wiki/Hendrik_Lorentz
-
-"Planck":
-  - firstname: ["Max"]
-    url: https://en.wikipedia.org/wiki/Max_Planck
diff --git a/_data/cv.yml b/_data/cv.yml
deleted file mode 100644
index 5b115724..00000000
--- a/_data/cv.yml
+++ /dev/null
@@ -1,97 +0,0 @@
-- title: General Information
-  type: map
-  contents:
-    - name: Full Name
-      value: Albert Einstein
-    - name: Date of Birth
-      value: 14th March 1879
-    - name: Languages
-      value: English, German
-
-- title: Education
-  type: time_table
-  contents:
-    - title: PhD
-      institution: University of Zurich, Zurich, Switzerland
-      year: 1905
-      description:
-        - Description 1.
-        - Description 2.
-        - title: Description 3.
-          contents:
-            - Sub-description 1.
-            - Sub-description 2.
-    - title: Federal teaching diploma
-      institution: Eidgenössische Technische Hochschule, Zurich, Switzerland
-      year: 1900
-      description:
-        - Description 1.
-        - Description 2.
-
-- title: Experience
-  type: time_table
-  contents:
-    - title: Professor of Theoretical Physics
-      institution: Institute for Advanced Study, Princeton University
-      year: 1933 - 1955
-      description:
-        - Description 1.
-        - Description 2.
-        - title: Description 3.
-          contents:
-            - Sub-description 1.
-            - Sub-description 2.
-    - title: Visiting Professor
-      institution: California Institute of Technology, Pasadena, California, US
-      year: 1933
-      description:
-        - Description 1.
-        - Description 2.
-
-    - title: Director
-      institution: Kaiser Wilhelm Institute for Physics, Berlin, Germany.
-      year: 1917-1933
-
-    - title: Professor of Theoretical Physics
-      institution: Karl-Ferdinand University, Prague, Czechoslovakia
-      year: 1911 - 1917
-      description:
-
-    - title: Associate Professor of Theoretical Physics
-      institution: University of Zurich, Zurich, Switzerland
-      year: 1909 - 1911
-
-- title: Open Source Projects
-  type: time_table
-  contents:
-    - title: <a href="https://github.com/alshedivat/al-folio">al-folio</a>
-      year: 2015-now
-      description: A beautiful, simple, clean, and responsive Jekyll theme for academics.
-
-- title: Honors and Awards
-  type: time_table
-  contents:
-    - year: 1921
-      items: 
-        - Nobel Prize in Physics 
-        - Matteucci Medal
-    - year: 2029
-      items: 
-        - Max Planck Medal
-
-- title: Academic Interests
-  type: nested_list
-  contents:
-    - title: Topic 1.
-      items: 
-        - Description 1.
-        - Description 2.
-    - title: Topic 2.
-      items:
-        - Description 1.
-        - Description 2.
-
-- title: Other Interests
-  type: list
-  contents:
-    - <u>Hobbies:</u> Hobby 1, Hobby 2, etc.
diff --git a/_data/repositories.yml b/_data/repositories.yml
deleted file mode 100644
index 5205c9f6..00000000
--- a/_data/repositories.yml
+++ /dev/null
@@ -1,12 +0,0 @@
-github_users:
-  - torvalds
-  - alshedivat
-
-github_repos:
-  - alshedivat/al-folio
-  - twbs/bootstrap
-  - jekyll/jekyll
-  - jquery/jquery
-  - FortAwesome/Font-Awesome
-  - jpswalsh/academicons
-  - mathjax/MathJax
diff --git a/_data/venues.yml b/_data/venues.yml
deleted file mode 100644
index 6c16ad5d..00000000
--- a/_data/venues.yml
+++ /dev/null
@@ -1,6 +0,0 @@
-"AJP":
-  url: https://aapt.scitation.org/journal/ajp
-  color: "#00369f"
-
-"PhysRev":
-  url: https://journals.aps.org/
diff --git a/_includes/cv/list.html b/_includes/cv/list.html
deleted file mode 100644
index 75625859..00000000
--- a/_includes/cv/list.html
+++ /dev/null
@@ -1,5 +0,0 @@
-<ul class="card-text font-weight-light list-group list-group-flush">
-    {% for content in entry.contents %}
-      <li class="list-group-item">{{ content }}</li>
-    {% endfor %}
-</ul>
\ No newline at end of file
diff --git a/_includes/cv/map.html b/_includes/cv/map.html
deleted file mode 100644
index e0d1983e..00000000
--- a/_includes/cv/map.html
+++ /dev/null
@@ -1,8 +0,0 @@
-<table class="table table-sm table-borderless table-responsive">
-    {% for content in entry.contents %}
-      <tr>
-        <td class="p-1 pr-2 font-weight-bold"><b>{{ content.name }}</b></td>
-        <td class="p-1 pl-2 font-weight-light text">{{ content.value }}</td>
-      </tr>
-    {% endfor %}
-</table>
\ No newline at end of file
diff --git a/_includes/cv/nested_list.html b/_includes/cv/nested_list.html
deleted file mode 100644
index 4778aca0..00000000
--- a/_includes/cv/nested_list.html
+++ /dev/null
@@ -1,14 +0,0 @@
-<ul class="card-text font-weight-light list-group list-group-flush">
-    {% for content in entry.contents %}
-      <li class="list-group-item">
-      <h5 class="font-italic">{{ content.title }}</h5>
-      {% if content.items %}
-        <ul class="subitems">
-          {% for subitem in content.items %}
-            <li><span class="subitem">{{ subitem }}</span></li>
-          {% endfor %}
-        </ul>
-      {% endif %}
-      </li>
-    {% endfor %}
-</ul>
\ No newline at end of file
diff --git a/_includes/cv/time_table.html b/_includes/cv/time_table.html
deleted file mode 100644
index 123b9d09..00000000
--- a/_includes/cv/time_table.html
+++ /dev/null
@@ -1,59 +0,0 @@
-<ul class="card-text font-weight-light list-group list-group-flush">
-    {% for content in entry.contents %}
-      <li class="list-group-item">
-        <div class="row">
-          {% if content.year %}
-            <div class="col-xs-2 cl-sm-2 col-md-2 text-center" style="width: 75px;">
-              <span class="badge font-weight-bold danger-color-dark text-uppercase align-middle" style="min-width: 75px;">
-                {{ content.year }}
-              </span>
-            </div>
-          {% endif %}
-          <div class="col-xs-10 cl-sm-10 col-md-10 mt-2 mt-md-0">
-            {% if content.title %}
-            <h6 class="title font-weight-bold ml-1 ml-md-4">{{content.title}}</h6>
-            {% endif %}
-            {% if content.institution %}
-            <h6 class="ml-1 ml-md-4" style="font-size: 0.95rem;">{{content.institution}}</h6>
-            {% endif %}
-            {% if content.description %}
-              <ul class="items">
-                {% for item in content.description %}
-                  <li>
-                    {% if item.contents %}
-                      <span class="item-title">{{ item.title }}</span>
-                      <ul class="subitems">
-                      {% for subitem in item.contents %}
-                        <li><span class="subitem">{{ subitem }}</span></li>
-                      {% endfor %}
-                      </ul>
-                    {% else %}
-                      <span class="item">{{ item }}</span>
-                    {% endif %}
-                  </li>
-                {% endfor %}
-              </ul>
-            {% endif %}
-            {% if content.items %}
-              <ul class="items">
-                {% for item in content.items %}
-                  <li>
-                    {% if item.contents %}
-                      <span class="item-title">{{ item.title }}</span>
-                      <ul class="subitems">
-                      {% for subitem in item.contents %}
-                        <li><span class="subitem">{{ subitem }}</span></li>
-                      {% endfor %}
-                      </ul>
-                    {% else %}
-                      <span class="item">{{ item }}</span>
-                    {% endif %}
-                  </li>
-                {% endfor %}
-              </ul>
-            {% endif %}
-          </div>
-        </div>
-      </li>
-    {% endfor %}
-    </ul>
\ No newline at end of file
diff --git a/_includes/figure.html b/_includes/figure.html
deleted file mode 100644
index e67e8043..00000000
--- a/_includes/figure.html
+++ /dev/null
@@ -1,36 +0,0 @@
-{%- assign img_path = include.path | remove: ".jpg" | remove: ".jpeg" | remove: ".png" | remove: ".tiff" -%}
-
-<figure>
-
-  <picture>
-    {% if site.imagemagick.enabled %}
-    {% for i in site.imagemagick.widths -%}
-      <source 
-        class="responsive-img-srcset"
-        media="(max-width: {{ i }}px)" 
-        srcset="{{ img_path | relative_url }}-{{ i }}.webp"
-      />
-    {% endfor -%}
-    {% endif %}
-
-    <!-- Fallback to the original file -->
-    <img 
-      src="{{ include.path | relative_url }}"
-      {% if include.class %}class="{{ include.class }}"{% endif %}  
-      {% if include.width %}width="{{ include.width }}"{% else %}width="auto"{% endif %} 
-      {% if include.height %}height="{{ include.height }}"{% else %}height="auto"{% endif %} 
-      {% if include.min-width %}min-width="{{ include.min-width }}"{% endif %} 
-      {% if include.min-height %}min-height="{{ include.min-height }}"{% endif %} 
-      {% if include.max-width %}max-width="{{ include.max-width }}"{% endif %} 
-      {% if include.max-height %}height="{{ include.max-height }}"{% endif %} 
-      {% if include.alt %}alt="{{ include.alt }}"{% endif %} 
-      {% if include.title %}title="{{ include.title }}"{% endif %} 
-      {% if include.zoomable %}data-zoomable{% endif %}
-      onerror="this.onerror=null; $('.responsive-img-srcset').remove();"
-    />
-
-  </picture>
-
-  {%- if include.caption -%}<figcaption class="caption">{{ include.caption }}</figcaption>{%- endif %}
-
-</figure>
diff --git a/_includes/footer.html b/_includes/footer.html
deleted file mode 100644
index acc4688f..00000000
--- a/_includes/footer.html
+++ /dev/null
@@ -1,25 +0,0 @@
-    {% if site.footer_fixed %}
-    <footer class="fixed-bottom">
-      <div class="container mt-0">
-        &copy; Copyright {{ site.time | date: '%Y' }} {{ site.first_name }} {{ site.middle_name }} {{ site.last_name }}. {{ site.footer_text }}
-        {%- if site.impressum_path -%}
-        <a href="{{ site.url }}{{ site.baseurl }}{{ site.impressum_path }}">Impressum</a>.
-        {%- endif -%}
-        {%- if site.last_updated -%}
-        Last updated: {{ "now" | date: '%B %d, %Y' }}.
-        {%- endif %}
-      </div>
-    </footer>
-    {%- else -%}
-    <footer class="sticky-bottom mt-5">
-      <div class="container">
-        &copy; Copyright {{ site.time | date: '%Y' }} {{ site.first_name }} {{ site.middle_name }} {{ site.last_name }}. {{ site.footer_text }}
-        {%- if site.impressum_path -%}
-        <a href="{{ site.url }}{{ site.baseurl }}{{ site.impressum_path }}">Impressum</a>.
-        {%- endif -%}
-        {%- if site.last_updated -%}
-        Last updated: {{ "now" | date: '%B %d, %Y' }}.
-        {%- endif %}
-      </div>
-    </footer>
-    {%- endif %}
\ No newline at end of file
diff --git a/_includes/head.html b/_includes/head.html
deleted file mode 100644
index 3796eb38..00000000
--- a/_includes/head.html
+++ /dev/null
@@ -1,31 +0,0 @@
-    <!-- Metadata, OpenGraph and Schema.org -->
-    {% include metadata.html %}
-
-    <!-- Bootstrap & MDB -->
-    <link href="https://cdn.jsdelivr.net/npm/bootstrap@{{ site.bootstrap.version }}/dist/css/bootstrap.min.css" rel="stylesheet" integrity="{{ site.bootstrap.integrity.css }}" crossorigin="anonymous">
-    <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/mdbootstrap@{{ site.mdb.version }}/css/mdb.min.css" integrity="{{ site.mdb.integrity.css }}" crossorigin="anonymous" />
-
-    <!-- Fonts & Icons -->
-    <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fortawesome/fontawesome-free@{{ site.fontawesome.version }}/css/all.min.css" integrity="{{ site.fontawesome.integrity }}" crossorigin="anonymous">
-    <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/academicons@{{ site.academicons.version }}/css/academicons.min.css" integrity="{{ site.academicons.integrity }}" crossorigin="anonymous">
-    <link rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700|Roboto+Slab:100,300,400,500,700|Material+Icons">
-
-    <!-- Code Syntax Highlighting -->
-    <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/{{ site.highlight_theme_light | append: '.css' }}" media="" id="highlight_theme_light" />
-
-    <!-- Styles -->
-    {% if site.icon.size < 3 %}
-    <link rel="shortcut icon" href="data:image/svg+xml,<svg xmlns=%22http://www.w3.org/2000/svg%22 viewBox=%220 0 100 100%22><text y=%22.9em%22 font-size=%2290%22>{{ site.icon }}</text></svg>">
-    {% elsif site.icon != blank %}
-    <link rel="shortcut icon" href="{{ site.icon | prepend: '/assets/img/' | relative_url}}"/>
-    {% endif %}
-    <link rel="stylesheet" href="{{ '/assets/css/main.css' | relative_url }}">
-    <link rel="canonical" href="{{ page.url | replace:'index.html','' | absolute_url }}">
-    
-    <!-- Dark Mode -->
-    {% if site.enable_darkmode %}
-    <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/{{ site.highlight_theme_dark | append: '.css' }}" media="none" id="highlight_theme_dark" />
-
-    <script src="{{ '/assets/js/theme.js' | relative_url }}"></script>
-    <script src="{{ '/assets/js/dark_mode.js' | relative_url }}"></script>
-    {% endif %}
diff --git a/_includes/header.html b/_includes/header.html
deleted file mode 100644
index f72668e5..00000000
--- a/_includes/header.html
+++ /dev/null
@@ -1,137 +0,0 @@
-
-    <header>
-
-      <!-- Nav Bar -->
-      <nav id="navbar" class="navbar navbar-light navbar-expand-sm {% if site.navbar_fixed %}fixed-top{% else %}sticky-top{% endif %}">
-        <div class="container">
-          {% if page.permalink != '/' -%}
-          <a class="navbar-brand title font-weight-lighter" href="{{ site.baseurl }}/">
-            {%- if site.title == "blank" -%}
-              {%- if site.first_name -%}
-                <span class="font-weight-bold">{{- site.first_name -}}&nbsp;</span>
-              {%- endif -%}
-              {%- if site.middle_name -%}
-                {{- site.middle_name -}}&nbsp;
-              {%- endif -%}
-              {%- if site.last_name -%}
-                {{- site.last_name -}}
-              {%- endif -%}
-            {%- else -%}
-              {{- site.title -}}
-            {%- endif -%}
-          </a>
-          {%- elsif site.enable_navbar_social -%}
-          <!-- Social Icons -->
-          <div class="navbar-brand social">
-            {% include social.html %}
-          </div>
-          {% endif %}
-          <!-- Navbar Toggle -->
-          <button class="navbar-toggler collapsed ml-auto" type="button" data-toggle="collapse" data-target="#navbarNav" aria-controls="navbarNav" aria-expanded="false" aria-label="Toggle navigation">
-            <span class="sr-only">Toggle navigation</span>
-            <span class="icon-bar top-bar"></span>
-            <span class="icon-bar middle-bar"></span>
-            <span class="icon-bar bottom-bar"></span>
-          </button>
-
-          <!-- {{ site.pages | inspect }} -->
-
-          <div class="collapse navbar-collapse text-right" id="navbarNav">
-            <ul class="navbar-nav ml-auto flex-nowrap">
-
-              {%- for page in site.pages -%}
-                {% if page.permalink == '/' %}
-                  {% assign about_title = page.title %}
-                {% endif %}
-              {% endfor %}
-
-              <!-- About -->
-              <!-- <li class="nav-item {% if page.permalink == '/2023/about' %}active{% endif %}">
-                <a class="nav-link" href="{{ '/2023/about' | relative_url }}">about
-                  {%- if page.permalink == '/2023/about' -%}
-                  <span class="sr-only">(current)</span>
-                  {%- endif -%}
-                </a>
-              </li> -->
-              <!-- Call -->
-              <!-- <li class="nav-item {% if page.permalink == '/2023/call' %}active{% endif %}">
-                <a class="nav-link" href="{{ '/2023/call' | relative_url }}">call
-                  {%- if page.permalink == '/2023/call' -%}
-                  <span class="sr-only">(current)</span>
-                  {%- endif -%}
-                </a>
-              </li> -->
-              <!-- submissions -->
-              <!-- <li class="nav-item {% if page.url contains 'submissions' %}active{% endif %}">
-                <a class="nav-link" href="{{ '/2023/submissions/' | relative_url }}">submissions
-                  {%- if page.permalink == '/2023/submissions' -%}
-                  <span class="sr-only">(current)</span>
-                  {%- endif -%}
-                </a>
-              </li> -->
-              <!-- Blog -->
-              <!-- <li class="nav-item {% if page.url contains 'blog' %}active{% endif %}">
-                <a class="nav-link" href="{{ '/blog/' | relative_url }}">{{ site.blog_nav_title }}
-                  {%- if page.url contains 'blog' -%}
-                  <span class="sr-only">(current)</span>
-                  {%- endif -%}
-                </a>
-              </li> -->
-              <!-- 2022 -->
-              <!-- <li class="nav-item">
-                <a class="nav-link" href="https://iclr-blog-track.github.io/home/">2022 edition <u>⤤</u>
-                  {%- if page.url contains 'blog' -%}
-                  <span class="sr-only">(current)</span>
-                  {%- endif -%}
-                </a>
-              </li> -->
-
-              <!-- Other pages -->
-              
-              {%- assign sorted_pages = site.pages | sort: "nav_order" -%}
-              {%- for p in sorted_pages -%}
-              {%- if p.nav and p.autogen == nil -%}
-              {%- if p.dropdown %}
-                <li class="nav-item dropdown {% if page.title == p.title %}active{% endif %}">
-                  <a class="nav-link dropdown-toggle" href="#" id="navbarDropdown" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">{{ p.title }}
-                    {%- if page.title == p.title -%}
-                    <span class="sr-only">(current)</span>
-                    {%- endif -%}
-                  </a>
-                  <div class="dropdown-menu dropdown-menu-right" aria-labelledby="navbarDropdown">
-                    {%- for child in p.children -%}
-                    {%- if child.title == 'divider' %}
-                    <div class="dropdown-divider"></div>
-                    {%- else %}
-                    <a class="dropdown-item" href="{{ child.permalink | relative_url }}">{{ child.title }}</a>
-                    {%- endif -%}
-                    {% endfor %}
-                  </div>
-                </li>
-              {%- else %}
-              <li class="nav-item {% if page.title == p.title %}active{% endif %}">
-                <a class="nav-link" href="{{ p.url | relative_url }}">{{ p.title }}
-                  {%- if page.title == p.title -%}
-                  <span class="sr-only">(current)</span>
-                  {%- endif -%}
-                </a>
-              </li>
-              {%- endif -%}
-              {%- endif -%}
-              {% endfor -%}
-             
-              {%- if site.enable_darkmode %}
-
-              <!-- Toogle theme mode -->
-              <li class="toggle-container">
-                <button id="light-toggle" title="Change theme">
-                  <i class="fas fa-moon"></i>
-                  <i class="fas fa-sun"></i>
-                </button>
-              </li>
-              {%- endif %}
-            </ul>
-          </div>
-        </div>
-      </nav>
-    </header>
\ No newline at end of file
diff --git a/_includes/metadata.html b/_includes/metadata.html
deleted file mode 100644
index af3813a8..00000000
--- a/_includes/metadata.html
+++ /dev/null
@@ -1,196 +0,0 @@
-{% if site.enable_google_verification or site.enable_bing_verification %}
-    <!-- Website verification -->
-    {% if site.enable_google_verification -%}
-    <meta name="google-site-verification" content="{{ site.google_site_verification }}" />
-    {%- endif -%}
-    {% if site.enable_bing_verification -%}
-    <meta name="msvalidate.01" content="{{ site.bing_site_verification }}" />
-    {%- endif -%}
-{%- endif %}
-
-    <!-- Standard metadata -->
-    <meta charset="utf-8">
-    <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
-    <meta http-equiv="X-UA-Compatible" content="IE=edge">
-    <title>
-    {%- if site.title == "blank" -%}
-        {%- capture title -%}{{ site.first_name }} {{ site.middle_name }} {{ site.last_name }}{%- endcapture -%}
-    {%- else -%}
-        {%- capture title -%}{{ site.title }}{%- endcapture -%}
-    {%- endif -%}
-    {% if page.url == '/blog/index.html' %}
-        {{ site.blog_nav_title }} | {{ title }}
-    {%- elsif page.title != "blank" and page.url != "/" -%}
-        {%- if page.title == nil or page.title == "" -%}
-            {{ page.date | date: "%Y" }} | {{ title }}
-        {%- else -%}
-            {{ page.title }} | {{ title }}
-        {%- endif -%}
-    {%- else -%}
-        {{ title }}
-    {%- endif -%}
-    </title>
-    <meta name="author" content="{{ site.first_name }} {{ site.middle_name }} {{ site.last_name }}" />
-    <meta name="description" content="{%- if page.description -%}{{ page.description }}{%- else -%}{{ site.description }}{%- endif -%}" />
-{%- if page.keywords or site.keywords %}
-    <meta name="keywords" content="{%- if page.keywords -%}{{ page.keywords }}{%- else -%}{{ site.keywords }}{%- endif -%}" />
-{%- endif %}
-
-{%- if site.serve_og_meta %}
-
-    <!-- OpenGraph -->
-    <meta property="og:site_name" content="{{ title }}" />
-    <meta property="og:type" content="website" />
-    <meta property="og:title" content="{%- if page.title -%}{{ title }} | {{ page.title }}{%- else -%}{{ title }}{%- endif -%}" />
-    <meta property="og:url" content="{{ page.url | prepend: site.baseurl | prepend: site.url | remove_first: 'index.html' }}" />
-    <meta property="og:description" content="{%- if page.description -%}{{ page.description }}{%- else -%}{{ site.description }}{%- endif -%}" />
-    {% if page.og_image or site.og_image -%}
-    <meta property="og:image" content="{%- if page.og_image -%}{{ page.og_image }}{%- else -%}{{ site.og_image }}{%- endif -%}" />
-    {%- endif %}
-    <meta property="og:locale" content="{{ site.lang }}" />
-
-    <!-- Twitter card -->
-    <meta name="twitter:card" content="summary" />
-    <meta name="twitter:title" content="{%- if page.title -%}{{ page.title }}{%- else -%}{{ title }}{%- endif -%}" />
-    <meta name="twitter:description" content="{%- if page.description -%}{{ page.description }}{%- else -%}{{ site.description }}{%- endif -%}" />
-    {% if page.og_image or site.og_image -%}
-    <meta name="twitter:image" content="{%- if page.og_image -%}{{ page.og_image }}{%- else -%}{{ site.og_image }}{%- endif -%}" />
-    {%- endif %}
-    {% if site.twitter_username -%}
-    <meta name="twitter:site" content="@{{ site.twitter_username }}" />
-    <meta name="twitter:creator" content="@{{ site.twitter_username }}" />
-    {%- endif %}
-{%- endif %}
-
-{%- if site.serve_schema_org %}
-
-    <!-- Schema.org -->
-    {%- comment -%} Social links generator for "sameAs schema" {%- endcomment %}
-    {% assign sameaslinks = "" | split: "," %}
-    {%- if site.orcid_id -%}
-        {%- capture link -%}https://orcid.org/{{ site.orcid_id }}{%- endcapture -%}
-        {%- assign sameaslinks = sameaslinks | push: link -%}
-    {%- endif -%}
-    {%- if site.scholar_userid -%}
-        {%- capture link -%}https://scholar.google.com/citations?user={{ site.scholar_userid }}{%- endcapture -%}
-        {%- assign sameaslinks = sameaslinks | push: link -%}
-    {%- endif -%}
-    {%- if site.semanticscholar_id -%}
-        {%- capture link -%}https://www.semanticscholar.org/author/{{ site.semanticscholar_id }}{%- endcapture -%}
-        {%- assign sameaslinks = sameaslinks | push: link -%}
-    {%- endif -%}
-    {%- if site.publons_id -%}
-        {%- capture link -%}https://publons.com/a/{{ site.publons_id }}/{%- endcapture -%}
-        {%- assign sameaslinks = sameaslinks | push: link -%}
-    {%- endif -%}
-    {%- if site.research_gate_profile -%}
-        {%- capture link -%}https://www.researchgate.net/profile/{{site.research_gate_profile}}{%- endcapture -%}
-        {%- assign sameaslinks = sameaslinks | push: link -%}
-    {%- endif -%}
-    {%- if site.github_username -%}
-        {%- capture link -%}https://github.com/{{ site.github_username }}{%- endcapture -%}
-        {%- assign sameaslinks = sameaslinks | push: link -%}
-    {%- endif -%}
-    {%- if site.linkedin_username -%}
-        {%- capture link -%}https://www.linkedin.com/in/{{ site.linkedin_username }}{%- endcapture -%}
-        {%- assign sameaslinks = sameaslinks | push: link -%}
-    {%- endif -%}
-    {%- if site.twitter_username -%}
-        {%- capture link -%}https://twitter.com/{{ site.twitter_username }}{%- endcapture -%}
-        {%- assign sameaslinks = sameaslinks | push: link -%}
-    {%- endif -%}
-    {%- if site.medium_username -%}
-        {%- capture link -%}https://medium.com/@{{ site.medium_username }}{%- endcapture -%}
-        {%- assign sameaslinks = sameaslinks | push: link -%}
-    {%- endif -%}
-    {%- if site.quora_username -%}
-        {%- capture link -%}https://www.quora.com/profile/{{ site.quora_username }}{%- endcapture -%}
-        {%- assign sameaslinks = sameaslinks | push: link -%}
-    {%- endif -%}
-    {%- if site.blogger_url -%}
-        {%- capture link -%}{{ site.blogger_url }}{%- endcapture -%}
-        {%- assign sameaslinks = sameaslinks | push: link -%}
-    {%- endif -%}
-    {%- if site.work_url -%}
-        {%- capture link -%}{{ site.work_url }}{%- endcapture -%}
-        {%- assign sameaslinks = sameaslinks | push: link -%}
-    {%- endif -%}
-    {%- if site.wikidata_id -%}
-        {%- capture link -%}https://www.wikidata.org/wiki/{{ site.wikidata_id }}{%- endcapture -%}
-        {%- assign sameaslinks = sameaslinks | push: link -%}
-    {%- endif -%}
-    {%- if site.strava_userid -%}
-        {%- capture link -%}https://www.strava.com/athletes/{{ site.strava_userid }}{%- endcapture -%}
-        {%- assign sameaslinks = sameaslinks | push: link -%}
-    {%- endif -%}
-    {%- if site.keybase_username -%}
-        {%- capture link -%}https://keybase.io/{{ site.keybase_username }}{%- endcapture -%}
-        {%- assign sameaslinks = sameaslinks | push: link -%}
-    {%- endif -%}
-    {%- if site.gitlab_username -%}
-        {%- capture link -%}https://gitlab.com/{{ site.gitlab_username }}{%- endcapture -%}
-        {%- assign sameaslinks = sameaslinks | push: link -%}
-    {%- endif -%}
-    {%- if site.dblp_url -%}
-        {%- capture link -%}{{ site.dblp_url }}{%- endcapture -%}
-        {%- assign sameaslinks = sameaslinks | push: link -%}
-    {%- endif -%}
-    {%- if site.stackoverflow_id -%}
-        {%- capture link -%}https://stackoverflow.com/users/{{ site.stackoverflow_id }}{%- endcapture -%}
-        {%- assign sameaslinks = sameaslinks | push: link -%}
-    {%- endif -%}
-    {%- if site.kaggle_id -%}
-        {%- capture link -%}https://www.kaggle.com/{{ site.kaggle_id }}{%- endcapture -%}
-        {%- assign sameaslinks = sameaslinks | push: link -%}
-    {%- endif -%}
-    {%- if site.lastfm_id -%}
-        {%- capture link -%}https://www.last.fm/user/{{ site.lastfm_id }}{%- endcapture -%}
-        {%- assign sameaslinks = sameaslinks | push: link -%}
-    {%- endif -%}
-    {%- if site.spotify_id -%}
-        {%- capture link -%}https://open.spotify.com/user/{{ site.spotify_id }}{%- endcapture -%}
-        {%- assign sameaslinks = sameaslinks | push: link -%}
-    {%- endif -%}
-    {%- if site.pinterest_id -%}
-        {%- capture link -%}https://www.pinterest.com/{{ site.pinterest_id }}{%- endcapture -%}
-        {%- assign sameaslinks = sameaslinks | push: link -%}
-    {%- endif -%}
-    {%- if site.unsplash_id -%}
-        {%- capture link -%}https://unsplash.com/@{{ site.unsplash_id }}{%- endcapture -%}
-        {%- assign sameaslinks = sameaslinks | push: link -%}
-    {%- endif -%}
-    {%- if site.instagram_id -%}
-        {%- capture link -%}https://instagram.com/{{ site.instagram_id }}{%- endcapture -%}
-        {%- assign sameaslinks = sameaslinks | push: link -%}
-    {%- endif -%}
-    {%- if site.facebook_id -%}
-        {%- capture link -%}https://facebook.com/{{ site.facebook_id }}{%- endcapture -%}
-        {%- assign sameaslinks = sameaslinks | push: link -%}
-    {%- endif -%}
-    {%- if site.discord_id -%}
-        {%- capture link -%}https://discord.com/users/{{ site.discord_id }}{%- endcapture -%}
-        {%- assign sameaslinks = sameaslinks | push: link -%}
-    {%- endif -%}
-    {%- if sameaslinks != blank -%}
-        {%- assign sameaslinks = sameaslinks | split: "" -%}
-    {%- endif -%}
-
-    <script type="application/ld+json">
-      {
-        "author":
-        {
-          "@type": "Person",
-          "name": "{{ site.first_name }} {{ site.middle_name }} {{ site.last_name }}"
-        },
-        "url": "{{ page.url | prepend: site.baseurl | prepend: site.url }}",
-        "@type": "WebSite",
-        "description": "{%- if page.description -%}{{ page.description }}{%- else if site.description -%}{{ site.description }}{%- endif -%}",
-        "headline": "{%- if page.title -%}{{ page.title }}{%- else -%}{{ site.title }}{%- endif -%}",
-        {% if sameaslinks != blank -%}
-        "sameAs": {{ sameaslinks }},
-        {%- endif %}
-        "name": "{{ site.first_name }} {{ site.middle_name }} {{ site.last_name }}",
-        "@context": "https://schema.org"
-      }
-    </script>
-{%- endif %}
diff --git a/_includes/news.html b/_includes/news.html
deleted file mode 100644
index 307e532d..00000000
--- a/_includes/news.html
+++ /dev/null
@@ -1,31 +0,0 @@
-          
-          <div class="news">
-            <h2>news</h2>
-            {% if site.news != blank -%} 
-            {%- assign news_size = site.news | size -%}
-            <div class="table-responsive" {% if site.news_scrollable and news_size > 3 %}style="max-height: 10vw"{% endif %}>
-              <table class="table table-sm table-borderless">
-              {%- assign news = site.news | reverse -%}
-              {% if site.news_limit %}
-              {% assign news_limit = site.news_limit %}
-              {% else %}
-              {% assign news_limit = news_size %}
-              {% endif %}
-              {% for item in news limit: news_limit %} 
-                <tr>
-                  <th scope="row">{{ item.date | date: "%b %-d, %Y" }}</th>
-                  <td>
-                    {% if item.inline -%} 
-                      {{ item.content | remove: '<p>' | remove: '</p>' | emojify }}
-                    {%- else -%} 
-                      <a class="news-title" href="{{ item.url | relative_url }}">{{ item.title }}</a>
-                    {%- endif %} 
-                  </td>
-                </tr>
-              {%- endfor %} 
-              </table>
-            </div>
-          {%- else -%} 
-            <p>No news so far...</p>
-          {%- endif %} 
-          </div>
diff --git a/_includes/pagination.html b/_includes/pagination.html
deleted file mode 100644
index 4b8d27e3..00000000
--- a/_includes/pagination.html
+++ /dev/null
@@ -1,17 +0,0 @@
-{%- if paginator.total_pages > 1 -%}
-<nav aria-label="Blog page naviation">
-  <ul class="pagination pagination-lg justify-content-center">
-    <li class="page-item {% unless paginator.previous_page %}disabled{% endunless %}">
-      <a class="page-link" href="{{ paginator.previous_page_path | relative_url }}" tabindex="-1" aria-disabled="{{ paginator.previous_page }}">Newer</a>
-    </li>
-    {%- if paginator.page_trail -%}
-      {% for trail in paginator.page_trail -%}
-        <li class="page-item {% if page.url == trail.path %}active{% endif %}"><a class="page-link" href="{{ trail.path | relative_url }}" title="{{trail.title}}">{{ trail.num }}</a></li>
-      {% endfor -%}
-    {%- endif -%}
-    <li class="page-item {% unless paginator.next_page %}disabled{% endunless %}">
-      <a class="page-link" href="{{ paginator.next_page_path | relative_url }}">Older</a>
-    </li>
-  </ul>
-</nav>
-{%- endif -%}
diff --git a/_includes/people.html b/_includes/people.html
deleted file mode 100644
index b5a79f1f..00000000
--- a/_includes/people.html
+++ /dev/null
@@ -1,16 +0,0 @@
-<!-- _includes/projects.html -->
-<div class="grid-sizer"></div>
-<div class="grid-item">
-  <a href="{{ include.url | relative_url }}">
-    <div class="card hoverable">      
-      {%- include figure.html
-      path=include.img
-      alt=include.name      
-      -%}
-      <div class="card-body">
-        <h5 class="card-title text-center">{{- include.name -}}</h5>
-        <p class="card-text text-center">{{- include.affiliation -}}</p>        
-      </div>
-    </div>
-  </a>
-</div>
diff --git a/_includes/people_horizontal.html b/_includes/people_horizontal.html
deleted file mode 100644
index 957bc768..00000000
--- a/_includes/people_horizontal.html
+++ /dev/null
@@ -1,17 +0,0 @@
-<div class="card-item col">  
-  <a href="{{ include.url | relative_url }}">  
-    <div class="card hoverable">
-      <div class="row g-0">        
-        <div class="card-img col-sm-4">
-          {% include figure.html path=include.img alt=include.name %}
-        </div>
-        <div class="col-sm-6">        
-          <div class="card-body">
-            <h5 class="card-title text-right">{{ include.name }}</h5>
-            <p class="card-text text-right">{{ include.affiliation }}</p>                                                  
-          </div>
-        </div>
-      </div>
-    </div>
-  </a>
-</div>
diff --git a/_includes/projects.html b/_includes/projects.html
deleted file mode 100644
index 503146e2..00000000
--- a/_includes/projects.html
+++ /dev/null
@@ -1,36 +0,0 @@
-<!-- _includes/projects.html -->
-<div class="grid-sizer"></div>
-<div class="grid-item">
-  {% if project.redirect -%}
-  <a href="{{ project.redirect }}">
-    {%- else -%}
-    <a href="{{ project.url | relative_url }}">
-      {%- endif %}
-      <div class="card hoverable">
-        {%- if project.img %}
-        {%- include figure.html
-          path=project.img
-          alt="project thumbnail" -%}
-        {%- endif %}
-        <div class="card-body">
-          <h2 class="card-title text-lowercase">{{ project.title }}</h2>
-          <p class="card-text">{{ project.description }}</p>
-          <div class="row ml-1 mr-1 p-0">
-            {%- if project.github -%}
-            <div class="github-icon">
-              <div class="icon" data-toggle="tooltip" title="Code Repository">
-                <a href="{{ project.github }}"><i class="fab fa-github gh-icon"></i></a>
-              </div>
-              {%- if project.github_stars -%}
-              <span class="stars" data-toggle="tooltip" title="GitHub Stars">
-                <i class="fas fa-star"></i>
-                <span id="{{ project.github_stars }}-stars"></span>
-              </span>
-              {%- endif %}
-            </div>
-            {%- endif %}
-          </div>
-        </div>
-      </div>
-    </a>
-</div>
\ No newline at end of file
diff --git a/_includes/projects_horizontal.html b/_includes/projects_horizontal.html
deleted file mode 100644
index ddf74058..00000000
--- a/_includes/projects_horizontal.html
+++ /dev/null
@@ -1,40 +0,0 @@
-<div class="card-item col">
-  {%- if project.redirect -%}
-  <a href="{{ project.redirect }}">
-  {%- else -%}
-  <a href="{{ project.url | relative_url }}">
-  {%- endif -%}
-    <div class="card hoverable">
-      <div class="row g-0">
-        {%- if project.img -%}
-        <div class="card-img col-md-6">
-          {% include figure.html path=project.img alt="project thumbnail" %}
-        </div>
-        <div class="col-md-6">
-        {%- else -%}
-        <div class="col-md-12">
-        {%- endif -%}
-          <div class="card-body">
-            <h3 class="card-title text-lowercase">{{ project.title }}</h3>
-            <p class="card-text">{{ project.description }}</p>
-            <div class="row ml-1 mr-1 p-0">
-              {%- if project.github -%}
-              <div class="github-icon">
-                <div class="icon" data-toggle="tooltip" title="Code Repository">
-                  <a href="{{ project.github }}"><i class="fab fa-github gh-icon"></i></a>
-                </div>
-                {%- if project.github_stars -%}
-                <span class="stars" data-toggle="tooltip" title="GitHub Stars">
-                  <i class="fas fa-star"></i>
-                  <span id="{{ project.github_stars }}-stars"></span>
-                </span>
-                {%- endif -%}
-              </div>
-              {%- endif -%}
-            </div>
-          </div>
-        </div>
-      </div>
-    </div>
-  </a>
-</div>
diff --git a/_includes/repository/repo.html b/_includes/repository/repo.html
deleted file mode 100644
index 6344b860..00000000
--- a/_includes/repository/repo.html
+++ /dev/null
@@ -1,14 +0,0 @@
-{% assign repo_url =  include.repository | split: '/' %}
-
-{% if site.data.repositories.github_users contains repo_url.first %}
-  {% assign show_owner = false %}
-{% else %}
-  {% assign show_owner = true %}
-{% endif %}
-
-<div class="repo p-2 text-center">
-  <a href="https://github.com/{{ include.repository }}">
-    <img class="repo-img-light w-100" alt="{{ include.repository }}" src="https://github-readme-stats.vercel.app/api/pin/?username={{ repo_url.first }}&repo={{ repo_url.last }}&theme={{ site.repo_theme_light }}&show_owner={{ show_owner }}">
-    <img class="repo-img-dark w-100" alt="{{ include.repository }}" src="https://github-readme-stats.vercel.app/api/pin/?username={{ repo_url.first }}&repo={{ repo_url.last }}&theme={{ site.repo_theme_dark }}&show_owner={{ show_owner }}">
-  </a>
-</div>
diff --git a/_includes/repository/repo_user.html b/_includes/repository/repo_user.html
deleted file mode 100644
index ae06a058..00000000
--- a/_includes/repository/repo_user.html
+++ /dev/null
@@ -1,6 +0,0 @@
-<div class="repo p-2 text-center">
-  <a href="https://github.com/{{ include.username }}">
-    <img class="repo-img-light w-100" alt="{{ include.username }}" src="https://github-readme-stats.vercel.app/api/?username={{ include.username }}&theme={{ site.repo_theme_light }}&show_icons=true">
-    <img class="repo-img-dark w-100" alt="{{ include.username }}" src="https://github-readme-stats.vercel.app/api/?username={{ include.username }}&theme={{ site.repo_theme_dark }}&show_icons=true">
-  </a>
-</div>
diff --git a/_includes/scripts/analytics.html b/_includes/scripts/analytics.html
deleted file mode 100644
index db2aeef9..00000000
--- a/_includes/scripts/analytics.html
+++ /dev/null
@@ -1,18 +0,0 @@
-{%- if site.enable_google_analytics -%}
-  <!-- Global site tag (gtag.js) - Google Analytics -->
-  <script async src="https://www.googletagmanager.com/gtag/js?id={{ site.google_analytics }}"></script>
-  <script>
-    window.dataLayer = window.dataLayer || [];
-    function gtag(){ window.dataLayer.push(arguments); }
-    gtag('js', new Date());
-    gtag('config', '{{ site.google_analytics }}');
-  </script>
-{%- endif -%}
-{%- if site.enable_panelbear_analytics -%}
-  <!-- Panelbear Analytics - We respect your privacy -->
-  <script async src="https://cdn.panelbear.com/analytics.js?site={{site.panelbear_analytics}}"></script>
-  <script>
-    window.panelbear = window.panelbear || function() { (window.panelbear.q = window.panelbear.q || []).push(arguments); };
-    panelbear('config', { site: '{{site.panelbear_analytics}}' });
-  </script>
-{%- endif -%}
diff --git a/_includes/scripts/bootstrap.html b/_includes/scripts/bootstrap.html
deleted file mode 100644
index 1c213650..00000000
--- a/_includes/scripts/bootstrap.html
+++ /dev/null
@@ -1,3 +0,0 @@
-<!-- Bootsrap & MDB scripts -->
-  <script src="https://cdn.jsdelivr.net/npm/bootstrap@{{ site.bootstrap.version }}/dist/js/bootstrap.bundle.min.js" integrity="{{ site.bootstrap.integrity.js }}" crossorigin="anonymous"></script>
-  <script src="https://cdn.jsdelivr.net/npm/mdbootstrap@{{ site.mdb.version }}/js/mdb.min.js" integrity="{{ site.mdb.integrity.js }}" crossorigin="anonymous"></script>
diff --git a/_includes/scripts/jquery.html b/_includes/scripts/jquery.html
deleted file mode 100644
index f84a2f22..00000000
--- a/_includes/scripts/jquery.html
+++ /dev/null
@@ -1,2 +0,0 @@
-<!-- jQuery -->
-  <script src="https://cdn.jsdelivr.net/npm/jquery@{{ site.jquery.version }}/dist/jquery.min.js" integrity="{{ site.jquery.integrity }}" crossorigin="anonymous"></script>
diff --git a/_includes/scripts/masonry.html b/_includes/scripts/masonry.html
deleted file mode 100644
index 804389d3..00000000
--- a/_includes/scripts/masonry.html
+++ /dev/null
@@ -1,6 +0,0 @@
-  {%- if site.enable_masonry -%}
-  <!-- Masonry & imagesLoaded -->
-  <script defer src="https://cdn.jsdelivr.net/npm/masonry-layout@{{ site.masonry.version }}/dist/masonry.pkgd.min.js" integrity="{{ site.masonry.integrity }}" crossorigin="anonymous"></script>
-  <script defer src="https://cdn.jsdelivr.net/npm/imagesloaded@4/imagesloaded.pkgd.min.js"></script>
-  <script defer src="{{ '/assets/js/masonry.js' | relative_url }}" type="text/javascript"></script>
-  {%- endif -%}
diff --git a/_includes/scripts/mathjax.html b/_includes/scripts/mathjax.html
deleted file mode 100644
index c55ec056..00000000
--- a/_includes/scripts/mathjax.html
+++ /dev/null
@@ -1,12 +0,0 @@
-  {%- if site.enable_math -%}
-  <!-- MathJax -->
-  <script type="text/javascript">
-    window.MathJax = {
-      tex: {
-        tags: 'ams'
-      }
-    };
-  </script>
-  <script defer type="text/javascript" id="MathJax-script" src="https://cdn.jsdelivr.net/npm/mathjax@{{ site.mathjax.version }}/es5/tex-mml-chtml.js"></script>
-  <script defer src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script>
-  {%- endif %}
diff --git a/_includes/scripts/misc.html b/_includes/scripts/misc.html
deleted file mode 100644
index 08ba49f0..00000000
--- a/_includes/scripts/misc.html
+++ /dev/null
@@ -1,14 +0,0 @@
-{% if site.enable_tooltips %}
-  <!-- Enable Tooltips -->
-  <script type="text/javascript">
-  $(function () {$('[data-toggle="tooltip"]').tooltip()})
-  </script>
-{%- endif %}
-{%- if site.enable_medium_zoom %}
-  <!-- Medium Zoom JS -->
-  <script defer src="https://cdn.jsdelivr.net/npm/medium-zoom@{{ site.medium_zoom.version }}/dist/medium-zoom.min.js" integrity="{{ site.medium_zoom.integrity }}" crossorigin="anonymous"></script>
-  <script defer src="{{ '/assets/js/zoom.js' | relative_url }}"></script>
-{%- endif -%}
-
-  <!-- Load Common JS -->
-  <script defer src="{{ '/assets/js/common.js' | relative_url }}"></script>
diff --git a/_includes/selected_papers.html b/_includes/selected_papers.html
deleted file mode 100644
index 61457dbc..00000000
--- a/_includes/selected_papers.html
+++ /dev/null
@@ -1,5 +0,0 @@
-
-          <div class="publications">
-            <h2>selected publications</h2>
-            {% bibliography -f papers -q @*[selected=true]* %}
-          </div>
diff --git a/_includes/social.html b/_includes/social.html
deleted file mode 100644
index 8c7a079c..00000000
--- a/_includes/social.html
+++ /dev/null
@@ -1,84 +0,0 @@
-            {%- if site.email -%}
-            <a href="mailto:{{ site.email | encode_email }}" title="email"><i class="fas fa-envelope"></i></a>
-            {% endif %}
-            {%- if site.orcid_id -%}
-            <a href="https://orcid.org/{{ site.orcid_id }}" title="ORCID"><i class="ai ai-orcid"></i></a>
-            {% endif %}
-            {%- if site.scholar_userid -%}
-            <a href="https://scholar.google.com/citations?user={{ site.scholar_userid }}" title="Google Scholar"><i class="ai ai-google-scholar"></i></a>
-            {% endif %}
-            {%- if site.semanticscholar_id -%}
-            <a href="https://www.semanticscholar.org/author/{{ site.semanticscholar_id }}" title="Semantic Scholar"><i class="ai ai-semantic-scholar"></i></a>
-            {% endif %}
-            {%- if site.publons_id -%}
-            <a href="https://publons.com/a/{{ site.publons_id }}/" title="Publons"><i class="ai ai-publons"></i></a>
-            {% endif %}
-            {%- if site.research_gate_profile -%}
-            <a href="https://www.researchgate.net/profile/{{site.research_gate_profile}}/" title="ResearchGate"><i class="ai ai-researchgate"></i></a>
-            {% endif %}
-            {%- if site.github_username -%}
-            <a href="https://github.com/{{ site.github_username }}" title="GitHub"><i class="fab fa-github"></i></a>
-            {% endif %}
-            {%- if site.linkedin_username -%}
-            <a href="https://www.linkedin.com/in/{{ site.linkedin_username }}" title="LinkedIn"><i class="fab fa-linkedin"></i></a>
-            {% endif %}
-            {%- if site.twitter_username -%}
-            <a href="https://twitter.com/{{ site.twitter_username }}" title="Twitter"><i class="fab fa-twitter"></i></a>
-            {% endif %}
-            {%- if site.medium_username -%}
-            <a href="https://medium.com/@{{ site.medium_username }}" title="Medium"><i class="fab fa-medium"></i></a>
-            {% endif %}
-            {%- if site.quora_username -%}
-            <a href="https://www.quora.com/profile/{{ site.quora_username }}" title="Quora"><i class="fab fa-quora"></i></a>
-            {% endif %}
-            {%- if site.blogger_url -%}
-            <a href="{{ site.blogger_url }}" title="Blogger"><i class="fab fa-blogger-b"></i></a>
-            {% endif %}
-            {%- if site.work_url -%}
-            <a href="{{ site.work_url }}" title="Work"><i class="fas fa-briefcase"></i></a>
-            {% endif %}
-            {%- if site.wikidata_id -%}
-            <a href="https://www.wikidata.org/wiki/{{ site.wikidata_id }}" title="Wikidata"><i class="fas fa-barcode"></i></a>
-            {% endif %}
-            {%- if site.strava_userid -%}
-            <a href="https://www.strava.com/athletes/{{ site.strava_userid }}" title="Strava"><i class="fab fa-strava"></i></a>
-            {% endif %}
-            {%- if site.keybase_username -%}
-            <a href="https://keybase.io/{{ site.keybase_username }}" title="Keybase"><i class="fab fa-keybase"></i></a>
-            {% endif %}
-            {%- if site.gitlab_username -%}
-            <a href="https://gitlab.com/{{ site.gitlab_username }}" title="GitLab"><i class="fab fa-gitlab"></i></a>
-            {% endif %}
-            {%- if site.dblp_url -%}
-            <a href="{{ site.dblp_url }}" title="DBLP"><i class="ai ai-dblp"></i></a>
-            {% endif %}
-            {%- if site.stackoverflow_id -%}
-            <a href="https://stackoverflow.com/users/{{ site.stackoverflow_id }}" title="Stackoverflow"><i class="fab fa-stack-overflow"></i></a>
-            {% endif %}
-            {%- if site.kaggle_id -%}
-            <a href="https://www.kaggle.com/{{ site.kaggle_id }}" title="Kaggle"><i class="fab fa-kaggle"></i></a>
-            {% endif %}
-            {%- if site.lastfm_id -%}
-            <a href="https://www.last.fm/user/{{ site.lastfm_id }}" title="Last FM"><i class="fab fa-lastfm"></i></a>
-            {% endif %}
-            {%- if site.spotify_id -%}
-            <a href="https://open.spotify.com/user/{{ site.spotify_id }}" title="Last FM"><i class="fab fa-spotify"></i></a>
-            {% endif %}
-            {%- if site.pinterest_id -%}
-            <a href="https://www.pinterest.com/{{ site.pinterest_id }}" title="Pinterest"><i class="fab fa-pinterest"></i></a>
-            {% endif %}
-            {%- if site.unsplash_id -%}
-            <a href="https://unsplash.com/@{{ site.unsplash_id }}" title="Unsplash"><i class="fab fa-unsplash"></i></a>
-            {% endif %}
-            {%- if site.instagram_id -%}
-            <a href="https://instagram.com/{{ site.instagram_id }}" title="Instagram"><i class="fab fa-instagram"></i></a>
-            {% endif %}
-            {%- if site.facebook_id -%}
-            <a href="https://facebook.com/{{ site.facebook_id }}" title="Facebook"><i class="fab fa-facebook"></i></a>
-            {% endif %}
-            {%- if site.discord_id -%}
-            <a href="https://discord.com/users/{{ site.discord_id }}" title="Discord"><i class="fab fa-discord"></i></a>
-            {% endif %}
-            {%- if site.rss_icon -%}
-            <a href="{{ site.baseurl }}/feed.xml" title="RSS Feed"><i class="fas fa-rss-square"></i></a>
-            {% endif %}
diff --git a/_layouts/about.html b/_layouts/about.html
deleted file mode 100644
index d3628377..00000000
--- a/_layouts/about.html
+++ /dev/null
@@ -1,66 +0,0 @@
----
-layout: default
----
-
-<!-- about.html -->
-      <div class="post">
-        <header class="post-header">
-          <!-- <h1 class="post-title">
-           {% if site.title == "blank" -%}<span class="font-weight-bold">{{ site.first_name }}</span> {{ site.middle_name }} {{ site.last_name }}{%- else -%}{{ site.title }}{%- endif %}
-          </h1> -->
-          <p class="desc">{{ page.subtitle }}</p>
-        </header>
-
-        <article>
-          {% if page.profile -%}
-          <div class="profile float-{%- if page.profile.align == 'left' -%}left{%- else -%}right{%- endif -%}">
-            {%- if page.profile.image %}
-              {%- assign profile_image_path = page.profile.image | prepend: 'assets/img/' -%}
-
-              {% if page.profile.image_circular %}
-                {%- assign profile_image_class = "img-fluid z-depth-1 rounded-circle" -%}
-              {% else %}
-                {%- assign profile_image_class = "img-fluid z-depth-1 rounded" -%}
-              {% endif %}
-
-              {% include figure.html 
-              path=profile_image_path 
-              class=profile_image_class
-              alt=page.profile.image -%}
-            {% endif -%}
-            {%- if page.profile.address %}
-            <div class="address">
-              {{ page.profile.address }}
-            </div>
-            {%- endif %}
-          </div>
-          {%- endif %}
-
-          <div class="clearfix">
-            {{ content }}
-          </div>
-
-          {% if page.news -%}
-          <!-- News -->
-          {%- include news.html %}
-          {%- endif %}
-          {% if page.selected_papers -%}
-            <!-- Selected papers -->
-            {%- include selected_papers.html %}
-          {%- endif %}
-          {%- if page.social %}
-          <!-- Social -->
-          <div class="social">
-            <div class="contact-icons">
-            {% include social.html %}
-            </div>
-
-            <div class="contact-note">
-              {{ site.contact_note }}
-            </div>
-            
-          </div>
-          {%- endif %}
-        </article>
-
-</div>
diff --git a/_layouts/archive-category.html b/_layouts/archive-category.html
deleted file mode 100644
index 79aad74f..00000000
--- a/_layouts/archive-category.html
+++ /dev/null
@@ -1,27 +0,0 @@
----
-layout: default
----
-
-<div class="post">
-
-  <header class="post-header">
-    <h1 class="post-title"> <i class="fas fa-tag fa-sm"></i> {{ page.title }} </h1>
-    <p class="post-description"> an archive of posts in this category </p>
-  </header>
-
-  <article>
-    <div class="table-responsive">
-      <table class="table table-sm table-borderless">
-        {% for post in page.posts %}
-        <tr>
-          <th scope="row">{{ post.date | date: "%b %-d, %Y" }}</th>
-          <td>
-              <a class="post-link" href="{{ post.url | relative_url }}">{{ post.title }}</a>
-          </td>
-        </tr>
-      {% endfor %}
-      </table>
-    </div>
-  </article>
-
-</div>
diff --git a/_layouts/archive-tag.html b/_layouts/archive-tag.html
deleted file mode 100644
index 66abaebb..00000000
--- a/_layouts/archive-tag.html
+++ /dev/null
@@ -1,27 +0,0 @@
----
-layout: default
----
-
-<div class="post">
-
-  <header class="post-header">
-    <h1 class="post-title"> <i class="fas fa-hashtag fa-sm"></i> {{ page.title }} </h1>
-    <p class="post-description"> an archive of posts with this tag </p>
-  </header>
-
-  <article>
-    <div class="table-responsive">
-      <table class="table table-sm table-borderless">
-        {% for post in page.posts %}
-        <tr>
-          <th scope="row">{{ post.date | date: "%b %-d, %Y" }}</th>
-          <td>
-              <a class="post-link" href="{{ post.url | relative_url }}">{{ post.title }}</a>
-          </td>
-        </tr>
-      {% endfor %}
-      </table>
-    </div>
-  </article>
-
-</div>
diff --git a/_layouts/archive-year.html b/_layouts/archive-year.html
deleted file mode 100644
index 8af1d29b..00000000
--- a/_layouts/archive-year.html
+++ /dev/null
@@ -1,27 +0,0 @@
----
-layout: default
----
-
-<div class="post">
-
-  <header class="post-header">
-    <h1 class="post-title"> <i class="fas fa-calendar fa-sm"></i> {{ page.date | date: "%Y" }} </h1>
-    <p class="post-description"> an archive of posts from this year </p>
-  </header>
-
-  <article>
-    <div class="table-responsive">
-      <table class="table table-sm table-borderless">
-        {% for post in page.posts %}
-        <tr>
-          <th scope="row">{{ post.date | date: "%b %-d, %Y" }}</th>
-          <td>
-              <a class="post-link" href="{{ post.url | relative_url }}">{{ post.title }}</a>
-          </td>
-        </tr>
-      {% endfor %}
-      </table>
-    </div>
-  </article>
-
-</div>
diff --git a/_layouts/bib.html b/_layouts/bib.html
deleted file mode 100644
index eb6520a2..00000000
--- a/_layouts/bib.html
+++ /dev/null
@@ -1,196 +0,0 @@
----
----
-<!-- _layouts/bib.html -->
-      <div class="row">
-        <div class="col-sm-2 {% if entry.preview %}preview{% else %}abbr{% endif %}">
-        {%- if entry.preview -%}
-          {% if entry.preview contains '://' -%}
-          <img class="preview z-depth-1 rounded" src="{{ entry.preview }}">
-          {%- else -%}
-          <img class="preview z-depth-1 rounded" src="{{ entry.preview | prepend: '/assets/img/publication_preview/' | relative_url }}">
-          {%- endif -%}
-        {%- elsif entry.abbr -%}
-          {%- if site.data.venues[entry.abbr] -%}
-            {%- assign venue_style = nil -%}
-            {%- if site.data.venues[entry.abbr].color != blank -%}
-              {%- assign venue_style = site.data.venues[entry.abbr].color | prepend: 'style="background-color:' | append: '"' -%}
-            {%- endif -%}
-            <abbr class="badge" {% if venue_style %}{{venue_style}}{% endif %}><a href="{{site.data.venues[entry.abbr].url}}">{{entry.abbr}}</a></abbr>
-          {%- else -%}
-            <abbr class="badge">{{entry.abbr}}</abbr>
-          {%- endif -%}
-        {%- endif -%}
-        </div>
-
-        <!-- Entry bib key -->
-        <div id="{{entry.key}}" class="col-sm-8">
-        {% if entry.type == "thesis" -%}
-          {{reference}}
-        {%- else %}
-          <!-- Title -->
-          <div class="title">{{entry.title}}</div>
-          <!-- Author -->
-          <div class="author">
-          {% assign author_array_size = entry.author_array | size %}
-
-          {% assign author_array_limit = author_array_size %}
-          {%- if site.max_author_limit and author_array_size > site.max_author_limit %}
-            {% assign author_array_limit = site.max_author_limit %}
-          {% endif %}
-
-          {%- for author in entry.author_array limit: author_array_limit -%}
-            {%- assign author_is_self = false -%}
-            {%- assign author_last_name = author.last | remove: "¶" | remove: "&" | remove: "*" | remove: "†" | remove: "^" -%}
-            {%- if site.scholar.last_name contains author_last_name -%}
-              {%- if site.scholar.first_name contains author.first -%}
-                {%- assign author_is_self = true -%}
-              {%- endif -%}
-            {%- endif -%}
-            {%- assign coauthor_url = nil -%}
-            {%- if site.data.coauthors[author_last_name] -%}
-              {%- for coauthor in site.data.coauthors[author_last_name] -%}
-                {%- if coauthor.firstname contains author.first -%}
-                  {%- assign coauthor_url = coauthor.url -%}
-                  {%- break -%}
-                {%- endif -%}
-              {%- endfor -%}
-            {%- endif -%}
-            
-            {%- if forloop.length > 1 -%}
-              {%- if forloop.first == false -%},&nbsp;{%- endif -%}
-              {%- if forloop.last and author_array_limit == author_array_size -%}and&nbsp;{%- endif -%}
-            {%- endif -%}
-            {%- if author_is_self -%}
-              <em>{{author.first}} {{author.last}}</em>
-            {%- else -%}
-              {%- if coauthor_url -%}
-                <a href="{{coauthor_url}}">{{author.first}} {{author.last}}</a>
-              {%- else -%}
-                {{author.first}} {{author.last}}
-              {%- endif -%}
-            {%- endif -%}
-          {%- endfor -%}
-          {%- assign more_authors = author_array_size | minus: author_array_limit -%}
-          
-          {%- assign more_authors_hide = more_authors | append: " more author" -%}
-          {%- if more_authors > 0 -%}
-            {%- if more_authors > 1 -%}
-              {%- assign more_authors_hide = more_authors_hide | append: "s" -%}
-            {%- endif -%}
-            {%- assign more_authors_show = '' -%}
-            {%- for author in entry.author_array offset: author_array_limit -%}
-              {%- assign more_authors_show = more_authors_show | append: author.first | append: " " | append: author.last -%}
-              {%- unless forloop.last -%}
-                {%- assign more_authors_show = more_authors_show | append: ", " -%}
-              {%- endunless -%}
-            {%- endfor -%}
-            , and
-            <span
-                class="more-authors"
-                title="click to view {{more_authors_hide}}"
-                onclick="
-                  var element = $(this);
-                  element.attr('title', '');
-                  var more_authors_text = element.text() == '{{more_authors_hide}}' ? '{{more_authors_show}}' : '{{more_authors_hide}}';
-                  var cursorPosition = 0;
-                  var textAdder = setInterval(function(){
-                    element.text(more_authors_text.substring(0, cursorPosition + 1));
-                    if (++cursorPosition == more_authors_text.length){
-                      clearInterval(textAdder);
-                    }
-                }, '{{site.more_authors_animation_delay}}');
-                "
-            >{{more_authors_hide}}</span>
-          {%- endif -%}
-
-          </div>
-
-          <!-- Journal/Book title and date -->
-          {% assign proceedings = "inproceedings, incollection" | split: ','%}
-          {% if entry.type == "article" -%}
-            {%- capture entrytype -%}<em>{{entry.journal}}</em>{%- endcapture -%}
-          {%- elsif proceedings contains entry.type -%}
-            {%- capture entrytype -%}<em>In {{entry.booktitle}}</em> {%- endcapture -%}
-          {%- else -%}
-            {%- capture entrytype -%}{%- endcapture -%}
-          {%- endif -%}
-          {%- if entry.month -%}
-            {%- capture entrymonth -%}{{ " " }}{{ entry.month | capitalize }}{%- endcapture -%}
-          {%- endif -%}
-          {%- if entry.year -%}
-            {%- capture entryyear -%}{{ " " }}{{entry.year}}{%- endcapture -%}
-          {%- endif -%}
-          {%- capture periodical -%}{{ entrytype }}{{ entrymonth }}{{ entryyear }}{%- endcapture -%}
-          <div class="periodical">
-            {{ periodical | strip }}
-          </div>
-        {%- endif %}
-        
-          <!-- Links/Buttons -->
-          <div class="links">
-          {%- if entry.abstract %}
-            <a class="abstract btn btn-sm z-depth-0" role="button">Abs</a>
-          {%- endif %}
-          {%- if entry.arxiv %}
-            <a href="http://arxiv.org/abs/{{ entry.arxiv }}" class="btn btn-sm z-depth-0" role="button">arXiv</a>
-          {%- endif %}
-          {%- if entry.bibtex_show %}
-            <a class="bibtex btn btn-sm z-depth-0" role="button">Bib</a>
-          {%- endif %}
-          {%- if entry.html %}
-            <a href="{{ entry.html }}" class="btn btn-sm z-depth-0" role="button">HTML</a>
-          {%- endif %}
-          {%- if entry.pdf %}
-            {% if entry.pdf contains '://' -%}
-            <a href="{{ entry.pdf }}" class="btn btn-sm z-depth-0" role="button">PDF</a>
-            {%- else -%}
-            <a href="{{ entry.pdf | prepend: '/assets/pdf/' | relative_url }}" class="btn btn-sm z-depth-0" role="button">PDF</a>
-            {%- endif %}
-          {%- endif %}
-          {%- if entry.supp %}
-            {% if entry.supp contains '://' -%}
-            <a href="{{ entry.supp }}" class="btn btn-sm z-depth-0" role="button">Supp</a>
-            {%- else -%}
-            <a href="{{ entry.supp | prepend: '/assets/pdf/' | relative_url }}" class="btn btn-sm z-depth-0" role="button">Supp</a>
-            {%- endif %}
-          {%- endif %}
-          {%- if entry.blog %}
-            <a href="{{ entry.blog }}" class="btn btn-sm z-depth-0" role="button">Blog</a>
-          {%- endif %}
-          {%- if entry.code %}
-            <a href="{{ entry.code }}" class="btn btn-sm z-depth-0" role="button">Code</a>
-          {%- endif %}
-          {%- if entry.poster %}
-            {% if entry.poster contains '://' -%}
-            <a href="{{ entry.poster }}" class="btn btn-sm z-depth-0" role="button">Poster</a>
-            {%- else -%}
-            <a href="{{ entry.poster | prepend: '/assets/pdf/' | relative_url }}" class="btn btn-sm z-depth-0" role="button">Poster</a>
-            {%- endif %}
-          {%- endif %}
-          {%- if entry.slides %}
-            {% if entry.slides contains '://' -%}
-            <a href="{{ entry.slides }}" class="btn btn-sm z-depth-0" role="button">Slides</a>
-            {%- else -%}
-            <a href="{{ entry.slides | prepend: '/assets/pdf/' | relative_url }}" class="btn btn-sm z-depth-0" role="button">Slides</a>
-            {%- endif %}
-          {%- endif %}
-          {%- if entry.website %}
-            <a href="{{ entry.website }}" class="btn btn-sm z-depth-0" role="button">Website</a>
-          {%- endif %}
-          </div>
-
-          {% if entry.abstract -%}
-          <!-- Hidden abstract block -->
-          <div class="abstract hidden">
-            <p>{{ entry.abstract }}</p>
-          </div>
-          {%- endif -%}
-
-          {% if entry.bibtex_show -%}
-          <!-- Hidden bibtex block -->
-          <div class="bibtex hidden">
-            {% highlight bibtex %}{{ entry.bibtex | hideCustomBibtex }}{% endhighlight %}
-          </div>
-          {%- endif %}
-        </div>
-      </div>
diff --git a/_layouts/cv.html b/_layouts/cv.html
deleted file mode 100644
index bb3d85af..00000000
--- a/_layouts/cv.html
+++ /dev/null
@@ -1,35 +0,0 @@
----
-layout: default
----
-<!-- _layouts/cv.html -->
-        <div class="post">
-
-          <header class="post-header">
-            <h1 class="post-title">{{ page.title }} {% if page.cv_pdf %}<a href="{{ page.cv_pdf | prepend: 'assets/pdf/' | relative_url}}" target="_blank" rel="noopener noreferrer" class="float-right"><i class="fas fa-file-pdf"></i></a>{% endif %}</h1>
-            <p class="post-description">{{ page.description }}</p>
-          </header>
-
-          <article>
-            <div class="cv">
-              {% for entry in site.data.cv %}
-                <div class="card mt-3 p-3">
-                  <h3 class="card-title font-weight-medium">{{ entry.title }}</h3>
-                  <div>
-                  {% if entry.type == "list" %}
-                    {% include cv/list.html %}
-                  {% elsif entry.type == "map" %}
-                    {% include cv/map.html %}
-                  {% elsif entry.type == "nested_list" %}
-                    {% include cv/nested_list.html %}
-                  {% elsif entry.type == "time_table" %}
-                   {% include cv/time_table.html %}
-                  {% else %}
-                    {{ entry.contents }}
-                  {% endif %}
-                  </div>
-                </div>
-              {% endfor %}
-              </div>
-          </article>
-
-        </div>
diff --git a/_layouts/default.html b/_layouts/default.html
deleted file mode 100644
index 1001a5b5..00000000
--- a/_layouts/default.html
+++ /dev/null
@@ -1,36 +0,0 @@
-<!DOCTYPE html>
-<html lang="{{ site.lang }}">
-
-  <!-- Head -->
-  <head>
-  {%- if page.redirect -%}
-    <meta http-equiv="refresh" content="3; url={{ site.baseurl }}/" />
-  {%- endif -%}
-  {% include head.html %}
-  </head>
-
-  <!-- Body -->
-  <body class="{% if site.navbar_fixed %}fixed-top-nav{% endif %} {% unless site.footer_fixed %}sticky-bottom-footer{% endunless %}">
-
-    <!-- Header -->
-    {%- include header.html %}
-
-    <div class="header-background"><div class="img"></div></div>      
-
-    <!-- Content -->
-    <div class="container mt-5">
-      {{ content }}
-    </div>
-
-    <!-- Footer -->
-    <!-- {%- include footer.html %} -->
-
-    <!-- JavaScripts -->
-    {% include scripts/jquery.html %}
-    {% include scripts/bootstrap.html %}
-    {% include scripts/masonry.html %}
-    {% include scripts/misc.html %}
-    {% include scripts/mathjax.html %}
-    {% include scripts/analytics.html %}
-  </body>
-</html>
diff --git a/_layouts/distill.html b/_layouts/distill.html
deleted file mode 100644
index d95aeddf..00000000
--- a/_layouts/distill.html
+++ /dev/null
@@ -1,197 +0,0 @@
-<!DOCTYPE html>
-<!-- _layouts/distill.html -->
-<html>
-
-<script>
-  let thunk = () => {
-    let trimIt = (e) => e.trim();
-    let getText = (e) => e.innerText;
-    let splitNameAndFamilyName = (e) => {
-      let splitted = e.split(" ");
-
-      let fnames = splitted.slice(0, -1).join(" ");
-      let lname = splitted.at(-1);
-
-      return [lname, fnames];
-    }
-
-    let authors = Array.from(document.getElementsByClassName("author")).map(getText).map(trimIt).map(splitNameAndFamilyName);
-    let firstAuthorLName = authors[0][0];
-    let affiliationElements = Array.from(document.getElementsByClassName("affiliation")).filter(e => e.nodeName === "P").map(getText).map(trimIt);
-
-    // getting stuff directly from Jekyll
-    let publishedWhen = "{{ page.date | date: '%B %-d, %Y' }}";
-    let title = "{{ page.title }}";
-    let description = "{{ page.description }}";
-
-    {
-      let authorsBibtex = authors.map(e => `${e[0]}, ${e[1]}`).join(" and ");
-      let bibtexTitleShorthand = (firstAuthorLName +
-              "2023" +
-              title.split(" ").slice(0, 3).join("")
-      ).replace(" ", "").replace(/[\p{P}$+<=>^`|~]/gu, '').toLowerCase().trim();
-      let bibtexTemplate = `
-@inproceedings{${bibtexTitleShorthand},
-  author = {${authorsBibtex}},
-  title = {${title}},
-  abstract = {${description}},
-  booktitle = {ICLR Blogposts 2023},
-  year = {2023},
-  date = {${publishedWhen}},
-  note = {${window.location.href}},
-  url  = {${window.location.href}}
-}
-  `.trim();
-      document.getElementById("bibtex-box").innerText = bibtexTemplate;
-    }
-
-    {
-      let academicLFI = authors.map(e => e[0]);
-      {
-        if (academicLFI.length > 2) academicLFI = academicLFI[0] + ", et al.";
-        else if (academicLFI.length == 2) academicLFI = academicLFI[0] + " & " + academicLFI[1];
-        else academicLFI = academicLFI[0];
-      }
-      let academicTemplate = `
-${academicLFI}, "${title}", ICLR Blogposts, 2023.
-`.trim();
-      document.getElementById("bibtex-academic-attribution").innerText = academicTemplate;
-    }
-  };
-
-  document.addEventListener('readystatechange', function(event) {
-    if (document.readyState === "complete") {
-      thunk();
-    }
-  });
-
-
-
-</script>
-
-  <head>
-    {%- include head.html %}
-
-    {% include scripts/jquery.html %}
-    {% include scripts/mathjax.html %}
-    <!-- Distill js -->
-    <script src="{{ '/assets/js/distillpub/template.v2.js' | relative_url }}"></script>
-    <script src="{{ '/assets/js/distillpub/transforms.v2.js' | relative_url }}"></script>
-    <script src="{{ '/assets/js/distillpub/overrides.js' | relative_url }}"></script>
-    {% if page._styles %}
-    <!-- Page/Post style -->
-    <style type="text/css">
-      {{ page._styles }}
-    </style>
-    {%- endif %}
-  </head>
-
-  <d-front-matter>
-    <script async type="text/json">{
-      "title": "{{ page.title }}",
-      "description": "{{ page.description }}",
-      "published": "{{ page.date | date: '%B %-d, %Y' }}",
-      "authors": [
-        {% for author in page.authors -%}
-        {
-          "author": "{{ author.name }}",
-          "authorURL": "{{ author.url }}",
-          "affiliations": [
-            {
-              "name": "{{ author.affiliations.name }}",
-              "url": "{{ author.affiliations.url }}"
-            }
-          ]
-        }{% if forloop.last == false %},{% endif %}
-        {% endfor %}
-      ],
-      "katex": {
-        "delimiters": [
-          {
-            "left": "$",
-            "right": "$",
-            "display": false
-          },
-          {
-            "left": "$$",
-            "right": "$$",
-            "display": true
-          }
-        ]
-      }
-    }</script>
-  </d-front-matter>
-
-  <body class="{%- if site.navbar_fixed -%}fixed-top-nav{%- endif -%} {%- unless site.footer_fixed -%}sticky-bottom-footer{%- endunless -%}">
-
-    <!-- Header -->
-    {%- include header.html %}
-
-    <!-- Content -->
-    <div class="post distill">
-
-      <d-title>
-        <h1>{{ page.title }}</h1>
-        <p>{{ page.description }}</p>
-      </d-title>
-
-      <d-byline></d-byline>
-
-      <d-article>
-        {% if page.toc -%}
-        <d-contents>
-          <nav class="l-text figcaption">
-          <h3>Contents</h3>
-            {% for section in page.toc -%}
-            <div><a href="#{{ section.name | slugify }}">{{ section.name }}</a></div>
-            {% if section.subsections -%}
-            <ul>
-              {% for subsection in section.subsections -%}
-              <li><a href="#{{ subsection.name | slugify }}">{{ subsection.name }}</a></li>
-              {% endfor %}
-            </ul>
-            {%- endif -%}
-            {%- endfor %}
-          </nav>
-        </d-contents>
-        {%- endif %}
-
-        {{ content }}
-      </d-article>
-
-      <d-appendix>
-        <d-footnote-list></d-footnote-list>
-        <d-citation-list></d-citation-list>
-      </d-appendix>
-
-    </div>
-
-    <d-bibliography src="{{ page.bibliography | prepend: '/assets/bibliography/' | relative_url }}"></d-bibliography>
-
-    <d-article id="bibtex-container" class="related highlight">
-      For attribution in academic contexts, please cite this work as
-      <pre id="bibtex-academic-attribution">
-        PLACEHOLDER FOR ACADEMIC ATTRIBUTION
-  </pre>
-
-      BibTeX citation
-      <pre id="bibtex-box">
-        PLACEHOLDER FOR BIBTEX
-  </pre>
-    </d-article>
-
-
-    <script src="https://utteranc.es/client.js"
-        repo="iclr-blogposts/2023"
-        issue-term="pathname"
-        theme="github-light"
-        crossorigin="anonymous"
-        async>
-</script>
-    
-    {% include scripts/bootstrap.html %}
-    {% include scripts/analytics.html %}
-  </body>
-
-
-</html>
diff --git a/_layouts/none.html b/_layouts/none.html
deleted file mode 100644
index b92f6522..00000000
--- a/_layouts/none.html
+++ /dev/null
@@ -1 +0,0 @@
-{{content}}
diff --git a/_layouts/page.html b/_layouts/page.html
deleted file mode 100644
index 5a0c4080..00000000
--- a/_layouts/page.html
+++ /dev/null
@@ -1,16 +0,0 @@
----
-layout: default
----
-<!-- page.html -->
-        <div class="post">
-
-          <!-- <header class="post-header">
-            <h1 class="post-title">{{ page.title }}</h1>
-            <p class="post-description">{{ page.description }}</p>
-          </header> -->
-
-          <article>
-            {{ content }}
-          </article>
-
-        </div>
diff --git a/_layouts/post.html b/_layouts/post.html
deleted file mode 100644
index bbe2477f..00000000
--- a/_layouts/post.html
+++ /dev/null
@@ -1,85 +0,0 @@
----
-layout: default
----
-<!-- _layouts/post.html -->
-{%- assign year = page.date | date: "%Y" -%}
-{%- assign tags = page.tags | join: "" -%}
-{%- assign categories = page.categories | join: "" -%}
-
-{% if page._styles %}
-<!-- Page/Post style -->
-<style type="text/css">
-  {{ page._styles }}
-</style>
-{% endif %}
-
-<div class="post">
-
-  <header class="post-header">
-    <h1 class="post-title">{{ page.title }}</h1>
-    <p class="post-meta">{{ page.date | date: "%B %-d, %Y" }}{%- if page.author -%} • {{ page.author }}{%- endif -%}{%- if page.meta -%} • {{ page.meta }}{%- endif -%}</p>
-    <p class="post-tags">
-      <a href="{{ year | prepend: '/blog/' | prepend: site.baseurl}}"> <i class="fas fa-calendar fa-sm"></i> {{ year }} </a>
-      {%- if tags != "" %}
-      &nbsp; &middot; &nbsp;
-        {% for tag in page.tags -%}
-        <a href="{{ tag | slugify | prepend: '/blog/tag/' | prepend: site.baseurl}}">
-          <i class="fas fa-hashtag fa-sm"></i> {{ tag }}</a> &nbsp;
-          {% endfor -%}
-      {% endif %}
-
-      {%- if categories != "" %}
-      &nbsp; &middot; &nbsp;
-        {% for category in page.categories -%}
-        <a href="{{ category | slugify | prepend: '/blog/category/' | prepend: site.baseurl}}">
-          <i class="fas fa-tag fa-sm"></i> {{ category }}</a> &nbsp;
-          {% endfor -%}
-      {% endif %}
-
-    </p>
-  </header>
-
-  <article class="post-content">
-    {{ content }}
-  </article>
-
-
-  <div id="bibtex-container" class="related">
-    For attribution in academic contexts, please cite this work as
-    <pre id="bibtex-academic-attribution">
-
-  </pre>
-
-    BibTeX citation
-    <pre id="bibtex-box">
-
-  </pre>
-  </div>
-
-
-  {%- if site.disqus_shortname and page.comments -%}
-    <div id="disqus_thread"></div>
-    <script type="text/javascript">
-      var disqus_shortname  = '{{ site.disqus_shortname }}';
-      var disqus_identifier = '{{ page.id }}';
-      var disqus_title      = {{ page.title | jsonify }};
-      (function() {
-        var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
-        dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js';
-        (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
-      })();
-    </script>
-    <noscript>Please enable JavaScript to view the <a href="http://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript>
-  {%- endif %}
-
-  <!--
-    <script src="https://utteranc.es/client.js"
-        repo="iclr-blogposts/2023"
-        issue-term="pathname"
-        theme="github-light"
-        crossorigin="anonymous"
-        async>
-</script>
-  -->
-
-</div>
diff --git a/_news/announcement_1.md b/_news/announcement_1.md
deleted file mode 100644
index 98e5af5c..00000000
--- a/_news/announcement_1.md
+++ /dev/null
@@ -1,7 +0,0 @@
----
-layout: post
-date: 2015-10-22 15:59:00-0400
-inline: true
----
-
-A simple inline announcement.
diff --git a/_news/announcement_2.md b/_news/announcement_2.md
deleted file mode 100644
index dbd4b4d4..00000000
--- a/_news/announcement_2.md
+++ /dev/null
@@ -1,31 +0,0 @@
----
-layout: post
-title: A long announcement with details
-date: 2015-11-07 16:11:00-0400
-inline: false
----
-
-Announcements and news can be much longer than just quick inline posts. In fact, they can have all the features available for the standard blog posts. See below.
-
-***
-
-Jean shorts raw denim Vice normcore, art party High Life PBR skateboard stumptown vinyl kitsch. Four loko meh 8-bit, tousled banh mi tilde forage Schlitz dreamcatcher twee 3 wolf moon. Chambray asymmetrical paleo salvia, sartorial umami four loko master cleanse drinking vinegar brunch. <a href="https://www.pinterest.com">Pinterest</a> DIY authentic Schlitz, hoodie Intelligentsia butcher trust fund brunch shabby chic Kickstarter forage flexitarian. Direct trade <a href="https://en.wikipedia.org/wiki/Cold-pressed_juice">cold-pressed</a> meggings stumptown plaid, pop-up taxidermy. Hoodie XOXO fingerstache scenester Echo Park. Plaid ugh Wes Anderson, freegan pug selvage fanny pack leggings pickled food truck DIY irony Banksy.
-
-#### Hipster list
-<ul>
-    <li>brunch</li>
-    <li>fixie</li>
-    <li>raybans</li>
-    <li>messenger bag</li>
-</ul>
-
-Hoodie Thundercats retro, tote bag 8-bit Godard craft beer gastropub. Truffaut Tumblr taxidermy, raw denim Kickstarter sartorial dreamcatcher. Quinoa chambray slow-carb salvia readymade, bicycle rights 90's yr typewriter selfies letterpress cardigan vegan.
-
-***
-
-Pug heirloom High Life vinyl swag, single-origin coffee four dollar toast taxidermy reprehenderit fap distillery master cleanse locavore. Est anim sapiente leggings Brooklyn ea. Thundercats locavore excepteur veniam eiusmod. Raw denim Truffaut Schlitz, migas sapiente Portland VHS twee Bushwick Marfa typewriter retro id keytar.
-
-> We do not grow absolutely, chronologically. We grow sometimes in one dimension, and not in another, unevenly. We grow partially. We are relative. We are mature in one realm, childish in another.
-> —Anais Nin
-
-Fap aliqua qui, scenester pug Echo Park polaroid irony shabby chic ex cardigan church-key Odd Future accusamus. Blog stumptown sartorial squid, gastropub duis aesthetic Truffaut vero. Pinterest tilde twee, odio mumblecore jean shorts lumbersexual.
diff --git a/_news/announcement_3.md b/_news/announcement_3.md
deleted file mode 100644
index d9072191..00000000
--- a/_news/announcement_3.md
+++ /dev/null
@@ -1,7 +0,0 @@
----
-layout: post
-date: 2016-01-15 07:59:00-0400
-inline: true
----
-
-A simple inline announcement with Markdown emoji! :sparkles: :smile:
diff --git a/_pages/about.md b/_pages/about.md
deleted file mode 100644
index 3b1fc4ea..00000000
--- a/_pages/about.md
+++ /dev/null
@@ -1,204 +0,0 @@
----
-layout: about
-title: about
-permalink: /about
-nav: true
-nav_order: 1
-subtitle:
-
-# profile:
-#   align: right
-#   image: 
-#   image_circular: false # crops the image to make it circular
-#   address: 
-
-# news: false  # includes a list of news items
-# selected_papers: false # includes a list of papers marked as "selected={true}"
-# social: false  # includes social icons at the bottom of the page
----
-
-**Announcements**: 
-- The track has concluded and accepted blogposts are viewable [here]({{ '/blog' | relative_url }})!
-- The poster session for the blog track will take place at **11:30** on **Tuesday May 2nd** in room **MH1-2-3-4**. 
-  - Check [here](https://iclr.cc/virtual/2023/workshop/14478) for more information, and come by to check out the posters!
-  - If you are going to be presenting a poster in-person, please add the [blog post track sticker]({{ '/assets/pdf/sticker.pdf' | relative_url }}) to your poster.
-
-
-
-## Contents
-
-- [Accepted Posts](#accepted-posts)
-- [ICLR 2023 Blogposts Track](#iclr-2023-blogposts-track)
-- [Key Dates](#key-dates)
-- [Submissions](#submissions)
-- [Organizers](#organizers)
-
-
-## Accepted Posts
-
-**[How does the inductive bias influence the generalization capability of neural networks?]({% post_url 2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks %})**
-: &nbsp;&nbsp;&nbsp;&nbsp; _Charlotte Barth, Thomas Goerttler, Klaus Obermayer_
-
-**[Universality of Neural Networks on Sets vs. Graphs]({% post_url 2023-05-01-sets-and-graphs %})**
-: &nbsp;&nbsp;&nbsp;&nbsp; _Fabian B. Fuchs, Petar Veličković_
-
-**[Data Poisoning is Hitting a Wall]({% post_url 2023-05-01-facial-poisoning %})**
-: &nbsp;&nbsp;&nbsp;&nbsp; _Rajat Sahay_
-
-**[Decay No More]({% post_url 2023-05-01-adamw %})**
-: &nbsp;&nbsp;&nbsp;&nbsp; _Fabian Schaipp_
-
-**[Rethinking the Implementation Tricks and Monotonicity Constraint in Cooperative Multi-agent Reinforcement Learning]({% post_url 2023-05-01-riit %})**
-: &nbsp;&nbsp;&nbsp;&nbsp; _Jian Hu, Siying Wang, Siyang Jiang, Weixun Wang_
-
-**[Autoregressive Renaissance in Neural PDE Solvers]({% post_url 2023-05-01-autoregressive-neural-pde-solver %})**
-: &nbsp;&nbsp;&nbsp;&nbsp; _Yolanne Lee_
-
-**[A Hitchhiker's Guide to Momentum]({% post_url 2023-05-01-hitchhikers-momentum %})**
-: &nbsp;&nbsp;&nbsp;&nbsp; _Fabian Pedregosa_
-
-**[Thinking Like Transformers]({% post_url 2023-05-01-raspy %})**
-: &nbsp;&nbsp;&nbsp;&nbsp; _Alexander Rush, Gail Weiss_
-
-**[Strategies for Classification Layer Initialization in Model-Agnostic Meta-Learning]({% post_url 2023-05-01-classification-layer-initialization-in-maml %})**
-: &nbsp;&nbsp;&nbsp;&nbsp; _Nys Tjade Siegel, Thomas Goerttler, Klaus Obermayer_
-
-**[Practical Applications of Bsuite For Reinforcement Learning]({% post_url 2023-05-01-bsuite-applications %})**
-: &nbsp;&nbsp;&nbsp;&nbsp; _Loren Anderson, Nathan Bittner_
-
-**[How much meta-learning is in image-to-image translation?]({% post_url 2023-05-01-how-much-meta-learning-is-in-image-to-image-translation %})**
-: &nbsp;&nbsp;&nbsp;&nbsp; _Maximilian Eißler, Thomas Goerttler, Klaus Obermayer_
-
-
-
-## ICLR 2023 Blogposts Track
-
-The Machine Learning community is currently experiencing a
-[reproducibility crisis](https://neuripsconf.medium.com/designing-the-reproducibility-program-for-neurips-2020-7fcccaa5c6ad)
-and a reviewing crisis [[Littman, 2021]](#Litt). Because of the highly competitive and noisy
-reviewing process of ML conferences [[Tran et al., 2020]](#Tran), researchers have an incentive to
-oversell their results, slowing down the progress and diminishing the
-integrity of the scientific community. Moreover with the growing number
-of papers published and submitted at the main ML conferences [[Lin et al., 2020]](#Lin), it has
-become more challenging to keep track of the latest advances in the
-field.
-
-Blog posts are becoming an increasingly popular and useful way to talk
-about science [[Brown and Woolston, 2018]](#Brow).
-They offer substantial value to the scientific community
-by providing a flexible platform to foster open, human, and transparent
-discussions about new insights or limitations of a scientific
-publication. However, because they are not as recognized as standard
-scientific publications, only a minority of researchers manage to
-maintain an active blog and get visibility for their efforts. Many are
-well-established researchers ([Francis Bach](https://francisbach.com/),
-[Ben Recht](https://www.argmin.net/), [Ferenc
-Huszár](https://www.inference.vc/), [Lilian
-Weng](https://lilianweng.github.io/lil-log/)) or big corporations that
-leverage entire teams of graphic designers designer and writers to
-polish their blogs ([Facebook AI](https://ai.facebook.com/blog/?page=1),
-[Google AI](https://ai.googleblog.com/),
-[DeepMind](https://deepmind.com/blog),
-[OpenAI](https://openai.com/blog/)). As a result, the incentives for
-writing scientific blog posts are largely personal; it is unreasonable
-to expect a significant portion of the machine learning community to
-contribute to such an initiative when everyone is trying to establish
-themselves through publications.
-
-## A Blog Post Conference Track
-
-Last year, we ran the first iteration of the [Blogpost track at ICLR 2022](https://iclr-blog-track.github.io/home/)!
-It was very successful, attracting over 60 submissions and 20 accepted posts.
-
-Our goal is to create a formal call for blog posts at ICLR to
-incentivize and reward researchers to review past work and summarize the
-outcomes, develop new intuitions, or highlight some shortcomings. A very
-influential initiative of this kind happened after the second world war
-in France. Because of the lack of up-to-date textbooks, a collective of
-mathematicians under the pseudonym Nicolas Bourbaki [[Halmos 1957]](#Halm), decided to start a
-series of textbooks about the foundations of mathematics [[Bourbaki, 1939]](#Bour).
-In the same vein, we aim at providing a new way to summarize scientific knowledge in
-the ML community.
-
-Due to the large diversity of topics that can be discussed in a blog
-post, we decided to restrict the range of topics for this call for blog
-posts. We identified that the blog posts that would bring to most value
-to the community and the conference would be posts that distill and
-discuss *previously published papers*.
-
-## Key Dates
-
-- **Abstract  deadline**: February 2nd AOE, 2023 (submit to [OpenReview](https://openreview.net/group?id=ICLR.cc/2023/BlogPosts&referrer=%5BHomepage%5D(%2F))).
-&nbsp;
-
-- **Submission  deadline**: February 10th AOE, 2023 (any modifications to your blog post, via a [pull request on github](https://github.com/iclr-blogposts/staging/pulls)).
-&nbsp;
-
-- **Notification of acceptance**: March 31st, 2023
-&nbsp;
-
-- **Camera-ready merge**: April 28th, 2023 (please follow the instructions [here]({{ '/submitting#camera-ready-instructions' | relative_url }}))
-
-### A call for blog posts discussing work previously published at ICLR
-
-The format and process for this blog post track is as follows:
-
-
-- Write a post on a subject that has been published at ICLR relatively recently.
-    The authors of the blog posts will have to declare their conflicts of interest (positive nor negative) with the paper (and their authors) they write about. 
-    Conflicts of interest include:
-    - Recent collaborators (less than 3 years)
-    - Current institution.
-
-    Blog Posts must not be used to highlight or advertise past publications of the authors of of their lab. 
-    Previously, we did not accept submissions with a conflict of interest, however this year we will only ask the authors to report if they have such a conflict. 
-    If so, reviewers will be asked to judge if the submission is sufficiently critical and objective of the papers addressed in the blog post. 
-
-- The posts will be created and published under a unified template; see [the submission instructions]({{ '/submitting' | relative_url }})
-    and the [sample post]({{ '/blog/2022/distill-example' | relative_url }}) hosted on the blog of this website.
-
-- Blogs will be peer-reviewed (double-blind) for quality and novelty of the content: clarity and pedagogy of the exposition, new theoretical or practical insights, reproduction/extension of experiments, etc.
-We are slightly relaxing the double-blind constraints by assuming good faith from both submitters and reviewers (see [the submission instructions]({{ '/submitting' | relative_url }}) for more details).
-
-## Submissions
-
-Our goal is to avoid heavily engineered, professionally-made
-blog-posts---Such as the “100+ hours” mentioned as a standard by the [Distill
-guidelines](https://distill.pub/journal/)---to entice ideas and clear writing rather than dynamic
-visualizations or embedded javascript engines.
-
-As a result, we restrict submissions to the Markdown format. We believe
-this is a good trade-off between complexity and flexibility. Markdown
-enables users to easily embed media such as images, gifs, audio, and
-video as well as write mathematical equations using MathJax, without
-requiring users to know how to create HTML web pages. This (mostly)
-static format is also fairly portable; users can download the blog post
-without much effort for offline reading or archival purposes. More
-importantly, this format can be easily hosted and maintained through
-GitHub.
-
-## Organizers
-
-<div class="row row-cols-2 projects pt-3 pb-3">
-  {% include people_horizontal.html name="Gauthier Gidel" affiliation="Mila, Université de Montréal" url="https://gauthiergidel.github.io/" img="assets/img/organizers/gg.jpg" %}
-  {% include people_horizontal.html name="Charlie Gauthier" affiliation="Mila, Université de Montréal" url="https://velythyl.github.io/" img="assets/img/organizers/cg.jpg" %}
-  {% include people_horizontal.html name="David Dobre" affiliation="Mila, Université de Montréal" url="" img="assets/img/organizers/dd.jpg" %}
-  {% include people_horizontal.html name="Claire Vernade" affiliation="University of Tuebingen" url="https://www.cvernade.com/" img="assets/img/organizers/cv.jpg" %}
-  {% include people_horizontal.html name="Joan Bruna" affiliation="New York University" url="https://cims.nyu.edu/~bruna/" img="assets/img/organizers/jb.jpg" %}
-</div>
-
----
-
-## References
-
-<a name="Litt">Michael L Littman. Collusion rings threaten the integrity of computer science research. Communications of the ACM, 2021.</a>
-
-<a name="Tran">David Tran, Alex Valtchanov, Keshav Ganapathy, Raymond Feng, Eric Slud, Micah Goldblum, and Tom Goldstein. An open review of openreview: A critical analysis of the machine learning conference review process. arXiv, 2020. </a>
-
-<a name="Lin">Hsuan-Tien Lin, Maria-Florina Balcan, Raia Hadsell, and Marc’Aurelio Ranzato. What we learned from neurips2020 reviewing process. Medium https://medium.com/@NeurIPSConf/what-we-learned-from-neurips-2020-reviewing-process-e24549eea38f, 2020. </a>
-
-<a name="Brow">Eryn Brown and Chris Woolston. Why science blogging still matters. Nature, 2018.</a>
-
-<a name="Halm">Paul R Halmos. Nicolas bourbaki. Scientific American, 1957.<a>
-
-<a name="Bour">Nicolas Bourbaki. Elements of mathematics. Éditions Hermann, 1939.</a>
\ No newline at end of file
diff --git a/_pages/call.md b/_pages/call.md
deleted file mode 100644
index e13c1ab6..00000000
--- a/_pages/call.md
+++ /dev/null
@@ -1,70 +0,0 @@
----
-layout: page
-title: call for blogposts
-permalink: /call
-description:
-nav: true
-nav_order: 2
----
-
-**Announcements**: 
-- The track has concluded and accepted blogposts are viewable [here]({{ '/blog' | relative_url }})!
-- The poster session for the blog track will take place at **11:30** on **Tuesday May 2nd** in room **MH1-2-3-4**. 
-  - Check [here](https://iclr.cc/virtual/2023/workshop/14478) for more information, and come by to check out the posters!
-  - If you are going to be presenting a poster in-person, please add the [blog post track sticker]({{ '/assets/pdf/sticker.pdf' | relative_url }}) to your poster.
-
-
-
-# Call for blogposts
-
-We invite all researchers and practicioners to submit a blogpost discussing work previously published at ICLR, to the ICLR 2023 blogpost track.
-
-The format and process for this blog post track is as follows:
-
-- Write a post on a subject that has been published at ICLR relatively recently.
-    The authors of the blog posts will have to declare their conflicts of interest (positive nor negative) with the paper (and their authors) they write about. 
-    Conflicts of interest include:
-    - Recent collaborators (less than 3 years)
-    - Current institution.
-
-    Blog Posts must not be used to highlight or advertise past publications of the authors or of their lab. 
-    Previously, we did not accept submissions with a conflict of interest, however this year we will only ask the authors to report if they have such a conflict. 
-    If so, reviewers will be asked to judge if the submission is sufficiently critical and objective of the papers addressed in the blog post. 
-
-- The posts will be created and published under a unified template; see [the submission instructions]({{ '/submitting' | relative_url }})
-    and the [sample post]({{ '/blog/2022/distill-example' | relative_url }}) hosted on the blog of this website.
-
-- Blogs will be peer-reviewed (double-blind) for quality and novelty of the content: clarity and pedagogy of the exposition, new theoretical or practical insights, reproduction/extension of experiments, etc.
-We are slightly relaxing the double-blind constraints by assuming good faith from both submitters and reviewers (see [the submission instructions]({{ '/submitting' | relative_url }}) for more details).
-
-## Key Dates
-
-- **Abstract  deadline**: February 2nd AOE, 2023 (submit to [OpenReview](https://openreview.net/group?id=ICLR.cc/2023/BlogPosts&referrer=%5BHomepage%5D(%2F))).
-&nbsp;
-
-- **Submission  deadline**: February 10th AOE, 2023 (any modifications to your blog post, via a [pull request on github](https://github.com/iclr-blogposts/staging/pulls)).
-&nbsp;
-
-- **Notification of acceptance**: March 31st, 2023
-&nbsp;
-
-- **Camera-ready merge**: April 28th, 2023
-
-## Submission Guidelines
-
-> See [the submission instructions]({{ '/submitting' | relative_url }}) for more details.
-
-For this edition of the Blogposts Track, we will forgo the requirement for total anonymity. 
-The blog posts **must be anonymized for the review process**, but users will submit their anonymized blog posts via a pull request to a staging repository (in addition to a submission on OpenReview).
-The post will be merged into the staging repository, where it will be deployed to a separate Github Pages website. 
-Reviewers will be able to access the posts directly through a public url on this staging website, and will submit their reviews on OpenReview.
-Reviewers should refrain from looking at the git history for the post, which may reveal information about the authors.
-
-This still largely follows the Double-Blind reviewing principle; it is no less double-blind than when reviewers are asked to score papers that have previously been released to [arXiv](https://arxiv.org/), an overwhelmingly common practice in the ML community.
-This approach was chosen to lower the burden on both the organizers and the authors; last year, many submissions had to be reworked once deployed due to a variety of reasons.
-By allowing the authors to render their websites to Github Pages prior to the review process, we hope to avoid this issue entirely. 
-We also avoid the issue of having to host the submissions on a separate server during the reviewing process.
-
-However, we understand the desire for total anonymity. 
-Authors that wish to have a fully double-blind process might consider creating new GitHub accounts without identifying information which will only be used for this track.
-For an example of a submission in the past which used an anonymous account in this manner, you can check out the [World Models blog post (Ha and Schmidhuber, 2018)](https://worldmodels.github.io/) and the [accompanying repository](https://github.com/worldmodels/worldmodels.github.io).
\ No newline at end of file
diff --git a/_pages/dropdown.md b/_pages/dropdown.md
deleted file mode 100644
index e2d62073..00000000
--- a/_pages/dropdown.md
+++ /dev/null
@@ -1,16 +0,0 @@
----
-layout: page
-title: other iterations
-nav: true
-nav_order: 99
-dropdown: true
-children: 
-    - title: 2025
-      permalink: https://iclr-blogposts.github.io/2025/
-    - title: divider
-    - title: 2024
-      permalink: https://iclr-blogposts.github.io/2024/
-    - title: divider
-    - title: 2022
-      permalink: https://iclr-blog-track.github.io/home/
----
diff --git a/_pages/dropdown/index.html b/_pages/dropdown/index.html
new file mode 100644
index 00000000..5da2560b
--- /dev/null
+++ b/_pages/dropdown/index.html
@@ -0,0 +1 @@
+<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <title>other iterations | ICLR Blogposts 2023</title> <meta name="author" content="abc b c"/> <meta name="description" content="Home to the 2023 ICLR Blogposts track "/> <meta name="keywords" content="machine-learning, ml, deep-learning, reinforcement-learning, iclr"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha256-DF7Zhf293AJxJNTmh5zhoYYIMs2oXitRfBjY+9L//AY=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/css/mdb.min.css" integrity="sha256-jpjYvU3G3N6nrrBwXJoVEYI/0zw8htfFnhT9ljN3JJw=" crossorigin="anonymous"/> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fortawesome/fontawesome-free@5.15.4/css/all.min.css" integrity="sha256-mUZM63G8m73Mcidfrv5E+Y61y7a12O5mW4ezU3bxqW4=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/academicons@1.9.1/css/academicons.min.css" integrity="sha256-i1+4qU2G2860dGGIOJscdC30s9beBXjFfzjWLjBRsBg=" crossorigin="anonymous"> <link rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700|Roboto+Slab:100,300,400,500,700|Material+Icons"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/github.css" media="" id="highlight_theme_light"/> <link rel="shortcut icon" href="/2023/assets/img/iclr_favicon.ico"/> <link rel="stylesheet" href="/2023/assets/css/main.css"> <link rel="canonical" href="https://iclr-blogposts.github.io/2023/_pages/dropdown/"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/native.css" media="none" id="highlight_theme_dark"/> <script src="/2023/assets/js/theme.js"></script> <script src="/2023/assets/js/dark_mode.js"></script> </head> <body class="fixed-top-nav "> <header> <nav id="navbar" class="navbar navbar-light navbar-expand-sm fixed-top"> <div class="container"> <a class="navbar-brand title font-weight-lighter" href="/2023/">ICLR Blogposts 2023</a> <button class="navbar-toggler collapsed ml-auto" type="button" data-toggle="collapse" data-target="#navbarNav" aria-controls="navbarNav" aria-expanded="false" aria-label="Toggle navigation"> <span class="sr-only">Toggle navigation</span> <span class="icon-bar top-bar"></span> <span class="icon-bar middle-bar"></span> <span class="icon-bar bottom-bar"></span> </button> <div class="collapse navbar-collapse text-right" id="navbarNav"> <ul class="navbar-nav ml-auto flex-nowrap"> <li class="nav-item "> <a class="nav-link" href="/2023/about">about</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/call">call for blogposts</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/submitting">submitting</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/reviewing">reviewing</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/blog/index.html">blog</a> </li> <li class="nav-item dropdown active"> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdown" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">other iterations<span class="sr-only">(current)</span></a> <div class="dropdown-menu dropdown-menu-right" aria-labelledby="navbarDropdown"> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2025/">2025</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2024/">2024</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blog-track.github.io/home/" target="_blank" rel="noopener noreferrer">2022</a> </div> </li> <li class="toggle-container"> <button id="light-toggle" title="Change theme"> <i class="fas fa-moon"></i> <i class="fas fa-sun"></i> </button> </li> </ul> </div> </div> </nav> </header> <div class="header-background"><div class="img"></div></div> <div class="container mt-5"> <div class="post"> <article> </article> </div> </div> <script src="https://cdn.jsdelivr.net/npm/jquery@3.6.0/dist/jquery.min.js" integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/js/bootstrap.bundle.min.js" integrity="sha256-fgLAgv7fyCGopR/gBNq2iW3ZKIdqIcyshnUULC4vex8=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/js/mdb.min.js" integrity="sha256-NdbiivsvWt7VYCt6hYNT3h/th9vSTL4EDWeGs5SN3DA=" crossorigin="anonymous"></script> <script defer src="https://cdn.jsdelivr.net/npm/masonry-layout@4.2.2/dist/masonry.pkgd.min.js" integrity="sha256-Nn1q/fx0H7SNLZMQ5Hw5JLaTRZp0yILA/FRexe19VdI=" crossorigin="anonymous"></script> <script defer src="https://cdn.jsdelivr.net/npm/imagesloaded@4/imagesloaded.pkgd.min.js"></script> <script defer src="/2023/assets/js/masonry.js" type="text/javascript"></script> <script defer src="https://cdn.jsdelivr.net/npm/medium-zoom@1.0.6/dist/medium-zoom.min.js" integrity="sha256-EdPgYcPk/IIrw7FYeuJQexva49pVRZNmt3LculEr7zM=" crossorigin="anonymous"></script> <script defer src="/2023/assets/js/zoom.js"></script> <script defer src="/2023/assets/js/common.js"></script> <script type="text/javascript">window.MathJax={tex:{tags:"ams"}};</script> <script defer type="text/javascript" id="MathJax-script" src="https://cdn.jsdelivr.net/npm/mathjax@3.2.0/es5/tex-mml-chtml.js"></script> <script defer src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script> </body> </html>
\ No newline at end of file
diff --git a/_pages/reviewer_guidelines.md b/_pages/reviewer_guidelines.md
deleted file mode 100644
index 9b509cde..00000000
--- a/_pages/reviewer_guidelines.md
+++ /dev/null
@@ -1,25 +0,0 @@
----
-layout: page
-title: reviewing
-permalink: /reviewing
-description:
-nav: true
-nav_order: 4
----
-
-
-### Reviewing Process
-
-Reviewers will be required to only view the live content of the blog. 
-We ask that they act in good faith, and refrain from digging into the repository's logs and closed Pull Requests to find any identifying information on the authors.
-   
-Reviewers should motivate their final decision based on the following points:
-   
-- Is there a significant added value in comparison to the cited papers? (BlogPosts have to be about a paper previously published at ICLR)
-- Is this added value supported by accurate, convincing, and clear arguments?
-- In case the field *Conflict Of Interest* is marked as *YES* the reviewers are asked to pay specific attention to how the related work mentioned in the field *ICLR Papers*: is the blogpost *too positive* (self advertisement) or *too negative* (unfair assessment of this related work)?
-   
-In order to access them please follow the following steps:
-
-1. Go to the OpenReview submission page.
-2. To see the blogpost submission, go to the blogpost url specified in the field 'Blogpost Url'. Example: In this submission [https://openreview.net/forum?id=djS_CaOq2F](https://openreview.net/forum?id=djS_CaOq2F) is this link: [https://iclr-blogposts.github.io/blog/2022/raspy/](https://iclr-blogposts.github.io/blog/2022/raspy/). This link is broken because it links to the main website. Instead, add the `staging` URI: [https://iclr-blogposts.github.io/staging/blog/2022/raspy/](https://iclr-blogposts.github.io/staging/blog/2022/raspy/).
diff --git a/_pages/submitting.md b/_pages/submitting.md
deleted file mode 100644
index 4221d089..00000000
--- a/_pages/submitting.md
+++ /dev/null
@@ -1,442 +0,0 @@
----
-layout: page
-title: submitting
-permalink: /submitting
-description:
-nav: true
-nav_order: 3
----
-
-
-**Announcements**: 
-- The track has concluded and accepted blogposts are viewable [here]({{ '/blog' | relative_url }})!
-- The poster session for the blog track will take place at **11:30** on **Tuesday May 2nd** in room **MH1-2-3-4**. 
-  - Check [here](https://iclr.cc/virtual/2023/workshop/14478) for more information, and come by to check out the posters!
-  - If you are going to be presenting a poster in-person, please add the [blog post track sticker]({{ '/assets/pdf/sticker.pdf' | relative_url }}) to your poster.
-
-
-### A more open process
-
-For this edition of the Blogposts Track, we will forgo the requirement for total anonymity. 
-The blog posts **must be anonymized for the review process**, but users will submit their anonymized blog posts via a pull request to a staging repository (in addition to a submission on OpenReview).
-The post will be merged into the staging repository, where it will be deployed to a separate Github Pages website. 
-Reviewers will be able to access the posts directly through a public url on this staging website, and will submit their reviews on OpenReview.
-Reviewers should refrain from looking at the git history for the post, which may reveal information about the authors.
-
-This still largely follows the Double-Blind reviewing principle; it is no less double-blind than when reviewers are asked to score papers that have previously been released to [arXiv](https://arxiv.org/), an overwhelmingly common practice in the ML community.
-This approach was chosen to lower the burden on both the organizers and the authors; last year, many submissions had to be reworked once deployed due to a variety of reasons.
-By allowing the authors to render their websites to Github Pages prior to the review process, we hope to avoid this issue entirely. 
-We also avoid the issue of having to host the submissions on a separate server during the reviewing process.
-
-However, we understand the desire for total anonymity. 
-Authors that wish to have a fully double-blind process might consider creating new GitHub accounts without identifying information which will only be used for this track.
-For an example of a submission in the past which used an anonymous account in this manner, you can check out the [World Models blog post (Ha and Schmidhuber, 2018)](https://worldmodels.github.io/) and the [accompanying repository](https://github.com/worldmodels/worldmodels.github.io).
-
-### Template
-
-The workflow you will use to participate in this track should be relatively familiar to you if have used [Github Pages](https://pages.github.com/). Specifically, our website uses the [Al-Folio](https://github.com/alshedivat/al-folio) template.
-This template uses Github Pages as part of its process, but it also utilizes a separate build step using [Github Actions](https://github.com/features/actions) and intermediary [Docker Images](https://www.docker.com/).
-
-**We stress that you must pay close attention to the steps presented in this guide. 
-Small mistakes here can have very hard-to-debug consequences.**
-
-### Contents
-
-- [Quickstart](#quickstart)
-- [Download the Blog Repository](#download-the-blog-repository)
-- [Creating a Blog Post](#creating-a-blog-post)
-- [Local Serving](#local-serving)
-   - [Method 1: Using Docker](#method-1-using-docker)
-   - [Method 2: Using Jekyll Manually](#method-2-using-jekyll-manually)
-      - [Installation](#installation)
-      - [Manual Serving](#manual-serving)
-- [Submitting Your Blog Post](#submitting-your-blog-post)
-- [Reviewing Process](#reviewing-process)
-- [Camera Ready](#camera-ready-instructions)
-
-
-### Quickstart
-
-This section provides a summary of the workflow for creating and submitting a blog post. 
-For more details about any of these steps, please refer to the appropriate section.
-
-
-1. Fork or download our [staging repository](https://github.com/iclr-blogposts/staging). 
-    We stress that you work with the [staging repository](https://github.com/iclr-blogposts/staging), not the main repository.
-    - If you do fork this repo, rename your fork. You probably should rename it
-    using a personalized name inspired by the subject of your submission. This is a **project** website, not a **user** website.
-    - If you wish to deploy the website on your own account before submitting a pull request, follow the [deployment instructions](https://github.com/iclr-blogposts/staging/blob/master/README.md#deployment) in the README **very carefully**. Pay particular attention to the instructions detailing how you must edit the `_config.yml`.
-
-    Note that any pull request to our repo will only permit modifying certain files, so you may have to omit some changes during the pull request.
-
-2. Create your blog post content as detailed in the [Creating a Blog Post](#creating-a-blog-post) section.
-    In summary, to create your post, you will: 
-    - Create a markdown file in the `_posts/` directory with the format `_posts/2022-12-01-[SUBMISSION NAME].md`. **Please ensure to use the provided `2022-12-01-distill-example.md` (with the distill layout) as your template**.
-    - Add any static image assets will be added to `assets/img/2022-12-01-[SUBMISSION NAME]/`.
-    - Add any interactive HTML figures will be added  to `assets/html/2022-12-01-[SUBMISSION NAME]/`. 
-    - Put your citations into a bibtex file in `assets/bibliography/2022-12-01-[SUBMISSION NAME].bib`. 
-
-    You **should not** touch anything else in the blog post.
-    Read the [relevant section](#creating-a-blog-post) for more details.
-    **Make sure to omit any identifying information for the review process.**
-
-3. To render your website locally, you can build a docker container via `$ ./bin/docker_build_image.sh` to serve your website locally. 
-    You can then run it with `$ ./bin/docker_run.sh`.
-    Alternatively, you can setup your loval environment to render the website via conventional `$ bundle exec jekyll serve` commands. 
-    More information for both of these configuratoins can be found in the [Local Serving](#local-serving) section.
-
-4. When ready to submit, open a pull request to our [staging repository](https://github.com/iclr-blogposts/staging). Your PR may only add files specified as specified in the [Creating a Blog Post](#creating-a-blog-post) section. Any modification to any other files will require you to undo or omit these changes.
-See the section on [submitting your blog post](#submitting-your-blog-post) for more details. 
-
-5. If accepted, we will then merge the accepted posts to our main repository. See the [camera ready](#camera-ready) section for more details on merging in an accepted blog post.
-
-**Should you edit ANY files other than `_config.yml`, your new post inside the `_posts` directory, and your new folder inside the `assets` directory,
-your pull requests will automatically be ignored.**
-
-### Download the Blog Repository
-
-Download or fork our [staging repository](https://github.com/iclr-blogposts/staging). 
-You will be submitting a pull request to this staging repository, so if you use the fork approach, we stress that you must fork the [staging repository](https://github.com/iclr-blogposts/staging), not the main repository!
-
-This is in contrast to last year's Blog Post track, where we explicitly stated you should *not* fork our repository.
-
-### Creating a Blog Post
-
-The bulk of your blogpost will be written in a Markdown file
-You can check out a [sample blogpost]( {{site.url}}/{{site.posturl}}/blog/2022/09/01/distill-example), which was
-generated by the markdown file in `_posts/2022-12-01-distill-example.md`.
-**Please ensure that you use the distill layout in your submission.**
-You must modify the file's header (or 'front-matter') as needed.
-
- ```markdown
----
-layout: distill
-title: [Your Blog Title]
-description: [Your blog's abstract - a short description of what your blog is about]
-date: 2022-12-01
-htmlwidgets: true
-
-# anonymize when submitting 
-authors:
-  - name: Anonymous 
-
-# do not fill this in until your post is accepted and you're publishing your camera-ready post!
-# authors:
-#   - name: Albert Einstein
-#     url: "https://en.wikipedia.org/wiki/Albert_Einstein"
-#     affiliations:
-#       name: IAS, Princeton
-#   - name: Boris Podolsky
-#     url: "https://en.wikipedia.org/wiki/Boris_Podolsky"
-#     affiliations:
-#       name: IAS, Princeton
-#   - name: Nathan Rosen
-#     url: "https://en.wikipedia.org/wiki/Nathan_Rosen"
-#     affiliations:
-#       name: IAS, Princeton 
-
-# must be the exact same name as your blogpost
-bibliography: 2022-12-01-distill-example.bib  
-
-# Add a table of contents to your post.
-#   - make sure that TOC names match the actual section names
-#     for hyperlinks within the post to work correctly.
-toc:
-  - name: [Section 1]
-  - name: [Section 2]
-  # you can additionally add subentries like so
-    subsections:
-    - name: [Subsection 2.1]
-  - name: [Section 3]
----
-
-# ... your blog post's content ...
-```
-
-You must change the `title`, `discription`, `toc`, and eventually the `authors` fields (**ensure that the
-submission is anonymous for the review process**).
-
-<!-- Add any tags that are relevant to your post, such as the areas your work is relevant to. -->
-Read our [sample blog post]({{ '/blog/2022/distill-example' | relative_url }}) carefully to see how you can add image assets, and how to write using $$\LaTeX$$!
-Read about rendering your post locally [below](#serving).
-
-**Important: make sure your post is completely anonymized before you export and submit it!**
-
-Before going any further, it will be useful to highlight exactly what folders and files you are going to add or modify.
-Even if you use one of our simpler quickstart methods, this will always be what's happening 
-behind the scenes.
-
-If you clone our repo or download a release, you will find a directory structure that looks like 
-the following (excluding all files and directories that are not relevant to your submission): 
-
-```bash
-your_blogpost_repo/
-│
-├── _posts
-│   ├── 2022-12-01-[YOUR SUBMISSION].md         # <--- Create this markdown file; this is your blogpost
-│   └── ...
-├── assets
-│   ├── bibliography
-│   │   ├── 2022-12-01-[YOUR SUBMISSION].bib    # <--- Create this bibtex file
-│   │   └── ...
-│   ├── html
-│   │   ├── 2022-12-01-[YOUR SUBMISSION]        # <--- Create this directory and add interactive html figures
-│   │   │   └──[YOUR HTML FIGURES].html
-│   │   └── ...
-│   ├── img
-│   │   ├── 2022-12-01-[YOUR SUBMISSION]        # <--- Create this directory and add static images here
-│   │   │   └──[YOUR IMAGES].png
-│   │   └── ...
-│   └── ...
-└── ...
-```
-
-In summary, to create your post, you will: 
-
-- Create a markdown file in the `_posts/` directory with the format `_posts/2022-12-01-[SUBMISSION NAME].md`. 
-- Add any static image assets will be added to `assets/img/2022-12-01-[SUBMISSION NAME]/`.
-- Add any interactive HTML figures will be added  to `assets/html/2022-12-01-[SUBMISSION NAME]/`. 
-- Put your citations into a bibtex file in `assets/bibliography/2022-12-01-[SUBMISSION NAME].bib`. 
-
-You **should not** touch anything else in the blog post.
-
-Note that `2022-12-01-[YOUR SUBMISSION]` serves as a tag to your submission, so it should be the
-same for all three items.
-For example, if you're writing a blog post called "Deep Learning", you'd likely want to make your
-tag `2022-12-01-deep-learning`, and the directory structure would look like this:
-
-```bash
-your_blogpost_repo/
-│
-├── _posts
-│   ├── 2022-12-01-deep-learning.md         # <--- Create this markdown file; this is your blogpost
-│   └── ...
-├── assets
-│   ├── bibliography
-│   │   ├── 2022-12-01-deep-learning.bib    # <--- Create this bibtex file
-│   │   └── ...
-│   ├── html
-│   │   ├── 2022-12-01-deep-learning        # <--- Create this directory and add interactive html figures
-│   │   │   └──[YOUR HTML FIGURES].html
-│   │   └── ...
-│   ├── img
-│   │   ├── 2022-12-01-deep-learning        # <--- Create this directory and add static images here
-│   │   │   └──[YOUR IMAGES].png
-│   │   └── ...
-│   └── ...
-└── ...
-```
-
-### Local serving
-
-So far we've talked about how to get the relevant repository and create a blog post conforming to our requirements.
-Everything you have done so far has been in Markdown, but this is not the same format as web content (typically HTML, etc.).
-You'll now need to build your static web site (which is done using Jekyll), and then *serve* it on some local webserver in order to view it properly.
-We will now discuss how you can *serve* your blog site locally, so you can visualize your work before you open a pull request on the staging website so you can submit it to the ICLR venue.
-
-#### Method 1: Using Docker
-
-To render your website locally, we follow the instructions for [Local setup using Docker (Recommended on Windows)](https://github.com/iclr-blogposts/iclr-blogposts.github.io/blob/master/README.md#local-setup-using-docker-recommended-on-windows), but specifically you will need to create your own docker container rather than pull it from Dockerhub (because we modified the Gemfile).
-
-In summary, the steps are as follows:
-
-1. Create your Docker image:
-
-    ```
-    ./bin/docker_build_image.sh
-    ```
-
-    Remove the `Gemfile.lock` file if prompted.
-    This will create a docker image labeled as `al-folio:latest`. 
-
-2. Run the Docker image:
-
-    ```
-    ./bin/docker_run.sh
-    ```
-
-    Remove the `Gemfile.lock` file if prompted. 
-    Don't use `dockerhub_run.sh`; this may result in issues with missing jekyll dependencies.
-
-
-#### Method 2: Using Jekyll Manually
-
-For users wishing to not use a Docker container, you can install Jekyll directly to your computer and build the site using Jekyll directly.
-This is done at your own risk, as there are many potential points of error!
-Follow the instructions for rendering the website via the conventional method of `$ bundle exec jekyll serve`
-
-##### Installation
-
-You will need to manually install Jekyll which will vary based on your operating system.
-The instructions here are only for convenience - you are responsible for making sure it works on your system and we are not liable for potential issues that occur when adding your submissions to our repo!
-
-**Ubuntu/Debian**
-
-1. Install Ruby
-
-    ```bash
-    sudo apt install ruby-full
-    ```
-
-2. Once installed, add the following to your `.bashrc` or whatever terminal startup script you may use (this is important because otherwise gem may complain about needing sudo permission to install packages):
-
-    ```bash
-    export GEM_HOME="$HOME/.gem"
-    export PATH="$HOME/.gem/bin:$PATH"
-    ```
-
-3. Install Jekyll and Bundler:
-
-    ```bash
-    gem install jekyll bundler
-    ```
-
-**MacOS and Windows**
-
-Mac and Windows users can find relevant guides for installing Jekyll here:
-
-- [Windows guide](https://jekyllrb.com/docs/installation/windows/)
-- [MacOS guide](https://jekyllrb.com/docs/installation/macos/)
-
-##### Manual Serving
-
-Once you've installed jekyll and all of the dependencies, you can now serve the webpage on your local machine for development purposes using the `bundle exec jekyll serve` command.
-
-You may first need to install any project dependencies. In your terminal, from the directory containing the Jekyll project run:
-
-```bash
-bundle install
-```
-
-This will install any plugins required by the project. 
-To serve the webpage locally, from your terminal, in the directory containing the Jekyll project run:
-
-```bash
-bundle exec jekyll serve
-```
-
-You should see something along the lines of:
-
-```
-> bundle exec jekyll serve
-Configuration file: /home/$USER/blog_post_repo/_config.yml
-            Source: /home/$USER/blog_post_repo
-       Destination: /home/$USER/blog_post_repo/_site
- Incremental build: disabled. Enable with --incremental
-      Generating... 
-       Jekyll Feed: Generating feed for posts
-
-        ... you may see a lot of stuff in here related to images ...
-
-                    done in 0.426 seconds.
- Auto-regeneration: enabled for '/home/$USER/blog_post_repo'
-    Server address: http://127.0.0.1:4000/2023/
-  Server running... press ctrl-c to stop.
-```
-
-If you see this, you've successfully served your web page locally!
-You can access it at server address specified, in this case `http://127.0.0.1:4000/2023` (and the blog posts should once again be viewable at the `blog/` endpoint).
-
-
-### Submitting your Blog Post
-
-The submission steps are as follows:
-
-1. Strip all identifying information from your blog post, such as your names, instituitions, etc. 
-    Be mindful that your commit history may include identifying history (beyond your Github usernames); 
-    this is okay as reviewers are only permitted to look at the live blog post and not the source repository during the review process, 
-    however if this is important to you, you may consider to rebase your commits (not required).
-
-2. Make a new Pull Request to the [staging repository](https://github.com/iclr-blogposts/staging/pulls) (not the 2023 repo!) containing your blog post.
-    Recall that your changes should (at most) modify the following files and directories:
-    ```bash
-    your_blogpost_repo/
-    │
-    ├── _posts
-    │   ├── 2022-12-01-deep-learning.md         # <--- Create this markdown file; this is your blogpost
-    │   └── ...
-    ├── assets
-    │   ├── bibliography
-    │   │   ├── 2022-12-01-deep-learning.bib    # <--- Create this bibtex file
-    │   │   └── ...
-    │   ├── html
-    │   │   ├── 2022-12-01-deep-learning        # <--- Create this directory and add interactive html figures
-    │   │   │   └──[YOUR HTML FIGURES].html
-    │   │   └── ...
-    │   ├── img
-    │   │   ├── 2022-12-01-deep-learning        # <--- Create this directory and add static images here
-    │   │   │   └──[YOUR IMAGES].png
-    │   │   └── ...
-    │   └── ...
-    └── ...
-    ```
-    Your PR will be briefly reviewed to ensure that it matches the formatting requirements (no content review), and it will then be merged into the staging version of the blog.
-
-3. Submit the name of your blog post and its URL to our [OpenReview](https://openreview.net/group?id=ICLR.cc/2023/BlogPosts&referrer=%5BHomepage%5D(%2F)).
-
-> **Note:** the abstract deadline preceeds the PR deadline and you might not have a PR merged before this.
-> As a result, you may not have a URL ready for the abstract deadline; please do your best to estimate your URL. 
-> It will be created with the following format: 
-> ```
-> https://iclr-blogposts.github.io/staging/blog/2022/<YOUR-BLOGPOST-NAME>/
-> ```
-> Using the example above, if your blog post's file is `2022-12-01-deep-learning.md`, then the corresponding url will be:
-> ```
-> https://iclr-blogposts.github.io/staging/blog/2022/deep-learning/
-> ```
-> Note that if you render your post locally, you will be able to see how the URL of your post is formatted (but please use the correct base url of `https://iclr-blogposts.github.io/staging`).
-> We will be fairly accomodating about this if any issues arise once your submission is merged.
-
-
-### Camera-ready instructions 
-
-To streamline the process of merging the accepted posts into the final blog post site, we have prepared a branch with all of the accepted blog posts in the staging repo which can be found here: 
-- [https://github.com/iclr-blogposts/staging/tree/accepted](https://github.com/iclr-blogposts/staging/tree/accepted)
-
-Please fetch this branch, and proceed with adding any final changes by creating a branch from accepted, and then merge your changes by opening a PR against this branch. 
-The checklist for updating your blog post is as follows:
-
-1. Implement any required changes from the review stage
-    - If you had a conditional acceptance, ensure that you update your post following the feedback given.
-2. Deanonymize your post
-    - Update the author list + any links that were anonymized for the review process
-3. Update formatting 
-    - **Abstracts:** ensure that your abstracts are contained within the `description` entry of the front-matter, so it renders correctly in the blog ([example](https://github.com/iclr-blogposts/staging/blob/aa15aa3797b572e7b7bb7c8881fd350d5f76fcbd/_posts/2022-12-01-distill-example.md?plain=1#L4-L))
-    - **Table of contents:** you must use the `toc` formatting like that in the distill template ([example](https://github.com/iclr-blogposts/staging/blob/aa15aa3797b572e7b7bb7c8881fd350d5f76fcbd/_posts/2022-12-01-distill-example.md?plain=1#L33-L42))
-    - **Bibliography:** uses correct reference style as per the distill template (i.e. using the bibtex file)
-
-Once you have updated your blog post with any necessary changes:
-
-- Open a pull request against the accepted branch of the staging repo. 
-- You should see a PR template when you open up a PR - please fill it in and make sure all of the required boxes are ticked before submitting your final PR. 
-
-
-Below is what you should see in the PR template:
-
-```md
-<!-- Please make sure you are opening a pull request against the `accepted` branch (not master!) of the STAGING repo (not 2023!) -->
-
-## OpenReview Submission Thread
-<!-- link to your OpenReview submission -->
-
-## Checklist before requesting a review
-<!-- To tick a box, put an 'x' inside it (e.g. [x]) -->
-
-- [ ] I am opening a pull request against the `accepted` branch of the `staging` repo
-- [ ] I have de-anonymized my post, added author lists, etc.
-- [ ] My post matches the formatting requirements
-	- [ ] I have a short 2-3 sentence abstract in the `description` field of my front-matter 
-	- [ ] I have a table of contents, formatted using the `toc` field of my front-matter 
-	- [ ] My bibliography is correctly formatted, using a `.bibtex` file as per the sample post
-
-## Changes implemented in response to reviewer feedback
-
-- [ ] Tick this box if you received a conditional accept
-- [ ] I have implemented the necessary changes in response to reviewer feedback (if any)
-
-<!-- briefly add your changes in response to reviewer feedback -->
-
-## Any other comments
-
-
-```
diff --git a/_plugins/external-posts.rb b/_plugins/external-posts.rb
deleted file mode 100644
index e4fd5eb6..00000000
--- a/_plugins/external-posts.rb
+++ /dev/null
@@ -1,36 +0,0 @@
-require 'feedjira'
-require 'httparty'
-require 'jekyll'
-
-module ExternalPosts
-  class ExternalPostsGenerator < Jekyll::Generator
-    safe true
-    priority :high
-
-    def generate(site)
-      if site.config['external_sources'] != nil
-        site.config['external_sources'].each do |src|
-          p "Fetching external posts from #{src['name']}:"
-          xml = HTTParty.get(src['rss_url']).body
-          feed = Feedjira.parse(xml)
-          feed.entries.each do |e|
-            p "...fetching #{e.url}"
-            slug = e.title.downcase.strip.gsub(' ', '-').gsub(/[^\w-]/, '')
-            path = site.in_source_dir("_posts/#{slug}.md")
-            doc = Jekyll::Document.new(
-              path, { :site => site, :collection => site.collections['posts'] }
-            )
-            doc.data['external_source'] = src['name'];
-            doc.data['feed_content'] = e.content;
-            doc.data['title'] = "#{e.title}";
-            doc.data['description'] = e.summary;
-            doc.data['date'] = e.published;
-            doc.data['redirect'] = e.url;
-            site.collections['posts'].docs << doc
-          end
-        end
-      end
-    end
-  end
-
-end
diff --git a/_plugins/hideCustomBibtex.rb b/_plugins/hideCustomBibtex.rb
deleted file mode 100644
index 4a852fde..00000000
--- a/_plugins/hideCustomBibtex.rb
+++ /dev/null
@@ -1,15 +0,0 @@
- module Jekyll
-  module HideCustomBibtex
-    def hideCustomBibtex(input)
-	  keywords = @context.registers[:site].config['filtered_bibtex_keywords']
-
-	  keywords.each do |keyword|
-		input = input.gsub(/^.*#{keyword}.*$\n/, '')
-	  end
-
-      return input
-    end
-  end
-end
-
-Liquid::Template.register_filter(Jekyll::HideCustomBibtex)
diff --git a/_posts/2023-05-01-adamw.md b/_posts/2023-05-01-adamw.md
deleted file mode 100644
index ddbdd67f..00000000
--- a/_posts/2023-05-01-adamw.md
+++ /dev/null
@@ -1,400 +0,0 @@
----
-layout: distill
-title: Decay No More
-description: Weight decay is among the most important tuning parameters to reach high accuracy for large-scale machine learning models. In this blog post, we revisit AdamW, the weight decay version of Adam, summarizing empirical findings as well as theoretical motivations from an optimization perspective.  
-date: 2023-05-01
-htmlwidgets: true
-
-authors:
-  - name: Fabian Schaipp
-    url: "https://fabian-sp.github.io/"
-    affiliations:
-      name: Technical University of Munich
-
-
-# must be the exact same name as your blogpost
-bibliography: 2023-05-01-adamw.bib  
-
-# Add a table of contents to your post.
-#   - make sure that TOC names match the actual section names
-#     for hyperlinks within the post to work correctly.
-toc:
-  - name: Introduction
-    subsections:
-    - name: Notation
-  - name: Adam
-  - name: AdamW
-  - name: Follow-up work
-  - name: ProxAdam
-    subsections:
-    - name: A short introduction to proximal operators
-    - name: Weight decay as a proximal operator
-    - name: Changing the norm
-  - name : AdamW is scale-free
-  - name: Summary
-  - name: Appendix
-
-# Below is an example of injecting additional post-specific styles.
-# This is used in the 'Layouts' section of this post.
-# If you use this post as a template, delete this _styles block.
----
-
-
-## Introduction
-
-Weight decay is a regularization technique in machine learning which scales down the weights in every step. It dates back at least to the 1990's and the work of Krogh and Hertz <d-cite key="Krogh1991"></d-cite> and Bos and Chug <d-cite key="Bos1996"></d-cite>.
-
-In `Pytorch`, weight decay is one simple line which typically is found somewhere in the `step`-method:
-
-{% highlight python %}
-
-for p in group['params']:
-  p.data.add_(p.data, alpha=-decay)
-
-{% endhighlight %}
-
-Subtracting a multiple of the weight can be seen as taking a step into the negative gradient direction of the squared norm of the weight. This relates weight decay to $$\ell_2$$-regularization. 
-
-[//]: #(see also the [Appendix](#appendix) with an excerpt of the original work by Krogh and Hertz <d-cite key="Krogh1991"></d-cite>). 
-
-The exact mechanism of weight decay is still puzzling the machine learning community:
-
-{% twitter https://twitter.com/rasbt/status/1614327550058328064 %}
-
-
-{% twitter https://twitter.com/deepcohen/status/1617274166570528769 %}
-
-The paper by Zhang et al. <d-cite key="Zhang2019"></d-cite> - which is the one mentioned in the second tweet - gives a comprehensive overview of weight decay and its effect on generalization, in particular in the interplay with Batch Normalization `(BN)` <d-cite key="Ioffe2015"></d-cite>. 
-Batch Normalization describes a module of a network that normalizes the output of the previous layer to have zero mean and variance of one (or a variant of this with learnable mean and variance). We will not go into the details here but refer to [this blog post](https://iclr-blog-track.github.io/2022/03/25/unnormalized-resnets/) <d-cite key="pieterjan2022normalizationisdead"></d-cite> for the interested reader. 
-
-We want to summarize two findings of <d-cite key="Zhang2019"></d-cite>:
-
-* On the one hand, weight decay has (in theory) no effect on layers with `(BN)`. This is simply due to the fact that `(BN)` makes the output invariant to a rescaling of the weights. 
-
-<blockquote>
-Weight decay is widely used in networks with Batch Normalization (Ioffe & Szegedy,
-2015). In principle, weight decay regularization should have no effect in this case, since one
-can scale the weights by a small factor without changing the network’s predictions. Hence, it
-does not meaningfully constrain the network’s capacity. 
-
-—Zhang et al., 2019
-</blockquote>
-
-* However, the experiments of the paper show that weight decay on layers with `(BN)` can nevertheless improve accuracy. The authors argue that this is due to an effectively larger learning rate.
-
-This blog post will summarize the development of weight decay specifically for <span style="font-family:monospace">Adam</span>.
-We try to shed some light on the following questions:
-
-1. What is the difference between <span style="font-family:monospace">Adam</span> and its weight decay version <span style="font-family:monospace">AdamW</span>? Does the existing literature give a clear answer to the question when (and why) <span style="font-family:monospace">AdamW</span> performs better?
-2. Is the weight decay mechanism of <span style="font-family:monospace">AdamW</span> just *one more trick* or can we actually motivate it from an optimization perspective? 
-3. The last section is somewhat explorational: could we come up with different formulas for a weight decay version of <span style="font-family:monospace">Adam</span>? By doing so, we will see that <span style="font-family:monospace">AdamW</span> already combines several advantages for practical use. 
-
-
-### Notation
-
-We denote by $$\alpha > 0$$ the initial learning rate. We use $$\eta_t > 0$$ for a learning rate schedule multiplier. By this, the effective learning rate in iteration $$t$$ is $$\alpha \eta_t$$. We use $$\lambda > 0$$ for the weight decay parameter. 
-
-## Adam 
-
-<span style="font-family:monospace">Adam</span> uses an exponentially moving average (EMA) of stochastic gradients, typically denoted by $$m_t$$, and of the elementwise squared gradients, denoted by $$v_t$$. 
-
-We denote with $$\hat m_t$$ and $$\hat v_t$$ the EMA estimates with bias correction (see <d-cite key="Kingma2015"></d-cite>), this means
-
-$$
-\hat m_t = \frac{m_t}{1-\beta_1^t}, \quad \hat v_t = \frac{v_t}{1-\beta_2^t}
-$$
-
-where $$\beta_1, \beta_2 \in [0,1)$$. The update formula of <span style="font-family:monospace">Adam</span> is given by
-
-$$
-w_t = w_{t-1} - \eta_t \alpha \frac{\hat m_t}{\epsilon + \sqrt{\hat v_t}}.
-$$
-
-How would <span style="font-family:monospace">Adam</span> handle regularization? The first approach to this was to simply add the regularization term $$\frac{\lambda}{2}\|w\|^2$$ on top of the loss, do backpropagation and then compute the <span style="font-family:monospace">Adam</span> step as outlined above. This is usually referred to as <span style="font-family:monospace">AdamL2</span>. However, Loshchilov and Hutter <d-cite key="Loshchilov2019"></d-cite> showed that this can be suboptimal and one major contribution to alleviate this was the development of <span style="font-family:monospace">AdamW</span>.
-
-## AdamW
-
-For training with $$\ell_2$$-regularization, Loshchilov and Hutter proposed <span style="font-family:monospace">AdamW</span> in 2019 <d-cite key="Loshchilov2019"></d-cite> as an alternative to <span style="font-family:monospace">AdamL2</span>. In the paper, the update formula is given as 
-
-$$
-\tag{AdamW}
-w_t = (1-\eta_t \lambda)w_{t-1} - \eta_t \alpha \frac{\hat m_t}{\epsilon + \sqrt{\hat v_t}}.
-$$
-
-
-While for <span style="font-family:monospace">Adam</span> several results for convex and nonconvex problems are established <d-cite key="Defossez2022, Reddi2018"></d-cite>, theoretical guarantees for <span style="font-family:monospace">AdamW</span> have been explored - to the best of our knowledge - only very recently <d-cite key="Anonymous2023"></d-cite>. Despite this, the method has enjoyed considerable practical success: for instance, <span style="font-family:monospace">AdamW</span> is implemented in the machine learning libraries Tensorflow <d-cite key="Abadi2015"></d-cite> and Pytorch <d-cite key="Paszke2019"></d-cite>. Another example is the `fairseq` library, developped by Facebook Research, which implements many SeqToSeq models. In their codebase, when <span style="font-family:monospace">Adam</span> is specified with weight decay, <span style="font-family:monospace">AdamW</span> is used by default (see [here](https://github.com/facebookresearch/fairseq/blob/main/fairseq/optim/adam.py)). 
-
-We summarize the empirical findings of <d-cite key="Loshchilov2019"></d-cite> as follows:
-
-* <span style="font-family:monospace">AdamW</span> improves generalization as compared to <span style="font-family:monospace">AdamL2</span> for image classification tasks. In the paper, the authors use a ResNet model <d-cite key="He2016"></d-cite> for the CIFAR10 and Imagenet32 dataset.
-
-* Another advantage of <span style="font-family:monospace">AdamW</span> is stated in the abstract of <d-cite key="Loshchilov2019"></d-cite>:
-<blockquote>
-    We provide empirical evidence that our proposed modification decouples the optimal choice of weight decay factor from the setting of the learning rate for both standard SGD and Adam [...].
-    —Loshchilov and Hutter, 2019
-</blockquote>
-
-What the authors mean by *decoupling* is that if we plot the test accuracy as a heatmap of learning rate and weight decay, the areas with high accuracy are more rectangular; the best learing rate is not too sensitive to the choice of weight decay. We illustrate this conceptually in the plot below which is inspired by Figure 2 in <d-cite key="Loshchilov2019"></d-cite>. The advantage of a decoupled method is that if one of the two hyperparameters is changed, the optimal value for the other one might still be identical and does not need to be retuned - this could reduce a 2D grid search to two 1D line searches.
-
-<div class="row mt-3">
-{% include figure.html path="assets/img/2023-05-01-adamw/heatmap.png" class="img-fluid" %}
-</div>
-<div class="caption">
-    Fig. 1: Heatmap of the test accuracy (bright = good accuracy) depending on learning rate and weight decay parameter choice.
-</div>
-
-When revisiting the literature on <span style="font-family:monospace">AdamW</span> we made an interesting practical observation: the [Pytorch implementation](https://pytorch.org/docs/stable/generated/torch.optim.AdamW.html) of <span style="font-family:monospace">AdamW</span> is actually slightly different to the algorithm proposed in the paper. In Pytorch, the following is implemented:
-
-$$
-w_t = (1-\eta_t \alpha \lambda)w_{t-1} - \eta_t \alpha \frac{\hat m_t}{\epsilon + \sqrt{\hat v_t}}.
-$$
-
-The difference is that the decay factor in the code is $$1-\eta_t \alpha \lambda$$ instead of $$1-\eta_t \lambda$$ in the paper. Clearly, this is equivalent as we can simply reparametrize the weight decay factor $$\lambda$$ to make up for this. However, as the default learning rate $$\alpha=0.001$$ is rather small, this means that practicioners might need to choose rather high values of $$\lambda$$ in order to get sufficiently strong decay. Moreover, this leaves a certain ambiguity when tuned values for $$\lambda$$ are reported in the literature. 
-
-## Follow-up work
-
-In a recent article, Zhuang et al. revisit the <span style="font-family:monospace">AdamW</span> method and try to explain its practical success <d-cite key="Zhuang2022"></d-cite>. One of their central arguments is that <span style="font-family:monospace">AdamW</span> is approximately equal to <span style="font-family:monospace">Adam</span> with a proximal update for $$\ell_2$$-regularization. 
-
-Before explaining this in detail, we first want to summarize the empirical findings of <d-cite key="Zhuang2022"></d-cite>:
-
-* When `(BN)` is *deactivated*, <span style="font-family:monospace">AdamW</span> achieves better generalization compared to <span style="font-family:monospace">AdamL2</span> for image classification with a standard ResNet architecture <d-cite key="He2016"></d-cite>.
-* When `(BN)` is *activated*, the test accuracy of <span style="font-family:monospace">AdamW</span> and <span style="font-family:monospace">AdamL2</span> are on par. Moreover, the best accuracy is achieved for no weight decay, i.e. $$\lambda=0$$. 
-
-The second result is somewhat stunning as it seems to contradict the results in <d-cite key="Loshchilov2019"></d-cite>, which had shown that <span style="font-family:monospace">AdamW</span> generalizes better than <span style="font-family:monospace">AdamL2</span>.<d-footnote>It seems like the AdamW-paper also used (BN) in their experiments, see https://github.com/loshchil/AdamW-and-SGDW.</d-footnote>
-
-Comparing the details of the experimental setups, we presume the following explanations for this:
-
-* The model that is trained in <d-cite key="Loshchilov2019"></d-cite> is slightly different as it uses a Shake-Shake-Image ResNet <d-cite key="He2016, Gastaldi2017"></d-cite>.
-
-* From Figure 4 in <d-cite key="Loshchilov2019"></d-cite>, one can observe that the improvement in accuracy for the CIFAR-10 dataset becomes noticeable very late in the training (see also Section 4.3 in <d-cite key="Loshchilov2019"></d-cite>). Thus, depending on the number of epochs after which training is stopped, one can reach different conclusions.
-
-## ProxAdam
-
-The paper by Zhuang et al. <d-cite key="Zhuang2022"></d-cite> does not only compare <span style="font-family:monospace">AdamL2</span> to <span style="font-family:monospace">AdamW</span> experimentally, but it also provides a mathematical motivation for weight decay. In order to understand this, we first need to introduce the **proximal operator**, a central concept of convex analysis. 
-
-
-### A short introduction to proximal operators
-
-Proximal algorithms have been studied for decades in the context of (non-smooth) optimization, way before machine learning was a thing. The groundwork of this field has been laid by R. Tyrrell Rockafellar from the 1970's onwards <d-cite key="Rockafellar1976,Rockafellar1998"></d-cite>.
-If $$\varphi: \mathbb{R}^n \to \mathbb{R}$$ is convex then the proximal operator is defined as 
-
-$$
-\mathrm{prox}_\varphi(x) := \mathrm{argmin}_{z \in \mathbb{R}^n} \varphi(z) + \frac12 \|z-x\|^2.
-$$
-
-If $$\varphi$$ is non-smooth, we can not simply compute a gradient step - hence we have to deal with non-smooth terms in a different way.
-For many classical regularization functions (e.g. the $$\ell_1$$-norm), the proximal operator can be computed in closed form. This makes it a key ingredient of optimization algorithms for non-smooth, regularized problems. Assume that we want to minimize the sum of a differentiable loss $$f$$ and a convex regularizer $$\varphi$$, i.e. 
-
-$$
-\min_{w \in \mathbb{R}^n} f(w) + \varphi(w).
-$$
-
-The proximal gradient method in this setting has the update formula
-
-$$
-w_{t} = \mathrm{prox}_{\alpha \varphi}\big(w_{t-1}- \alpha \nabla f(w_{t-1})\big),
-$$
-
-where $$\alpha>0$$ is a step size (*aka* learning rate). An equivalent way of writing this (which will become useful later on) is<d-footnote>This can be proven using the definition of the proximal operator and completing the square.</d-footnote>
-
-$$
-\tag{1}
-w_{t} =  \mathrm{argmin}_y \langle y-w_{t-1}, \nabla f(w_{t-1})\rangle + \varphi(y) + \frac{1}{2\alpha}\|y-w_{t-1}\|^2.
-$$
-
-### Weight decay as a proximal operator
-
-For $$\ell_2$$-regularization $$\varphi(w) = \frac{\lambda}{2}\|w\|^2$$, the proximal operator at $$w$$ is given by $$\frac{1}{1+\lambda}w = (1-\frac{\lambda}{1+\lambda})w$$. Based on this, the authors of <d-cite key="Zhuang2022"></d-cite> propose a proximal version of <span style="font-family:monospace">Adam</span> called <span style="font-family:monospace">ProxAdam</span>. It is given by 
-
-$$
-\tag{ProxAdam}
-w_t = \big(1- \frac{\lambda\eta_t}{1+\lambda\eta_t} \big)w_{t-1} - \frac{\eta_t \alpha}{1+\lambda\eta_t} \frac{\hat m_t}{\epsilon + \sqrt{\hat v_t}}.
-$$
-
-Knowing this, we can now understand why <span style="font-family:monospace">AdamW</span> is approximately a proximal version of <span style="font-family:monospace">Adam</span>. Using the first-order  Taylor-approximation $$\frac{ax}{1+bx}\approx ax$$ for small $$x$$, applied to the coefficients in front of $$w_{t-1}$$ and $$\frac{\hat m_t}{\epsilon + \sqrt{\hat v_t}}$$ gives the formula
-
-$$
-w_t = (1-\eta_t \lambda)w_{t-1} - \eta_t \alpha \frac{\hat m_t}{\epsilon + \sqrt{\hat v_t}}
-$$
-
-which is equal to <span style="font-family:monospace">AdamW</span>. The argument we just presented is exactly how <d-cite key="Zhuang2022"></d-cite> concludes that <span style="font-family:monospace">AdamW</span> $$\approx$$ <span style="font-family:monospace">ProxAdam</span>.
-
-### Changing the norm
-
-There is one more way of interpreting proximal methods. Let us begin with a simple example: Define the diagonal matrix $$D_t := \mathrm{Diag}(\epsilon + \sqrt{\hat v_t})$$. Then, the <span style="font-family:monospace">Adam</span> update can be equivalently written<d-footnote>This can be proven by first-order optimality and solving for $w_t$. We will do a similar calculation further below.</d-footnote> as
-
-$$
-w_t = \mathrm{argmin}_y \langle y-w_{t-1}, \hat m_t \rangle + \frac{1}{2\eta_t\alpha}\|y-w_{t-1}\|_{D_t}^2.
-$$
-
-In other words, <span style="font-family:monospace">Adam</span> takes a proximal step of a linear function, but with the adaptive norm $$D_t$$. This change in norm is what makes <span style="font-family:monospace">Adam</span> different from <span style="font-family:monospace">SGD</span> with (heavy-ball) momentum.
-
-The update formula of <span style="font-family:monospace">ProxAdam</span> can also be written as a proximal method:
-
-$$
-\tag{P1}
-w_t = \mathrm{argmin}_y \langle y-w_{t-1}, \hat m_t \rangle + \frac{\lambda}{2\alpha}\|y\|_{D_t}^2 + \frac{1}{2 \eta_t \alpha}\|y-w_{t-1}\|_{D_t}^2.
-$$
-
-In fact, the first-order optimality conditions of (P1) are
-
-$$
-0 = \hat m_t + \frac{\lambda}{\alpha} D_t w_t + \frac{1}{\eta_t \alpha}D_t (w_t-w_{t-1}).
-$$
-
-Solving for $$w_t$$ (and doing simple algebra) gives
-
-$$
-\tag{2}
-w_t = (1+\lambda \eta_t)^{-1}\big[w_{t-1} - \eta_t \alpha D_t^{-1} \hat m_t\big]
-$$
-
-which is equal to <span style="font-family:monospace">ProxAdam</span>. 
-
-What is slightly surprising here is the term $$\alpha^{-1}\|y\|_{D_t}^2$$ in (P1) - we might have expected the regularization term to be used with the standard $$\ell_2$$-norm. This leads us to our final section.
-
-## <span style="font-family:monospace">AdamW</span> is scale-free
-
-As an alternative to (P1), we could replace $$\alpha^{-1}\|y\|_{D_t}^2$$ by $$\|y\|^2$$ and update
-
-$$
-w_t = \mathrm{argmin}_y \langle y-w_{t-1}, \hat m_t \rangle + \frac{\lambda}{2}\|y\|^2 + \frac{1}{2\eta_t\alpha}\|y-w_{t-1}\|_{D_t}^2.
-$$
-
-Again, setting the gradient of the objective to zero and solving for $$w_t$$ we get 
-
-$$
-w_t = \big(\mathrm{Id} + \eta_t \lambda \alpha D_t^{-1}\big)^{-1} \big[w_{t-1} - \eta_t\alpha D_t^{-1} \hat m_t \big].
-$$
-
-Comparing this to (2) we see that the second factor is the same, but the decay factor now also depends on $$D_t$$ and $$\alpha$$. Let us call this method <span style="font-family:monospace">AdamP</span>.
-
-Now the natural question is whether <span style="font-family:monospace">AdamP</span> or <span style="font-family:monospace">ProxAdam</span> (or <span style="font-family:monospace">AdamW</span> as its approximation) would be superior. One answer to this is that we would prefer a *scale-free* algorithm: with this we mean that if the loss function would be multiplied by a positive constant, we could still run the method with exactly the same parameters and obtain the same result. <span style="font-family:monospace">Adam</span> for example is scale-free and in <d-cite key="Zhuang2022"></d-cite> it is explained that <span style="font-family:monospace">ProxAdam</span>/<span style="font-family:monospace">AdamW</span> are, too. The reason for this is the following: looking at (P1) we see that if the loss is scaled by $$c>0$$, then $$\hat m_t$$ and $$D_t$$ are scaled by $$c$$ (if we neglect the $$\epsilon$$ in $$D_t$$). Hence, the objective in (P1) is multiplied by $$c$$ which implies that <span style="font-family:monospace">ProxAdam</span> for $$\epsilon=0$$ is invariant to scaling for the same values of $$\lambda,\alpha,\eta_t$$. Now, for (P2) the story is different, as here the second term $$\frac{\lambda}{2}\|y\|^2$$ is not scaled by $$c$$, but the other terms are. We would need to rescale $$\lambda$$ by $$c$$ to obtain the identical update. As a consequence, <span style="font-family:monospace">AdamP</span> would **not be scale-free** and this makes it less attractive as a method. We should point out that scale-freeness is rather a practical advantage that requires less tuning when changing the model or dataset - it does not imply that the test accuracy would be different when both methods are tuned.
-
-To verify this, we ran a simple experiment on a ResNet20 for CIFAR10 with `(BN)` deactivated. For <span style="font-family:monospace">AdamW</span> (the `Pytorch` version) and <span style="font-family:monospace">AdamP</span> we tested the learning rates `[1e-3,1e-2,1e-1]` and weight decay `[1e-5,1e-4,1e-3,1e-2]`. From the plots below, we can see that both methods approximately achieve the same accuracy for the best configurations<d-footnote>The best configurations all have learning rate 1e-3.</d-footnote>.
-The only difference - in this very simple example - is that <span style="font-family:monospace">AdamP</span> seems to arrive at a model with smaller norm for the configurations with high accuracy (see right plot). Hence, its regularization seems to be stronger. 
-
-<div class="row mt-3">
-    <div class="col-sm mt-3 mt-md-0">
-        {% include figure.html path="assets/img/2023-05-01-adamw/resnet20val_score.png" class="img-fluid rounded z-depth-1" %}
-    </div>
-    <div class="col-sm mt-3 mt-md-0">
-        {% include figure.html path="assets/img/2023-05-01-adamw/resnet20model_norm.png" class="img-fluid rounded z-depth-1" %}
-    </div>
-</div>
-
-For the sake of completeness, we also add a `Pytorch` implementation of <span style="font-family:monospace">AdamP</span> in the [Appendix](#appendix).
-
-## Summary
-
-* Weight decay can be seen as a proximal way of handling $$\ell_2$$-regularization. Therefore, it is not a different *type* of regularization itself but rather a different *treatment* of regularization in the optimization method. As a consequence, <span style="font-family:monospace">AdamW</span> is an (almost) proximal version of <span style="font-family:monospace">Adam</span>.
-
-* Whether or not weight decay brings advantages when used *together with* `(BN)` seems to depend on several factors of the model and experimental design. However, in all experiments we discussed here <span style="font-family:monospace">AdamW</span> performed better or at least on par to <span style="font-family:monospace">AdamL2</span>. 
-
-* The second conclusion suggests that proximal algorithms such as <span style="font-family:monospace">AdamW</span> seem to be favourable. Together with the scale-free property that we described in the final section, this makes <span style="font-family:monospace">AdamW</span> a robust method and explains its practical success.
-
-
-## Acknowledgements
-
-Special thanks go to Robert M. Gower and the anonymous reviewers for their constructive feedback.
-
-<a name="appendix">
-## Appendix
-
-
-Below you find a `Pytorch` implementation of <span style="font-family:monospace">AdamP</span>:
-
-{% highlight python %}
-
-import torch
-from torch.optim import Optimizer
-
-
-class AdamP(Optimizer):
-    r"""
-    Arguments:
-        params (iterable): iterable of parameters to optimize or dicts defining
-            parameter groups
-        lr (float, optional): learning rate (default: 1e-3)
-        betas (Tuple[float, float], optional): coefficients used for computing
-            running averages of gradient and its square (default: (0.9, 0.999))
-        eps (float, optional): term added to the denominator to improve
-            numerical stability (default: 1e-8)
-        weight_decay (float, optional): weight decay (L2 penalty) (default: 0)
-        
-    """
-
-    def __init__(self, params, lr=1e-3, betas=(0.9, 0.999), eps=1e-8,
-                 weight_decay=0):
-        if not 0.0 <= lr:
-            raise ValueError("Invalid learning rate: {}".format(lr))
-        if not 0.0 <= eps:
-            raise ValueError("Invalid epsilon value: {}".format(eps))
-        if not 0.0 <= betas[0] < 1.0:
-            raise ValueError("Invalid beta parameter at index 0: {}".format(betas[0]))
-        if not 0.0 <= betas[1] < 1.0:
-            raise ValueError("Invalid beta parameter at index 1: {}".format(betas[1]))
-        if not 0.0 <= weight_decay:
-            raise ValueError("Invalid weight_decay value: {}".format(weight_decay))
-        defaults = dict(lr=lr, betas=betas, eps=eps,
-                        weight_decay=weight_decay)
-        
-        self._init_lr = lr
-        super().__init__(params, defaults)
-
-        return
-   
-
-    def step(self, closure=None):
-        """Performs a single optimization step.
-
-        Arguments:
-            closure (callable, optional): A closure that reevaluates the model
-                and returns the loss.
-        """
-        
-        if closure is not None:
-            with torch.enable_grad():
-                loss = closure()
-
-        for group in self.param_groups:
-            for p in group['params']:
-                if p.grad is None:
-                    continue
-
-                grad = p.grad
-                state = self.state[p]
-
-                # State initialization
-                if 'step' not in state:
-                    state['step'] = 0
-                    # Exponential moving average of gradient values
-                    state['exp_avg'] = torch.zeros_like(p.data).detach()
-                    # Exponential moving average of squared gradient values
-                    state['exp_avg_sq'] = torch.zeros_like(p.data).detach()
-                    
-                exp_avg, exp_avg_sq = state['exp_avg'], state['exp_avg_sq']
-                beta1, beta2 = group['betas']
-
-                state['step'] += 1
-                bias_correction1 = 1 - beta1**state['step']
-                bias_correction2 = 1 - beta2**state['step']
-
-                
-                # Decay the first and second moment running average coefficient
-                exp_avg.mul_(beta1).add_(grad, alpha= 1-beta1)
-                exp_avg_sq.mul_(beta2).addcmul_(grad, grad, value= 1-beta2)
-                D = (exp_avg_sq.div(bias_correction2)).sqrt().add_(group['eps'])
-
-                lr = group['lr']
-                lmbda = group['weight_decay']
-
-                p.data.addcdiv_(exp_avg, D, value=-lr/bias_correction1)
-                if lmbda > 0:
-                    p.data.div_(1.0 + lr*lmbda/D) # adaptive weight decay
-
-            
-
-        return loss
-
-{% endhighlight %}
\ No newline at end of file
diff --git a/_posts/2023-05-01-autoregressive-neural-pde-solver.md b/_posts/2023-05-01-autoregressive-neural-pde-solver.md
deleted file mode 100644
index a112bdbe..00000000
--- a/_posts/2023-05-01-autoregressive-neural-pde-solver.md
+++ /dev/null
@@ -1,1093 +0,0 @@
----
-layout: distill
-title: Autoregressive Renaissance in Neural PDE Solvers
-description: 
-  Recent developments in the field of neural partial differential equation (PDE) solvers have placed a strong emphasis on neural operators. However, the paper Message Passing Neural PDE Solver by Brandstetter et al. published in ICLR 2022 revisits autoregressive models and designs a message passing graph neural network that is comparable with or outperforms both the state-of-the-art Fourier Neural Operator and traditional classical PDE solvers in its generalization capabilities and performance. This blog post delves into the key contributions of this work, exploring the strategies used to address the common problem of instability in autoregressive models and the design choices of the message passing graph neural network architecture.
-date: 2023-05-01
-htmlwidgets: true
-
-# Anonymize when submitting
-authors:
-    - name: Yolanne Lee
-      url: "https://www.linkedin.com/in/yolannelee/"
-      affiliations:
-        name: University College London
-
-# must be the exact same name as your blogpost
-bibliography: 2023-05-01-autoregressive-neural-pde-solver.bib
-
-# Add a table of contents to your post.
-#   - make sure that TOC names match the actual section names
-#     for hyperlinks within the post to work correctly.
-toc:
-  - name: Introduction
-  - name: Background
-    subsections:
-    - name: Let's brush up on the basics...
-    - name: Solving PDEs the classical way
-    - name: Neural Solvers
-  - name: Message Passing Neural PDE Solver (MP-PDE)
-    subsections:
-    - name: The Pushforward Trick and Temporal bundling
-    - name: Network Architecture
-    - name: Results
-    - name: Comparing Interpretations
-  - name: Conclusion
-    subsections:
-    - name: Future Directions
-    - name: Ongoing Challenges
-    - name: Remarks
-
-# Below is an example of injecting additional post-specific styles.
-# This is used in the 'Layouts' section of this post.
-# If you use this post as a template, delete this _styles block.
-_styles: >
-  .center-screen {
-  justify-content: center;
-  align-items: center;
-  text-align: center;
-  }
-
-  .fake-img {
-    background: #e2edfc;
-    border: 1px solid rgba(0, 0, 0, 0.1);
-    border-radius: 25px;
-    box-shadow: 0 0px 4px rgba(0, 0, 0, 0.05);
-    margin-bottom: 12px;
-  }
-
-  .fake-img p {
-    font-family: sans-serif;
-    color: white;
-    margin: 12px 8px;
-    text-align: center;
-    font-size: 12px;
-    line-height: 150%;
-  }
-
-  .vertical-center {
-  margin: 0;
-  position: absolute;
-  top: 50%;
-  -ms-transform: translateY(-50%);
-  transform: translateY(-50%);
-  }
-
-  [data-theme="dark"] .fake-img {
-    background: #112f4a;
-  }
-
-  summary {
-    color: steelblue
-  }
-
-  summary-math {
-    text-align:center;
-    color: black
-  }
-
-  [data-theme="dark"] summary-math {
-    text-align:center;
-    color: white
-  }
-
-  details[open] {
-  --bg: #e2edfc;
-  color: white;
-  border-radius: 25px;
-  padding-left: 8px;
-  background: var(--bg);
-  outline: 0.5rem solid var(--bg);
-  margin: 0 0 2rem 0;
-  }
-
-  [data-theme="dark"] details[open] {
-  --bg: #112f4a;
-  border-radius: 25px;
-  padding-left: 8px;
-  background: var(--bg);
-  outline: 0.5rem solid var(--bg);
-  margin: 0 0 2rem 0;
-  }
-
-  [data-theme="dark"] blockquote {
-    background: var(--global-bg-color);
-    border-left: 2px solid white;
-    margin: 1.5em 10px;
-    padding: 0.5em 10px;
-    font-size: 1.1rem;
-    color: white;
-  }
-
-  hr {
-    color: #333;
-    width:50%;
-    margin:0 auto;
-    text-align: center;
-    height: 2px;
-  }
-
-  l-body-outset {
-    display: flex;
-    justify-content: center;
-  }
----
-## Introduction
-<blockquote>
-Improving PDE solvers has trickle down benefits to a vast range of other fields.
-</blockquote>
-
-Partial differential equations (PDEs) play a crucial role in modeling complex systems and understanding how they change over time and in space.
-
-They are used across physics and engineering, modeling a wide range of physical phenomena like heat transfer, sound waves, electromagnetism, and fluid dynamics, but they can also be used in finance to model the behavior of financial markets, in biology to model the spread of diseases, and in computer vision to model the processing of images.
-
-They are particularly interesting in deep learning!
-
-<ol>
-  <li><span style="color:#9444e2;">Neural networks can be used to model complex PDEs.</span></li>
-
-  <li><span style="color:#9444e2;">Embedding knowledge of a PDE into a neural network can help it generalize better and/or use less data</span></li>
-
-  <li><span style="color:#9444e2;">PDEs can help explain, interpret, and design neural networks.</span></li>
-</ol>
-
-
-Despite their long history dating back to equations first formalized by Euler over 250 years ago, finding numerical solutions to PDEs continues to be a challenging problem.
-
-The recent advances in machine learning and artificial intelligence have opened up new possibilities for solving PDEs in a more efficient and accurate manner. These developments have the potential to revolutionize many fields, leading to a better understanding of complex systems and the ability to make more informed predictions about their behavior.
-
-The background and problem set up precedes a brief look into classical and neural solvers, and finally discusses the message passing neural PDE solver (MP-PDE) introduced by Brandstetter et al. <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>.
-
-## Background
-### Let\'s brush up on the basics...
-
-*The notation and definitions provided match those in the paper for consistency, unless otherwise specified.*
-
-<div>
-<p>
-Ordinary differential equations (ODEs) describe how a function changes with respect to a <span style="color:#9444e2">single independent variable</span> and its derivatives. In contrast, PDEs are mathematical equations that describe the behavior of a dependent variable as it changes with respect to <span style="color:#9444e2">multiple independent variables</span> and their derivatives.
-</p>
-<p>
-Formally, for one time dimension and possibly multiple spatial dimensions denoted by \(\textbf{x}=[x_{1},x_{2},x_{3},\text{...}]^{\top} \in \mathbb{X}\), a general (temporal) PDE may be written as
-</p>
-<p>
-$$\partial_{t}\textbf{u}= F\left(t, \textbf{x}, \textbf{u},\partial_{\textbf{x}}\textbf{u},\partial_{\textbf{xx}}\textbf{u},\text{...}\right) \qquad (t,\mathbf{x}) \in [0,T] \times \mathbb{X}$$
-</p>
-<p>
-The \(\partial\) is a partial derivative operator which can be understood as "a small change in". For example, the \(\partial_{t}\textbf{u}\) term refers to how much an infinitesmally small change in \(t\) changes \(\textbf{u}\). Below is an explicit definition for some arbitrary function \(f(x,y)\):
-
-$$\frac{\partial f(x,y)}{\partial x} = \lim_{h \to 0} \frac{f(x+h,y) - f(x,y)}{h}$$
-
-</p>
-<ul>
-  <li>Initial condition:
- \(\mathbf{u}(0,\mathbf{x})=\mathbf{u}^{0}(\mathbf{x})\) for \(\mathbf{x} \in \mathbb{X}\)</li>
-
-  <li>Boundary conditions:
- \(B[ \mathbf{u}](t,x)=0\) for \((t,\mathbf{x}) \in [0,T] \times \partial \mathbb{X}\)</li>
-</ul>
-</div>
-
-<div class="fake-img l-gutter">
-  <p>
-
-  Many equations are solutions to such PDEs alone. For example, the wave equation is given by \(\partial_{tt}u = \partial_{xx}u\). You will find that any function in the form \(u(x,t)=F(x-ct)+\) \(G(x+ct)\) is a potential solution. Initial conditions are used to specify how a PDE "starts" in time, and boundary conditions determine the value of the solution at the boundaries of the region where the PDE is defined.
-
-  </p>
-</div>
-
-<details><summary>Types of boundary conditions</summary>
-Dirichlet boundary conditions prescribe a fixed value of the solution at a particular point on the boundary of the domain. Neumann boundary conditions, on the other hand, prescribe the rate of change of the solution at a particular point on the boundary. There are also mixed boundary conditions, which involve both Dirichlet and Neumann conditions, and Robin boundary conditions, which involve a linear combination of the solution and its derivatives at the boundary.
-</details><br/>
-
-<div class="l-body-outset">
-  <iframe src="{{ 'assets/html/2023-05-01-autoregressive-neural-pde-solver/slider.html' | relative_url }}" frameborder='0' scrolling='no' height="750px" width="100%"></iframe>
-</div>
-<div class="caption">
-Example of the wave equation PDE \(\partial^{2}_{t}u = c^{2}\partial^{2}_ {\mathbf{x}}u\) solved using finite differences. Drag the slider to watch it evolve in time!
-</div>
-
-The study of PDEs is in itself split into many broad fields. Briefly, these are two other important properties in addition to the initial and boundary conditions:
-
-<details><summary>Linearity</summary>
-<ul>
-<li>Linear: the highest power of the unknown function appearing in the equation is one (i.e., a linear combination of the unknown function and its derivatives)</li>
-<li>Nonlinear: the highest power of the unknown function appearing in the equation is greater than one</li>
-</ul>
-
-</details><br/>
-
-<details><summary>Homogeneity</summary>
-
-For an example PDE \(u_t - u_xx = f(x, t)\):
-
-<ul>
-<li>Homogeneous: PDEs with no constant terms (i.e., the right-hand side \(f(x,t)=0\)) and express a balance between different physical quantities</li>
-<li>Inhomogeneous: PDEs with a non-zero constant term \(f(x,t)\neq0\) on the right-hand side and describe how an external factor affects the balance</li>
-</ul>
-
-</details><br/>
-
-PDEs can be either linear or nonlinear, homogeneous or inhomogeneous, and can contain a combination of constant coefficients and variable coefficients. They can also involve a variety of boundary conditions, such as Dirichlet, Neumann, and Robin conditions, and can be solved using analytical, numerical, or semi-analytical methods <d-cite key="straussPartialDifferentialEquations2007"></d-cite>.
-<hr style="width:40%">
-
-Brandstetter et al. <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite> follow precedence set by Li et al. <d-cite key="liFourierNeuralOperator2021"></d-cite> and Bar-Sinai et al. <d-cite key="bar-sinaiLearningDatadrivenDiscretizations2019"></d-cite>to focus on <span style="color:#9444e2;">PDEs written in conservation form</span>:
-
-<p style="text-align:center;">
-\(\partial_{t} \mathbf{u} + \nabla \cdot \mathbf{J}(\mathbf{u}) = 0\)
-</p>
-
-<ul>
-<li><p>\(J\) is the flux, or the amount of some quantity that is flowing through a region at a given time</p>
-</li>
-<li><p>\(\nabla \cdot J\) is the divergence of the flux, or the amount of outflow of the flux at a given point</p>
-</li>
-</ul>
-
-
-Additionally, they consider <span style="color:#9444e2;">Dirichlet and Neumann</span> boundary conditions.
-
-### Solving PDEs the classical way
-A brief search in a library will find numerous books detailing how to solve various types of PDEs.
-<!-- Since Brandstetter et al. proposes to numerically solve PDEs, numerical methods are discussed in more detail. -->
-
-<details><summary>Analytical methods: an exact solution to a PDE can be found by mathematical means <d-cite key="straussPartialDifferentialEquations2007"></d-cite>.</summary><br/>
-
- <ul>
-<li>Separation of Variables<ul>
-<li>This method involves expressing the solution as the product of functions of each variable, and then solving each function individually. It is mainly used for linear PDEs that can be separated into two or more ordinary differential equations.</li>
-</ul>
-</li>
-<li>Green&#39;s Functions<ul>
-<li>This method involves expressing the solution in terms of a Green&#39;s function, which is a particular solution to a homogeneous equation with specified boundary conditions.</li>
-</ul>
-</li>
-</ul>
-
-</details><br/>
-
-<details><summary>Semi-analytical methods: an analytical solution is combined with numerical approximations to find a solution <d-cite key="bartelsNumericalApproximationPartial"></d-cite>.</summary><br/>
-
-<ul>
-<li>Perturbation methods<ul>
-<li>This method is used when the solution to a PDE is close to a known solution or is a small deviation from a known solution. The solution is found by making a perturbation to the known solution and solving the resulting equation analytically.</li>
-</ul>
-</li>
-<li>Asymptotic methods<ul>
-<li>In this method, the solution is represented as a series of terms that are solved analytically. The solution is then approximated by taking the leading terms of the series.</li>
-</ul>
-</li>
-</ul>
-
-</details><br/>
-
-<blockquote>
-Very few PDEs have analytical solutions, so numerical methods have been developed to approximate PDE solutions over a wider range of potential problems.
-</blockquote>
-
-#### Numerical Methods
-
-Often, approaches for temporal PDEs follow the <span style="color:#9444e2;">method of lines (<abbr title="method of lines">MOL</abbr>)</span>.
-
-Every point of the discretization is then thought of as a separate ODE evolving in time, enabling the use of ODE solvers such as Runge-Kutta methods.
-
-<details><summary>1. Discretizing the problem</summary><br/>
-
-<p>
-In the most basic case (<span style="color:#9444e2;">a regular grid</span>), arbitrary spatial and temporal resolutions \(\mathbf{n_{x}}\) and \(n_{t}\) can be chosen and thus used to create a grid where \(\mathbf{n_{x}}\) is a vector containing a resolution for each spatial dimension.
-</p>
-<hr style="width:40%">
-<p>
-The domain may also be <span style="color:#9444e2;">irregularly sampled, resulting in a grid-free discretization</span>. This is often the case with real-world data that comes from scattered sensors, for example.
-</p>
-<p>Finite difference methods (FDMs) or any other discretization technique can be used to discretize the time domain.
-</p>
-<p>
-One direction of ongoing research seeks to determine discretization methods which can result in more efficient numerical solvers (for example, take larger steps in flatter regions and smaller steps in rapidly changing regions).
-</p>
-
-</details><br/>
-
-<details><summary>2. Estimating the spatial derivatives</summary><br/>
-
-<p>
-A popular choice when using a gridded discretization is the <span style="color:#9444e2;">finite difference method (FDM)</span>. Spatial derivative operators are replaced by a stencil which indicates how values at a finite set of neighboring grid points are combined to approximate the derivative at a given position. This stencil is based on the Taylor series expansion.
-</p>
-
-<p>
-{% include figure.html path="assets/img/2023-05-01-autoregressive-neural-pde-solver/fdm_animation.gif" style="max-width:690px;height:auto;" %}
-</p>
-
-<div class="caption">
-Credits: Augusto Peres, Inductiva <d-cite key="HeatHeatEquation"></d-cite>.
-</div>
-
-<hr style="width:40%">
-<p>
-The <span style="color:#9444e2;">finite volume method (FVM)</span> is another approach which works for irregular geometries. Rather than requiring a grid, the computation domain can be divided into discrete, non-overlapping control volumes used to compute the solution for that portion <d-cite key="bartelsNumericalApproximationPartial"></d-cite>.
-</p>
-
-<p>
-For every control volume, a set of equations describing the balance of some physical quantities (in essence, estimating the flux at control volume boundaries) can be solved which results in the approximated spatial derivative.
-</p>
-
-<p>
-While this method <span style="color:#9444e2;">only works for conservation form equations</span>, it can handle complex problems with irregular geometries and fluxes that are difficult to handle with other numerical techniques such as the <abbr title="finite difference method">FDM</abbr>.
-</p>
-<hr style="width:40%">
-<p>
-In the <span style="color:#9444e2;">pseudospectral method (PSM)</span>, PDEs are solved pointwise in physical space by using basis functions to approximate the spatial derivatives <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. The pseudospectral method and the Galerkin method are two common examples of spectral methods which use basis functions satisfying various conditions depending on the specific algorithm being applied. While the <abbr title="finite difference method">FDM</abbr> considers local information to construct approximations, spectral methods determine global solutions and have exponential convergence.
-</p>
-
-<p>
-These methods are well-suited for solving problems with <span style="color:#9444e2;">smooth solutions and periodic boundary conditions</span>, but their performance drops for irregular or non-smooth solutions, as well as problems with more degrees of freedom where their global nature results in high dimensional dense matrix computations.
-</p>
-</details><br/>
-
-<details><summary>3. Time updates</summary><br/>
-
-The resulting problem is a set of temporal ODEs which can be solved with classical ODE solvers such as any member of the Runge-Kutta method family.
-
-</details><br/>
-
-#### Limitations of Classical Methods
-
-The properties of a PDE, such as its order, linearity, homogeneity, and boundary conditions, determine its solution method. <span style="color:#9444e2;">Different methods have been developed based on the different properties and requirements of the problem at hand.</span> Brandstetter at al. categorizes these requirements into the following <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>:
-
-<div>
-
-<table>
-<thead>
-<tr>
-<th>User</th>
-<th>Structural</th>
-<th>Implementational</th>
-</tr>
-</thead>
-<tbody>
-<tr>
-<td>Computation efficiency, computational cost, accuracy, guarantees (or uncertainty estimates), generalization across PDEs</td>
-<td>Spatial and temporal resolution, boundary conditions, domain sampling regularity, dimensionality</td>
-<td>Stability over long rollouts, preservation of invariants</td>
-</tr>
-</tbody>
-</table>
-
-<p>
-The countless combinations of requirements resulted in what Bartels defines as a <span style="color:#9444e2;">splitter field</span> <d-cite key="bartelsNumericalApproximationPartial"></d-cite>: a specialized classical solver is developed for each sub-problems, resulting in many specialized tools rather than a single one.
-</p>
-
-<p>
-These methods, while effective and mathematically proven, often come at high computation costs. Taking into account that PDEs often exhibit chaotic behaviour and are sensitive to any changes in their parameters, <span style="color:#ff4f4b;">re-running a solver every time a coefficient or boundary condition changes in a single PDE can be computationally expensive</span>.
-</p>
-<p>
-One key example which limits grid-based classical solvers is the <span style="color:#9444e2;">Courant-Friedrichs-Lewy (CFL) condition</span>, which states that the maximum time step size should be proportional to the minimum spatial grid size. According to this condition, as the number of dimensions increases, the size of the temporal step must decrease and therefore numerical solvers become very slow for complex PDEs. 
-</p>
-</div>
-
-<table>
-  <thead>
-    <tr>
-      <th>Algorithm</th>
-      <th>Equation</th>
-      <th>Boundary conditions</th>
-      <th>Complexity</th>
-    </tr>
-  </thead>
-  <tbody>
-    <tr>
-      <td>Classical FDM/FEM/FVM</td>
-      <td>general</td>
-      <td>general</td>
-      <td>poly\(((\frac{1}{\varepsilon})^{d})\)</td>
-    </tr>
-    <tr>
-      <td>Adaptive FDM/FEM <d-cite key="babuskaHpVersionFinite1987"></d-cite></td>
-      <td>general</td>
-      <td>general</td>
-      <td>poly\(((\log(\frac{1}{\varepsilon}))^{d})\)</td>
-    </tr>
-    <tr>
-      <td>Spectral method <d-cite key="gheorghiuSpectralMethodsDifferential2007,shenSpectralMethodsAlgorithms2011"></d-cite></td>
-      <td>general</td>
-      <td>general</td>
-      <td>poly\(((\log(\frac{1}{\varepsilon}))^{d})\)</td>
-    </tr>
-    <tr>
-      <td>Sparse grid FDM/FEM <d-cite key="bungartzSparseGrids2004,zengerSparseGrids1991"></d-cite></td>
-      <td>general</td>
-      <td>general</td>
-      <td>poly\(((\frac{1}{\varepsilon})(\log(\frac{1}{\varepsilon}))^{d})\)</td>
-    </tr>
-    <tr>
-      <td>Sparse grid spectral method <d-cite key="shenEfficientSpectralSparse2010,shenEfficientSpectralSparse2012"></d-cite></td>
-      <td>elliptic</td>
-      <td>general</td>
-      <td>poly\((\log(\frac{1}{\varepsilon})(\log \log(\frac{1}{\varepsilon}))^{d})\)</td>
-    </tr>
-  </tbody>
-</table>
-<div class="caption">
-Table showing (polynomial) computational complexity of some common numerical methods, including finite difference method (FDM), finite elements method (FEM), finite volume method (FVM), spectral method, and some of their variants for \(d\)-dimensional PDEs with error tolerance ε. Note that every method has an exponential dependency on the dimenAdapted from <d-cite key="childsHighprecisionQuantumAlgorithms2021"></d-cite>.
-</div>
-
-### Neural Solvers
-<p>
-Neural solvers offer some very desirable properties that may serve to unify some of this splitter field. Neural networks can <span style="color:#9444e2;">learn and generalize to new contexts</span> such as different initial/boundary conditions, coefficients, or even different PDEs entirely <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. They can also circumvent the CFL condition, making them a promising avenue for solving highly complex PDEs such as those found in weather prediction. For a review which contextualizes physics informed machine learning with regards to classical problems and methods, see <d-cite key="mengWhenPhysicsMeets2022"></d-cite>
-</p>
-<p>
-Though most methods lie along a spectrum from classical leaning to end-to-end neural, a naive yet illustrative categorization into three groupings is shown below. 
-</p>
-<p>
-{% include figure.html path="assets/img/2023-05-01-autoregressive-neural-pde-solver/PDEchart.png" style="max-width:690px;height:auto;" %}
-</p>
-
-#### Fully Neural/Universal Function Approximators
-
-The term fully neural here refers to methods which rely on the universal function approximation theory such that a sufficiently complex network can represent any arbitrary function. Many common fully neural methods are also known as neural operators which <span style="color:#9444e2;">model the solution of a PDE as an operator that maps inputs to outputs</span>. The problem is set such that a neural operator $$\mathcal{M}$$ satisfies $$\mathcal{M}(t,\mathbf{u}^{0}) = \mathbf{u}(t)$$ where $$\mathbf{u}^{0}$$ are the initial conditions <d-cite key="luDeepONetLearningNonlinear2021, brandstetterMessagePassingNeural2022a"></d-cite>. The idea of using deep learning techniques to solve differential equations has a long history, including Dissanayake's and Phan-Thien's attempt to use <abbr title="multilayer perceptron">MLP</abbr>s as universal approximators to solve PDEs, and arguably includes any work involving incorporating prior knowledge into models in general <d-cite key="dissanayakeNeuralnetworkbasedApproximationsSolving1994,psichogiosHybridNeuralNetworkfirst1992,lagarisArtificialNeuralNetworks1998"></d-cite>. Simple <abbr title="multilayer perceptron">MLP</abbr>s, CNNs, RNNs, and other networks used to map input vectors to output vectors are naive examples of finite-dimensional operators.
-
-Raissi et al. officially coined the physics-informed neural network (PINN) in 2017 <d-cite key="raissiPhysicsinformedNeuralNetworks2019"></d-cite>. The problem is set such that the network $$\mathcal{N}$$ satisfies $$\mathcal{N}(t,\mathbf{u}^{0}) = \mathbf{u}(t)$$ where $$\mathbf{u}^{0}$$ are the initial conditions. The main principle behind <abbr title="physics informed neural network">PINN</abbr>s is to enforce the governing physical laws of the problem on the network's predictions by adding loss term(s) to the network's objective function.
-
-For a typical loss function
-$$\theta = \text{argmin}_{\theta} \mathcal{L}(\theta)$$
-
-the loss with a physics prior may be defined as follows:
-
-$$\mathcal{L}(\theta) = \omega_{\mathcal{F}} \mathcal{L}_{\mathcal{F}}(\theta) + \omega_{\mathcal{B}} \mathcal{L}_{\mathcal{B}}(\theta) + \omega_{d} \mathcal{L}_{\text{data}}(\theta)$$
-
-
-| Term | Definition | Effect |
-|--|--|--|
-| $$\mathcal{L}_{\mathcal{B}}$$ | Loss wrt. the initial and/or boundary conditions | Fits the known data over the network |
-| $$\mathcal{L}_{\mathcal{F}}$$ | Loss wrt. the PDE | Enforces DE $$\mathcal{F}$$ at collocation points;  Calculating using autodiff to compute derivatives of $$\mathbf{\hat{u}_{\theta}(\mathbf{z})}$$ |
-| $$\mathcal{L}_{\text{data}}$$ | Validation of known data points | Fits the known data over the NN and forces $$\mathbf{\hat{u}}_{\theta}$$ to match measurements of $$\mathbf{u}$$ over provided points | -->
-
-Since the network maps input variables to output variables which are both finite-dimensional and dependent on the grid used to discretize the problem domain, it is considered a finite dimensional neural operator. The paper gained a lot of traction and inspired many architectures which now fall under the <abbr title="physics informed neural network">PINN</abbr> family; for a more thorough review see <d-cite key="cuomoScientificMachineLearning2022"></d-cite>, and for <a href="https://www.physicsbaseddeeplearning.org/intro.html">hands-on examples visit this digital book</a> <d-cite key="thuereyPhysicsbasedDeepLearning2022"></d-cite>.
-
-The success of this loss-based approach is apparent when considering the rapid growth of papers which extend the original iteration of the <abbr title="physics informed neural network">PINN</abbr>. However, Krishnapriyan et al. <d-cite key="krishnapriyanCharacterizingPossibleFailure2021"></d-cite> has shown that even though standard fully-connected neural networks are theoretically capable of representing any function given enough neurons and layers, a <abbr title="physics informed neural network">PINN</abbr> may still fail to approximate a solution due to the complex loss landscapes arising from soft PDE constraints.
-
-The DeepONet architecture is a seminal example of an infinite dimensional neural operator in contrast to the finite dimensional <abbr title="physics informed neural network">PINN</abbr> <d-cite key="luDeepONetLearningNonlinear2021"></d-cite>. It consists of one or multiple branch net(s) which encode discrete inputs to an input function space, and a single trunk net which receives the query location to evaluate the output function. The model maps from a fixed, finite dimensional grid to an infinite dimensional output space.
-
-Since the development of the DeepONet, many novel neural operators have emerged which generalize this finite-infinite dimensional mapping to an infinite-infinite dimensional mapping<d-cite key="liNeuralOperatorGraph2020,liPhysicsinformedNeuralOperator2021,goswamiPhysicsInformedDeepNeural2022,rahmanUshapedNeuralOperators2022,tripuraWaveletNeuralOperator2022,fanaskovSpectralNeuralOperators2022,pathakFourCastNetGlobalDatadriven2022"></d-cite>, including the <span style="color:#9444e2;">Fourier Neural Operator (FNO)</span> <d-cite key="liFourierNeuralOperator2021"></d-cite>. It operates within Fourier space and takes advantage of the convolution theorem to place the integral kernel in Fourier space as a convolutional operator.
-
-<div>
-  <p>
-  These global integral operators (implemented as Fourier space convolutional operators) are combined with local nonlinear activation functions, resulting in an architecture which is <span style="color:#9444e2;">highly expressive yet computationally efficient, as well as being resolution-invariant</span>.
-  </p>
-  <p>
-  While the vanilla <abbr title="Fourier neural operator">FNO</abbr> required the input function to be defined on a grid due to its reliance on the FFT, further work developed mesh-independent variations as well <d-cite key="kovachkiNeuralOperatorLearning2022"></d-cite>.
-  </p>
-</div>
-
-<div class="fake-img l-gutter">
-  <p>
-
-  Convolution Theorem
-
-  </p>
-  <p>
-  The Fourier transform of the convolution of two signals is equal to the pointwise product of their individual Fourier transforms
-  </p>
-</div>
-
-<p>
-{% include figure.html path="assets/img/2023-05-01-autoregressive-neural-pde-solver/FNO.png" style="max-width:80%;height:auto;" %}
-</p>
-<div class="caption">
-<abbr title="Fourier neural operator">FNO</abbr> architecture. For more details, see <a href="https://zongyi-li.github.io/blog/2020/fourier-pde/">this blogpost</a>. Credits: Li et al. <d-cite key="liFourierNeuralOperator2021"></d-cite>.
-</div>
-
-<p>
-Neural operators are able to operate on multiple domains and can be completely data-driven.
-</p>
-<p>
-However, these models <span style="color:#ff4f4b;">do not tend to predict out-of-distribution \(t\)</span> and are therefore limited when dealing with temporal PDEs. Another major barrier is their relative <span style="color:#ff4f4b;">lack of interpretability and guarantees</span> compared to classical solvers. 
-</p>
-
-#### Neural-Augmented Classical Methods
-
-A parallel line of research involves using deep learning as a tool to improve classical numerical methods for solving PDEs. One avenue involves modifying existing iterative methods: while neural operator methods directly mapped inputs to outputs, <span style="color:#9444e2;">autoregressive methods take an iterative approach instead</span>. For example, iterating over time results in a problem such as $$\mathbf{u}(t+\Delta t) = \mathcal{A}(\Delta t, \mathbf{u}(t))$$ where $$\mathcal{A}$$ is some temporal update <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>.
-
-<div class="l-body-outset">
-<div class="row mt-3">
-    <div class="col-sm mt-3 mt-md-0">
-    <div class="vertical-center" style="background-color:white">
-        {% include figure.html path="assets/img/2023-05-01-autoregressive-neural-pde-solver/rnn.png" class="img-fluid rounded" %}
-    </div>
-    </div>
-    <div class="col-sm mt-3 mt-md-0">
-        {% include figure.html path="assets/img/2023-05-01-autoregressive-neural-pde-solver/wavenet.gif" class="img-fluid rounded" %}
-    </div>
-</div>
-<div class="caption">
-    Similarly to <abbr title="recurrent neural networks">RNN</abbr>s (left), autoregressive models take previous time steps to predict the next time step. However, autoregressive models (right) are entirely feed-forward and take the previous predictions as inputs rather than storing them in some hidden state. Credits: RNN diagram from Colah's Blog <d-cite key="UnderstandingLSTMNetworks"></d-cite>, WaveNet from Deepmind Blog <d-cite key="WaveNetGenerativeModel"></d-cite>
-</div>
-</div>
-
-Three autoregressive systems mentioned by Brandstetter et al. are hybrid methods which use neural networks to predict certain parameters for finite volume, multigrid, and iterative finite elements methods. <span style="color:#9444e2;">All three retain a (classical) computation grid which makes them somewhat interpretable</span> <d-cite key="bar-sinaiLearningDatadrivenDiscretizations2019, greenfeldLearningOptimizeMultigrid2019a, hsiehLearningNeuralPDE2019"></d-cite>.
-
-<div class="fake-img l-gutter">
-  <p>
-  Other autoregressive models include PixelCNN for images, WaveNet for audio, and the Transformer for text.
-  </p>
-</div>
-
-Hsieh et al. <d-cite key="hsiehLearningNeuralPDE2019"></d-cite>, for example, develops a neural network-accelerated iterative finite elements method. Most significantly, their approach offers theoretical guarantees of convergence and correctness. Their problem formulation focuses on solving a single linear PDE class for variable discretization, boundary conditions, and source/forcing terms. For any PDE with an existing linear iterative solver, a learned iterator can replace a handcrafted classical iterator. 
-
-Similarly, Um et al. <d-cite key="umSolverintheLoopLearningDifferentiable2020a"></d-cite> proposed using a neural network component to learn the error or deviation from the path of an iterative solver. Using this component, the iterative method can be "pushed" back onto the true PDE solution.
-
-Another way deep learning can be leveraged in classical methods is characterized by <d-cite key="meurisMachinelearningbasedSpectralMethods2023"></d-cite> and also highlights the deeply interconnected nature of these novel developments. The conventional spectral method rewrites a PDE in terms of the sum of basis functions; Meuris et al. use a DeepONet to discover candidate functions to be used as basis functions. Though mathematical work is required to mold the extracted function (from the DeepONet) to a basis function satisfying certain desirable properties, it expands the use of the spectral method toward complex domains where we might not have known appropriate basis functions.
-
-However, augmented classical systems have not gained the acclaim seen by their fully neural counterparts as a whole.
-
-This is on one hand due to their <span style="color:#ff4f4b;">limitations in generalization</span>. In Hsieh et al.'s case, an existing numerical method must be used to craft a complementary neural iterator <d-cite key="hsiehLearningNeuralPDE2019"></d-cite>. Another major concern is the <span style="color:#ff4f4b;">accumulation of error</span> in iterative methods, which is particularly detrimental for PDE problems that often exhibit chaotic behavior <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. Overarching both neural component and neural-optimized methods, however, is the tradeoff between marginal improvements to classical methods and what tends to be a non-trivial amount of manual work required to implement such methods.
-
-#### Classical-Inspired Neural Methods
-
-Ruthotto and Haber released an impactful study in 2018 which interprets residual neural networks (ResNets) as PDEs, and addresses some of their challenges using PDE theory <d-cite key="ruthottoDeepNeuralNetworks2018"></d-cite>. A standard ResNet has skip connections which in effect add a previous layer's output directly to the calculation of a future layer's output. Given input features $$\mathbf{Y}_{0}=\mathbf{Y}$$ and a ResNet with $$N$$ layers, the output of the $$j$$th layer is used to calculate that of the next:
-
-$$\mathbf{Y}_{j+1}=\mathbf{Y}_{j}+f(\theta^{(j)},\mathbf{Y}_{j})$$
-
-This formulation also describes a typical forward Euler discretization with a step size $$\delta_{t}=1$$. Based on this continuous interpretation of a ResNet layer, PDEs from control theory can be used to develop novel networks with specific and expected behaviours like smoothing or even memory reduction <d-cite key="ruthottoDeepNeuralNetworks2018"></d-cite>.
-
-This is an example of a strong classical-inspired neural method which allowed us to systematically develop novel architectures. Since then, PDE interpretations of neural network architectures have been expanded to encompass embedding PDEs into architectures themselves, and building architectures to mimic classical PDE solvers.
-
-The Graph Neural Diffusion (GRAND) model introduced by Chamberlain et al. demonstrates that <span style="color:#9444e2;">graph neural networks (GNNs) can be crafted using differential equations</span> (like diffusion processes) where the spatial derivative is analogous to the difference between node features, and the temporal update is a continuous counterpart to the layer index <d-cite key="chamberlainGRANDGraphNeural2021a"></d-cite>. From these two principles and their derivations of diffusion PDEs on graphs, Chamberlain et al. design networks which ameliorate common <abbr title="graph neural network">GNN</abbr> pitfalls like oversmoothing (which occurs as the number of layers increases). Note that here, the emphasis is not in outputting the solution of a PDE directly but rather using a PDE to influence or bias the output toward an expected result, somewhat more similarly to how a <abbr title="physics informed neural network">PINN</abbr> biases the output to obey a specified PDE.
-
-Later, the PDE-GCN model extends GRAND by deriving differential operators on manifolds which are then discretized on graphs to then build not only diffusion, but hyperbolic PDE-inspired <abbr title="graph neural network">GNN</abbr>s as well <d-cite key="eliasofPdegcnNovelArchitectures2021"></d-cite>. The discretized nonlinear diffusion and nonlinear hyperbolic PDEs call back to Ruthotto et al.'s comparison to ResNet updates and are used to define the titular PDE-inspired graph convolutional network (GCN) layer. Interestingly, mixing both diffusion and hyperbolic variants can allow one to discover which is more prominent to a task by retrieving a parameter which weights how much one network dynamic contributes to the output.
-
-This category of models highlights the diverse ways that PDEs are used in deep learning. Not only can these networks be tested on mathematical datasets, but they provide valuable interpretations and performance improvements when used in non-geometric tasks like node classification and even protein-protein interactions in biology.
-
-## Message Passing Neural PDE Solver (MP-PDE)
-
-Brandstetter et al. propose a <span style="color:#9444e2;">fully neural PDE solver which capitalizes on neural message passing</span>. The overall architecture is laid out below, consisting of an <abbr title="multilayer perceptron">MLP</abbr> encoder, a <abbr title="graph neural network">GNN</abbr> processor, and a CNN decoder.
-
-<div class="l-body-outset" style="background-color:white">
-{% include figure.html path="assets/img/2023-05-01-autoregressive-neural-pde-solver/MP-PDE-Solver.png" style="max-width:100%;height:auto;" %}
-</div>
-<div class="caption">
-Overall MP-PDE architecture. Credits: Brandstetter et al. <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>.
-</div>
-
-At its core, this model is autoregressive and thus faces the same challenge listed above. Two key contributions of this work are the <span style="color:#9444e2;">pushforward trick and temporal bundling which mitigate the potential butterfly effect of error accumulation</span><d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. The network itself, being fully neural, is capable of generalization across many changes as well.
-
-### The Pushforward Trick and Temporal Bundling
-
-<div class="l-body-outset">
-{% include figure.html path="assets/img/2023-05-01-autoregressive-neural-pde-solver/pushforward3.jpg" style="max-width:100%;height:auto;" %}
-<div class="caption">
-Pushforward trick compared to one-step and unrolled training. Credits: Brandstetter et al. <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>.
-</div>
-</div>
-
-During testing, the model uses current time steps (first from data, then <span style="color:#9444e2;">from its own predictions</span>) to approximate the next time step.
-
-This results in a distribution shift problem because the inputs are no longer solely from ground truth data: <span style="color:#9444e2;">the distribution learned during training will always be an approximation of the true data distribution</span>. The model will appear to overfit to the one-step training distribution and perform poorly the further it continues to predict.
-
-An adversarial-style stability loss is added to the one-step loss so that the training distribution is brought closer to the test time distribution <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>:
-
-<details><summary style="text-align:center;"><span style="summary-math">
-\(L_{\text{one-step}} =\) <span style="color:#23a15c;">\(\mathbb{E}_{k}\)</span> <span style="color:#928b54;">\(\mathbb{E}_{\mathbf{u^{k+1}|\mathbf{u^{k},\mathbf{u^{k} \sim p_{k}}}}}\)</span> \([\) <span style="color:#5588e0;">\(\mathcal{L}\)</span> \((\) <span style="color:#9444e2;">\(\mathcal{A}(\mathbf{u}^{k})\)</span> \(,\) <span style="color:#46b4af;">\(\mathbf{u}^{k+1}\)</span>\(]\)
-</span>
-</summary>
-
-<p>
-The <span style="color:#5588e0;">loss function</span> is used to evaluate the difference between the <span style="color:#9444e2;">temporal update</span> and the <span style="color:#46b4af;">expected next state</span>, and the overall one-step loss is calculated as the expected value of this loss over <span style="color:#23a15c;">all time-steps</span> and <span style="color:#928b54;">all possible next states</span>.
-</p>
-</details><br style="line-height:5px"/>
-
-<p style="text-align:center;">
-\(L_{\text{stability}} = \mathbb{E}_{k}\mathbb{E}_{\mathbf{u^{k+1}|\mathbf{u^{k},\mathbf{u^{k} \sim p_{k}}}}}[\mathbb{E}_{\epsilon | \mathbf{u}^{k}} [\mathcal{L}(\mathcal{A}(\mathbf{u}^{k}+\) <span style="color:#faad18;">\(\epsilon\)</span> \()),\mathbf{u}^{k+1}]]\)
-</p>
-
-<p style="text-align:center;">
-\(L_{\text{total}} = L_{\text{one-step}} + L_{\text{stability}}\)
-</p>
-
-<p>
-The stability loss is largely based off the one-step loss, but now assumes that the temporal update uses <span style="color:#faad18;">noisy data</span>.
-</p>
-
-<p>
-The pushforward trick lies in the choice of <span style="color:#faad18;">\(\epsilon\)</span> such that \(\mathbf{u}^{k}+\epsilon = \mathcal{A}(\mathbf{u}^{k-1})\), similar to the test time distribution. Practically, it is implemented to be <span style="color:#9444e2;">noise from the network itself</span> so that as the network improves, the loss decreases.
-</p>
-
-<p>
-Necessarily, the noise of the network must be known or calculated to implement this loss term. So, <span style="color:#9444e2;">the model is unrolled for 2 steps</span> but only backpropagated over the most recent unroll step, which already has the neural network noise <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. In essence, the one-step training has a clean input and noisy output whereas the pushforward trick has both noisy input and noisy output with the \(\epsilon\) term capturing the noise.
-</p>
-
-<p>
-While the network could be unrolled during training, this not only slows the training down but also might result in the network learning shortcuts across unrolled steps.
-</p>
-
-**Temporal bundling**
-
-<div class="row mt-3">
-    <div class="col-8">
-        {% include figure.html path="assets/img/2023-05-01-autoregressive-neural-pde-solver/NN-AR.jpg" class="img-fluid rounded" %}
-    </div>
-    <div class="col-4">
-        {% include figure.html path="assets/img/2023-05-01-autoregressive-neural-pde-solver/temporalbundling.jpg" class="img-fluid rounded" %}
-    </div>
-</div>
-<div class="caption">
-    Temporal bundling compared to neural operators and autoregressive models. Credits: Brandstetter et al. <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>.
-</div>
-<!-- </div> -->
-
-This trick complements the previous by <span style="color:#9444e2;">reducing the amount of times the test time distribution changes</span>. Rather than predicting a single value at a time, the MP-PDE predicts multiple time-steps at a time, as seen above <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>.
-
-### Network Architecture
-
-<abbr title="graph neural network">GNN</abbr>s have been used as PDE solvers in a variety of works <d-cite key="liNeuralOperatorGraph2020, eliasofPdegcnNovelArchitectures2021, iakovlevLearningContinuoustimePDEs2021"></d-cite>; however, in this implementation, <span style="color:#9444e2;">links can be drawn directly from the <abbr title="method of lines">MOL</abbr> to each component of the network architecture centering around the use of a message passing algorithm.
-
-<table>
-<thead>
-<tr>
-<th>Classical Numerical Method</th>
-<th>MP-PDE Network Component</th>
-</tr>
-</thead>
-<tbody>
-<tr>
-<td>Partitioning the problem onto a grid</td>
-<td>Encoder <br /><em>Encodes a vector of solutions into node embeddings</em></td>
-</tr>
-<tr>
-<td>Estimating the spatial derivatives</td>
-<td>Processor <br /><em>Estimates spatial derivatives via message passing</em></td>
-</tr>
-<tr>
-<td>Time updates</td>
-<td>Decoder <br /><em>Combines some representation of spatial derivatives smoothed into a time update</em></td>
-</tr>
-</tbody>
-</table>
-
-<ol>
-<li>Encoder
-<p>
-The encoder is implemented as a two-layer <abbr title="multilayer perceptron">MLP</abbr> which computes an embedding for each node \(i\) to cast the data to a <span style="color:#9444e2;">non-regular integration grid</span>:
-</p>
-<details><summary style="text-align:center"><span style="summary-math"> \(\mathbf{f}_{i}^{0} = \epsilon^{v}([\mathbf{u}_{i}^{k-K:k},\mathbf{x}_{i},t_{k},\theta_{PDE}])\)
-</span>
-</summary>
-
-where \(\mathbf{u}_{i}^{k-K:k}\) is a vector of previous solutions (the length equaling the temporal bundle length), \(\mathbf{x}_{i}\) is the node's position, \(t_{k}\) is the current timestep, and \(\theta_{PDE}\) holds equation parameters.
-
-</details>
-</li>
-<li>
-Processor
-
-<p>
-The node embeddings from the encoder are then used in a message passing <abbr title="graph neural network">GNN</abbr>. <a id="spatialderivative" style="text-decoration:none;">The message passing algorithm, which approximates spatial derivatives, is run \(M\) steps using the following updates:</a>
-</p>
-
-<details><summary style="text-align:center"><span style="summary-math">
-
-\(\text{edge } j \to i \text{ message:} \qquad \mathbf{m}_{ij}^{m} =\) <span style="color:#ae46b4;">\(\phi\)</span> \((\) <span style="color:#b4a546;">\(\mathbf{f}_{i}^{m}, \mathbf{f}_{j}^{m},\)</span>  <span style="color:steelblue;">\(\mathbf{u}_{i}^{k-K:k}-\mathbf{u}_{j}^{k-K:k}\)</span>, <span style="color:#6546b4;">\(\mathbf{x}_{i}-\mathbf{x}_{j}\)</span>, <span style="color:#46b4af;">\(\theta_{PDE}\)</span> \())\)
-
-</span>
-</summary>
-
-The <span style="color:#6546b4;">difference in spatial coordinates</span> helps enforce translational symmetry and, combined with the <span style="color:steelblue;">difference in node solutions</span>, relates the message passing to a local difference operator. The addition of the <span style="color:#46b4af;">PDE parameters</span> is motivated by considering what the MP-PDE should generalize over: by adding this information in multiple places, flexibility can potentially be learned since all this information (as well as the <span style="color:#b4a546;">node embeddings</span>) is fed through <span style="color:#ae46b4;">a two-layer <abbr title="multilayer perceptron">MLP</abbr></span>.
-
-In addition, the solution of a PDE at any timestep must respect the boundary condition (the same as in classical methods for BVPs), so adding the <span style="color:#46b4af;">PDE parameters</span> in the edge update provides knowledge of the boundary conditions to the neural solver.
-
-</details>
-
-<br/>
-
-<details><summary style="text-align:center;"><span style="summary-math">
-
-\(\text{node } i \text{ update:} \qquad\) <span style="color:#ff4f4b;">\(\mathbf{f}_{i}^{m+1}\)</span> \(=\) <span style="color:#928b54;">\(\psi\)</span> \((\) <span style="color:#5588e0;">\(\mathbf{f}^{m}_{i}\)</span>, <span style="color:#722e4e;">\(\sum_{j \in \mathcal{N}(i)} \mathbf{m}_{ij}^{m}\)</span>, <span style="color:#46b4af;">\(\theta_{PDE}\)</span> \()\)
-
-</span>
-</summary>
-
-The <span style="color:#ff4f4b;">future node embedding</span> is updated using <span style="color:#5588e0;">the current node embedding</span>, <span style="color:#722e4e;">the aggregation of all received messages</span>, and (again) the <span style="color:#46b4af;">PDE parameters</span>. This information is also fed through <span style="color:#928b54;">a two-layer <abbr title="multilayer perceptron">MLP</abbr></span>.
-
-</details><br/>
-
-<p>
-Bar-Sinai et al. explores the relationship between <abbr title="finite difference method">FDM</abbr> and <abbr title="finite volume method">FVM</abbr> as used in the method of lines <d-cite key="bar-sinaiLearningDatadrivenDiscretizations2019"></d-cite>. In both methods, the \(n^{th}\) order derivative at a point \(x\) is approximated by
-</p>
-
-<p style="text-align:center;">
-\(\partial^{(n)}_{x}u \approx \sum_{i} a^{(n)}_{i} u_{i}\)
-</p>
-
-<p>
-for some precomputed coefficients \(a^{(n)}_{i}\). <span style="color:#9444e2;">The right hand side parallels the message passing scheme</span>, which aggregates the local difference (<span style="color:steelblue;">\(\mathbf{u}_{i}^{k-K:k}-\mathbf{u}_{j}^{k-K:k}\)</span> in the edge update) and other (learned) embeddings over neighborhoods of nodes. 
-</p>
-
-<p>
-This relationship gives an intuitive understanding of the message passing <abbr title="graph neural network">GNN</abbr>, which mimics <abbr title="finite difference method">FDM</abbr> for a single layer, <abbr title="finite volume method">FVM</abbr> for two layers, and <abbr title="Weighted Essentially Non-Oscillatory (5th order)">WENO5</abbr> for three layers <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. <abbr title="Weighted Essentially Non-Oscillatory (5th order)">WENO5</abbr> is a numerical interpolation scheme used to reconstruct the solution at cell interfaces in <abbr title="finite volume method">FVM</abbr>.
-</p>
-
-<p>
-While the interpretation is desirable, how far this holds in the actual function of the <abbr title="message passing graph neural network">MP-GNN</abbr> is harder to address. The concepts of the nodes as integration points and messages as local differences break down as the nodes and edges update. In addition, the furthest node that contributes a message from for any point is at \(n\) edges away for the \(n^{th}\) layer (or a specified limit). This results in a very coarse and potentially underinformed approximation for the first layer which is then propagated to the next layers. However, both the updates use two layer <abbr title="multilayer perceptron">MLP</abbr>s which (although abstracting away from their respective interpretations) may in effect learn optimal weightings to counterbalance this.
-</p>
-</li>
-<li>
-Decoder
-
-<p>
-The approximated spatial derivatives are then <span style="color:#9444e2;">combined and smoothed using a 1D CNN</span> which outputs a bundle of next time steps (recall temporal bundling) \(\mathbf{d}_{i}\). The solution is then updated:
-</p>
-
-<p style="text-align:center;">
-\(\mathbf{u}^{k+l}_{i} = u^{k}_{i} + (t_{k+l}-t_{k})\mathbf{d}^{l}_{i}\)
-</p>
-
-<p>
-Some precedence is seen, for example, in classical linear multistep methods which (though effective) face stability concerns. Since the CNN is adaptive, it appears that it avoids this issue <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>.
-</p>
-</li>
-</ol>
-
-### Results
-
-<details><summary>Quantitative measures: accumulated error, runtime</summary>
-<p>
-Accumulated error: \(\frac{1}{n_{x}} \sum_{x,t} MSE\)
-</p>
-<p>
-Runtime (s): Measured time taken to run for a given number of steps.
-</p>
-
-</details>
-
-<blockquote>
-As a general neural PDE solver, the <abbr title="message passing graph neural network">MP-GNN</abbr> surpasses even the current state-of-the-art <abbr title="Fourier neural operator">FNO</abbr>.
-</blockquote>
-
-For example, after training a neural model and setting up an instance of <abbr title="method of lines">MOL</abbr>, this is a brief comparison of how they can generalize without re-training.
-
-<table>
-<thead>
-<tr>
-<th>Generalization to...</th>
-<th><abbr title="message passing graph neural network">MP-GNN</abbr></th>
-<th><abbr title="Fourier neural operator">FNO</abbr></th>
-<th>Classical (<abbr title="method of lines">MOL</abbr>)</th>
-</tr>
-</thead>
-<tbody>
-<tr>
-<td>New PDEs</td>
-<td>Yes</td>
-<td>No</td>
-<td>No</td>
-</tr>
-<tr>
-<td>Different resolutions</td>
-<td>Yes</td>
-<td>Yes</td>
-<td>No (unless downsampling)</td>
-</tr>
-<tr>
-<td>Changes in PDE parameters</td>
-<td>Yes</td>
-<td>Yes</td>
-<td>Sometimes</td>
-</tr>
-<tr>
-<td>Non-regular grids</td>
-<td>Yes</td>
-<td>Some</td>
-<td>Yes (dependent on implementation)</td>
-</tr>
-<tr>
-<td>Higher dimensions</td>
-<td>Yes</td>
-<td>No</td>
-<td>No</td>
-</tr>
-</tbody>
-</table>
-
-<div class="l-body-outset">
-{% include figure.html path="assets/img/2023-05-01-autoregressive-neural-pde-solver/shock_formation.png" style="max-width:100%;height:auto;" %}
-<div class="caption">
-Demonstration of shock formation using MP-PDE from different training data resolutions. Credits: Brandstetter et al. <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>.
-</div>
-</div>
-
-This experiment exemplifies the MP-PDE's ability to model shocks (where both the <abbr title="finite difference method">FDM</abbr> and PSM methods fail) across multiple resolutions. Even at a fifth of the resolution of the ground truth, both the small and large shocks are captured well.
-
-<div class="l-body-outset">
-{% include figure.html path="assets/img/2023-05-01-autoregressive-neural-pde-solver/2dshock.jpg" style="max-width:100%;height:auto;" %}
-<div class="caption">
-Demonstration of shock formation using MP-PDE from different training data resolutions. Credits: Brandstetter et al. <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>.
-</div>
-</div>
-
-The same data is displayed in 2D to show the time evolution. After about 7.5s, the error accumulation is large enough to visibly diverge from the ground truth. The predictions become unreliable due to error accumulation.
-
-In practice, this survival time should be empirically found (as seen here) to determine how long the solution is reliable. However, the ground truth would be needed for comparison, rendering this as another chicken-egg problem.
-
-<table>
-<thead>
-  <tr>
-    <th colspan="2"></th>
-    <th colspan="4" style="border-left:1px solid lightgrey;">Accumulated Error</th>
-    <th colspan="2" style="border-left:1px solid lightgrey;">Runtime [s]</th>
-  </tr>
-</thead>
-<tbody>
-  <tr>
-    <td colspan="2">
-    \(\quad (n_{t},n_{x})\)
-    </td>
-    <td style="border-left:1px solid lightgrey;">WENO5</td>
-    <td>FNO-RNN</td>
-    <td style="border-left:1px solid lightgrey;">FNO-PF</td>
-    <td>MP-PDE</td>
-    <td style="border-left:1px solid lightgrey;">WENO5</td>
-    <td>MP-PDE</td>
-  </tr>
-  <tr>
-    <td><b>E1</b></td>
-    <td>(250,100)</td>
-    <td style="border-left:1px solid lightgrey;">2.02</td>
-    <td>11.93</td>
-    <td style="border-left:1px solid lightgrey;">0.54</td>
-    <td>1.55</td>
-    <td style="border-left:1px solid lightgrey;">1.9</td>
-    <td>0.09</td>
-  </tr>
-  <tr>
-    <td><b>E1</b></td>
-    <td>(250, 50)</td>
-    <td style="border-left:1px solid lightgrey;">6.23</td>
-    <td>29.98</td>
-    <td style="border-left:1px solid lightgrey;">0.51</td>
-    <td>1.67</td>
-    <td style="border-left:1px solid lightgrey;">1.8</td>
-    <td>0.08</td>
-  </tr>
-  <tr>
-    <td><b>E1</b></td>
-    <td>(250, 40)</td>
-    <td style="border-left:1px solid lightgrey;">9.63</td>
-    <td>10.44</td>
-    <td style="border-left:1px solid lightgrey;">0.57</td>
-    <td>1.47</td>
-    <td style="border-left:1px solid lightgrey;">1.7</td>
-    <td>0.08</td>
-  </tr>
-  <tr>
-    <td><b>E2</b></td>
-    <td>(250, 100)</td>
-    <td style="border-left:1px solid lightgrey;">1.19</td>
-    <td>17.09</td>
-    <td style="border-left:1px solid lightgrey;">2.53</td>
-    <td>1.58</td>
-    <td style="border-left:1px solid lightgrey;">1.9</td>
-    <td>0.09</td>
-  </tr>
-  <tr>
-    <td><b>E2</b></td>
-    <td>(250, 50)</td>
-    <td style="border-left:1px solid lightgrey;">5.35</td>
-    <td>3.57</td>
-    <td style="border-left:1px solid lightgrey;">2.27</td>
-    <td>1.63</td>
-    <td style="border-left:1px solid lightgrey;">1.8</td>
-    <td>0.09</td>
-  </tr>
-  <tr>
-    <td><b>E2</b></td>
-    <td>(250, 40)</td>
-    <td style="border-left:1px solid lightgrey;">8.05</td>
-    <td>3.26</td>
-    <td style="border-left:1px solid lightgrey;">2.38</td>
-    <td>1.45</td>
-    <td style="border-left:1px solid lightgrey;">1.7</td>
-    <td>0.08</td>
-  </tr>
-  <tr>
-    <td><b>E3</b></td>
-    <td>(250, 100)</td>
-    <td style="border-left:1px solid lightgrey;">4.71</td>
-    <td>10.16</td>
-    <td style="border-left:1px solid lightgrey;">5.69</td>
-    <td>4.26</td>
-    <td style="border-left:1px solid lightgrey;">4.8</td>
-    <td>0.09</td>
-  </tr>
-  <tr>
-    <td><b>E3</b></td>
-    <td>(250, 50)</td>
-    <td style="border-left:1px solid lightgrey;">11.71</td>
-    <td>14.49</td>
-    <td style="border-left:1px solid lightgrey;">5.39</td>
-    <td>3.74</td>
-    <td style="border-left:1px solid lightgrey;">4.5</td>
-    <td>0.09</td>
-  </tr>
-  <tr>
-    <td><b>E3</b></td>
-    <td>(250, 40)</td>
-    <td style="border-left:1px solid lightgrey;">15.97</td>
-    <td>20.90</td>
-    <td style="border-left:1px solid lightgrey;">5.98</td>
-    <td>3.70</td>
-    <td style="border-left:1px solid lightgrey;">4.4</td>
-    <td>0.09</td>
-  </tr>
-</tbody>
-</table>
-<div class="caption">
-Table of experiment results adapted from paper. Credits: Brandstetter et al. <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>.
-</div>
-
-<details><summary>Abbreviations</summary>
-
-<table>
-<thead>
-<tr>
-<th>Shorthand</th>
-<th>Meaning</th>
-</tr>
-</thead>
-<tbody>
-<tr>
-<td><strong>E1</strong></td>
-<td>Burgers&#39; equation without diffusion</td>
-</tr>
-<tr>
-<td><strong>E2</strong></td>
-<td>Burgers&#39; equation with variable diffusion</td>
-</tr>
-<tr>
-<td><strong>E3</strong></td>
-<td>Mixed equation, see below</td>
-</tr>
-<tr>
-<td>\(n_{t}\)</td>
-<td>Temporal resolution</td>
-</tr>
-<tr>
-<td>\(n_{x}\)</td>
-<td>Spatial resolution</td>
-</tr>
-<tr>
-<td>WENO5</td>
-<td>Weighted Essentially Non-Oscillatory (5th order)</td>
-</tr>
-<tr>
-<td><abbr title="Fourier neural operator">FNO</abbr>-<abbr title="recurrent neural networks">RNN</abbr></td>
-<td>Recurrent variation of <abbr title="Fourier neural operator">FNO</abbr> from original paper</td>
-</tr>
-<tr>
-<td><abbr title="Fourier neural operator">FNO</abbr>-PF</td>
-<td><abbr title="Fourier neural operator">FNO</abbr> with the pushforward trick added</td>
-</tr>
-<tr>
-<td>MP-PDE</td>
-<td>Message passing neural PDE solver</td>
-</tr>
-</tbody>
-</table>
-
-<p>
-The authors form a general PDE in the form
-</p>
-
-<p style="text-align:center;">
-\([\partial_{t}u + \partial_{x}(\alpha u^{2} - \beta \partial_{x} u + \gamma \partial_{xx} u)](t,x) = \delta (t,x)\)
-</p>
-
-<p style="text-align:center;">
-\(u(0,x) = \delta(0,x)\)
-</p>
-
-<p>
-such that \(\theta_{PDE} = (\alpha, \beta, \gamma)\) and different combinations of these result in the heat equation, Burgers' equation, and the KdV equation. \(\delta\) is a forcing term, allowing for greater variation in the equations being tested.
-</p>
-
-</details>
-
-For this same experiment, the error and runtimes were recorded when solving using <abbr title="Weighted Essentially Non-Oscillatory (5th order)">WENO5</abbr>, the recurrent variant of the <abbr title="Fourier neural operator">FNO</abbr> (<abbr title="Fourier neural operator">FNO</abbr>-<abbr title="recurrent neural networks">RNN</abbr>), the <abbr title="Fourier neural operator">FNO</abbr> with the pushforward trick (<abbr title="Fourier neural operator">FNO</abbr>-PF), and the MP-PDE.
-
-<blockquote>
-The pushforward trick is successful in mitigating error accumulation.
-</blockquote>
-Comparing the accumulated errors of <abbr title="Fourier neural operator">FNO</abbr>-<abbr title="recurrent neural networks">RNN</abbr> and the <abbr title="Fourier neural operator">FNO</abbr>-PF across all experiments highlights the advantage of the pushforward trick. While the MP-PDE outperforms all other tested methods in the two generalization experiments **E2** and **E3**, the <abbr title="Fourier neural operator">FNO</abbr>-PF is most accurate for **E1**.
-
-When solving a single equation, the <abbr title="Fourier neural operator">FNO</abbr> likely performs better, though both <abbr title="Fourier neural operator">FNO</abbr>-PF and MP-PDE methods outperform <abbr title="Weighted Essentially Non-Oscillatory (5th order)">WENO5</abbr>.
-
-<blockquote>
-Neural solvers are resolution-invariant.
-</blockquote>
-As $$n_{x}$$ is decreased, <abbr title="Weighted Essentially Non-Oscillatory (5th order)">WENO5</abbr> performs increasingly worse whereas all the neural solvers remain relatively stable.
-<blockquote>
-Neural solver runtimes are constant to resolution.
-</blockquote>
-Additionally, the runtimes of <abbr title="Weighted Essentially Non-Oscillatory (5th order)">WENO5</abbr> decrease (likely proportionally) since fewer steps require fewer calculations, but the MP-PDE runtimes again appear relatively stable.
-
-### Comparing Interpretations
-
-The way the MP-PDE is constructed parallels how both GRAND and the PDE-GCN are built. All three architectures follow a basic premise of mirroring the <abbr title="method of lines">MOL</abbr> and describe certain mechanisms in their respective systems which mimic spatial discretisations and temporal discretisations.
-
-The spatial derivative is discretized by a <abbr title="graph neural network">GNN</abbr> in the MP-PDE and by the message passing algorithm (consisting of node and edge updates within one layer of a <abbr title="graph neural network">GNN</abbr>) in the GRAND and PDE-GCN. In the MP-PDE, the spatial derivatives are in effect parameterized by the node and edge updates (the former which Brandstetter et al. highlight takes the difference in solutions $$u_{i}=u_{j}$$) detailed above, both of which are generic <abbr title="multilayer perceptron">MLP</abbr>s. In comparison, both GRAND and PDE-GCN (using the diffusion variant) come to comparable formulas when discretising using the forward Euler method.
-
-The GRAND paper derives the following, where $$\tau$$ is a temporal step, $$\mathbf{x}$$ is the diffusion equation, and $$\mathbf{A}$$ is the attention matrix <d-cite key="chamberlainGRANDGraphNeural2021a"></d-cite>:
-
-$$\mathbf{x}^{(k+1)}=(\mathbf{I} + \tau \mathbf{A}(\mathbf{x}^{(k)}))\mathbf{x}^{(k)}$$
-
-which, when modified, results in:
-
-$$\mathbf{x}^{(k+1)}=\mathbf{x}^{(k)} + \tau \mathbf{x}^{(k)} \mathbf{A}(\mathbf{x}^{(k)})$$
-
-The PDE-GCN defines manifold operators discretized onto graphs. The update is defined as the following, where $$\mathbf{G}$$ is the gradient operator, $$\mathbf{K}$$ is a $$1 \times 1$$ trainable convolution kernel, $$\sigma$$ is the activation function, $$\tau$$ is the temporal step, and $$\mathbf{x}$$ is the diffusion equation <d-cite key="eliasofPdegcnNovelArchitectures2021"></d-cite>:
-
-$$\mathbf{x}^{(k+1)}=\mathbf{x}^{(k)}-\tau \mathbf{G}^{T} \mathbf{K}^{T}_{k} \sigma (\mathbf{K}_{k} \mathbf{G} \mathbf{x}^{(k)})$$
-
-The structure of these latter two models shares many similarities, though where GRAND naturally results in a graph attention network, the PDE-GCN results in a graph convolutional network.
-
-The temporal update for the MP-PDE relies on the 1D CNN outputting a temporal bundle, whereas GRAND and PDE-GCN regard their respective layer indexes to be the discretised time steps.
-
-These are examples of how spatial and temporal discretisations can result in unique architectures. The PDE-GCN outperforms GRAND on at least two out of three out of the popular Cora, SiteSeer, and PubMed benchmarks. However, the MP-PDE has a different objective altogether; while the PDE-GCN and GRAND output a single graph result (which is fed through a convolutional layer for node classification tasks), the MP-PDE iteratively produces results through time. This iterative requirement also requires that the temporal update must be retrievable and therefore must diverge from Ruthotto et al.'s original interpretation of time steps as layers adopted by the other two models. The MP-PDE instead appears to rely on the neural networks in both node and edge updates to learn spatial derivatives over multiple layers. An interesting experiment would be to apply the other two techniques to the same testing data as PDE-GCN and compare accuracies at a specific point in time (see future directions).
-
-## Conclusion
-
-#### Future Directions
-
-The authors conclude by discussing some future directions.
-
-For example, the MP-PDE can be modified for <span style="color:#9444e2;">PDE *retrieval* (which they call parameter optimization)</span>. There is some precedence for this: Cranmer et al. develop a method which fits a symbolic regression model (eg.: PySR, eureqa) to the learned internal functions of a GNN <d-cite key="cranmerDiscoveringSymbolicModels2020"></d-cite>. Alternatively, the MP-PDE's capacity for generalization means that biasing the model with a prior to determine coefficients could be as simple as training on an example instance of the predicted equation, fitting this model on real world data (much like a finetuning process), and extracting the $$\theta_{PDE}$$ parameters.
-
-The one-step loss which is the basis of the <span style="color:#9444e2;">adversarial-style loss</span> is also used in reinforcement learning, which frequently uses deep autoregressive models. Other formulations which borrow from reinforcement learning (where distribution shifts are quite common) and other fields could prove successful as well. Transformer-based natural language processing are now capable of capturing extremely long sequence dependencies and generating coherent long-form text. Since [Transformers are GNNs](https://graphdeeplearning.github.io/post/transformers-are-gnns/) which use attention to aggregate neighborhoods, this may be a viable avenue to explore.
-
-<span style="color:#9444e2;">Adaptive time stepping</span> is another avenue which could make the model more efficient and accurate by taking large steps over stable/predictable solution regions and smaller steps over changing/unpredictable solution regions. The choice of a CNN for the decoder works well over regular inputs and outputs, but other options like attention-based architectures could potentially weigh the outputted node embeddings such that the model might learn different time steps. Some care would have to be taken with temporal bundling in this case, since the resulting vectors would be potentially irregular in time.
-
-In addition, while the GRAND architecture is designed for a single output, adapting it to suit an iterative solver may prove fruitful since the attention mechanism would encode spatial awareness. The motivation for this choice is that a sparse attention matrix might be able to provide a more global solution.
-
-#### Ongoing Challenges
-
-While there are numerous diverse branches of development, key challenges remain:
-
-- (Unified and) appropriate evaluation metrics
-    - Currently, mean squared error (or root mean squared error) is implemented as the choice of loss in not only MP-PDE, but most named networks herein. However, it is unclear whether this is the best measure of correctness to solving a PDE since the specific values of the solution evaluated at the discretised points will depend on the discretisation method. An interesting further study would be to use the MP-PDE and test it on data generated from multiple numerical solvers. Additionally, Brandstetter et al. identify a metric called survival time which defines the length of time before the predicted solution diverges past a specified error threshold. Such metrics are important from a user's perspective when choosing between architectures, but there has yet to be a unified set of metrics in literature and so we lack convenient benchmarking.
-- Understanding choices in network architecture
-    - Given an end goal of using neural PDE solvers in practical settings, a major barrier for not only MP-PDE but for GRAND and PDE-GCN as well are the difficulties in choosing network parameters. While the proposed MP-PDE sheds light on certain choices like the message passing function and encoder-processor-decoder structure, it does not address some pragmatic decisions. For example, the 6 message passing layers in the MP-PDE appears relatively arbitrary which is a complaint shared in many machine learning methods. Because of the resulting upfront work in optimising the chosen model to determine what works for a new problem setting, the time cost of implementing it can be prohibitively high in comparison to the relative convenience of the many numerical solvers. One avenue of research to address this concern is neural architecture searching, where the design of neural architectures is discovered rather than manually specified. However, there is still a long way to go as many automated searches require significant compute to test the parameter space adequately.
-- The chicken and the egg
-    - As impressive as many novel neural methods may be, it remains that training data comes from classical methods. One of the largest open questions (which also drives the need for generalisation) is how we can design neural solvers which require as little data as possible. Transfer learning, curriculum learning, and techniques to encourage generalisation (as seen with the MP-PDE) are all steps toward addressing this problem, but no significant success has been seen from any one in particular.
-
-<!-- **One more potential direction is inspired by the recent GRAND paper.**
-
-Brandstetter et al. emphasizes the value of relationships to classical solvers - in fact, this is one of the key benefits of hybrid autoregressive models. However, modeling continuous functions as in neural operator models typically outperforms their competitors. Even the MP-PDE is fully neural, making it less explanable than the hybrid autoregressive models introduced earlier. -->
-
-<!-- The Graph Neural Diffusion (GRAND) model introduced by Chamberlain et al. demonstrates that <span style="color:#9444e2;"><abbr title="graph neural network">GNN</abbr> can be crafted using differential equations</span> (like diffusion processes) where, <a href="#spatialderivative">similarly to Brandstetter et al.</a>, the spatial derivative is analogous to the difference between node features <d-cite key="chamberlainGRANDGraphNeural2021a"></d-cite>. The layers are however analogous to the temporal change in a continuous-time differential equation, diverging from the MP-PDE intuition. -->
-
-<!-- Rather than "representationally [containing] some classical methods" <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>, GRAND provides a <span style="color:#9444e2;">mathematical framework</span> which not only offers explanability, but also a method to design new architectures with theoretical guarantees like stability or convergence <d-cite key="chamberlainGRANDGraphNeural2021a"></d-cite>.
-
-For example, standard <abbr title="message passing graph neural network">MP-GNN</abbr>s are shown to be equivalent to the explicit single-step Euler scheme; other classical solvers result in different flavours of message passing. Using GRAND to extend the MP-PDE would require rethinking the encoder and decoder, but the potential benefit could result in more reliability and therefore wider adoption of neural solvers for real world applications. -->
-
-#### Remarks
-
-In their paper "Message Passing Neural PDE Solver", Brandstetter at al. present a well-motivated neural solver based on the principle of message passing. The key contributions are the end-to-end network capable of one-shot generalization, and the mitigation of error accumulation in autoregressive models via temporal bundling and the pushforward trick. Note that the latter are self-contained can be applied to other architectures (as in the FNO-PF), providing a valuable tool to improve autoregressive models.
\ No newline at end of file
diff --git a/_posts/2023-05-01-bsuite-applications.md b/_posts/2023-05-01-bsuite-applications.md
deleted file mode 100644
index eba7a08e..00000000
--- a/_posts/2023-05-01-bsuite-applications.md
+++ /dev/null
@@ -1,398 +0,0 @@
----
-layout: distill
-title: Practical Applications of Bsuite For Reinforcement Learning
-description: In 2019, researchers at DeepMind published a suite of reinforcement 
-  learning environments called Behavior Suite for Reinforcement Learning, or bsuite. 
-  Each environment is designed to directly test a core capability of a general 
-  reinforcement learning agent, such as its ability to generalize from past experience 
-  or handle delayed rewards. In this blog post, we extend their work by providing specific examples 
-  of how bsuite can address common challenges faced by reinforcement learning practitioners 
-  during the development process.
-date: 2023-05-01
-htmlwidgets: true
-
-# Anonymize when submitting
-authors:
-  - name: Loren Anderson
-    affiliations: 
-        name: USA
-  - name: Nathan Bittner
-    affiliations: 
-      name: USA
-
-# must be the exact same name as your blogpost
-bibliography: 2023-05-01-bsuite-applications.bib  
-
-# Add a table of contents to your post.
-#   - make sure that TOC names match the actual section names
-#     for hyperlinks within the post to work correctly.
-toc:
-  - name: 0. Introduction
-    subsections:
-      - name: Background
-      - name: Summary of bsuite
-      - name: Motivation
-      - name: Contribution Statement
-      - name: Experiment Summary
-  - name: 1. Initial Model Selection
-    subsections:
-      - name: Comparing Baseline Algorithms
-      - name: Comparing Off-the-Shelf Implementations
-      - name: Gauging Hardware Necessities
-      - name: Future Work
-  - name: 2. Preprocessing Choice 
-    subsections:
-      - name: Verification of Preprocessing
-      - name: Better Model versus Preprocessing
-      - name: Future Work
-  - name: 3. Hyperparameter Tuning
-    subsections:
-      - name: Unintuitive Hyperparameters
-      - name: Promising Ranges of Hyperparameters
-      - name: Pace of Annealing Hyperparameters
-      - name: Future Work
-  - name: 4. Testing and Debugging
-    subsections:
-      - name: Incorrect Hyperparameter
-      - name: Off-the-Shelf Algorithm Testing
-      - name: Future Work
-  - name: 5. Model Improvement
-    subsections:
-      - name: Increasing Network Complexity
-      - name: Off-the-Shelf Improvements 
-      - name: Future Work 
-  - name: 6. Conclusion 
-    subsections:
-    - name: Green Computing Statement
-    - name: Inclusive Computing Statement
-  - name: Acknowledgements
-
-# Below is an example of injecting additional post-specific styles.
-# This is used in the 'Layouts' section of this post.
-# If you use this post as a template, delete this _styles block.
-_styles: >
-  .fake-img {
-    background: #bbb;
-    border: 1px solid rgba(0, 0, 0, 0.1);
-    box-shadow: 0 0px 4px rgba(0, 0, 0, 0.1);
-    margin-bottom: 12px;
-  }
-  .fake-img p {
-    font-family: monospace;
-    color: white;
-    text-align: left;
-    margin: 12px 0;
-    text-align: center;
-    font-size: 16px;
-  }
-  .asdf {
-    max-width: 75%;
-  }
-  .emph {
-    text-decoration: underline;
-    # font-weight: bold;
-    # font-style: italic;
-  }
----
-
-## 0. Introduction
-For the past few decades, the field of AI has appeared similar to the Wild West. There have been rapid achievements <d-cite key="krizhevsky_imagenet_2012"></d-cite><d-cite key="hessel_rainbow_2018"></d-cite> and epic showdowns <d-cite key="brown_superhuman_2019"></d-cite><d-cite key="silver_mastering_2016"></d-cite><d-cite key="vinyals_sc2_2019"></d-cite> happening in the frontier of AI research. The subfield of reinforcement learning has been no exception, where progress in the frontier has generated sensational applied feats while leaving theoretical understanding in the dust <d-cite key="osband_behaviour_2020"></d-cite>. As in many other AI subfields, there remain prevailing questions such as, *"Which model should I initially select for the given task?"*, *"How can I tune hyperparameters to increase performance?"*, and *"What is the best way to improve my already working model?"*. In this blog post, we help tame the frontier of reinforcement learning research by providing insights and quantitative answers to such questions through diagnostic, methodical, and reproducible reinforcement learning techniques. In particular, we focus on DeepMind's *Behaviour Suite for Reinforcement Learning* (bsuite) codebase and showcase explicit examples of how it can aid reinforcement learning researchers in the development process and help provide a bridge between theoretical and applied reinforcement learning understanding.
-
-This introduction section provides the necessary background and motivation to understand the importance of our contribution. The background section describes how deep learning provides a blueprint for bridging theory to practice, and then discusses traditional reinforcement learning benchmarks. The bsuite summary section provides a high-level overview of the core capabilities tested by bsuite, its motivation, an example environment, and a comparison against traditional benchmark environments.  In the motivation section, we present arguments for increasing the wealth and diversity of documented bsuite examples, with references to the paper and reviewer comments. The contribution statement presents the four distinct contributions of our work that help extend the bsuite publication. Finally, the experiment summary section describes our setup and rationale for the experimental illustrations in sections 1-5. The information in this introduction section is primarily distilled from the original bsuite publication <d-cite key="osband_behaviour_2020"></d-cite>.
-
-### Background
-The current state of reinforcement learning (RL) theory notably lags progress in practice, especially in challenging problems. There are examples of deep reinforcement learning (DRL) agents learning to play Go from scratch at the professional level <d-cite key="silver_mastering_2016"></d-cite>, learning to navigate diverse video games from raw pixels <d-cite key="mnih_human-level_2015"></d-cite>, and learning to manipulate objects with robotic hands <d-cite key="andrychowicz_learning_2020"></d-cite>. While these algorithms have some foundational roots in theory, including gradient descent <d-cite key="bottou_large-scale_2010"></d-cite>, TD learning <d-cite key="sutton_learning_1988"></d-cite>, and Q-learning <d-cite key="watkins_q-learning_1992"></d-cite>, the authors of bsuite acknowledge that, "The current theory of deep reinforcement learning is still in its infancy" <d-cite key="osband_behaviour_2020"></d-cite>.  A strong theory is prized since it can help provide insight and direction for improving known algorithms, while hinting at future research directions.
-
-Fortunately, deep learning (DL) provides a blueprint of the interaction between theoretical and practical improvements. During the 'neural network winter', DL techniques were disregarded in favor of more theoretically sound convex loss methods <d-cite key="cortes_support-vector_1995"></d-cite>, even though the main ideas and successful demonstrations existed many years previously <d-cite key="rosenblatt_perceptron_1958"></d-cite>. It was only until DL techniques achieved superior scores on benchmark problems, mainly for image recognition <d-cite key="krizhevsky_imagenet_2012"></d-cite>, that DL earned the research spotlight. Consequently, a renewed interest in DL theory followed shortly after <d-cite key="kawaguchi_deep_2016"></d-cite><d-cite key="bartlett_spectrally-normalized_2017"></d-cite><d-cite key="belkin_reconciling_2019"></d-cite>, bolstered by the considerable wealth of applied research. Due to the lack of theory in DRL and the proximity of the DL and DRL research fields, <span class="emph">one enticing avenue to accelerate progress in reinforcement learning research is to follow the blueprint laid out by deep learning research and create well-defined and vetted benchmarks for the understanding of reinforcement learning algorithms</span>.
-
-To this end, the trend of RL benchmarks has seen an increase in overall complexity. The earliest such benchmarks were simple MDPs that served as basic testbeds with fairly obvious solutions, such as *Cartpole* <d-cite key="barto_neuronlike_1983"></d-cite> and *MountainCar* <d-cite key="moore_efficient_1990"></d-cite>. Other benchmarks proved to be more diagnostic by targeting certain capabilities such as *RiverSwim* <d-cite key="strehl_analysis_2008"></d-cite> for exploration and *Taxi* <d-cite key="dietterich_hierarchical_2000"></d-cite> for temporal abstraction. Modern benchmarks such as the *ATARI Learning Environment* <d-cite key="bellemare_arcade_2013"></d-cite> and board games such as *Chess*, *Go*, and *Shogi* are more complex and prove difficult for humans, with even the best humans unable to achieve perfect play. The corresponding achievements were highly publicized <d-cite key="silver_mastering_2016"></d-cite><d-cite key="mnih_human-level_2015"></d-cite> due to the superhuman performance of the agents, with the agents taking actions that were sometimes not even considered by their human counterparts. Consequently, the pursuit of superhuman performance on complex benchmarks has recently been a strong driver of progress in the field <d-cite key="vinyals_sc2_2019"></d-cite><d-cite key="silver_general_2018"></d-cite><d-cite key="perolat_mastering_2022"></d-cite><d-cite key="ecoffet_first_2021"></d-cite><d-cite key="bakhtin_diplomacy_2022"></d-cite>.
-
-### Summary of bsuite
-
-The open-source *Behaviour Suite for Reinforcement Learning* (bsuite) benchmark <d-cite key="osband_behaviour_2020"></d-cite> goes against the grain of the current benchmark trend of increasing complexity. It acts as a complement to existing benchmarks by creating 23 environments with minimal confounding factors to test 7 behavioral core capabilities of RL agents, as follows: **basic**, **exploration**, **memory**, **generalization**, **noise**, **scale**, and **credit assignment**. Current benchmarks often contain most of these capabilities within a single environment, whereas bsuite tailors its environments to target one or a few of these capabilities. Each bsuite environment is scalable and has 16 to 22 levels of difficulty, providing a more precise analysis of the corresponding capabilities than a simple, and possibly misleading <d-cite key="agarwal_deep_2021"></d-cite>, ranking of algorithm performance. Furthermore, algorithms have fixed evaluation regimes based on the number of seeds and episodes allowed during training, which rewards algorithms that exhibit the capabilities rather than those that focus on sheer compute power. The targeted and scalable nature of bsuite can provide insights such as eliciting bottlenecks and revealing scaling properties that are opaque in traditional benchmarks. With respect to the benchmarks described in the preceding paragraph, bsuite is most similar to the diagnostic benchmarks of *RiverSwim* <d-cite key="strehl_analysis_2008"></d-cite> for and *Taxi* <d-cite key="dietterich_hierarchical_2000"></d-cite> due to its purpose as a stepping stone for tackling more challenging benchmarks. 
-
-The bsuite evaluation of an agent yields a radar chart (Fig. 1) that displays the agent's score from 0 to 1 on all seven capabilities, usually based on regret, that yields a quick quantitative comparison between agents. Scores near 0 indicate poor performance, often akin to an agent acting randomly, while scores near 1 indicate mastery of all environment difficulties. A central premise of bsuite is that <span class="emph">if an agent achieves high scores on certain environments, then it is much more likely to exhibit the associated core capabilities due to the targeted nature of the environments. Therefore, the agent will more likely perform better on a challenging environment that contains many of the capabilities than one with lower scores on bsuite</span>.  This premise is corroborated by recent research that shows how insights on simple environments can still hold true on more complex environments <d-cite key="ceron_revisiting_2021"></d-cite>. However, we urge practitioners to exercise caution when adopting bsuite into the development process, as the insights on more simple bsuite environments are not guaranteed to extend to more complex environments in a straightforward manner.
-
-
-
-<div style="text-align: center;">
-
-{% include figure.html path="assets/img/2023-05-01-bsuite-applications/radar01.png" class="img-fluid" %}
-
-</div>
-<div class="caption">
-    Figure 1. Example radar chart of DQN on all 7 bsuite core capabilities.
-</div>
-
-An example environment is *deep sea* that targets exploration power. As shown in Figure 2, *deep sea* is an $N \times N$ grid with starting state at cell $(1, 1)$ and treasure at $(N, N)$, with $N$ ranging from 10 to 100. The agent has two actions, move downward left and downward right; the goal is to reach the treasure and receive a reward of $1$ by always moving downward right. A reward of $0$ is given to the agent for moving downward left at a timestep, while a penalizing reward of $-0.01/N$ is given for moving downward right. The evaluation protocol of *deep sea* only allows for $10K$ episodes of $N-1$ time steps each, which prevents an algorithm with unlimited time from casually exploring the entire state space and stumbling upon the treasure. Note that superhuman performance is nonexistent in *deep sea* (and more precisely in the entire bsuite gamut) since a human can spot the optimal policy nearly instantaneously. Surprisingly, we will show later that baseline DRL agents fail miserably at this task. 
-
-<div style="text-align: center;">
-
-{% include figure.html path="assets/img/2023-05-01-bsuite-applications/diagram02.png" class="img-fluid asdf" %}
-
-<div class="caption">
-    Figure 2. Illustration of deep sea environment taken from <d-cite key="osband_behaviour_2020"></d-cite>.
-</div>
-
-</div>
-
-The **challenge** of *deep sea* is the necessity of exploration in an environment that presents an irreversible, suboptimal greedy action (moving downward left) at every time step. This environment **targets** exploration power by ensuring that a successful agent must deliberately choose to explore the state space by neglecting the greedy action. The **simplistic** implementation removes confounding goals, such as learning to see from pixels while completing other tasks <d-cite key="mnih_human-level_2015"></d-cite>. Furthermore, this environment provides a granular exploration score through **scaling** the environment size by $N$ and determining when an agent starts to fail. Finally, the implementation of the environment yields **fast** computation, allowing multiple, quick runs with minimal overhead and compute cost. These 5 aforementioned key qualities are encompassed by all bsuite environments, and we contrast such environments against traditional benchmark environments in the below table.
-
-| Key Quality     | Traditional Benchmark Environment                                                                      | bsuite Environment                                                                            |
-|-----------------|--------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------|
-| **Targeted**    | Performance on environment subtly related to many or all core capabilities.                            | Performance on environment directly related with one or few core capabilities.                |
-| **Simple**      | Exhibits many confounding factors related to performance.                                              | Removes confounding factors related to performance.                                           |
-| **Challenging** | Requires competency in many core capabilities but not necessarily past normal range in any capability. | Pushes agents beyond normal range in one or few core capabilities.                            |
-| **Scalable**    | Discerns agent's power through comparing against other agents and human performance.                   | Discerns agent's competency of core capabilities through increasingly more difficult environments. |
-| **Fast**        | Long episodes with computationally-intensive observations.                                             | Relatively small episode and experiment lengths with low observation complexity.              |
-
-
-### Motivation
-
-The authors of bsuite stated, "Our aim is that these experiments can help provide a bridge between theory and practice, with benefits to both sides" <d-cite key="osband_behaviour_2020"></d-cite>.  As discussed in the background section, establishing clear benchmarks can yield applied progress, which in turn can accelerate theoretical progress. The use of bsuite in this manner seems highly fruitful since its environments are targeted, which allows for hypothesis testing and eventual formalization into provable guarantees. As such, <span class="emph">it is instrumental that the applied aspect of bsuite is emphasized through the adoption and diverse application of reinforcement learning practitioners</span>. 
-
-The applied examples in the published paper are rather meagre: there are two examples of algorithm comparison on two specific environments and three example comparisons of algorithms, optimizers, and ensemble sizes across the entire bsuite gamut in the appendix. The two examples on the specific environments showcase how bsuite can be used for directed algorithm improvement, but the experiments in the appendices only discuss the general notion of algorithm comparison using bsuite scores. In addition to the examples, the authors supply some comments throughout the paper that provide hints regarding the applied usage of bsuite. Looking at the [paper reviews](https://openreview.net/forum?id=rygf-kSYwH), [reviewer #1](https://openreview.net/forum?id=rygf-kSYwH&noteId=rkxk2BR3YH) mentioned how there was no explicit conclusion from the evaluation, and [reviewer #3](https://openreview.net/forum?id=rygf-kSYwH&noteId=rJxjmH6otS) mentioned that examples of diagnostic use and concrete examples would help support the paper. Furthermore, [reviewer #2](https://openreview.net/forum?id=rygf-kSYwH&noteId=SJgEVpbAFr) encouraged publication of bsuite at a top venue to see traction within with the RL research community, and the [program chairs](https://openreview.net/forum?id=rygf-kSYwH&noteId=7x_6G9OVWG) mentioned how success or failure can rely on community acceptance. Considering that bsuite received a spotlight presentation at ICLR 2020 and has amassed over 100 citations in the relatively small field of RL reproducibility during the past few years, bsuite has all intellectual merit and some community momentum to reach the level of a timeless benchmark in RL research. <span class="emph">To elevate bsuite to the status of a timeless reinforcement learning benchmark and to help bridge the theoretical and applied sides of reinforcement learning, we believe that it is necessary to develop and document concrete bsuite examples that help answer difficult and prevailing questions throughout the reinforcement learning development process</span>.   
-
-### Contribution Statement
-
-This blog post extends the work of bsuite by showcasing 12 example use cases with experimental illustration that directly address specific questions in the reinforcement learning development process to (i) help bridge the gap between theory and practice, (ii) promote community acceptance, (iii) aid applied practitioners, and (iv) highlight potential research directions in reproducible reinforcement learning. 
-
-### Experiment Summary
-
-We separate our examples into 5 categories of **initial model selection**, **preprocessing choice**, **hyperparameter tuning**, **testing and debugging**, and **model improvement**. This blog post follows a similar structure to the paper *Deep Reinforcement Learning that Matters* <d-cite key="henderson_deep_2018"></d-cite> by posing and answering a question in each category, and then providing a few illustrative examples with conclusions. Most examples use Stable-Baselines3 (SB3) <d-cite key="raffin_stable-baselines3_2022"></d-cite> for training DRL agents due to its clarity and simplicity, and the examples focus on DRL due to its pervasiveness in the applied RL community. We provide code and instructions for each experiment in our [GitHub codebase](https://github.com/LorenJAnderson/bsuite-applications.git), along with hyperparameters and implementation details. Since the focus of this blog post is the discussion of diverse example use cases, not architectural considerations or implementation details, we refer the reader to the [paper appendix](https://openreview.net/pdf?id=rygf-kSYwH#page=13) and the [colab analysis tutorial](https://colab.research.google.com/github/deepmind/bsuite/blob/master/bsuite/analysis/results.ipynb) for more information about the environments and to the [colab intro tutorial](https://colab.research.google.com/drive/1rU20zJ281sZuMD1DHbsODFr1DbASL0RH) and our own codebase for instructions and examples regarding the implementation of bsuite.
-
-Although running a bsuite environment is orders of magnitude faster than most benchmark environments, the number of individual bsuite environments and the number of our examples required us to create a subset of bsuite, which we will refer to as *mini-bsuite* or *msuite* in this work. We designed msuite to mirror the general scaling pattern of each bsuite environment and the diversity of core capabilities in bsuite; a complete description of msuite can be found in our GitHub codebase. Running experiments on a subset of bsuite highlights its flexibility, and we will show, still elicits quality insights. Since we use a subset of bsuite for our experiments, our radar charts will look different from those in the original bsuite paper. We generally keep the more challenging environments and consequently produce lower scores, especially in the generalization category. 
-
-We stress that the below examples are not meant to amaze the reader or exhibit state-of-the-art research. <span class="epmh">The main products of this work are the practicality and diversity of ideas in the examples</span>, while the experiments are primarily for basic validation and illustrative purposes. Moreover, these experiments use modest compute power and showcase the effectiveness of bsuite in the low-compute regime. Each example has tangible benefits such as saving development time, shortening compute time, increasing performance, and lessening frustration of the practitioner, among others. To maintain any sense of brevity in this post, we now begin discussion of the examples.
-
-## 1. Initial Model Selection
-
-The reinforcement learning development cycle typically begins with an environment to solve. A natural question usually follows: "*Which underlying RL model should I choose to best tackle this environment, given my resources*?". Resources can range from the hardware (e.g. model size on the GPU), to temporal constraints, to availability of off-the-shelf algorithms <d-cite key="liang_rllib_2018"></d-cite><d-cite key="raffin_stable-baselines3_2022"></d-cite>, to programming efficiency of the practitioner. Initially selecting an effective model can save a great amount of development time due to the potentially greater performance baseline of the agent. In this section, we illustrate how bsuite can be used to effectively answer the question of initial model selection.
-
-### Comparing Baseline Algorithms
-
-Perhaps the first choice in the RL development cycle is choosing the algorithm. A considerable amount of RL research is focused on the corresponding algorithms, which presents many possibilities for the researcher. The No Free Lunch Theorem <d-cite key="wolpert_no_1997"></d-cite> tailored to reinforcement learning would state that no algorithm will prove better than any other unless the characteristics of the underlying environment are known. Using bsuite provides a quantitative assessment of algorithm performance on capabilities that are prevalent in many or even most reinforcement learning environments of interest.
-
-Example: Figure 3 shows the performance of the Stable-Baselines3 (SB3) implementations of DQN, A2C, and PPO on msuite with our default hyperparameters. Recent research <d-cite key="andrychowicz_what_2020"></d-cite> suggests that PPO is the most commonly used RL algorithm, and it was a successor to DQN and A2C. The results indeed show that PPO is superior on msuite in most categories, providing credibility for its use as the premiere baseline DRL algorithm.
-
-<div style="text-align: center;">
-
-{% include figure.html path="assets/img/2023-05-01-bsuite-applications/radar11.png" class="img-fluid" %}
-
-<div class="caption">
-    Figure 3. Comparison of SB3 default DQN, A2C, and PPO baseline algorithms.
-</div>
-</div>
-
-### Comparing Off-the-Shelf Implementations
-
-Due to the vast number of reinforcement learning paradigms (e.g. model-based, hierarchical), there are many off-the-shelf (OTS) libraries that provide a select number of thoroughly tested reinforcement learning algorithms. Often, temporal resources or coding capabilities do not allow for practitioners to implement every algorithm by hand. Fortunately, running an algorithm on bsuite can provide a quick glance of an OTS algorithm's abilities at low cost to the practitioner.
-
-Example: Figure 4 compares our default DQN implementation against the example DQN implementation in the bsuite codebase. There is a significant difference between the performance of each implementation on msuite, with the bsuite implementation displaying its superiority. Note that the hyperparameters of bsuite DQN were most likely chosen with the evaluation on bsuite in mind, which could explain its increased performance.
-
-<div style="text-align: center;">
-
-{% include figure.html path="assets/img/2023-05-01-bsuite-applications/radar12.png" class="img-fluid" %}
-
-<div class="caption">
-    Figure 4. Comparison of SB3 DQN and bsuite DQN.
-</div>
-
-</div>
-
-### Gauging Hardware Necessities
-
-Even after an initial algorithm is selected, hardware limitations such as network size and data storage can prevent the agent from being deployed. Using bsuite provides a low-cost comparison among possible hardware choices that can be used to argue for their necessity. This is especially important for small development teams since there can likely be a major disparity between their own hardware resources and those discussed in corresponding research publications. 
-
-Example: Figure 5 compares the default DQN implementation when varying replay buffer sizes, from $1\mathrm{e}{2}$ to $1\mathrm{e}{5}$, with the default having size $1\mathrm{e}{4}$. The original DQN implementation used a replay buffer of size $1\mathrm{e}{6}$, which is too large for the RAM constraints of many personal computers. The results show that increasing the buffer size to at least $1\mathrm{e}{4}$ yields significant returns on msuite. Note that since the experiment lengths (total time steps for all episodes) of msuite were sometimes less than $1\mathrm{e}{5}$, the largest buffer size of $1\mathrm{e}{5}$ did not always discard experiences from very old episodes, which most likely decreased its performance in comparison to a buffer size of $1\mathrm{e}{4}$.
-
-<div style="text-align: center;">
-
-{% include figure.html path="assets/img/2023-05-01-bsuite-applications/radar13.png" class="img-fluid" %}
-
-<div class="caption">
-    Figure 5. Comparison of DQN with varying buffer sizes.
-</div>
-
-</div>
-
-### Future Work
-
-Due to the diversity of OTS libraries, one possible research direction in reproducible RL is to test algorithms from different OTS libraries using the same hyperparameters on bsuite and create a directory of bsuite radar charts. This provides practitioners a comparison with their own implementation or a starting point when selecting an OTS library and algorithm. Another direction is to test various aspects related to hardware constraints and attempt to show the tradeoff between constraints and performance on bsuite and other benchmarks. This would especially help practitioners with low compute resources to budget resource use on multiple projects.
-
-## 2. Preprocessing Choice
-Most benchmark environments present complexities such as high-dimensional observations, unscaled rewards, unnecessary actions, and partially-observable Markov Decision Process (POMDP) dynamics. Some of these difficulties can be curbed using environment preprocessing techniques. While certain environments such as *ATARI* have formalized standards for preprocessing, there are some aspects such as frame skipping that are considered part of the underlying algorithm, and therefore, a choice of the practitioner <d-cite key="machado_revisiting_2018"></d-cite>. A natural question to ask is, "*What environment preprocessing techniques will best help my agent attain its goal in this environment*?".  In this section, we show how bsuite can provide insight to the choice of preprocessing, with benefits of increased performance and shortened training time.
-
-### Verification of Preprocessing
-Preprocessing techniques usually targeted to ease some aspect of the agent's training. For example, removing unnecessary actions (e.g. in a joystick action space) prevents the agent from having to learn which actions are useless. While a new preprocessing technique can provide improvements, there is always the chance that it fails to make a substantial improvement, or worse yet, generally decreases performance. Invoking bsuite can help provide verification that the preprocessing provided the planned improvement.
-
-Example: Figure 6 shows the performance of the default DQN agent versus an agent that received normalized rewards from the environment. Normalizing the rewards increases the speed of training a neural network, since the parameters are usually initialized to expect target values in a range from $-1$ to $1$. Our results show that the normalization preprocessing indeed increases the capability of navigating varying reward scales while not suffering drastically in any other capability.
-
-
-<div style="text-align: center;">
-
-{% include figure.html path="assets/img/2023-05-01-bsuite-applications/radar21.png" class="img-fluid" %}
-
-<div class="caption">
-    Figure 6. Comparison of DQN with and without reward normalization.
-</div>
-
-</div>
-
-### Better Model versus Preprocessing
-
-Instead of choosing to preprocess the environment, a more sophisticated algorithm may better achieve the preprocessing goals. For example, many improvements on the original DQN algorithm have been directed towards accomplishing goals such as improving stability, reducing overestimation, and bolstering exploration. Comparing preprocessing against an algorithmic improvement provides a quantitative reason for deciding between the two options, especially since development time of many common preprocessing wrappers is quite short.
-
-Example: Figure 7 shows the results of PPO with a recurrent network versus PPO having its observation as the last 4 stacked frames from the environment. Frame stacking is common on some *ATARI* environments by converting the POMDP dynamics to an MDP, which is necessary to determine velocity of any element on the screen. An improvement to DQN, Deep Recurrent Q-networks <d-cite key="hausknecht_deep_2017"></d-cite> uses a recurrent LSTM to aid in memory and achieve the same effects of frame stacking. The msuite results show that memory is considerably improved with PPO RNN and therefore may be worth the extra development time.
-
-<div style="text-align: center;">
-
-{% include figure.html path="assets/img/2023-05-01-bsuite-applications/radar22.png" class="img-fluid" %}
-
-<div class="caption">
-    Figure 7. Comparison of PPO with frame stacking and PPO with RNN.
-</div>
-
-</div>
-
-
-### Future Work
-One research direction is to document common preprocessing techniques and determine their scores on bsuite. This would provide practitioners a summary of directed strengths for each preprocessing technique while possibly uncovering unexpected behavior. Another direction is to determine the extent to which preprocessing techniques aided previous results in the literature, which could illuminate strengths or weaknesses in the corresponding algorithms.
-
-## 3. Hyperparameter Tuning
-After selecting a model and determining any preprocessing of the environment, an agent must eventually be trained on the environment to gauge its performance. During the training process, initial choices of hyperparameters can heavily influence the agent's performance <d-cite key="andrychowicz_what_2020"></d-cite>, including how to explore and how quickly the model should learn from past experience. The corresponding question to ask is, "*How can I choose hyperparameters to yield the best performance, given a model?*" In this section, we show how bsuite can be used to tune hyperparameters, thereby increasing performance and shortening compute time.
-
-### Unintuitive Hyperparameters
-Some hyperparameters such as exploration percentage and batch size are more concrete, while others such as discounting factor and learning rate are a little less intuitive. Determining a starting value of an unintuitive hyperparameter can be challenging and require a few trials before honing in on a successful value. Instead of having to run experiments on a costly environment, using bsuite can provide a thoughtful initial guess of the value with minimal compute.
-
-Example: Figure 8 shows the results of running PPO with various entropy bonus coefficients across msuite (default is $0.01$). The entropy bonus affects the action distribution of the agent, and the value of $1\mathrm{e}{-2}$ presented in the original paper <d-cite key="schulman_proximal_2017"></d-cite> is fairly unintuitive. The results show that the value of $1\mathrm{e}{-2}$ is indeed superior on msuite by a small margin. Since SB3 has the entropy bonus initialized to 0, this example also shows how hyperparameter tuning with msuite can improve performance even on OTS implementations.
-
-<div style="text-align: center;">
-
-{% include figure.html path="assets/img/2023-05-01-bsuite-applications/radar31.png" class="img-fluid" %}
-
-<div class="caption">
-    Figure 8. Comparison of default PPO with varying entropy bonuses.
-</div>
-
-</div>
-
-### Promising Ranges of Hyperparameters
-Instead of determining a single value of a hyperparameter, gauging an acceptable range may be required. Since hyperparameters can have confounding effects, knowing approximate soft boundaries of hyperparameters at which agents start to fail basic tasks can provide useful information during a more general hyperparameter tuning process. For example, smaller learning rates generally take longer for algorithm convergence, and a practitioner may want to know a promising range of learning rates if the computing budget is flexible. The scaling nature of bsuite presents knowledge of the extent to which different hyperparameter choices affect performance, greatly aiding in ascertaining a promising hyperparameter range.
-
-Example: Figure 9 shows the results of default DQN with varying learning rates on msuite (default $7\mathrm{e}{-4}$). The results suggest that learning rates above $1\mathrm{e}{-2}$ start to yield diminishing returns. Since some experiment lengths in msuite only run for $10K$ episodes, the lowest learning rate of $1\mathrm{e}{-6}$ may never converge in time even with high-quality training data, necessitating a modification to msuite to learn a lower bound.
-
-<div style="text-align: center;">
-
-{% include figure.html path="assets/img/2023-05-01-bsuite-applications/radar32.png" class="img-fluid" %}
-
-<div class="caption">
-    Figure 9. Comparison of default DQN with varying learning rates.
-</div>
-</div>
-
-### Pace of Annealing Hyperparameters
-While some hyperparameters stay fixed, others must change throughout the course of training. Typically, these include hyperparameters that control the exploration vs. exploitation dilemma, such as entropy bonus and epsilon-greedy exploration. These hyperparameters are often dependent on the entire experiment; for example, SB3 anneals epsilon-greedy exploration for a fixed fraction of the experiment. Therefore, entire experiments, some consisting of millions of episodes, need to be run to determine successful values of these hyperparameters. Using bsuite can provide a quick confirmation that the annealing of these parameters happens at an acceptable rate.
-
-Example: Figure 10 shows the performance of DQN with various epsilon-greedy exploration annealing lengths, based on a fixed fraction of the entire experiment (default $0.1$). The annealing fraction of $0.1$ performs best on msuite, which is the same choice of parameter in the original DQN paper. Furthermore, performance decreases with greater annealing lengths. Since bsuite environments are generally scored with regret, we acknowledge that the longer annealing lengths may have better relative performance if bsuite were scored with a training versus testing split.
-
-<div style="text-align: center;">
-
-{% include figure.html path="assets/img/2023-05-01-bsuite-applications/radar33.png" class="img-fluid" %}
-
-<div class="caption">
-    Figure 10. Comparison of default DQN with varying epsilon annealing lengths.
-</div>
-
-</div>
-
-### Future Work
-The three experiments above can be extended by documenting the effect of varying hyperparameters on performance, especially in OTS implementations. This would help practitioners understand the effects of certain hyperparameters on the bsuite core capabilities, allowing for a better initial hyperparameter choice when certain capabilities are necessary for the environment at hand. Another research direction is to determine if integrating a fast hyperparameter tuner on general environments such as bsuite into a hyperparameter tuner for single, complex environments would increase the speed of tuning on the fixed environment. Since the bsuite core capabilities are necessary in many complex environments, initially determining competency on bsuite would act as a first pass of the tuning algorithm.
-
-## 4. Testing and Debugging
-Known to every RL practitioner, testing and debugging during the development cycle is nearly unavoidable. It is common to encounter silent bugs in RL code, where the program runs but the agent fails to learn because of an implementation error. Examples include incorrect preprocessing, incorrect hyperparameters, or missing algorithm additions. Quick unit tests can be invaluable for the RL practitioner, as shown in successor work to bsuite <d-cite key="rajan_mdp_2021"></d-cite>. A corresponding question to ask during the testing and debugging phase is, "*What tests can I perform to verify that my agent is running as intended?*" In this section, we show how bsuite can be used as a sanity check for the implementation, saving compute time and lessening the frustration of the practitioner. In an effort to refrain from contrived examples, the two examples below highlight real-life scenarios where using bsuite could have saved the authors of this blog post hours of frustration in their own work.
-
-### Incorrect Hyperparameter
-As discussed in the previous section, hyperparameters are of major importance to the performance of a RL algorithm. A missing or incorrect hyperparameter will not necessarily prevent a program from running, but most such bugs will severely degrade performance. Using bsuite can quickly expose poor performance of an algorithm at a low cost to the practitioner.
-
-Example: Figure 11 shows the default PPO implementation against a PPO implementation with an erroneous learning rate of $1\mathrm{e}{3}$. Many hyperparameters such as total training steps and maximum buffer size are usually coded using scientific notation since they are so large; consequently, it is easy to forget the 'minus sign' when coding the learning rate and instead code the learning rate as $1\mathrm{e}{3}$. The results on msuite show that performance has degraded severely from an OTS implementation, and more investigation into the code is required. One of the authors of this blog post would have saved roughly a day of training a PPO agent in their own work had they realized this exact mistake.  
-
-<div style="text-align: center;">
-
-{% include figure.html path="assets/img/2023-05-01-bsuite-applications/radar41.png" class="img-fluid" %}
-
-<div class="caption">
-    Figure 11. Comparison of default PPO with miscoded PPO.
-</div>
-
-</div>
-
-### Off-the-Shelf Algorithm Testing
-While the previous example used an OTS algorithm for comparison to illuminate silent bugs, it may be the case that the OTS algorithm itself could have a silent bug. Whether due to an incorrect library being used or a misunderstanding of the OTS algorithm, any silent bug in an OTS algorithm can be difficult to detect due to the codebase being written by another practitioner. Again, bsuite can be used to diagnose poor performance and elucidate a coding problem.
-
-Example: Figure 12 shows the results of the SB3 DQN with our default experimental hyperparameters and with the default SB3 hyperparameters on msuite. A core difference between the hyperparameters is the burn rate: the default SB3 hyperparameters perform $10K$ steps before learning takes place (e.g. backprop), while our default experimental hyperparameters start the learning after $1K$ steps. Since many of the easier msuite environments only last $10K$ time steps, failure to learn anything during that time severely degrades performance, as shown. Noticing the default value of this hyperparameter in SB3 would have saved the authors roughly 10 hours of training time.
-
-<div style="text-align: center;">
-
-{% include figure.html path="assets/img/2023-05-01-bsuite-applications/radar42.png" class="img-fluid" %}
-
-<div class="caption">
-    Figure 12. Comparison of DQN with small and large burn-in.
-</div>
-
-</div>
-
-### Future Work
-The training time for a complete run of bsuite can take an hour for even the most basic algorithms. Considering that a few of the easiest bsuite environments could have shown poor performance in the above examples within mere minutes, one research avenue is to create a fast debugging system for reinforcement learning algorithms. In the spirit of bsuite, it should implement targeted experiments to provide actionable solutions for eliminating silent bugs. Such work would primarily act as a public good, but it could also help bridge the gap between RL theory and practice if it embodies the targeted nature of bsuite.
-
-## 5. Model Improvement
-A natural milestone in the RL development cycle is getting an algorithm running bug-free with notable signs of learning. A common follow-up question to ask is, "*How can I improve my model to yield better performance?*". The practitioner may consider choosing an entirely new model and repeating some of the above steps; a more enticing option is usually to improve the existing model by reusing its core structure and only making minor additions or modifications, an approach taken in the development of the baseline RAINBOW DQN algorithm <d-cite key="hessel_rainbow_2018"></d-cite>. In this section, we discuss how bsuite can be used to provide targeted improvements of existing models and increase performance while mitigating compute time.
-
-### Increasing Network Complexity
-In DRL, the neural network usually encodes the policy, and its architecture directly affects the agent's learning capacity. The more complicated CNN architecture was a driver for the first superhuman performance of a DRL algorithm on the *ATARI* suite due to its ability to distill image data into higher-level features. Using bsuite can provide a quick verification if an architectural improvement produces its intended effect.
-
-Example: Figure 13 shows the results of PPO against PPO with a recurrent neural network. As mentioned in a previous example, RNNs aid memory and were originally incorporated into DRL as a way to deal with POMDP dynamics. The results on msuite display the substantial increase in memory capability while sacrificing on credit assignment. This example highlights how bsuite can provide warnings of possible unexpected decreases in certain capabilities, which must be monitored closely by the practitioner. 
-
-
-<div style="text-align: center;">
-
-{% include figure.html path="assets/img/2023-05-01-bsuite-applications/radar51.png" class="img-fluid" %}
-
-<div class="caption">
-    Figure 13. Comparison of default PPO with PPO RNN.
-</div>
-
-</div>
-
-###  Off-the-Shelf Improvements
-While previous examples discussed comparison, verification, and debugging OTS implementations, many OTS libraries provide support for well-known algorithm improvements. For example, some DQN implementations have boolean values to signify the use of noisy networks, double Q-learning, and more. Using bsuite provides the necessary targeted analysis to help determine if certain improvements are fruitful for the environment at hand.
-
-Example: Figure 14 shows the results of our default DQN compared against the SB3 QRDQN algorithm with default hyperparameters and the SBE QRDQN algorithm with hyperparameters matching our default DQN implementation. The QRDQN algorithm is an improvement over DQN that aims to capture the distribution over returns instead of a point estimate of the expected return. This implementation is more complex but allows for a precise estimate that aids in stability. The results show that this improvement was rather negligible on msuite, and unless credit assignment is the major concern in the environment at hand, a different improvement may prove more useful.
-
-<div style="text-align: center;">
-
-{% include figure.html path="assets/img/2023-05-01-bsuite-applications/radar52.png" class="img-fluid" %}
-
-<div class="caption">
-    Figure 14. Comparison of DQN with QRDQN variants.
-</div>
-
-</div>
-
-### Future Work
- Since bsuite provides quantitative results, one avenue of research is to create a recommender system that uses information from previous bsuite analyses to recommend improvements in DRL algorithms. The practitioner would need to provide as input the most important capabilities that an environment is believed to exhibit, and bsuite would tailor recommendations towards those capabilities. Such a recommender system could save compute time, increase performance, and ultimately expose the practitioner to new and exciting algorithmic possibilities.
-
-## 6. Conclusion
-
-Traditional RL benchmarks contain many confounding variables, which makes analysis of agent performance rather opaque. In contrast, bsuite  provides targeted environments that help gauge agent prowess in one or few core capabilities. The goal of bsuite is to help bridge the gap between practical theory and practical algorithms, yet there currently is no database or list of example use cases for the practitioner. Our work extends bsuite by providing concrete examples of its use, with a few examples in each of five categories. We supply at least one possible avenue of related future work or research in reproducible RL for each category. In its current state, bsuite is poised to be a standard RL benchmark for years to come due to its acceptance in a top-tier venue, well-structured codebase, multiple tutorials, and over 100 citations in the past few years in a relatively small field. We aim to help propel bsuite, and more generally methodical and reproducible RL research, into the mainstream through our explicit use cases and examples. With a diverse set of examples to choose from, we intend for applied RL practitioners to understand more use cases of bsuite, apply and document the use of bsuite in their experiments, and ultimately help bridge the gap between practical theory and practical algorithms. 
-
-### Green Computing Statement
-
-The use of bsuite can provide directed improvements in algorithms, from high-level model selection and improvement to lower-level debugging, testing, and hyperparameter tuning. Due to the current climate crisis, we feel that thoroughly-tested and accessible ideas that can reduce computational cost should be promoted to a wide audience of researchers.
-
-### Inclusive Computing Statement
-
-Many of the ideas in bsuite and this blog post are most helpful in regimes with low compute resources because of the targeted nature of these works. Due to the increasing gap between compute power of various research teams, we feel that thoroughly-tested and accessible ideas that can benefit teams with meagre compute power should be promoted to a wide audience of researchers.
-
-## Acknowledgements
-We thank the reviewers for their helpful comments. We also thank the authors of bsuite for their outstanding work.
\ No newline at end of file
diff --git a/_posts/2023-05-01-classification-layer-initialization-in-maml.md b/_posts/2023-05-01-classification-layer-initialization-in-maml.md
deleted file mode 100644
index 28ceb939..00000000
--- a/_posts/2023-05-01-classification-layer-initialization-in-maml.md
+++ /dev/null
@@ -1,302 +0,0 @@
----
-layout: distill
-title: Strategies for Classification Layer Initialization in Model-Agnostic Meta-Learning
-description: [This blog post discusses different strategies for initializing the classification layers parameters before fine-tuning on a new task in Model-Agnostic Meta-Learning. Each of the strategies in question has emerged from a different problem, and it will be analyzed whether one approach can solve the problems addressed by the other approaches.]
-date: 2023-05-01
-htmlwidgets: true
-
-# anonymize when submitting 
-# authors:
-#  - name: Anonymous 
-
-# do not fill this in until your post is accepted and you're publishing your camera-ready post!
-authors:
-   - name: Nys Tjade Siegel
-     url: "https://www.linkedin.com/in/nys-tjade-siegel-b06a1718a?originalSubdomain=de"
-     affiliations:
-       name: ALU Freiburg
-   - name: Thomas Goerttler
-     url: "https://scholar.google.de/citations?user=ppQIwpIAAAAJ&hl=de"
-     affiliations:
-       name: TU Berlin
-   - name: Klaus Obermayer
-     url: "https://www.tu.berlin/ni/"
-     affiliations:
-       name: TU Berlin
-
-# must be the exact same name as your blogpost
-bibliography: 2023-05-01-classification-layer-initialization-in-maml.bib  
-
-# Add a table of contents to your post.
-#   - make sure that TOC names match the actual section names
-#     for hyperlinks within the post to work correctly.
-toc:
-  - name: Introduction
-  - name: What is Meta-Learning?
-  - name: Quick recap on MAML
-  - name: Learning a single initialization vector
-  - name: Zero initialization
-    subsections:
-      - name: MAML's SCL Intuition
-  - name: Initialization using prototypes
-  - name: What else is there?
-  - name: Conclusion and Discussion
----
-
-
-## Introduction
-
-In a previous study, Raghu et al. [2020] <d-cite key="DBLP:conf/iclr/RaghuRBV20"></d-cite> found that in model-agnostic meta-learning (MAML) for few-shot classification, the majority of changes observed in the network during the inner loop fine-tuning process occurred in the linear classification head. It is commonly believed that during this phase, the linear head remaps encoded features to the classes of the new task. In
-traditional MAML, the weights of the final linear layer are meta-learned in the usual way. However, there are some issues with this approach:
-
-First, it is difficult to imagine that a single set of optimal classification head weights can be learned. This becomes apparent when considering class label permutations: two different tasks may have the same classes but in a different order. As a result, the weights that perform well for the first task will likely not be effective for the second task. This is reflected in the fact that MAML's performance can vary by up to 15% depending on the class label ordering during testing <d-cite key="DBLP:conf/iclr/YeC22"></d-cite>.
-
-Second, more challenging datasets are being proposed as few-shot learning benchmarks, such as Meta-Dataset <d-cite key="DBLP:conf/iclr/TriantafillouZD20"></d-cite>. These datasets have varying numbers of classes per task, making it impossible to learn a single set of weights for the classification layer.
-
-Therefore, it seems logical to consider how to initialize the final classification layer before fine-tuning on a new task. Random initialization may not be optimal, as it can introduce unnecessary noise <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite>. 
-
-This blog post will discuss different approaches to the last layer initialization that claim to outperform the original MAML method.
-
-## What is Meta-Learning?
-
-
-Before diving into the topic, let's look at the general idea of meta-learning. In supervised machine learning, tasks are learned using a large number of labeled examples.
-However, acquiring a sufficient amount of labeled data can be labor extensive. Also, this approach to machine learning evidently deviates from the human learning process; a child is certainly
-able to learn what a specific object is, using only a few examples, and not hundreds or thousands.
-This is where meta-learning comes in. Its goal can be described as acquiring the ability to learn new tasks from only a few examples <d-cite key="9428530"></d-cite>.
-
-There is not one fixed framework for meta-learning; however, a common approach is based on the principle that the conditions in which a model is trained and evaluated must match <d-cite key="vinyals2016matching"></d-cite>.\\
-Let's look at this in more detail for the case of few-shot classification, which can be solved with meta-learning. Here, the meta-learning goal
-can be verbalized as "learning to learn new classes from few examples" <d-cite key="DBLP:conf/iclr/TriantafillouZD20"></d-cite>. When evaluating a meta-learner, one needs a training set $$
-\mathcal{D^{tr}} = ((\mathbf{x}_1, y_1), (\mathbf{x}_2, y_2), (\mathbf{x}_3, y_3), ...)$$, consisting of labeled examples for unseen classes.
-Those are used by the meta-learner to adapt to the novel task. How well the meta-learner performs can then be evaluated on labeled examples from the same classes: $$
-\mathcal{D^{test}} = ((\mathbf{x}_{1}^{\ast}, y_{1}^{\ast}), (\mathbf{x}_{2}^{\ast}, y_{2}^{\ast}), (\mathbf{x}_{3}^{\ast}, y_{3}^{\ast}), ...)$$. The combination of such a training and test set is referred to
-as an episode or a task: $\mathcal{T} = (\mathcal{D^{tr}}, \mathcal{D^{test}})$.
-
-To match the conditions for training and evaluation, one would split all available classes with their examples into a dataset for meta-training $$\mathcal{C}_{train}$$ and a dataset for 
-meta-testing $$\mathcal{C}_{test}$$. Tasks are then drawn from those datasets for either training or testing purposes.\\
-A possible approach for using a task in the training phase could be: Fine-tune the meta-learner
-using $$\mathcal{D^{tr}}$$, evaluate its performance on $$\mathcal{D^{test}}$$, and finally update the model based on this evaluation error. 
-
-## Quick recap on MAML
-Model-Agnostic Meta-Learning (MAML) <d-cite key="DBLP:conf/icml/FinnAL17"></d-cite> is a well-established algorithm in the field of optimization-based meta-learning. Its goal is to find parameters $\theta$ for a parametric model $f_{\theta}$ that can be efficiently adapted to perform an unseen task from the same task distribution, using only a few training examples. The pre-training of $\theta$ is done using two nested loops (bi-level optimization), with meta-training occurring in the outer loop and task-specific fine-tuning in the inner loop. The task-specific fine-tuning is typically done using a few steps of gradient descent:
-
-$$
-\theta_{i}' = \theta - \alpha\nabla_{\theta}\mathcal{L_{\mathcal{T_{i}}}}(\theta, \mathcal{D^{tr}})
-$$
-
-where $\alpha$ is the inner loop learning rate, $\mathcal{L_{\mathcal{T_{i}}}}$ is a task's loss function, and $\mathcal{D^{tr}}$ is a task's training set. The task includes a test set as well: $\mathcal{T_{i}} = (\mathcal{D_{i}^{tr}}, \mathcal{D_{i}^{test}})$.
-
-In the outer loop, the meta parameter $\theta$ is updated by backpropagating through the inner loop to reduce errors made on the tasks' test set using the fine-tuned parameters:
-
-$$
-\theta' = \theta - \eta\nabla_{\theta} \sum_{\mathcal{T_{i}} \sim p(\mathcal{T})}^{} \mathcal{L_{\mathcal{T_{i}}}}(\theta_{i}', \mathcal{D^{test}}).
-$$
-
-Here, $\eta$ is the meta-learning rate. The differentiation through the inner loop involves calculating second-order derivatives, which mainly distinguishes MAML from simply optimizing for a $\theta$ that minimizes the average task loss.
-
-It is worth noting that in practical scenarios, this second-order differentiation is computationally expensive, and approximation methods such as first-order MAML (FOMAML) <d-cite key="DBLP:conf/icml/FinnAL17"></d-cite> or Reptile <d-cite key="DBLP:journals/corr/abs-1803-02999"></d-cite> are often used. In FOMAML, the outer loop update is simply: $$ \theta' = \theta - \eta\nabla_{\theta'} \sum_{\mathcal{T_{i}} \sim p(\mathcal{T})}^{}\mathcal{L_{\mathcal{T_{i}}}}(\theta_{i}', \mathcal{D^{test}}) $$, which avoids differentiating through the inner loop.
-
-Before proceeding, let's prepare ourselves for the next sections by looking at the notation we can use when discussing MAML in the few-shot classification regime: The model's output prediction can be described as $\hat{y} = f_{\theta}(\mathbf{x}) = \underset{c\in[N]}{\mathrm{argmax}} h_{\mathbf{w}} (g_{\phi}(\mathbf{x}), c)$, where we divide our model $f_{\theta}(\mathbf{x})$ (which takes an input $\mathbf{x}$) into a feature extractor $g_{\phi}(\mathbf{x})$ and a classifier $h_\mathbf{w}(\mathbf{r}, c)$, which is parameterized by classification head weight vectors ${\mathbf{w}}_{c=1}^N$. $\mathbf{r}$ denotes an input's representation, and $c$ is the index of the class we want the output prediction for.
-
-Finally, $\theta = {\mathbf{w_1}, \mathbf{w_1}, ..., \mathbf{w_N}, \phi}$, and we are consistent with our previous notation.
-
-## Learning a single initialization vector
-The first two variants of MAML - we look at - approach the initialization task by initializing the classification head weight vectors identically for all classes. In the paper
-
-<p></p>
-<span>&nbsp;&nbsp;&nbsp;&#9654;&nbsp;&nbsp;</span>Han-Jia Ye & Wei-Lun Chao (ICLR, 2022) How to train your MAML to excel in few-shot classification <d-cite key="DBLP:conf/iclr/YeC22"></d-cite>,
-<p></p>
-
-an approach called <strong>UnicornMAML</strong> is presented. It is explicitly motivated by the effect that different class-label assignments can have. Ye & Chao [2022] <d-cite key="DBLP:conf/iclr/YeC22"></d-cite> report that during testing, vanilla MAML can perform very differently for <ins>tasks with the same set of classes</ins>, which are just <ins>differently ordered</ins>. Namely, they report that classification accuracy can vary up to 15% in the one-shot setting and up to 8% in the five-shot setting. This makes MAML's performance quite unstable.
-<br/><br/>
-
-
-{% include figure.html path="assets/img/2023-05-01-classification-layer-initialization-in-maml/perm_final.png" class="img-fluid" %}
-
-<p align = "center">
-<em>Fig.1 Example of MAML and a class label permutation <d-cite key="DBLP:conf/iclr/YeC22"></d-cite>. We can see the randomness introduced, as $\mathbf{w_1}$ is supposed to interpret the input features as "unicorn" for the first task and as "bee" for the second. For both tasks, the class outputted as a prediction should be the same, as in human perception, both tasks are identical. This, however, is obviously not the case.</em>
-</p>
-
-The solution proposed is fairly simple: Instead of meta-learning $N$ weight vectors for the final layer, only a <ins>single vector</ins> $\mathbf{w}$ is meta-learned and used to initialize all $ \\{ \mathbf{w} \\}_{c=1}^N $ before the fine-tuning stage.
-
-This forces the model to make random predictions before the inner loop, as $\hat{y_c}= h_{\mathbf{w}} (g_{\phi} (\mathbf{x}), c)$ will be the same for all $c \in [1,...,N ]$.
-
-After the inner loop, the updated parameters have been computed as usual: $$ \theta' = \\{\mathbf{w_1}', \mathbf{w_2}', ..., \mathbf{w_N}', \phi'\\} $$. The gradient for updating the single classification head meta weight vector $\mathbf{w}$, is just the aggregation of the gradients w.r.t. all the single $\mathbf{w_c}$:
-
-$$
-\nabla_{\mathbf{w}} \mathcal{L_{\mathcal{T_i}}} (\mathcal{D^{test}}, \theta_i) = \sum_{c \in [N]} \nabla_{\mathbf{w_c}}
-\mathcal{L_{\mathcal{T_i}}} (\theta_i, \mathcal{D^{test}})
-$$
-
-This collapses the models meta-parameters to $ \theta = \\{\mathbf{w}, \phi\\} $.
-<br/><br/>
-
-
-{% include figure.html path="assets/img/2023-05-01-classification-layer-initialization-in-maml/unicorn_maml_final.png" class="img-fluid" %}
-
-<p align = "center">
-<em>Fig.2 Overview of UnicornMAML <d-cite key="DBLP:conf/iclr/YeC22"></d-cite>. We can see that class label permutations don't matter anymore, as before fine-tuning, the probability of predicting each class is the same.</em>
-</p>
-
-This tweak to vanilla MAML makes UnicornMAML permutation invariant, as models fine-tuned on tasks including the same categories - just differently ordered - will now yield the same output predictions. Also, the method could be used with more challenging datasets where the number of classes varies without any further adaptation: It doesn't matter how many classification head weight vectors are initialized by the single meta-classification head weight vector.
-
-Furthermore, the uniform initialization in Unicorn-MAML addresses the problem of memorization overfitting <d-cite key="DBLP:conf/iclr/YinTZLF20"></d-cite>. 
-The phenomenon describes a scenario where a single model can learn all the training tasks only from the test data in the outer loop. 
-This leads to a model that learns to perform the training tasks but also to a model that doesn't do any fine-tuning and thus fails to generalize to unseen tasks.
-
-Yin et al. [2020] <d-cite key="DBLP:conf/iclr/YinTZLF20"></d-cite> illustrate memorization overfitting using a simple example: Imagine a 3D pose prediction problem,
-where each task consists of 2D pictures of a certain object. The objects are rotated by some angle from an (unknown) canonical pose in every picture.
-Each picture is labeled by the angle by which the object is rotated from the object's canonical pose.
-
-In a memorization overfitting scenario, a model learns and memorizes the canonical pose of all the objects shown during training. 
-This way, the model no longer needs to adapt during fine-tuning in the meta-training phase.
-For correctly dealing with the test examples during training, it could just recognize which object it is looking at and calculate the angle from the remembered canonical pose.\\
-This becomes a problem when unseen objects are shown to the model during meta-testing. Here, it would be critical to infer
-the canonical pose from the training examples to infer the rotation angle for the test examples correctly. This, however,
-was not learned by the model in this example.
-
-When initializing the classification head identically for all classes, the model is forced to adapt during fine-tuning, 
-as otherwise, it would predict only at the chance level.
-This prevents memorization overfitting.
-
-Ye & Chao [2022] <d-cite key="DBLP:conf/iclr/YeC22"></d-cite> benchmark UnicornMAML on MiniImageNet and TieredImageNet.
-In the five-shot setting, the approach is claimed to outperform ProtoNet, ProtoMAML, MetaOptNet, MTL+E3BM, RFS-Distill, DeepEMD, MATE+MetaOpt
-DSN-MR and FEAT. In the one-shot setting, UnicornMAML is reported to perform averagely compared with the other methods.
-
-Let's finally think of how to interpret UnicornMAML: When meta-learning only a single classification head vector, one could say that rather than learning a mapping from features to classes, the weight vector instead learns a prioritization of those features that seem to be more relevant across tasks.
-
-## Zero initialization
-The second approach for initializing weights identically for all classes is proposed in the paper
-
-<p></p>
-<span>&nbsp;&nbsp;&nbsp;&#9654;&nbsp;&nbsp;</span>Chia-Hsiang Kao et al. (ICLR, 2022) MAML is a Noisy Contrastive Learner in Classification <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite>.
-<p></p>
-
-Kao et al. [2022] <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite> modify the original MAML by setting the whole classification head to zero before each inner loop. They refer to this MAML-tweak as the <strong>zeroing trick</strong>.
-
-An overview of MAML with the zeroing trick is displayed below:
-
-<div class="l-page">
-  <iframe src="{{ 'assets/html/2023-05-01-classification-layer-initialization-in-maml/algorithm.html' | relative_url }}" frameborder='0' scrolling='no' width="100%"  height="400px"></iframe>
-</div>
-
-
-<p align = "center">
-<em>Fig.3 MAML with the zeroing trick applied <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite>.</em>
-</p>
-
-Through applying the zero initialization, three of the problems addressed by UnicornMAML are solved as well:
-- MAML, with the zeroing trick applied, leads to random predictions before fine-tuning. This happens as zeroing the whole classification head
-is also a form of identical weight initialization for all classes. Thus, the zeroing trick solves the problem caused by
-class label ordering permutations during testing.
-- Through the random predictions before fine-tuning, memorization overfitting is prevented as well.
-- The zeroing trick makes MAML applicable for datasets with a varying number of classes per task.
-
-Interestingly, the motivation for applying the zeroing trick, stated by Kao et al. [2022] <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite>, is entirely different. In general, Kao et al. [2022] <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite> want to unveil in what sense MAML encourages its models to learn general-purpose feature representations. 
-They show that under some assumptions, there is a supervised contrastive learning (SCL) objective underlying MAML.
-
-In SCL, the label information is leveraged by pulling embeddings belonging to the same class closer together while increasing 
-the embedding distances of samples from different classes <d-cite key="DBLP:conf/nips/KhoslaTWSTIMLK20"></d-cite>.
-This is achieved by contrasting examples within a batch to each other. If two examples share the same label, the SCL loss is designed
-to increase their embeddings' similarity. If the label differs, it enforces the examples embedding similarity to decrease.
-The SCL loss contains an explicit similarity measure, which distinguishes it from supervised learning.
-
-More specifically, Kao et al. [2022] <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite> show that, in MAML without the zeroing trick, the outer-loop update for the encoder follows a noisy SCL loss under the following assumptions:
-1. The encoder weights are frozen in the inner loop (EFIL assumption)
-2. There is only a single inner loop update step.<d-footnote>Note that FOMAML technically follows a noisy SCL loss without this assumption. However, when applying the zeroing trick, this assumption is needed again for stating that the encoder update is following an SCL loss</d-footnote>
-
-A noisy SCL loss means that cases can occur where the loss forces the model to maximize similarities between embeddings from samples of different classes. The outer-loop encoder loss in this setting contains an "interference term" which causes the model to pull together embeddings from different tasks or to pull embeddings into a random direction, with the randomness being introduced by random initialization of the classification head. Those two phenomena are termed *cross-task interference*
-and *initialization interference*. Noise and interference in the loss vanish when applying the zeroing trick, and the outer-loop encoder loss turns into a proper SCL loss. Meaning that minimizing this loss forces embeddings of the same class/task together while pushing embeddings from the same task and different classes apart. 
-
-Those findings are derived using a general formulation of MAML, with a cross-entropy loss, and the details are available in the paper <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite>. Also, a slightly simpler example is stated to give an intuition of MAML's SCL properties. We will briefly summarize it in the following to share this intuition with you. 
-
-In experiments on the mini-ImageNet and Omniglot datasets, a decent increase in performance is reported for MAML with the zeroing trick compared to vanilla MAML.
-
-### MAML's SCL Intuition
-To get an intuition of how MAML relates to SCL, let's look at the following setup: an N-way one-shot classification task using MAML with Mean Squared Error (MSE) between the one-hot encoded class label and the prediction of the model. Furthermore, the EFIL assumption is made, the zeroing trick is applied, only a single inner loop update step is used, and only a single task is sampled per batch.
-
-In this setting, the classification heads inner-loop update for a single datapoint looks like this:
-
-$$
-\mathbf{w}' = \mathbf{w} - \alpha (-g_{\phi} (\mathbf{x}_{1}^{tr}) \mathbf{t}_{1}^{tr\top})
-$$
-
-$\mathbf{t}_1^{tr}$ refers to the one-hot encoded class label belonging to $\mathbf{x}_1^{tr}$. In words, the features extracted for training example $\mathbf{x}_1^{tr}$ are added to column $\mathbf{w}_c$, with $c$ being the index of 1 in $\mathbf{t}_1^{tr}$. For multiple examples, the features of all training examples labeled with class $c$ are added to the $c^{th}$ column of $\mathbf{w}$.
-
-Now, for calculating the model's output in the outer loop, the model computes the dot products of the columns $$ \\{\mathbf{w'} \\}_{c=1}^N $$ 
-and the encoded test examples $$ g_{\phi}(\mathbf{x}_1^{test}) $$.
-To match the one-hot encoded label as well as possible, the dot product has to be large when $$ \mathbf{t}_1^{test} $$ = $$1$$ at 
-index $$c$$, and small otherwise. We can see that the loss enforces embedding similarity for features from the same classes while enforcing 
-dissimilarity for embeddings from different classes, which fits the SCL objective.
-
-## Initialization using prototypes
-A more sophisticated approach for last-layer initialization in MAML is introduced in the paper
-
-<p></p>
-<span>&nbsp;&nbsp;&nbsp;&#9654;&nbsp;&nbsp;</span>Eleni Triantafillou et al. (ICLR, 2020) Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples <d-cite key="DBLP:conf/iclr/TriantafillouZD20"></d-cite> .
-<p></p>
-
-As one might guess from the name, <strong>Proto-MAML</strong> makes use of Prototypical Networks (PNs) for enhancing MAML. Unlike the two initialization strategies presented above, Proto-MAML does not force the classification head weights to be initialized identically for all classes before fine-tuning. Instead, it calculates class-specific initialization vectors based on the training examples. This solves some of the problems mentioned earlier (see [Conclusion & Discussion](#conclusion--discussion)), but also it adds another type of logic to the classification layer.
-
-Let's revise how PNs work when used for few-shot learning for understanding Proto-MAML afterward:
-
-Class prototypes $$\mathbf{c}_{c}$$ are computed by averaging over train example embeddings of each class, created by a feature extractor $$g_{\phi}(\mathbf{x})$$.
-For classifying a test example, a softmax over the distances (e.g., squared Euclidean distance) between class prototypes $$ \mathbf{c}_{c} $$ and example embeddings $$g_{\phi}(\mathbf{x}^{test})$$ is used, to generate probabilities for each class.
-
-When using the squared Euclidean distance, the model's output logits are expressed as:
-
-$$ 
-\begin{align*}
-&- \vert \vert g_{\phi}(\mathbf{x}) - \mathbf{c}_c \vert \vert^2 \\ =& −g_{\phi}(\mathbf{\mathbf{x}})^{\top} g_{\phi}(\mathbf{x}) + 2 \mathbf{c}_{c}^{\top} g_{\phi}(\mathbf{x}) − \mathbf{c}_{c}^{\top} \mathbf{c}_{c} \\ =& 2 \mathbf{c}_{c}^{\top} g_{\phi}(\mathbf{x}) − \vert \vert \mathbf{c}_{c} \vert \vert^2 + constant.
-\end{align*}
-$$
-
-Note that the "test" superscripts on $\mathbf{x}$ are left out for clarity. $$−g_{\phi}(\mathbf{x})^{\top} g_{\phi}(\mathbf{x})$$ is disregarded here, as it's the same for all logits, and thus doesn't affect the output probabilities. When inspecting the left-over equation, we can see that it now has the shape of a linear classifier. More specifically, a linear classifier with weight vectors $$ \mathbf{w}_c = 2 \mathbf{c}_c^{\top} $$ and biases $$ b_c = \vert \vert \mathbf{c}_{c} \vert \vert^2 $$.
-
-Returning to Proto-MAML, Triantafillou et al. [2020] <d-cite key="DBLP:conf/iclr/TriantafillouZD20"></d-cite> adapt vanilla MAML by initializing the classification head using the prototype weights and biases, as just discussed. The initialization happens before the inner loop for each task, and the prototypes are computed by MAML's own feature extractor. Afterward, the fine-tuning works as usual. Finally, when updating $\theta$ in the outer loop, the gradients flow also through the initialization of $$\mathbf{w}_c $$ and $$b_c$$, which is easy as they fully depend on $$ g_{\phi}(\mathbf{x})$$.
-
-Note that because of computational reasons, Triantafillou et al. [2020] <d-cite key="DBLP:conf/iclr/TriantafillouZD20"></d-cite> refer to Proto-MAML as (FO-)Proto-MAML.
-
-With Proto-MAML, one gets a task-specific, data-dependent initialization in a simple fashion, which seems super nice. For computing the model's output logits after classification head initialization, dot products between class prototypes and embedded examples are computed, which again seems very reasonable.
-
-One could argue that in the one-shot scenario, Proto-MAML doesn't learn that much in the inner loop beside the initialization itself. This happens as the dot product between an embedded training example and one class prototype (which equals the embedded training example itself for one class) will be disproportionately high. For a k-shot example, this effect might be less, but still, there is always one training example embedding within the prototype to compare. Following this thought, the training samples would rather provide a useful initialization of the final layer than a lot of parameter adaptation. 
-
-Proto-MAML is claimed to outperform the approaches, K-nearest neighbours, Finetune, MatchingNet, ProtoNet, fo-MAML and RelationNet on most sub-datasets of MetaDataset <d-cite key="DBLP:conf/iclr/TriantafillouZD20"></d-cite>, like ILSVRC-2012 or Omniglot. 
-
-## What else is there?
-Before proceeding to [Conclusion & Discussion](#conclusion--discussion), here are some pointers to methods that did not perfectly fit the topic but which are closely related:
-
-The first method worth mentioning is called Latent Embedding Optimization (LEO) <d-cite key="DBLP:conf/iclr/RusuRSVPOH19"></d-cite>. The authors encode the training data in a low dimensional subspace, from which model parameters $\theta$ can be generated. In the example presented, $\theta$ consists only of $\mathbf{w}$, so for the first inner-loop iteration, this would perfectly fit our initialization topic. The low-dimensional code is generated using a feed-forward encoder, as well as a relation network. Using the relation network allows LEO to consider relations between the training examples of different classes. Very similar classes, for example, might require different decision boundaries than more distinct classes, hence the intuition.
-
-LEO deviates from the initialization scheme, however, as optimization is done in the low dimensional subspace and not on the model's parameters directly. It is stated that optimizing in a lower dimensional subspace helps in low-data regimes.
-
-Another related method is called MetaOptNet <d-cite key="DBLP:conf/cvpr/LeeMRS19"></d-cite>. In this approach, convex base learners, like support vector machines, are used as the classification head. Those can be optimized till convergence, which solves, e.g., the problem of varying performance due to random class label orderings.
-
-## Conclusion and Discussion
-To conclude, we've seen that a variety of problems can be tackled by using initialization strategies for MAML's linear classification head, including:
-- Varying performance due to random class label orderings
-- Ability of MAML to work on datasets where the number of classes per task varies
-- Memorization overfitting
-- Cross-task interference
-- and Initialization interference.
-
-Furthermore, for all the approaches presented, a decent gain in performance is reported in comparison to vanilla MAML. It seems, therefore, very reasonable to spend some time thinking about the last layer initialization.
-
-Looking at the problems mentioned and variants discussed in more detail, we can state that all the different variants make MAML <strong>permutation invariant with regard to class label orderings</strong>. UnicornMAML and the zeroing trick solve it by uniform initialization of $\mathbf{w}$. In Proto-MAML, the initialization adapts to the class label assignments, so it's permutation invariant as well.
-
-Also, all variants are compatible with <strong>datasets where the number of classes per task varies</strong>. In UnicornMAML, an arbitrary number of classification head vectors can be initialized with the single meta-learned classification head weight vector. When zero-initializing the classification head, the number of classes per task does not matter as well. In Proto-MAML, prototypes can be computed for an arbitrary number of classes, so again, the algorithm works on such a dataset without further adaption.
-
-Next, UnicornMAML and the zeroing trick solve <strong>memorization overfitting</strong>, again by initializing $\mathbf{w}$ identically for all classes. Proto-MAML solves memorization overfitting as well, as the task-specific initialization of $\mathbf{w}$ itself can be interpreted as fine-tuning.
-
-<strong>Cross-task interference</strong> and <strong>initialization interference</strong> are solved by the zeroing trick. For the other methods, this is harder to say, as the derivations made by Kao et al. [2022] <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite> are quite a case specific. Intuitively, Proto-MAML should solve cross-task interference, as the classification head is reinitialized after each task. 
-Initialization interference is not solved by either ProtoMAML or UnicornMAML, as random initialization before the beginning of meta-training remains.
-
-Note that in discussion with a reviewer, Kao et al. [2022] <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite> state that the main results they show are achieved by models which had the zeroing trick implemented but which didn't follow the EFIL assumption. They argue that using only the zeroing trick still enhances supervised contrastiveness. This kind of puts their whole theory into perspective, as without the EFIL assumption, MAML with the zeroing trick is neither an SCL algorithm nor a noisy SCL algorithm. Still, noticeable performance gains are reported though.
-
-The question arises whether the whole theoretical background is needed or whether the zeroing tricks benefit is mainly the identical initialization for all classes, like in UnicornMAML. It would be nice to see how the single learned initialization vector in UnicornMAML turns out to be shaped and how it compares to the zeroing trick. While the zeroing trick reduces cross-task noise and initialization noise, a single initialization vector can weight some features as more important than others for the final classification decision across tasks.
-
-In contrast to the uniform initialization approaches, we have seen Proto-MAML, where class-specific classification head vectors are computed for initialization based on the training data.
-
-Finally, Ye & Chao [2022] <d-cite key="DBLP:conf/iclr/YeC22"></d-cite> compare the performance between Proto-MAML and UnicornMAML on MiniImageNet and TieredImageNet. UnicornMAML performs slightly better here in the one- and five-shot settings. 
-Kao et al. [2022] <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite> report that MAML with the zeroing-trick outperforms unmodified MAML on the mini-ImageNet and Omniglot datasets. They do not provide a benchmark score, however.
diff --git a/_posts/2023-05-01-facial-poisoning.md b/_posts/2023-05-01-facial-poisoning.md
deleted file mode 100644
index 114b55ab..00000000
--- a/_posts/2023-05-01-facial-poisoning.md
+++ /dev/null
@@ -1,166 +0,0 @@
----
-layout: distill
-title: Data Poisoning is Hitting a Wall
-description: In this post, we look at the paper 'Data Poisoning Won't Save You From Facial Recognition', discuss the impact of the work, and additionally look at how this work fares in the current state of adversarial machine learning. Being a blog post as opposed to a traditional paper, we try to avoid inundating the reader with mathematical equations and complex terminologies. Instead, we aim to put forth this work's primary concept and implications, along with our observations, in a clear, concise manner. Don't want to go through the entire post? Check out the TL;DR at the end for a quick summary.
-
-date: 2023-05-01
-htmlwidgets: true
-
-# Anonymize when submitting
-authors:
-  - name: Rajat Sahay
-    url: "https://rajatsahay.github.io/"
-    affiliations:
-      name: Rochester Institute of Technology, USA
-
-bibliography: 2023-05-01-facial-poisoning.bib
-
-toc:
-  - name: Overview and Motivation
-    subsections:
-    - name: What is Data Poisoning?
-  - name: Why doesn't Data Poisoning work?
-  - name: High Level Idea
-  - name: Experiments
-    subsections:
-    - name: Adaptive defenses break facial poisoning attacks
-    - name: Attack Detection
-    - name: Time is all you need
-    - name: Robustness shouldnt come at the cost of accuracy
-  - name: Conclusion
-  - name: Outlook
-  - name: TLDR
-
----
-
-## Overview and Motivation
-
-To illustrate the data poisoning process, and to tie in with the paper below, let's describe data poisoning against the backdrop of the facial recognition problem.
-
-Facial recognition systems have been known to pose a severe threat to society. With unprecedented advancements in AI research, it is evident that this threat will be around for a while. There has been a steady increase in vendors offering facial recognition services for downstream applications — ranging from customer onboarding tools to criminal identification for police forces. The systems provided by these vendors are usually trained on images of users' faces scraped from the Web. Ethical and moral concerns aside, this poses a considerable risk to the privacy of individuals.
-
-#### What is Data Poisoning?
-
-Keeping this in mind, a growing body of work has emerged that allows users to fight back using principles from adversarial machine learning. Primary among these is the technique of data poisoning - where users can perturb pictures that they post online so that models that train on these become _poisoned_. In other words, once a model has been introduced to a perturbed image of a user, it misidentifies any future instances of that person.
-
-Services like _Fawkes_ popularized this approach by offering a service promising "strong protection against unauthorized {facial recognition} models." Users could pass their images through Fawkes and receive poisoned photos - virtually identical to the naked eye, which were then posted to social media, alleviating any worries that they might be used to identify them in the future. It quickly gained popularity, was covered by the New York Times <d-footnote>[This tool could protect your photos from Facial Recognition](https://www.nytimes.com/2020/08/03/technology/fawkes-tool-protects-photos-from-facial-recognition.html)</d-footnote> and received over 500,000 downloads. Following Fawkes' success, similar systems were proposed in academic and commercial settings.
-
-{% include figure.html path="assets/img/2023-05-01-facial-poisoning/facial_poisoning.png" class="img-fluid" %}
-
-*** 
-The authors of the paper, however, look at these systems from a different perspective. They argue that services like Fawkes (and poisoning strategies in general) cannot protect users' privacy when it comes to facial recognition systems. In fact, it usually exacerbates the situation by providing them with a false sense of security.
-For instance, there might have previously been a privacy-focused user who would have refrained from uploading their photos to the Internet. However, they might do so now under the false belief that their poisoned photos would work towards protecting their privacy. Thus, these users are now _less private_ than they were before.
-
-## Why doesn't data poisoning work?
-
-While data poisoning may have uses in other fields, such as healthcare, this post shows that it would not protect against facial recognition models. The main reason for this is due to a fundamental asymmetry between the users and the model trainers. Let us take the scenario described in the above figure. A user commits to an attack and uploads a perturbed image of themselves to the Web. This image eventually gets scraped by the model as part of its data collection strategy. In this case, the model trainer, or the vendors offering facial recognition services, now benefit from acting second. This provides them with two significant advantages:
--   Since image poisoning systems cater to large user bases, these systems are usually made publicly accessible. This allows the model trainers to become aware of the technique used, which, in turn, helps them apply techniques to resist the poisoning attacks. This strategy of using alternate training techniques is known as an **adaptive defense**.
-
--   As current poisoning attacks are designed to prevent _existing_ facial recognition tools from working, there is no reason to assume that future models will also be poisoned. So, trainers can simply wait a while and use newer models to keep identifying users, which would be invulnerable to poisoning attacks. This technique can (aptly) be named an **oblivious defense**.
-
-Observant readers might equate this setting of continually evolving attack and defense tactics to an _arms race_. However, since a perturbation applied to an image cannot be changed once scraped by the model, a successful attack has to remain effective against _all_ future models, even those trained adaptively against the attack. A better alternative to this would be pushing for legislation that restricts the use of privacy-invasive facial recognition systems.
-
-## High Level Idea
-We now look at the conclusions put forward in the excellent paper written by Radiya-Dixit _et al_. 
-
- 1. An adaptive model trainer with black-box access to facial recognition systems like Fawkes can train a robust model that resists poisoning attacks and correctly identifies all users with high accuracy.
- 2. An adaptive model trainer can also repurpose this model to _detect_ perturbed pictures with near-perfect accuracy.
- 3. Image poisoning systems have already been broken by newer facial recognition that appeared less than a year after the attacks were introduced and employed superior training strategies.
-4.  It is possible to increase the robustness of a model (against poisoning attacks) without degrading its accuracy in identifying 'clean' images.
-
-Let us take a closer look and deconstruct how the authors arrived at these conclusions.
-
-## Experiments 
-
-For clarity, before we arrive at the individual conclusions, we look at the setup used by the authors to carry out their experiments.
-
-The authors evaluate three distinct poisoning attacks: **Fawkes v0.3**, **Fawkes v1.0**<d-cite key="shan2020fawkes"></d-cite>, and a separate attack published at ICLR 2021 called **LowKey**<d-cite key="cherepanova2021lowkey"></d-cite>. All of these function on the same underlying principle of data poisoning. Their goal is to force the facial recognition model to associate an image with spurious features absent in unperturbed images.
-
-The experiments are performed with the _FaceScrub_ dataset<d-cite key="ng2014data"></d-cite>, which contains over 50,000 pictures of 530 celebrities. A sample run of an experimental procedure can be described as follows:
-A user, in this case, one of the celebrities in the _FaceScrub_ dataset, perturbs all of their images with _Fawkes_ or _LowKey_ in their strongest settings. These images then end up as the training data used by the model trainer. The model trainer uses the standard approach for training their facial recognition system by employing a pre-trained feature extractor to convert pictures into embeddings. Given a test image, the model tries to find a training example that minimizes the distance between them in the embedding space and returns the identity associated with the training example.
-
-The authors use various models as feature extractors from _FaceNet_<d-cite key="schroff2015facenet"></d-cite> to OpenAI's _CLIP_<d-cite key="radford2021learning"></d-cite>. This is an important step that helps quantify the effectiveness of the **oblivious defense** strategy.
-***
-
-#### Adaptive defenses break facial poisoning attacks
-
-This section describes how the model trainer can adaptively train a generic feature extractor that can resist poisoning attacks.
-
-The model trainer begins by collecting a public dataset of unperturbed images. In this case, that would be a canonical dataset of celebrities that are a part of the _FaceScrub_ dataset. With black-box access to the poisoning tool, the trainer calls it to obtain perturbed samples of the same images.
-
-{% include figure.html path="assets/img/2023-05-01-facial-poisoning/adaptive-attack.gif" class="img-fluid" %}
-
-With access to both unperturbed images and their corresponding poisoned counterparts, the trainer can teach a model to produce similar embeddings for both sets of pictures, encouraging the model to adaptively learn robust features. This is done hoping that this robustness would eventually generalize to perturbations' applied to other images.
-
-While the above strategy works in theory, it requires direct intervention from model trainers by using the 'clean' images provided by them. This would not scale well, especially for large-scale facial recognition systems that look at millions of photographs. However, this attack could also occur without the trainers' explicit involvement. There is a high possibility that some users already have unperturbed images of themselves on the Web; either they forgot to perturb some pictures, or they were uploaded by someone else. Feature extractors trained on these pictures would then be encouraged to learn robust features.
-
-**Results:** All three attacks were evaluated against a non-robust _WebFace_ model to establish a baseline. They were found to have a misclassification rate of 55-77% for users who poisoned their pictures online. This compares starkly to a rate of 8% for unprotected users. However, when trained adaptively, the misclassification rate for all users - protected or unprotected - dropped to 5-8%, and all poisoning attacks were rendered ineffective.
-***
-
-#### Attack Detection
-
-Since the model trainers have black-box access to the facial poisoning tools (_Fawkes_ and _LowKey_), they can also turn the tables and build a detector to determine whether a specific image has been perturbed. Such a detector can dynamically filter out perturbed photos, allowing the model to retain only unperturbed pictures of a user. Moreover, detecting an attack could be a privacy concern (for instance, law enforcement might actively target users whose attack attempts are detected).
-
-To verify this, the authors were able to fine-tune a standard pre-trained _ImageNet_ model to distinguish between perturbed and clean images of 25 random celebrities in the dataset. The model detected the poisoned images with near-perfect precision (99.8%) and recall (99.8%).
-***
-
-#### Time is all you need
-
-Rather than creating poisoned counterparts to clean images and adaptively training a model, trainers have a much simpler alternative. They can simply wait for better facial recognition systems to be developed and then retroactively apply such a system to pictures they scraped in the past. _**Simply put, facial poisoning attacks cannot withstand the test of time**_.
-
-To bypass this _oblivious_ defense strategy, an attack must not only be able to fool all present models but also be effective against future iterations without changing its perturbation. Asymetrically (to the benefit of the model trainer) newer techniques need not be robust to all attacks; instead, they just have to resist the specific method used in previous pictures.
-
-{% include figure.html path="assets/img/2023-05-01-facial-poisoning/oblivious-attack.gif" class="img-fluid" %}
-
-To confirm this, the paper included a study where _Fawkes_ was pitted against various feature extractors ordered chronologically. While the original _Fawkes v0.3_ was utterly ineffective against any model apart from _WebFace_, the updated v1.0 could transfer its attack to other extractors like _VGGFace_, _FaceNet_, and _ArcFace_. However, while _Fawkes v1.0_ provided a perfect (100%) error rate on the _Celeb1M_ model (the one it was trained to target), it failed miserably against more recent extractors like _MagFace_<d-cite key="meng2021magface"></d-cite> or _CLIP_. A similar trend was also observed when using _LowKey_. While it fared better than _Fawkes_ and could transfer its attack to MagSafe, LowKey failed to break the fine-tuned _CLIP_ model trained by the authors.
-
-To provide more credence to their findings, the authors also illustrated how users who downloaded an older model (_Fawkes v0.3_, for example) could not 'regain' their privacy by switching to an updated attack. For brevity, this post does not go into the specifics, but we encourage interested readers to look at the paper and additional supplementary material.
-***
-
-#### Robustness shouldn't come at the cost of accuracy
-
-A potential caveat for the _adaptive_ and _oblivious_ defenses is that increased robustness may come at the cost of decreased accuracy. For example, the CLIP model is much more robust than all the other feature extractors, but its clean accuracy falls slightly below the best models. In most cases, a trainer might be hesitant to deploy a _CLIP_-based model if only a small minority of users try to attack the system.
-
-Keeping this in mind, the authors demonstrated two approaches that allow model trainers to incorporate the best qualities of both worlds:
-
-**Top2:** This approach involved having a human in the loop. The authors propose that the system simply run the image through both models and return two candidate labels. To further streamline the process, the system could pass the image to the robust model only when the more accurate model cannot get a result. Humans could visually inspect these images to check for inconsistencies or determine if they were poisoned.
-
-**Confidence Thresholding:** To automate the above process, the system could begin by passing the image through the most accurate model and checking the prediction's confidence. This can be quantitatively defined as the distance between the target picture and its nearest neighbor in the embedding space. If the system finds the confidence below a certain threshold, the image is passed through the robust model instead.
-
-The paper demonstrates a facial recognition system that uses _MagFace_ for an accurate model and combines that with a more robust model like the fine-tuned _CLIP_ or an adaptively trained model. In both cases, the clean accuracy of the system matches or exceeds that of _MagFace_, while retaining high robustness to attacks.
-
-***
-
-### Conclusion
-
-The main takeaway from this post is that data poisoning is no longer an effective method to protect users from facial recognition systems. The original premise for developing poisoning attacks was to facilitate an 'arms race,' where better attacks could counteract improved defenses. However, the people who deploy facial recognition models would always have the upper hand. 
-
-The paper shows that facial recognition models can be trained to detect and overcome poisoning attacks by simply having black-box access to a public-facing tool or just waiting for newer models and retroactively using them. To compete even against the latter category of systems, users would have to presume that minimal changes will be made to facial recognition models in the upcoming years. Given the state and pace of research in the field, that seems highly unlikely. 
-***
-
-### Outlook
-This blog post provides a better understanding of the techniques used to neutralize the effects of data poisoning from the ICLR 2022 paper _Data Poisoning Won't Save You from Facial Recognition._ We hope that this has been of help to researchers and practitioners in the fields of adversarial ML.
-
-We now look to provide some clarifications and how we think this work would fit in the current age of machine learning.
-
-**The work is a net positive** 
-This paper takes a gloomy stance on the current state of protection against facial recognition models. By stating that model trainers would always have the upper hand in the race by simply switching to a more advanced framework, the authors quash any possibility of a technological solution. Instead, they argue that a legislative approach might hold the key to solving the problem. Looking at the discussion between the authors and the reviewers before the acceptance of the paper <d-footnote>[ICLR OpenReview](https://openreview.net/forum?id=B5XahNLmna)</d-footnote>, it was clear that the reviewers were reluctant to accept the finality of the solution - a sentiment we're sure would be shared by many others. However, if nothing else, this paper warns users against the futility of using commercial products like Fawkes to protect their identities. In alleviating the false sense of security provided by data poisoning attacks, this paper - and, by extension, this post - serves as a net positive for users' privacy. 
-
-**Is legislation the answer?**
-With artificial intelligence embedding itself into society at an unprecedented rate, it is clear that a complete overhaul of legislative frameworks is urgently required. As AI becomes more mainstream, privacy-invasive systems could graduate from storing information to using them for financial incentives. While we have seen this happen with users' browsing data, the repercussions of using biometrics would be much more severe. In fact, there have already been cases where facial recognition has been used by companies on users without their prior explicit consent. <d-footnote> [Madison Square Garden has put lawyers who represent people suing it on an 'exclusion list' to keep them out of concerts and sporting events](https://www.nytimes.com/2022/12/22/nyregion/madison-square-garden-facial-recognition.html)</d-footnote>
-
-While we agree with the authors for a push towards proper legislation, given the rate of progress, we believe the community can do more. Legislation is a process that moves slowly and usually needs uniform implementation. Literature on the subject has shown that each country has its own views on the emerging landscape of AI <d-footnote>[How different countries view artificial intelligence](https://www.brookings.edu/research/how-different-countries-view-artificial-intelligence/)</d-footnote> and bases its rules on those views. These may or may not always work. We believe a temporary stopgap in the form of a technological solution would be helpful, while a legislative solution holds maximum promise in the long run.
-
-***
-
-
-### TL;DR
-
-
- This post broadly explores the ineffectiveness of data poisoning strategies against facial recognition models. It shows that commercial solutions like Fawkes and LowKey, which allow users to perturb their photos before posting them to social media, offer no protection to the users once their pictures are scraped.
- 
-It reveals that an 'oblivious' model trainer can simply wait long enough for future developments to nullify the effects of the perturbation. Or, since the people developing the facial recognition systems also have access to poisoning tools, they can simply develop strategies to detect and adapt to the perturbations. 
-
-Finally, given that there are no technical solutions to the problem, the best approach would be to push for legislation to counteract privacy-invasive facial recognition systems.
-
-***
diff --git a/_posts/2023-05-01-hitchhikers-momentum.md b/_posts/2023-05-01-hitchhikers-momentum.md
deleted file mode 100644
index 9de66480..00000000
--- a/_posts/2023-05-01-hitchhikers-momentum.md
+++ /dev/null
@@ -1,567 +0,0 @@
----
-layout: distill
-title: A Hitchhiker's Guide to Momentum
-description: Polyak momentum is one of the most iconic methods in optimization. Despite it's simplicity, it features rich dynamics that depend both on the step-size and momentum parameter. In this blog post we identify the different regions of the parameter space and discuss their convergence properties using the theory of Chebyshev polynomials.
-date: 2023-05-01
-htmlwidgets: true
-
-# Anonymize when submitting
-authors:
-  - name: Fabian Pedregosa
-    url: https://fa.bianp.net
-    affiliations:
-      name: Google Research
-
-# authors:
-#   - name: Albert Einstein
-#     url: "https://en.wikipedia.org/wiki/Albert_Einstein"
-#     affiliations:
-#       name: IAS, Princeton
-#   - name: Boris Podolsky
-#     url: "https://en.wikipedia.org/wiki/Boris_Podolsky"
-#     affiliations:
-#       name: IAS, Princeton
-#   - name: Nathan Rosen
-#     url: "https://en.wikipedia.org/wiki/Nathan_Rosen"
-#     affiliations:
-#       name: IAS, Princeton
-
-# must be the exact same name as your blogpost
-bibliography: 2023-05-01-hitchhikers-momentum.bib  
-
-
-# Add a table of contents to your post.
-#   - make sure that TOC names match the actual section names
-#     for hyperlinks within the post to work correctly.
-toc:
-  - name: Gradient Descent with Momentum
-  - name: How fast is Momentum?
-  - name: The Robust Region
-  - name: The Lazy Region
-  - name: Knife's Edge
-  - name: Putting it All Together
-
-# Below is an example of injecting additional post-specific styles.
-# This is used in the 'Layouts' section of this post.
-# If you use this post as a template, delete this _styles block.
-_styles: >
-  /* see http://drz.ac/2013/01/17/latex-theorem-like-environments-for-the-web/ and http://felix11h.github.io/blog/mathjax-theorems */
-  .theorem {
-    display: block;
-    margin: 12px 0;
-    font-style: italic;
-  }
-  .theorem:before {
-    content: "Theorem.";
-    font-weight: bold;
-    font-style: normal;
-  }
-  .theorem[text]:before {
-    content: "Theorem (" attr(text) ") ";
-  }
-
-  .corollary {
-    display: block;
-    margin: 12px 0;
-    font-style: italic;
-  }
-  .corollary:before {
-    content: "Corollary.";
-    font-weight: bold;
-    font-style: normal;
-  }
-  .corollary[text]:before {
-  content: "Corollary (" attr(text) ") ";
-  }
-
-  .lemma {
-      display: block;
-      margin: 12px 0;
-      font-style: italic;
-  }
-  .lemma:before {
-      content: "Lemma.";
-      font-weight: bold;
-      font-style: normal;
-  }
-  .lemma[text]:before {
-    content: "Lemma (" attr(text) ") ";
-  }
-
-  .definition {
-    display: block;
-    margin: 12px 0;
-    font-style: italic;
-  }
-  .definition:before {
-    content: "Definition.";
-    font-weight: bold;
-    font-style: normal;
-  }
-  .definition[text]:before {
-    content: "Definition (" attr(text) ") ";
-  }
-
-  .remark {
-    display: block;
-    margin: 12px 0;
-    font-style: italic;
-  }
-  .remark:before {
-    content: "Remark.";
-    font-weight: bold;
-    font-style: normal;
-  }
-  .remark[text]:before {
-    content: "Remark (" attr(text) ") ";
-  }
-
-  .lemma[text]:before {
-    content: "Lemma (" attr(text) ") ";
-  }
-
-  .proof {
-      display: block;
-      font-style: normal;
-      margin: 0;
-  }
-  .proof:before {
-      content: "Proof.";
-      font-style: italic;
-  }
-  .proof:after {
-      content: "\25FC";
-      float:right;
-      font-size: 1.8rem;
-  }
-
-  .wrap-collapsible {
-    margin-bottom: 1.2rem 0;
-  }
-
-  input[type='checkbox'] {
-    display: none;
-  }
-
-  .lbl-toggle {
-    text-align: center;
-    padding: 0.6rem;
-    cursor: pointer;
-    border-radius: 7px;
-    transition: all 0.25s ease-out;
-  }
-
-  .lbl-toggle::before {
-    content: ' ';
-    display: inline-block;
-    border-top: 5px solid transparent;
-    border-bottom: 5px solid transparent;
-    border-left: 5px solid currentColor;
-    vertical-align: middle;
-    margin-right: .7rem;
-    transform: translateY(-2px);
-    transition: transform .2s ease-out;
-  }
-
-  .toggle:checked + .lbl-toggle::before {
-    transform: rotate(90deg) translateX(-3px);
-  }
-
-  .collapsible-content {
-    max-height: 0px;
-    overflow: hidden;
-    transition: max-height .25s ease-in-out;
-  }
-
-  .toggle:checked + .lbl-toggle + .collapsible-content {
-    max-height: none;
-    overflow: visible;
-  }
-
-  .toggle:checked + .lbl-toggle {
-    border-bottom-right-radius: 0;
-    border-bottom-left-radius: 0;
-  }
-
-  .collapsible-content .content-inner {
-    /* background: rgba(250, 224, 66, .2); */
-    /* border-bottom: 1px solid rgba(250, 224, 66, .45); */
-    border-bottom-left-radius: 7px;
-    border-bottom-right-radius: 7px;
-    padding: .5rem 1rem;
-  }
-
-  .center {
-      display: block;
-      margin-left: auto;
-      margin-right: auto;
-  }
-
-  .framed {
-    border: 1px var(--global-text-color) dashed !important;
-    padding: 20px;
-  }
-  
-  d-article {
-    overflow-x: visible;
-  }
-
-  .underline {
-    text-decoration: underline;
-  }
----
-
-
-> Dedicated to the memory of Boris Polyak <a href="https://memorialsource.com/memorial/polyak">(May 4, 1935 - February 3, 2023)</a>, inventor of this method and pioneer of the field of optimization.
-
-
-<!-- some latex shortcuts -->
-<div style="display: none">
-$$
-\def\argmin{\mathop{\mathrm{arg\,min}}}
-\def\xx{\pmb{x}}
-\def\HH{\pmb{H}}
-\def\bb{\pmb{b}}
-\def\EE{ \mathbb{E} }
-\def\RR{ \mathbb{R} }
-\def\lmax{L}
-\def\lmin{\mu}
-\def\defas{\stackrel{\text{def}}{=}}
-\definecolor{colormomentum}{RGB}{27, 158, 119}
-\definecolor{colorstepsize}{RGB}{217, 95, 2}
-\def\mom{ {\color{colormomentum}{m}} }
-\def\step{ {\color{colorstepsize}h} }
-$$
-</div>
-
-
-{% include figure.html path="assets/img/2023-05-01-hitchhikers-momentum/rate_convergence_momentum.png" class="img-fluid" %}
-
-
-## Gradient Descent with Momentum
-
-
-Gradient descent with momentum,<d-cite key="polyak1964some"></d-cite> also known as heavy ball or momentum for short, is an optimization method
-designed to solve unconstrained minimization problems of the form
-\begin{equation}
-\argmin_{\xx \in \RR^d} \, f(\xx)\,,
-\end{equation}
-where the objective function $$f$$ is differentiable and we have access to its gradient $$\nabla f$$. In this method
-the update is a sum of two terms. The first term is the
-difference between the current and the previous iterate $$(\xx_{t} - \xx_{t-1})$$, also known as _momentum term_. The second term is the gradient $$\nabla f(\xx_t)$$ of the objective function.
-
-<p class="framed">
-  <b class="underline">Gradient Descent with Momentum</b><br>
-  <b>Input</b>: starting guess \(\xx_0\), step-size \(\step > 0\) and momentum
-    parameter \(\mom \in (0, 1)\).<br>
-  \(\xx_1 = \xx_0 - \dfrac{\step}{\mom+1} \nabla f(\xx_0)\) <br>
-  <b>For</b> \(t=1, 2, \ldots\) compute
-  \begin{equation}\label{eq:momentum_update}
-  \xx_{t+1} = \xx_t + \mom(\xx_{t} - \xx_{t-1}) - \step\nabla
-  f(\xx_t)
-  \end{equation}
-</p>
-
-
-Despite its simplicity, gradient descent with momentum exhibits unexpectedly rich dynamics that we'll explore on this post. 
-
-### History and related work
-
-The origins of momentum can be traced back to Frankel's method in the 1950s for solving linear system of equations.<d-cite key="frankel1950convergence"></d-cite> It was later generalized by Boris Polyak to non-quadratic objectives<d-cite key="polyak1964some"></d-cite>.
-While the quadratic case is by now well understood, the general strongly convex case has instead had some fascinating developments in the last years.
-In the convex (but not strongly convex) case, <a href="https://arxiv.org/pdf/1412.7457.pdf">Ghadimi et al.</a><d-cite key="ghadimi2015global"></d-cite> showed the global convergence of the method in 2015.
-One year later, Lessard, Recht and Packard<d-cite key="lessard2016analysis"></d-cite> surprised the community with an example of a on-dimensional Lipschitz gradient and strongly convex function where the heavy-ball method (with a specific choice of parameters) doesn't converge nor diverge, but cycles instead.
-
-
-
-
-A paper that also explores the dynamics of momentum is Gabriel Goh's excellent <a href="https://distill.pub/2017/momentum/">Why Momentum Really Works</a>.<d-cite key="goh2017momentum"></d-cite> There are subtle but important differences between both analysis. The landscape described in the section <a href="https://distill.pub/2017/momentum/#momentum2D">"The Dynamics of Momentum"</a> describe the improvement along the direction _of a single eigenvector_. This partial view produces some misleading conclusions. For example, along the direction of a single eigenvector, the largest improvement is achieved with zero momentum and a step-size of 1 over the associated eigenvalue. This conclusion however doesn't hold in higher dimensions, where as we will see, the momentum term that yields the fastest convergence is non-zero.
-
-
-A __stochastic variant__ of this method, where the gradient is replaced by a stochastic estimate, is one of the most popular methods for deep learning. This has led in recent years to a flurry of research --and improved understanding -- of this stochastic variant.
-Although we won't be analyzing the stochastic variant, due to its importance, let us briefly mention some recent works.
-
-
-One of the first works to highlight the importance of momentum for training deep neural networks is the 2013 paper by <a href="https://arxiv.org/abs/1712.07628">Sutskever et al</a>.<d-cite key="sutskever2013importance"></d-cite>
-Some recent progress in the field has been possible by viewing the stochastic variant as an averaging method.<d-cite key="flammarion2015averaging"></d-cite> This has led to the development of improved last-iterate convergence rates <d-cite key="taylor2019stochastic"></d-cite> <d-cite key="tao2021the"></d-cite> and a better understanding of it's behavior in the non-convex setting.<d-cite key="defazio2020momentum"></d-cite>
-Another fruitful line of work has been to consider the <i>overparametrized</i> (or interpolation) setting, where the variance of the updates vanishes at the optimum. In this regime, different momentum-like methods have been shown to enjoy a faster worst-case convergence rate than SGD.<d-cite key="Liu2020Accelerating"></d-cite> <d-cite key="vaswani2019fast"></d-cite>
-
-
-
-## How Fast is Momentum?
-
-Momentum is _fast_. So fast that it's often the default choice of machine learning practitioners. But can we quantify this more precisely?
-
-Throughout the post we'll assume that our objective function $$f$$ is a quadratic objective of the form
-\begin{equation}\label{eq:opt}
-f(\xx) \defas \frac{1}{2}(\xx - \xx^\star) \HH (\xx - \xx^\star)~,
-\end{equation}
-where $$\HH$$ is a symmetric positive definite matrix and $$\xx^\star$$ is the minimizer of the objective. We'll assume that the eigenvalues of $$\HH$$ are in the interval $$[\mu, L]$$, where $$\mu$$ is strictly positive by the PSD assumption.
-
-
-The measure we'll use to quantify the speed of convergence is the rate of convergence. This is the worst-case relative improvement in the iterate suboptimality at iteration $$t$$, defined as
-\begin{equation}\label{eq:convergence_rate}
-r_t \defas \sup_{\xx_0, \text{eigs}(\HH) \in [\mu, L]} \frac{\\|\xx_{t} - \xx^\star\\|}{\\|\xx_{0} - \xx^\star\\|}\,.
-\end{equation}
-This is a worst-case measure because of all problem instances, we take worst possible initialization $$\xx_0$$ and matrix $$\HH$$ with eigenvalues in the interval $$[\mu, L]$$.
-
-
-This is a measure of how much progress is made (in the worst-case) at iteration $$t$$. The smaller the value of $$r_t$$, the faster the algorithm converges. Since all algorithms that we consider converge exponentially fast, for large enough $$t$$ the error is of the order of $$\mathcal{O}{(\text{constant}^t)}$$. Hence the most informative quantity is the value of $$\text{constant}$$ in this expression. We'll call this quantity the <i>asymptotic rate of convergence</i>, and denote it:
-\begin{equation}
-r_{\infty} \defas \limsup_{t \to \infty} \sqrt[t]{r_t}\,.
-\end{equation}
-This is the quantity we'll be discussing throughout the post and what we'll use to compare the speed of momentum for different values of its hyperparameters.
-
-
-
-### A connection between optimization methods and polynomials
-
-To compute easily the asymptotic rate of convergence for all admissible values of step-size and momentum, we'll use a connection between optimization of quadratic functions and the theory of orthogonal polynomials. This theory was extensively used in the early days of numerical analysis <d-cite key="Rutishauser1959"></d-cite> and provides an elegant and simple way to compute asymptotic rates (and non-asymptotic ones, although not the topic of this blog post) from known results in the theory of orthogonal polynomials. We favor this technique for its simplicity and elegance, although ones that also be used with identical results. Other techniques include the linear operator technique used by Polyak,<d-cite key="polyak1964some"></d-cite> the estimate sequences technique pioneered by Nesterov<d-cite key="nesterov1983method"></d-cite> or the use of Lyapunov functions.<d-cite key="JMLR:v22:20-195">
-
-
-
-The main result that will allow us to make the link between optimization and orthogonal polynomials is the following result. It's origins seem unclear, although a proof can be found in the 1959 monograph of Rutishauser.<d-cite key="Rutishauser1959"></d-cite> 
-
-
-<p class="lemma">
-Consider the following polynomial \(P_t\) of degree \(t\), defined recursively as:
-\begin{equation}
-\begin{split}
-&amp;P_{t+1}(\lambda) = (1 + \mom - \step \lambda ) P_{t}(\lambda) -
-\mom P_{t-1}(\lambda)\\
-&amp;P_1(\lambda) = 1 - \frac{\step}{1 + \mom} \lambda\,, ~ P_0(\lambda) = 1\,,~ 
-\end{split}\label{eq:def_residual_polynomial2}
-\end{equation}
-Then we can write the suboptimality at iteration \(t\) as
-\begin{equation}
-\xx_t - \xx^\star = P_t(\HH) \left( \xx_0 - \xx^\star \right) \,,
-\end{equation}
-where \(P_t(\HH)\) is the matrix obtained from evaluating the (originally real-valued) polynomial \(P_t\) at the matrix \(\HH\).
-</p>
-
-
-This last identity will allow us to easily compute convergence rates. In particular, plugging it into the definition of the convergence rate \eqref{eq:convergence_rate} we get that the rate is determined by the absolute value of the residual polynomial over the $$[\mu, L]$$ interval:
-\begin{align}
-r_t &amp;=  \sup_{\xx_0, \text{eigs}(\HH) \in [\mu, L]} \frac{\\|P_t(\HH) \left( \xx_0 - \xx^\star \right)\\|}{\\|\xx_{0} - \xx^\star\\|} \\\ 
-&amp; = \sup_{\text{eigs}(\HH) \in [\mu, L]} \\|P_t(\HH)\\| \\\ 
-&amp; = \sup_{\lambda \in [\mu, L]} \lvert P_t(\lambda) \rvert\,.
-\end{align}
-We've now reduced the problem of computing the convergence rate to the problem of computing the absolute value of a polynomial over a given interval. This is a problem that has been extensively studied in the theory of orthogonal polynomials. In particular, we'll use known bounds on Chebyshev polynomials of the first and second kind, as the residual polynomial of momentum can be written as a convex combination of these two polynomials. This fact is proven in the next result, which is a generalization of equation (II.29) in (Rutishauser 1959).<d-cite key="Rutishauser1959"></d-cite>
-
-
-
-<p class="lemma">
-The residual polynomial of momentum can be written in terms of Chebyshev polynomials of the first and second kind as
-\begin{align}
-P_t(\lambda) = \mom^{t/2} \left( {\small\frac{2\mom}{1+\mom}}\, T_t(\sigma(\lambda)) + {\small\frac{1 - \mom}{1 + \mom}}\,U_t(\sigma(\lambda))\right)\,.
-\end{align}
-where \(\sigma(\lambda) = {\small\dfrac{1}{2\sqrt{\mom}}}(1 + \mom - \step\,\lambda)\,\) is a linear function that we'll refer to as the <span class="underline">link function</span> and  \(T_t\) and \(U_t\) are the Chebyshev polynomials of the first and second kind respectively.
-</p>
-
-<div class="wrap-collapsible-XXX"> <input id="collapsible3" class="toggle" type="checkbox"> <label for="collapsible3" class="lbl-toggle" tabindex="0"><b>Show proof</b></label><div class="collapsible-content"><div class="content-inner"><div class="proof" id="proof-variance">
-<p>
-  Let's denote by \(\widetilde{P}_t\) the right hand side of the above equation, that is,
-  \begin{equation}
-  \widetilde{P}_{t}(\lambda) \defas \mom^{t/2} \left( {\small\frac{2
-  \mom}{1 + \mom}}\,
-  T_t(\sigma(\lambda))
-  + {\small\frac{1 - \mom}{1 + \mom}}\,
-  U_t(\sigma(\lambda))\right)\,.
-  \end{equation}
-  Our goal is to show that \(P_t = \widetilde{P}_t\) for all \(t\).
-</p>
-<p>
-  For \(t=1\), \(T_1(\lambda) = \lambda\) and \(U_1(\lambda) = 2\lambda\), so we have
-  \begin{align}
-  \widetilde{P}_1(\lambda) &amp;= \sqrt{\mom} \left(\tfrac{2
-  \mom}{1 + \mom} \sigma(\lambda) + \tfrac{1 - \mom}{1 + \mom} 2
-  \sigma(\lambda)\right)\\
-  &amp;= \frac{2 \sqrt{\mom}}{1 + \mom} \sigma(\lambda) = 1 - \frac{\step}{1 + \mom} \lambda\,,
-  \end{align}
-  which corresponds to the definition of \(P_1\) in \eqref{eq:def_residual_polynomial2}.
-</p>
-<p>
-  Assume it's true for any iteration up to \(t\), we will show it's true for \(t+1\). Using the three-term recurrence of Chebyshev polynomials we have
-  \begin{align}
-  &amp;\widetilde{P}_{t+1}(\lambda) = \mom^{(t+1)/2} \left( {\small\frac{2 \mom}{1 + \mom}}\,
-  T_{t+1}(\sigma(\lambda))
-  + {\small\frac{1 - \mom}{1 + \mom}}\, U_{t+1}(\sigma(\lambda))\right) \\
-  &amp;= \mom^{(t+1)/2} \Big( {\small\frac{2
-  \mom}{1 + \mom}}\,
-  (2 \sigma(\lambda) T_{t}(\sigma(\lambda)) - T_{t-1}(\sigma(\lambda))) \nonumber\\
-  &amp;\qquad\qquad
-  + {\small\frac{1 - \mom}{1 + \mom}}\, (2 \sigma(\lambda)
-  U_{t}(\sigma(\lambda)) - U_{t-1}(\sigma(\lambda)))\Big)\\
-  &amp;= 2 \sigma(\lambda) \sqrt{\mom} P_t(\lambda) - \mom P_{t-1}(\lambda)\\
-  &amp;= (1 + \mom - \step \lambda) P_t(\lambda) -
-  \mom P_{t-1}(\lambda)
-  \end{align}
-  where the third identity follows from grouping polynomials of same degree and the
-  induction hypothesis. The last expression is the recursive definition of \(P_{t+1}\) in
-  \eqref{eq:def_residual_polynomial2}, which proves the desired \(\widetilde{P}_{t+1} =
-  {P}_{t+1}\).
-</p>
-
-
-</div></div></div></div>
-
-
-
-### Tools of the trade: the two faces of Chebyshev polynomials
-
-
-A key feature that we'll use extensively about Chebyshev polynomials is that they behave very differently inside and outside the interval $$[-1, 1]$$.  Inside this interval (shaded blue region) the magnitude of these polynomials stays close to zero, while outside it explodes:
-
-
-{% include figure.html path="assets/img/2023-05-01-hitchhikers-momentum/two_phases_chebyshev.gif" class="img-fluid" %}
-
-Let's make this observation more precise.
-
-**Inside** the $$[-1, 1]$$ interval, Chebyshev polynomials admit the [trigonometric definitions](https://en.wikipedia.org/wiki/Chebyshev_polynomials#Trigonometric_definition) $$T_t(\cos(\theta)) = \cos(t \theta)$$ and $$U_{t}(\cos(\theta)) = \sin((t+1)\theta) / \sin(\theta)$$ and so they have an oscillatory behavior with values bounded in absolute value by 1 and $$t+1$$ respectively.
-
-
-**Outside** of this interval the Chebyshev polynomials of the first kind admit the <a href="https://en.wikipedia.org/wiki/Chebyshev_polynomials#Explicit_expressions">explicit form</a> for $$|\xi| \ge 1$$:
-\begin{align}
-T_t(\xi) &amp;= \dfrac{1}{2} \Big(\xi-\sqrt{\xi^2-1} \Big)^t + \dfrac{1}{2} \Big(\xi+\sqrt{\xi^2-1} \Big)^t \\\\ 
-U_t(\xi) &amp;= \frac{(\xi + \sqrt{\xi^2 - 1})^{t+1} - (\xi - \sqrt{\xi^2 - 1})^{t+1}}{2 \sqrt{\xi^2 - 1}}\,.
-\end{align}
-We're interested in convergence rates, so we'll look into $$t$$-th root asymptotics of the quantities.<d-footnote>With little extra effort, it would be possible to derive non-asymptotic convergence rates, although I won't pursue this analysis here.</d-footnote> Luckily, these asymptotics are the same for both polynomials<d-footnote>Although we won't use it here, this \(t\)-th root asymptotic holds for (almost) all orthogonal polynomials, not just Chebyshev polynomials. See for instance reference below</d-footnote> <d-cite key="stahl1990nth"></d-cite> and taking limits we have that
-\begin{equation}
-\lim_{t \to \infty} \sqrt[t]{|T_t(\xi)|} = \lim_{t \to \infty} \sqrt[t]{|U_t(\xi)|} = |\xi| + \sqrt{\xi^2 - 1}\,.
-\end{equation}
-
-
-
-## The Robust Region
-
-Let's start first by considering the case in which the image of $$\sigma$$ is in the $$[-1, 1]$$ interval. 
-This is the most favorable case. In this case, the Chebyshev polynomials are bounded in absolute value by 1 and $$t+1$$ respectively.
-Since the Chebsyshev polynomials are evaluated at $$\sigma(\cdot)$$, this implies that $$\lvert \sigma(\lambda)\rvert \leq 1$$. We'll call the set of step-size and momentum parameters for which the previous inequality is verified the _robust region_.
-
-Let's visualize this region in a map. Since $$\sigma$$ is a linear function, its extremal values are reached at the edges:
-\begin{equation}
-  \max_{\lambda \in [\lmin, \lmax]} |\sigma(\lambda)| = \max\{|\sigma(\lmin)|, |\sigma(\lmax)|\}\,.
-\end{equation}
-Using $$\lmin \leq \lmax$$ and that $$\sigma(\lambda)$$ is decreasing in $$\lambda$$, we can simplify the condition $$\lvert \sigma(\lambda)\rvert \leq 1$$ to $$\sigma(\lmin) \leq 1$$ and $$\sigma(L) \geq -1$$, which in terms of the step-size and momentum correspond to:
-\begin{equation}\label{eq:robust_region}
-\frac{(1 - \sqrt{\mom})^2}{\lmin} \leq \step \leq \frac{(1 + \sqrt{\mom})^2}{L} \,.
-\end{equation}
-These two conditions provide the upper and lower bound of the robust region.
-
-
-{% include figure.html path="assets/img/2023-05-01-hitchhikers-momentum/sketch_robust_region.png" class="img-fluid" %}
-
-
-### Asymptotic rate
-
-Let $$\sigma(\lambda) = \cos(\theta)$$ for some $$\theta$$, which is always possible since $$\sigma(\lambda) \in [-1, 1]$$. In this regime, Chebyshev polynomials verify the identities $$T_t(\cos(\theta)) = \cos(t \theta)$$ and $$U_t(\cos(\theta)) = \sin((t+1)\theta)/\sin(\theta)$$ , which replacing in the definition of the residual polynomial gives
-\begin{equation}
-P_t(\sigma^{-1}(\cos(\theta))) = \mom^{t/2} \left[ {\small\frac{2\mom}{1+\mom}}\, \cos(t\theta) + {\small\frac{1 - \mom}{1 + \mom}}\,\frac{\sin((t+1)\theta)}{\sin(\theta)}\right]\,.
-\end{equation}
-
-Since the expression inside the square brackets is bounded in absolute value by $$t+2$$, taking $$t$$-th root and then limits we have $$\limsup_{t \to \infty} \sqrt[t]{\lvert P_t(\sigma^{-1}(\cos(\theta)))\rvert} = \sqrt{\mom}$$ for <i>any</i> $$\theta$$. This gives our first asymptotic rate:
-
-
-<p class="framed" style="text-align: center">
-  The asymptotic rate in the robust region is \(r_{\infty} = \sqrt{\mom}\).
-</p>
-
-This is nothing short of magical. It would seem natural &ndash;and this will be the case in other regions&ndash; that the speed of convergence should depend on both the step-size and the momentum parameter. Yet, this result implies that it's not the case in the robust region. In this region, the convergence <i>only</i> depends on the momentum parameter $\mom$. Amazing.<d-footnote>This insensitivity to step-size has been leveraged by Zhang et al. 2018 to develop a momentum tuner </d-footnote> <d-cite key="zhang2017yellowfin"></d-cite>
-
-This also illustrates why we call this the <i>robust</i> region. In its interior, perturbing the step-size in a way that we stay within the region has no effect on the convergence rate. The next figure displays the asymptotic rate (darker is faster) in the robust region. 
-
-
-{% include figure.html path="assets/img/2023-05-01-hitchhikers-momentum/rate_robust_region.png" class="img-fluid" %}
-
-
-## The Lazy Region
-
-Let's consider now what happens outside of the robust region. In this case, the convergence will depend on the largest of $$\{\lvert\sigma(\lmin)\rvert, \lvert\sigma(L)\rvert\}$$. We'll consider first the case in which the maximum is $$\lvert\sigma(\lmin)\rvert$$ and leave the other one for next section. 
-
-This region is determined by the inequalities $$\lvert\sigma(\lmin)\rvert > 1$$ and $$\lvert\sigma(\lmin)\rvert \geq \lvert\sigma(L)\rvert$$.
-Using the definition of $$\sigma$$ and solving for $$\step$$ gives the equivalent conditions
-\begin{equation}
-\step \leq \frac{2(1 + \mom)}{L + \lmin} \quad \text{ and }\quad \step \leq \frac{(1 - \sqrt{\mom})^2}{\lmin}\,.
-\end{equation}
-Note the second inequality is the same one as for the robust region \eqref{eq:robust_region} but with the inequality sign reversed, and so the region will be on the oposite side of that curve. We'll call this the <i>lazy region</i>, as in increasing the momentum will take us out of it and into the robust region.
-
-
-{% include figure.html path="assets/img/2023-05-01-hitchhikers-momentum/sketch_lazy_region.png" class="img-fluid" %}
-
-
-
-### Asymptotic rate
-
-As we saw earlier, outside of the $$[-1, 1]$$ interval both Chebyshev have simple $$t$$-th root asymptotics.
-Using this and that both kinds of Chebyshev polynomials agree in sign outside of the $$[-1, 1]$$ interval we can compute the asymptotic rate as
-\begin{align}
-\lim_{t \to \infty} \sqrt[t]{r_t} &amp;= \sqrt{\mom} \lim_{t \to \infty} \sqrt[t]{ {\small\frac{2\mom}{\mom+1}}\, T_t(\sigma(\lmin)) + {\small\frac{1 - \mom}{1 + \mom}}\,U_t(\sigma(\lmin))} \\\\ 
-&amp;= \sqrt{\mom}\left(|\sigma(\lmin)| + \sqrt{\sigma(\lmin)^2 - 1} \right) \\\\ 
-\end{align}
-This gives the asymptotic rate for this region
-
-
-<p class="framed" style="text-align: center">
-  In the lazy region the asymptotic rate is \(r_{\infty} = \sqrt{\mom}\left(|\sigma(\lmin)| + \sqrt{\sigma(\lmin)^2 - 1} \right)\). 
-</p>
-
-Unlike in the robust region, this rate depends on both the step-size and the momentum parameter, which enters in the rate through the link function $$\sigma$$. This can be observed in the color plot of the asymptotic rate 
-
-
-
-{% include figure.html path="assets/img/2023-05-01-hitchhikers-momentum/rate_lazy_region.png" class="img-fluid" %}
-
-
-## Knife's Edge
-
-
-The robust and lazy region occupy most (but not all!) of the region for which momentum converges. There's a small region that sits between the lazy and robust regions and the region where momentum diverges. We call this region the <i>Knife's edge</i>
-
-For parameters not in the robust or lazy region, we have that $$|\sigma(L)| > 1$$ and $$|\sigma(L)| > |\sigma(\lmin)|$$. Using the asymptotics of Chebyshev polynomials as we did in the previous section, we have that the asymptotic rate is $$\sqrt{\mom}\left(|\sigma(L)| + \sqrt{\sigma(L)^2 - 1} \right)$$. The method will only converge when this asymptotic rate is below 1. Enforcing this results in $$\step \lt 2 (1 + \mom) / L$$. Combining this condition with the one of not being in the robust or lazy region gives the characterization:
-\begin{equation}
-\step \lt \frac{2 (1 + \mom)}{L}  \quad \text{ and } \quad \step \geq \max\Big\\{\tfrac{2(1 + \mom)}{L + \lmin}, \tfrac{(1 + \sqrt{\mom})^2}{L}\Big\\}\,.
-\end{equation}
-
-
-{% include figure.html path="assets/img/2023-05-01-hitchhikers-momentum/sketch_knife_edge.png" class="img-fluid" %}
-
-
-### Asymptotic rate
-
-The asymptotic rate can be computed using the same technique as in the lazy region. The resulting rate is the same as in that region but with $$\sigma(L)$$ replacing $$\sigma(\lmin)$$:
-
-
-<p class="framed"  style="text-align: center">
-  In the Knife's edge region the asymptotic rate is \(\sqrt{\mom}\left(|\sigma(L)| + \sqrt{\sigma(L)^2 - 1} \right)\).
-</p>
-
-Pictorially, this corresponds to 
-
-{% include figure.html path="assets/img/2023-05-01-hitchhikers-momentum/rate_knife_edge.png" class="img-fluid" %}
-
-
-## Putting it All Together
-
-This is the end of our journey. We've visited all the regions on which momentum converges.<d-footnote>There's a small convergent region with <i>negative</i> momentum parameter that we haven't visited. Although not typically used for minimization, negative momentum has found applications in smooth games <a href="https://arxiv.org/abs/1807.04740">(Gidel et al., 2020)</a>.</d-footnote> The only thing left to do is to combine all the asymptotic rates we've gathered along the way.
-
-
-<p class="theorem"> The asymptotic rate \(\limsup_{t \to \infty} \sqrt[t]{r_t}\) of momentum is
-\begin{alignat}{2}
-  &amp;\sqrt{\mom} &amp;&amp;\text{ if }\step \in \big[\frac{(1 - \sqrt{\mom})^2}{\lmin}, \frac{(1+\sqrt{\mom})^2}{L}\big]\\
-&amp;\sqrt{\mom}(|\sigma(\lmin)| + \sqrt{\sigma(\lmin)^2 - 1})  &amp;&amp;\text{ if } \step \in \big[0, \min\{\tfrac{2(1 + \mom)}{L + \lmin}, \tfrac{(1 - \sqrt{\mom})^2}{\lmin}\}\big]\\
-&amp;\sqrt{\mom}(|\sigma(L)| + \sqrt{\sigma(L)^2 - 1})&amp;&amp;\text{ if } \step \in \big[\max\big\{\tfrac{2(1 + \mom)}{L + \lmin}, \tfrac{(1 + \sqrt{\mom})^2}{L}\big\},  \tfrac{2 (1 + \mom) }{L} \big)\\
-&amp;\geq 1 \text{ (divergence)} &amp;&amp; \text{ otherwise.}
-\end{alignat}
-</p>
-
-Plotting the asymptotic rates for all regions we can see that Polyak momentum (the method with momentum $\mom = \left(\frac{\sqrt{L} - \sqrt{\lmin}}{\sqrt{L} + \sqrt{\lmin}}\right)^2$ and step-size $\step = \left(\frac{2}{\sqrt{L} + \sqrt{\lmin}}\right)^2$ which is asymptotically optimal among the momentum methods with constant coefficients) is at the intersection of the three regions.
-
-
-
-{% include figure.html path="assets/img/2023-05-01-hitchhikers-momentum/rate_convergence_momentum.png" class="img-fluid" %}
-
-
-
-## Reproducibility
-
-All plots in this post were generated using the following Jupyer notebook: [[HTML]]({{'assets/html/2023-05-01-hitchhikers-momentum/hitchhikers-momentum.html' | relative_url}}) [[IPYNB]]({{'assets/html/2023-05-01-hitchhikers-momentum/hitchhikers-momentum.ipynb' | relative_url}})
diff --git a/_posts/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks.md b/_posts/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks.md
deleted file mode 100644
index ec56e41e..00000000
--- a/_posts/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks.md
+++ /dev/null
@@ -1,166 +0,0 @@
----
-layout: distill
-title: How does the inductive bias influence the generalization capability of neural networks?
-description: [The blog post discusses how memorization and generalization are affected by extreme overparameterization. Therefore, it explains the overfitting puzzle in machine learning and how the inductive bias can help to understand the generalization capability of neural networks.]
-date: 2023-05-01
-htmlwidgets: true
-
-# anonymize when submitting 
-# authors:
-#  - name: Anonymous 
-
-# do not fill this in until your post is accepted and you're publishing your camera-ready post!
-authors:
-   - name: Charlotte Barth
-     url: "https://www.linkedin.com/in/charlotte-barth-a58b0a152/?originalSubdomain=de"
-     affiliations:
-       name: TU Berlin
-   - name: Thomas Goerttler
-     url: "https://scholar.google.de/citations?user=ppQIwpIAAAAJ&hl=de"
-     affiliations:
-       name: TU Berlin
-   - name: Klaus Obermayer
-     url: "https://www.tu.berlin/ni/"
-     affiliations:
-       name: TU Berlin
-
-# must be the exact same name as your blogpost
-bibliography: 2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks.bib  
-
-# Add a table of contents to your post.
-#   - make sure that TOC names match the actual section names
-#     for hyperlinks within the post to work correctly.
-toc:
-  - name: Overfitting Puzzle
-  - name: Experiments
-    subsections:
-    - name: Fully connected networks (FCN)
-    - name: Convolutional neural networks (CNN)
-  - name: General findings
-  - name: Conclusion
----
-
-Deep neural networks are a commonly used machine learning technique that has proven to be effective for many different use cases. However, their ability to generalize from training data is not well understood. In this blog post, we will explore the paper "Identity Crisis: Memorization and Generalization under Extreme Overparameterization" by Zhang et al. [2020] <d-cite key="DBLP:conf/iclr/ZhangBHMS20"></d-cite>, which aims to shed light on the question of why neural networks are able to generalize, and how inductive biases influence their generalization capabilities.
-
-
-## Overfitting Puzzle
-
-One open question in the field of machine learning is the **overfitting puzzle**, which describes the paradox that neural networks are often used in an overparameterized state (i.e., with more parameters than training examples), yet they are still able to generalize well to new, unseen data. This contradicts **classical learning theory**, which states that a model with too many parameters will simply memorize the training data and perform poorly on new data. This is based on the [**bias-variance tradeoff**](https://machinelearningcompass.com/model_optimization/bias_and_variance/) which is commonly illustrated in this way <d-cite key="fortmann2012understanding"></d-cite>:
-
-{% include figure.html path="assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/bias_variance_tradeoff.png" class="img-fluid" %}
-
-The tradeoff consists of finding the optimal model complexity between two extremes: If there are too few parameters, the model may have high bias and underfit the data, resulting in poor performance on both the training and test data. On the other hand, if there are too many parameters, the model may have high variance and overfit the training data, resulting in a good performance on the training data but a poor performance on the test data.
-
-Therefore, it is important to carefully balance the number of parameters and the amount of data available to achieve the best possible generalization performance for a given learning task.
-
-Neural networks, particularly deep networks, are typically used in the overparameterized regime, where the number of parameters exceeds the number of training examples. In these cases, common generalization bounds do not apply <d-cite key="DBLP:journals/corr/abs-1801-00173"></d-cite>. According to classical learning theory, the generalization behavior of a learning system should depend on the number of training examples (n), and the complexity of the model should be balanced with its fit to the data <d-cite key="DBLP:journals/corr/abs-1801-00173"></d-cite>. Otherwise, the algorithm would overfit. However, neural networks have shown that this is not always the case, as they can perform well even in cases of extreme overparameterization (e.g., a 5-layer CNN with 80 million parameters <d-cite key="DBLP:conf/iclr/ZhangBHMS20"></d-cite>). This is a very interesting finding as it shows that the classical learning theory may not hold true for neural networks.
-
-To better understand this phenomenon, Zhang et al. [2020] <d-cite key="DBLP:conf/iclr/ZhangBHMS20"></d-cite> examined the role of **inductive bias** in neural networks and its influence on the generalization capability of these networks. Inductive bias, or learning bias, refers to the assumptions a network makes about the nature of the target function and is determined by the network's architecture. Zhang et al. [2020] <d-cite key="DBLP:conf/iclr/ZhangBHMS20"></d-cite> conducted experiments with different types of fully connected networks (FCN) and convolutional neural networks (CNN) to investigate which biases are effective for these network architectures.
-
-
-
-## Experiments
-
-
-In the paper "Identity Crisis: Memorization and Generalization under Extreme Overparameterization" by Zhang et al. [2020] <d-cite key="DBLP:conf/iclr/ZhangBHMS20"></d-cite>, the authors use **empirical studies** to better understand the *overfitting puzzle* and how inductive bias affects the behavior of overparameterized neural networks. The authors specifically aim to investigate the role of inductive bias under **different architectural choices** by comparing fully connected and convolutional neural networks.
-
-The task used in the study is to learn an identity map through a single data point, which is an artificial setup that demonstrates the most extreme case of overparameterization. The goal of the study is to determine whether a network tends towards memorization (learning a constant function) or generalization (learning the identity function).
-
-To enable the **identity task** <d-cite key="DBLP:conf/eccv/HeZRS16"></d-cite> for linear models, the authors ensure that hidden dimensions are not smaller than the input and set the weights to the identity matrix in every layer. For convolutional layers, only the center of the kernel is used, and all other values are set to zero, simulating a 1 x 1 convolution which acts as a local identity function. For deeper models that use the [ReLU](https://en.wikipedia.org/wiki/Rectifier_(neural_networks)) activation function, it is necessary to encode and recover negative values, as they are discarded by the ReLU function. This can be achieved by using hidden dimensions that are twice the size of the input and storing negative and positive values separately.
-
-All networks are trained using standard gradient descent to minimize the mean squared error.
-
-The study uses the **[MNIST dataset](https://paperswithcode.com/dataset/mnist)** and tests the networks on various types of data, including a linear combination of two digits, random digits from the MNIST test set, random images from the Fashion MNIST dataset, and algorithmically generated image patterns. 
-
-
-So let us look at some of the results:
-
-
-<div class="l-page">
-  <iframe src="{{ 'assets/html/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_3.html' | relative_url }}" frameborder='0' scrolling='no' width="100%"  height="450px"></iframe>
-</div>
-
-The first column of the figure above shows the single data point that was used to train the network on, and all following columns show the test data with its specific results. The rows represent the different implementations of the respective networks (FCN, CNN).
-
-
-### Fully connected networks (FCN)
-
-
-For fully connected networks, the outputs differ depending on the depth of the network and the type of testing data.
-Shallower networks seem to incorporate random white noise into the output, while deeper networks tend to learn the constant function. The similarity of the test data to the training example also affects the behavior of the model. When the test data is from the MNIST digit sets, all network architectures perform quite well. However, for test data that is more dissimilar to the training data, the output tends to include more random white noise. The authors prove this finding with a *theorem* for 1-layer FCNs. The formula shows the prediction results for a test data point $x$:
-
-$$
-    f(x) = \Pi_{\parallel}(x) + R \Pi_{\perp}(x)
-$$
-
-The test data point is decomposed into components that are parallel $\Pi_{\parallel}$ and perpendicular $\Pi_{\perp}$ to the training example. $R$ is a random matrix, independent of the training data. If the test data is highly correlated to the training data, the prediction resembles the training output. If the test data is dissimilar to the training data, $\Pi_{\perp}(x)$ dominates $\Pi_{\parallel}(x)$, the output is randomly projected by $R$ and persists of white noise.
-
-This behavior can be confirmed by visualizing the results of the 1-layer FCN:
-
-{% include figure.html path="assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_1layer.png" class="img-fluid" %}
-
-The inductive bias does not lead to either good generalization or memorization. Instead, the predictions become more random as the test data becomes less similar to the training data.
-
-Deeper networks tend to learn the constant function, resulting in a strong inductive bias towards the training output regardless of the specific input. This behavior is similar to that of a deep ReLU network, as shown in the figure comparing deep FCN and deep ReLU networks.
-
-{% include figure.html path="assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_compareFCNReLU.png" class="img-fluid" %}
-
-Zhang et al. [2020] <d-cite key="DBLP:conf/iclr/ZhangBHMS20"></d-cite> conclude that more complex network architectures are more prone to memorization. This finding aligns with statistical learning theory, as a more complex architecture has more parameters and, therefore, more overparameterization.
-
-
-
-
-### Convolutional neural networks (CNN)
-
-
-
-
-For convolutional neural networks, the inductive bias was analyzed using the ReLU activation function and testing networks with different depths. The hidden layers of the CNN consist of 5 × 5 convolution filters organized into 128 channels. The networks have two constraints to match the structure of the identity target function.
-
-If you choose the button 'CNN' in the first figure, it shows the resulting visualizations. It can be seen that shallow networks are able to learn the identity function, while intermediate-depth networks function as edge detectors, and deep networks learn the constant function. Whether the model learns the identity or the constant function, both outcomes reflect inductive biases since no specific structure was given by the task.
-
-A better understanding of the evolution of the output can be obtained by examining the status of the prediction in the hidden layers of the CNN. Since CNNs, unlike FCNs, preserve the spatial relations between neurons in the intermediate layers, these layers can be visualized. The figure below shows the results for a randomly initialized 20-layer CNN compared to different depths of trained CNNs."
-
-
-<div class="l-page">
-
-  <iframe src="{{ 'assets/html/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNNs_intermedLayers.html' | relative_url }}" frameborder='0' scrolling='no' width="100%" height="450px"></iframe>
-</div>
-
-
-Random convolution gradually smooths out the input data, and after around eight layers, the shapes are lost. When the networks are trained, the results differ. The 7-layer CNN performs well and ends up with an identity function of the input images, while the results of the 14-layer CNN are more blurry. For the 20-layer trained CNN, it initially behaves similarly to the randomly initialized CNN by wiping out the input data, but it preserves the shapes for a longer period. In the last three layers, it renders the constant function of the training data and outputs 7 for any input.
-
-These results align with the findings of Radhakrishnan et al. [2018] <d-cite key="radhakrishnan2019memorization"></d-cite> in 'Memorization in overparametrized autoencoders', which used a similar empirical framework on fully-connected autoencoders. They found that deep neural networks learn locally contractive maps around the training examples, leading to learning the constant function.
-
-As for FCNs, the experiments show that the similarity of the test data to the training data point increases task success.
-Zhang et al. [2020] <d-cite key="DBLP:conf/iclr/ZhangBHMS20"></d-cite> conducted further experiments with different **feature channel numbers and dimensions**. They found that increasing the hidden dimensions/adding channels is much less prone to overfitting than adding depth. This should be considered when designing new models: if the goal is to increase the number of parameters of an existing model (perhaps to improve optimization dynamics or prepare for more training data), it is better to try increasing the hidden dimension before tuning the depth, unless the nature of the data changes.
-
-Another factor that influences inductive bias is **model initialization++. For networks with few channels, the difference between random initialization and the converged network is extreme <d-cite key="DBLP:conf/iclr/FrankleC19"></d-cite>. This can be explained as follows: in the regime of random initialization with only a few channels, the initialization does not have enough flexibility to compensate for incorrect choices. As a result, the networks are more likely to converge to non-optimal extrema. Having more channels helps to smooth out this problem, as more parameters can compensate for 'unlucky' cases.
-
-## General findings
-
-The first figure in this post shows that CNNs have better generalization capability than FCNs. However, it is important to note that the experiments primarily aim to compare different neural networks **within their architecture type**, so a comparison between FCNs and CNNs cannot be considered fair. CNNs have natural advantages due to sparser networks and structural biases, such as local receptive fields and parameter sharing, that are consistent with the identity task. Additionally, CNNs have more parameters, as seen in the underlying figure: a 6-layer FCN contains 3.6M parameters, while a 5-layer CNN (with 5x5 filters of 1024 channels) has 78M parameters. These differences should be taken into account when evaluating the results of the experiments.
-
-
-<div class="l-page" style="width: 704px; margin: auto;">
-
-  <iframe src="{{ 'assets/html/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/plot.html' | relative_url }}" frameborder='0' scrolling='no' width="100%" height="480px"></iframe>
-</div>
-
-
-To conclude, CNNs generalize better than FCNs, even though they have more parameters. This is consistent with the observed phenomenon that neural networks do not follow the statistical learning theory.
-
-The experiments described above lead to the following main findings of the paper:
-
-* The number of parameters does not strongly correlate with generalization performance, but the structural bias of the model does.
-
-For example, when equally overparameterized,
-
-* training a very deep model is prone to memorization, while
-* adding more feature channels/dimensions is much less likely to cause overfitting.
-
-
-## Conclusion
-After reading this blog post, we hope that the concept of the overfitting puzzle is understood and it is revealed how the generalization capability of neural networks contrasts with classical learning theory. We also made the significance of the study conducted by Zhang et al. [2020] <d-cite key="DBLP:conf/iclr/ZhangBHMS20"></d-cite> clear, as they provide more insights into the inductive bias. The artificial setup used in the study is a smart way to approach this topic and allows for an intuitive interpretation of the results. The authors found that CNNs tend to *generalize* by actually learning the concept of identity, while FCNs are prone to memorization. Within these networks, it can be said that the simpler the network architecture is, the better the task results. Another observation is that deep CNNs exhibit extreme memorization. It would have been interesting to analyze the inductive bias for other types of data (e.g., sequence data like speech) and compare whether the stated theorems also hold in those cases.
-
-In summary, Zhang et al. [2020] <d-cite key="DBLP:conf/iclr/ZhangBHMS20"></d-cite> conducted interesting studies that have helped the machine learning community to gain a deeper understanding of inductive bias. Their results provide concrete guidance for practitioners that can help design models for new tasks.
\ No newline at end of file
diff --git a/_posts/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation.md b/_posts/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation.md
deleted file mode 100644
index 8d5c1512..00000000
--- a/_posts/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation.md
+++ /dev/null
@@ -1,325 +0,0 @@
----
-layout: distill
-title: How much meta-learning is in image-to-image translation?
-description: ...in which we find a connection between meta-learning literature and a paper studying how well CNNs deal with nuisance transforms in a class-imbalanced setting. Closer inspection reveals a surprising amount of similarity - from meta-information to loss functions. This implies that the current conception of meta-learning might be too narrow.
-date: 2023-05-01
-htmlwidgets: true
-
-authors:
-  - name: Maximilian Eißler
-    url: "https://www.linkedin.com/in/maximilian-eißler-b51b9213b/"
-    affiliations:
-      name: TU Berlin
-  - name: Thomas Goerttler
-    url: "https://scholar.google.de/citations?user=ppQIwpIAAAAJ&hl=de"
-    affiliations:
-      name: TU Berlin
-  - name: Klaus Obermayer
-    url: "https://www.tu.berlin/ni/"
-    affiliations:
-      name: TU Berlin 
-
-# must be the exact same name as your blogpost
-bibliography: 2023-05-01-how-much-meta-learning-is-in-image-to-image-translation.bib  
-
-# Add a table of contents to your post.
-#   - make sure that TOC names match the actual section names
-#     for hyperlinks within the post to work correctly.
-toc:
-  - name: A closer look at the experiment
-  - name: How is this a meta-learning experiment? 
-  - name: Generative Invariance Transfer
-  - name: How much meta-learning is in MUNIT?
-    subsections:
-      - name: "Part 1: The task-centered view"
-      - name: "Part 2: The bi-level programming view"
-      - name: "Now, does MUNIT meta-learn?"
-  - name: Implications
-  - name: Key Takeaways
----
-
-At the last ICLR conference, Zhou et al. [2022] <d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> presented work showing that CNNs do not transfer information between classes of a classification task. 
-
-- Allan Zhou, Fahim Tajwar, Alexander Robey, Tom Knowles, George J. Pappas, Hamed Hassani, Chelsea Finn [ICLR, 2022] Do Deep Networks Transfer Invariances Across Classes?<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite>
-
-Here is a quick summary of their findings: 
-If we train a Convolutional Neural Net (CNN) to classify animals on a set of randomly brightened and darkened images of cats and dogs, it will learn to ignore the scene's brightness. We say that the CNN learned that classification is **invariant** to the **nuisance transformation** of randomly changing the brightness of an image. We now add a set of leopards to the training data, but fewer examples of them (they are hard to photograph) than we have cats and dogs. However, we keep using the same random transformations. The training set thus becomes **class-imbalanced**.
-
-We might expect a sophisticated learner to look at the entire dataset, recognize the random brightness modifications across all species of animal and henceforth ignore brightness when making predictions. If this applied to our experiment, the CNN would be similarly good at ignoring lighting variations on all animals. Furthermore, we would expect the CNN to become more competent at ignoring lighting variations in proportion to **the total amount of images**, irrespective of which animal they depict. 
-
-{% include figure.html path="assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/CONCEPTUAL_DIAGRAM.svg" class="img-fluid" %}
-
-Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> show that a CNN does not behave like this: When using a CNN on a **class-imbalanced** classification task with random nuisance transformations, the CNNs invariance to the transformation is proportional to the size of the training set **for each class**. This finding suggests CNNs don't **transfer invariance** between classes when learning such a classification task.
-
-However, there is a solution: Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> use an Image-to-Image translation architecture called MUNIT<d-cite key="DBLP:conf/eccv/HuangLBK18"></d-cite> to learn the transformations and generate additional data from which the CNN can learn the invariance separately for each class. Thus, the invariance to nuisance transformations is transferred **generatively**. They call this method **Generative Invariance Transfer (GIT)**.
-
-**So why is this an interesting result?**
-
-In the field of machine learning many have dreamed for a long time<d-cite key="schmidhuber:1987:srl"></d-cite><d-cite key="DBLP:books/sp/98/ThrunP98"></d-cite> of a learner that, having learned a number of tasks can adapt to new tasks with little to no extra training - a learner that has learned to learn, a meta-learner. Yet, specialized meta-learners <d-cite key="DBLP:conf/icml/FinnAL17"></d-cite><d-cite key="NIPS2017_cb8da676"></d-cite><d-cite key="NIPS2016_90e13578"></d-cite><d-cite key="sung2018learning"></d-cite>  struggled to outperform baseline methods<d-cite key="DBLP:journals/corr/abs-2104-02638"></d-cite><d-cite key="DBLP:journals/corr/abs-1904-04232"></d-cite>, arguably due to high computational requirements<d-cite key="nichol2018first"></d-cite> and few large scale datasets<d-cite key="triantafillou2019meta"></d-cite>. We believe this to be caused by a too-narrow conception of what constitutes meta-learning. We argue that:
-
-- In contradiction to recent definitions of meta-learning, the experiment described above is a meta-learning experiment.
-- MUNIT is related to contemporary meta-learning methods and a meta-learner.
-- These two findings point to a too-narrow conception of meta-learning in the recent literature. A wider conception based on mutual information could lead to interesting future work.
-
-Before we proceed to the main post, let's clarify some definitions. If you are already familiar with the subject, you may skip this part. If you have only a vague notion of contemporary meta-learning you will be able to follow the article anyway. However, if you want to know more, [here](https://interactive-maml.github.io/) is a gentle introduction to MAML, one of the most popular methods.
-
-<details>
-  <summary><b> Definition: Class-Imbalanced Classification</b></summary>
-  <br/>
-  <p>
-     In many real-world classification datasets, the number of examples for each class varies. <b>Class-imbalanced classification</b> refers to classification on datasets where the frequencies of class labels vary significantly. 
-  </p>
-  <p>
-    It is generally more difficult for a neural network to learn to classify classes with fewer examples <d-cite key="5128907"></d-cite><d-cite key="10.1117/12.2228523"></d-cite>. However, it is often important to perform well on all classes, regardless of their frequency in the dataset. If we train a model to classify a dataset of different skin tumors, most examples may be benign. Still, it is crucial to identify the rare, malignant ones. Experiment design, including training and evaluation methods must therefore be adjusted when using class-imbalanced data. (see Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> section 3.1)
-    </p>
-    <br/>
-</details>
-<details>
-  <summary><b> Definition: Nuisance Transformation & Transformation Invariance</b></summary>
-  <br/>
-  <p>
-    Transformations are alterations of data. In the context of image classification, <b>nuisance transformations</b> are alterations that do not affect the class labels of the data. A model is said to be invariant to a <b>nuisance transformation</b> if it can successfully ignore the transformation when predicting a class label.
-  </p>
-    We can formally define a <b>nuisance transformation</b>
-  <p>
-    $$T(\cdot |x)$$
-  </p>
-  <p>
-    as a distribution over transformation functions. An example of a <b>nuisance transformation</b> might be a distribution over rotation matrices of different angles, or lighting transformations with different exposure values. By definition, <b>nuisance transformations</b> have no impact on class labels $y$, only on data $x$. A perfectly <b>transformation-invariant</b> classifier would thus completely ignore them, i.e.,
-  </p>
-  <p>
-    $$
-        \hat{P}_w(y = j|x) = \hat{P}_w(y = j|x'), \; x' \sim T(\cdot |x).
-    $$
-  </p>
-  <p>
-  (see Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> section 3.1)
-  </p>
-</details>
-
-## A closer look at the experiment
-
-Let's take a more detailed look at the experiment Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> conducted:
-
-Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> take a dataset, e.g., CIFAR-100, then apply a nuisance transformation, for example, random rotation, background intensity, or dilation and erosion. They then remove samples from some classes until the distribution of class sizes follows [Zipf's law](https://en.wikipedia.org/wiki/Zipf%27s_law) with parameter 2.0 and a minimum class size of 5. The test set remains balanced, i.e., all test classes have the same number of samples. They then train a CNN model - for example, a ResNet - on this imbalanced and transformed training data. 
-
-To measure the invariance of the trained model to the applied transformation Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> use the empirical [Kullback-Leibler divergence](https://en.wikipedia.org/wiki/Kullback-Leibler_divergence) between the predictions on the untransformed test set and the transformed test set of each class. 
-
-If the learner is invariant to the transformation, the predicted probability distribution over class labels should be identical for the transformed and untransformed images. In that case, the KLD should be zero and greater than zero otherwise. The higher the expected KL-divergence, the more the applied transformation impacts the network's predictions.
-
-The result: eKLD falls with class size. This implies that the CNN does not learn that there are the same nuisance transformations on all images and therefore does not transfer this knowledge to the classes with less training data. A CNN learns invariance **separately for each class** (see also Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> section 3.2). 
-
-{% include figure.html path="assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/EKLD.svg" class="img-fluid" %}
-
-
-## How is this a meta-learning experiment? 
-
-You might think this is a cool experiment, but how is it related to meta-learning? 
-
-And, indeed, in contemporary literature meta-learning is often conceived of as learning multiple tasks. In an much-cited 2022 survey, Hosepdales et al. write:
->Meta-learning is most commonly understood as learning to learn; the process of improving a learning algorithm over multiple learning episodes. In contrast, conventional ML improves model predictions over multiple data instances. <d-cite key="DBLP:journals/pami/HospedalesAMS22"></d-cite>
-
-In another popular survey Vanschoren [2018] describes the meta-learning process as follows:
->First, we need to collect meta-data that describe prior learning tasks and previously learned models. They comprise the exact algorithm configurations used to train the models, including hyperparameter settings, pipeline compositions and/or network architectures, the resulting model evaluations, such as accuracy and training time, the learned model parameters, such as the trained weights of a neural net, as well as measurable properties of the task itself, also known as meta-features.<d-cite key="vanschoren2018meta"></d-cite>
-
-Francheschi et al. [2018] basically equate meta-learning (ML) with [hyperparameter optimization](https://en.wikipedia.org/wiki/Hyperparameter_optimization) (HO):
->[...] both HO and ML essentially boil down to nesting two search problems: at the inner level we seek a good hypothesis (as in standard supervised learning) while at the outer level we seek a good configuration (including a good hypothesis space) where the inner search takes place.<d-cite key="DBLP:conf/icml/FranceschiFSGP18"></d-cite> 
-
-This perspective on meta-learning seems to indicate that "true" meta-learning requires a rigid structure of multiple discrete tasks that is optimized over. However, in the invariance transfer setting we neither have multiple learning episodes, i.e., we learn over multiple data instances, nor any "meta-features". Also, adding a class to the dataset does not exactly constitute a new "task", even though knowledge of the nuisance transform is applicable.
-
-So is Zhou et al.'s<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> experiment no meta-learning after all? 
-
-Let's look at one of the original papers on meta-learning. In the 1998 book "Learning to learn" Sebastian Thrun & Lorien Pratt define an algorithm as capable of "Learning to learn" if it improves its performance in proportion to the number of tasks it is exposed to:
-
->an algorithm is said to learn to learn if its performance at each task improves with experience and with the number of tasks. Put differently, a learning algorithm whose performance does not depend on the number of learning tasks, which hence would not benefit from the presence of other learning tasks, is not said to learn to learn <d-cite key="DBLP:books/sp/98/ThrunP98"></d-cite>
-
-Now this seems a much looser definition. How might this apply to the experiment just outlined? In the introduction, we thought about how a sophisticated learner might handle a dataset like the one described in the last section. We said that a sophisticated learner would learn that the nuisance transformations are applied uniformly **to all classes**. Therefore, if we added more classes to the dataset, the learner would become **more invariant** to the transformations because we expose it to more examples of them. Since this is part of the classification task **for each class**, the learner should, everything else being equal, become better at classification, especially on classes with few training examples. To see this, we must think of the multi-classification task not as a single task but as multiple mappings from image features to activations that must be learned, as a set of binary classification tasks. Thrun and Pratt continue:
-
->For an algorithm to fit this definition, some kind of *transfer* must occur between multiple tasks that must have a positive impact on expected task-performance <d-cite key="DBLP:books/sp/98/ThrunP98"></d-cite>.
-
-This transfer is what Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> tried to measure. There is some meta-information learnable across several tasks, in our case, the transformation distribution across many binary classification tasks. If a learner can learn this meta-information and transfer it to each new task it has "learned to learn"; it is a meta-learner. The goal of Zhou et al.'s<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> experiment was to see whether this transfer takes place. Thus, arguably, it is a meta-learning experiment. 
-
-## Generative Invariance Transfer
-
-Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> don't stop there. They show that using the MUNIT (Multimodal Unsupervised image-to-image Translation)<d-cite key="DBLP:conf/eccv/HuangLBK18"></d-cite>  architecture, they can learn the nuisance transformations applied to the dataset and generate additional training samples for the classes with few samples, improving transformation invariance there. They call this Generative invariance transfer (GIT). Let's take a closer look: 
-
-MUNIT networks are capable of performing image-to-image translation, which means that they can translate an image from one domain, such as pictures of leopards, into another domain, such as pictures of house cats. The translated image should look like a real house cat while still resembling the original leopard image. For instance, if the leopard in the original image has its eyes closed, the translated image should contain a house cat with closed eyes. Eye state is a feature present in both domains, so a good translator should not alter it. On the other hand, a leopard's fur is yellow and spotted, while a house cat's fur can be white, black, grey, or brown. To make the translated images indistinguishable from real house cats, the translator must thus replace leopard fur with house cat fur.
-
-{% include figure.html path="assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/MUNIT_ENCODING.svg" class="img-fluid" %}
-
-MUNIT networks learn to perform translations by correctly distinguishing the domain-agnostic features (such as eye state) from the domain-specific features (such as the distribution of fur color). They embed an image into two latent spaces: a content space that encodes the domain-agnostic features and a style space that encodes the domain-specific features (see figure above).
-
-To transform a leopard into a house cat, we can encode the leopard into a content and a style code, discard the leopard-specific style code, randomly select a cat-specific style code, and assemble a house cat image that looks similar by combining the leopard's content code with the randomly chosen cat style code (see figure below).
-
-{% include figure.html path="assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/MUNIT_TRANSLATION.svg" class="img-fluid" %}
-
-Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> modify the process of using MUNIT to transfer images between domains. They do not use MUNIT to translate images **between** domains but **within** a domain. The MUNIT network exchanges the style code of an image with another style code of the same domain. For example, if the domain is house cats, the MUNIT network might translate a grey house cat into a black one. The learning task in this single-domain application of MUNIT is to decompose example-agnostic content features from example-specific style features so that the translated images still look like house cats. For example, fur color is a valid style feature for translating within the 'house cat' domain because every house cat has a fur color. A translator only switching fur color is hard to detect.
-
- However, if the domain included house cats **and apples**, fur color is not a valid style feature. If it was, the translator might translate fur color on an apple and give it black fur, which would look suspiciously out of place. Whatever house cats and apples have in common - maybe their position or size in the frame - would be a valid style feature. We would expect an intra-domain translator on an apples-and-cats dataset to change the position and size of an apple but not to turn it into a cat (not even partially).
-
-It turns out that on a dataset with uniformly applied nuisance transformations, the nuisance transformations are valid style features: The result of randomly rotating an apple cannot be discerned as artificial when images of all classes, house cats and apples, were previously randomly rotated. 
-
-Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> find that when they train a MUNIT network on a dataset with nuisance transformations and class imbalances, the MUNIT network decomposes the class and transformation distributions. The style latent space of the MUNIT network approximates the transformation distribution $T(\cdot &#124;x)$. The content space preserves the remaining features of the image, such as its class. Thus, when translating an image, i.e., exchanging its style code, MUNIT applies a random nuisance transformation while preserving content. Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> use this method to generate data for classes with few examples. While the CNN is still unable to transfer invariance to $T(\cdot &#124;x)$ between classes, it can now learn it for each class separately using the data generated by MUNIT, which has acquired knowledge of $T(\cdot &#124;x)$ from the entire dataset (see also Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> section 4).
-
-So MUNIT decomposes the example-specific information, e.g., whether something is an apple or a house cat, from the meta-information, i.e., nuisance transformations applied to the entire dataset. When we add more classes, it has more data and can better learn the transformation distribution $T(\cdot &#124;x)$. Does solving a meta-learning problem make MUNIT a meta-learner? Let's look at the relationship MUNIT has with contemporary meta-learners 
-
-## How much meta-learning is in MUNIT?
-To see how well MUNIT fits the definition of meta-learning, let's see what the same survey papers we consulted earlier consider the structure of a meta-learning algorithm. 
-
-### Part 1: The task-centered view
-Hospedales et al. [2021] <d-cite key="DBLP:journals/pami/HospedalesAMS22"></d-cite> defines a generic meta-learner as follows: 
-An outer training loop with a set of trainable parameters iterates over tasks in a  distribution of tasks. Formally a task is comprised of a dataset and a loss function $ \mathcal{T} = \\\{ \mathcal{D}, \mathcal{L} \\\} $. In an inner loop, a learning algorithm based on the outer loop's parameters is instantiated for each task. We train it on a training set (*meta-training*) and test it on a validation set (*meta-validation*). We then use loss on this validation set to update the outer loop's parameters. In this task-centered view of meta-learning, we can express the objective function as
-
-<p>
-$$
-\underset{\omega}{\mathrm{min}} \; \mathbb{E}_{\mathcal{T} \sim p(\mathcal{T})} \; \mathcal{L}(\mathcal{D}, \omega), 
-$$
-</p>
-
-where $ \omega $ is parameters trained exclusively on the meta-level, i.e., the *meta-knowledge* learnable from the task distribution <d-cite key="DBLP:journals/pami/HospedalesAMS22"></d-cite>.
-
-
-This *meta-knowledge* is what the meta-learner accumulates and transfers across the tasks. Collecting meta-knowledge allows the meta-learner to improve its expected task performance with the number of tasks. The meta-knowledge in the experiment of Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> is the invariance to the nuisance transformations as the transformations are identical and need to be ignored for images of all classes. By creating additional transformed samples, the MUNIT network makes the meta-knowledge learnable for the CNN.
-
-The task-centered view of meta-learning brings us to a related issue: A meta-learner must discern and decompose task-specific knowledge from meta-knowledge. Contemporary meta-learners decompose meta-knowledge through the different objectives of their inner and outer loops and their respective loss terms. They store meta-knowledge in the outer loop's parameter set $ \omega $ but must not learn task-specific information there. Any unlearned meta-features lead to slower adaptation, negatively impacting performance, *meta-underfitting*. On the other hand, any learned task-specific features will not generalize to unseen tasks in the distribution, thus also negatively impacting performance, *meta-overfitting*.
-
-We recall that, similarly, MUNIT <d-cite key="DBLP:conf/eccv/HuangLBK18"></d-cite> decomposes domain-specific style information and domain-agnostic content information. Applied to two domains, leopards and house cats, a MUNIT network will encode the domain-agnostic information, e.g., posture, scale, background, in its content latent space, and the domain-specific information, e.g., how a cat's hair looks, in its style latent space. If the MUNIT network encoded the domain-agnostic information in the style latent space, the resulting image would not appear to be a good translation since the style information is discarded and replaced. It might turn a closed-eyed leopard into a staring cat. If the MUNIT network encoded the domain-specific transformation in the content latent space, the network would have difficulty translating between domains. A house cat might still have its original leopard fur.
-
-Although the single-domain application of MUNIT explicitly learns a single task and scales "over multiple data instances" instead of "multiple learning episodes"<d-cite key="DBLP:journals/pami/HospedalesAMS22"></d-cite> it is clearly compatible with the task-centered view of meta-learning set forth *in the same survey paper*. Both meta-learning and multi-domain unsupervised image-to-image translation are thus learning problems that require a separation of the general from the specific. 
-
-As we shall see, this is even visible when comparing their formalizations as optimization problems.
-
-### Part 2: The bi-level programming view
-
-Francheschi et al. [2018] <d-cite key="DBLP:conf/icml/FranceschiFSGP18"></d-cite> show that all contemporary neural-network-based meta-learning approaches can be expressed as bi-level optimization problems. Formally the optimization objective of a general meta-learner can be expressed as:
-
-<p>
-$$
-\bbox[5pt, border: 2px solid blue]{
-\begin{align*}
-   \omega^{*} = \underset{\omega}{\mathrm{argmin}} \sum_{i=1}^{M} \mathcal{L}^{meta}(\theta^{* \; (i)}(\omega), D^{val}_i),
-\end{align*}
-}
-$$
-</p>
-
-
-where $M$ describes the number of tasks in a batch, $\mathcal{L}^{meta}$ is the meta-loss function, and $ D^{val}_i $ is the validation set of the task $ i $. $\omega$ represents the parameters exclusively updated in the outer loop. $ \theta^{* \; (i)} $ represents an inner loop learning a task that we can formally express as a sub-objective constraining the primary objective
-
-<p>
-$$
-\bbox[5pt, border: 2px solid red]{
-\begin{align*}
-   s.t. \; \theta^{* \; (i)} = \underset{\theta}{\mathrm{argmin}} \; \mathcal{L^{task}}(\theta, \omega, D^{tr}_i),
-\end{align*}
-}
-$$
-</p>
-
-where $ \theta $ are the model parameters updated in the inner loop, $ \mathcal{L}^{task} $ is the loss function by which they are updated and $ D^{tr}_i $ is the training set of the task $ i $  <d-cite key="DBLP:journals/pami/HospedalesAMS22"></d-cite>.
-
-While not adhering to Francheschi et al.'s [2018] notion of a meta-learner as "nesting two search problems", it turns out that the loss functions of MUNIT can be similarly decomposed:
-
-{% include figure.html path="assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/MUNIT_LOSS.svg" class="img-fluid" %}
-
-
-MUNIT's loss function consists of two adversarial (GAN) <d-cite key="DBLP:conf/nips/GoodfellowPMXWOCB14"></d-cite> loss terms (see figure above) with several auxiliary reconstruction loss terms. To keep the notation simple, we combine all reconstruction terms into a joined reconstruction loss $ \mathcal{L}_{recon}(\theta_c, \theta_s) $, where $ \theta_c $ are the parameters of the *content* encoding/decoding networks and $ \theta_s $ are the parameters of the *style* encoding/decoding networks. We will only look at one of the two GAN losses in detail since they are symmetric, and one is discarded entirely when MUNIT is used on a single domain in the fashion of Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite>.
-
-MUNIT's GAN loss term is
-
-<p>
-$$
-\begin{align*}
-    &\mathcal{L}^{x_{2}}_{GAN}(\theta_d, \theta_c, \theta_s) 
-    \\\\
-    =& \;\mathbb{E}_{c_{1} \sim p(c_{1}), s_{2} \sim p(s_{2})} \left[ \log (1 -D_ {2} (G_{2} (c_{1}, s_{2}, \theta_c, \theta_s), \theta_d)) \right]
-    \\
-    +& \;\mathbb{E}_{x_{2} \sim p(x_{2})}  \left[ \log(D_{2} (x_{2}, \theta_d)) \right],
-\end{align*}
-$$
-</p>
-
-where the $ \theta_d $ represents the parameters of the discriminator network, $p(x_2)$ is the data of the second domain, $ c_1 $ is the content embedding of an image from the first domain to be translated. $ s_2 $ is a random style code of the second domain. $ D_2 $ is the discriminator of the second domain, and $ G_2 $ is its generator. MUNIT's full objective function is:
-
-<p>
-$$
-\begin{align*}
-        \underset{\theta_c, \theta_s}{\mathrm{argmin}} \; \underset{\theta_d}{\mathrm{argmax}}& \;\mathbb{E}_{c_{1} \sim p(c_{1}), s_{2} \sim p(s_{2})} \left[ \log (1 -D_ {2} (G_{2} (c_{1}, s_{2}, \theta_c, \theta_s), \theta_d)) \right]
-    \\ +& \; \mathbb{E}_{x_{2} \sim p(x_{2})}  \left[ \log(D_{2} (x_{2}, \theta_d)) \right], + \; \mathcal{L}^{x_{1}}_{GAN}(\theta_d, \theta_c, \theta_s) 
-    \\ +& \;\mathcal{L}_{recon}(\theta_c, \theta_s)
-\end{align*}
-$$
-</p>
-
-(compare <d-cite key="DBLP:conf/eccv/HuangLBK18, DBLP:conf/nips/GoodfellowPMXWOCB14"></d-cite>).
-We can reformulate this into a bi-level optimization problem by extracting a minimization problem describing the update of the generative networks.
-We also drop the second GAN loss term as it is not relevant to our analysis. 
-
-<p>
-$$
-\bbox[5px, border: 2px solid blue]{
-\begin{align*}
-    \omega^{*} 
-    & = \{ \theta_c^*, \theta_s^* \} 
-    \\\\
-    & = 
-    \underset{\theta_c, \theta_s}{\mathrm{argmin}} \; \mathbb{E}_{c_{1} \sim p(c_{1}), s_{2} \sim p(s_{2})} \left[ \log (1 -D_ {2} (G_{2} (c_{1}, s_{2}, \theta_c, \theta_s), \theta_d^{*})) \right]
-    \\
-    & + \mathcal{L}_{recon}(\theta_c, \theta_s),
-\end{align*}
-}
-$$
-</p>
-
-We then add a single constraint, a subsidiary maximization problem for the discriminator function:
-
-<p>
-$$
-\bbox[5px, border: 2px solid red]{
-\begin{align*}
-   &s.t. \;\theta_d^{*}
-   \\\\
-    & =
-    \underset{\theta_d}{\mathrm{argmax}} \; \mathbb{E}_{c_{1} \sim p(c_{1}), s_{2} \sim p(s_{2})} \left[ \log (1 -D_ {2} (G_{2} (c_{1}, s_{2}, \theta_c, \theta_s), \theta_d)) \right] 
-    \\
-    & + \mathbb{E}_{x_{2} \sim p(x_{2})}  \left[ \log(D_{2} (x_{2}, \theta_d)) \right]
-\end{align*}
-}
-$$
-</p>
-
-Interestingly, this bi-level view does not only resemble a meta-learning procedure as expressed above, but the bi-level optimization also facilitates a similar effect. Maximizing the discriminator's performance in the constraint punishes style information encoded as content information. If style information is encoded as content information, the discriminator detects artifacts of the original domain in the translated image. Similarly, a meta-learner prevents *meta-overfitting* via an outer optimization loop. 
-
-*However, MUNIT, while representable as a bi-level optimization problem does not "essentially boil down to nesting two search problems".<d-cite key="DBLP:conf/icml/FranceschiFSGP18"></d-cite>* During GAN training, the discriminator's parameters are updated through the changes in the generator's parameters, which derive from the discriminator's parameters, and so forth; The training of the discriminator and generator are dependent processes. Crucially, they depend on each other symmetrically, forming a min-max game. Contemporary meta-learners, meanwhile, are strictly hierarchical, with an outer and inner optimization loop.
-
-### Now, does MUNIT meta-learn?
-
-So it appears that while not conforming to any verbal definition of a contemporary meta-learner MUNIT seems to:
-
-a) adhere to multiple formalizations made in the very same publications to define meta-learning
-
-b) solve a meta-learning problem via GIT when applied to a single domain (if you agree with the conclusion of the previous chapter)
-
-We thus conclude:
-
-When applied to a single domain MUNIT *does* meta-learn as it combines information from all classes to extract the transformation distribution. While it does not perform classification explicitly, the class information of an image is encoded in MUNIT's content space. Since MUNIT is trained in an unsupervised way, it is probably closer to a distance metric than an actual class label. We might thus classify single-domain MUNIT as an unsupervised, generative meta-learner. 
-
-## Implications
-
-That invariance transfer and GIT are meta-learning and that MUNIT is a meta-learner is important. Granted, it is not especially hard to see that invariance transfer is a form of "learning to learn" or that Image-to-Image translation is essentially a mechanism to decompose class-specific form general features. 
-
-However, because contemporary meta-learning has been narrowly cast as "improving a learning algorithm over multiple learning episodes"<d-cite key="DBLP:journals/pami/HospedalesAMS22"></d-cite> and "nesting two search problems"<d-cite key="DBLP:conf/icml/FranceschiFSGP18"></d-cite> it is hard to recognize GIT as meta-learning. 
-
-In these authors opinion this is not GIT's fault, but a sign that meta-learning has recently been conceived of too narrowly. Zhou et al.'s<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> experiment is a beautiful illustration of this showing that something as general as a GAN loss term, with appropriate modifications, can be used to meta-learn. 
-
-A too-narrow conception goes further than obscuring some experiment's significance though: Meta-learning as a field has recently struggled to compete with less specialized architectures<d-cite key="DBLP:journals/corr/abs-2104-02638"></d-cite><d-cite key="DBLP:journals/corr/abs-1904-04232"></d-cite>. Multi-task datasets are hard to scale <d-cite key="triantafillou2019meta"></d-cite>, as are episode rollouts <d-cite key="DBLP:conf/icml/FinnAL17"></d-cite>. Meanwhile, large-scale architectures have shown impressive zero-shot capabilities<d-cite key="dosovitskiy2021an"></d-cite><d-cite key="pmlr-v139-radford21a"></d-cite>.
-
-Zhou et al.'s<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> contributions are therefore important as a challenge to the status quo in meta-learning. MUNIT seems to meta-learn by embedding class (and class-specific features) in one space and transformation-specific features (e.g., how bright/dark) in another. This seems to point to a conception of meta-learning as finding mutual information between sets of examples (not necessarily defined by class or transformation feature but by arbitrary concepts) or hierarchies of such sets. Examining and designing mechanisms by which such behavior can be evoked is an exciting direction for future work.
-
-## Key Takeaways
-
-1. Zhou et al.'s<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> experiments show that the meta-learning setting can be formulated more broadly than learning an explicit task distribution, suggesting that specialized datasets are not necessary.
-
-2. Using GIT, Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> show that meta-learning algorithms can come in shapes other than inner and outer training loops. Analysis suggests that countervailing loss terms facilitate the decomposition of meta-features from task-specific features.
-
-3. Our discussion of Zhou et al.'s<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> experiments suggests, that when thinking about meta-learning, thinking about mutual information between batches of examples (not necessarily aligned with class labels) and how to extract it trumps thinking about distinct tasks.
diff --git a/_posts/2023-05-01-raspy.md b/_posts/2023-05-01-raspy.md
deleted file mode 100644
index 09cbdd07..00000000
--- a/_posts/2023-05-01-raspy.md
+++ /dev/null
@@ -1,836 +0,0 @@
----
-layout: distill
-title: Thinking Like Transformers
-description: "Thinking like Transformers proposes a simple language for coding with attention-like primitives. Using this language, we consider a challenging set of puzzles to gain intuition for how Transformer could implement basic algorithms."
-
-date: 2023-05-01
-htmlwidgets: false
-
-# Anonymize when submitting
-
-authors:
-  - name: Alexander Rush
-    url: "https://rush-nlp.com"
-    affiliations:
-      name: Cornell Tech
-  - name: Gail Weiss
-    url: "https://gailweiss.github.io/"
-    affiliations:
-      name: EPFL
-
-toc:
-  - name: Transformers as Code
-    subsections:
-    - name: Feed Forward Network
-    - name: Attention Selectors
-    - name: Using Attention
-    - name: Layers
-  - name: Coding with Transformers
-    subsections:
-    - name: "Challenge 1: Select a given index"
-    - name: "Challenge 2: Shift"
-    - name: "Challenge 3: Minimum"
-    - name: "Challenge 4: First Index"
-    - name: "Challenge 5: Right Align"
-    - name: "Challenge 6: Split"
-    - name: "Challenge 7: Add"
-
-
-# must be the exact same name as your blogpost
-bibliography: 2023-05-01-raspy.bib
-
-# Add a table of contents to your post.
-#   - make sure that TOC names match the actual section names
-#     for hyperlinks within the post to work correctly.
-
-
-# Below is an example of injecting additional post-specific styles.
-# This is used in the 'Layouts' section of this post.
-# If you use this post as a template, delete this _styles block.
-_styles: >
-  img {
-    display: block;
-    margin-left: auto;
-    margin-right: auto;
-  }
-  .fake-img {
-        background: #bbb;
-        border: 1px solid rgba(0, 0, 0, 0.1);
-        box-shadow: 0 0px 4px rgba(0, 0, 0, 0.1);
-        margin-bottom: 12px;
-  }
-  .fake-img p {
-    font-family: monospace;
-    color: white;
-    text-align: left;
-    margin: 12px 0;
-    text-align: center;
-    font-size: 16px;
-  }
----
-
-
-# Thinking Like Transformers
-
-
-- [Paper](https://arxiv.org/pdf/2106.06981.pdf)<d-cite key="weiss2021thinking"></d-cite> by Gail Weiss, Yoav Goldberg, Eran Yahav
-
-Transformer models are foundational to AI systems. There are now countless explanations of "how transformers work?" in the sense of the architecture diagram at the heart of transformers.
-
-
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_5_0.svg)
-    
-
-
-
-However this diagram does not provide any intuition into the computational model of this framework. As researchers become interested in how Transformers work, gaining intuition into their mechanisms becomes increasingly useful.
-
-<a href="https://arxiv.org/pdf/2106.06981.pdf">Thinking like Transformers</a> proposes a computational framework for Transformer-like calculations. The framework uses discrete computation to simulate Transformer computations. The resulting language <a href="https://github.com/tech-srl/RASP">RASP</a> is a programming language where, ideally, every program can compile down to a specific Transformer (indeed, David Lindner and colleagues have recently released a <a href="https://arxiv.org/abs/2301.05062">compiler</a> for a large subset of RASP!).
-
-
-In this blog post, I reimplemented a variant of RASP in Python (RASPy). The language is roughly compatible with the original version, but with some syntactic changes that I thought were fun. With this language, we have a challenging set of puzzles to walk through and understand how it works. 
-
-Before jumping into the language itself, let's look at an example of what coding with Transformers looks like. Here is some code that computes the `flip`, i.e. reversing an input sequence. The code itself uses two Transformer layers to apply attention and mathematical computations to achieve the result.
-
-```python
-def flip():
-    length = (key(1) == query(1)).value(1)
-    flip = (key(length - indices - 1) == query(indices)).value(tokens)
-    return flip
-flip()
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_11_0.svg)
-    
-
-
-
-
-
-## Transformers as Code
-
-Our goal is to define a computational formalism that mimics the expressivity of Transformers. We will go through this process by analogy, describing each language construct next to the aspect of the Transformer it represents. (See the full [paper](https://arxiv.org/pdf/2106.06981.pdf) for the formal language specification).
-
-The core unit of the language is a *sequence operation* that transforms a sequence to another sequence of the same length. I will refer to these throughout as *transforms*.
-
-### Inputs
-
-In a Transformer, the base layer is the input fed to the model. This input usually contains the raw tokens as well as positional information.  
-
-
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_15_0.svg)
-    
-
-
-
-In code, the symbol `tokens` represents the simplest transform. It returns the tokens passed to the model. The default input is the sequence "hello". 
-
-```python
-tokens
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_17_0.svg)
-    
-
-
-
-If we want to change the input to the transform, we use the input method to pass in an alternative. 
-
-```python
-tokens.input([5, 2, 4, 5, 2, 2])
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_19_0.svg)
-    
-
-
-
-As with Transformers, we cannot access the positions of these sequences directly. However, to mimic position embeddings, we have access to a sequence of indices.
-
-```python
-indices
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_21_0.svg)
-    
-
-
-
-```python
-sop = indices
-sop.input("goodbye")
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_22_0.svg)
-    
-
-
-
-### Feed Forward Network
-
-After the input layer, we reach the feed-forward network. In a Transformer, this stage can apply mathematical operations to each element of the sequence independently. 
-
-
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_24_0.svg)
-    
-
-
-
-In code, we represent this stage by computation on transforms. Mathematical operations are overloaded to represent independent computation on each element of the sequence .
-
-```python
-tokens == "l"
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_26_0.svg)
-    
-
-
-
-The result is a new transform. Once constructed it can be applied to new input.
-
-```python
-model = tokens * 2  - 1
-model.input([1, 2, 3, 5, 2])
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_28_0.svg)
-    
-
-
-
-Operations can combine multiple transforms. For example, functions of `tokens` and `indices`. The analogy here is that the Transformer activations can keep track of multiple pieces of information simultaneously.
-
-```python
-model = tokens - 5 + indices
-model.input([1, 2, 3, 5, 2])
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_30_0.svg)
-    
-
-
-
-```python
-(tokens == "l") | (indices == 1)
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_31_0.svg)
-    
-
-
-
-We provide a few helper functions to make it easier to write transforms. For example, `where` provides an "if" statement like construct
-
-```python
-where((tokens == "h") | (tokens == "l"), tokens, "q")
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_33_0.svg)
-    
-
-
-
-And `map` lets us define our own operators, for instance a string to int transform. (Users should be careful to only use operations here that could be computed with a simple neural network).
-
-```python
-atoi = tokens.map(lambda x: ord(x) - ord('0'))
-atoi.input("31234")
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_35_0.svg)
-    
-
-
-
-When chaining these transforms, it is often easier to work with functions. For example the following applies where and then <code>atoi</code> and then adds 2.
-
-```python
-def atoi(seq=tokens):
-    return seq.map(lambda x: ord(x) - ord('0')) 
-
-op = (atoi(where(tokens == "-", "0", tokens)) + 2)
-op.input("02-13")
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_37_0.svg)
-    
-
-
-
-From here on, unless we use a different input sequence, we will assume that the input is ‘hello’ and omit the input display in the illustrations.
-
-
-###  Attention Selectors
-
-Things get more interesting when we start to apply attention. This allows routing of information between the different elements of the sequence. 
-
-
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_39_0.svg)
-    
-
-
-
-We begin by defining notation for the keys and queries of the model. Keys and queries are effectively transforms that we will broadcast and compare to each other to create *selectors*, our parallel to attention patterns. We create them directly from transforms. For example, if we want to define a key, we call `key` on a transform.
-
-```python
-key(tokens)
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_41_0.svg)
-    
-
-
-
-Similarly for `query`. (Queries are presented as columns to reflect their relation to the selectors we will create from them.)
-
-```python
-query(tokens)
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_43_0.svg)
-    
-
-
-
-Scalars can be used as keys or queries. They broadcast out to the length of the underlying sequence.
-
-```python
-query(1)
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_45_0.svg)
-    
-
-
-
-By applying a comparison operation between a key and a query we create a *selector*, our parallel to an attention matrix - though this one is unweighted. 
-
-A selector is a binary matrix indicating which input position (column) each output position (row) will attend to in an eventual attention computation. In the comparison creating it, the key values describe the input (column) positions, and the query values describe the output (row) positions.
-
-```python
-eq = (key(tokens) == query(tokens))
-eq
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_47_0.svg)
-    
-
-
-
-Some examples: 
-
-* A selector that matches each output position to the previous input position.
-
-```python
-offset = (key(indices) == query(indices - 1))
-offset
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_49_0.svg)
-    
-
-
-
-* A selector that matches each output position to all earlier input positions.
-
-```python
-before = key(indices) < query(indices)
-before
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_51_0.svg)
-    
-
-
-
-* A selector that matches each output position to all later input positions.
-
-```python
-after = key(indices) > query(indices)
-after
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_53_0.svg)
-    
-
-
-
-Selectors can be merged using boolean operations. For example, this selector focuses each output position on 1) earlier positions that 2) contain the same original input token as its own. We show this by including both pairs of keys and queries in the matrix.
-
-```python
-before & eq
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_55_0.svg)
-    
-
-
-
-### Using Attention 
-
-Given an attention selector we can provide a value sequence to aggregate. We represent aggregation by **summing** up over the values that have a true value for their selector. 
-
-(Note: in the original paper, they use a **mean** aggregation and show a clever construction where mean aggregation is able to represent a sum calculation. RASPy uses sum by default for simplicity and to avoid fractions. In practicce this means that RASPy may underestimate the number of layers needed to convert to a mean based model by a factor of 2.)
-
-Attention aggregation gives us the ability to compute functions like histograms. 
-
-```python
-(key(tokens) == query(tokens)).value(1)
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_59_0.svg)
-    
-
-
-
-Visually we follow the architecture diagram. Queries are to the left, Keys at the top, Values at the bottom, and the Output is to the right.
-
-
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_61_0.svg)
-    
-
-
-
-Some attention operations may not even use the input tokens. For instance to compute the `length` of a sequence, we create a “select all” attention selector and then add 1 from each position.
-
-```python
-length = (key(1) == query(1)).value(1)
-length = length.name("length")
-length
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_63_0.svg)
-    
-
-
-
-Here's a more complex example, shown step-by-step. (This is the kind of thing they ask in interviews!)
-
-Say we want to compute the sum of neighboring values in a sequence, along a sliding window. First we apply the forward cutoff, attending only to positions that are not too far in the past.
-
-```python
-WINDOW=3
-s1 = (key(indices) >= query(indices - WINDOW + 1))  
-s1
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_65_0.svg)
-    
-
-
-
-Then the backward cutoff, attending only to positions up to and including our own.
-
-```python
-s2 = (key(indices) <= query(indices))
-s2
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_67_0.svg)
-    
-
-
-
-Intersect.
-
-```python
-sel = s1 & s2
-sel
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_69_0.svg)
-    
-
-
-
-And finally aggregate.
-
-```python
-sum2 = sel.value(tokens) 
-sum2.input([1,3,2,2,2])
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_71_0.svg)
-    
-
-
-
-Here is a simple example that produces a 2-layer transform. The first corresponds to computing length and the second the cumulative sum. The cumulative sum has to go into a second layer because it is applied to a transform which uses length, and so it can only be computed after the computation of length is complete.
-
-```python
-def cumsum(seq=tokens):
-    x = (before | (key(indices) == query(indices))).value(seq)
-    return x.name("cumsum")
-cumsum().input([3, 1, -2, 3, 1])
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_73_0.svg)
-    
-
-
-
-### Layers 
-
-The language supports building up more complex transforms. It keeps track of the *layers* by tracking the operations computed so far. 
-
-
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_76_0.svg)
-    
-
-
-
-Here is a simple example that produces a 2-layer transform. The first corresponds to computing length and the second the cumulative sum.
-
-```python
-x = cumsum(length - indices)
-x.input([3, 2, 3, 5])
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_78_0.svg)
-    
-
-
-
-## Coding with Transformers
-
-Given this library of functions, we can write operations to accomplish surprisingly complex tasks. 
-
-**Can we produce a Transformer that does basic addition of two arbitrary length numbers?**
-
-i.e. given a string "19492+23919" can we produce the correct output? 
-
-We will go through these steps, and their solutions, here. If you would rather do them on your own, we provide a version where you can try them yourself!
-
-Before we dive in to the main task, we will do some challenges of increasing difficulty to help us build some intuitions.
-
-
-### Challenge 1: Select a given index
-
-Produce a sequence where all the elements have the value at index i.
-
-```python
-def index(i, seq=tokens):
-    x = (key(indices) == query(i)).value(seq)
-    return x.name("index")
-index(1)
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_83_0.svg)
-    
-
-
-
-### Challenge 2: Shift
-
-Shift all of the tokens in a sequence to the right by i positions. (Here we introduce an optional parameter in the aggregation: the default value to be used when no input positions are selected. If not defined, this value is 0.)
-
-```python
-def shift(i=1, default="_", seq=tokens):
-    x = (key(indices) == query(indices-i)).value(seq, default)
-    return x.name("shift")
-shift(2)
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_85_0.svg)
-    
-
-
-
-### Challenge 3: Minimum 
-
-Compute the minimum values of the sequence. (This one starts to get harder. Our version uses 2 layers of attention.)
-
-```python
-def minimum(seq=tokens):
-    sel1 = before & (key(seq) == query(seq))
-    sel2 = key(seq) < query(seq)
-    less = (sel1 | sel2).value(1)
-    x = (key(less) == query(0)).value(seq)
-    return x.name("min")
-minimum()([5,3,2,5,2])
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_87_0.svg)
-    
-
-
-
-The idea behind our solution is an implicit full ordering of the input positions: we (implicitly) order the positions according to input token value, with input position as tie breaker. Our first act is to have each position attend to all positions before it in the ordering: `sel1` focuses on earlier input positions with the same input token value, and `sel2` focuses on input positions with lower input token value. We then aggregate a 1 from all positions to get where each position is located in this ordering (i.e., how many other positions precede it). The minimum value is the input value at the first position according to this ordering (i.e., the one which had no other positions precede it).
-
-### Challenge 4: First Index
-
-Compute the first index that has token q, assuming the sequence always has length shorter than 100. (2 layers)
-
-```python
-def first(q, seq=tokens):
-    return minimum(where(seq == q, indices, 99))
-first("l")
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_90_0.svg)
-    
-
-
-
-### Challenge 5: Right Align
-
-Right align a padded sequence e.g. ralign().inputs('xyz___') = '---xyz'" (2 layers)
-
-```python
-def ralign(default="-", sop=tokens):
-    c = (key(sop) == query("_")).value(1)
-    x = (key(indices + c) == query(indices)).value(sop, default)
-    return x.name("ralign")
-ralign()("xyz__")
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_92_0.svg)
-    
-
-
-
-### Challenge 6: Split
-
-Split a sequence into two parts at value v and then right align. You can assume there is exactly one appearance of v in the sequence. (3 layers to get and align the first part of the sequence, but only 1 for the second.)
-
-```python
-def split(v, get_first_part, sop=tokens, default="0"):
-    split_point = (key(sop) == query(v)).value(indices)
-    if get_first_part:
-        x = ralign(default, 
-                   where(indices < split_point, 
-                         sop, "_"))
-        return x
-    else:
-        x = where(indices > split_point, sop, default)
-        return x
-split("+", False)("xyz+zyr")
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_94_0.svg)
-    
-
-
-
-```python
-split("+", 0)("xyz+zyr")
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_95_0.svg)
-    
-
-
-
-### Challenge 6: Slide
-
-Replace special tokens "<" with the closest non "<" value to their right. (2 layers)
-
-```python
-def slide(match, seq=tokens):
-    x = cumsum(match) 
-    y = ((key(x) == query(x + 1)) & (key(match) == query(True))).value(seq)
-    seq =  where(match, seq, y)
-    return seq.name("slide")
-slide(tokens != "<").input("xxxh<<<l")
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_97_0.svg)
-    
-
-
-
-### Challenge 7: Add
-
-For this one you want to perform addition of two numbers. Here are the steps. 
-
-```python
-add().input("683+345")
-```
-
-0. Split into parts (challenge 6). Convert to ints. Add.
-
-> “683+345” => [0, 0, 0, 9, 12, 8]
-
-1. Compute the carry terms. Three possibilities: definitely receives carry (“1”), definitely doesn't receive carry (“0”), maybe receives carry (“<”).Because we are only adding two numbers, the only  case in which a position might receive a carry is if the position after it sums to 9. In that case, it will receive a carry if and only if the position after *that* receives a carry.
-
-> [0, 0, 0, 9, 12, 8] => “00<100”
-
-2. Slide the carry coefficients. A position that might receive a carry will get one if and only if the next position receives a carry - and so on down the chain until the next definite carry/no carry.
-
-> “00<100” => 001100"
-
-3. Complete the addition.
-
-Each of these is 1 line of code. The full system is 6 layers. (if you are careful you can do it in 5!).
-
-
-```python
-def add(sop=tokens):
-    # 0) Parse and add
-    x = atoi(split("+", True, sop)) \
-        + atoi(split("+", False, sop))
-    # 1) Check for carries 
-    gets_carry = shift(-1, "0", where(x > 9, "1", where(x == 9, "<", "0")))
-    # 2) Slide carries to their columns - all in one parallel go!                                         
-    gets_carry = atoi(slide(gets_carry != "<", gets_carry))
-    # 3) Add in carries, and remove overflow from original addition.                                                                                  
-    return (x + gets_carry) % 10
-add()("683+345")
-```
-
-
-
-    
-![svg]({{site.baseurl}}/assets/img/2023-05-01-raspy/Blog_99_0.svg)
-    
-
-
-
-```python
-683 + 345
-```
-
-
-
-    1028
-
-
-
-Pretty neat stuff. If you are interested more in this topic, be sure to check at the paper: 
-
-[Thinking like Transformers](https://arxiv.org/pdf/2106.06981.pdf) and the [RASP language](https://github.com/tech-srl/RASP).
-
-
diff --git a/_posts/2023-05-01-riit.md b/_posts/2023-05-01-riit.md
deleted file mode 100644
index daccb0df..00000000
--- a/_posts/2023-05-01-riit.md
+++ /dev/null
@@ -1,578 +0,0 @@
----
-layout: distill
-title: Rethinking the Implementation Tricks and Monotonicity Constraint in Cooperative Multi-agent Reinforcement Learning
-description: QMIX, a very classical multi-agent reinforcement learning (MARL) algorithm, is often considered to be a weak performance baseline due to its representation capability limitations. However, we found that by improving the implementation techniques of QMIX we can enable it to achieve state-of-the-art on the StarCraft Multi-Agent Challenge (SMAC) testbed. Furthermore, the key factor of the monotonicity constraint of QMIX was found in this post, we tried to explain its role and corroborated its superior performance by combining it with another actor-critic style algorithm. We have open-sourced the code at https://github.com/hijkzzz/pymarl2 for researchers to evaluate the effects of these proposed techniques. 
-
-authors:
-  - name:  Jian Hu
-    url: https://hujian.website/
-    affiliations:
-      name: National Taiwan University
-  - name: Siying Wang
-    affiliations:
-      name: University of Electronic Science and Technology of China
-  - name: Siyang Jiang
-    url: https://siyang-jiang.github.io/
-    affiliations:
-      name: Huizhou University
-  - name: Weixun Wang
-    url: https://wwxfromtju.github.io/
-    affiliations:
-      name: Tianjin University, Netease Fuxi AI Lab
-
-# must be the exact same name as your blogpost
-bibliography: 2023-05-01-riit.bib  
-
-toc:
-  - name: Background
-    subsections:
-    - name: From RL to MARL
-    - name: Decentralized Partially Observable Markov Decision Process
-    - name: Centralized Training with Decentralized Execution and Value Decomposition
-    - name: Notation
-  - name: QMIX and Monotonicity Constraint
-  - name: Extension to QMIX
-    subsections:
-    - name: Experimental Design
-    - name: Optimizer
-    - name: Rollout Process Number
-    - name: Replay Buffer Size
-    - name: Eligibility Traces
-    - name: Hidden Size
-    - name: Exploration Steps
-    - name: Integrating the Techniques
-  - name: Role of Monotonicity Constraint
-    subsections:
-    - name: Amazing Performance in Policy-Based Methods
-    - name: What is Under the Hood?
-  - name: Conclusion
-  - name: Authorship, Credit Attribution and Acknowledgement
-  - name: Appendix
-
-_styles: >
-  figure {
-    text-align: center;
-  }
-
-  .img-center img {
-    margin: 0 auto;
-  }
-
-  .img-height-180 img {
-    height: 160px;
-  }
-
-  .img-height-200 img {
-    height: 180px;
-  }
-
-  .img-height-210 img {
-    height: 180px;
-  }
-
-  img-height-240 img {
-    height: 220px;
-  }
-
-  .img-height-300 img {
-    height: 280px;
-  }
-
-  .img-height-310 img {
-    height: 280px;
-  }
-
-  .img-height-340 img {
-    height: 315px;
-  }
-
-  .img-height-400 img {
-    height: 370px;
-  }
-
-  .img-height-600 img {
-    height: 600px;
-  }
-
-  .text{text-align:center;}
----
-
-## Background
-
-### From RL to MARL
-
-Since AlphaZero beats humans at Go, RL has become a consistent hot spot in academia and industry. The agent of RL can obtain some rewards by interacting with the environment and taking actions to maximize these cumulative rewards. Actually, almost all the RL problems can be described as **Markov Decision Processes** as illustrated in Figure <a href="#mdp">1</a>.
-
-<div id="mdp" class="img-height-200 img-center"> {% include figure.html path="assets/img/2023-05-01-riit/mdp.png" class="img-fluid rounded z-depth-1" %} </div>
-<div class="caption">Figure 1: The agent-environment interaction in a Markov decision process. (Image source: Sec. 3.1 Sutton & Barto (2018)<d-cite key="sutton2018reinforcement"></d-cite>). $R_t, S_t, A_t$ denote the reward, state and action at timestep $t$.</div>
-
-Just as its name implies, MARL contains multiple agents trained by RL algorithms in the same environment. Many complex multi-agent systems such as robot swarm control, autonomous vehicle coordination, and sensor networks, can be modeled as MARL tasks. The interaction of these agents would make them work together to achieve a common goal.
-
-<div style="display:flex; margin-bottom:-30px; margin-left :150px; margin-right :150px">
-<div id="chase" class="img-height-100"> {% include figure.html path="assets/img/2023-05-01-riit/chase.gif"  class="img-fluid rounded z-depth-1" %} </div>
-<div id="magent" class="img-height-100"> {% include figure.html path="assets/img/2023-05-01-riit/magent.gif"  class="img-fluid rounded z-depth-1" %} </div></div>
-<div style="display:flex; margin-top:-30px; margin-left :50px; margin-right :50px">
-<div id="hide" class="img-height-200"> {% include figure.html path="assets/img/2023-05-01-riit/hide.gif"  class="img-fluid rounded z-depth-1" %} </div>
-<div id="smac" class="img-height-200"> {% include figure.html path="assets/img/2023-05-01-riit/smac.gif"  class="img-fluid rounded z-depth-1" %} </div></div>
-<div style="margin-bottom: 20px"><div class="caption">Figure 2: Some multi-agent cooperative scenarios [from-left-to-right].
-<a href="https://github.com/openai/multiagent-particle-envs"> <br/>
-(a) Chasing in Multi-Agent Particle Environment (Predator-Prey); </a>
-<a href="https://github.com/geek-ai/MAgent"> (b) MAgent Environment; </a>
-<a href="https://openai.com/blog/emergent-tool-use"> <br/> (c) Hide & Seek; </a>
-<a href="https://github.com/oxwhirl/smac"> (d) StarCraft Multi-Agent Challenge. </a></div></div>
-
-In this general setting, agents usually have a limited sight range to observe their surrounding environment. As shown in Figure <a href="#smac_obs">3</a>, the cyan border indicates the sight and shooting range of the agent, which means the agent could only obtain the information of terrain or other agents in that range. This restricted field of view may also result in the difficulty of agents to access to global state information, making its policy updates subject to bias and unsatisfactory performance. In general, these kinds of multi-agent scenarios can be modeled as **Decentralized Partially Observable Markov Decision Processes** (Dec-POMDP)<d-cite key="png2009pomdps"></d-cite>.  
-
-Even though many RL algorithms<d-cite key="sutton2018reinforcement"></d-cite> and their variants have been successfully extended to the cooperative scenarios in MARL setting, few of their performance is satisfactory. One of the most troublesome issues is *Non-Stationarity*. Specifically, as a part of the environment, the changing policies of other agents during training would make the observation non-stationary from the perspective of any individual agent<d-cite key="oliehoek2016concise"></d-cite> and significantly slow down the policy optimization of MARL. This situation has forced researchers to seek a method that can exploit global information during training but does not destroy the ability of the agents to only use their respective observations during execution, to find a joint policy $\boldsymbol{\pi} = \langle \pi^{1},...,\pi^{n}\rangle$ to maximize global reward. Naturally, the simplicity and effectiveness of the **Centralized Training with Decentralized Execution** (CTDE) paradigm have attracted the attention of the community, and many MARL algorithms based on CTDE were proposed, making a remarkable contribution to MARL.
-
-In the rest of this section, we briefly introduce Dec-POMDP and CTDE to facilitate the understanding of the contents of MARL, the QMIX algorithm and the following text.
-
-<div style="float:left; margin-left :150px; margin-right :150px;" ><div id="smac_obs" class="img-height-100"> {% include figure.html path="assets/img/2023-05-01-riit/smac_agent_obs.jpg"  class="img-fluid rounded z-depth-1" %} </div>
-<div class="caption">Figure 3: The partial observation of agents <br>(Image source: SMAC<d-cite key="samvelyan2019starcraft"></d-cite>). </div></div>
-
-### Decentralized Partially Observable Markov Decision Process
-
-A **Decentralized Partially Observable Markov Decision Process** (Dec-POMDP) model, as described in <d-cite key="pmlr-v80-rashid18a"></d-cite><d-cite key="oliehoek2016concise"></d-cite>, is typically used to represent a full cooperative multi-agent task. The model consists of a tuple denoted by $G=(S, U, P, r, Z, O, n, \gamma)$, and involves $n$ agents, where $n$ is an integer between 1 and $n$, inclusive. The true state of the environment, denoted by $s \in S$, describes global information that is relevant to both agents and other auxiliary features. At each timestep $t$, a transition in the environment occurs via a joint action $\mathbf{u} \in \mathbf{U} \equiv U^{n}$, which is composed of an action $u^i \in U$, chosen by each agent. This transition is driven by the state transition function $P\left(s^{\prime} \mid s, \mathbf{u}\right): S \times \mathbf{U} \times S \rightarrow[0,1]$. Additionally, there is a shared global reward function, denoted by $r(s, \mathbf{u}): S \times \mathbf{U} \rightarrow \mathbf{R}$, which is optimized by the whole team. Finally, each agent has a partial observation described by $o^i \in O$, which is derived from the observation function $Z(o^i \mid s, u^i) : S \times U \rightarrow O$. All agents work cooperatively to maximize the shared global reward $R_{t}=\sum_{k=0}^{T} \gamma^{k} r_{t+k}$, which is described by the joint value function $$Q^{\boldsymbol{\pi}}\left(s_{t}, \mathbf{u}_{t}\right) = \mathbb{E}_{s_{t+1: \infty}, \mathbf{u}_{t+1: \infty}}\left[R_{t} \mid s_{t}, \mathbf{u}_{t}\right]$$.
-
-### Centralized Training with Decentralized Execution and Value Decomposition
-
-To better explore the factors affecting the QMIX algorithm, our focus lies in the **Centralized Training with Decentralized Execution** (CTDE) paradigm of MARL algorithms. These algorithms under this paradigm have access to the true state $s$ and the action-observation histories $\tau^{i}$ of all agents to centrally train policies, but each agent can only rely on its local observation $o^{i}$ for decision-making. Some value-based algorithms implemented under CTDE follow the Individual-Global-Max (**IGM**) principle<d-cite key="pmlr-v97-son19a"></d-cite>, ensuring consistency between the joint action-value function $Q_{tot} \left(\boldsymbol{\tau}, \mathbf{u}\right)$ and individual agent-utilities $[Q_i\left(\tau^i, u^i\right)] _{i=1} ^{n}$:
-
-$$
-\underset{\mathbf{u}}{\operatorname{argmax}}\ Q_{tot} \left(\boldsymbol{\tau}, \mathbf{u}\right) = (\underset{u^{1}}{\operatorname{argmax}}\ Q_{1} \left(\tau^{1}, u^{1}\right), \ldots, \underset{u^{n}}{\operatorname{argmax}}\ Q_{n} \left(\tau^{n} , u^{n}\right)). \tag{1} \label{eq1}
-$$
-
-One of the most typical ways to efficiently train the joint value function $$Q_{tot} \left(\boldsymbol{\tau}, \mathbf{u}\right)$$ is to decompose it into the utility functions $$[Q_i\left(\tau^i, u^i\right)] _{i=1} ^{n}$$ and maintain updating consistency between them via IGM. The simplest factorization structure, called *additivity*, has been proposed by VDN<d-cite key="10.5555/3237383.3238080"></d-cite>, which makes VDN simply factorize $Q_{tot}$ into a sum of per-agent utilities $$Q_{tot}^{\mathrm{VDN}} \left(\boldsymbol{\tau}, \boldsymbol{u}\right)=\sum_{i=1}^{n} Q_{i} \left(\tau^{i}, u^{i}\right)$$. VDN's simplicity and equal weighting of each utility in the joint value function makes it ineffective for cooperative tasks, which has motivated the QMIX structure and other more efficient decomposition approaches.
-
-### Notation
-
-In this subsection, we define the notations used in this post. Specifically, in traditional RL, time steps $t$ are usually represented in the update formula and the value function of RL is considered to be estimated by the pairwise variables at the current time step $t$ and the next time step $t+1$. Since the *ID* of the agent also needs to be represented in the MARL algorithm, it may cause ambiguity when expressed in the same formula as the time step $t$. For simplicity of expression, variables without $t$ are indicated to be implemented at the current time step, while variables at the next time step are indicated with an apostrophe in the upper right corner in the rest of the context, e.g., $s$ means the current state and $s^{\prime}$ indicates the next time step state, the same approach applies to actions $u$ and observations $o$. All the notations are listed in Table [1](#table1).
-
-<a name="table1"> </a>
-<div class="caption">
-    Table 1: All the notations used in this post.
-</div>
-<style type="text/css">
-.tg  {border-collapse:collapse;border-spacing:0;}
-.tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
-  overflow:hidden;padding:10px 5px;word-break:normal;}
-.tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
-  font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
-.tg .tg-c3ow{border-color:inherit;text-align:center;vertical-align:top}
-</style>
-<table class="tg">
-<thead>
-  <tr>
-    <th class="tg-c3ow">Notation</th>
-    <th class="tg-c3ow">Description</th>
-    <th class="tg-c3ow">Notation</th>
-    <th class="tg-c3ow">Description</th>
-  </tr>
-</thead>
-<tbody>
-  <tr>
-    <td class="tg-c3ow">$s$</td>
-    <td class="tg-c3ow">the current state (at time $t$)</td>
-    <td class="tg-c3ow">$S$</td>
-    <td class="tg-c3ow">the set of all states</td>
-  </tr>
-  <tr>
-    <td class="tg-c3ow">$s^{\prime}$</td>
-    <td class="tg-c3ow">the next state (at time $t+1$)</td>
-    <td class="tg-c3ow">$U$</td>
-    <td class="tg-c3ow">the set of all actions</td>
-  </tr>
-  <tr>
-    <td class="tg-c3ow">$u^{i}$</td>
-    <td class="tg-c3ow">the action of agent $i$</td>
-    <td class="tg-c3ow">$N$</td>
-    <td class="tg-c3ow">the set of all agents</td>
-  </tr>
-  <tr>
-    <td class="tg-c3ow">$\mathbf{u}$</td>
-    <td class="tg-c3ow">the joint actions (at time $t$)</td>
-    <td class="tg-c3ow">$\tau^{i}$</td>
-    <td class="tg-c3ow">the action-observation history of agent $i$</td>
-  </tr>
-  <tr>
-    <td class="tg-c3ow">$o^{i}$</td>
-    <td class="tg-c3ow">the observation of agent $i$</td>
-    <td class="tg-c3ow">$${\tau}$$</td>
-    <td class="tg-c3ow">the joint action-observation histories</td>
-  </tr>
-  <tr>
-    <td class="tg-c3ow">$$o$$</td>
-    <td class="tg-c3ow">the joint observation</td>
-    <td class="tg-c3ow">$r(s, \mathbf{u})$</td>
-    <td class="tg-c3ow">the joint reward supplied by environments</td>
-  </tr>
-  <tr>
-    <td class="tg-c3ow">$Q_{i}(\tau^{i}, u^{i})$</td>
-    <td class="tg-c3ow">the utility function of agent $i$</td>
-    <td class="tg-c3ow">$\gamma$</td>
-    <td class="tg-c3ow">the discount factor</td>
-  </tr>
-  <tr>
-    <td class="tg-c3ow">$Q_{tot}({\tau}, \mathbf{u})$</td>
-    <td class="tg-c3ow">the joint value function </td>
-    <td class="tg-c3ow">$P(s^{\prime} \mid s, \mathbf{u})$</td>
-    <td class="tg-c3ow">the transition function</td>
-  </tr>
-  <tr>
-    <td class="tg-c3ow">$Z(o^{i} \mid s, u^{i})$</td>
-    <td class="tg-c3ow">the observation function</td>
-    <td class="tg-c3ow">$\epsilon$</td>
-    <td class="tg-c3ow">action selection probability of $\epsilon$-greedy</td>
-  </tr>
-  <tr>
-    <td class="tg-c3ow">$N$</td>
-    <td class="tg-c3ow">the set of all agents with $n$ agents</td>
-    <td class="tg-c3ow">$$\theta$$</td>
-    <td class="tg-c3ow">the set of parameters of agents network, with $[\theta^{i}]_{i=1}^{n}$</td>
-  </tr>
-  <tr>
-    <td class="tg-c3ow">$b$</td>
-    <td class="tg-c3ow">sampled batch size for training</td>
-    <td class="tg-c3ow">$\phi$</td>
-    <td class="tg-c3ow">the parameter of mixing network</td>
-  </tr>
-  <tr>
-    <td class="tg-c3ow">$TS$</td>
-    <td class="tg-c3ow">the $T$otal rollout $S$amples</td>
-    <td class="tg-c3ow">$PP$</td>
-    <td class="tg-c3ow">the number of rollout $P$rocesses in $P$arallel</td>
-  </tr>
-  <tr>
-    <td class="tg-c3ow">$SE$</td>
-    <td class="tg-c3ow">the number of $S$amples in each <br/> $E$pisode</td>
-    <td class="tg-c3ow">$PI$</td>
-    <td class="tg-c3ow">the $P$olicy $I$teration number</td>
-  </tr>
-</tbody>
-</table>
-
-## QMIX and Monotonicity Constraint
-
-To deal with the relationship between the individual agent and the cooperative group, QMIX<d-cite key="pmlr-v80-rashid18a"></d-cite> learns a joint action-value function $Q_{tot}$ and factorizes the joint policy into the individual policy of each agent. In other words, as illustrated in Figure <a href="#frame">4</a>, QMIX integrates all the individual $Q_{i}$ with a mixing network to obtain a centralized value function $Q_{tot}$, which can be more appropriately updated by the global reward.
-
-<div id="frame" class="img-height-310 image-center"> {% include figure.html path="assets/img/2023-05-01-riit/qmix_frame.png"  class="img-fluid rounded z-depth-1" %} </div>
-<div class="caption">Figure 4: Framework of QMIX. (Image source: QMIX<d-cite key="pmlr-v80-rashid18a"></d-cite>). On the left is Mixing Network (A Hypernetwork), and on the right is the Agent network.</div>
-
-Still, it also can be represented in Eq.(\ref{eq2})
-
-$$
-Q_{tot}(s, \boldsymbol{u} ; \boldsymbol{\theta}, \phi)
-= g_{\phi}\left(s, Q_{1}\left(\tau^{1}, u^{1} ; \theta^{1}\right), \ldots, Q_{n}\left(\tau^{n}, u^{n} ;  \theta^{n}\right)\right);
-$$
-
-$$
-with \quad \frac{\partial Q_{tot}(s, \boldsymbol{u} ; \boldsymbol{\theta}, \phi)}{\partial Q_{i}\left(\tau^{i}, u^{i}; \theta^{i}\right)} \geq 0, \quad \forall i \in N. \tag{2} \label{eq2}
-$$
-
-where $\theta^i$ is the parameter of the agent network $i$, $u^{i}$ denotes the action of agent $i$, and $\phi$ is the trainable parameter of the mixing network. The the mixing network $g_{\phi}(\cdot)$ is responsible to factorize $Q_{tot}$ to each utility $Q_{i}$. The *Monotonicity Constraint* is also implemented in the mixing network $g_{\phi}(\cdot)$, which inputs the global state $s$ and outputs *non-negative* wights through a *hyper-network* as illustrated in the left part of Figure <a href="#frame">4</a>, which will result in $$\frac{\partial Q_{tot}(s, \boldsymbol{u} ; \boldsymbol{\theta}, \phi)}{\partial Q_{i}\left(\tau^{i}, u^{i}; \theta^{i}\right)} \geq 0$$. This delicate design ensures consistency between joint actions and the individual actions of each agent, then guarantees the Individual-Global-Max (IGM) principle. Benefiting from the monotonicity constraint in Eq.(\ref{eq2}), maximizing joint $Q_{tot}$ is precisely the equivalent of maximizing individual $Q_i$, which would also allow the optimal individual action to maintain consistency with optimal joint action. Furthermore, QMIX learns the centralized value function $Q_{tot}$ by sampling a multitude of transitions from the replay buffer and minimizing the mean squared temporal-difference (TD) error loss:
-
-$$
-\mathcal{L}(\theta)= \frac{1}{2} \sum_{i=1}^{b}\left[\left(y_{i}^{}-Q_{tot}(s, u ; \theta, \phi)\right)^{2}\right] \tag{3} \label{eq3}
-$$
-
-where the TD target value $$y=r+\gamma \underset{u^{\prime}}{\operatorname{max}} Q_{tot}(s^{\prime},u^{\prime};\theta^{-},\phi^{-})$$, and $\theta^{-}, \phi^{-}$ are the target network parameters copied periodically from the current network and kept constant for a number of iterations. $b$ is the sampled training batch size. Due to the strong constraints in Eq.(\ref{eq2}), QMIX is still criticized for the insufficient expressive capacity of the joint value function<d-cite key="mahajan2019maven"></d-cite>. 
-
-
-## Extension to QMIX
-
-### Experimental Design
-
-To facilitate the study of proper techniques affecting the training effectiveness and sample efficiency of QMIX, we perform a set of experiments designed to provide insight into some methods that have been proven effective in single-agent RL but may be ambiguous in MARL. In particular,  we investigate the effects of **Adam optimizer with parallel rollout process; the incremental replay buffer size; the number of parallel rollout processes; $\epsilon$-exploration steps; the implementation of $Q(\lambda)$ in centralized value function; the hidden size of the recurrent network of agents**. And we also dive into the **role of monotonicity constraints in QMIX**. For all experiments, we generally implement PyMARL<d-cite key="samvelyan2019starcraft"></d-cite> framework to implement QMIX. To ensure fairness we run independent 3 to 6 experimental trials for each evaluation, each with a random seed. Unless otherwise mentioned, we use default settings as in PyMARL whenever possible, while incorporating the techniques of interest. To prevent the training process of the algorithm from crashing by chance, we remove the highest and lowest scores when counting the calculated returns and win rates for the test episode. All the results are plotted with the median and shaded the interval, and the final scores were ***not*** smoothed for the sake of image aesthetics, and we did so to verify exactly what direct effect the proposed techniques could have on QMIX.
-
-**StarCraft Multi-Agent Challenge (SMAC)** As a commonly used testing environment, SMAC<d-cite key="samvelyan2019starcraft"></d-cite> sets an example to offer a great opportunity to tackle the cooperative control problems in the multi-agent domain. We focus on the micromanagement challenge in SMAC, which means each agent is controlled by an independent agency that conditions on a limited observation area, and these groups of units are trained to conquer the enemy consisting of built-in AI. According to the quantity and type of enemy, all testing scenarios could be divided into *Easy, Hard*, and *Super-Hard* levels. Since QMIX can effectively solve the *Easy* tasks, we pay attention to some *Hard* and *Super-Hard* scenarios that QMIX failed to win, especially in *Corridor, 3s5z_vs_3s6z*, and *6h_vs_8z*.
-
-**Predator-Prey (PP)**  is representative of another classical problem called *relative overgeneralization*<d-cite key="wei2018multiagent"></d-cite>. The cooperating predators are trained to chase a faster running prey, and hope to capture this escaping robot with the fewest steps possible. We leverage Predator-Prey-2 (a variant of Predator-Prey) proposed in FACMAC<d-cite key="peng2021facmac"></d-cite>, whose policy of prey is replaced with a hard-coded heuristic policy. The heuristic policy asks the prey to move to the farthest sampled position to the closest predator. If one of the cooperative agents collides with the prey, a team reward of +10 is emitted; otherwise, no reward is given. In the original simple tag environment, each agent can observe the relative positions of the other two agents, the relative position and velocity of the prey, and the relative positions of the landmarks. This means each agent’s private observation provides an almost complete representation of the true state of the environment. 
-
-To introduce partial observability to the environment, the view radius is added to the agent, which restricts the agents from receiving information about other entities (including all landmarks, the other two agents, and the prey) that are out of range. Specifically, we set the view radius such that the agents can only observe other agents roughly 60% of the time. These environments require greater cooperation between agents.
-
-**Notes:** Although the code repository of this post is given in the abstract, we give its url here again for greater convenience and still strongly welcome researchers to conduct experiments referring to the proposed methods. Still, in the following subsections, we post their corresponding permalinks for easy understanding.
-
-Code Repository: <a href="https://github.com/hijkzzz/pymarl2"> https://github.com/hijkzzz/pymarl2 </a>
-
-### Optimizer
-As an important part of training neural networks, the selection of an optimizer is very important since it could seriously affect the training effect of the reinforcement learning agent. Without a further illustration, QMIX uses RMSProp<d-cite key="zou2019sufficient"></d-cite> to optimize the neural networks of agents as they prove stable in SMAC. While Adam<d-cite key="kingma2014adam"></d-cite> is famous for the fast convergence benefiting from the momentum in training, which seems to be the first choice for AI researchers. We reckon that the momentum property in Adam would have some advantages in learning the sampled data which is generated by agents interacting with the environment as in MARL. And then, on the other hand, QMIX is criticized for performing sub-optimally and sampling inefficiency when equipped with the A2C framework, which is implemented to promote the training efficiency of the RL algorithm. VMIX<d-cite key="su2021value"></d-cite> argues this limitation is brought about by the value-based inherent Q function, so they extend QMIX to the actor-critic style algorithm to take advantage of the A2C framework. This controversy attracts our attention to evaluate the performance of QMIX using Adam, as well as the parallel sampling paradigm.
-
-**Permalink:** <a href="https://github.com/hijkzzz/pymarl2/blob/45278a5f8d1e3d006811351ed5fa99d614731e7d/src/learners/nq_learner.py#L37-L40"> Adam optimizer in nq_learner. </a>
-
-<div id="optimizer" class="img-height-210 image-center"> {% include figure.html path="assets/img/2023-05-01-riit/optimizer.svg"  class="img-fluid rounded z-depth-1" %} </div>
-<div class="caption">Figure 5: The performance of QMIX optimized by Adam and RMSProp.</div>
-
-**Results** As shown in Figure <a href="#optimizer">5</a>, we run the Adam-supported QMIX with **8 rollout processes**. Different from what was described in VMIX, the performance and efficiency of QMIX could be greatly improved by Adam. We speculate the reason is the momentum property in Adam could fastly fit the newly sampled data from the parallel rollout processes and then enhance the performance, while RMSProp failed.
-
-### Rollout Process Number
-Naturally, we come to focus on the benefits of parallel data sampling in QMIX. A2C<d-cite key="pmlr-v48-mniha16"></d-cite> provides an excellent example to reduce training time and improve the training efficiency in single-agent RL. As we implement the algorithms under the paradigm of A2C, there is usually a defined total number of samples and an unspecified number of rollout processes. The total number of samples $TS$ can be calculated as $TS = SE \cdot PP \cdot PI$, where $TS$ is the total sum of sampled data, $SE$ denotes the number of samples in each episode, $PP$ and $PI$ denote the number of rollout processes in parallel and the policy iteration number, respectively. This section aims to perform analysis and spur discussion on the impact of the parallel rollout process on the final performance of QMIX.
-
-**Permalink:** <a href="https://github.com/hijkzzz/pymarl2/blob/45278a5f8d1e3d006811351ed5fa99d614731e7d/src/config/algs/qmix.yaml#L9-L10"> 1) Rollout process number setting in the configuration file</a>; 2) <a href="https://github.com/hijkzzz/pymarl2/blob/45278a5f8d1e3d006811351ed5fa99d614731e7d/src/runners/parallel_runner.py#L88-L212"> Parallel trajectory sampling code. </a>
-
-<div id="process_number" class="img-height-210 image-center">  {% include figure.html path="assets/img/2023-05-01-riit/process_number.svg"  class="img-fluid rounded z-depth-1" %} </div>
-<div class="caption">Figure 6: The performance of different rollout process numbers of QMIX. When given the total number of samples, the performance of fewer processes achieves better performance.</div>
-
-**Results** Still, we use Adam-supported QMIX to evaluate the effect of the number of the rollout process. Since we could choose the *Parallel* model to sample the interacting data of the agent with the environment in PyMARL, we can theoretically get more **on-policy** data which is close to the updating policy in training. Figure <a href="#process_number">6</a> shows that when $TS$ and $PP$ is given, the performance enhancement of QMIX is not consistent with the increase in rollout process number. The intuitive explanation is when we set the fewer rollout processes, the greater the quantity of policy would iterate<d-cite key="sutton2018reinforcement"></d-cite>. Besides, too fast updated data in parallel may cause the factitious unstable training in policy updating, i.e., it is difficult for agents to learn effective information from rapidly sampled data from the replay buffer. The more times policies are iterated, the more information the agents would learn which lead to an increase in performance. However, it also causes longer training time and loss of stability. We suggest trying the fewer rollout process in the beginning and then balancing between training time and performance.
-
-### Replay Buffer Size
-Replay buffer plays an important role in improving sample efficiency in off-policy single-agent RL. Its capacity would greatly affect the performance and stability of algorithms. Researchers usually set a very large capacity of replay buffer in Deep Q-network (DQN)<d-cite key="mnih2013playing"></d-cite> to stabilize the training. Some research on the effect of replay buffer in single-agent RL has already been carried out in <d-cite key="pmlr-v119-fedus20a"></d-cite>, which poses the distribution of sampled training data should be close as possible to the agents' policies to be updated. Actually, there are two factors affected when we change the capacity of the replay buffer: (1) the replay capacity (total number of transitions/episodes stored in the buffer); and (2) the replay ratio (the number of gradient updates per environment transition/episode) of old policies. When we increase the capacity of the replay buffer, the aged experiences of old policies would grow as the replay ratio is fixed. Then the distribution of outdated experiences would also be much different from the updating policy, which would bring additional difficulty to the training agents. From the results in <d-cite key="pmlr-v119-fedus20a"></d-cite>, there seems to be an optimal range of choices between replay buffer size and replay ratio of experiences in RL, where we would like to know whether it is consistent with the results in MARL.
-
-**Permalink:** <a href="https://github.com/hijkzzz/pymarl2/blob/45278a5f8d1e3d006811351ed5fa99d614731e7d/src/config/algs/qmix.yaml#L11"> Replay buffer size setting in the configuration file. </a>
-
-<div id="replay_buffer" class="img-height-210 image-center"> {% include figure.html path="assets/img/2023-05-01-riit/buffer_size.svg"  class="img-fluid rounded z-depth-1" %} </div>
-<div class="caption">Figure 7:  Setting the replay buffer size to 5000 episodes allows for QMIX’s learning to be stable.</div>
-
-**Results** The results seem not to be consistent with that in single-agent RL. Figure <a href="#replay_buffer">7</a> shows the large replay buffer size of QMIX would cause instability during training. When we increase the buffer size from the default setting in PyMARL, the performance would almost continuously declines. We speculate the reason is the fast-changing distribution of experiences in a larger buffer would make it more difficult to fit sampled data due to the enormous joint action space. Since the samples become obsolete more quickly, these aged policies would also be more different from the updating policy, which brings additional difficulty. On the other hand, we find the same performance decline when we squeeze the buffer. We reckon that a small buffer would accelerate the updating speed of sampling data in a disguised way, which makes it tough to fit the data and learn a good policy. We believe that researchers should be cautious to increase the buffer size in other multi-agent applications.
-
-### Eligibility Traces
-The well-known trade-off between bias and variance of bootstrapping paradigm is a classic research topic in RL. Since we implement the Centralized Value Function (CVF) to alleviate the *Non-Stationarity*  multi-agent settings, the estimated accuracy of CVF is critical to MARL and then guides the policies of agents to update. Eligibility traces such as TD($\lambda$)<d-cite key="sutton2018reinforcement"></d-cite>, Peng's Q($\lambda$)<d-cite key="pmlr-v139-kozuno21a"></d-cite>, and TB($\lambda$)<d-cite key="10.5555/645529.658134"></d-cite> achieve a balance between return-based algorithms (where return refers to the sum of discounted rewards $\sum_{k} \gamma^{k} r_{t+k}$) and bootstrap algorithms (where return refers $r_t + V(s_{t+1})$), then speed up the convergence of agents' policies. As a pioneer, SMIX<d-cite key="wen2020smix"></d-cite> equipped QMIX with the SARSA($\lambda$) to estimate the accurate CVF and get decent performance. As another example of eligibility trace in Q-learning, we study the estimation of CVF using Peng's Q$(\lambda)$ for QMIX.
-
-**Permalink:** <a href="https://github.com/hijkzzz/pymarl2/blob/45278a5f8d1e3d006811351ed5fa99d614731e7d/src/utils/rl_utils.py#L6-L45"> Different eligibility traces code in repository. </a>
-
-<div id="qlambda1" class="img-height-210 image-center">  {% include figure.html path="assets/img/2023-05-01-riit/td_lambda.svg"  class="img-fluid rounded z-depth-1" %} </div>
-<div class='caption'>
-  Figure 8:  Q(λ)  significantly improves the performance of QMIX, but large values of λ lead to instability in the algorithm.
-</div>
-
-**Results** As the same in single-agent RL, the Q-networks without sufficient training usually have a large bias in bootstrapping returns. Figure <a href="#qlambda1">8</a> shows that, with the help of Q$(\lambda)$, the performance of QMIX has generally improved across all scenarios. It means the more accurate estimate of CVF would still provide a better direction of policy updating for each agent. However, the value of $\lambda$ in Peng's Q$(\lambda)$ is not so radical as in single-agent RL, which would lead to failed convergence due to the large variance. We recommend a small $\lambda$, such as $0.5$, when using $Q(\lambda)$ in MARL.
-
-### Hidden Size
-Searching for an optimal scale and architecture of neural networks is a very tough problem in the field of machine learning. Researchers typically use empirically small networks to train the agents in deep reinforcement learning. Since the role of neural networks is to extract the features of input states and actions, the size of the neural network would also have a great impact on the performance of MARL algorithms. The study in <d-cite key="pmlr-v119-ota20a"></d-cite> has revealed that networks with a complex structure like ResNet<d-cite key="He_2016_CVPR"></d-cite> and DenseNet<d-cite key="Huang_2017_CVPR"></d-cite> can extract more useful information for training, while Ba<d-cite key="ba2014deep"></d-cite> poses that the width of neural networks is probably more important than its depth. The subsequent study on QMIX<d-cite key="rashid2020monotonic"></d-cite> makes preliminary research on the depth of neural networks, which showed a limited improvement in performance. Though, there is little research on the width of neural networks in MARL. Instead of searching for an optimal network architecture here, we just want to make a pilot study on the effect of the hidden size of network width in QMIX.
-
-**Permalink:** <a href="https://github.com/hijkzzz/pymarl2/blob/45278a5f8d1e3d006811351ed5fa99d614731e7d/src/config/algs/qmix_large.yaml#L25-L30"> Hidden size of neural network setting in the configuration file. </a>
-
-<div id="hiddensize" class="img-height-210 image-center"> {% include figure.html path="assets/img/2023-05-01-riit/hidden_size.svg"  class="img-fluid rounded z-depth-1" %} </div>
-<div class="caption">Figure 9:  Impact of the hidden size of network in QMIX.</div>
-
-**Results** The study in <d-cite key="ba2014deep"></d-cite> illustrates the ability of infinity-width networks to fit any complex function, which would theoretically provide the performance gain from increasing network width. As shown in Figure <a href="#hiddensize">9</a>, the final performance or the efficiency of policy training would have varying degrees of improvement when we increase the hidden size of the network from 64 to 256 in QMIX, where **QMIX-ALL-Hidden indicates all the sizes of the Recurrent Neural Network (RNN) and the Mixing network would be increased to 256, while QMIX-RNN-Hidden only refers to the size of the RNN part of the network will be changed**. Also, the results reveal the spectacular effect of increasing the network width of RNN, which would allow for about a 20% increase in the Super-Hard scenarios *3s5z_vs_3s6z*. While the performance improvement is limited in enlarging the mixing network. We speculate that more units in the network are needed to represent the complex temporal context information in RNN, which is not included in the mixing network. We advise researchers to appropriately increase the network width of RNN to achieve better performance.
-
-### Exploration Steps
-Exploration and exploitation are other classic trade-offs in reinforcement learning. Agents need some directed mechanisms to explore the states that may be of higher value or inexperienced. The most versatile method of exploration in RL is $\epsilon$-greedy action, which makes the agent select random actions with probability $\epsilon$, or select the greedy action with $1 - \epsilon$. The value of $\epsilon$ would drop-down with training and then stays at a small constant. The annealing period of $\epsilon$-greedy determines how fast the drop down will be. This exploration mechanism is
-usually implemented for each agent to select their action, which has been criticized by MAVEN<d-cite key="mahajan2019maven"></d-cite> for lacking joint exploratory policy over an entire episode. However, we can still get more exploration when $\epsilon$ drops slower, then we evaluate the performance of the annealing period of $\epsilon$-greedy in some Super-Hard scenarios in SMAC.
-
-**Permalink:** <a href="https://github.com/hijkzzz/pymarl2/blob/45278a5f8d1e3d006811351ed5fa99d614731e7d/src/config/algs/qmix.yaml#L5-L7"> $\epsilon$-greedy exploration steps setting in the configuration file. </a>
-
-<div id="exploration" class="img-height-210 image-center">  {% include figure.html path="assets/img/2023-05-01-riit/exploration.svg"  class="img-fluid rounded z-depth-1" %} </div>
-<div class="caption">Figure 10: Experinments for the impact of ε anneal period.</div>
-
-**Results** Apparently, appropriately increasing the annealing period of $\epsilon$-greedy from 100K steps to 500K would get explicit performance gain in those hard explorated scenarios, where QMIX failed with the default setting. However, as shown in Figure <a href="#exploration">10</a>, too large steps like 1000K would also bring additional exploration noise even making the training collapse. The results above confirm the $\epsilon$-greedy mechanism is still the proper and simplest choice in MARL but should be elaboratively tuned for different tasks.
-
-### Integrating the Techniques
-These techniques mentioned above indeed impact QMIX in hard cooperative scenarios of SMAC, which really catches our attention to exhaust the extreme performance of QMIX. We combine these techniques and finetune all the hyperparameters in QMIX for each scenario of SMAC. As shown in Table [2](#table2), the Finetuned-QMIX would almost conquer all the scenarios in SMAC and exceed the effect of the original QMIX by a large margin in some Hard and Super-Hard scenarios.
-
-<a name="table2"> </a>
-<div class="caption">
-    Table 2: Best median test win rate of Finetuned-QMIX and QMIX (batch size=128) in all testing scenarios.
-</div>
-<table style="text-align: center; width: 600px; margin: 0 auto; margin-bottom:20px; margin-top:20px">
-  <thead>
-    <tr>
-      <td><b>Senarios</b></td>
-      <td><b>Difficulty</b></td>
-      <td><b>QMIX</b></td>
-      <td><b>Finetuned-QMIX</b></td>
-    </tr>
-  </thead>
-  <tbody>
-    <tr>
-      <td>10m_vs_11m</td>
-      <td>Easy</td>
-      <td>98%</td>
-      <td><b>100%</b></td>
-    </tr>
-    <tr>
-      <td>8m_vs_9m</td>
-      <td>Hard</td>
-      <td>84%</td>
-      <td><b>100%</b></td>
-    </tr>
-    <tr>
-      <td>5m_vs_6m</td>
-      <td>Hard</td>
-      <td>84%</td>
-      <td><b>90%</b></td>
-    </tr>
-    <tr>
-      <td>3s_vs_5z</td>
-      <td>Hard</td>
-      <td>96%</td>
-      <td><b>100%</b></td>
-    </tr>
-    <tr>
-      <td>bane_vs_bane</td>
-      <td>Hard</td>
-      <td><b>100%</b></td>
-      <td><b>100%</b></td>
-    </tr>
-    <tr>
-      <td>2c_vs_64zg</td>
-      <td>Hard</td>
-      <td><b>100%</b></td>
-      <td><b>100%</b></td>
-    </tr>
-    <tr>
-      <td>corridor</td>
-      <td>Super hard</td>
-      <td>0%</td>
-      <td><b>100%</b></td>
-    </tr>
-    <tr>
-      <td>MMM2</td>
-      <td>Super hard</td>
-      <td>98%</td>
-      <td><b>100%</b></td>
-    </tr>
-    <tr>
-      <td>3s5z_vs_3s6z</td>
-      <td>Super hard</td>
-      <td>3%</td>
-      <td><b>93% (Hidden Size = 256)</b></td>
-    </tr>
-    <tr>
-      <td>27m_vs_3s6z</td>
-      <td>Super hard</td>
-      <td>56%</td>
-      <td><b>100%</b></td>
-    </tr>
-    <tr>
-      <td>6h_vs_8z</td>
-      <td>Super hard</td>
-      <td>0%</td>
-      <td><b>93% (λ = 0.3)</b></td>
-    </tr>
-  </tbody>
-</table>
-
-## Role of Monotonicity Constraint
-
-### Amazing Performance in Policy-Based Methods
-
-<div id="qmix_sy" class="img-height-180 image-center img-margin-left-30"> {% include figure.html path="assets/img/2023-05-01-riit/riit.svg" class="img-fluid rounded z-depth-1" %} </div>
-<div class="caption">Figure 11: Architecture for AC-MIX: <b>|·|</b> denotes <b>absolute value operation</b>, implementing the monotonicity constraint of QMIX. <b>W</b> denotes the non-negative mixing weights. Agent $i$ denotes the agent's network, which can be trained end-to-end by maximizing the $Q_{tot}$.</div>
-
-The novelty of QMIX is the IGM consistency between $\text{argmax} Q_{tot}$ and $\text{argmax} \sum_{i}^{n} Q_{i}$, which is implemented in the mixing network. **We still expect to study the role of *monotonicity constraint* in MARL**. Therefore, we propose an actor-critic style algorithm called Actor-Critic-Mixer (AC-MIX), which has a similar architecture to QMIX. As illustrated in Figure <a href="#qmix_sy">11</a>, we use the monotonic mixing network as a centralized critic, which integrates $Q_{i}$ of each agent, to optimize the decentralized policy networks $π^i_{θ_i}$ in an end-to-end pattern. We still add the Adaptive Entropy $\mathcal{H}(\cdot)$<d-cite key="zhou2020smarts"></d-cite> of each agent in the optimization object of Eq.(\ref{eq4}) to get more exploration, and the detail of the algorithm will be described in Appendix [A](#A).
-
-$$
-\max _{\theta} \mathbb{E}_{t, s_{t}, \tau_{t}^{1}, \ldots, \tau_{t}^{n}}\left[Q_{\theta_{c}}^{\pi}\left(s_{t}, \pi_{\theta_{1}}^{1}\left(\cdot \mid \tau_{t}^{1}\right), \ldots, \pi_{\theta_{n}}^{n}\left(\cdot \mid \tau_{t}^{n}\right)\right)
-+ \mathbb{E}_{i}\left[\mathcal{H}\left(\pi_{\theta_{i}}^{i}\left(\cdot \mid \tau_{t}^{i}\right)\right)\right]\right] \tag{4} \label{eq4}
-$$
-
-<div id="riit_abla" class="img-height-210 image-center"> {% include figure.html path="assets/img/2023-05-01-riit/monotonicity_riit.svg" class="img-fluid rounded z-depth-1" %} </div>
-
-<div class="caption">Figure 12: Comparing AC-MIX w./ and w./o. monotonicity constraint (remove absolute value operation) on SMAC and Predator-Prey-2</div>
-
-As the monotonicity constraint on the critic of AC-MIX is theoretically no longer required as the critic is not used for greedy action selection. We can evaluate the effects of the monotonicity constraint by removing the absolute value operation in the mixing network. The results in Figure <a href="#riit_abla">12</a> demonstrate the *monotonicity constraint* significantly improves the performance of AC-MIX. Then to explore the generality of *monotonicity constraints* in the parallel sampling framework of MARL, we extend the above experiments to VMIX<d-cite key="su2021value"></d-cite>. VMIX adds the monotonicity constraint to the value network of A2C, and learns the policy of each agent by advantage-based policy gradient<d-cite key="sutton2018reinforcement"></d-cite> as illustrated in Figure <a href="#vmix_net">13</a>. Still, the result from Figure <a href="#vmix_abla">14</a> shows that the monotonicity constraint improves the sample efficiency in value networks.
-
-<div id="vmix_net" class="img-height-180 image-center img-margin-left-60"> {% include figure.html path="assets/img/2023-05-01-riit/vmix.svg" class="img-fluid rounded z-depth-1" %} </div>
-<div class="caption">Figure 13. Architecture for VMIX: |·| denotes absolute value operation</div>
-
-<div id="vmix_abla" class="img-height-210 image-center"> {% include figure.html path="assets/img/2023-05-01-riit/monotonicity_vmix.svg" class="img-fluid rounded z-depth-1" %} </div>
-
-<div class="caption">Figure 14: Comparing VMIX w./ and w./o. monotonicity constraint (remove absolute value operation) on SMAC</div>
-
-### What is Under the Hood?
-
-Observed from the results of previous experiments, **the *monotonicity constraints* in the mixing network indeed improve performance and sample efficiency of training**, but on the flip side of the coin, QMIX is still criticized for the insufficient expressive capacity of the centralized critic<d-cite key="mahajan2019maven"></d-cite>, which may cause poor performance. The abnormal question naturally occurred to us: *Why the performance of AC-MIX would be better than AC-MIX-nonmonotonic which aims to relax the monotonicity constraint of mixing network*? 
-
-To answer this question we first need to reexamine the **IGM** principle. Since in QMIX, $Q_{tot}$ is decomposed by the mixing network into the sum of the weighted $[Q_i] _{i=1}^{n}$, as shown in Figure <a href="#frame">4</a>, where the weights and bias of mixing network are generated by the *Hypernetwork*, then the monotonicity in QMIX can be defined simplistically as a constraint on the relationship between $$Q_{tot}$$ and each $$Q_{i}$$ :
-
-$$
-Q_{tot} = \sum_{i=1}^{N}w_{i}(s_{t}) \cdot Q_{i} + b(s_{t}), \\
-w_{i} = \frac{\partial Q_{tot}}{\partial Q_{i}} \geq 0, \forall i \in N.
-\tag{5} \label{5}
-$$
-
-From the sufficient condition above, the weight $w_{i}$ in *Mixing Network* would be forced to be greater or equal to zero $w_{i} \geq 0$. To put it another way, it makes the parameter space smaller for searching $w_{i}$ weights to decompose $Q_{tot}$. As illustrated in the schematic diagram <a href="#diagram">15</a>, assume there is only 1 agent in the environment, the parameter searching space will be directly halved and the optimal $w_{1}$ will be found in the region where $w \geq 0$, i.e., the green region. Similarly, when the number of agents is 2 or 3, its parameter searching space for $w_i$ will be restricted to the first quadrant, and the same can be recursively extended to the case of high-dimensional parameter space. **In other words, the search area of exhausting the whole joint state-action space would also be decreased exponentially by $(\frac{1}{2})^{N}$ ($N$ denotes the number of parameter space of $w_{i}$, as well as the number of agents).** Then the optimal solution in the original domain cannot be expressed correctly in the restricted region. Since the essence of learning in MARL is to search for the optimal joint-policy parameterized by weights and bias of agents and mixing network, QMIX could find a satisfying policy more quickly in these **reduced** parameter spaces.
-
-<div id='diagram' style="display:flex; margin:20px 0; gap:5px">
-<div id="1_agent" class="img-height-100"> {% include figure.html path="assets/img/2023-05-01-riit/1_agent.svg"  class="img-fluid rounded z-depth-1" %} </div>
-<div id="2_agent" class="img-height-100"> {% include figure.html path="assets/img/2023-05-01-riit/2_agent.svg"  class="img-fluid rounded z-depth-1" %} </div>
-<div id="3_agent" class="img-height-100"> {% include figure.html path="assets/img/2023-05-01-riit/3_agent.svg"  class="img-fluid rounded z-depth-1" %} </div></div>
-
-<div style="margin-bottom: 20px">
-<div class="caption">Figure 15: the weight parameter space diagram of different number of agents in QMIX [from-left-to-right]. (a) weight parameter space of only 1 agent; (b) weight parameter space of 2 agents; (c) weight parameter space of 3 agents.</div></div>
-
-As a side effect, the global optimum may not be in the parameter space that QMIX needs to search at all due to the monotonicity of the mixing network. One effective way is to estimate the $Q_{tot}$ as accurately as possible in the hope that it could find the global optimum, this probably explains why $Q(\lambda)$ in the previous section could result in such a performance improvement in SMAC. On the other hand, we could delicately design the reward function to be approximately monotonic when we use QMIX to solve cooperative multi-agent tasks. Then adapting the algorithm to the test environment is not a good idea, after all, we still need to figure out how to use QMIX more effectively or develop other more efficient algorithms.
-
-## Conclusion
-
-In this post, we revisited the performance of the QMIX as a baseline algorithm in the SMAC environment. We found that the application of hyperparameters and other RL techniques have a great impact on the effectiveness of QMIX. We evaluated the effect of optimizer, number of rollout processes, replay buffer size, eligibility traces, hidden size and the degree of annealed exploration on QMIX, and tried to explain their role in MARL. Furthermore, we dived into the monotonicity in QMIX, and found the absolute operation in mixing network would decrease the parameter searching space of the joint state-action area exponentially by $(\frac{1}{2})^{N}$, which would make QMIX find the satisfying policy more quickly but with the drawback of inaccurate evaluated joint value function of optimal policy. We hope that our findings will stimulate some inspiration for the value decomposition method in MARL and provoke the community to think about the performance of QMIX as a new benchmark.
-
-## Authorship, Credit Attribution and Acknowledgement
-
-Jian Hu was responsible for the key ideas, open source code and all experiments, as well as the first draft of the paper.
-
-Siying Wang was responsible for the writing of the blog.
-
-Siyang Jiang participated in writing the first draft of the paper.
-
-Weixun Wang provided feedback on revisions.
-
-Siyang Jiang was supported by the fund which aims to improve scientific research capability of key construction disciplines in Guangdong province “Light-weight federal learning paradigm and its application” (No:2022ZDJS058) and Foundation for Distinguished Young Talents in Higher Education of Guangdong, China. (NO. 2022KQNCX084)
-
-## Appendix
-
-### A Pseudo-code of AC-MIX<a id="A"> </a>
-
-In this subsection, we show the pseudo-code for the training procedure of AC-MIX. (1) Training the critic network with offline samples and 1-step TD error loss improves the sample efficiency for critic networks; (2) We find that policy networks are sensitive to old sample reuse. Training policy networks end-to-end and critic with TD($\lambda$) and online samples improve the learning stability of AC-MIX.
-
-<div id="algorithm_riit" class="img-height-600 image-center"> {% include figure.html path="assets/img/2023-05-01-riit/algorithm_riit.svg" class="img-fluid rounded z-depth-1" %} </div>
-
-### B HYPERPARAMETERS
-
-In this subsection, we present our hyperparameters tuning process. We get the optimal hyperparameters for each algorithm by grid search, shown in Table [3](#t3).
-
-<div class="caption">
-   Table 3: Hyperparameters Search on SMAC. The bold type indicates the selected hyperparameters.
-</div>
-<table style="text-align: center; width: 700px; margin: 0 auto; margin-bottom:20px; margin-top:20px;"><a name="t3"> </a>
-  <thead>
-    <tr>
-      <td><b>Tricks</b></td>
-      <td><b>QMIX</b></td>
-      <td><b>AC-MIX</b></td>
-    </tr>
-  </thead>
-  <tbody>
-    <tr>
-      <td>Optimizer</td>
-      <td><b>Adam</b>,RMSProp</td>
-      <td><b>Adam</b>,RMSProp</td>
-    </tr>
-    <tr>
-      <td>Learning Rates</td>
-      <td>0.0005, <b>0.001</b></td>
-      <td>0.0005, <b>0.001</b></td>
-    </tr>
-    <tr>
-      <td>Batch Size (episodes)</td>
-      <td>32, 64, <b>128</b></td>
-      <td>32, <b>64</b> </td>
-    </tr>
-    <tr>
-      <td>Replay Buffer Size</td>
-      <td><b>5000</b>, 10000, 20000</td>
-      <td>2000, <b>5000</b>, 10000</td>
-    </tr>
-    <tr>
-      <td>Q(λ)/TD(λ)</td>
-      <td>0, 0.3, <b>0.6</b>, 0.9</td>
-      <td>0.3, <b>0.6</b>, 0.8</td>
-    </tr>
-    <tr>
-      <td>Entropy/Adaptive Entropy</td>
-      <td>-</td>
-      <td>0.005, 0.01, <b>0.03</b>, 0.06</td>
-    </tr>
-    <tr>
-      <td>ε Anneal Steps</td>
-      <td>50K, <b>100K, 500K</b>, 1000K</td>
-      <td>-</td>
-    </tr>
-  </tbody>
-</table><br/>
-
-**Rollout Processes Number**. For SMAC, 8 rollout processes for parallel sampling are used to obtain as many samples as possible from the environments at a high rate.
-And 4 rollout processes are used for Predator-Prey-2.
-
-**Other Settings**. We set all discount factors $\gamma$ = 0.99. We update the target network every 200 episodes.
\ No newline at end of file
diff --git a/_posts/2023-05-01-sets-and-graphs.md b/_posts/2023-05-01-sets-and-graphs.md
deleted file mode 100644
index 5d2af148..00000000
--- a/_posts/2023-05-01-sets-and-graphs.md
+++ /dev/null
@@ -1,212 +0,0 @@
----
-layout: distill
-title: Universality of Neural Networks on Sets vs. Graphs
-description: Universal function approximation is one of the central tenets in theoretical deep learning research. It is the question of whether a specific neural network architecture is, in theory, able to approximate any function of interest. The ICLR paper “How Powerful are Graph Neural Networks?” shows that mathematically analysing the constraints of an architecture as a universal function approximator and alleviating these constraints can lead to more principled architecture choices, performance improvements, and long-term impact on the field. Specifically in the fields of learning on sets and learning on graphs, universal function approximation is a well-studied property. The two fields are closely linked because the need for permutation invariance in both cases leads to similar building blocks. However, we argue that these two fields have sometimes evolved in parallel, not fully exploiting their synergies. This post aims at bringing these two fields closer together, particularly from the perspective of universal function approximation.
-date: 2023-05-01
-htmlwidgets: true
-
-# Anonymize when submitting
-#authors:
-#  - name: Anonymous
-
-authors:
-  - name: Fabian B. Fuchs*
-    url: "https://fabianfuchsml.github.io"
-    affiliations:
-      name: Google DeepMind
-  - name: Petar Veličković*
-    url: "https://petar-v.com/"
-    affiliations:
-      name: (*equal contribution)
-
-# must be the exact same name as your blogpost
-bibliography: 2023-05-01-sets-and-graphs.bib  
-
-# Add a table of contents to your post.
-#   - make sure that TOC names match the actual section names
-#     for hyperlinks within the post to work correctly.
-toc:
-  - name: Sets and Graphs
-  - name: Why do we care about universal function approximation?
-  - name: Learning on Sets & Universality
-  - name: Approximation vs. Representation
-  - name: What about _graph_ representation learning?
-  - name: Learning on Graphs and Universality
-  - name: The Weisfeiler-Lehman Test
-  - name: Broader Context and Takeaways
-
-# Below is an example of injecting additional post-specific styles.
-# This is used in the 'Layouts' section of this post.
-# If you use this post as a template, delete this _styles block.
-_styles: >
-  .fake-img {
-    background: #bbb;
-    border: 1px solid rgba(0, 0, 0, 0.1);
-    box-shadow: 0 0px 4px rgba(0, 0, 0, 0.1);
-    margin-bottom: 12px;
-  }
-  .fake-img p {
-    font-family: monospace;
-    color: white;
-    text-align: left;
-    margin: 12px 0;
-    text-align: center;
-    font-size: 16px;
-  }
----
-
-## Sets and Graphs
-
-Before we dive into<d-footnote>It is important to briefly focus on declaring the *conflict of interest* we had while writing this blog. We are actively working on set and graph representation learning. Accordingly, several paragraphs of this write-up focus on papers that we have co-written. That being said, and in the context of ICLR, we declare that the majority of the ICLR papers referenced in this blog post do _not_ present a conflict of interest for us. Hence, we believe we have, to the best of our efforts, provided an objective and impartial view of learning universal representations over graphs and sets.</d-footnote> universal function approximation, let's start with the basics. What do we mean by learning on set- or graph-based data? In both cases, we assume no ordering (we will more formally describe this at the end of this section as the task being permutation _invariant_ or _equivariant_). A graph is typically thought of as a set of nodes with edges between the nodes. A set doesn't have edges, it just has the nodes, although we often don't call them nodes, but rather set elements. Both the nodes and the edges (in the case of graphs) can have feature vectors attached to them. The figure below (originally from Wagstaff et al. 2021<d-cite key="wagstaff21"></d-cite>) visualises this relationship:
-
-{% include figure.html path="assets/img/2023-05-01-sets-and-graphs/graphsuniv_graphsandsets.png" class="img-fluid" %}
-
-Examples of machine learning tasks on this type of data include 3D point cloud classification (a function mapping a set of coordinates to an object class) and molecular property prediction (a function mapping a molecular graph to, e.g., a free energy value).
-
-So, what are invariance and equivariance? Both concepts describe how the output of a function (or task) changes under a transformation of the input. Transformation can mean different things, but we restrict ourselves to permutations here for simplicity. A function $$f$$ is permutation *invariant* if the output does not change as the inputs are permuted. The left-hand side of the following figure below visualises that concept:
-
-
-{% include figure.html path="assets/img/2023-05-01-sets-and-graphs/graphsuniv_permutations.png" class="img-fluid" %}
-
-The right-hand side depicts permutation _equivariance:_ changing the order of the input implies a change in the order of the output (but the values themselves remain unchanged).
-
-Tasks (or functions) defined on sets and graphs are typically permutation invariant or equivariant. This symmetry is often incorporated into the neural network architecture, as we will see in examples below. It is exactly the incorporation of the symmetry that makes the question of universalilty so interesting: is the network (theoretically) able to model all permutation invariant (or equivariant) functions on this data?
-
-
-## Why do we care about universal function approximation?
-
-First of all, why do we need to be able to approximate all functions? After all, having _one_ function that performs well on the train set and generalises to the test set is all we need in most cases. Well, the issue is that we have no idea what such a function looks like, otherwise we would implement it directly and wouldn't need to train a neural network. Hence, the network not being a universal function approximator *may* hurt its performance.
-<!-- So the logic is, we don't want to restrict the network unless the restrictions only refer to functions we know that we don't care about. -->
-
-Graph Isomorphism Networks (GINs) by Xu et al.<d-cite key="GIN"></d-cite>) provide the quintessential example of the merit of universality research. The authors first realised that it is possible to mathematically describe all functions that can be computed by graph neural networks relying on message passing between immediate neighbours, over graphs with discrete-valued features. They then analysed Graph Convolutional Networks (a very popular class of graph neural networks by Kipf and Welling 2016 [3]), and pointed out that GCNs are not capable of expressing all of these functions — that is, they are not universal. Guided by their analysis, the authors then created the GIN, which was provably capable of expressing all possible such functions and achieved significantly better empirical results.
-<!-- Graph Isomorphism Networks (GINs) by Xu et al.<d-cite key="GIN"></d-cite>) provide the quintessential example for the merit of universality research: the authors analysed Graph Convolutional Networks (a very popular class of graph neural networks by Kipf et al. 2016<d-cite key="GCN"></d-cite>), pointed out that GCNs are not universal, created a variation of the algorithm that *is* universal (or at least closer to), and achieved better results. -->
-<!-- So, in this case, the non-universality of the GCNs really did hurt their performance. -->
-
-However, this is not always the case. Sometimes, architecture changes motivated by universal function approximation arguments lead to *worse* results. Even in such unfortunate cases, however, we argue that thinking about universality is no waste of time. Firstly, it brings structure into the literature and into the wide range of models available. We need to group approaches together to see the similarities and differences. Universality research can and has served as a helpful tool for that.
-
-Moreover, proving that a certain architecture is or is not universal is an inherently interesting task and teaches us mathematical thinking and argumentation. In a deep learning world, where there is a general sense of randomness and magic in building high-performing neural networks and where it’s hard to interpret what’s going on, one might argue that an additional mathematical analysis is probably good for the balance, even if it turns out to not always directly result in better performance. 
-
-
-
-## Learning on Sets & Universality
-
-To prove universal function approximation<d-footnote>There is actually a nuanced distinction between *approximation* and *representation*, which we will glance over for now but discuss in the next section.</d-footnote>, we typically make two assumptions: 
-1) the MLP components of the neural networks are arbitrarily large.
-2) the functions that we want to be able to learn are continuous on $$\mathbb{R}$$. Continuity for a function $$f(x)$$ mapping from $$\mathbb{R}$$ to $$\mathbb{R}$$ implies that for all $$x_0$$ in the domain of $$f$$ and all $$\epsilon > 0, \epsilon \in R$$, there exists a $$\delta > 0, \delta \in R$$ such that $$|x - x_0| < \delta$$ implies $$|f(x) - < f(x_0)| < \epsilon$$ if $$x$$ is in the domain of $$f$$. 
-
-The first part says: any concrete implementation of a 'universal' network architecture might not be able to learn the function of interest, but, if you make it [bigger](https://i.redd.it/n9fgba8b0qr01.png), eventually it will---and that is *guaranteed*<d-footnote>Conversely, if the network is provably non-universal (like Graph Convolutional Networks), then there are functions it can *never* learn, no matter how many layers you stack.</d-footnote>. The second part is a non-intuitive mathematical technicality we will leave uncommented for now and get back to later (because it's actually a really interesting and important technicality).
-
-
-
-One of the seminal papers discussing both permutation invariant neural networks and universal function approximation was Deep Sets by Zaheer et al. in 2017<d-cite key="Zaheer2017"></d-cite>. The idea is simple: apply the same neural network $$\phi$$ to several inputs, sum up their results, and apply a final neural network $$\rho$$.<d-footnote>Figure from Wagstaff et al. 2021.</d-footnote>
-
-{% include figure.html path="assets/img/2023-05-01-sets-and-graphs/graphsuniv_deepsets.png" class="img-fluid" %}
-
-
-Because the sum operation is permutation invariant, the final output is invariant with respect to the ordering of the inputs. In other words, the sum quite obviously restricts the space of learnable functions to permutation invariant ones. The question is, can a neural network with this architecture, in principle, learn _all_ (continuous) permutation invariant functions? Perhaps surprisingly, the authors show that all functions can indeed be represented with this architecture. The idea is a form of binary bit-encoding in the output of $$\phi$$, which we will call the _latent space_ from here on. Concretely, they argue that there is a bijective mapping from rational to natural numbers. Assuming that each input is a rational number, they first map each rational number $$x$$ to a natural number $$c(x)$$, and then each natural number to $$\phi(x) = 4^{-c(x)}$$. It is now easy to see that $$\sum_i \phi(x_i) \neq \sum_i \phi(y_i)$$ unless the finite sets $$ \\{ x_0, x_1, ... \\} $$ and $$\\{y_0, y_1, ...\\}$$ are the same. Now that we uniquely encoded each input, a universal decoder can map this to any output we want. This concludes the proof that the Deep Sets architecture is, in theory, a universal function approximator, despite its simplicity.
-
-However, there is an issue with this proof: it builds on the assumption that the MLP components themselves are universal function approximators, in the limit of infinite width. However, the universal function approximation theorem says that this is the case only for continuous functions, where continuity is defined on the real numbers. That continuity is important is sort of intuitive: continuity means that a small change in the input implies a small change in the output. And because the building blocks of neural networks (specifically linear combinations and non-linearities) are continuous, it makes sense that the overall function we want the network to learn should be continuous.
-
-But why continuity on the real numbers? Because continuity on the rational numbers is not a very useful property as shown in Wagstaff et al. 2019<d-cite key="wagstaff19"></d-cite>. The mapping we described above is clearly highly discontinuous, and anyone could attest that it is completely unrealistic to assume that a neural network could learn such a function. That doesn't mean all is lost. Wagstaff et al. show that the Deep Sets architecture is still a universal function approximator when requiring continuity, but only if the latent space (the range of $$\phi$$) has a dimensionality at least as large as the number of inputs, which is an important restriction.
-
-
-What about more complicated architectures? Murphy et al.<d-cite key="Janossy"></d-cite> generalise the idea of Deep Sets to applying networks to all possible $$k$$-tuples of inputs, where $$k=1$$ recovers the Deep Sets case. This can be seen as unifying other architecture classes such as self-attention. However, this is not known to alleviate the constraint on the latent space mentioned above, as explained in Wagstaff et al. 2021<d-cite key="wagstaff21"></d-cite>.
-
-
-## Approximation vs. Representation
-
-For simplicity, we have so far deliberately glanced over the distinction between function approximation and representation, but we will rectify this now. The Deep Sets architecture from the previous section can be written as:
-
-$$\rho (\sum \phi_i(x_i))$$
-
-If we forget about $$\rho$$ and $$\phi$$ being implemented as neural networks for a second and just think of them as general functions, it turns out that any continuous permutation invariant function can be _represented_ in the above way. The word _represented_ implies that it's exact, without an approximation error, not even an arbitrarily small one. As such, Zaheer et al. 2017<d-cite key="Zaheer2017"></d-cite> and Wagstaff et al. 2019<d-cite key="wagstaff19"></d-cite> study universal function *representation*, not the softer criterion of *approximation*. However, once we assume $$\rho$$ and $$\phi$$ are being implemented as neural networks, it is an approximation. Hence, it makes sense to call Deep Sets a universal function *approximator* for continuous functions on sets. There is a catch here, though. If we are satisfied with approximation in the components $$\phi$$ and $$\rho$$, we might as well be satisfied with approximations in other places as well. A question one could ask is "how large does the latent space have to be in order to keep the errors small?". This is unsurprisingly a much harder question to answer, but Wagstaff et al. 2021<d-cite key="wagstaff21"></d-cite> find that the result is largely the same: the latent space much have a dimensionality at least as large as the number of inputs.
-
-
-
-
-## What about _graph_ representation learning?
-
-So, this was universality in the context of machine learning on sets, but what about graphs? Interestingly, the graph representation learning community experienced a near-identical journey, evolving entirely in parallel! Perhaps this observation comes as little surprise: to meaningfully propagate information in a graph neural network (GNN), a local, permutation invariant operation is commonplace. 
-
-<!-- Specifically, a GNN typically operates by computing representations (_"messages"_) sent from each node to its neighbours, followed by an _aggregation function_ which, for every node, combines all of its incoming messages in a way that is _invariant to permutations_.  -->
-
- Specifically, a GNN typically operates by computing representations (_"messages"_) sent from each node to its neighbours, using a _message function_<d-footnote>Here, for the purpose of clarity, we assume that the message function $\psi$ only takes into account the features of the sender and receiver nodes. It is of course possible to have additional relevant features in the graph that $\psi$ could use, for example, there could be features on the edge $i\rightarrow j$, as is often the case, e.g., in molecular graphs. Such cases can usually be resolved by inserting these features as additional inputs to $\psi$.</d-footnote>, $\psi : \mathbb{R}^k \times \mathbb{R}^k\rightarrow\mathbb{R}^l$:
-
-$$\mathbf{m}_{ij} = \psi(\mathbf{x}_{i}, \mathbf{x}_{j})$$
-
-where $$\mathbf{x}_{i}$$ are the features of node $i$. This is followed by an _aggregation function_ which, for every node, combines all of its incoming messages in a way that is invariant to permutations:
-
-$$\mathbf{h}_{i} = \phi\left(\mathbf{x}_{i}, \bigoplus_{j\in\mathcal{N}_{i}} \mathbf{m}_{ji}\right)$$
-
-where $$\mathcal{N}_i$$ is the set of all nodes neighbouring $i$, and $$\phi : \mathbb{R}^k\times\mathbb{R}^l\rightarrow\mathbb{R}^m$$ is an _update function_, updating the representation of each node $$i$$ from $$\mathbf{x}_{i}$$ to $$\mathbf{h}_{i}$$.
-
-Opinions are still divided on whether _every_ permutation equivariant GNN can be expressed with such pairwise messaging, with a recent position paper by Veličković<d-cite key="Velickovic22"></d-cite> claiming they **can**. Regardless of which way the debate goes in the future, aggregating messages over 1-hop neighbours gives rise to a highly elegant implementation of GNNs which is likely here to stay. This comes with very solid community backing, with [PyG](https://www.pyg.org/)---one of the most popular GNN frameworks---[recently making aggregators a "first-class citizen"](https://github.com/pyg-team/pytorch_geometric/releases/tag/2.1.0) in their GNN pipelining.
-
-Therefore, to build a GNN, it suffices to build a _permutation-invariant, local_ layer which combines data coming from each node's neighbours. This feels nearly identical to our previous discussion; what's changed, really? Well, we need to take care of one seemingly minor detail: it is possible for **two or more neighbours to send _exactly the same message_**. The theoretical framework of Deep Sets and/or Wagstaff et al. wouldn't entirely suffice in this case, as they assumed a _set_ input, whereas now we have a _multiset_ (a set where some elements might be repeated)..
-
-
-
-
-
-
-## Learning on Graphs and Universality
-
-Several influential GNN papers were able to overcome this limitation. The first key development came from the _graph isomorphism network_ (**GIN**)<d-cite key="GIN"></d-cite>. GIN is an elegant example of how, over countable features, the maximally-powerful GNN<d-footnote>That is, a GNN that is capable of expressing all possible functions that can be described using several iterations of message passing between one-hop neighbours in a graph.</d-footnote> can be built up using similar ideas as in Deep Sets; so long as the local layer we use is _injective_<d-footnote>Injectivity of a function means that two different inputs always yield two different outputs. In other words, if you evaluate the function twice and the output is the same both times, you know that the input must have been the same, too.</d-footnote> over multisets. Similarly to before, we must choose our encoder $$\phi$$ and aggregator $$\bigoplus$$, such that $$\bigoplus\limits_i \phi(x_i) \neq \bigoplus\limits_i \phi(y_i)$$ unless the finite _multisets_ $\\{  \mkern-4mu \\{x_0, x_1, ...\\} \mkern-4mu \\}$ and $\\{\mkern-4mu\\{y_0, y_1, ...\\} \mkern-4mu \\}$ are the same ($$x_i, y_i\in\mathbb{Q}$$).
-
-In the multiset case, the framework from Deep Sets induces an additional constraint over $$\bigoplus$$---it needs to preserve the _cardinality_ information about the repeated elements in a multiset. This immediately implies that some choices of $$\bigoplus$$, such as $$\max$$ or averaging, will not yield maximally powerful GNNs.
-
-For example, consider the multisets $\\{\mkern-4mu\\{1, 1, 2, 2\\} \mkern-4mu \\}$ and $\\{\mkern-4mu\\{1, 2\\}\mkern-4mu\\}$. As we assume the features to be countable, we specify the numbers as _one-hot_ integers; that is, $$1 = [1\ \ 0]$$ and $$2=[0\ \ 1]$$. The maximum of these features, taken over the multiset, is $$[1\ \ 1]$$, and their average is $$\left[\frac{1}{2}\ \ \frac{1}{2}\right]$$. This is the case for both of these multisets, meaning that both maximising and averaging are _incapable_ of telling them apart.
-
-Summations $$\left(\bigoplus=\sum\right)$$, however, are an example of a suitable injective operator.
-
-Very similarly to the analysis from Wagstaff et al. in the domain of sets, a similar extension in the domain of graphs came through the work on [_principal neighbourhood aggregation_](**PNA**) by Corso, Cavalleri et al.<d-cite key="Corso"></d-cite>. We already discussed why it is a good idea to focus on features coming from $$\mathbb{R}$$ rather than $$\mathbb{Q}$$---the universal approximation theorem only applies to functions that are continuous on $$\mathbb{R}$$. However, it turns out that, when we let $$x_i, y_i\in\mathbb{R}$$, it is easily possible to construct neighbourhood multisets for which setting $$\bigoplus=\sum$$ would **not** preserve injectivity: 
-
-{% include figure.html path="assets/img/2023-05-01-sets-and-graphs/graphsuniv_examples.png" class="img-fluid" %}
-
-In fact, PNA itself is based on a proof that it is _impossible_ to build an injective function over multisets with real-valued features using _any_ **single** aggregator. In general, for an injective function over $$n$$ neighbours, we need _at least_ $$n$$ aggregation functions (applied in parallel). PNA then builds an empirically powerful aggregator combination, leveraging this insight while trying to preserve numerical stability.
-
-Note that there is an apparent **similarity** between these results and the ones from Wagstaff et al. 2019<d-cite key="wagstaff19"></d-cite> . Wagstaff et al. show that, over real-valued sets of $$n$$ elements, it is necessary to have an encoder representation _width_ of at least $$n$$. Corso, Cavalleri et al. show that, over real-valued multisets of $$n$$ elements, it is necessary to aggregate them with at least $$n$$ aggregators.
-
-There are also major differences between the two analyses: Wagstaff et al. 2019<d-cite key="wagstaff19"></d-cite> assume the sum as an aggregator, whereas Corso et al.<d-cite key="Corso"></d-cite> consider arbitrary aggregation functions. They also use different language: number of aggregators vs. dimensionality of the latent space, although the two are equivalent. Ultimately, the restriction to sums makes the sufficiency proof (the neural network _is_ universal for num latents $$\geq$$ num inputs) for Wagstaff et al. more complicated, which uses a sum-of-power mapping. Corso et al., on the other hand, simply use an aggregator that extracts the $$i$$th-smallest input element. The necessity proof (the neural network _is not_ universal for num latents $$<$$ num inputs), on the other hand, is more complex for Corso et al. and uses the Borsuk–Ulam theorem, because all possible aggregation functions have to be taken into account. Remarkably, despite the different starting conditions, both proofs arrive at the exact same result: for a universal neural network, you need as many aggregators/latents as you have inputs.
-
-In other words, it appears that potent processing of real-valued collections necessitates representational capacity proportional to the collection’s size, in order to guarantee injectivity. Discovering this correspondence is actually what brought the two of us together to publish this blog post in the first place.
-
-We have now established what is necessary to create a maximally-powerful GNN over both _countable_ and _uncountable_ input features. So, _how powerful are they_, exactly?
-
-## The Weisfeiler-Lehman Test
-While GNNs are often a powerful tool for processing graph data in the real world, they also won’t solve _all_ tasks specified on a graph accurately! As a simple counterexample, consider any NP-hard problem, such as the Travelling Salesperson Problem. If we had a fixed-depth GNN that perfectly solves such a problem, we would have shown P=NP! Expectedly, not all GNNs will be equally good at solving various problems, and we may be highly interested in characterising their _expressive power_.
-
-The canonical example for characterising expressive power is _deciding graph isomorphism_; that is, can our 
-GNN distinguish two non-isomorphic graphs? Specifically, if our GNN is capable of computing graph-level 
-representations $$\mathbf{h}_{\mathcal{G}}$$, we are interested whether $$\mathbf{h}_{\mathcal{G_{1}}} \neq\mathbf{h}_{\mathcal{G_{2}}}$$ for non-isomorphic graphs $$\mathcal{G}_{1}$$ and $$\mathcal{G}_{2}$$. If we cannot attach different representations to these two graphs, any kind of task requiring us to classify them differently is _hopeless_! This motivates assessing the power of GNNs by which graphs they are able to _distinguish_.
-
-A typical way in which this is formalised is by using the _Weisfeiler-Lehman_ (**WL**) graph isomorphism test. To formalise this, we will study a popular algorithm for approximately deciding graph isomorphism.
-
-The WL algorithm featurises a graph $$\mathcal{G}=(\mathcal{V},\mathcal{E})$$ as follows. First, we set the representation of each node $$i\in\mathcal{V}$$ as $$x_i^{(0)} = 1$$. Then, it proceeds as follows:
-1. Let $\mathcal{X}_i^{(t+1)} = \\{\mkern-4mu\\{x_j^{(t)} :(i,j)\in\mathcal{E}\\}\mkern-4mu\\}$ be the multiset of features of all neighbours of $$i$$.
-2. Then, let $$x_i^{(t+1)}=\sum\limits_{y_j\in\mathcal{X}_i^{(t+1)}}\phi(y_j)$$, where $$\phi : \mathbb{Q}\rightarrow\mathbb{Q}$$ is an _injective_ hash function.
-
-This process continues as long as the _histogram_ of $$x_i^{(t)}$$ changes---initially, all nodes have the same representation. As steps 1--2 are iterated, certain $$x_i^{(t)}$$ values may become different. Finally, the WL test checks whether two graphs are (possibly) isomorphic by checking whether their histograms have the same (sorted) shape upon convergence.
-
-While remarkably simple, the WL test can accurately distinguish most graphs of real-world interest. It does have some rather painful failure modes, though; for example, it cannot distinguish a 6-cycle from two triangles!
-
-{% include figure.html path="assets/img/2023-05-01-sets-and-graphs/graphsuniv_wlfail.png" class="img-fluid" %}
-
-This is because, locally, _all nodes look the same_ in these two graphs, and the histogram never changes.
-
-The key behind the power of the WL test is the _injectivity_ of the hash function $$\phi$$---it may be interpreted as assigning each node a different _colour_ if it has a different _local context_. Similarly, we saw that GNNs are maximally powerful when their propagation models are _injective_. It should come as little surprise then that, in terms of distinguishing graph structures over _countable_ input features, GNNs can **never be more powerful than the WL test**! And, in fact, this level of power is achieved _exactly_ when the aggregator is injective. This fact was first discovered by Morris et al.<d-cite key="Morris"></d-cite>, and reinterpreted from the perspective of multiset aggregation by the GIN paper.
-
-While the WL connection has certainly spurred a vast amount of works on improving GNN expressivity, it is also worth recalling the initial assumption: $$x_i^{(0)} = 1$$. That is, we assume that the input node features are _completely uninformative_! Very often, this is not a good idea! It can be proven that even placing _random numbers_ in the nodes can yield to a provable improvement in expressive power (Sato et al.<d-cite key="Sato"></d-cite>). Further, many recent works (Loukas et al.<d-cite key="Loukas"></d-cite>); Kanatsoulis and Ribeiro<d-cite key="Ribeiro"></d-cite> make it very explicit that, if we allow GNNs access to "appropriate" input features, this leads to a vast improvement in their expressive power. All of these models hence surpass the 1-WL test. There is now a significant body of recent research to improve GNNs beyond the 1-WL test by giving them access to features or structures they wouldn't otherwise be capable of computing. The broad strategies for doing so, beyond the just-discussed feature augmentation, include rewiring the graph, and explicit message passing over _substructures_ in the graph. Veličković<d-cite key="Velickovic22"></d-cite> provides a bird's eye summary of these recent developments.
-
-Even beyond the limitation of the uninformative input features, recent influential works (published at ICLR'22 and '23 as orals) have demonstrated that the WL framework itself is worth extending. Geerts and Reutter<d-cite key="Geerts"></d-cite> demonstrate clear theoretical value to expressing GNN computations using a _tensor language_ (TL), allowing for drawing significant connections to _color refinement_ algorithms. And Zhang et al.<d-cite key="Zhang"></d-cite> demonstrate that the WL framework may be _weak_ in terms of its architectural distinguishing power, showing that many higher-order GNNs that surpass the limitations of the 1-WL test are in fact still incapable of computing many standard polynomial-time-computable properties over graphs, such as ones relating to the graph's _biconnected components_.
-
-Lastly, linking back to our central discussion, we argue that focusing the theoretical analysis only on discrete features may not lead to highly learnable target mappings. From the perspective of the WL test (and basically any discrete-valued procedure), the models presented in Deep Sets and PNA are no more powerful than 1-WL. However, moving into continuous feature support, PNA is indeed more powerful at distinguishing graphs than models like GIN.
-
-## Broader Context and Takeaways
-
-It is no coincidence that many of the current universality discussions within machine learning are happening inside communities that build networks that exploit symmetries (in our examples, the symmetry was always permutation invariance/equivariance, but the following argument equally applies to, e.g., rotational symmetries): exploiting symmetries with a neural network architecture is tantamount to limiting the space of functions that can be learned. This naturally raises the question of _how much_ the space of learnable function has been limited. In other words: for the space of functions observing a specific symmmetry, is the neural network (still) a universal function approximator? This does not imply, however, that universality isn't interesting in other fields, too: e.g., the fact that self-attention (popularised by natural language processing) is a universal approximator for functions on sets is an interesting property that gives its design more context. The (once) ubiquitous usage of the convolutional layer seems less surprising when knowing that it is the most general<d-footnote>In fact, it is also the only such linear layer because simpler and less expressive translation equivariant linear layers (e.g. point-wise linears) can be seen as special cases of a convolutional layer.</d-footnote> linear layer that observes translation equivariance<d-cite key="Cohen"></d-cite>.
-
-In this blog post, we aimed at explaining most of the key concepts of universal function approximation for set and graph-based machine learning: invariance and equivariance, sets and multisets, representation vs. approximation, injectivity, Deep Sets, GINs, WL-tests, and the motivation for universality research itself. We hope that we provided some insights into the similarities and differences of universality research on graphs and sets, and maybe even food for thought leading to future research on this intersection. We also acknowledge that this is a theoretical topic and that none of these proofs can ultimately predict how well a 'universal' neural network will perform on a specific task in the real world. However, even in the worst-case scenario, where theoretical universality properties are completely uncorrelated (or inversely correlated?) with real-world performance, we still hope that the thoughts and concepts of this post add a bit of additional structure to the multifaceted zoo of neural network architectures for sets and graphs.
-
-
-
diff --git a/_projects/1_project.md b/_projects/1_project.md
deleted file mode 100644
index 3f7cf783..00000000
--- a/_projects/1_project.md
+++ /dev/null
@@ -1,80 +0,0 @@
----
-layout: page
-title: project 1
-description: a project with a background image
-img: assets/img/12.jpg
-importance: 1
-category: work
----
-
-Every project has a beautiful feature showcase page.
-It's easy to include images in a flexible 3-column grid format.
-Make your photos 1/3, 2/3, or full width.
-
-To give your project a background in the portfolio page, just add the img tag to the front matter like so:
-
-    ---
-    layout: page
-    title: project
-    description: a project with a background image
-    img: /assets/img/12.jpg
-    ---
-
-<div class="row">
-    <div class="col-sm mt-3 mt-md-0">
-        {% include figure.html path="assets/img/1.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-    <div class="col-sm mt-3 mt-md-0">
-        {% include figure.html path="assets/img/3.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-    <div class="col-sm mt-3 mt-md-0">
-        {% include figure.html path="assets/img/5.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-</div>
-<div class="caption">
-    Caption photos easily. On the left, a road goes through a tunnel. Middle, leaves artistically fall in a hipster photoshoot. Right, in another hipster photoshoot, a lumberjack grasps a handful of pine needles.
-</div>
-<div class="row">
-    <div class="col-sm mt-3 mt-md-0">
-        {% include figure.html path="assets/img/5.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-</div>
-<div class="caption">
-    This image can also have a caption. It's like magic.
-</div>
-
-You can also put regular text between your rows of images.
-Say you wanted to write a little bit about your project before you posted the rest of the images.
-You describe how you toiled, sweated, *bled* for your project, and then... you reveal its glory in the next row of images.
-
-
-<div class="row justify-content-sm-center">
-    <div class="col-sm-8 mt-3 mt-md-0">
-        {% include figure.html path="assets/img/6.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-    <div class="col-sm-4 mt-3 mt-md-0">
-        {% include figure.html path="assets/img/11.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-</div>
-<div class="caption">
-    You can also have artistically styled 2/3 + 1/3 images, like these.
-</div>
-
-
-The code is simple.
-Just wrap your images with `<div class="col-sm">` and place them inside `<div class="row">` (read more about the <a href="https://getbootstrap.com/docs/4.4/layout/grid/">Bootstrap Grid</a> system).
-To make images responsive, add `img-fluid` class to each; for rounded corners and shadows use `rounded` and `z-depth-1` classes.
-Here's the code for the last row of images above:
-
-{% raw %}
-```html
-<div class="row justify-content-sm-center">
-    <div class="col-sm-8 mt-3 mt-md-0">
-        {% include figure.html path="assets/img/6.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-    <div class="col-sm-4 mt-3 mt-md-0">
-        {% include figure.html path="assets/img/11.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-</div>
-```
-{% endraw %}
diff --git a/_projects/2_project.md b/_projects/2_project.md
deleted file mode 100644
index bebf7961..00000000
--- a/_projects/2_project.md
+++ /dev/null
@@ -1,80 +0,0 @@
----
-layout: page
-title: project 2
-description: a project with a background image
-img: assets/img/3.jpg
-importance: 2
-category: work
----
-
-Every project has a beautiful feature showcase page.
-It's easy to include images in a flexible 3-column grid format.
-Make your photos 1/3, 2/3, or full width.
-
-To give your project a background in the portfolio page, just add the img tag to the front matter like so:
-
-    ---
-    layout: page
-    title: project
-    description: a project with a background image
-    img: /assets/img/12.jpg
-    ---
-
-<div class="row">
-    <div class="col-sm mt-3 mt-md-0">
-        {% include figure.html path="assets/img/1.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-    <div class="col-sm mt-3 mt-md-0">
-        {% include figure.html path="assets/img/3.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-    <div class="col-sm mt-3 mt-md-0">
-        {% include figure.html path="assets/img/5.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-</div>
-<div class="caption">
-    Caption photos easily. On the left, a road goes through a tunnel. Middle, leaves artistically fall in a hipster photoshoot. Right, in another hipster photoshoot, a lumberjack grasps a handful of pine needles.
-</div>
-<div class="row">
-    <div class="col-sm mt-3 mt-md-0">
-        {% include figure.html path="assets/img/5.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-</div>
-<div class="caption">
-    This image can also have a caption. It's like magic.
-</div>
-
-You can also put regular text between your rows of images.
-Say you wanted to write a little bit about your project before you posted the rest of the images.
-You describe how you toiled, sweated, *bled* for your project, and then... you reveal its glory in the next row of images.
-
-
-<div class="row justify-content-sm-center">
-    <div class="col-sm-8 mt-3 mt-md-0">
-        {% include figure.html path="assets/img/6.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-    <div class="col-sm-4 mt-3 mt-md-0">
-        {% include figure.html path="assets/img/11.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-</div>
-<div class="caption">
-    You can also have artistically styled 2/3 + 1/3 images, like these.
-</div>
-
-
-The code is simple.
-Just wrap your images with `<div class="col-sm">` and place them inside `<div class="row">` (read more about the <a href="https://getbootstrap.com/docs/4.4/layout/grid/">Bootstrap Grid</a> system).
-To make images responsive, add `img-fluid` class to each; for rounded corners and shadows use `rounded` and `z-depth-1` classes.
-Here's the code for the last row of images above:
-
-{% raw %}
-```html
-<div class="row justify-content-sm-center">
-    <div class="col-sm-8 mt-3 mt-md-0">
-        {% include figure.html path="assets/img/6.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-    <div class="col-sm-4 mt-3 mt-md-0">
-        {% include figure.html path="assets/img/11.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-</div>
-```
-{% endraw %}
diff --git a/_projects/3_project.md b/_projects/3_project.md
deleted file mode 100644
index 3f3cbf70..00000000
--- a/_projects/3_project.md
+++ /dev/null
@@ -1,81 +0,0 @@
----
-layout: page
-title: project 3
-description: a project that redirects to another website
-img: assets/img/7.jpg
-redirect: https://unsplash.com
-importance: 3
-category: work
----
-
-Every project has a beautiful feature showcase page.
-It's easy to include images in a flexible 3-column grid format.
-Make your photos 1/3, 2/3, or full width.
-
-To give your project a background in the portfolio page, just add the img tag to the front matter like so:
-
-    ---
-    layout: page
-    title: project
-    description: a project with a background image
-    img: /assets/img/12.jpg
-    ---
-
-<div class="row">
-    <div class="col-sm mt-3 mt-md-0">
-        {% include figure.html path="assets/img/1.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-    <div class="col-sm mt-3 mt-md-0">
-        {% include figure.html path="assets/img/3.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-    <div class="col-sm mt-3 mt-md-0">
-        {% include figure.html path="assets/img/5.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-</div>
-<div class="caption">
-    Caption photos easily. On the left, a road goes through a tunnel. Middle, leaves artistically fall in a hipster photoshoot. Right, in another hipster photoshoot, a lumberjack grasps a handful of pine needles.
-</div>
-<div class="row">
-    <div class="col-sm mt-3 mt-md-0">
-        {% include figure.html path="assets/img/5.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-</div>
-<div class="caption">
-    This image can also have a caption. It's like magic.
-</div>
-
-You can also put regular text between your rows of images.
-Say you wanted to write a little bit about your project before you posted the rest of the images.
-You describe how you toiled, sweated, *bled* for your project, and then... you reveal its glory in the next row of images.
-
-
-<div class="row justify-content-sm-center">
-    <div class="col-sm-8 mt-3 mt-md-0">
-        {% include figure.html path="assets/img/6.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-    <div class="col-sm-4 mt-3 mt-md-0">
-        {% include figure.html path="assets/img/11.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-</div>
-<div class="caption">
-    You can also have artistically styled 2/3 + 1/3 images, like these.
-</div>
-
-
-The code is simple.
-Just wrap your images with `<div class="col-sm">` and place them inside `<div class="row">` (read more about the <a href="https://getbootstrap.com/docs/4.4/layout/grid/">Bootstrap Grid</a> system).
-To make images responsive, add `img-fluid` class to each; for rounded corners and shadows use `rounded` and `z-depth-1` classes.
-Here's the code for the last row of images above:
-
-{% raw %}
-```html
-<div class="row justify-content-sm-center">
-    <div class="col-sm-8 mt-3 mt-md-0">
-        {% include figure.html path="assets/img/6.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-    <div class="col-sm-4 mt-3 mt-md-0">
-        {% include figure.html path="assets/img/11.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-</div>
-```
-{% endraw %}
diff --git a/_projects/4_project.md b/_projects/4_project.md
deleted file mode 100644
index edb5dd25..00000000
--- a/_projects/4_project.md
+++ /dev/null
@@ -1,80 +0,0 @@
----
-layout: page
-title: project 4
-description: another without an image
-img:
-importance: 3
-category: fun
----
-
-Every project has a beautiful feature showcase page.
-It's easy to include images in a flexible 3-column grid format.
-Make your photos 1/3, 2/3, or full width.
-
-To give your project a background in the portfolio page, just add the img tag to the front matter like so:
-
-    ---
-    layout: page
-    title: project
-    description: a project with a background image
-    img: /assets/img/12.jpg
-    ---
-
-<div class="row">
-    <div class="col-sm mt-3 mt-md-0">
-        {% include figure.html path="assets/img/1.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-    <div class="col-sm mt-3 mt-md-0">
-        {% include figure.html path="assets/img/3.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-    <div class="col-sm mt-3 mt-md-0">
-        {% include figure.html path="assets/img/5.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-</div>
-<div class="caption">
-    Caption photos easily. On the left, a road goes through a tunnel. Middle, leaves artistically fall in a hipster photoshoot. Right, in another hipster photoshoot, a lumberjack grasps a handful of pine needles.
-</div>
-<div class="row">
-    <div class="col-sm mt-3 mt-md-0">
-        {% include figure.html path="assets/img/5.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-</div>
-<div class="caption">
-    This image can also have a caption. It's like magic.
-</div>
-
-You can also put regular text between your rows of images.
-Say you wanted to write a little bit about your project before you posted the rest of the images.
-You describe how you toiled, sweated, *bled* for your project, and then... you reveal its glory in the next row of images.
-
-
-<div class="row justify-content-sm-center">
-    <div class="col-sm-8 mt-3 mt-md-0">
-        {% include figure.html path="assets/img/6.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-    <div class="col-sm-4 mt-3 mt-md-0">
-        {% include figure.html path="assets/img/11.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-</div>
-<div class="caption">
-    You can also have artistically styled 2/3 + 1/3 images, like these.
-</div>
-
-
-The code is simple.
-Just wrap your images with `<div class="col-sm">` and place them inside `<div class="row">` (read more about the <a href="https://getbootstrap.com/docs/4.4/layout/grid/">Bootstrap Grid</a> system).
-To make images responsive, add `img-fluid` class to each; for rounded corners and shadows use `rounded` and `z-depth-1` classes.
-Here's the code for the last row of images above:
-
-{% raw %}
-```html
-<div class="row justify-content-sm-center">
-    <div class="col-sm-8 mt-3 mt-md-0">
-        {% include figure.html path="assets/img/6.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-    <div class="col-sm-4 mt-3 mt-md-0">
-        {% include figure.html path="assets/img/11.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-</div>
-```
-{% endraw %}
diff --git a/_projects/5_project.md b/_projects/5_project.md
deleted file mode 100644
index efd9b6cf..00000000
--- a/_projects/5_project.md
+++ /dev/null
@@ -1,80 +0,0 @@
----
-layout: page
-title: project 5
-description: a project with a background image
-img: assets/img/1.jpg
-importance: 3
-category: fun
----
-
-Every project has a beautiful feature showcase page.
-It's easy to include images in a flexible 3-column grid format.
-Make your photos 1/3, 2/3, or full width.
-
-To give your project a background in the portfolio page, just add the img tag to the front matter like so:
-
-    ---
-    layout: page
-    title: project
-    description: a project with a background image
-    img: /assets/img/12.jpg
-    ---
-
-<div class="row">
-    <div class="col-sm mt-3 mt-md-0">
-        {% include figure.html path="assets/img/1.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-    <div class="col-sm mt-3 mt-md-0">
-        {% include figure.html path="assets/img/3.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-    <div class="col-sm mt-3 mt-md-0">
-        {% include figure.html path="assets/img/5.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-</div>
-<div class="caption">
-    Caption photos easily. On the left, a road goes through a tunnel. Middle, leaves artistically fall in a hipster photoshoot. Right, in another hipster photoshoot, a lumberjack grasps a handful of pine needles.
-</div>
-<div class="row">
-    <div class="col-sm mt-3 mt-md-0">
-        {% include figure.html path="assets/img/5.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-</div>
-<div class="caption">
-    This image can also have a caption. It's like magic.
-</div>
-
-You can also put regular text between your rows of images.
-Say you wanted to write a little bit about your project before you posted the rest of the images.
-You describe how you toiled, sweated, *bled* for your project, and then... you reveal its glory in the next row of images.
-
-
-<div class="row justify-content-sm-center">
-    <div class="col-sm-8 mt-3 mt-md-0">
-        {% include figure.html path="assets/img/6.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-    <div class="col-sm-4 mt-3 mt-md-0">
-        {% include figure.html path="assets/img/11.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-</div>
-<div class="caption">
-    You can also have artistically styled 2/3 + 1/3 images, like these.
-</div>
-
-
-The code is simple.
-Just wrap your images with `<div class="col-sm">` and place them inside `<div class="row">` (read more about the <a href="https://getbootstrap.com/docs/4.4/layout/grid/">Bootstrap Grid</a> system).
-To make images responsive, add `img-fluid` class to each; for rounded corners and shadows use `rounded` and `z-depth-1` classes.
-Here's the code for the last row of images above:
-
-{% raw %}
-```html
-<div class="row justify-content-sm-center">
-    <div class="col-sm-8 mt-3 mt-md-0">
-        {% include figure.html path="assets/img/6.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-    <div class="col-sm-4 mt-3 mt-md-0">
-        {% include figure.html path="assets/img/11.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-</div>
-```
-{% endraw %}
diff --git a/_projects/6_project.md b/_projects/6_project.md
deleted file mode 100644
index 9a95d6e8..00000000
--- a/_projects/6_project.md
+++ /dev/null
@@ -1,80 +0,0 @@
----
-layout: page
-title: project 6
-description: a project with no image
-img:
-importance: 4
-category: fun
----
-
-Every project has a beautiful feature showcase page.
-It's easy to include images in a flexible 3-column grid format.
-Make your photos 1/3, 2/3, or full width.
-
-To give your project a background in the portfolio page, just add the img tag to the front matter like so:
-
-    ---
-    layout: page
-    title: project
-    description: a project with a background image
-    img: /assets/img/12.jpg
-    ---
-
-<div class="row">
-    <div class="col-sm mt-3 mt-md-0">
-        {% include figure.html path="assets/img/1.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-    <div class="col-sm mt-3 mt-md-0">
-        {% include figure.html path="assets/img/3.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-    <div class="col-sm mt-3 mt-md-0">
-        {% include figure.html path="assets/img/5.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-</div>
-<div class="caption">
-    Caption photos easily. On the left, a road goes through a tunnel. Middle, leaves artistically fall in a hipster photoshoot. Right, in another hipster photoshoot, a lumberjack grasps a handful of pine needles.
-</div>
-<div class="row">
-    <div class="col-sm mt-3 mt-md-0">
-        {% include figure.html path="assets/img/5.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-</div>
-<div class="caption">
-    This image can also have a caption. It's like magic.
-</div>
-
-You can also put regular text between your rows of images.
-Say you wanted to write a little bit about your project before you posted the rest of the images.
-You describe how you toiled, sweated, *bled* for your project, and then... you reveal its glory in the next row of images.
-
-
-<div class="row justify-content-sm-center">
-    <div class="col-sm-8 mt-3 mt-md-0">
-        {% include figure.html path="assets/img/6.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-    <div class="col-sm-4 mt-3 mt-md-0">
-        {% include figure.html path="assets/img/11.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-</div>
-<div class="caption">
-    You can also have artistically styled 2/3 + 1/3 images, like these.
-</div>
-
-
-The code is simple.
-Just wrap your images with `<div class="col-sm">` and place them inside `<div class="row">` (read more about the <a href="https://getbootstrap.com/docs/4.4/layout/grid/">Bootstrap Grid</a> system).
-To make images responsive, add `img-fluid` class to each; for rounded corners and shadows use `rounded` and `z-depth-1` classes.
-Here's the code for the last row of images above:
-
-{% raw %}
-```html
-<div class="row justify-content-sm-center">
-    <div class="col-sm-8 mt-3 mt-md-0">
-        {% include figure.html path="assets/img/6.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-    <div class="col-sm-4 mt-3 mt-md-0">
-        {% include figure.html path="assets/img/11.jpg" title="example image" class="img-fluid rounded z-depth-1" %}
-    </div>
-</div>
-```
-{% endraw %}
diff --git a/_sass/_base.scss b/_sass/_base.scss
deleted file mode 100644
index 7b826527..00000000
--- a/_sass/_base.scss
+++ /dev/null
@@ -1,658 +0,0 @@
-/*******************************************************************************
- * Styles for the base elements of the theme.
- ******************************************************************************/
-
-// Typography
-
-p, h1, h2, h3, h4, h5, h6, em, div, li, span, strong {
-  color: var(--global-text-color);
-}
-
-hr {
-  border-top: 1px solid var(--global-divider-color);
-}
-
-table {
-  td, th {
-    color: var(--global-text-color);
-  }
-  td {
-    font-size: 1rem;
-  }
-}
-
-a, table.table a {
-  color: var(--global-theme-color);
-  &:hover {
-    color: var(--global-theme-color);
-    text-decoration: underline;
-  }
-  &:hover:after :not(.nav-item.dropdown) {
-    width: 100%;
-  }
-}
-
-figure, img {
-  max-width: 90vw;
-}
-
-blockquote {
-  background: var(--global-bg-color);
-  border-left: 2px solid var(--global-theme-color);
-  margin: 1.5em 10px;
-  padding: 0.5em 10px;
-  font-size: 1.1rem;
-}
-
-// Math
-
-.equation {
-  margin-bottom: 1rem;
-  text-align: center;
-}
-
-// Caption
-
-.caption {
-  font-size: 0.875rem;
-  margin-top: 0.75rem;
-  margin-bottom: 1.5rem;
-  text-align: center;
-}
-
-// Card
-
-.card {
-  background-color: var(--global-card-bg-color);
-
-  img {
-    width: 100%;
-  }
-
-  .card-title {
-    color: var(--global-text-color);
-  }
-
-  .card-item {
-    width: auto;
-    margin-bottom: 10px;
-
-    .row {
-      display: flex;
-      align-items: center;
-    }
-  }
-}
-
-// Citation
-
-.citation, .citation-number {
-  color: var(--global-theme-color);
-}
-
-// Profile
-
-.profile {
-  width: 100%;
-
-  .address {
-    margin-bottom: 5px;
-    margin-top: 5px;
-    font-family: monospace;
-    p {
-      display: inline-block;
-      margin: 0;
-    }
-  }
-}
-.profile.float-right{
-  margin-left: 1rem;
-}
-.profile.float-left{
-  margin-right: 1rem;
-}
-
-@media (min-width: 576px) {
-  .profile {
-    width: 30%;
-    .address {
-      p { display: block; }
-    }
-  }
-}
-
-.post-description {
-  margin-bottom: 2rem;
-  font-size: 0.875rem;
-  a {
-    color: inherit;
-    &:hover {
-      color: var(--global-theme-color);
-      text-decoration: none;
-    }
-  }
-}
-
-
-// Navbar customization
-
-.navbar {
-  box-shadow: none;
-  border-bottom: 1px solid var(--global-divider-color);
-  background-color: var(--global-bg-color);
-  opacity: 0.95;
-}
-.navbar .dropdown-menu {
-  background-color: var(--global-bg-color);
-  border: 1px solid var(--global-divider-color);
-  a:not(.active) {
-    color: var(--global-text-color);
-  }
-  a:hover {
-    color: var(--global-hover-color);
-  }
-  .dropdown-divider {
-    border-top: 1px solid var(--global-divider-color) !important;
-  }
-}
-.dropdown-item {
-  color: var(--global-text-color);
-    &:hover {
-      color: var(--global-hover-color);
-      background-color: var(--global-bg-color);
-    }
-}
-.navbar.navbar-light {
-  a {
-    &:hover {
-      text-decoration: none;
-    }
-  }
-  .navbar-brand {
-    color: var(--global-text-color);
-  }
-  .navbar-nav .nav-item .nav-link {
-    color: var(--global-text-color);
-    &:hover {
-      color: var(--global-hover-color);
-    }
-  }
-  .navbar-nav .nav-item.active>.nav-link {
-      background-color: inherit;
-      font-weight: bolder;
-      color: var(--global-theme-color);
-      &:hover {
-        color: var(--global-hover-color);
-      }
-  }
-  .navbar-brand.social {
-    padding-bottom: 0;
-    padding-top: 0;
-    font-size: 1.7rem;
-    a {
-      i::before {
-        color: var(--global-text-color);
-        transition-property: all 0.2s ease-in-out;
-      }
-      &:hover {
-        i::before {
-          color: var(--global-theme-color);
-        }
-      }
-    }
-  }
-}
-
-.navbar-toggler {
-  .icon-bar {
-    display: block;
-    width: 22px;
-    height: 2px;
-    background-color: var(--global-text-color);
-    border-radius: 1px;
-    margin-bottom: 4px;
-    transition: all 0.2s;
-  }
-  .top-bar {
-    transform: rotate(45deg);
-    transform-origin: 10% 10%;
-  }
-  .middle-bar {
-    opacity: 0;
-  }
-  .bottom-bar {
-    transform: rotate(-45deg);
-    transform-origin: 10% 90%;
-  }
-}
-
-.navbar-toggler.collapsed {
-  .top-bar {
-    transform: rotate(0);
-  }
-  .middle-bar {
-    opacity: 1;
-  }
-  .bottom-bar {
-    transform: rotate(0);
-  }
-}
-
-#light-toggle {
-  padding: 0;
-  border: 0;
-  background-color: inherit;
-  color: var(--global-text-color);
-  &:hover {
-    color: var(--global-hover-color);
-  }
-}
-
-// Social (bottom)
-
-.social {
-  text-align: center;
-  .contact-icons {
-    font-size: 4rem;
-    a {
-      i::before {
-        color: var(--global-text-color);
-        transition-property: all 0.2s ease-in-out;
-      }
-      &:hover {
-        i::before {
-          color: var(--global-theme-color);
-        }
-      }
-    }
-  }
-  .contact-note {
-    font-size: 0.8rem;
-  }
-}
-
-
-// Footer
-footer.fixed-bottom {
-  background-color: var(--global-footer-bg-color);
-  font-size: 0.75rem;
-  .container {
-    color: var(--global-footer-text-color);
-    padding-top: 9px;
-    padding-bottom: 8px;
-  }
-  a {
-    color: var(--global-footer-link-color);
-    &:hover {
-      color: var(--global-theme-color);
-      text-decoration: none;
-    }
-  }
-}
-
-footer.sticky-bottom {
-  border-top: 1px solid var(--global-divider-color);
-  padding-top: 40px;
-  padding-bottom: 40px;
-  font-size: 0.9rem;
-}
-
-// CV
-
-.cv {
-  margin-bottom: 40px;
-  
-  .card {
-    background-color: var(--global-card-bg-color);
-    border: 1px solid var(--global-divider-color);
-    
-    .list-group-item {
-      background-color: inherit;
-
-      .badge {
-        color: var(--global-card-bg-color) !important;
-        background-color: var(--global-theme-color) !important;
-      }
-    }
-  }
-}
-
-// Repositories
-
-@media (min-width: 768px) {
-  .repo {
-    max-width: 50%;
-  }
-}
-
-// Blog
-
-.header-bar {
-  border-bottom: 1px solid var(--global-divider-color);
-  text-align: center;
-  padding-top: 2rem;
-  padding-bottom: 3rem;
-  h1 {
-    color: var(--global-theme-color);
-    font-size: 5rem;
-  }
-}
-
-.tag-list {
-  border-bottom: 1px solid var(--global-divider-color);
-  text-align: center;
-  padding-top: 1rem;
-
-  ul {
-    justify-content: center;
-    display: flow-root;
-
-    p, li {
-      list-style: none;
-      display: inline-block;
-      padding: 1rem 0.5rem;
-      color: var(--global-text-color-light);
-    }
-  }
-}
-
-.post-list {
-  margin: 0;
-  margin-bottom: 40px;
-  padding: 0;
-  li {
-    border-bottom: 1px solid var(--global-divider-color);
-    list-style: none;
-    padding-top: 2rem;
-    padding-bottom: 2rem;
-    .post-meta {
-      color: var(--global-text-color-light);
-      font-size: 0.875rem;
-      margin-bottom: 0;
-    }
-    .post-tags {
-      color: var(--global-text-color-light);
-      font-size: 0.875rem;
-      padding-top: 0.25rem;
-      padding-bottom: 0;
-    }
-    a {
-      color: var(--global-text-color);
-      text-decoration: none;
-      &:hover {
-        color: var(--global-theme-color);
-      }
-    }
-  }
-}
-
-.pagination {
-  .page-item {
-    .page-link {
-      color: var(--global-text-color);
-      &:hover {
-        color: $black-color;
-      }
-    }
-    &.active .page-link {
-      color: $white-color;
-      background-color: var(--global-theme-color);
-      &:hover {
-        background-color: var(--global-theme-color);
-      }
-    }
-  }
-}
-
-
-// Distill
-
-.distill {
-  a:hover {
-    border-bottom-color: var(--global-theme-color);
-    text-decoration: none;
-  }
-}
-
-
-// Projects
-
-.projects {
-  a {
-    text-decoration: none;
-
-    &:hover {
-      .card-title {
-        color: var(--global-theme-color);
-      }
-    }
-  }
-
-  .card {
-    img {
-      width: 100%;
-    }
-  }
-
-  .card-item {
-    width: auto;
-    margin-bottom: 10px;
-
-    .row {
-      display: flex;
-      align-items: center;
-    }
-  }
-
-  .grid-sizer, .grid-item {
-    width: 250px;
-    margin-bottom: 10px;
-  }
-
-  h2.category {
-    color: var(--global-divider-color);
-    border-bottom: 1px solid var(--global-divider-color);
-    padding-top: 0.5rem;
-    margin-top: 2rem;
-    margin-bottom: 1rem;
-    text-align: right;
-  }
-}
-
-
-// Publications
-
-.publications {
-  margin-top: 2rem;
-  h1 {
-    color: var(--global-theme-color);
-    font-size: 2rem;
-    text-align: center;
-    margin-top: 1em;
-    margin-bottom: 1em;
-  }
-  h2 {
-    margin-bottom: 1rem;
-    span {
-      font-size: 1.5rem;
-    }
-  }
-  h2.year {
-    color: var(--global-divider-color);
-    border-top: 1px solid var(--global-divider-color);
-    padding-top: 1rem;
-    margin-top: 2rem;
-    margin-bottom: -2rem;
-    text-align: right;
-  }
-  ol.bibliography {
-    list-style: none;
-    padding: 0;
-    margin-top: 0;
-
-    li {
-      margin-bottom: 1rem;
-      .preview {
-        width: 100%;
-        min-width: 80px;
-        max-width: 200px;
-      }
-      .abbr {
-        height: 2rem;
-        margin-bottom: 0.5rem;
-        abbr {
-          display: inline-block;
-          background-color: var(--global-theme-color);
-          padding-left: 1rem;
-          padding-right: 1rem;
-          a {
-            color: white;
-            &:hover {
-              text-decoration: none;
-            }
-          }
-        }
-        .award {
-          color: var(--global-theme-color) !important;
-          border: 1px solid var(--global-theme-color);
-        }
-      }
-      .title {
-        font-weight: bolder;
-      }
-      .author {
-        a {
-          border-bottom: 1px dashed var(--global-theme-color);
-          &:hover {
-              border-bottom-style: solid;
-              text-decoration: none;
-          }
-        }
-        > em {
-          border-bottom: 1px solid;
-          font-style: normal;
-        }
-        > span.more-authors {
-          color: var(--global-text-color-light);
-          border-bottom: 1px dashed var(--global-text-color-light);
-          cursor: pointer;
-          &:hover {
-              color: var(--global-text-color);
-              border-bottom: 1px dashed var(--global-text-color);
-          }
-        }
-      }
-      .links {
-        a.btn {
-          color: var(--global-text-color);
-          border: 1px solid var(--global-text-color);
-          padding-left: 1rem;
-          padding-right: 1rem;
-          padding-top: 0.25rem;
-          padding-bottom: 0.25rem;
-          &:hover {
-            color: var(--global-theme-color);
-            border-color: var(--global-theme-color);
-          }
-        }
-      }
-      .hidden {
-        font-size: 0.875rem;
-        max-height: 0px;
-        overflow: hidden;
-        text-align: justify;
-        transition-property: 0.15s ease;
-        -moz-transition: 0.15s ease;
-        -ms-transition: 0.15s ease;
-        -o-transition: 0.15s ease;
-        transition: all 0.15s ease;
-
-        p {
-          line-height: 1.4em;
-          margin: 10px;
-        }
-        pre {
-          font-size: 1em;
-          line-height: 1.4em;
-          padding: 10px;
-        }
-      }
-      .hidden.open {
-        max-height: 100em;
-        transition-property: 0.15s ease;
-        -moz-transition: 0.15s ease;
-        -ms-transition: 0.15s ease;
-        -o-transition: 0.15s ease;
-        transition: all 0.15s ease;
-      }
-      div.abstract.hidden {
-        border: dashed 1px var(--global-bg-color);
-      }
-      div.abstract.hidden.open {
-        border-color: var(--global-text-color);
-      }
-    }
-  }
-}
-
-// Rouge Color Customization
-figure.highlight {
-  margin: 0 0 1rem;
-}
-
-pre {
-  color: var(--global-theme-color);
-  background-color: var(--global-code-bg-color);
-  border-radius: 6px;
-  padding: 6px 12px;
-  pre, code {
-    background-color: transparent;
-    border-radius: 0;
-    padding: 0;
-  }
-}
-
-code {
-  color: var(--global-theme-color);
-  background-color: var(--global-code-bg-color);
-  border-radius: 3px;
-  padding: 3px 3px;
-}
-
-
-// Transitioning Themes
-html.transition,
-html.transition *,
-html.transition *:before,
-html.transition *:after {
-  transition: all 750ms !important;
-  transition-delay: 0 !important;
-}
-
-// Extra Markdown style (post Customization)
-.post{
-  .post-meta{
-    color: var(--global-text-color-light);
-    font-size: 0.875rem;
-    margin-bottom: 0;
-  }
-  .post-tags{
-    color: var(--global-text-color-light);
-    font-size: 0.875rem;
-    padding-top: 0.25rem;
-    padding-bottom: 1rem;
-    a {
-      color: var(--global-text-color-light);
-      text-decoration: none;
-      &:hover {
-        color: var(--global-theme-color);
-      }
-    }
-  }
-  .post-content{
-    blockquote {
-      border-left: 5px solid var(--global-theme-color);
-      padding: 8px;
-    }
-  }
-}
diff --git a/_sass/_distill.scss b/_sass/_distill.scss
deleted file mode 100644
index d83fafd4..00000000
--- a/_sass/_distill.scss
+++ /dev/null
@@ -1,126 +0,0 @@
-/*******************************************************************************
- * Style overrides for distill blog posts.
- ******************************************************************************/
-
-d-byline {
-  border-top-color: var(--global-divider-color) !important;
-}
-
-d-byline h3 {
-  color: var(--global-text-color) !important;
-}
-
-d-byline a, d-article d-byline a {
-  color: var(--global-text-color) !important;
-  &:hover {
-    color: var(--global-hover-color) !important;
-  }
-}
-
-d-article {
-  border-top-color: var(--global-divider-color) !important;
-  a, p, h1, h2, h3, h4, h5, h6, li, table {
-    color: var(--global-text-color) !important;
-  }
-  a, h1, h2, hr, table, table th, table td {
-    border-bottom-color: var(--global-divider-color) !important;
-  }
-  a:hover {
-    border-bottom-color: var(--global-hover-color) !important;
-  }
-  b i {
-    display: inline;
-  }
-
-  d-contents {
-    align-self: start;
-    grid-column: 1 / 4;
-    grid-row: auto / span 4;
-    justify-self: end;
-    margin-top: 0em;
-    padding-left: 2em;
-    padding-right: 3em;
-    border-right: 1px solid var(--global-divider-color);
-    width: calc(max(70%, 300px));
-    margin-right: 0px;
-    margin-top:  0em;
-    display: grid;
-    grid-template-columns:
-      minmax(8px, 1fr) [toc] auto
-      minmax(8px, 1fr) [toc-line] 1px
-      minmax(32px, 2fr);
-
-    nav {
-      grid-column: toc;
-      a {
-        border-bottom: none !important;
-        &:hover {
-          border-bottom: 1px solid var(--global-text-color) !important;
-        }
-      }
-      h3 {
-        margin-top: 0;
-        margin-bottom: 1em;
-      }
-      div {
-        display: block;
-        outline: none;
-        margin-bottom: 0.8em;
-        color: rgba(0, 0, 0, 0.8);
-        font-weight: bold;
-      }
-      ul {
-        padding-left: 1em;
-        margin-top: 0;
-        margin-bottom: 6px;
-        list-style-type: none;
-        li {
-          margin-bottom: 0.25em;
-        }
-      }
-    }
-    .figcaption {
-      line-height: 1.4em;
-    }
-    toc-line {
-      border-right: 1px solid var(--global-divider-color);
-      grid-column: toc-line;
-    }
-  }
-
-  d-footnote {
-    scroll-margin-top: 66px;
-  }
-}
-
-d-appendix {
-  border-top-color: var(--global-divider-color) !important;
-  color: var(--global-distill-app-color) !important;
-  h3, li, span {
-    color: var(--global-distill-app-color) !important;
-  }
-  a, a.footnote-backlink {
-    color: var(--global-distill-app-color) !important;
-    &:hover {
-      color: var(--global-hover-color) !important;
-    }
-  }
-}
-
-@media (max-width: 1024px) {
-  d-article {
-    d-contents {
-      display: block;
-      grid-column-start: 2;
-      grid-column-end: -2;
-      padding-bottom: 0.5em;
-      margin-bottom: 1em;
-      padding-top: 0.5em;
-      width: 100%;
-      border: 1px solid var(--global-divider-color);
-      nav {
-        grid-column: none;
-      }
-    }
-  }
-}
diff --git a/_sass/_layout.scss b/_sass/_layout.scss
deleted file mode 100644
index 9c10cac7..00000000
--- a/_sass/_layout.scss
+++ /dev/null
@@ -1,50 +0,0 @@
-/******************************************************************************
- * Content
- ******************************************************************************/
-
-body {
-  padding-bottom: 70px;
-  color: var(--global-text-color);
-  background-color: var(--global-bg-color);
-
-  h1, h2, h3, h4, h5, h6 {
-    scroll-margin-top: 66px;
-  }
-}
-
-body.fixed-top-nav {
-  // Add some padding for the nav-bar.
-  padding-top: 56px;
-}
-
-body.sticky-bottom-footer {
-  // Remove padding below footer.
-  padding-bottom: 0;
-}
-
-.container {
-  max-width: $max-content-width;
-}
-
-// Profile
-.profile {
-  img {
-    width: 100%;
-  }
-}
-
-// TODO: redefine content layout.
-
-
-/******************************************************************************
- * Publications
- ******************************************************************************/
-
-// TODO: redefine publications layout.
-
-
-/*****************************************************************************
-* Projects
-*****************************************************************************/
-
-// TODO: redefine projects layout.
diff --git a/_sass/_themes.scss b/_sass/_themes.scss
deleted file mode 100644
index d3322c99..00000000
--- a/_sass/_themes.scss
+++ /dev/null
@@ -1,100 +0,0 @@
-/*******************************************************************************
- * Themes
- ******************************************************************************/
- 
-:root {
-  --global-bg-color: #{$white-color};
-  --global-code-bg-color: #{$code-bg-color-light};
-  --global-text-color: #{$black-color};
-  --global-text-color-light: #{$grey-color};
-  --global-theme-color: #{$cyan-color};
-  --global-hover-color: #{$cyan-color};
-  --global-footer-bg-color: #{$grey-color-dark};
-  --global-footer-text-color: #{$grey-color-light};
-  --global-footer-link-color: #{$white-color};
-  --global-distill-app-color: #{$grey-color};
-  --global-divider-color: rgba(0,0,0,.1);
-  --global-card-bg-color: #{$white-color};
-
-  .fa-sun {
-    display : none;
-  }
-  .fa-moon {
-    padding-left: 10px;
-    padding-top: 12px;
-    display : block;
-  }
-
-  .repo-img-light {
-    display: block;
-  }
-  .repo-img-dark {
-    display: none;
-  }
-}
-
-.header-background .img {
-  background-image: url("../img/ICLR-logo.png");
-  background-repeat: no-repeat;
-  background-size: 400px;
-  background-position: center bottom;
-  height: 12em;
-  margin-bottom: 0em;
-  margin-top: -2.7em;  
-}
-
-html[data-theme='dark'] {
-  --global-bg-color: #{$grey-color-dark};
-  --global-code-bg-color: #{$code-bg-color-dark};
-  --global-text-color: #{$grey-color-light};
-  --global-text-color-light: #{$grey-color-light};
-  --global-theme-color: #{$cyan-color};
-  --global-hover-color: #{$cyan-color};
-  --global-footer-bg-color: #{$grey-color-light};
-  --global-footer-text-color: #{$grey-color-dark};
-  --global-footer-link-color: #{$black-color};
-  --global-distill-app-color: #{$grey-color-light};
-  --global-divider-color: #424246;
-  --global-card-bg-color: #{$grey-900};
-
-  .fa-sun {
-    padding-left: 10px;
-    padding-top: 12px;
-    display : block;
-  }
-  .fa-moon {
-    display : none;
-  }
-
-  .repo-img-light {
-    display: none;
-  }
-  .repo-img-dark {
-    display: block;
-  }
-
-.header-background .img {
-  background-image: url("../img/ICLR-logo-dark.png");
-  background-repeat: no-repeat;
-  background-size: 400px;
-  background-position: center bottom;
-  height: 12em;
-  margin-bottom: 0em;
-  margin-top: -2.7em;  
-  // filter: invert(89%);
-}
-
-
-
-
-  // .header-background .img {
-  //   background-image: url("../img/score_contour.jpg");
-  //   background-repeat: no-repeat;
-  //   background-size: cover;
-  //   background-position: center bottom;
-  //   height: 15em;
-  //   margin-bottom: 2em;
-  //   margin-top: -2.7em;
-  //   filter: invert(89%);
-  // }
-}
diff --git a/_sass/_variables.scss b/_sass/_variables.scss
deleted file mode 100644
index b050aa6e..00000000
--- a/_sass/_variables.scss
+++ /dev/null
@@ -1,38 +0,0 @@
-/*******************************************************************************
- * Variables used throughout the theme.
- * To adjust anything, simply edit the variables below and rebuild the theme.
- ******************************************************************************/
-
-
-// Colors
-$red-color:           #FF3636 !default;
-$red-color-dark:      #B71C1C !default;
-$orange-color:        #F29105 !default;
-$blue-color:          #0076df !default;
-$blue-color-dark:     #00369f !default;
-$cyan-color:          #2698BA !default;
-$light-cyan-color:    lighten($cyan-color, 25%);
-$green-color:         #00ab37 !default;
-$green-color-lime:    #B7D12A !default;
-$green-color-dark:    #009f06 !default;
-$green-color-light:   #ddffdd !default;
-$green-color-bright:  #11D68B !default;
-$purple-color:        #B509AC !default;
-$light-purple-color:  lighten($purple-color, 25%);
-$pink-color:          #f92080 !default;
-$pink-color-light:    #ffdddd !default;
-$yellow-color:        #efcc00 !default;
-
-$grey-color:          #828282 !default;
-$grey-color-light:    lighten($grey-color, 40%);
-$grey-color-dark:     #1C1C1D;
-$grey-900:            #212529;
-
-$white-color: #ffffff !default;
-$black-color: #000000 !default;
-
-
-// Theme colors
-
-$code-bg-color-light:     rgba($purple-color, 0.05);
-$code-bg-color-dark:      #2c3237 !default;
diff --git a/about.html b/about.html
new file mode 100644
index 00000000..e2738b32
--- /dev/null
+++ b/about.html
@@ -0,0 +1 @@
+<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <title>about | ICLR Blogposts 2023</title> <meta name="author" content="abc b c"/> <meta name="description" content="Home to the 2023 ICLR Blogposts track "/> <meta name="keywords" content="machine-learning, ml, deep-learning, reinforcement-learning, iclr"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha256-DF7Zhf293AJxJNTmh5zhoYYIMs2oXitRfBjY+9L//AY=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/css/mdb.min.css" integrity="sha256-jpjYvU3G3N6nrrBwXJoVEYI/0zw8htfFnhT9ljN3JJw=" crossorigin="anonymous"/> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fortawesome/fontawesome-free@5.15.4/css/all.min.css" integrity="sha256-mUZM63G8m73Mcidfrv5E+Y61y7a12O5mW4ezU3bxqW4=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/academicons@1.9.1/css/academicons.min.css" integrity="sha256-i1+4qU2G2860dGGIOJscdC30s9beBXjFfzjWLjBRsBg=" crossorigin="anonymous"> <link rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700|Roboto+Slab:100,300,400,500,700|Material+Icons"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/github.css" media="" id="highlight_theme_light"/> <link rel="shortcut icon" href="/2023/assets/img/iclr_favicon.ico"/> <link rel="stylesheet" href="/2023/assets/css/main.css"> <link rel="canonical" href="https://iclr-blogposts.github.io/2023/about"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/native.css" media="none" id="highlight_theme_dark"/> <script src="/2023/assets/js/theme.js"></script> <script src="/2023/assets/js/dark_mode.js"></script> </head> <body class="fixed-top-nav "> <header> <nav id="navbar" class="navbar navbar-light navbar-expand-sm fixed-top"> <div class="container"> <a class="navbar-brand title font-weight-lighter" href="/2023/">ICLR Blogposts 2023</a> <button class="navbar-toggler collapsed ml-auto" type="button" data-toggle="collapse" data-target="#navbarNav" aria-controls="navbarNav" aria-expanded="false" aria-label="Toggle navigation"> <span class="sr-only">Toggle navigation</span> <span class="icon-bar top-bar"></span> <span class="icon-bar middle-bar"></span> <span class="icon-bar bottom-bar"></span> </button> <div class="collapse navbar-collapse text-right" id="navbarNav"> <ul class="navbar-nav ml-auto flex-nowrap"> <li class="nav-item active"> <a class="nav-link" href="/2023/about">about<span class="sr-only">(current)</span></a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/call">call for blogposts</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/submitting">submitting</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/reviewing">reviewing</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/blog/index.html">blog</a> </li> <li class="nav-item dropdown "> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdown" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">other iterations</a> <div class="dropdown-menu dropdown-menu-right" aria-labelledby="navbarDropdown"> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2025/">2025</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2024/">2024</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blog-track.github.io/home/" target="_blank" rel="noopener noreferrer">2022</a> </div> </li> <li class="toggle-container"> <button id="light-toggle" title="Change theme"> <i class="fas fa-moon"></i> <i class="fas fa-sun"></i> </button> </li> </ul> </div> </div> </nav> </header> <div class="header-background"><div class="img"></div></div> <div class="container mt-5"> <div class="post"> <header class="post-header"> <p class="desc"></p> </header> <article> <div class="clearfix"> <p><strong>Announcements</strong>:</p> <ul> <li>The track has concluded and accepted blogposts are viewable <a href="/2023/blog">here</a>!</li> <li>The poster session for the blog track will take place at <strong>11:30</strong> on <strong>Tuesday May 2nd</strong> in room <strong>MH1-2-3-4</strong>. <ul> <li>Check <a href="https://iclr.cc/virtual/2023/workshop/14478" target="_blank" rel="noopener noreferrer">here</a> for more information, and come by to check out the posters!</li> <li>If you are going to be presenting a poster in-person, please add the <a href="/2023/assets/pdf/sticker.pdf">blog post track sticker</a> to your poster.</li> </ul> </li> </ul> <h2 id="contents">Contents</h2> <ul> <li><a href="#accepted-posts">Accepted Posts</a></li> <li><a href="#iclr-2023-blogposts-track">ICLR 2023 Blogposts Track</a></li> <li><a href="#key-dates">Key Dates</a></li> <li><a href="#submissions">Submissions</a></li> <li><a href="#organizers">Organizers</a></li> </ul> <h2 id="accepted-posts">Accepted Posts</h2> <dl> <dt><strong><a href="/2023/blog/2023/how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/">How does the inductive bias influence the generalization capability of neural networks?</a></strong></dt> <dd>     <em>Charlotte Barth, Thomas Goerttler, Klaus Obermayer</em> </dd> <dt><strong><a href="/2023/blog/2023/sets-and-graphs/">Universality of Neural Networks on Sets vs. Graphs</a></strong></dt> <dd>     <em>Fabian B. Fuchs, Petar Veličković</em> </dd> <dt><strong><a href="/2023/blog/2023/facial-poisoning/">Data Poisoning is Hitting a Wall</a></strong></dt> <dd>     <em>Rajat Sahay</em> </dd> <dt><strong><a href="/2023/blog/2023/adamw/">Decay No More</a></strong></dt> <dd>     <em>Fabian Schaipp</em> </dd> <dt><strong><a href="/2023/blog/2023/riit/">Rethinking the Implementation Tricks and Monotonicity Constraint in Cooperative Multi-agent Reinforcement Learning</a></strong></dt> <dd>     <em>Jian Hu, Siying Wang, Siyang Jiang, Weixun Wang</em> </dd> <dt><strong><a href="/2023/blog/2023/autoregressive-neural-pde-solver/">Autoregressive Renaissance in Neural PDE Solvers</a></strong></dt> <dd>     <em>Yolanne Lee</em> </dd> <dt><strong><a href="/2023/blog/2023/hitchhikers-momentum/">A Hitchhiker’s Guide to Momentum</a></strong></dt> <dd>     <em>Fabian Pedregosa</em> </dd> <dt><strong><a href="/2023/blog/2023/raspy/">Thinking Like Transformers</a></strong></dt> <dd>     <em>Alexander Rush, Gail Weiss</em> </dd> <dt><strong><a href="/2023/blog/2023/classification-layer-initialization-in-maml/">Strategies for Classification Layer Initialization in Model-Agnostic Meta-Learning</a></strong></dt> <dd>     <em>Nys Tjade Siegel, Thomas Goerttler, Klaus Obermayer</em> </dd> <dt><strong><a href="/2023/blog/2023/bsuite-applications/">Practical Applications of Bsuite For Reinforcement Learning</a></strong></dt> <dd>     <em>Loren Anderson, Nathan Bittner</em> </dd> <dt><strong><a href="/2023/blog/2023/how-much-meta-learning-is-in-image-to-image-translation/">How much meta-learning is in image-to-image translation?</a></strong></dt> <dd>     <em>Maximilian Eißler, Thomas Goerttler, Klaus Obermayer</em> </dd> </dl> <h2 id="iclr-2023-blogposts-track">ICLR 2023 Blogposts Track</h2> <p>The Machine Learning community is currently experiencing a <a href="https://neuripsconf.medium.com/designing-the-reproducibility-program-for-neurips-2020-7fcccaa5c6ad" target="_blank" rel="noopener noreferrer">reproducibility crisis</a> and a reviewing crisis <a href="#Litt">[Littman, 2021]</a>. Because of the highly competitive and noisy reviewing process of ML conferences <a href="#Tran">[Tran et al., 2020]</a>, researchers have an incentive to oversell their results, slowing down the progress and diminishing the integrity of the scientific community. Moreover with the growing number of papers published and submitted at the main ML conferences <a href="#Lin">[Lin et al., 2020]</a>, it has become more challenging to keep track of the latest advances in the field.</p> <p>Blog posts are becoming an increasingly popular and useful way to talk about science <a href="#Brow">[Brown and Woolston, 2018]</a>. They offer substantial value to the scientific community by providing a flexible platform to foster open, human, and transparent discussions about new insights or limitations of a scientific publication. However, because they are not as recognized as standard scientific publications, only a minority of researchers manage to maintain an active blog and get visibility for their efforts. Many are well-established researchers (<a href="https://francisbach.com/" target="_blank" rel="noopener noreferrer">Francis Bach</a>, <a href="https://www.argmin.net/" target="_blank" rel="noopener noreferrer">Ben Recht</a>, <a href="https://www.inference.vc/" target="_blank" rel="noopener noreferrer">Ferenc Huszár</a>, <a href="https://lilianweng.github.io/lil-log/" target="_blank" rel="noopener noreferrer">Lilian Weng</a>) or big corporations that leverage entire teams of graphic designers designer and writers to polish their blogs (<a href="https://ai.facebook.com/blog/?page=1" target="_blank" rel="noopener noreferrer">Facebook AI</a>, <a href="https://ai.googleblog.com/" target="_blank" rel="noopener noreferrer">Google AI</a>, <a href="https://deepmind.com/blog" target="_blank" rel="noopener noreferrer">DeepMind</a>, <a href="https://openai.com/blog/" target="_blank" rel="noopener noreferrer">OpenAI</a>). As a result, the incentives for writing scientific blog posts are largely personal; it is unreasonable to expect a significant portion of the machine learning community to contribute to such an initiative when everyone is trying to establish themselves through publications.</p> <h2 id="a-blog-post-conference-track">A Blog Post Conference Track</h2> <p>Last year, we ran the first iteration of the <a href="https://iclr-blog-track.github.io/home/" target="_blank" rel="noopener noreferrer">Blogpost track at ICLR 2022</a>! It was very successful, attracting over 60 submissions and 20 accepted posts.</p> <p>Our goal is to create a formal call for blog posts at ICLR to incentivize and reward researchers to review past work and summarize the outcomes, develop new intuitions, or highlight some shortcomings. A very influential initiative of this kind happened after the second world war in France. Because of the lack of up-to-date textbooks, a collective of mathematicians under the pseudonym Nicolas Bourbaki <a href="#Halm">[Halmos 1957]</a>, decided to start a series of textbooks about the foundations of mathematics <a href="#Bour">[Bourbaki, 1939]</a>. In the same vein, we aim at providing a new way to summarize scientific knowledge in the ML community.</p> <p>Due to the large diversity of topics that can be discussed in a blog post, we decided to restrict the range of topics for this call for blog posts. We identified that the blog posts that would bring to most value to the community and the conference would be posts that distill and discuss <em>previously published papers</em>.</p> <h2 id="key-dates">Key Dates</h2> <ul> <li> <p><strong>Abstract deadline</strong>: February 2nd AOE, 2023 (submit to <a href="https://openreview.net/group?id=ICLR.cc/2023/BlogPosts&amp;referrer=%5BHomepage%5D(%2F)" target="_blank" rel="noopener noreferrer">OpenReview</a>).  </p> </li> <li> <p><strong>Submission deadline</strong>: February 10th AOE, 2023 (any modifications to your blog post, via a <a href="https://github.com/iclr-blogposts/staging/pulls" target="_blank" rel="noopener noreferrer">pull request on github</a>).  </p> </li> <li> <p><strong>Notification of acceptance</strong>: March 31st, 2023  </p> </li> <li> <p><strong>Camera-ready merge</strong>: April 28th, 2023 (please follow the instructions <a href="/2023/submitting#camera-ready-instructions">here</a>)</p> </li> </ul> <h3 id="a-call-for-blog-posts-discussing-work-previously-published-at-iclr">A call for blog posts discussing work previously published at ICLR</h3> <p>The format and process for this blog post track is as follows:</p> <ul> <li>Write a post on a subject that has been published at ICLR relatively recently. The authors of the blog posts will have to declare their conflicts of interest (positive nor negative) with the paper (and their authors) they write about. Conflicts of interest include: <ul> <li>Recent collaborators (less than 3 years)</li> <li>Current institution.</li> </ul> <p>Blog Posts must not be used to highlight or advertise past publications of the authors of of their lab. Previously, we did not accept submissions with a conflict of interest, however this year we will only ask the authors to report if they have such a conflict. If so, reviewers will be asked to judge if the submission is sufficiently critical and objective of the papers addressed in the blog post.</p> </li> <li> <p>The posts will be created and published under a unified template; see <a href="/2023/submitting">the submission instructions</a> and the <a href="/2023/blog/2022/distill-example">sample post</a> hosted on the blog of this website.</p> </li> <li>Blogs will be peer-reviewed (double-blind) for quality and novelty of the content: clarity and pedagogy of the exposition, new theoretical or practical insights, reproduction/extension of experiments, etc. We are slightly relaxing the double-blind constraints by assuming good faith from both submitters and reviewers (see <a href="/2023/submitting">the submission instructions</a> for more details).</li> </ul> <h2 id="submissions">Submissions</h2> <p>Our goal is to avoid heavily engineered, professionally-made blog-posts—Such as the “100+ hours” mentioned as a standard by the <a href="https://distill.pub/journal/" target="_blank" rel="noopener noreferrer">Distill guidelines</a>—to entice ideas and clear writing rather than dynamic visualizations or embedded javascript engines.</p> <p>As a result, we restrict submissions to the Markdown format. We believe this is a good trade-off between complexity and flexibility. Markdown enables users to easily embed media such as images, gifs, audio, and video as well as write mathematical equations using MathJax, without requiring users to know how to create HTML web pages. This (mostly) static format is also fairly portable; users can download the blog post without much effort for offline reading or archival purposes. More importantly, this format can be easily hosted and maintained through GitHub.</p> <h2 id="organizers">Organizers</h2> <div class="row row-cols-2 projects pt-3 pb-3"> <div class="card-item col"> <a href="https://gauthiergidel.github.io/" target="_blank" rel="noopener noreferrer"> <div class="card hoverable"> <div class="row g-0"> <div class="card-img col-sm-4"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/organizers/gg-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/organizers/gg-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/organizers/gg-1400.webp"></source> <img src="/2023/assets/img/organizers/gg.jpg" width="auto" height="auto" alt="Gauthier Gidel" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> <div class="col-sm-6"> <div class="card-body"> <h5 class="card-title text-right">Gauthier Gidel</h5> <p class="card-text text-right">Mila, Université de Montréal</p> </div> </div> </div> </div> </a> </div> <div class="card-item col"> <a href="https://velythyl.github.io/" target="_blank" rel="noopener noreferrer"> <div class="card hoverable"> <div class="row g-0"> <div class="card-img col-sm-4"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/organizers/cg-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/organizers/cg-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/organizers/cg-1400.webp"></source> <img src="/2023/assets/img/organizers/cg.jpg" width="auto" height="auto" alt="Charlie Gauthier" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> <div class="col-sm-6"> <div class="card-body"> <h5 class="card-title text-right">Charlie Gauthier</h5> <p class="card-text text-right">Mila, Université de Montréal</p> </div> </div> </div> </div> </a> </div> <div class="card-item col"> <a href="/2023"> <div class="card hoverable"> <div class="row g-0"> <div class="card-img col-sm-4"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/organizers/dd-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/organizers/dd-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/organizers/dd-1400.webp"></source> <img src="/2023/assets/img/organizers/dd.jpg" width="auto" height="auto" alt="David Dobre" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> <div class="col-sm-6"> <div class="card-body"> <h5 class="card-title text-right">David Dobre</h5> <p class="card-text text-right">Mila, Université de Montréal</p> </div> </div> </div> </div> </a> </div> <div class="card-item col"> <a href="https://www.cvernade.com/" target="_blank" rel="noopener noreferrer"> <div class="card hoverable"> <div class="row g-0"> <div class="card-img col-sm-4"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/organizers/cv-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/organizers/cv-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/organizers/cv-1400.webp"></source> <img src="/2023/assets/img/organizers/cv.jpg" width="auto" height="auto" alt="Claire Vernade" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> <div class="col-sm-6"> <div class="card-body"> <h5 class="card-title text-right">Claire Vernade</h5> <p class="card-text text-right">University of Tuebingen</p> </div> </div> </div> </div> </a> </div> <div class="card-item col"> <a href="https://cims.nyu.edu/~bruna/" target="_blank" rel="noopener noreferrer"> <div class="card hoverable"> <div class="row g-0"> <div class="card-img col-sm-4"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/organizers/jb-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/organizers/jb-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/organizers/jb-1400.webp"></source> <img src="/2023/assets/img/organizers/jb.jpg" width="auto" height="auto" alt="Joan Bruna" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> <div class="col-sm-6"> <div class="card-body"> <h5 class="card-title text-right">Joan Bruna</h5> <p class="card-text text-right">New York University</p> </div> </div> </div> </div> </a> </div> </div> <hr> <h2 id="references">References</h2> <p><a name="Litt">Michael L Littman. Collusion rings threaten the integrity of computer science research. Communications of the ACM, 2021.</a></p> <p><a name="Tran">David Tran, Alex Valtchanov, Keshav Ganapathy, Raymond Feng, Eric Slud, Micah Goldblum, and Tom Goldstein. An open review of openreview: A critical analysis of the machine learning conference review process. arXiv, 2020. </a></p> <p><a name="Lin">Hsuan-Tien Lin, Maria-Florina Balcan, Raia Hadsell, and Marc’Aurelio Ranzato. What we learned from neurips2020 reviewing process. Medium https://medium.com/@NeurIPSConf/what-we-learned-from-neurips-2020-reviewing-process-e24549eea38f, 2020. </a></p> <p><a name="Brow">Eryn Brown and Chris Woolston. Why science blogging still matters. Nature, 2018.</a></p> <p><a name="Halm">Paul R Halmos. Nicolas bourbaki. Scientific American, 1957.</a><a></a></p> <p><a name="Bour">Nicolas Bourbaki. Elements of mathematics. Éditions Hermann, 1939.</a></p> </div> </article> </div> </div> <script src="https://cdn.jsdelivr.net/npm/jquery@3.6.0/dist/jquery.min.js" integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/js/bootstrap.bundle.min.js" integrity="sha256-fgLAgv7fyCGopR/gBNq2iW3ZKIdqIcyshnUULC4vex8=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/js/mdb.min.js" integrity="sha256-NdbiivsvWt7VYCt6hYNT3h/th9vSTL4EDWeGs5SN3DA=" crossorigin="anonymous"></script> <script defer src="https://cdn.jsdelivr.net/npm/masonry-layout@4.2.2/dist/masonry.pkgd.min.js" integrity="sha256-Nn1q/fx0H7SNLZMQ5Hw5JLaTRZp0yILA/FRexe19VdI=" crossorigin="anonymous"></script> <script defer src="https://cdn.jsdelivr.net/npm/imagesloaded@4/imagesloaded.pkgd.min.js"></script> <script defer src="/2023/assets/js/masonry.js" type="text/javascript"></script> <script defer src="https://cdn.jsdelivr.net/npm/medium-zoom@1.0.6/dist/medium-zoom.min.js" integrity="sha256-EdPgYcPk/IIrw7FYeuJQexva49pVRZNmt3LculEr7zM=" crossorigin="anonymous"></script> <script defer src="/2023/assets/js/zoom.js"></script> <script defer src="/2023/assets/js/common.js"></script> <script type="text/javascript">window.MathJax={tex:{tags:"ams"}};</script> <script defer type="text/javascript" id="MathJax-script" src="https://cdn.jsdelivr.net/npm/mathjax@3.2.0/es5/tex-mml-chtml.js"></script> <script defer src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script> </body> </html>
\ No newline at end of file
diff --git a/assets/css/main.css b/assets/css/main.css
new file mode 100644
index 00000000..b28183e0
--- /dev/null
+++ b/assets/css/main.css
@@ -0,0 +1,3 @@
+:root{--global-bg-color:#fff;--global-code-bg-color:rgba(181,9,172,0.05);--global-text-color:#000;--global-text-color-light:#828282;--global-theme-color:#2698ba;--global-hover-color:#2698ba;--global-footer-bg-color:#1c1c1d;--global-footer-text-color:#e8e8e8;--global-footer-link-color:#fff;--global-distill-app-color:#828282;--global-divider-color:rgba(0,0,0,.1);--global-card-bg-color:#fff}:root .fa-sun{display:none}:root .fa-moon{padding-left:10px;padding-top:12px;display:block}:root .repo-img-light{display:block}:root .repo-img-dark{display:none}.header-background .img{background-image:url("../img/ICLR-logo.png");background-repeat:no-repeat;background-size:400px;background-position:center bottom;height:12em;margin-bottom:0;margin-top:-2.7em}html[data-theme=dark]{--global-bg-color:#1c1c1d;--global-code-bg-color:#2c3237;--global-text-color:#e8e8e8;--global-text-color-light:#e8e8e8;--global-theme-color:#2698ba;--global-hover-color:#2698ba;--global-footer-bg-color:#e8e8e8;--global-footer-text-color:#1c1c1d;--global-footer-link-color:#000;--global-distill-app-color:#e8e8e8;--global-divider-color:#424246;--global-card-bg-color:#212529}html[data-theme=dark] .fa-sun{padding-left:10px;padding-top:12px;display:block}html[data-theme=dark] .fa-moon{display:none}html[data-theme=dark] .repo-img-light{display:none}html[data-theme=dark] .repo-img-dark{display:block}html[data-theme=dark] .header-background .img{background-image:url("../img/ICLR-logo-dark.png");background-repeat:no-repeat;background-size:400px;background-position:center bottom;height:12em;margin-bottom:0;margin-top:-2.7em}body{padding-bottom:70px;color:var(--global-text-color);background-color:var(--global-bg-color)}body h1,body h2,body h3,body h4,body h5,body h6{scroll-margin-top:66px}body.fixed-top-nav{padding-top:56px}body.sticky-bottom-footer{padding-bottom:0}.container{max-width:1000px}.profile img{width:100%}p,h1,h2,h3,h4,h5,h6,em,div,li,span,strong{color:var(--global-text-color)}hr{border-top:1px solid var(--global-divider-color)}table td,table th{color:var(--global-text-color)}table td{font-size:1rem}a,table.table a{color:var(--global-theme-color)}a:hover,table.table a:hover{color:var(--global-theme-color);text-decoration:underline}a:hover:after :not(.nav-item.dropdown),table.table a:hover:after :not(.nav-item.dropdown){width:100%}figure,img{max-width:90vw}blockquote{background:var(--global-bg-color);border-left:2px solid var(--global-theme-color);margin:1.5em 10px;padding:.5em 10px;font-size:1.1rem}.equation{margin-bottom:1rem;text-align:center}.caption{font-size:.875rem;margin-top:.75rem;margin-bottom:1.5rem;text-align:center}.card{background-color:var(--global-card-bg-color)}.card img{width:100%}.card .card-title{color:var(--global-text-color)}.card .card-item{width:auto;margin-bottom:10px}.card .card-item .row{display:flex;align-items:center}.citation,.citation-number{color:var(--global-theme-color)}.profile{width:100%}.profile .address{margin-bottom:5px;margin-top:5px;font-family:monospace}.profile .address p{display:inline-block;margin:0}.profile.float-right{margin-left:1rem}.profile.float-left{margin-right:1rem}@media(min-width:576px){.profile{width:30%}.profile .address p{display:block}}.post-description{margin-bottom:2rem;font-size:.875rem}.post-description a{color:inherit}.post-description a:hover{color:var(--global-theme-color);text-decoration:none}.navbar{box-shadow:none;border-bottom:1px solid var(--global-divider-color);background-color:var(--global-bg-color);opacity:.95}.navbar .dropdown-menu{background-color:var(--global-bg-color);border:1px solid var(--global-divider-color)}.navbar .dropdown-menu a:not(.active){color:var(--global-text-color)}.navbar .dropdown-menu a:hover{color:var(--global-hover-color)}.navbar .dropdown-menu .dropdown-divider{border-top:1px solid var(--global-divider-color)!important}.dropdown-item{color:var(--global-text-color)}.dropdown-item:hover{color:var(--global-hover-color);background-color:var(--global-bg-color)}.navbar.navbar-light a:hover{text-decoration:none}.navbar.navbar-light .navbar-brand{color:var(--global-text-color)}.navbar.navbar-light .navbar-nav .nav-item .nav-link{color:var(--global-text-color)}.navbar.navbar-light .navbar-nav .nav-item .nav-link:hover{color:var(--global-hover-color)}.navbar.navbar-light .navbar-nav .nav-item.active>.nav-link{background-color:inherit;font-weight:bolder;color:var(--global-theme-color)}.navbar.navbar-light .navbar-nav .nav-item.active>.nav-link:hover{color:var(--global-hover-color)}.navbar.navbar-light .navbar-brand.social{padding-bottom:0;padding-top:0;font-size:1.7rem}.navbar.navbar-light .navbar-brand.social a i::before{color:var(--global-text-color);transition-property:all .2s ease-in-out}.navbar.navbar-light .navbar-brand.social a:hover i::before{color:var(--global-theme-color)}.navbar-toggler .icon-bar{display:block;width:22px;height:2px;background-color:var(--global-text-color);border-radius:1px;margin-bottom:4px;transition:all .2s}.navbar-toggler .top-bar{transform:rotate(45deg);transform-origin:10% 10%}
+.navbar-toggler .middle-bar{opacity:0}.navbar-toggler .bottom-bar{transform:rotate(-45deg);transform-origin:10% 90%}.navbar-toggler.collapsed .top-bar{transform:rotate(0)}.navbar-toggler.collapsed .middle-bar{opacity:1}.navbar-toggler.collapsed .bottom-bar{transform:rotate(0)}#light-toggle{padding:0;border:0;background-color:inherit;color:var(--global-text-color)}#light-toggle:hover{color:var(--global-hover-color)}.social{text-align:center}.social .contact-icons{font-size:4rem}.social .contact-icons a i::before{color:var(--global-text-color);transition-property:all .2s ease-in-out}.social .contact-icons a:hover i::before{color:var(--global-theme-color)}.social .contact-note{font-size:.8rem}footer.fixed-bottom{background-color:var(--global-footer-bg-color);font-size:.75rem}footer.fixed-bottom .container{color:var(--global-footer-text-color);padding-top:9px;padding-bottom:8px}footer.fixed-bottom a{color:var(--global-footer-link-color)}footer.fixed-bottom a:hover{color:var(--global-theme-color);text-decoration:none}footer.sticky-bottom{border-top:1px solid var(--global-divider-color);padding-top:40px;padding-bottom:40px;font-size:.9rem}.cv{margin-bottom:40px}.cv .card{background-color:var(--global-card-bg-color);border:1px solid var(--global-divider-color)}.cv .card .list-group-item{background-color:inherit}.cv .card .list-group-item .badge{color:var(--global-card-bg-color)!important;background-color:var(--global-theme-color)!important}@media(min-width:768px){.repo{max-width:50%}}.header-bar{border-bottom:1px solid var(--global-divider-color);text-align:center;padding-top:2rem;padding-bottom:3rem}.header-bar h1{color:var(--global-theme-color);font-size:5rem}.tag-list{border-bottom:1px solid var(--global-divider-color);text-align:center;padding-top:1rem}.tag-list ul{justify-content:center;display:flow-root}.tag-list ul p,.tag-list ul li{list-style:none;display:inline-block;padding:1rem .5rem;color:var(--global-text-color-light)}.post-list{margin:0;margin-bottom:40px;padding:0}.post-list li{border-bottom:1px solid var(--global-divider-color);list-style:none;padding-top:2rem;padding-bottom:2rem}.post-list li .post-meta{color:var(--global-text-color-light);font-size:.875rem;margin-bottom:0}.post-list li .post-tags{color:var(--global-text-color-light);font-size:.875rem;padding-top:.25rem;padding-bottom:0}.post-list li a{color:var(--global-text-color);text-decoration:none}.post-list li a:hover{color:var(--global-theme-color)}.pagination .page-item .page-link{color:var(--global-text-color)}.pagination .page-item .page-link:hover{color:#000}.pagination .page-item.active .page-link{color:#fff;background-color:var(--global-theme-color)}.pagination .page-item.active .page-link:hover{background-color:var(--global-theme-color)}.distill a:hover{border-bottom-color:var(--global-theme-color);text-decoration:none}.projects a{text-decoration:none}.projects a:hover .card-title{color:var(--global-theme-color)}.projects .card img{width:100%}.projects .card-item{width:auto;margin-bottom:10px}.projects .card-item .row{display:flex;align-items:center}.projects .grid-sizer,.projects .grid-item{width:250px;margin-bottom:10px}.projects h2.category{color:var(--global-divider-color);border-bottom:1px solid var(--global-divider-color);padding-top:.5rem;margin-top:2rem;margin-bottom:1rem;text-align:right}.publications{margin-top:2rem}.publications h1{color:var(--global-theme-color);font-size:2rem;text-align:center;margin-top:1em;margin-bottom:1em}.publications h2{margin-bottom:1rem}.publications h2 span{font-size:1.5rem}.publications h2.year{color:var(--global-divider-color);border-top:1px solid var(--global-divider-color);padding-top:1rem;margin-top:2rem;margin-bottom:-2rem;text-align:right}.publications ol.bibliography{list-style:none;padding:0;margin-top:0}.publications ol.bibliography li{margin-bottom:1rem}.publications ol.bibliography li .preview{width:100%;min-width:80px;max-width:200px}.publications ol.bibliography li .abbr{height:2rem;margin-bottom:.5rem}.publications ol.bibliography li .abbr abbr{display:inline-block;background-color:var(--global-theme-color);padding-left:1rem;padding-right:1rem}.publications ol.bibliography li .abbr abbr a{color:white}.publications ol.bibliography li .abbr abbr a:hover{text-decoration:none}.publications ol.bibliography li .abbr .award{color:var(--global-theme-color)!important;border:1px solid var(--global-theme-color)}.publications ol.bibliography li .title{font-weight:bolder}.publications ol.bibliography li .author a{border-bottom:1px dashed var(--global-theme-color)}.publications ol.bibliography li .author a:hover{border-bottom-style:solid;text-decoration:none}.publications ol.bibliography li .author>em{border-bottom:1px solid;font-style:normal}.publications ol.bibliography li .author>span.more-authors{color:var(--global-text-color-light);border-bottom:1px dashed var(--global-text-color-light);cursor:pointer}.publications ol.bibliography li .author>span.more-authors:hover{color:var(--global-text-color);border-bottom:1px dashed var(--global-text-color)}
+.publications ol.bibliography li .links a.btn{color:var(--global-text-color);border:1px solid var(--global-text-color);padding-left:1rem;padding-right:1rem;padding-top:.25rem;padding-bottom:.25rem}.publications ol.bibliography li .links a.btn:hover{color:var(--global-theme-color);border-color:var(--global-theme-color)}.publications ol.bibliography li .hidden{font-size:.875rem;max-height:0;overflow:hidden;text-align:justify;transition-property:.15s ease;-moz-transition:.15s ease;-ms-transition:.15s ease;-o-transition:.15s ease;transition:all .15s ease}.publications ol.bibliography li .hidden p{line-height:1.4em;margin:10px}.publications ol.bibliography li .hidden pre{font-size:1em;line-height:1.4em;padding:10px}.publications ol.bibliography li .hidden.open{max-height:100em;transition-property:.15s ease;-moz-transition:.15s ease;-ms-transition:.15s ease;-o-transition:.15s ease;transition:all .15s ease}.publications ol.bibliography li div.abstract.hidden{border:dashed 1px var(--global-bg-color)}.publications ol.bibliography li div.abstract.hidden.open{border-color:var(--global-text-color)}figure.highlight{margin:0 0 1rem}pre{color:var(--global-theme-color);background-color:var(--global-code-bg-color);border-radius:6px;padding:6px 12px}pre pre,pre code{background-color:transparent;border-radius:0;padding:0}code{color:var(--global-theme-color);background-color:var(--global-code-bg-color);border-radius:3px;padding:3px 3px}html.transition,html.transition *,html.transition *:before,html.transition *:after{transition:all 750ms!important;transition-delay:0!important}.post .post-meta{color:var(--global-text-color-light);font-size:.875rem;margin-bottom:0}.post .post-tags{color:var(--global-text-color-light);font-size:.875rem;padding-top:.25rem;padding-bottom:1rem}.post .post-tags a{color:var(--global-text-color-light);text-decoration:none}.post .post-tags a:hover{color:var(--global-theme-color)}.post .post-content blockquote{border-left:5px solid var(--global-theme-color);padding:8px}d-byline{border-top-color:var(--global-divider-color)!important}d-byline h3{color:var(--global-text-color)!important}d-byline a,d-article d-byline a{color:var(--global-text-color)!important}d-byline a:hover,d-article d-byline a:hover{color:var(--global-hover-color)!important}d-article{border-top-color:var(--global-divider-color)!important}d-article a,d-article p,d-article h1,d-article h2,d-article h3,d-article h4,d-article h5,d-article h6,d-article li,d-article table{color:var(--global-text-color)!important}d-article a,d-article h1,d-article h2,d-article hr,d-article table,d-article table th,d-article table td{border-bottom-color:var(--global-divider-color)!important}d-article a:hover{border-bottom-color:var(--global-hover-color)!important}d-article b i{display:inline}d-article d-contents{align-self:start;grid-column:1/4;grid-row:auto/span 4;justify-self:end;margin-top:0;padding-left:2em;padding-right:3em;border-right:1px solid var(--global-divider-color);width:max(70%,300px);margin-right:0;margin-top:0;display:grid;grid-template-columns:minmax(8px,1fr) [toc] auto minmax(8px,1fr) [toc-line] 1px minmax(32px,2fr)}d-article d-contents nav{grid-column:toc}d-article d-contents nav a{border-bottom:none!important}d-article d-contents nav a:hover{border-bottom:1px solid var(--global-text-color)!important}d-article d-contents nav h3{margin-top:0;margin-bottom:1em}d-article d-contents nav div{display:block;outline:0;margin-bottom:.8em;color:rgba(0,0,0,0.8);font-weight:bold}d-article d-contents nav ul{padding-left:1em;margin-top:0;margin-bottom:6px;list-style-type:none}d-article d-contents nav ul li{margin-bottom:.25em}d-article d-contents .figcaption{line-height:1.4em}d-article d-contents toc-line{border-right:1px solid var(--global-divider-color);grid-column:toc-line}d-article d-footnote{scroll-margin-top:66px}d-appendix{border-top-color:var(--global-divider-color)!important;color:var(--global-distill-app-color)!important}d-appendix h3,d-appendix li,d-appendix span{color:var(--global-distill-app-color)!important}d-appendix a,d-appendix a.footnote-backlink{color:var(--global-distill-app-color)!important}d-appendix a:hover,d-appendix a.footnote-backlink:hover{color:var(--global-hover-color)!important}@media(max-width:1024px){d-article d-contents{display:block;grid-column-start:2;grid-column-end:-2;padding-bottom:.5em;margin-bottom:1em;padding-top:.5em;width:100%;border:1px solid var(--global-divider-color)}d-article d-contents nav{grid-column:none}}
\ No newline at end of file
diff --git a/assets/css/main.css.map b/assets/css/main.css.map
new file mode 100644
index 00000000..db608df8
--- /dev/null
+++ b/assets/css/main.css.map
@@ -0,0 +1 @@
+{"version":3,"sourceRoot":"","sources":["../../_sass/_variables.scss","../../_sass/_themes.scss","../../_sass/_layout.scss","main.scss","../../_sass/_base.scss","../../_sass/_distill.scss"],"names":[],"mappings":"AAAA;AAAA;AAAA;AAAA;ACAA;AAAA;AAAA;AAIA;EACE;EACA;EACA;EACA;EACA;EACA;EACA;EACA;EACA;EACA;EACA;EACA;;AAEA;EACE;;AAEF;EACE;EACA;EACA;;AAGF;EACE;;AAEF;EACE;;;AAIJ;EACE;EACA;EACA;EACA;EACA;EACA;EACA;;;AAGF;EACE;EACA;EACA;EACA;EACA;EACA;EACA;EACA;EACA;EACA;EACA;EACA;;AAEA;EACE;EACA;EACA;;AAEF;EACE;;AAGF;EACE;;AAEF;EACE;;AAGJ;EACE;EACA;EACA;EACA;EACA;EACA;EACA;;;AClFF;AAAA;AAAA;AAIA;EACE;EACA;EACA;;AAEA;EACE;;;AAIJ;EAEE;;;AAGF;EAEE;;;AAGF;EACE,WCtBkB;;;AD2BlB;EACE;;;AAOJ;AAAA;AAAA;AAOA;AAAA;AAAA;AE7CA;AAAA;AAAA;AAMA;EACE;;;AAGF;EACE;;;AAIA;EACE;;AAEF;EACE;;;AAIJ;EACE;;AACA;EACE;EACA;;AAEF;EACE;;;AAIJ;EACE;;;AAGF;EACE;EACA;EACA;EACA;EACA;;;AAKF;EACE;EACA;;;AAKF;EACE;EACA;EACA;EACA;;;AAKF;EACE;;AAEA;EACE;;AAGF;EACE;;AAGF;EACE;EACA;;AAEA;EACE;EACA;;;AAON;EACE;;;AAKF;EACE;;AAEA;EACE;EACA;EACA;;AACA;EACE;EACA;;;AAIN;EACE;;;AAEF;EACE;;;AAGF;EACE;IACE;;EAEE;IAAI;;;AAKV;EACE;EACA;;AACA;EACE;;AACA;EACE;EACA;;;AAQN;EACE;EACA;EACA;EACA;;;AAEF;EACE;EACA;;AACA;EACE;;AAEF;EACE;;AAEF;EACE;;;AAGJ;EACE;;AACE;EACE;EACA;;;AAKF;EACE;;AAGJ;EACE;;AAEF;EACE;;AACA;EACE;;AAGJ;EACI;EACA;EACA;;AACA;EACE;;AAGN;EACE;EACA;EACA;;AAEE;EACE;EACA;;AAGA;EACE;;;AAQR;EACE;EACA;EACA;EACA;EACA;EACA;EACA;;AAEF;EACE;EACA;;AAEF;EACE;;AAEF;EACE;EACA;;;AAKF;EACE;;AAEF;EACE;;AAEF;EACE;;;AAIJ;EACE;EACA;EACA;EACA;;AACA;EACE;;;AAMJ;EACE;;AACA;EACE;;AAEE;EACE;EACA;;AAGA;EACE;;AAKR;EACE;;;AAMJ;EACE;EACA;;AACA;EACE;EACA;EACA;;AAEF;EACE;;AACA;EACE;EACA;;;AAKN;EACE;EACA;EACA;EACA;;;AAKF;EACE;;AAEA;EACE;EACA;;AAEA;EACE;;AAEA;EACE;EACA;;;AAQR;EACE;IACE;;;AAMJ;EACE;EACA;EACA;EACA;;AACA;EACE;EACA;;;AAIJ;EACE;EACA;EACA;;AAEA;EACE;EACA;;AAEA;EACE;EACA;EACA;EACA;;;AAKN;EACE;EACA;EACA;;AACA;EACE;EACA;EACA;EACA;;AACA;EACE;EACA;EACA;;AAEF;EACE;EACA;EACA;EACA;;AAEF;EACE;EACA;;AACA;EACE;;;AAQJ;EACE;;AACA;EACE,OJ1WM;;AI6WV;EACE,OJ/WQ;EIgXR;;AACA;EACE;;;AAUN;EACE;EACA;;;AAQF;EACE;;AAGE;EACE;;AAMJ;EACE;;AAIJ;EACE;EACA;;AAEA;EACE;EACA;;AAIJ;EACE;EACA;;AAGF;EACE;EACA;EACA;EACA;EACA;EACA;;;AAOJ;EACE;;AACA;EACE;EACA;EACA;EACA;EACA;;AAEF;EACE;;AACA;EACE;;AAGJ;EACE;EACA;EACA;EACA;EACA;EACA;;AAEF;EACE;EACA;EACA;;AAEA;EACE;;AACA;EACE;EACA;EACA;;AAEF;EACE;EACA;;AACA;EACE;EACA;EACA;EACA;;AACA;EACE;;AACA;EACE;;AAIN;EACE;EACA;;AAGJ;EACE;;AAGA;EACE;;AACA;EACI;EACA;;AAGN;EACE;EACA;;AAEF;EACE;EACA;EACA;;AACA;EACI;EACA;;AAKN;EACE;EACA;EACA;EACA;EACA;EACA;;AACA;EACE;EACA;;AAIN;EACE;EACA;EACA;EACA;EACA;EACA;EACA;EACA;EACA;;AAEA;EACE;EACA;;AAEF;EACE;EACA;EACA;;AAGJ;EACE;EACA;EACA;EACA;EACA;EACA;;AAEF;EACE;;AAEF;EACE;;;AAOR;EACE;;;AAGF;EACE;EACA;EACA;EACA;;AACA;EACE;EACA;EACA;;;AAIJ;EACE;EACA;EACA;EACA;;;AAKF;AAAA;AAAA;AAAA;EAIE;EACA;;;AAKA;EACE;EACA;EACA;;AAEF;EACE;EACA;EACA;EACA;;AACA;EACE;EACA;;AACA;EACE;;AAKJ;EACE;EACA;;;AC9oBN;AAAA;AAAA;AAIA;EACE;;;AAGF;EACE;;;AAGF;EACE;;AACA;EACE;;;AAIJ;EACE;;AACA;EACE;;AAEF;EACE;;AAEF;EACE;;AAEF;EACE;;AAGF;EACE;EACA;EACA;EACA;EACA;EACA;EACA;EACA;EACA;EACA;EACA;EACA;EACA,uBACE;;AAIF;EACE;;AACA;EACE;;AACA;EACE;;AAGJ;EACE;EACA;;AAEF;EACE;EACA;EACA;EACA;EACA;;AAEF;EACE;EACA;EACA;EACA;;AACA;EACE;;AAIN;EACE;;AAEF;EACE;EACA;;AAIJ;EACE;;;AAIJ;EACE;EACA;;AACA;EACE;;AAEF;EACE;;AACA;EACE;;;AAKN;EAEI;IACE;IACA;IACA;IACA;IACA;IACA;IACA;IACA;;EACA;IACE","sourcesContent":["/*******************************************************************************\n * Variables used throughout the theme.\n * To adjust anything, simply edit the variables below and rebuild the theme.\n ******************************************************************************/\n\n\n// Colors\n$red-color: #FF3636 !default;\n$red-color-dark: #B71C1C !default;\n$orange-color: #F29105 !default;\n$blue-color: #0076df !default;\n$blue-color-dark: #00369f !default;\n$cyan-color: #2698BA !default;\n$light-cyan-color: lighten($cyan-color, 25%);\n$green-color: #00ab37 !default;\n$green-color-lime: #B7D12A !default;\n$green-color-dark: #009f06 !default;\n$green-color-light: #ddffdd !default;\n$green-color-bright: #11D68B !default;\n$purple-color: #B509AC !default;\n$light-purple-color: lighten($purple-color, 25%);\n$pink-color: #f92080 !default;\n$pink-color-light: #ffdddd !default;\n$yellow-color: #efcc00 !default;\n\n$grey-color: #828282 !default;\n$grey-color-light: lighten($grey-color, 40%);\n$grey-color-dark: #1C1C1D;\n$grey-900: #212529;\n\n$white-color: #ffffff !default;\n$black-color: #000000 !default;\n\n\n// Theme colors\n\n$code-bg-color-light: rgba($purple-color, 0.05);\n$code-bg-color-dark: #2c3237 !default;\n","/*******************************************************************************\r\n * Themes\r\n ******************************************************************************/\r\n \r\n:root {\r\n --global-bg-color: #{$white-color};\r\n --global-code-bg-color: #{$code-bg-color-light};\r\n --global-text-color: #{$black-color};\r\n --global-text-color-light: #{$grey-color};\r\n --global-theme-color: #{$cyan-color};\r\n --global-hover-color: #{$cyan-color};\r\n --global-footer-bg-color: #{$grey-color-dark};\r\n --global-footer-text-color: #{$grey-color-light};\r\n --global-footer-link-color: #{$white-color};\r\n --global-distill-app-color: #{$grey-color};\r\n --global-divider-color: rgba(0,0,0,.1);\r\n --global-card-bg-color: #{$white-color};\r\n\r\n .fa-sun {\r\n display : none;\r\n }\r\n .fa-moon {\r\n padding-left: 10px;\r\n padding-top: 12px;\r\n display : block;\r\n }\r\n\r\n .repo-img-light {\r\n display: block;\r\n }\r\n .repo-img-dark {\r\n display: none;\r\n }\r\n}\r\n\r\n.header-background .img {\r\n background-image: url(\"../img/ICLR-logo.png\");\r\n background-repeat: no-repeat;\r\n background-size: 400px;\r\n background-position: center bottom;\r\n height: 12em;\r\n margin-bottom: 0em;\r\n margin-top: -2.7em; \r\n}\r\n\r\nhtml[data-theme='dark'] {\r\n --global-bg-color: #{$grey-color-dark};\r\n --global-code-bg-color: #{$code-bg-color-dark};\r\n --global-text-color: #{$grey-color-light};\r\n --global-text-color-light: #{$grey-color-light};\r\n --global-theme-color: #{$cyan-color};\r\n --global-hover-color: #{$cyan-color};\r\n --global-footer-bg-color: #{$grey-color-light};\r\n --global-footer-text-color: #{$grey-color-dark};\r\n --global-footer-link-color: #{$black-color};\r\n --global-distill-app-color: #{$grey-color-light};\r\n --global-divider-color: #424246;\r\n --global-card-bg-color: #{$grey-900};\r\n\r\n .fa-sun {\r\n padding-left: 10px;\r\n padding-top: 12px;\r\n display : block;\r\n }\r\n .fa-moon {\r\n display : none;\r\n }\r\n\r\n .repo-img-light {\r\n display: none;\r\n }\r\n .repo-img-dark {\r\n display: block;\r\n }\r\n\r\n.header-background .img {\r\n background-image: url(\"../img/ICLR-logo-dark.png\");\r\n background-repeat: no-repeat;\r\n background-size: 400px;\r\n background-position: center bottom;\r\n height: 12em;\r\n margin-bottom: 0em;\r\n margin-top: -2.7em; \r\n // filter: invert(89%);\r\n}\r\n\r\n\r\n\r\n\r\n // .header-background .img {\r\n // background-image: url(\"../img/score_contour.jpg\");\r\n // background-repeat: no-repeat;\r\n // background-size: cover;\r\n // background-position: center bottom;\r\n // height: 15em;\r\n // margin-bottom: 2em;\r\n // margin-top: -2.7em;\r\n // filter: invert(89%);\r\n // }\r\n}\r\n","/******************************************************************************\n * Content\n ******************************************************************************/\n\nbody {\n padding-bottom: 70px;\n color: var(--global-text-color);\n background-color: var(--global-bg-color);\n\n h1, h2, h3, h4, h5, h6 {\n scroll-margin-top: 66px;\n }\n}\n\nbody.fixed-top-nav {\n // Add some padding for the nav-bar.\n padding-top: 56px;\n}\n\nbody.sticky-bottom-footer {\n // Remove padding below footer.\n padding-bottom: 0;\n}\n\n.container {\n max-width: $max-content-width;\n}\n\n// Profile\n.profile {\n img {\n width: 100%;\n }\n}\n\n// TODO: redefine content layout.\n\n\n/******************************************************************************\n * Publications\n ******************************************************************************/\n\n// TODO: redefine publications layout.\n\n\n/*****************************************************************************\n* Projects\n*****************************************************************************/\n\n// TODO: redefine projects layout.\n","@charset \"utf-8\";\n\n// Dimensions\n$max-content-width: 1000px;\n\n@import\n \"variables\",\n \"themes\",\n \"layout\",\n \"base\",\n \"distill\"\n;\n","/*******************************************************************************\n * Styles for the base elements of the theme.\n ******************************************************************************/\n\n// Typography\n\np, h1, h2, h3, h4, h5, h6, em, div, li, span, strong {\n color: var(--global-text-color);\n}\n\nhr {\n border-top: 1px solid var(--global-divider-color);\n}\n\ntable {\n td, th {\n color: var(--global-text-color);\n }\n td {\n font-size: 1rem;\n }\n}\n\na, table.table a {\n color: var(--global-theme-color);\n &:hover {\n color: var(--global-theme-color);\n text-decoration: underline;\n }\n &:hover:after :not(.nav-item.dropdown) {\n width: 100%;\n }\n}\n\nfigure, img {\n max-width: 90vw;\n}\n\nblockquote {\n background: var(--global-bg-color);\n border-left: 2px solid var(--global-theme-color);\n margin: 1.5em 10px;\n padding: 0.5em 10px;\n font-size: 1.1rem;\n}\n\n// Math\n\n.equation {\n margin-bottom: 1rem;\n text-align: center;\n}\n\n// Caption\n\n.caption {\n font-size: 0.875rem;\n margin-top: 0.75rem;\n margin-bottom: 1.5rem;\n text-align: center;\n}\n\n// Card\n\n.card {\n background-color: var(--global-card-bg-color);\n\n img {\n width: 100%;\n }\n\n .card-title {\n color: var(--global-text-color);\n }\n\n .card-item {\n width: auto;\n margin-bottom: 10px;\n\n .row {\n display: flex;\n align-items: center;\n }\n }\n}\n\n// Citation\n\n.citation, .citation-number {\n color: var(--global-theme-color);\n}\n\n// Profile\n\n.profile {\n width: 100%;\n\n .address {\n margin-bottom: 5px;\n margin-top: 5px;\n font-family: monospace;\n p {\n display: inline-block;\n margin: 0;\n }\n }\n}\n.profile.float-right{\n margin-left: 1rem;\n}\n.profile.float-left{\n margin-right: 1rem;\n}\n\n@media (min-width: 576px) {\n .profile {\n width: 30%;\n .address {\n p { display: block; }\n }\n }\n}\n\n.post-description {\n margin-bottom: 2rem;\n font-size: 0.875rem;\n a {\n color: inherit;\n &:hover {\n color: var(--global-theme-color);\n text-decoration: none;\n }\n }\n}\n\n\n// Navbar customization\n\n.navbar {\n box-shadow: none;\n border-bottom: 1px solid var(--global-divider-color);\n background-color: var(--global-bg-color);\n opacity: 0.95;\n}\n.navbar .dropdown-menu {\n background-color: var(--global-bg-color);\n border: 1px solid var(--global-divider-color);\n a:not(.active) {\n color: var(--global-text-color);\n }\n a:hover {\n color: var(--global-hover-color);\n }\n .dropdown-divider {\n border-top: 1px solid var(--global-divider-color) !important;\n }\n}\n.dropdown-item {\n color: var(--global-text-color);\n &:hover {\n color: var(--global-hover-color);\n background-color: var(--global-bg-color);\n }\n}\n.navbar.navbar-light {\n a {\n &:hover {\n text-decoration: none;\n }\n }\n .navbar-brand {\n color: var(--global-text-color);\n }\n .navbar-nav .nav-item .nav-link {\n color: var(--global-text-color);\n &:hover {\n color: var(--global-hover-color);\n }\n }\n .navbar-nav .nav-item.active>.nav-link {\n background-color: inherit;\n font-weight: bolder;\n color: var(--global-theme-color);\n &:hover {\n color: var(--global-hover-color);\n }\n }\n .navbar-brand.social {\n padding-bottom: 0;\n padding-top: 0;\n font-size: 1.7rem;\n a {\n i::before {\n color: var(--global-text-color);\n transition-property: all 0.2s ease-in-out;\n }\n &:hover {\n i::before {\n color: var(--global-theme-color);\n }\n }\n }\n }\n}\n\n.navbar-toggler {\n .icon-bar {\n display: block;\n width: 22px;\n height: 2px;\n background-color: var(--global-text-color);\n border-radius: 1px;\n margin-bottom: 4px;\n transition: all 0.2s;\n }\n .top-bar {\n transform: rotate(45deg);\n transform-origin: 10% 10%;\n }\n .middle-bar {\n opacity: 0;\n }\n .bottom-bar {\n transform: rotate(-45deg);\n transform-origin: 10% 90%;\n }\n}\n\n.navbar-toggler.collapsed {\n .top-bar {\n transform: rotate(0);\n }\n .middle-bar {\n opacity: 1;\n }\n .bottom-bar {\n transform: rotate(0);\n }\n}\n\n#light-toggle {\n padding: 0;\n border: 0;\n background-color: inherit;\n color: var(--global-text-color);\n &:hover {\n color: var(--global-hover-color);\n }\n}\n\n// Social (bottom)\n\n.social {\n text-align: center;\n .contact-icons {\n font-size: 4rem;\n a {\n i::before {\n color: var(--global-text-color);\n transition-property: all 0.2s ease-in-out;\n }\n &:hover {\n i::before {\n color: var(--global-theme-color);\n }\n }\n }\n }\n .contact-note {\n font-size: 0.8rem;\n }\n}\n\n\n// Footer\nfooter.fixed-bottom {\n background-color: var(--global-footer-bg-color);\n font-size: 0.75rem;\n .container {\n color: var(--global-footer-text-color);\n padding-top: 9px;\n padding-bottom: 8px;\n }\n a {\n color: var(--global-footer-link-color);\n &:hover {\n color: var(--global-theme-color);\n text-decoration: none;\n }\n }\n}\n\nfooter.sticky-bottom {\n border-top: 1px solid var(--global-divider-color);\n padding-top: 40px;\n padding-bottom: 40px;\n font-size: 0.9rem;\n}\n\n// CV\n\n.cv {\n margin-bottom: 40px;\n \n .card {\n background-color: var(--global-card-bg-color);\n border: 1px solid var(--global-divider-color);\n \n .list-group-item {\n background-color: inherit;\n\n .badge {\n color: var(--global-card-bg-color) !important;\n background-color: var(--global-theme-color) !important;\n }\n }\n }\n}\n\n// Repositories\n\n@media (min-width: 768px) {\n .repo {\n max-width: 50%;\n }\n}\n\n// Blog\n\n.header-bar {\n border-bottom: 1px solid var(--global-divider-color);\n text-align: center;\n padding-top: 2rem;\n padding-bottom: 3rem;\n h1 {\n color: var(--global-theme-color);\n font-size: 5rem;\n }\n}\n\n.tag-list {\n border-bottom: 1px solid var(--global-divider-color);\n text-align: center;\n padding-top: 1rem;\n\n ul {\n justify-content: center;\n display: flow-root;\n\n p, li {\n list-style: none;\n display: inline-block;\n padding: 1rem 0.5rem;\n color: var(--global-text-color-light);\n }\n }\n}\n\n.post-list {\n margin: 0;\n margin-bottom: 40px;\n padding: 0;\n li {\n border-bottom: 1px solid var(--global-divider-color);\n list-style: none;\n padding-top: 2rem;\n padding-bottom: 2rem;\n .post-meta {\n color: var(--global-text-color-light);\n font-size: 0.875rem;\n margin-bottom: 0;\n }\n .post-tags {\n color: var(--global-text-color-light);\n font-size: 0.875rem;\n padding-top: 0.25rem;\n padding-bottom: 0;\n }\n a {\n color: var(--global-text-color);\n text-decoration: none;\n &:hover {\n color: var(--global-theme-color);\n }\n }\n }\n}\n\n.pagination {\n .page-item {\n .page-link {\n color: var(--global-text-color);\n &:hover {\n color: $black-color;\n }\n }\n &.active .page-link {\n color: $white-color;\n background-color: var(--global-theme-color);\n &:hover {\n background-color: var(--global-theme-color);\n }\n }\n }\n}\n\n\n// Distill\n\n.distill {\n a:hover {\n border-bottom-color: var(--global-theme-color);\n text-decoration: none;\n }\n}\n\n\n// Projects\n\n.projects {\n a {\n text-decoration: none;\n\n &:hover {\n .card-title {\n color: var(--global-theme-color);\n }\n }\n }\n\n .card {\n img {\n width: 100%;\n }\n }\n\n .card-item {\n width: auto;\n margin-bottom: 10px;\n\n .row {\n display: flex;\n align-items: center;\n }\n }\n\n .grid-sizer, .grid-item {\n width: 250px;\n margin-bottom: 10px;\n }\n\n h2.category {\n color: var(--global-divider-color);\n border-bottom: 1px solid var(--global-divider-color);\n padding-top: 0.5rem;\n margin-top: 2rem;\n margin-bottom: 1rem;\n text-align: right;\n }\n}\n\n\n// Publications\n\n.publications {\n margin-top: 2rem;\n h1 {\n color: var(--global-theme-color);\n font-size: 2rem;\n text-align: center;\n margin-top: 1em;\n margin-bottom: 1em;\n }\n h2 {\n margin-bottom: 1rem;\n span {\n font-size: 1.5rem;\n }\n }\n h2.year {\n color: var(--global-divider-color);\n border-top: 1px solid var(--global-divider-color);\n padding-top: 1rem;\n margin-top: 2rem;\n margin-bottom: -2rem;\n text-align: right;\n }\n ol.bibliography {\n list-style: none;\n padding: 0;\n margin-top: 0;\n\n li {\n margin-bottom: 1rem;\n .preview {\n width: 100%;\n min-width: 80px;\n max-width: 200px;\n }\n .abbr {\n height: 2rem;\n margin-bottom: 0.5rem;\n abbr {\n display: inline-block;\n background-color: var(--global-theme-color);\n padding-left: 1rem;\n padding-right: 1rem;\n a {\n color: white;\n &:hover {\n text-decoration: none;\n }\n }\n }\n .award {\n color: var(--global-theme-color) !important;\n border: 1px solid var(--global-theme-color);\n }\n }\n .title {\n font-weight: bolder;\n }\n .author {\n a {\n border-bottom: 1px dashed var(--global-theme-color);\n &:hover {\n border-bottom-style: solid;\n text-decoration: none;\n }\n }\n > em {\n border-bottom: 1px solid;\n font-style: normal;\n }\n > span.more-authors {\n color: var(--global-text-color-light);\n border-bottom: 1px dashed var(--global-text-color-light);\n cursor: pointer;\n &:hover {\n color: var(--global-text-color);\n border-bottom: 1px dashed var(--global-text-color);\n }\n }\n }\n .links {\n a.btn {\n color: var(--global-text-color);\n border: 1px solid var(--global-text-color);\n padding-left: 1rem;\n padding-right: 1rem;\n padding-top: 0.25rem;\n padding-bottom: 0.25rem;\n &:hover {\n color: var(--global-theme-color);\n border-color: var(--global-theme-color);\n }\n }\n }\n .hidden {\n font-size: 0.875rem;\n max-height: 0px;\n overflow: hidden;\n text-align: justify;\n transition-property: 0.15s ease;\n -moz-transition: 0.15s ease;\n -ms-transition: 0.15s ease;\n -o-transition: 0.15s ease;\n transition: all 0.15s ease;\n\n p {\n line-height: 1.4em;\n margin: 10px;\n }\n pre {\n font-size: 1em;\n line-height: 1.4em;\n padding: 10px;\n }\n }\n .hidden.open {\n max-height: 100em;\n transition-property: 0.15s ease;\n -moz-transition: 0.15s ease;\n -ms-transition: 0.15s ease;\n -o-transition: 0.15s ease;\n transition: all 0.15s ease;\n }\n div.abstract.hidden {\n border: dashed 1px var(--global-bg-color);\n }\n div.abstract.hidden.open {\n border-color: var(--global-text-color);\n }\n }\n }\n}\n\n// Rouge Color Customization\nfigure.highlight {\n margin: 0 0 1rem;\n}\n\npre {\n color: var(--global-theme-color);\n background-color: var(--global-code-bg-color);\n border-radius: 6px;\n padding: 6px 12px;\n pre, code {\n background-color: transparent;\n border-radius: 0;\n padding: 0;\n }\n}\n\ncode {\n color: var(--global-theme-color);\n background-color: var(--global-code-bg-color);\n border-radius: 3px;\n padding: 3px 3px;\n}\n\n\n// Transitioning Themes\nhtml.transition,\nhtml.transition *,\nhtml.transition *:before,\nhtml.transition *:after {\n transition: all 750ms !important;\n transition-delay: 0 !important;\n}\n\n// Extra Markdown style (post Customization)\n.post{\n .post-meta{\n color: var(--global-text-color-light);\n font-size: 0.875rem;\n margin-bottom: 0;\n }\n .post-tags{\n color: var(--global-text-color-light);\n font-size: 0.875rem;\n padding-top: 0.25rem;\n padding-bottom: 1rem;\n a {\n color: var(--global-text-color-light);\n text-decoration: none;\n &:hover {\n color: var(--global-theme-color);\n }\n }\n }\n .post-content{\n blockquote {\n border-left: 5px solid var(--global-theme-color);\n padding: 8px;\n }\n }\n}\n","/*******************************************************************************\n * Style overrides for distill blog posts.\n ******************************************************************************/\n\nd-byline {\n border-top-color: var(--global-divider-color) !important;\n}\n\nd-byline h3 {\n color: var(--global-text-color) !important;\n}\n\nd-byline a, d-article d-byline a {\n color: var(--global-text-color) !important;\n &:hover {\n color: var(--global-hover-color) !important;\n }\n}\n\nd-article {\n border-top-color: var(--global-divider-color) !important;\n a, p, h1, h2, h3, h4, h5, h6, li, table {\n color: var(--global-text-color) !important;\n }\n a, h1, h2, hr, table, table th, table td {\n border-bottom-color: var(--global-divider-color) !important;\n }\n a:hover {\n border-bottom-color: var(--global-hover-color) !important;\n }\n b i {\n display: inline;\n }\n\n d-contents {\n align-self: start;\n grid-column: 1 / 4;\n grid-row: auto / span 4;\n justify-self: end;\n margin-top: 0em;\n padding-left: 2em;\n padding-right: 3em;\n border-right: 1px solid var(--global-divider-color);\n width: calc(max(70%, 300px));\n margin-right: 0px;\n margin-top: 0em;\n display: grid;\n grid-template-columns:\n minmax(8px, 1fr) [toc] auto\n minmax(8px, 1fr) [toc-line] 1px\n minmax(32px, 2fr);\n\n nav {\n grid-column: toc;\n a {\n border-bottom: none !important;\n &:hover {\n border-bottom: 1px solid var(--global-text-color) !important;\n }\n }\n h3 {\n margin-top: 0;\n margin-bottom: 1em;\n }\n div {\n display: block;\n outline: none;\n margin-bottom: 0.8em;\n color: rgba(0, 0, 0, 0.8);\n font-weight: bold;\n }\n ul {\n padding-left: 1em;\n margin-top: 0;\n margin-bottom: 6px;\n list-style-type: none;\n li {\n margin-bottom: 0.25em;\n }\n }\n }\n .figcaption {\n line-height: 1.4em;\n }\n toc-line {\n border-right: 1px solid var(--global-divider-color);\n grid-column: toc-line;\n }\n }\n\n d-footnote {\n scroll-margin-top: 66px;\n }\n}\n\nd-appendix {\n border-top-color: var(--global-divider-color) !important;\n color: var(--global-distill-app-color) !important;\n h3, li, span {\n color: var(--global-distill-app-color) !important;\n }\n a, a.footnote-backlink {\n color: var(--global-distill-app-color) !important;\n &:hover {\n color: var(--global-hover-color) !important;\n }\n }\n}\n\n@media (max-width: 1024px) {\n d-article {\n d-contents {\n display: block;\n grid-column-start: 2;\n grid-column-end: -2;\n padding-bottom: 0.5em;\n margin-bottom: 1em;\n padding-top: 0.5em;\n width: 100%;\n border: 1px solid var(--global-divider-color);\n nav {\n grid-column: none;\n }\n }\n }\n}\n"],"file":"main.css"}
\ No newline at end of file
diff --git a/assets/css/main.scss b/assets/css/main.scss
deleted file mode 100644
index fd8c311c..00000000
--- a/assets/css/main.scss
+++ /dev/null
@@ -1,15 +0,0 @@
----
-# Only the main Sass file needs front matter (the dashes are enough)
----
-@charset "utf-8";
-
-// Dimensions
-$max-content-width: {{ site.max_width }};
-
-@import
-  "variables",
-  "themes",
-  "layout",
-  "base",
-  "distill"
-;
diff --git a/assets/img/2023-05-01-adamw/heatmap-1400.webp b/assets/img/2023-05-01-adamw/heatmap-1400.webp
new file mode 100644
index 00000000..c10dc10c
Binary files /dev/null and b/assets/img/2023-05-01-adamw/heatmap-1400.webp differ
diff --git a/assets/img/2023-05-01-adamw/heatmap-480.webp b/assets/img/2023-05-01-adamw/heatmap-480.webp
new file mode 100644
index 00000000..89d43036
Binary files /dev/null and b/assets/img/2023-05-01-adamw/heatmap-480.webp differ
diff --git a/assets/img/2023-05-01-adamw/heatmap-800.webp b/assets/img/2023-05-01-adamw/heatmap-800.webp
new file mode 100644
index 00000000..c10dc10c
Binary files /dev/null and b/assets/img/2023-05-01-adamw/heatmap-800.webp differ
diff --git a/assets/img/2023-05-01-adamw/resnet20model_norm-1400.webp b/assets/img/2023-05-01-adamw/resnet20model_norm-1400.webp
new file mode 100644
index 00000000..c1b88d20
Binary files /dev/null and b/assets/img/2023-05-01-adamw/resnet20model_norm-1400.webp differ
diff --git a/assets/img/2023-05-01-adamw/resnet20model_norm-480.webp b/assets/img/2023-05-01-adamw/resnet20model_norm-480.webp
new file mode 100644
index 00000000..8dcfb983
Binary files /dev/null and b/assets/img/2023-05-01-adamw/resnet20model_norm-480.webp differ
diff --git a/assets/img/2023-05-01-adamw/resnet20model_norm-800.webp b/assets/img/2023-05-01-adamw/resnet20model_norm-800.webp
new file mode 100644
index 00000000..c1b88d20
Binary files /dev/null and b/assets/img/2023-05-01-adamw/resnet20model_norm-800.webp differ
diff --git a/assets/img/2023-05-01-adamw/resnet20val_score-1400.webp b/assets/img/2023-05-01-adamw/resnet20val_score-1400.webp
new file mode 100644
index 00000000..0228bb02
Binary files /dev/null and b/assets/img/2023-05-01-adamw/resnet20val_score-1400.webp differ
diff --git a/assets/img/2023-05-01-adamw/resnet20val_score-480.webp b/assets/img/2023-05-01-adamw/resnet20val_score-480.webp
new file mode 100644
index 00000000..a84cc4c1
Binary files /dev/null and b/assets/img/2023-05-01-adamw/resnet20val_score-480.webp differ
diff --git a/assets/img/2023-05-01-adamw/resnet20val_score-800.webp b/assets/img/2023-05-01-adamw/resnet20val_score-800.webp
new file mode 100644
index 00000000..0228bb02
Binary files /dev/null and b/assets/img/2023-05-01-adamw/resnet20val_score-800.webp differ
diff --git a/assets/img/2023-05-01-autoregressive-neural-pde-solver/2dshock-1400.webp b/assets/img/2023-05-01-autoregressive-neural-pde-solver/2dshock-1400.webp
new file mode 100644
index 00000000..e111ea25
Binary files /dev/null and b/assets/img/2023-05-01-autoregressive-neural-pde-solver/2dshock-1400.webp differ
diff --git a/assets/img/2023-05-01-autoregressive-neural-pde-solver/2dshock-480.webp b/assets/img/2023-05-01-autoregressive-neural-pde-solver/2dshock-480.webp
new file mode 100644
index 00000000..620a28df
Binary files /dev/null and b/assets/img/2023-05-01-autoregressive-neural-pde-solver/2dshock-480.webp differ
diff --git a/assets/img/2023-05-01-autoregressive-neural-pde-solver/2dshock-800.webp b/assets/img/2023-05-01-autoregressive-neural-pde-solver/2dshock-800.webp
new file mode 100644
index 00000000..e111ea25
Binary files /dev/null and b/assets/img/2023-05-01-autoregressive-neural-pde-solver/2dshock-800.webp differ
diff --git a/assets/img/2023-05-01-autoregressive-neural-pde-solver/FNO-1400.webp b/assets/img/2023-05-01-autoregressive-neural-pde-solver/FNO-1400.webp
new file mode 100644
index 00000000..ad65512f
Binary files /dev/null and b/assets/img/2023-05-01-autoregressive-neural-pde-solver/FNO-1400.webp differ
diff --git a/assets/img/2023-05-01-autoregressive-neural-pde-solver/FNO-480.webp b/assets/img/2023-05-01-autoregressive-neural-pde-solver/FNO-480.webp
new file mode 100644
index 00000000..1cd17fbd
Binary files /dev/null and b/assets/img/2023-05-01-autoregressive-neural-pde-solver/FNO-480.webp differ
diff --git a/assets/img/2023-05-01-autoregressive-neural-pde-solver/FNO-800.webp b/assets/img/2023-05-01-autoregressive-neural-pde-solver/FNO-800.webp
new file mode 100644
index 00000000..ad65512f
Binary files /dev/null and b/assets/img/2023-05-01-autoregressive-neural-pde-solver/FNO-800.webp differ
diff --git a/assets/img/2023-05-01-autoregressive-neural-pde-solver/MP-PDE-Solver-1400.webp b/assets/img/2023-05-01-autoregressive-neural-pde-solver/MP-PDE-Solver-1400.webp
new file mode 100644
index 00000000..87eac0dc
Binary files /dev/null and b/assets/img/2023-05-01-autoregressive-neural-pde-solver/MP-PDE-Solver-1400.webp differ
diff --git a/assets/img/2023-05-01-autoregressive-neural-pde-solver/MP-PDE-Solver-480.webp b/assets/img/2023-05-01-autoregressive-neural-pde-solver/MP-PDE-Solver-480.webp
new file mode 100644
index 00000000..4651a541
Binary files /dev/null and b/assets/img/2023-05-01-autoregressive-neural-pde-solver/MP-PDE-Solver-480.webp differ
diff --git a/assets/img/2023-05-01-autoregressive-neural-pde-solver/MP-PDE-Solver-800.webp b/assets/img/2023-05-01-autoregressive-neural-pde-solver/MP-PDE-Solver-800.webp
new file mode 100644
index 00000000..87eac0dc
Binary files /dev/null and b/assets/img/2023-05-01-autoregressive-neural-pde-solver/MP-PDE-Solver-800.webp differ
diff --git a/assets/img/2023-05-01-autoregressive-neural-pde-solver/NN-AR-1400.webp b/assets/img/2023-05-01-autoregressive-neural-pde-solver/NN-AR-1400.webp
new file mode 100644
index 00000000..93a4daa2
Binary files /dev/null and b/assets/img/2023-05-01-autoregressive-neural-pde-solver/NN-AR-1400.webp differ
diff --git a/assets/img/2023-05-01-autoregressive-neural-pde-solver/NN-AR-480.webp b/assets/img/2023-05-01-autoregressive-neural-pde-solver/NN-AR-480.webp
new file mode 100644
index 00000000..ced2404f
Binary files /dev/null and b/assets/img/2023-05-01-autoregressive-neural-pde-solver/NN-AR-480.webp differ
diff --git a/assets/img/2023-05-01-autoregressive-neural-pde-solver/NN-AR-800.webp b/assets/img/2023-05-01-autoregressive-neural-pde-solver/NN-AR-800.webp
new file mode 100644
index 00000000..93a4daa2
Binary files /dev/null and b/assets/img/2023-05-01-autoregressive-neural-pde-solver/NN-AR-800.webp differ
diff --git a/assets/img/2023-05-01-autoregressive-neural-pde-solver/PDEchart-1400.webp b/assets/img/2023-05-01-autoregressive-neural-pde-solver/PDEchart-1400.webp
new file mode 100644
index 00000000..9acfcaf4
Binary files /dev/null and b/assets/img/2023-05-01-autoregressive-neural-pde-solver/PDEchart-1400.webp differ
diff --git a/assets/img/2023-05-01-autoregressive-neural-pde-solver/PDEchart-480.webp b/assets/img/2023-05-01-autoregressive-neural-pde-solver/PDEchart-480.webp
new file mode 100644
index 00000000..077eb50e
Binary files /dev/null and b/assets/img/2023-05-01-autoregressive-neural-pde-solver/PDEchart-480.webp differ
diff --git a/assets/img/2023-05-01-autoregressive-neural-pde-solver/PDEchart-800.webp b/assets/img/2023-05-01-autoregressive-neural-pde-solver/PDEchart-800.webp
new file mode 100644
index 00000000..9acfcaf4
Binary files /dev/null and b/assets/img/2023-05-01-autoregressive-neural-pde-solver/PDEchart-800.webp differ
diff --git a/assets/img/2023-05-01-autoregressive-neural-pde-solver/pushforward3-1400.webp b/assets/img/2023-05-01-autoregressive-neural-pde-solver/pushforward3-1400.webp
new file mode 100644
index 00000000..bc9fceba
Binary files /dev/null and b/assets/img/2023-05-01-autoregressive-neural-pde-solver/pushforward3-1400.webp differ
diff --git a/assets/img/2023-05-01-autoregressive-neural-pde-solver/pushforward3-480.webp b/assets/img/2023-05-01-autoregressive-neural-pde-solver/pushforward3-480.webp
new file mode 100644
index 00000000..33a94335
Binary files /dev/null and b/assets/img/2023-05-01-autoregressive-neural-pde-solver/pushforward3-480.webp differ
diff --git a/assets/img/2023-05-01-autoregressive-neural-pde-solver/pushforward3-800.webp b/assets/img/2023-05-01-autoregressive-neural-pde-solver/pushforward3-800.webp
new file mode 100644
index 00000000..bc9fceba
Binary files /dev/null and b/assets/img/2023-05-01-autoregressive-neural-pde-solver/pushforward3-800.webp differ
diff --git a/assets/img/2023-05-01-autoregressive-neural-pde-solver/rnn-1400.webp b/assets/img/2023-05-01-autoregressive-neural-pde-solver/rnn-1400.webp
new file mode 100644
index 00000000..6581c707
Binary files /dev/null and b/assets/img/2023-05-01-autoregressive-neural-pde-solver/rnn-1400.webp differ
diff --git a/assets/img/2023-05-01-autoregressive-neural-pde-solver/rnn-480.webp b/assets/img/2023-05-01-autoregressive-neural-pde-solver/rnn-480.webp
new file mode 100644
index 00000000..5a2a20ec
Binary files /dev/null and b/assets/img/2023-05-01-autoregressive-neural-pde-solver/rnn-480.webp differ
diff --git a/assets/img/2023-05-01-autoregressive-neural-pde-solver/rnn-800.webp b/assets/img/2023-05-01-autoregressive-neural-pde-solver/rnn-800.webp
new file mode 100644
index 00000000..6581c707
Binary files /dev/null and b/assets/img/2023-05-01-autoregressive-neural-pde-solver/rnn-800.webp differ
diff --git a/assets/img/2023-05-01-autoregressive-neural-pde-solver/shock_formation-1400.webp b/assets/img/2023-05-01-autoregressive-neural-pde-solver/shock_formation-1400.webp
new file mode 100644
index 00000000..660c8fdf
Binary files /dev/null and b/assets/img/2023-05-01-autoregressive-neural-pde-solver/shock_formation-1400.webp differ
diff --git a/assets/img/2023-05-01-autoregressive-neural-pde-solver/shock_formation-480.webp b/assets/img/2023-05-01-autoregressive-neural-pde-solver/shock_formation-480.webp
new file mode 100644
index 00000000..4b84c651
Binary files /dev/null and b/assets/img/2023-05-01-autoregressive-neural-pde-solver/shock_formation-480.webp differ
diff --git a/assets/img/2023-05-01-autoregressive-neural-pde-solver/shock_formation-800.webp b/assets/img/2023-05-01-autoregressive-neural-pde-solver/shock_formation-800.webp
new file mode 100644
index 00000000..660c8fdf
Binary files /dev/null and b/assets/img/2023-05-01-autoregressive-neural-pde-solver/shock_formation-800.webp differ
diff --git a/assets/img/2023-05-01-autoregressive-neural-pde-solver/temporalbundling-1400.webp b/assets/img/2023-05-01-autoregressive-neural-pde-solver/temporalbundling-1400.webp
new file mode 100644
index 00000000..3db48f5e
Binary files /dev/null and b/assets/img/2023-05-01-autoregressive-neural-pde-solver/temporalbundling-1400.webp differ
diff --git a/assets/img/2023-05-01-autoregressive-neural-pde-solver/temporalbundling-480.webp b/assets/img/2023-05-01-autoregressive-neural-pde-solver/temporalbundling-480.webp
new file mode 100644
index 00000000..127df374
Binary files /dev/null and b/assets/img/2023-05-01-autoregressive-neural-pde-solver/temporalbundling-480.webp differ
diff --git a/assets/img/2023-05-01-autoregressive-neural-pde-solver/temporalbundling-800.webp b/assets/img/2023-05-01-autoregressive-neural-pde-solver/temporalbundling-800.webp
new file mode 100644
index 00000000..3db48f5e
Binary files /dev/null and b/assets/img/2023-05-01-autoregressive-neural-pde-solver/temporalbundling-800.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/diagram02-1400.webp b/assets/img/2023-05-01-bsuite-applications/diagram02-1400.webp
new file mode 100644
index 00000000..4118c141
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/diagram02-1400.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/diagram02-480.webp b/assets/img/2023-05-01-bsuite-applications/diagram02-480.webp
new file mode 100644
index 00000000..c804627e
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/diagram02-480.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/diagram02-800.webp b/assets/img/2023-05-01-bsuite-applications/diagram02-800.webp
new file mode 100644
index 00000000..4118c141
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/diagram02-800.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar01-1400.webp b/assets/img/2023-05-01-bsuite-applications/radar01-1400.webp
new file mode 100644
index 00000000..4920a4dc
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar01-1400.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar01-480.webp b/assets/img/2023-05-01-bsuite-applications/radar01-480.webp
new file mode 100644
index 00000000..f1e27e4d
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar01-480.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar01-800.webp b/assets/img/2023-05-01-bsuite-applications/radar01-800.webp
new file mode 100644
index 00000000..4920a4dc
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar01-800.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar11-1400.webp b/assets/img/2023-05-01-bsuite-applications/radar11-1400.webp
new file mode 100644
index 00000000..56962157
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar11-1400.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar11-480.webp b/assets/img/2023-05-01-bsuite-applications/radar11-480.webp
new file mode 100644
index 00000000..0cacd9f3
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar11-480.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar11-800.webp b/assets/img/2023-05-01-bsuite-applications/radar11-800.webp
new file mode 100644
index 00000000..56962157
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar11-800.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar12-1400.webp b/assets/img/2023-05-01-bsuite-applications/radar12-1400.webp
new file mode 100644
index 00000000..82257950
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar12-1400.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar12-480.webp b/assets/img/2023-05-01-bsuite-applications/radar12-480.webp
new file mode 100644
index 00000000..008c34dc
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar12-480.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar12-800.webp b/assets/img/2023-05-01-bsuite-applications/radar12-800.webp
new file mode 100644
index 00000000..82257950
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar12-800.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar13-1400.webp b/assets/img/2023-05-01-bsuite-applications/radar13-1400.webp
new file mode 100644
index 00000000..1fc57c61
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar13-1400.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar13-480.webp b/assets/img/2023-05-01-bsuite-applications/radar13-480.webp
new file mode 100644
index 00000000..a5252186
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar13-480.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar13-800.webp b/assets/img/2023-05-01-bsuite-applications/radar13-800.webp
new file mode 100644
index 00000000..1fc57c61
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar13-800.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar21-1400.webp b/assets/img/2023-05-01-bsuite-applications/radar21-1400.webp
new file mode 100644
index 00000000..a0bb17ae
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar21-1400.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar21-480.webp b/assets/img/2023-05-01-bsuite-applications/radar21-480.webp
new file mode 100644
index 00000000..d5544844
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar21-480.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar21-800.webp b/assets/img/2023-05-01-bsuite-applications/radar21-800.webp
new file mode 100644
index 00000000..a0bb17ae
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar21-800.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar22-1400.webp b/assets/img/2023-05-01-bsuite-applications/radar22-1400.webp
new file mode 100644
index 00000000..bbc15037
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar22-1400.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar22-480.webp b/assets/img/2023-05-01-bsuite-applications/radar22-480.webp
new file mode 100644
index 00000000..777bb7ba
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar22-480.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar22-800.webp b/assets/img/2023-05-01-bsuite-applications/radar22-800.webp
new file mode 100644
index 00000000..bbc15037
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar22-800.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar31-1400.webp b/assets/img/2023-05-01-bsuite-applications/radar31-1400.webp
new file mode 100644
index 00000000..fa63a7ec
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar31-1400.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar31-480.webp b/assets/img/2023-05-01-bsuite-applications/radar31-480.webp
new file mode 100644
index 00000000..6db060d8
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar31-480.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar31-800.webp b/assets/img/2023-05-01-bsuite-applications/radar31-800.webp
new file mode 100644
index 00000000..fa63a7ec
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar31-800.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar32-1400.webp b/assets/img/2023-05-01-bsuite-applications/radar32-1400.webp
new file mode 100644
index 00000000..b9ad2419
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar32-1400.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar32-480.webp b/assets/img/2023-05-01-bsuite-applications/radar32-480.webp
new file mode 100644
index 00000000..98dc406a
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar32-480.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar32-800.webp b/assets/img/2023-05-01-bsuite-applications/radar32-800.webp
new file mode 100644
index 00000000..b9ad2419
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar32-800.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar33-1400.webp b/assets/img/2023-05-01-bsuite-applications/radar33-1400.webp
new file mode 100644
index 00000000..f1749bc1
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar33-1400.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar33-480.webp b/assets/img/2023-05-01-bsuite-applications/radar33-480.webp
new file mode 100644
index 00000000..0689ba42
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar33-480.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar33-800.webp b/assets/img/2023-05-01-bsuite-applications/radar33-800.webp
new file mode 100644
index 00000000..f1749bc1
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar33-800.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar41-1400.webp b/assets/img/2023-05-01-bsuite-applications/radar41-1400.webp
new file mode 100644
index 00000000..5892327a
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar41-1400.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar41-480.webp b/assets/img/2023-05-01-bsuite-applications/radar41-480.webp
new file mode 100644
index 00000000..d88d0f38
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar41-480.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar41-800.webp b/assets/img/2023-05-01-bsuite-applications/radar41-800.webp
new file mode 100644
index 00000000..5892327a
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar41-800.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar42-1400.webp b/assets/img/2023-05-01-bsuite-applications/radar42-1400.webp
new file mode 100644
index 00000000..c4963221
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar42-1400.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar42-480.webp b/assets/img/2023-05-01-bsuite-applications/radar42-480.webp
new file mode 100644
index 00000000..0ff48141
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar42-480.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar42-800.webp b/assets/img/2023-05-01-bsuite-applications/radar42-800.webp
new file mode 100644
index 00000000..c4963221
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar42-800.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar51-1400.webp b/assets/img/2023-05-01-bsuite-applications/radar51-1400.webp
new file mode 100644
index 00000000..6d37d223
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar51-1400.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar51-480.webp b/assets/img/2023-05-01-bsuite-applications/radar51-480.webp
new file mode 100644
index 00000000..fa621fcc
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar51-480.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar51-800.webp b/assets/img/2023-05-01-bsuite-applications/radar51-800.webp
new file mode 100644
index 00000000..6d37d223
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar51-800.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar52-1400.webp b/assets/img/2023-05-01-bsuite-applications/radar52-1400.webp
new file mode 100644
index 00000000..28cd2d29
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar52-1400.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar52-480.webp b/assets/img/2023-05-01-bsuite-applications/radar52-480.webp
new file mode 100644
index 00000000..5bbd9d69
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar52-480.webp differ
diff --git a/assets/img/2023-05-01-bsuite-applications/radar52-800.webp b/assets/img/2023-05-01-bsuite-applications/radar52-800.webp
new file mode 100644
index 00000000..28cd2d29
Binary files /dev/null and b/assets/img/2023-05-01-bsuite-applications/radar52-800.webp differ
diff --git a/assets/img/2023-05-01-classification-layer-initialization-in-maml/perm_final-1400.webp b/assets/img/2023-05-01-classification-layer-initialization-in-maml/perm_final-1400.webp
new file mode 100644
index 00000000..2185e26f
Binary files /dev/null and b/assets/img/2023-05-01-classification-layer-initialization-in-maml/perm_final-1400.webp differ
diff --git a/assets/img/2023-05-01-classification-layer-initialization-in-maml/perm_final-480.webp b/assets/img/2023-05-01-classification-layer-initialization-in-maml/perm_final-480.webp
new file mode 100644
index 00000000..6cea1b53
Binary files /dev/null and b/assets/img/2023-05-01-classification-layer-initialization-in-maml/perm_final-480.webp differ
diff --git a/assets/img/2023-05-01-classification-layer-initialization-in-maml/perm_final-800.webp b/assets/img/2023-05-01-classification-layer-initialization-in-maml/perm_final-800.webp
new file mode 100644
index 00000000..2185e26f
Binary files /dev/null and b/assets/img/2023-05-01-classification-layer-initialization-in-maml/perm_final-800.webp differ
diff --git a/assets/img/2023-05-01-classification-layer-initialization-in-maml/unicorn_maml_final-1400.webp b/assets/img/2023-05-01-classification-layer-initialization-in-maml/unicorn_maml_final-1400.webp
new file mode 100644
index 00000000..d172cfe8
Binary files /dev/null and b/assets/img/2023-05-01-classification-layer-initialization-in-maml/unicorn_maml_final-1400.webp differ
diff --git a/assets/img/2023-05-01-classification-layer-initialization-in-maml/unicorn_maml_final-480.webp b/assets/img/2023-05-01-classification-layer-initialization-in-maml/unicorn_maml_final-480.webp
new file mode 100644
index 00000000..eba1f82d
Binary files /dev/null and b/assets/img/2023-05-01-classification-layer-initialization-in-maml/unicorn_maml_final-480.webp differ
diff --git a/assets/img/2023-05-01-classification-layer-initialization-in-maml/unicorn_maml_final-800.webp b/assets/img/2023-05-01-classification-layer-initialization-in-maml/unicorn_maml_final-800.webp
new file mode 100644
index 00000000..d172cfe8
Binary files /dev/null and b/assets/img/2023-05-01-classification-layer-initialization-in-maml/unicorn_maml_final-800.webp differ
diff --git a/assets/img/2023-05-01-classification-layer-initialization-in-maml/zeroing_trick-1400.webp b/assets/img/2023-05-01-classification-layer-initialization-in-maml/zeroing_trick-1400.webp
new file mode 100644
index 00000000..841d9b7f
Binary files /dev/null and b/assets/img/2023-05-01-classification-layer-initialization-in-maml/zeroing_trick-1400.webp differ
diff --git a/assets/img/2023-05-01-classification-layer-initialization-in-maml/zeroing_trick-480.webp b/assets/img/2023-05-01-classification-layer-initialization-in-maml/zeroing_trick-480.webp
new file mode 100644
index 00000000..02b76a58
Binary files /dev/null and b/assets/img/2023-05-01-classification-layer-initialization-in-maml/zeroing_trick-480.webp differ
diff --git a/assets/img/2023-05-01-classification-layer-initialization-in-maml/zeroing_trick-800.webp b/assets/img/2023-05-01-classification-layer-initialization-in-maml/zeroing_trick-800.webp
new file mode 100644
index 00000000..841d9b7f
Binary files /dev/null and b/assets/img/2023-05-01-classification-layer-initialization-in-maml/zeroing_trick-800.webp differ
diff --git a/assets/img/2023-05-01-facial-poisoning/error_rate-1400.webp b/assets/img/2023-05-01-facial-poisoning/error_rate-1400.webp
new file mode 100644
index 00000000..dbbdfa88
Binary files /dev/null and b/assets/img/2023-05-01-facial-poisoning/error_rate-1400.webp differ
diff --git a/assets/img/2023-05-01-facial-poisoning/error_rate-480.webp b/assets/img/2023-05-01-facial-poisoning/error_rate-480.webp
new file mode 100644
index 00000000..ee247144
Binary files /dev/null and b/assets/img/2023-05-01-facial-poisoning/error_rate-480.webp differ
diff --git a/assets/img/2023-05-01-facial-poisoning/error_rate-800.webp b/assets/img/2023-05-01-facial-poisoning/error_rate-800.webp
new file mode 100644
index 00000000..dbbdfa88
Binary files /dev/null and b/assets/img/2023-05-01-facial-poisoning/error_rate-800.webp differ
diff --git a/assets/img/2023-05-01-facial-poisoning/facial_poisoning-1400.webp b/assets/img/2023-05-01-facial-poisoning/facial_poisoning-1400.webp
new file mode 100644
index 00000000..8714fe98
Binary files /dev/null and b/assets/img/2023-05-01-facial-poisoning/facial_poisoning-1400.webp differ
diff --git a/assets/img/2023-05-01-facial-poisoning/facial_poisoning-480.webp b/assets/img/2023-05-01-facial-poisoning/facial_poisoning-480.webp
new file mode 100644
index 00000000..259f683d
Binary files /dev/null and b/assets/img/2023-05-01-facial-poisoning/facial_poisoning-480.webp differ
diff --git a/assets/img/2023-05-01-facial-poisoning/facial_poisoning-800.webp b/assets/img/2023-05-01-facial-poisoning/facial_poisoning-800.webp
new file mode 100644
index 00000000..8714fe98
Binary files /dev/null and b/assets/img/2023-05-01-facial-poisoning/facial_poisoning-800.webp differ
diff --git a/assets/img/2023-05-01-hitchhikers-momentum/chebyshev_interval-1400.webp b/assets/img/2023-05-01-hitchhikers-momentum/chebyshev_interval-1400.webp
new file mode 100644
index 00000000..f00c2e52
Binary files /dev/null and b/assets/img/2023-05-01-hitchhikers-momentum/chebyshev_interval-1400.webp differ
diff --git a/assets/img/2023-05-01-hitchhikers-momentum/chebyshev_interval-480.webp b/assets/img/2023-05-01-hitchhikers-momentum/chebyshev_interval-480.webp
new file mode 100644
index 00000000..62d834b9
Binary files /dev/null and b/assets/img/2023-05-01-hitchhikers-momentum/chebyshev_interval-480.webp differ
diff --git a/assets/img/2023-05-01-hitchhikers-momentum/chebyshev_interval-800.webp b/assets/img/2023-05-01-hitchhikers-momentum/chebyshev_interval-800.webp
new file mode 100644
index 00000000..f00c2e52
Binary files /dev/null and b/assets/img/2023-05-01-hitchhikers-momentum/chebyshev_interval-800.webp differ
diff --git a/assets/img/2023-05-01-hitchhikers-momentum/rate_convergence_momentum-1400.webp b/assets/img/2023-05-01-hitchhikers-momentum/rate_convergence_momentum-1400.webp
new file mode 100644
index 00000000..fc03c159
Binary files /dev/null and b/assets/img/2023-05-01-hitchhikers-momentum/rate_convergence_momentum-1400.webp differ
diff --git a/assets/img/2023-05-01-hitchhikers-momentum/rate_convergence_momentum-480.webp b/assets/img/2023-05-01-hitchhikers-momentum/rate_convergence_momentum-480.webp
new file mode 100644
index 00000000..903a49af
Binary files /dev/null and b/assets/img/2023-05-01-hitchhikers-momentum/rate_convergence_momentum-480.webp differ
diff --git a/assets/img/2023-05-01-hitchhikers-momentum/rate_convergence_momentum-800.webp b/assets/img/2023-05-01-hitchhikers-momentum/rate_convergence_momentum-800.webp
new file mode 100644
index 00000000..fc03c159
Binary files /dev/null and b/assets/img/2023-05-01-hitchhikers-momentum/rate_convergence_momentum-800.webp differ
diff --git a/assets/img/2023-05-01-hitchhikers-momentum/rate_knife_edge-1400.webp b/assets/img/2023-05-01-hitchhikers-momentum/rate_knife_edge-1400.webp
new file mode 100644
index 00000000..a99c53e5
Binary files /dev/null and b/assets/img/2023-05-01-hitchhikers-momentum/rate_knife_edge-1400.webp differ
diff --git a/assets/img/2023-05-01-hitchhikers-momentum/rate_knife_edge-480.webp b/assets/img/2023-05-01-hitchhikers-momentum/rate_knife_edge-480.webp
new file mode 100644
index 00000000..712ca901
Binary files /dev/null and b/assets/img/2023-05-01-hitchhikers-momentum/rate_knife_edge-480.webp differ
diff --git a/assets/img/2023-05-01-hitchhikers-momentum/rate_knife_edge-800.webp b/assets/img/2023-05-01-hitchhikers-momentum/rate_knife_edge-800.webp
new file mode 100644
index 00000000..a99c53e5
Binary files /dev/null and b/assets/img/2023-05-01-hitchhikers-momentum/rate_knife_edge-800.webp differ
diff --git a/assets/img/2023-05-01-hitchhikers-momentum/rate_lazy_region-1400.webp b/assets/img/2023-05-01-hitchhikers-momentum/rate_lazy_region-1400.webp
new file mode 100644
index 00000000..657b5365
Binary files /dev/null and b/assets/img/2023-05-01-hitchhikers-momentum/rate_lazy_region-1400.webp differ
diff --git a/assets/img/2023-05-01-hitchhikers-momentum/rate_lazy_region-480.webp b/assets/img/2023-05-01-hitchhikers-momentum/rate_lazy_region-480.webp
new file mode 100644
index 00000000..572d2a44
Binary files /dev/null and b/assets/img/2023-05-01-hitchhikers-momentum/rate_lazy_region-480.webp differ
diff --git a/assets/img/2023-05-01-hitchhikers-momentum/rate_lazy_region-800.webp b/assets/img/2023-05-01-hitchhikers-momentum/rate_lazy_region-800.webp
new file mode 100644
index 00000000..657b5365
Binary files /dev/null and b/assets/img/2023-05-01-hitchhikers-momentum/rate_lazy_region-800.webp differ
diff --git a/assets/img/2023-05-01-hitchhikers-momentum/rate_robust_region-1400.webp b/assets/img/2023-05-01-hitchhikers-momentum/rate_robust_region-1400.webp
new file mode 100644
index 00000000..f5e621d9
Binary files /dev/null and b/assets/img/2023-05-01-hitchhikers-momentum/rate_robust_region-1400.webp differ
diff --git a/assets/img/2023-05-01-hitchhikers-momentum/rate_robust_region-480.webp b/assets/img/2023-05-01-hitchhikers-momentum/rate_robust_region-480.webp
new file mode 100644
index 00000000..c32c98df
Binary files /dev/null and b/assets/img/2023-05-01-hitchhikers-momentum/rate_robust_region-480.webp differ
diff --git a/assets/img/2023-05-01-hitchhikers-momentum/rate_robust_region-800.webp b/assets/img/2023-05-01-hitchhikers-momentum/rate_robust_region-800.webp
new file mode 100644
index 00000000..f5e621d9
Binary files /dev/null and b/assets/img/2023-05-01-hitchhikers-momentum/rate_robust_region-800.webp differ
diff --git a/assets/img/2023-05-01-hitchhikers-momentum/sketch_knife_edge-1400.webp b/assets/img/2023-05-01-hitchhikers-momentum/sketch_knife_edge-1400.webp
new file mode 100644
index 00000000..8ec5ed88
Binary files /dev/null and b/assets/img/2023-05-01-hitchhikers-momentum/sketch_knife_edge-1400.webp differ
diff --git a/assets/img/2023-05-01-hitchhikers-momentum/sketch_knife_edge-480.webp b/assets/img/2023-05-01-hitchhikers-momentum/sketch_knife_edge-480.webp
new file mode 100644
index 00000000..d3bc025e
Binary files /dev/null and b/assets/img/2023-05-01-hitchhikers-momentum/sketch_knife_edge-480.webp differ
diff --git a/assets/img/2023-05-01-hitchhikers-momentum/sketch_knife_edge-800.webp b/assets/img/2023-05-01-hitchhikers-momentum/sketch_knife_edge-800.webp
new file mode 100644
index 00000000..8ec5ed88
Binary files /dev/null and b/assets/img/2023-05-01-hitchhikers-momentum/sketch_knife_edge-800.webp differ
diff --git a/assets/img/2023-05-01-hitchhikers-momentum/sketch_lazy_region-1400.webp b/assets/img/2023-05-01-hitchhikers-momentum/sketch_lazy_region-1400.webp
new file mode 100644
index 00000000..23ffb82c
Binary files /dev/null and b/assets/img/2023-05-01-hitchhikers-momentum/sketch_lazy_region-1400.webp differ
diff --git a/assets/img/2023-05-01-hitchhikers-momentum/sketch_lazy_region-480.webp b/assets/img/2023-05-01-hitchhikers-momentum/sketch_lazy_region-480.webp
new file mode 100644
index 00000000..1ef8925c
Binary files /dev/null and b/assets/img/2023-05-01-hitchhikers-momentum/sketch_lazy_region-480.webp differ
diff --git a/assets/img/2023-05-01-hitchhikers-momentum/sketch_lazy_region-800.webp b/assets/img/2023-05-01-hitchhikers-momentum/sketch_lazy_region-800.webp
new file mode 100644
index 00000000..23ffb82c
Binary files /dev/null and b/assets/img/2023-05-01-hitchhikers-momentum/sketch_lazy_region-800.webp differ
diff --git a/assets/img/2023-05-01-hitchhikers-momentum/sketch_robust_region-1400.webp b/assets/img/2023-05-01-hitchhikers-momentum/sketch_robust_region-1400.webp
new file mode 100644
index 00000000..a7a526bc
Binary files /dev/null and b/assets/img/2023-05-01-hitchhikers-momentum/sketch_robust_region-1400.webp differ
diff --git a/assets/img/2023-05-01-hitchhikers-momentum/sketch_robust_region-480.webp b/assets/img/2023-05-01-hitchhikers-momentum/sketch_robust_region-480.webp
new file mode 100644
index 00000000..30c57816
Binary files /dev/null and b/assets/img/2023-05-01-hitchhikers-momentum/sketch_robust_region-480.webp differ
diff --git a/assets/img/2023-05-01-hitchhikers-momentum/sketch_robust_region-800.webp b/assets/img/2023-05-01-hitchhikers-momentum/sketch_robust_region-800.webp
new file mode 100644
index 00000000..a7a526bc
Binary files /dev/null and b/assets/img/2023-05-01-hitchhikers-momentum/sketch_robust_region-800.webp differ
diff --git a/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_14-1400.webp b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_14-1400.webp
new file mode 100644
index 00000000..3ab2dbd4
Binary files /dev/null and b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_14-1400.webp differ
diff --git a/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_14-480.webp b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_14-480.webp
new file mode 100644
index 00000000..3f6dad02
Binary files /dev/null and b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_14-480.webp differ
diff --git a/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_14-800.webp b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_14-800.webp
new file mode 100644
index 00000000..3ab2dbd4
Binary files /dev/null and b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_14-800.webp differ
diff --git a/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_20rnd-1400.webp b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_20rnd-1400.webp
new file mode 100644
index 00000000..c49ff63d
Binary files /dev/null and b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_20rnd-1400.webp differ
diff --git a/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_20rnd-480.webp b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_20rnd-480.webp
new file mode 100644
index 00000000..47c20feb
Binary files /dev/null and b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_20rnd-480.webp differ
diff --git a/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_20rnd-800.webp b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_20rnd-800.webp
new file mode 100644
index 00000000..c49ff63d
Binary files /dev/null and b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_20rnd-800.webp differ
diff --git a/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_20t-1400.webp b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_20t-1400.webp
new file mode 100644
index 00000000..ce6fcf37
Binary files /dev/null and b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_20t-1400.webp differ
diff --git a/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_20t-480.webp b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_20t-480.webp
new file mode 100644
index 00000000..9aac6b97
Binary files /dev/null and b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_20t-480.webp differ
diff --git a/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_20t-800.webp b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_20t-800.webp
new file mode 100644
index 00000000..ce6fcf37
Binary files /dev/null and b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_20t-800.webp differ
diff --git a/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_7-1400.webp b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_7-1400.webp
new file mode 100644
index 00000000..f6b1b216
Binary files /dev/null and b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_7-1400.webp differ
diff --git a/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_7-480.webp b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_7-480.webp
new file mode 100644
index 00000000..2dce846e
Binary files /dev/null and b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_7-480.webp differ
diff --git a/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_7-800.webp b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_7-800.webp
new file mode 100644
index 00000000..f6b1b216
Binary files /dev/null and b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNN_7-800.webp differ
diff --git a/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2-1400.webp b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2-1400.webp
new file mode 100644
index 00000000..00524bb5
Binary files /dev/null and b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2-1400.webp differ
diff --git a/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2-480.webp b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2-480.webp
new file mode 100644
index 00000000..564a93ef
Binary files /dev/null and b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2-480.webp differ
diff --git a/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2-800.webp b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2-800.webp
new file mode 100644
index 00000000..00524bb5
Binary files /dev/null and b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2-800.webp differ
diff --git a/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_1layer-1400.webp b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_1layer-1400.webp
new file mode 100644
index 00000000..ed47a35f
Binary files /dev/null and b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_1layer-1400.webp differ
diff --git a/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_1layer-480.webp b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_1layer-480.webp
new file mode 100644
index 00000000..2afd616b
Binary files /dev/null and b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_1layer-480.webp differ
diff --git a/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_1layer-800.webp b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_1layer-800.webp
new file mode 100644
index 00000000..ed47a35f
Binary files /dev/null and b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_1layer-800.webp differ
diff --git a/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_compareFCNReLU-1400.webp b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_compareFCNReLU-1400.webp
new file mode 100644
index 00000000..fdced528
Binary files /dev/null and b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_compareFCNReLU-1400.webp differ
diff --git a/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_compareFCNReLU-480.webp b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_compareFCNReLU-480.webp
new file mode 100644
index 00000000..4aa8da43
Binary files /dev/null and b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_compareFCNReLU-480.webp differ
diff --git a/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_compareFCNReLU-800.webp b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_compareFCNReLU-800.webp
new file mode 100644
index 00000000..fdced528
Binary files /dev/null and b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_compareFCNReLU-800.webp differ
diff --git a/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure3-1400.webp b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure3-1400.webp
new file mode 100644
index 00000000..f507c2dd
Binary files /dev/null and b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure3-1400.webp differ
diff --git a/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure3-480.webp b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure3-480.webp
new file mode 100644
index 00000000..a2fc46a7
Binary files /dev/null and b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure3-480.webp differ
diff --git a/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure3-800.webp b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure3-800.webp
new file mode 100644
index 00000000..f507c2dd
Binary files /dev/null and b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure3-800.webp differ
diff --git a/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/bias_variance_tradeoff-1400.webp b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/bias_variance_tradeoff-1400.webp
new file mode 100644
index 00000000..a5ff173d
Binary files /dev/null and b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/bias_variance_tradeoff-1400.webp differ
diff --git a/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/bias_variance_tradeoff-480.webp b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/bias_variance_tradeoff-480.webp
new file mode 100644
index 00000000..c92cd200
Binary files /dev/null and b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/bias_variance_tradeoff-480.webp differ
diff --git a/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/bias_variance_tradeoff-800.webp b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/bias_variance_tradeoff-800.webp
new file mode 100644
index 00000000..a5ff173d
Binary files /dev/null and b/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/bias_variance_tradeoff-800.webp differ
diff --git a/assets/img/2023-05-01-riit/diagram-1400.webp b/assets/img/2023-05-01-riit/diagram-1400.webp
new file mode 100644
index 00000000..21e52278
Binary files /dev/null and b/assets/img/2023-05-01-riit/diagram-1400.webp differ
diff --git a/assets/img/2023-05-01-riit/diagram-480.webp b/assets/img/2023-05-01-riit/diagram-480.webp
new file mode 100644
index 00000000..3fcacacc
Binary files /dev/null and b/assets/img/2023-05-01-riit/diagram-480.webp differ
diff --git a/assets/img/2023-05-01-riit/diagram-800.webp b/assets/img/2023-05-01-riit/diagram-800.webp
new file mode 100644
index 00000000..21e52278
Binary files /dev/null and b/assets/img/2023-05-01-riit/diagram-800.webp differ
diff --git a/assets/img/2023-05-01-riit/mdp-1400.webp b/assets/img/2023-05-01-riit/mdp-1400.webp
new file mode 100644
index 00000000..8ab0d5c3
Binary files /dev/null and b/assets/img/2023-05-01-riit/mdp-1400.webp differ
diff --git a/assets/img/2023-05-01-riit/mdp-480.webp b/assets/img/2023-05-01-riit/mdp-480.webp
new file mode 100644
index 00000000..982d606d
Binary files /dev/null and b/assets/img/2023-05-01-riit/mdp-480.webp differ
diff --git a/assets/img/2023-05-01-riit/mdp-800.webp b/assets/img/2023-05-01-riit/mdp-800.webp
new file mode 100644
index 00000000..8ab0d5c3
Binary files /dev/null and b/assets/img/2023-05-01-riit/mdp-800.webp differ
diff --git a/assets/img/2023-05-01-riit/monotonic-1400.webp b/assets/img/2023-05-01-riit/monotonic-1400.webp
new file mode 100644
index 00000000..d8e209ca
Binary files /dev/null and b/assets/img/2023-05-01-riit/monotonic-1400.webp differ
diff --git a/assets/img/2023-05-01-riit/monotonic-480.webp b/assets/img/2023-05-01-riit/monotonic-480.webp
new file mode 100644
index 00000000..093a09c2
Binary files /dev/null and b/assets/img/2023-05-01-riit/monotonic-480.webp differ
diff --git a/assets/img/2023-05-01-riit/monotonic-800.webp b/assets/img/2023-05-01-riit/monotonic-800.webp
new file mode 100644
index 00000000..d8e209ca
Binary files /dev/null and b/assets/img/2023-05-01-riit/monotonic-800.webp differ
diff --git a/assets/img/2023-05-01-riit/original_version-1400.webp b/assets/img/2023-05-01-riit/original_version-1400.webp
new file mode 100644
index 00000000..2a4dc3f8
Binary files /dev/null and b/assets/img/2023-05-01-riit/original_version-1400.webp differ
diff --git a/assets/img/2023-05-01-riit/original_version-480.webp b/assets/img/2023-05-01-riit/original_version-480.webp
new file mode 100644
index 00000000..261e9416
Binary files /dev/null and b/assets/img/2023-05-01-riit/original_version-480.webp differ
diff --git a/assets/img/2023-05-01-riit/original_version-800.webp b/assets/img/2023-05-01-riit/original_version-800.webp
new file mode 100644
index 00000000..2a4dc3f8
Binary files /dev/null and b/assets/img/2023-05-01-riit/original_version-800.webp differ
diff --git a/assets/img/2023-05-01-riit/qmix_frame-1400.webp b/assets/img/2023-05-01-riit/qmix_frame-1400.webp
new file mode 100644
index 00000000..54295142
Binary files /dev/null and b/assets/img/2023-05-01-riit/qmix_frame-1400.webp differ
diff --git a/assets/img/2023-05-01-riit/qmix_frame-480.webp b/assets/img/2023-05-01-riit/qmix_frame-480.webp
new file mode 100644
index 00000000..09578c12
Binary files /dev/null and b/assets/img/2023-05-01-riit/qmix_frame-480.webp differ
diff --git a/assets/img/2023-05-01-riit/qmix_frame-800.webp b/assets/img/2023-05-01-riit/qmix_frame-800.webp
new file mode 100644
index 00000000..54295142
Binary files /dev/null and b/assets/img/2023-05-01-riit/qmix_frame-800.webp differ
diff --git a/assets/img/2023-05-01-riit/smac_agent_obs-1400.webp b/assets/img/2023-05-01-riit/smac_agent_obs-1400.webp
new file mode 100644
index 00000000..0b581f0c
Binary files /dev/null and b/assets/img/2023-05-01-riit/smac_agent_obs-1400.webp differ
diff --git a/assets/img/2023-05-01-riit/smac_agent_obs-480.webp b/assets/img/2023-05-01-riit/smac_agent_obs-480.webp
new file mode 100644
index 00000000..e4dec8e7
Binary files /dev/null and b/assets/img/2023-05-01-riit/smac_agent_obs-480.webp differ
diff --git a/assets/img/2023-05-01-riit/smac_agent_obs-800.webp b/assets/img/2023-05-01-riit/smac_agent_obs-800.webp
new file mode 100644
index 00000000..0b581f0c
Binary files /dev/null and b/assets/img/2023-05-01-riit/smac_agent_obs-800.webp differ
diff --git a/assets/img/2023-05-01-sets-and-graphs/graphsuniv_deepsets-1400.webp b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_deepsets-1400.webp
new file mode 100644
index 00000000..3cacabf3
Binary files /dev/null and b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_deepsets-1400.webp differ
diff --git a/assets/img/2023-05-01-sets-and-graphs/graphsuniv_deepsets-480.webp b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_deepsets-480.webp
new file mode 100644
index 00000000..db2651b4
Binary files /dev/null and b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_deepsets-480.webp differ
diff --git a/assets/img/2023-05-01-sets-and-graphs/graphsuniv_deepsets-800.webp b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_deepsets-800.webp
new file mode 100644
index 00000000..3cacabf3
Binary files /dev/null and b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_deepsets-800.webp differ
diff --git a/assets/img/2023-05-01-sets-and-graphs/graphsuniv_examples-1400.webp b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_examples-1400.webp
new file mode 100644
index 00000000..3157008f
Binary files /dev/null and b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_examples-1400.webp differ
diff --git a/assets/img/2023-05-01-sets-and-graphs/graphsuniv_examples-480.webp b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_examples-480.webp
new file mode 100644
index 00000000..cd2b7b57
Binary files /dev/null and b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_examples-480.webp differ
diff --git a/assets/img/2023-05-01-sets-and-graphs/graphsuniv_examples-800.webp b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_examples-800.webp
new file mode 100644
index 00000000..3157008f
Binary files /dev/null and b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_examples-800.webp differ
diff --git a/assets/img/2023-05-01-sets-and-graphs/graphsuniv_graphsandsets-1400.webp b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_graphsandsets-1400.webp
new file mode 100644
index 00000000..0410b02d
Binary files /dev/null and b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_graphsandsets-1400.webp differ
diff --git a/assets/img/2023-05-01-sets-and-graphs/graphsuniv_graphsandsets-480.webp b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_graphsandsets-480.webp
new file mode 100644
index 00000000..1c3b48dd
Binary files /dev/null and b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_graphsandsets-480.webp differ
diff --git a/assets/img/2023-05-01-sets-and-graphs/graphsuniv_graphsandsets-800.webp b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_graphsandsets-800.webp
new file mode 100644
index 00000000..0410b02d
Binary files /dev/null and b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_graphsandsets-800.webp differ
diff --git a/assets/img/2023-05-01-sets-and-graphs/graphsuniv_permutations-1400.webp b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_permutations-1400.webp
new file mode 100644
index 00000000..6ab15b2b
Binary files /dev/null and b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_permutations-1400.webp differ
diff --git a/assets/img/2023-05-01-sets-and-graphs/graphsuniv_permutations-480.webp b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_permutations-480.webp
new file mode 100644
index 00000000..682e1c07
Binary files /dev/null and b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_permutations-480.webp differ
diff --git a/assets/img/2023-05-01-sets-and-graphs/graphsuniv_permutations-800.webp b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_permutations-800.webp
new file mode 100644
index 00000000..6ab15b2b
Binary files /dev/null and b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_permutations-800.webp differ
diff --git a/assets/img/2023-05-01-sets-and-graphs/graphsuniv_wlfail-1400.webp b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_wlfail-1400.webp
new file mode 100644
index 00000000..0e5fb66a
Binary files /dev/null and b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_wlfail-1400.webp differ
diff --git a/assets/img/2023-05-01-sets-and-graphs/graphsuniv_wlfail-480.webp b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_wlfail-480.webp
new file mode 100644
index 00000000..1e155f29
Binary files /dev/null and b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_wlfail-480.webp differ
diff --git a/assets/img/2023-05-01-sets-and-graphs/graphsuniv_wlfail-800.webp b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_wlfail-800.webp
new file mode 100644
index 00000000..0e5fb66a
Binary files /dev/null and b/assets/img/2023-05-01-sets-and-graphs/graphsuniv_wlfail-800.webp differ
diff --git a/assets/img/ICLR-logo-1400.webp b/assets/img/ICLR-logo-1400.webp
new file mode 100644
index 00000000..d56968ba
Binary files /dev/null and b/assets/img/ICLR-logo-1400.webp differ
diff --git a/assets/img/ICLR-logo-480.webp b/assets/img/ICLR-logo-480.webp
new file mode 100644
index 00000000..c9d42d7e
Binary files /dev/null and b/assets/img/ICLR-logo-480.webp differ
diff --git a/assets/img/ICLR-logo-800.webp b/assets/img/ICLR-logo-800.webp
new file mode 100644
index 00000000..d56968ba
Binary files /dev/null and b/assets/img/ICLR-logo-800.webp differ
diff --git a/assets/img/ICLR-logo-dark-1400.webp b/assets/img/ICLR-logo-dark-1400.webp
new file mode 100644
index 00000000..5ed49089
Binary files /dev/null and b/assets/img/ICLR-logo-dark-1400.webp differ
diff --git a/assets/img/ICLR-logo-dark-480.webp b/assets/img/ICLR-logo-dark-480.webp
new file mode 100644
index 00000000..7f0830c1
Binary files /dev/null and b/assets/img/ICLR-logo-dark-480.webp differ
diff --git a/assets/img/ICLR-logo-dark-800.webp b/assets/img/ICLR-logo-dark-800.webp
new file mode 100644
index 00000000..5ed49089
Binary files /dev/null and b/assets/img/ICLR-logo-dark-800.webp differ
diff --git a/assets/img/organizers/cg-1400.webp b/assets/img/organizers/cg-1400.webp
new file mode 100644
index 00000000..7d4f4383
Binary files /dev/null and b/assets/img/organizers/cg-1400.webp differ
diff --git a/assets/img/organizers/cg-480.webp b/assets/img/organizers/cg-480.webp
new file mode 100644
index 00000000..c4497e86
Binary files /dev/null and b/assets/img/organizers/cg-480.webp differ
diff --git a/assets/img/organizers/cg-800.webp b/assets/img/organizers/cg-800.webp
new file mode 100644
index 00000000..7d4f4383
Binary files /dev/null and b/assets/img/organizers/cg-800.webp differ
diff --git a/assets/img/organizers/cv-1400.webp b/assets/img/organizers/cv-1400.webp
new file mode 100644
index 00000000..3967f400
Binary files /dev/null and b/assets/img/organizers/cv-1400.webp differ
diff --git a/assets/img/organizers/cv-480.webp b/assets/img/organizers/cv-480.webp
new file mode 100644
index 00000000..8e5721ba
Binary files /dev/null and b/assets/img/organizers/cv-480.webp differ
diff --git a/assets/img/organizers/cv-800.webp b/assets/img/organizers/cv-800.webp
new file mode 100644
index 00000000..3967f400
Binary files /dev/null and b/assets/img/organizers/cv-800.webp differ
diff --git a/assets/img/organizers/dd-1400.webp b/assets/img/organizers/dd-1400.webp
new file mode 100644
index 00000000..b63f6c49
Binary files /dev/null and b/assets/img/organizers/dd-1400.webp differ
diff --git a/assets/img/organizers/dd-480.webp b/assets/img/organizers/dd-480.webp
new file mode 100644
index 00000000..57f80658
Binary files /dev/null and b/assets/img/organizers/dd-480.webp differ
diff --git a/assets/img/organizers/dd-800.webp b/assets/img/organizers/dd-800.webp
new file mode 100644
index 00000000..b63f6c49
Binary files /dev/null and b/assets/img/organizers/dd-800.webp differ
diff --git a/assets/img/organizers/gg-1400.webp b/assets/img/organizers/gg-1400.webp
new file mode 100644
index 00000000..4a8c5bd4
Binary files /dev/null and b/assets/img/organizers/gg-1400.webp differ
diff --git a/assets/img/organizers/gg-480.webp b/assets/img/organizers/gg-480.webp
new file mode 100644
index 00000000..ca12493c
Binary files /dev/null and b/assets/img/organizers/gg-480.webp differ
diff --git a/assets/img/organizers/gg-800.webp b/assets/img/organizers/gg-800.webp
new file mode 100644
index 00000000..4a8c5bd4
Binary files /dev/null and b/assets/img/organizers/gg-800.webp differ
diff --git a/assets/img/organizers/jb-1400.webp b/assets/img/organizers/jb-1400.webp
new file mode 100644
index 00000000..805ff806
Binary files /dev/null and b/assets/img/organizers/jb-1400.webp differ
diff --git a/assets/img/organizers/jb-480.webp b/assets/img/organizers/jb-480.webp
new file mode 100644
index 00000000..61136b49
Binary files /dev/null and b/assets/img/organizers/jb-480.webp differ
diff --git a/assets/img/organizers/jb-800.webp b/assets/img/organizers/jb-800.webp
new file mode 100644
index 00000000..805ff806
Binary files /dev/null and b/assets/img/organizers/jb-800.webp differ
diff --git a/assets/img/organizers/sb-1400.webp b/assets/img/organizers/sb-1400.webp
new file mode 100644
index 00000000..a7f4c274
Binary files /dev/null and b/assets/img/organizers/sb-1400.webp differ
diff --git a/assets/img/organizers/sb-480.webp b/assets/img/organizers/sb-480.webp
new file mode 100644
index 00000000..f3f51102
Binary files /dev/null and b/assets/img/organizers/sb-480.webp differ
diff --git a/assets/img/organizers/sb-800.webp b/assets/img/organizers/sb-800.webp
new file mode 100644
index 00000000..a7f4c274
Binary files /dev/null and b/assets/img/organizers/sb-800.webp differ
diff --git a/assets/js/common.js b/assets/js/common.js
index f7c41c20..521235d2 100644
--- a/assets/js/common.js
+++ b/assets/js/common.js
@@ -1,9 +1 @@
-$(document).ready(function() {
-    $('a.abstract').click(function() {
-        $(this).parent().parent().find(".abstract.hidden").toggleClass('open');
-    });
-    $('a.bibtex').click(function() {
-        $(this).parent().parent().find(".bibtex.hidden").toggleClass('open');
-    });
-    $('a').removeClass('waves-effect waves-light');
-});
+$(document).ready(function(){$("a.abstract").click(function(){$(this).parent().parent().find(".abstract.hidden").toggleClass("open")}),$("a.bibtex").click(function(){$(this).parent().parent().find(".bibtex.hidden").toggleClass("open")}),$("a").removeClass("waves-effect waves-light")});
\ No newline at end of file
diff --git a/assets/js/dark_mode.js b/assets/js/dark_mode.js
index 863b273f..26312e44 100644
--- a/assets/js/dark_mode.js
+++ b/assets/js/dark_mode.js
@@ -1,8 +1 @@
-document.addEventListener('DOMContentLoaded', function() {
-    const mode_toggle = document.getElementById("light-toggle");
-
-    mode_toggle.addEventListener("click", function() {
-        toggleTheme(localStorage.getItem("theme"));
-    });
-});
-
+document.addEventListener("DOMContentLoaded",function(){document.getElementById("light-toggle").addEventListener("click",function(){toggleTheme(localStorage.getItem("theme"))})});
\ No newline at end of file
diff --git a/assets/js/distillpub/overrides.js b/assets/js/distillpub/overrides.js
index 2d839626..066b8efa 100644
--- a/assets/js/distillpub/overrides.js
+++ b/assets/js/distillpub/overrides.js
@@ -1,24 +1 @@
-$(document).ready(function() {
-    // Override styles of the footnotes.
-    document.querySelectorAll("d-footnote").forEach(function(footnote) {
-        footnote.shadowRoot.querySelector("sup > span")
-            .setAttribute("style", "color: var(--global-theme-color);");
-        footnote.shadowRoot.querySelector("d-hover-box").shadowRoot.querySelector("style").sheet
-            .insertRule(".panel {background-color: var(--global-bg-color) !important;}");
-        footnote.shadowRoot.querySelector("d-hover-box").shadowRoot.querySelector("style").sheet
-            .insertRule(".panel {border-color: var(--global-divider-color) !important;}");
-    });
-    // Override styles of the citations.
-    document.querySelectorAll("d-cite").forEach(function(cite) {
-        cite.shadowRoot.querySelector("div > span")
-            .setAttribute("style", "color: var(--global-theme-color);");
-        cite.shadowRoot.querySelector("style").sheet
-            .insertRule("ul li a {color: var(--global-text-color) !important; text-decoration: none;}");
-        cite.shadowRoot.querySelector("style").sheet
-            .insertRule("ul li a:hover {color: var(--global-theme-color) !important;}");
-        cite.shadowRoot.querySelector("d-hover-box").shadowRoot.querySelector("style").sheet
-            .insertRule(".panel {background-color: var(--global-bg-color) !important;}");
-        cite.shadowRoot.querySelector("d-hover-box").shadowRoot.querySelector("style").sheet
-            .insertRule(".panel {border-color: var(--global-divider-color) !important;}");
-    });
-})
\ No newline at end of file
+$(document).ready(function(){document.querySelectorAll("d-footnote").forEach(function(o){o.shadowRoot.querySelector("sup > span").setAttribute("style","color: var(--global-theme-color);"),o.shadowRoot.querySelector("d-hover-box").shadowRoot.querySelector("style").sheet.insertRule(".panel {background-color: var(--global-bg-color) !important;}"),o.shadowRoot.querySelector("d-hover-box").shadowRoot.querySelector("style").sheet.insertRule(".panel {border-color: var(--global-divider-color) !important;}")}),document.querySelectorAll("d-cite").forEach(function(o){o.shadowRoot.querySelector("div > span").setAttribute("style","color: var(--global-theme-color);"),o.shadowRoot.querySelector("style").sheet.insertRule("ul li a {color: var(--global-text-color) !important; text-decoration: none;}"),o.shadowRoot.querySelector("style").sheet.insertRule("ul li a:hover {color: var(--global-theme-color) !important;}"),o.shadowRoot.querySelector("d-hover-box").shadowRoot.querySelector("style").sheet.insertRule(".panel {background-color: var(--global-bg-color) !important;}"),o.shadowRoot.querySelector("d-hover-box").shadowRoot.querySelector("style").sheet.insertRule(".panel {border-color: var(--global-divider-color) !important;}")})});
\ No newline at end of file
diff --git a/assets/js/distillpub/template.v2.js b/assets/js/distillpub/template.v2.js
index 4ddc61fc..a05e82bb 100644
--- a/assets/js/distillpub/template.v2.js
+++ b/assets/js/distillpub/template.v2.js
@@ -1,9247 +1,67 @@
-(function (factory) {
-  typeof define === 'function' && define.amd ? define(factory) :
-  factory();
-}((function () { 'use strict';
-
-  // Copyright 2018 The Distill Template Authors
-  //
-  // Licensed under the Apache License, Version 2.0 (the "License");
-  // you may not use this file except in compliance with the License.
-  // You may obtain a copy of the License at
-  //
-  //      http://www.apache.org/licenses/LICENSE-2.0
-  //
-  // Unless required by applicable law or agreed to in writing, software
-  // distributed under the License is distributed on an "AS IS" BASIS,
-  // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  // See the License for the specific language governing permissions and
-  // limitations under the License.
-
-  const days = ['Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday'];
-  const months = ['Jan.', 'Feb.', 'March', 'April', 'May', 'June', 'July', 'Aug.', 'Sept.', 'Oct.', 'Nov.', 'Dec.'];
-  const zeroPad = n => n < 10 ? '0' + n : n;
-
-  const RFC = function(date) {
-    const day = days[date.getDay()].substring(0, 3);
-    const paddedDate = zeroPad(date.getDate());
-    const month = months[date.getMonth()].substring(0,3);
-    const year = date.getFullYear().toString();
-    const hours = date.getUTCHours().toString();
-    const minutes = date.getUTCMinutes().toString();
-    const seconds = date.getUTCSeconds().toString();
-    return `${day}, ${paddedDate} ${month} ${year} ${hours}:${minutes}:${seconds} Z`;
-  };
-
-  const objectFromMap = function(map) {
-    const object = Array.from(map).reduce((object, [key, value]) => (
-      Object.assign(object, { [key]: value }) // Be careful! Maps can have non-String keys; object literals can't.
-    ), {});
-    return object;
-  };
-
-  const mapFromObject = function(object) {
-    const map = new Map();
-    for (var property in object) {
-      if (object.hasOwnProperty(property)) {
-        map.set(property, object[property]);
-      }
-    }
-    return map;
-  };
-
-  class Author {
-
-    // constructor(name='', personalURL='', affiliation='', affiliationURL='') {
-    //   this.name = name; // 'Chris Olah'
-    //   this.personalURL = personalURL; // 'https://colah.github.io'
-    //   this.affiliation = affiliation; // 'Google Brain'
-    //   this.affiliationURL = affiliationURL; // 'https://g.co/brain'
-    // }
-
-    constructor(object) {
-      this.name = object.author; // 'Chris Olah'
-      this.personalURL = object.authorURL; // 'https://colah.github.io'
-      this.affiliation = object.affiliation; // 'Google Brain'
-      this.affiliationURL = object.affiliationURL; // 'https://g.co/brain'
-      this.affiliations = object.affiliations || []; // new-style affiliations
-    }
-
-    // 'Chris'
-    get firstName() {
-      const names = this.name.split(' ');
-      return names.slice(0, names.length - 1).join(' ');
-    }
-
-    // 'Olah'
-    get lastName() {
-      const names = this.name.split(' ');
-      return names[names.length -1];
-    }
-  }
-
-  function mergeFromYMLFrontmatter(target, source) {
-    target.title = source.title;
-    if (source.published) {
-      if (source.published instanceof Date) {
-        target.publishedDate = source.published;
-      } else if (source.published.constructor === String) {
-        target.publishedDate = new Date(source.published);
-      }
-    }
-    if (source.publishedDate) {
-      if (source.publishedDate instanceof Date) {
-        target.publishedDate = source.publishedDate;
-      } else if (source.publishedDate.constructor === String) {
-        target.publishedDate = new Date(source.publishedDate);
-      } else {
-        console.error('Don\'t know what to do with published date: ' + source.publishedDate);
-      }
-    }
-    target.description = source.description;
-    target.authors = source.authors.map( (authorObject) => new Author(authorObject));
-    target.katex = source.katex;
-    target.password = source.password;
-    if (source.doi) {
-      target.doi = source.doi;
-    }
-  }
-
-  class FrontMatter {
-    constructor() {
-      this.title = 'unnamed article'; // 'Attention and Augmented Recurrent Neural Networks'
-      this.description = ''; // 'A visual overview of neural attention...'
-      this.authors = []; // Array of Author(s)
-
-      this.bibliography = new Map();
-      this.bibliographyParsed = false;
-      //  {
-      //    'gregor2015draw': {
-      //      'title': 'DRAW: A recurrent neural network for image generation',
-      //      'author': 'Gregor, Karol and Danihelka, Ivo and Graves, Alex and Rezende, Danilo Jimenez and Wierstra, Daan',
-      //      'journal': 'arXiv preprint arXiv:1502.04623',
-      //      'year': '2015',
-      //      'url': 'https://arxiv.org/pdf/1502.04623.pdf',
-      //      'type': 'article'
-      //    },
-      //  }
-
-      // Citation keys should be listed in the order that they are appear in the document.
-      // Each key refers to a key in the bibliography dictionary.
-      this.citations = []; // [ 'gregor2015draw', 'mercier2011humans' ]
-      this.citationsCollected = false;
-
-      //
-      // Assigned from posts.csv
-      //
-
-      //  publishedDate: 2016-09-08T07:00:00.000Z,
-      //  tags: [ 'rnn' ],
-      //  distillPath: '2016/augmented-rnns',
-      //  githubPath: 'distillpub/post--augmented-rnns',
-      //  doiSuffix: 1,
-
-      //
-      // Assigned from journal
-      //
-      this.journal = {};
-      //  journal: {
-      //    'title': 'Distill',
-      //    'full_title': 'Distill',
-      //    'abbrev_title': 'Distill',
-      //    'url': 'http://distill.pub',
-      //    'doi': '10.23915/distill',
-      //    'publisherName': 'Distill Working Group',
-      //    'publisherEmail': 'admin@distill.pub',
-      //    'issn': '2476-0757',
-      //    'editors': [...],
-      //    'committee': [...]
-      //  }
-      //  volume: 1,
-      //  issue: 9,
-
-      this.katex = {};
-
-      //
-      // Assigned from publishing process
-      //
-
-      //  githubCompareUpdatesUrl: 'https://github.com/distillpub/post--augmented-rnns/compare/1596e094d8943d2dc0ea445d92071129c6419c59...3bd9209e0c24d020f87cf6152dcecc6017cbc193',
-      //  updatedDate: 2017-03-21T07:13:16.000Z,
-      //  doi: '10.23915/distill.00001',
-      this.doi = undefined;
-      this.publishedDate = undefined;
-    }
-
-    // Example:
-    // title: Demo Title Attention and Augmented Recurrent Neural Networks
-    // published: Jan 10, 2017
-    // authors:
-    // - Chris Olah:
-    // - Shan Carter: http://shancarter.com
-    // affiliations:
-    // - Google Brain:
-    // - Google Brain: http://g.co/brain
-
-    //
-    // Computed Properties
-    //
-
-    // 'http://distill.pub/2016/augmented-rnns',
-    set url(value) {
-      this._url = value;
-    }
-    get url() {
-      if (this._url) {
-        return this._url;
-      } else if (this.distillPath && this.journal.url) {
-        return this.journal.url + '/' + this.distillPath;
-      } else if (this.journal.url) {
-        return this.journal.url;
-      }
-    }
-
-    // 'https://github.com/distillpub/post--augmented-rnns',
-    get githubUrl() {
-      if (this.githubPath) {
-        return 'https://github.com/' + this.githubPath;
-      } else {
-        return undefined;
-      }
-    }
-
-    // TODO resolve differences in naming of URL/Url/url.
-    // 'http://distill.pub/2016/augmented-rnns/thumbnail.jpg',
-    set previewURL(value) {
-      this._previewURL = value;
-    }
-    get previewURL() {
-      return this._previewURL ? this._previewURL : this.url + '/thumbnail.jpg';
-    }
-
-    // 'Thu, 08 Sep 2016 00:00:00 -0700',
-    get publishedDateRFC() {
-      return RFC(this.publishedDate);
-    }
-
-    // 'Thu, 08 Sep 2016 00:00:00 -0700',
-    get updatedDateRFC() {
-      return RFC(this.updatedDate);
-    }
-
-    // 2016,
-    get publishedYear() {
-      return this.publishedDate.getFullYear();
-    }
-
-    // 'Sept',
-    get publishedMonth() {
-      return months[this.publishedDate.getMonth()];
-    }
-
-    // 8,
-    get publishedDay() {
-      return this.publishedDate.getDate();
-    }
-
-    // '09',
-    get publishedMonthPadded() {
-      return zeroPad(this.publishedDate.getMonth() + 1);
-    }
-
-    // '08',
-    get publishedDayPadded() {
-      return zeroPad(this.publishedDate.getDate());
-    }
-
-    get publishedISODateOnly() {
-      return this.publishedDate.toISOString().split('T')[0];
-    }
-
-    get volume() {
-      const volume = this.publishedYear - 2015;
-      if (volume < 1) {
-        throw new Error('Invalid publish date detected during computing volume');
-      }
-      return volume;
-    }
-
-    get issue() {
-      return this.publishedDate.getMonth() + 1;
-    }
-
-    // 'Olah & Carter',
-    get concatenatedAuthors() {
-      if (this.authors.length > 2) {
-        return this.authors[0].lastName + ', et al.';
-      } else if (this.authors.length === 2) {
-        return this.authors[0].lastName + ' & ' + this.authors[1].lastName;
-      } else if (this.authors.length === 1) {
-        return this.authors[0].lastName;
-      }
-    }
-
-    // 'Olah, Chris and Carter, Shan',
-    get bibtexAuthors() {
-      return this.authors.map(author => {
-        return author.lastName + ', ' + author.firstName;
-      }).join(' and ');
-    }
-
-    // 'olah2016attention'
-    get slug() {
-      let slug = '';
-      if (this.authors.length) {
-        slug += this.authors[0].lastName.toLowerCase();
-        slug += this.publishedYear;
-        slug += this.title.split(' ')[0].toLowerCase();
-      }
-      return slug || 'Untitled';
-    }
-
-    get bibliographyEntries() {
-      return new Map(this.citations.map( citationKey => {
-        const entry = this.bibliography.get(citationKey);
-        return [citationKey, entry];
-      }));
-    }
-
-    set bibliography(bibliography) {
-      if (bibliography instanceof Map) {
-        this._bibliography = bibliography;
-      } else if (typeof bibliography === 'object') {
-        this._bibliography = mapFromObject(bibliography);
-      }
-    }
-
-    get bibliography() {
-      return this._bibliography;
-    }
-
-    static fromObject(source) {
-      const frontMatter = new FrontMatter();
-      Object.assign(frontMatter, source);
-      return frontMatter;
-    }
-
-    assignToObject(target) {
-      Object.assign(target, this);
-      target.bibliography = objectFromMap(this.bibliographyEntries);
-      target.url = this.url;
-      target.doi = this.doi;
-      target.githubUrl = this.githubUrl;
-      target.previewURL = this.previewURL;
-      if (this.publishedDate) {
-        target.volume = this.volume;
-        target.issue = this.issue;
-        target.publishedDateRFC = this.publishedDateRFC;
-        target.publishedYear = this.publishedYear;
-        target.publishedMonth = this.publishedMonth;
-        target.publishedDay = this.publishedDay;
-        target.publishedMonthPadded = this.publishedMonthPadded;
-        target.publishedDayPadded = this.publishedDayPadded;
-      }
-      if (this.updatedDate) {
-        target.updatedDateRFC = this.updatedDateRFC;
-      }
-      target.concatenatedAuthors = this.concatenatedAuthors;
-      target.bibtexAuthors = this.bibtexAuthors;
-      target.slug = this.slug;
-    }
-
-  }
-
-  // Copyright 2018 The Distill Template Authors
-  //
-  // Licensed under the Apache License, Version 2.0 (the "License");
-  // you may not use this file except in compliance with the License.
-  // You may obtain a copy of the License at
-  //
-  //      http://www.apache.org/licenses/LICENSE-2.0
-  //
-  // Unless required by applicable law or agreed to in writing, software
-  // distributed under the License is distributed on an "AS IS" BASIS,
-  // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  // See the License for the specific language governing permissions and
-  // limitations under the License.
-
-  const Mutating = (superclass) => {
-    return class extends superclass {
-
-      constructor() {
-        super();
-
-        // set up mutation observer
-        const options = {childList: true, characterData: true, subtree: true};
-        const observer = new MutationObserver( () => {
-          observer.disconnect();
-          this.renderIfPossible();
-          observer.observe(this, options);
-        });
-
-        // ...and listen for changes
-        observer.observe(this, options);
-      }
-
-      connectedCallback() {
-        super.connectedCallback();
-
-        this.renderIfPossible();
-      }
-
-      // potential TODO: check if this is enough for all our usecases
-      // maybe provide a custom function to tell if we have enough information to render
-      renderIfPossible() {
-        if (this.textContent && this.root) {
-          this.renderContent();
-        }
-      }
-
-      renderContent() {
-        console.error(`Your class ${this.constructor.name} must provide a custom renderContent() method!` );
-      }
-
-    }; // end class
-  }; // end mixin function
-
-  // Copyright 2018 The Distill Template Authors
-  //
-  // Licensed under the Apache License, Version 2.0 (the "License");
-  // you may not use this file except in compliance with the License.
-  // You may obtain a copy of the License at
-  //
-  //      http://www.apache.org/licenses/LICENSE-2.0
-  //
-  // Unless required by applicable law or agreed to in writing, software
-  // distributed under the License is distributed on an "AS IS" BASIS,
-  // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  // See the License for the specific language governing permissions and
-  // limitations under the License.
-
-  /*global ShadyCSS*/
-
-  const Template = (name, templateString, useShadow = true) => {
-
-    return (superclass) => {
-
-      const template = document.createElement('template');
-      template.innerHTML = templateString;
-
-      if (useShadow && 'ShadyCSS' in window) {
-        ShadyCSS.prepareTemplate(template, name);
-      }
-
-      return class extends superclass {
-
-        static get is() { return name; }
-
-        constructor() {
-          super();
-
-          this.clone = document.importNode(template.content, true);
-          if (useShadow) {
-            this.attachShadow({mode: 'open'});
-            this.shadowRoot.appendChild(this.clone);
-          }
-        }
-
-        connectedCallback() {
-          if (this.hasAttribute('distill-prerendered')) {
-            return;
-          }
-          if (useShadow) {
-            if ('ShadyCSS' in window) {
-              ShadyCSS.styleElement(this);
-            }
-          } else {
-            this.insertBefore(this.clone, this.firstChild);
-          }
-        }
-
-        get root() {
-          if (useShadow) {
-            return this.shadowRoot;
-          } else {
-            return this;
-          }
-        }
-
-        /* TODO: Are we using these? Should we even? */
-        $(query) {
-          return this.root.querySelector(query);
-        }
-
-        $$(query) {
-          return this.root.querySelectorAll(query);
-        }
-      };
-    };
-  };
-
-  var math = "/*\n * Copyright 2018 The Distill Template Authors\n *\n * Licensed under the Apache License, Version 2.0 (the \"License\");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *      http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an \"AS IS\" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n\nspan.katex-display {\n  text-align: left;\n  padding: 8px 0 8px 0;\n  margin: 0.5em 0 0.5em 1em;\n}\n\nspan.katex {\n  -webkit-font-smoothing: antialiased;\n  color: rgba(0, 0, 0, 0.8);\n  font-size: 1.18em;\n}\n";
-
-  // Copyright 2018 The Distill Template Authors
-  //
-  // Licensed under the Apache License, Version 2.0 (the "License");
-  // you may not use this file except in compliance with the License.
-  // You may obtain a copy of the License at
-  //
-  //      http://www.apache.org/licenses/LICENSE-2.0
-  //
-  // Unless required by applicable law or agreed to in writing, software
-  // distributed under the License is distributed on an "AS IS" BASIS,
-  // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  // See the License for the specific language governing permissions and
-  // limitations under the License.
-
-  // This is a straight concatenation of code from KaTeX's contrib folder,
-  // but we aren't using some of their helpers that don't work well outside a browser environment.
-
-  /*global katex */
-
-  const findEndOfMath = function(delimiter, text, startIndex) {
-    // Adapted from
-    // https://github.com/Khan/perseus/blob/master/src/perseus-markdown.jsx
-    let index = startIndex;
-    let braceLevel = 0;
-
-    const delimLength = delimiter.length;
-
-    while (index < text.length) {
-      const character = text[index];
-
-      if (
-        braceLevel <= 0 &&
-        text.slice(index, index + delimLength) === delimiter
-      ) {
-        return index;
-      } else if (character === "\\") {
-        index++;
-      } else if (character === "{") {
-        braceLevel++;
-      } else if (character === "}") {
-        braceLevel--;
-      }
-
-      index++;
-    }
-
-    return -1;
-  };
-
-  const splitAtDelimiters = function(startData, leftDelim, rightDelim, display) {
-    const finalData = [];
-
-    for (let i = 0; i < startData.length; i++) {
-      if (startData[i].type === "text") {
-        const text = startData[i].data;
-
-        let lookingForLeft = true;
-        let currIndex = 0;
-        let nextIndex;
-
-        nextIndex = text.indexOf(leftDelim);
-        if (nextIndex !== -1) {
-          currIndex = nextIndex;
-          finalData.push({
-            type: "text",
-            data: text.slice(0, currIndex)
-          });
-          lookingForLeft = false;
-        }
-
-        while (true) {
-          // eslint-disable-line no-constant-condition
-          if (lookingForLeft) {
-            nextIndex = text.indexOf(leftDelim, currIndex);
-            if (nextIndex === -1) {
-              break;
-            }
-
-            finalData.push({
-              type: "text",
-              data: text.slice(currIndex, nextIndex)
-            });
-
-            currIndex = nextIndex;
-          } else {
-            nextIndex = findEndOfMath(
-              rightDelim,
-              text,
-              currIndex + leftDelim.length
-            );
-            if (nextIndex === -1) {
-              break;
-            }
-
-            finalData.push({
-              type: "math",
-              data: text.slice(currIndex + leftDelim.length, nextIndex),
-              rawData: text.slice(currIndex, nextIndex + rightDelim.length),
-              display: display
-            });
-
-            currIndex = nextIndex + rightDelim.length;
-          }
-
-          lookingForLeft = !lookingForLeft;
-        }
-
-        finalData.push({
-          type: "text",
-          data: text.slice(currIndex)
-        });
-      } else {
-        finalData.push(startData[i]);
-      }
-    }
-
-    return finalData;
-  };
-
-  const splitWithDelimiters = function(text, delimiters) {
-    let data = [{ type: "text", data: text }];
-    for (let i = 0; i < delimiters.length; i++) {
-      const delimiter = delimiters[i];
-      data = splitAtDelimiters(
-        data,
-        delimiter.left,
-        delimiter.right,
-        delimiter.display || false
-      );
-    }
-    return data;
-  };
-
-  /* Note: optionsCopy is mutated by this method. If it is ever exposed in the
-   * API, we should copy it before mutating.
-   */
-  const renderMathInText = function(text, optionsCopy) {
-    const data = splitWithDelimiters(text, optionsCopy.delimiters);
-    const fragment = document.createDocumentFragment();
-
-    for (let i = 0; i < data.length; i++) {
-      if (data[i].type === "text") {
-        fragment.appendChild(document.createTextNode(data[i].data));
-      } else {
-        const tag = document.createElement("d-math");
-        const math = data[i].data;
-        // Override any display mode defined in the settings with that
-        // defined by the text itself
-        optionsCopy.displayMode = data[i].display;
-        try {
-          tag.textContent = math;
-          if (optionsCopy.displayMode) {
-            tag.setAttribute("block", "");
-          }
-        } catch (e) {
-          if (!(e instanceof katex.ParseError)) {
-            throw e;
-          }
-          optionsCopy.errorCallback(
-            "KaTeX auto-render: Failed to parse `" + data[i].data + "` with ",
-            e
-          );
-          fragment.appendChild(document.createTextNode(data[i].rawData));
-          continue;
-        }
-        fragment.appendChild(tag);
-      }
-    }
-
-    return fragment;
-  };
-
-  const renderElem = function(elem, optionsCopy) {
-    for (let i = 0; i < elem.childNodes.length; i++) {
-      const childNode = elem.childNodes[i];
-      if (childNode.nodeType === 3) {
-        // Text node
-        const text = childNode.textContent;
-        if (optionsCopy.mightHaveMath(text)) {
-          const frag = renderMathInText(text, optionsCopy);
-          i += frag.childNodes.length - 1;
-          elem.replaceChild(frag, childNode);
-        }
-      } else if (childNode.nodeType === 1) {
-        // Element node
-        const shouldRender =
-          optionsCopy.ignoredTags.indexOf(childNode.nodeName.toLowerCase()) ===
-          -1;
-
-        if (shouldRender) {
-          renderElem(childNode, optionsCopy);
-        }
-      }
-      // Otherwise, it's something else, and ignore it.
-    }
-  };
-
-  const defaultAutoRenderOptions = {
-    delimiters: [
-      { left: "$$", right: "$$", display: true },
-      { left: "\\[", right: "\\]", display: true },
-      { left: "\\(", right: "\\)", display: false }
-      // LaTeX uses this, but it ruins the display of normal `$` in text:
-      // {left: '$', right: '$', display: false},
-    ],
-
-    ignoredTags: [
-      "script",
-      "noscript",
-      "style",
-      "textarea",
-      "pre",
-      "code",
-      "svg"
-    ],
-
-    errorCallback: function(msg, err) {
-      console.error(msg, err);
-    }
-  };
-
-  const renderMathInElement = function(elem, options) {
-    if (!elem) {
-      throw new Error("No element provided to render");
-    }
-
-    const optionsCopy = Object.assign({}, defaultAutoRenderOptions, options);
-    const delimiterStrings = optionsCopy.delimiters.flatMap(d => [
-      d.left,
-      d.right
-    ]);
-    const mightHaveMath = text =>
-      delimiterStrings.some(d => text.indexOf(d) !== -1);
-    optionsCopy.mightHaveMath = mightHaveMath;
-    renderElem(elem, optionsCopy);
-  };
-
-  // Copyright 2018 The Distill Template Authors
-
-  const katexJSURL = 'https://distill.pub/third-party/katex/katex.min.js';
-  const katexCSSTag = '<link rel="stylesheet" href="https://distill.pub/third-party/katex/katex.min.css" crossorigin="anonymous">';
-
-  const T = Template('d-math', `
-${katexCSSTag}
-<style>
-
-:host {
-  display: inline-block;
-  contain: style;
-}
-
-:host([block]) {
-  display: block;
-}
-
-${math}
-</style>
-<span id='katex-container'></span>
-`);
-
-  // DMath, not Math, because that would conflict with the JS built-in
-  class DMath extends Mutating(T(HTMLElement)) {
-
-    static set katexOptions(options) {
-      DMath._katexOptions = options;
-      if (DMath.katexOptions.delimiters) {
-        if (!DMath.katexAdded) {
-          DMath.addKatex();
-        } else {
-          DMath.katexLoadedCallback();
-        }
-      }
-    }
-
-    static get katexOptions() {
-      if (!DMath._katexOptions) {
-        DMath._katexOptions = {
-          delimiters: [ { 'left':'$$', 'right':'$$', 'display': false } ]
-        };
-      }
-      return DMath._katexOptions;
-    }
-
-    static katexLoadedCallback() {
-      // render all d-math tags
-      const mathTags = document.querySelectorAll('d-math');
-      for (const mathTag of mathTags) {
-        mathTag.renderContent();
-      }
-      // transform inline delimited math to d-math tags
-      if (DMath.katexOptions.delimiters) {
-        renderMathInElement(document.body, DMath.katexOptions);
-      }
-    }
-
-    static addKatex() {
-      // css tag can use this convenience function
-      document.head.insertAdjacentHTML('beforeend', katexCSSTag);
-      // script tag has to be created to work properly
-      const scriptTag = document.createElement('script');
-      scriptTag.src = katexJSURL;
-      scriptTag.async = true;
-      scriptTag.onload = DMath.katexLoadedCallback;
-      scriptTag.crossorigin = 'anonymous';
-      document.head.appendChild(scriptTag);
-
-      DMath.katexAdded = true;
-    }
-
-    get options() {
-      const localOptions = { displayMode: this.hasAttribute('block') };
-      return Object.assign(localOptions, DMath.katexOptions);
-    }
-
-    connectedCallback() {
-      super.connectedCallback();
-      if (!DMath.katexAdded) {
-        DMath.addKatex();
-      }
-    }
-
-    renderContent() {
-      if (typeof katex !== 'undefined') {
-        const container = this.root.querySelector('#katex-container');
-        katex.render(this.textContent, container, this.options);
-      }
-    }
-
-  }
-
-  DMath.katexAdded = false;
-  DMath.inlineMathRendered = false;
-  window.DMath = DMath; // TODO: check if this can be removed, or if we should expose a distill global
-
-  // Copyright 2018 The Distill Template Authors
-  //
-  // Licensed under the Apache License, Version 2.0 (the "License");
-  // you may not use this file except in compliance with the License.
-  // You may obtain a copy of the License at
-  //
-  //      http://www.apache.org/licenses/LICENSE-2.0
-  //
-  // Unless required by applicable law or agreed to in writing, software
-  // distributed under the License is distributed on an "AS IS" BASIS,
-  // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  // See the License for the specific language governing permissions and
-  // limitations under the License.
-
-  function collect_citations(dom = document) {
-    const citations = new Set();
-    const citeTags = dom.querySelectorAll("d-cite");
-    for (const tag of citeTags) {
-      const keyString = tag.getAttribute("key") || tag.getAttribute("bibtex-key");
-      const keys = keyString.split(",").map(k => k.trim());
-      for (const key of keys) {
-        citations.add(key);
-      }
-    }
-    return [...citations];
-  }
-
-  function author_string(ent, template, sep, finalSep) {
-    if (ent.author == null) {
-      return "";
-    }
-    var names = ent.author.split(" and ");
-    let name_strings = names.map(name => {
-      name = name.trim();
-      if (name.indexOf(",") != -1) {
-        var last = name.split(",")[0].trim();
-        var firsts = name.split(",")[1];
-      } else if (name.indexOf(" ") != -1) {
-        var last = name
-          .split(" ")
-          .slice(-1)[0]
-          .trim();
-        var firsts = name
-          .split(" ")
-          .slice(0, -1)
-          .join(" ");
-      } else {
-        var last = name.trim();
-      }
-      var initials = "";
-      if (firsts != undefined) {
-        initials = firsts
-          .trim()
-          .split(" ")
-          .map(s => s.trim()[0]);
-        initials = initials.join(".") + ".";
-      }
-      return template
-        .replace("${F}", firsts)
-        .replace("${L}", last)
-        .replace("${I}", initials)
-        .trim(); // in case one of first or last was empty
-    });
-    if (names.length > 1) {
-      var str = name_strings.slice(0, names.length - 1).join(sep);
-      str += (finalSep || sep) + name_strings[names.length - 1];
-      return str;
-    } else {
-      return name_strings[0];
-    }
-  }
-
-  function venue_string(ent) {
-    var cite = ent.journal || ent.booktitle || "";
-    if ("volume" in ent) {
-      var issue = ent.issue || ent.number;
-      issue = issue != undefined ? "(" + issue + ")" : "";
-      cite += ", Vol " + ent.volume + issue;
-    }
-    if ("pages" in ent) {
-      cite += ", pp. " + ent.pages;
-    }
-    if (cite != "") cite += ". ";
-    if ("publisher" in ent) {
-      cite += ent.publisher;
-      if (cite[cite.length - 1] != ".") cite += ".";
-    }
-    return cite;
-  }
-
-  function link_string(ent) {
-    if ("url" in ent) {
-      var url = ent.url;
-      var arxiv_match = /arxiv\.org\/abs\/([0-9\.]*)/.exec(url);
-      if (arxiv_match != null) {
-        url = `http://arxiv.org/pdf/${arxiv_match[1]}.pdf`;
-      }
-
-      if (url.slice(-4) == ".pdf") {
-        var label = "PDF";
-      } else if (url.slice(-5) == ".html") {
-        var label = "HTML";
-      }
-      return ` &ensp;<a href="${url}">[${label || "link"}]</a>`;
-    } /* else if ("doi" in ent){
-      return ` &ensp;<a href="https://doi.org/${ent.doi}" >[DOI]</a>`;
-    }*/ else {
-      return "";
-    }
-  }
-  function doi_string(ent, new_line) {
-    if ("doi" in ent) {
-      return `${new_line ? "<br>" : ""} <a href="https://doi.org/${
-      ent.doi
-    }" style="text-decoration:inherit;">DOI: ${ent.doi}</a>`;
-    } else {
-      return "";
-    }
-  }
-
-  function title_string(ent) {
-    return '<span class="title">' + ent.title + "</span> ";
-  }
-
-  function bibliography_cite(ent, fancy) {
-    if (ent) {
-      var cite = title_string(ent);
-      cite += link_string(ent) + "<br>";
-      if (ent.author) {
-        cite += author_string(ent, "${L}, ${I}", ", ", " and ");
-        if (ent.year || ent.date) {
-          cite += ", ";
-        }
-      }
-      if (ent.year || ent.date) {
-        cite += (ent.year || ent.date) + ". ";
-      } else {
-        cite += ". ";
-      }
-      cite += venue_string(ent);
-      cite += doi_string(ent);
-      return cite;
-      /*var cite =  author_string(ent, "${L}, ${I}", ", ", " and ");
-      if (ent.year || ent.date){
-        cite += ", " + (ent.year || ent.date) + ". "
-      } else {
-        cite += ". "
-      }
-      cite += "<b>" + ent.title + "</b>. ";
-      cite += venue_string(ent);
-      cite += doi_string(ent);
-      cite += link_string(ent);
-      return cite*/
-    } else {
-      return "?";
-    }
-  }
-
-  function hover_cite(ent) {
-    if (ent) {
-      var cite = "";
-      cite += "<strong>" + ent.title + "</strong>";
-      cite += link_string(ent);
-      cite += "<br>";
-
-      var a_str = author_string(ent, "${I} ${L}", ", ") + ".";
-      var v_str =
-        venue_string(ent).trim() + " " + ent.year + ". " + doi_string(ent, true);
-
-      if ((a_str + v_str).length < Math.min(40, ent.title.length)) {
-        cite += a_str + " " + v_str;
-      } else {
-        cite += a_str + "<br>" + v_str;
-      }
-      return cite;
-    } else {
-      return "?";
-    }
-  }
-
-  function domContentLoaded() {
-    return ['interactive', 'complete'].indexOf(document.readyState) !== -1;
-  }
-
-  // Copyright 2018 The Distill Template Authors
-  //
-  // Licensed under the Apache License, Version 2.0 (the "License");
-  // you may not use this file except in compliance with the License.
-  // You may obtain a copy of the License at
-  //
-  //      http://www.apache.org/licenses/LICENSE-2.0
-  //
-  // Unless required by applicable law or agreed to in writing, software
-  // distributed under the License is distributed on an "AS IS" BASIS,
-  // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  // See the License for the specific language governing permissions and
-  // limitations under the License.
-
-  function _moveLegacyAffiliationFormatIntoArray(frontMatter) {
-    // authors used to have propoerties "affiliation" and "affiliationURL".
-    // We now encourage using an array for affiliations containing objects with
-    // properties "name" and "url".
-    for (let author of frontMatter.authors) {
-      const hasOldStyle = Boolean(author.affiliation);
-      const hasNewStyle = Boolean(author.affiliations);
-      if (!hasOldStyle) continue;
-      if (hasNewStyle) {
-        console.warn(`Author ${author.author} has both old-style ("affiliation" & "affiliationURL") and new style ("affiliations") affiliation information!`);
-      } else {
-        let newAffiliation = {
-          "name": author.affiliation
-        };
-        if (author.affiliationURL) newAffiliation.url = author.affiliationURL;
-        author.affiliations = [newAffiliation];
-      }
-    }
-    return frontMatter
-  }
-
-  function parseFrontmatter(element) {
-    const scriptTag = element.firstElementChild;
-    if (scriptTag) {
-      const type = scriptTag.getAttribute('type');
-      if (type.split('/')[1] == 'json') {
-        const content = scriptTag.textContent;
-        const parsed = JSON.parse(content);
-        return _moveLegacyAffiliationFormatIntoArray(parsed);
-      } else {
-        console.error('Distill only supports JSON frontmatter tags anymore; no more YAML.');
-      }
-    } else {
-      console.error('You added a frontmatter tag but did not provide a script tag with front matter data in it. Please take a look at our templates.');
-    }
-    return {};
-  }
-
-  class FrontMatter$1 extends HTMLElement {
-
-    static get is() { return 'd-front-matter'; }
-
-    constructor() {
-      super();
-
-      const options = {childList: true, characterData: true, subtree: true};
-      const observer = new MutationObserver( (entries) => {
-        for (const entry of entries) {
-          if (entry.target.nodeName === 'SCRIPT' || entry.type === 'characterData') {
-            const data = parseFrontmatter(this);
-            this.notify(data);
-          }
-        }
-      });
-      observer.observe(this, options);
-    }
-
-    notify(data) {
-      const options = { detail: data, bubbles: true };
-      const event = new CustomEvent('onFrontMatterChanged', options);
-      document.dispatchEvent(event);
-    }
-
-  }
-
-  // Copyright 2018 The Distill Template Authors
-  //
-  // Licensed under the Apache License, Version 2.0 (the "License");
-  // you may not use this file except in compliance with the License.
-  // You may obtain a copy of the License at
-  //
-  //      http://www.apache.org/licenses/LICENSE-2.0
-  //
-  // Unless required by applicable law or agreed to in writing, software
-  // distributed under the License is distributed on an "AS IS" BASIS,
-  // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  // See the License for the specific language governing permissions and
-  // limitations under the License.
-
-  // no appendix -> add appendix
-  // title in front, no h1 -> add it
-  // no title in front, h1 -> read and put into frontMatter
-  // footnote -> footnote list
-  // break up bib
-  // if citation, no bib-list -> add citation-list
-
-  // if authors, no byline -> add byline
-
-  function optionalComponents(dom, data) {
-    const body = dom.body;
-    const article = body.querySelector('d-article');
-
-    // If we don't have an article tag, something weird is going on—giving up.
-    if (!article) {
-      console.warn('No d-article tag found; skipping adding optional components!');
-      return;
-    }
-
-    let byline = dom.querySelector('d-byline');
-    if (!byline) {
-      if (data.authors) {
-        byline = dom.createElement('d-byline');
-        body.insertBefore(byline, article);
-      } else {
-        console.warn('No authors found in front matter; please add them before submission!');
-      }
-    }
-
-    let title = dom.querySelector('d-title');
-    if (!title) {
-      title = dom.createElement('d-title');
-      body.insertBefore(title, byline);
-    }
-
-    let h1 = title.querySelector('h1');
-    if (!h1) {
-      h1 = dom.createElement('h1');
-      h1.textContent = data.title;
-      title.insertBefore(h1, title.firstChild);
-    }
-
-    const hasPassword = typeof data.password !== 'undefined';
-    let interstitial = body.querySelector('d-interstitial');
-    if (hasPassword && !interstitial) {
-      const inBrowser = typeof window !== 'undefined';
-      const onLocalhost = inBrowser && window.location.hostname.includes('localhost');
-      if (!inBrowser || !onLocalhost) {
-        interstitial = dom.createElement('d-interstitial');
-        interstitial.password = data.password;
-        body.insertBefore(interstitial, body.firstChild);
-      }
-    } else if (!hasPassword && interstitial) {
-      interstitial.parentElement.removeChild(this);
-    }
-
-    let appendix = dom.querySelector('d-appendix');
-    if (!appendix) {
-      appendix = dom.createElement('d-appendix');
-      dom.body.appendChild(appendix);
-    }
-
-    let footnoteList = dom.querySelector('d-footnote-list');
-    if (!footnoteList) {
-      footnoteList = dom.createElement('d-footnote-list');
-      appendix.appendChild(footnoteList);
-    }
-
-    let citationList = dom.querySelector('d-citation-list');
-    if (!citationList) {
-      citationList = dom.createElement('d-citation-list');
-      appendix.appendChild(citationList);
-    }
-
-  }
-
-  // Copyright 2018 The Distill Template Authors
-
-  const frontMatter = new FrontMatter();
-
-  const Controller = {
-    frontMatter: frontMatter,
-    waitingOn: {
-      bibliography: [],
-      citations: []
-    },
-    listeners: {
-      onCiteKeyCreated(event) {
-        const [citeTag, keys] = event.detail;
-
-        // ensure we have citations
-        if (!frontMatter.citationsCollected) {
-          // console.debug('onCiteKeyCreated, but unresolved dependency ("citations"). Enqueing.');
-          Controller.waitingOn.citations.push(() =>
-            Controller.listeners.onCiteKeyCreated(event)
-          );
-          return;
-        }
-
-        // ensure we have a loaded bibliography
-        if (!frontMatter.bibliographyParsed) {
-          // console.debug('onCiteKeyCreated, but unresolved dependency ("bibliography"). Enqueing.');
-          Controller.waitingOn.bibliography.push(() =>
-            Controller.listeners.onCiteKeyCreated(event)
-          );
-          return;
-        }
-
-        const numbers = keys.map(key => frontMatter.citations.indexOf(key));
-        citeTag.numbers = numbers;
-        const entries = keys.map(key => frontMatter.bibliography.get(key));
-        citeTag.entries = entries;
-      },
-
-      onCiteKeyChanged() {
-        // const [citeTag, keys] = event.detail;
-
-        // update citations
-        frontMatter.citations = collect_citations();
-        frontMatter.citationsCollected = true;
-        for (const waitingCallback of Controller.waitingOn.citations.slice()) {
-          waitingCallback();
-        }
-
-        // update bibliography
-        const citationListTag = document.querySelector("d-citation-list");
-        const bibliographyEntries = new Map(
-          frontMatter.citations.map(citationKey => {
-            return [citationKey, frontMatter.bibliography.get(citationKey)];
-          })
-        );
-        citationListTag.citations = bibliographyEntries;
-
-        const citeTags = document.querySelectorAll("d-cite");
-        for (const citeTag of citeTags) {
-          console.log(citeTag);
-          const keys = citeTag.keys;
-          const numbers = keys.map(key => frontMatter.citations.indexOf(key));
-          citeTag.numbers = numbers;
-          const entries = keys.map(key => frontMatter.bibliography.get(key));
-          citeTag.entries = entries;
-        }
-      },
-
-      onCiteKeyRemoved(event) {
-        Controller.listeners.onCiteKeyChanged(event);
-      },
-
-      onBibliographyChanged(event) {
-        const citationListTag = document.querySelector("d-citation-list");
-
-        const bibliography = event.detail;
-
-        frontMatter.bibliography = bibliography;
-        frontMatter.bibliographyParsed = true;
-        for (const waitingCallback of Controller.waitingOn.bibliography.slice()) {
-          waitingCallback();
-        }
-
-        // ensure we have citations
-        if (!frontMatter.citationsCollected) {
-          Controller.waitingOn.citations.push(function() {
-            Controller.listeners.onBibliographyChanged({
-              target: event.target,
-              detail: event.detail
-            });
-          });
-          return;
-        }
-
-        if (citationListTag.hasAttribute("distill-prerendered")) {
-          console.debug("Citation list was prerendered; not updating it.");
-        } else {
-          const entries = new Map(
-            frontMatter.citations.map(citationKey => {
-              return [citationKey, frontMatter.bibliography.get(citationKey)];
-            })
-          );
-          citationListTag.citations = entries;
-        }
-      },
-
-      onFootnoteChanged() {
-        // const footnote = event.detail;
-        //TODO: optimize to only update current footnote
-        const footnotesList = document.querySelector("d-footnote-list");
-        if (footnotesList) {
-          const footnotes = document.querySelectorAll("d-footnote");
-          footnotesList.footnotes = footnotes;
-        }
-      },
-
-      onFrontMatterChanged(event) {
-        const data = event.detail;
-        mergeFromYMLFrontmatter(frontMatter, data);
-
-        const interstitial = document.querySelector("d-interstitial");
-        if (interstitial) {
-          if (typeof frontMatter.password !== "undefined") {
-            interstitial.password = frontMatter.password;
-          } else {
-            interstitial.parentElement.removeChild(interstitial);
-          }
-        }
-
-        const prerendered = document.body.hasAttribute("distill-prerendered");
-        if (!prerendered && domContentLoaded()) {
-          optionalComponents(document, frontMatter);
-
-          const appendix = document.querySelector("distill-appendix");
-          if (appendix) {
-            appendix.frontMatter = frontMatter;
-          }
-
-          const byline = document.querySelector("d-byline");
-          if (byline) {
-            byline.frontMatter = frontMatter;
-          }
-
-          if (data.katex) {
-            DMath.katexOptions = data.katex;
-          }
-        }
-      },
-
-      DOMContentLoaded() {
-        if (Controller.loaded) {
-          console.warn(
-            "Controller received DOMContentLoaded but was already loaded!"
-          );
-          return;
-        } else if (!domContentLoaded()) {
-          console.warn(
-            "Controller received DOMContentLoaded at document.readyState: " +
-              document.readyState +
-              "!"
-          );
-          return;
-        } else {
-          Controller.loaded = true;
-          console.debug("Runlevel 4: Controller running DOMContentLoaded");
-        }
-
-        const frontMatterTag = document.querySelector("d-front-matter");
-        if (frontMatterTag) {
-          const data = parseFrontmatter(frontMatterTag);
-          Controller.listeners.onFrontMatterChanged({ detail: data });
-        }
-
-        // Resolving "citations" dependency due to initial DOM load
-        frontMatter.citations = collect_citations();
-        frontMatter.citationsCollected = true;
-        for (const waitingCallback of Controller.waitingOn.citations.slice()) {
-          waitingCallback();
-        }
-
-        if (frontMatter.bibliographyParsed) {
-          for (const waitingCallback of Controller.waitingOn.bibliography.slice()) {
-            waitingCallback();
-          }
-        }
-
-        const footnotesList = document.querySelector("d-footnote-list");
-        if (footnotesList) {
-          const footnotes = document.querySelectorAll("d-footnote");
-          footnotesList.footnotes = footnotes;
-        }
-      }
-    } // listeners
-  }; // Controller
-
-  var base = "/*\n * Copyright 2018 The Distill Template Authors\n *\n * Licensed under the Apache License, Version 2.0 (the \"License\");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *      http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an \"AS IS\" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n\nhtml {\n  font-size: 14px;\n\tline-height: 1.6em;\n  /* font-family: \"Libre Franklin\", \"Helvetica Neue\", sans-serif; */\n  font-family: -apple-system, BlinkMacSystemFont, \"Segoe UI\", Roboto, Oxygen, Ubuntu, Cantarell, \"Fira Sans\", \"Droid Sans\", \"Helvetica Neue\", Arial, sans-serif;\n  /*, \"Apple Color Emoji\", \"Segoe UI Emoji\", \"Segoe UI Symbol\";*/\n  text-size-adjust: 100%;\n  -ms-text-size-adjust: 100%;\n  -webkit-text-size-adjust: 100%;\n}\n\n@media(min-width: 768px) {\n  html {\n    font-size: 16px;\n  }\n}\n\nbody {\n  margin: 0;\n}\n\na {\n  color: #004276;\n}\n\nfigure {\n  margin: 0;\n}\n\ntable {\n\tborder-collapse: collapse;\n\tborder-spacing: 0;\n}\n\ntable th {\n\ttext-align: left;\n}\n\ntable thead {\n  border-bottom: 1px solid rgba(0, 0, 0, 0.05);\n}\n\ntable thead th {\n  padding-bottom: 0.5em;\n}\n\ntable tbody :first-child td {\n  padding-top: 0.5em;\n}\n\npre {\n  overflow: auto;\n  max-width: 100%;\n}\n\np {\n  margin-top: 0;\n  margin-bottom: 1em;\n}\n\nsup, sub {\n  vertical-align: baseline;\n  position: relative;\n  top: -0.4em;\n  line-height: 1em;\n}\n\nsub {\n  top: 0.4em;\n}\n\n.kicker,\n.marker {\n  font-size: 15px;\n  font-weight: 600;\n  color: rgba(0, 0, 0, 0.5);\n}\n\n\n/* Headline */\n\n@media(min-width: 1024px) {\n  d-title h1 span {\n    display: block;\n  }\n}\n\n/* Figure */\n\nfigure {\n  position: relative;\n  margin-bottom: 2.5em;\n  margin-top: 1.5em;\n}\n\nfigcaption+figure {\n\n}\n\nfigure img {\n  width: 100%;\n}\n\nfigure svg text,\nfigure svg tspan {\n}\n\nfigcaption,\n.figcaption {\n  color: rgba(0, 0, 0, 0.6);\n  font-size: 12px;\n  line-height: 1.5em;\n}\n\n@media(min-width: 1024px) {\nfigcaption,\n.figcaption {\n    font-size: 13px;\n  }\n}\n\nfigure.external img {\n  background: white;\n  border: 1px solid rgba(0, 0, 0, 0.1);\n  box-shadow: 0 1px 8px rgba(0, 0, 0, 0.1);\n  padding: 18px;\n  box-sizing: border-box;\n}\n\nfigcaption a {\n  color: rgba(0, 0, 0, 0.6);\n}\n\nfigcaption b,\nfigcaption strong, {\n  font-weight: 600;\n  color: rgba(0, 0, 0, 1.0);\n}\n";
-
-  var layout = "/*\n * Copyright 2018 The Distill Template Authors\n *\n * Licensed under the Apache License, Version 2.0 (the \"License\");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *      http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an \"AS IS\" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n\n@supports not (display: grid) {\n  .base-grid,\n  distill-header,\n  d-title,\n  d-abstract,\n  d-article,\n  d-appendix,\n  distill-appendix,\n  d-byline,\n  d-footnote-list,\n  d-citation-list,\n  distill-footer {\n    display: block;\n    padding: 8px;\n  }\n}\n\n.base-grid,\ndistill-header,\nd-title,\nd-abstract,\nd-article,\nd-appendix,\ndistill-appendix,\nd-byline,\nd-footnote-list,\nd-citation-list,\ndistill-footer {\n  display: grid;\n  justify-items: stretch;\n  grid-template-columns: [screen-start] 8px [page-start kicker-start text-start gutter-start middle-start] 1fr 1fr 1fr 1fr 1fr 1fr 1fr 1fr [text-end page-end gutter-end kicker-end middle-end] 8px [screen-end];\n  grid-column-gap: 8px;\n}\n\n.grid {\n  display: grid;\n  grid-column-gap: 8px;\n}\n\n@media(min-width: 768px) {\n  .base-grid,\n  distill-header,\n  d-title,\n  d-abstract,\n  d-article,\n  d-appendix,\n  distill-appendix,\n  d-byline,\n  d-footnote-list,\n  d-citation-list,\n  distill-footer {\n    grid-template-columns: [screen-start] 1fr [page-start kicker-start middle-start text-start] 45px 45px 45px 45px 45px 45px 45px 45px [ kicker-end text-end gutter-start] 45px [middle-end] 45px [page-end gutter-end] 1fr [screen-end];\n    grid-column-gap: 16px;\n  }\n\n  .grid {\n    grid-column-gap: 16px;\n  }\n}\n\n@media(min-width: 1000px) {\n  .base-grid,\n  distill-header,\n  d-title,\n  d-abstract,\n  d-article,\n  d-appendix,\n  distill-appendix,\n  d-byline,\n  d-footnote-list,\n  d-citation-list,\n  distill-footer {\n    grid-template-columns: [screen-start] 1fr [page-start kicker-start] 50px [middle-start] 50px [text-start kicker-end] 50px 50px 50px 50px 50px 50px 50px 50px [text-end gutter-start] 50px [middle-end] 50px [page-end gutter-end] 1fr [screen-end];\n    grid-column-gap: 16px;\n  }\n\n  .grid {\n    grid-column-gap: 16px;\n  }\n}\n\n@media(min-width: 1180px) {\n  .base-grid,\n  distill-header,\n  d-title,\n  d-abstract,\n  d-article,\n  d-appendix,\n  distill-appendix,\n  d-byline,\n  d-footnote-list,\n  d-citation-list,\n  distill-footer {\n    grid-template-columns: [screen-start] 1fr [page-start kicker-start] 60px [middle-start] 60px [text-start kicker-end] 60px 60px 60px 60px 60px 60px 60px 60px [text-end gutter-start] 60px [middle-end] 60px [page-end gutter-end] 1fr [screen-end];\n    grid-column-gap: 32px;\n  }\n\n  .grid {\n    grid-column-gap: 32px;\n  }\n}\n\n\n\n\n.base-grid {\n  grid-column: screen;\n}\n\n/* .l-body,\nd-article > *  {\n  grid-column: text;\n}\n\n.l-page,\nd-title > *,\nd-figure {\n  grid-column: page;\n} */\n\n.l-gutter {\n  grid-column: gutter;\n}\n\n.l-text,\n.l-body {\n  grid-column: text;\n}\n\n.l-page {\n  grid-column: page;\n}\n\n.l-body-outset {\n  grid-column: middle;\n}\n\n.l-page-outset {\n  grid-column: page;\n}\n\n.l-screen {\n  grid-column: screen;\n}\n\n.l-screen-inset {\n  grid-column: screen;\n  padding-left: 16px;\n  padding-left: 16px;\n}\n\n\n/* Aside */\n\nd-article aside {\n  grid-column: gutter;\n  font-size: 12px;\n  line-height: 1.6em;\n  color: rgba(0, 0, 0, 0.6)\n}\n\n@media(min-width: 768px) {\n  aside {\n    grid-column: gutter;\n  }\n\n  .side {\n    grid-column: gutter;\n  }\n}\n";
-
-  var print = "/*\n * Copyright 2018 The Distill Template Authors\n *\n * Licensed under the Apache License, Version 2.0 (the \"License\");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *      http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an \"AS IS\" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n\n@media print {\n\n  @page {\n    size: 8in 11in;\n    @bottom-right {\n      content: counter(page) \" of \" counter(pages);\n    }\n  }\n\n  html {\n    /* no general margins -- CSS Grid takes care of those */\n  }\n\n  p, code {\n    page-break-inside: avoid;\n  }\n\n  h2, h3 {\n    page-break-after: avoid;\n  }\n\n  d-header {\n    visibility: hidden;\n  }\n\n  d-footer {\n    display: none!important;\n  }\n\n}\n";
-
-  var byline = "/*\n * Copyright 2018 The Distill Template Authors\n *\n * Licensed under the Apache License, Version 2.0 (the \"License\");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *      http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an \"AS IS\" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n\nd-byline {\n  contain: style;\n  overflow: hidden;\n  border-top: 1px solid rgba(0, 0, 0, 0.1);\n  font-size: 0.8rem;\n  line-height: 1.8em;\n  padding: 1.5rem 0;\n  min-height: 1.8em;\n}\n\n\nd-byline .byline {\n  grid-template-columns: 1fr 1fr;\n  grid-column: text;\n}\n\n@media(min-width: 768px) {\n  d-byline .byline {\n    grid-template-columns: 1fr 1fr 1fr 1fr;\n  }\n}\n\nd-byline .authors-affiliations {\n  grid-column-end: span 2;\n  grid-template-columns: 1fr 1fr;\n  margin-bottom: 1em;\n}\n\n@media(min-width: 768px) {\n  d-byline .authors-affiliations {\n    margin-bottom: 0;\n  }\n}\n\nd-byline h3 {\n  font-size: 0.6rem;\n  font-weight: 400;\n  color: rgba(0, 0, 0, 0.5);\n  margin: 0;\n  text-transform: uppercase;\n}\n\nd-byline p {\n  margin: 0;\n}\n\nd-byline a,\nd-article d-byline a {\n  color: rgba(0, 0, 0, 0.8);\n  text-decoration: none;\n  border-bottom: none;\n}\n\nd-article d-byline a:hover {\n  text-decoration: underline;\n  border-bottom: none;\n}\n\nd-byline p.author {\n  font-weight: 500;\n}\n\nd-byline .affiliations {\n\n}\n";
-
-  var article = "/*\n * Copyright 2018 The Distill Template Authors\n *\n * Licensed under the Apache License, Version 2.0 (the \"License\");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *      http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an \"AS IS\" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n\nd-article {\n  contain: layout style;\n  overflow-x: hidden;\n  border-top: 1px solid rgba(0, 0, 0, 0.1);\n  padding-top: 2rem;\n  color: rgba(0, 0, 0, 0.8);\n}\n\nd-article > * {\n  grid-column: text;\n}\n\n@media(min-width: 768px) {\n  d-article {\n    font-size: 16px;\n  }\n}\n\n@media(min-width: 1024px) {\n  d-article {\n    font-size: 1.06rem;\n    line-height: 1.7em;\n  }\n}\n\n\n/* H2 */\n\n\nd-article .marker {\n  text-decoration: none;\n  border: none;\n  counter-reset: section;\n  grid-column: kicker;\n  line-height: 1.7em;\n}\n\nd-article .marker:hover {\n  border: none;\n}\n\nd-article .marker span {\n  padding: 0 3px 4px;\n  border-bottom: 1px solid rgba(0, 0, 0, 0.2);\n  position: relative;\n  top: 4px;\n}\n\nd-article .marker:hover span {\n  color: rgba(0, 0, 0, 0.7);\n  border-bottom: 1px solid rgba(0, 0, 0, 0.7);\n}\n\nd-article h2 {\n  font-weight: 600;\n  font-size: 24px;\n  line-height: 1.25em;\n  margin: 2rem 0 1.5rem 0;\n  border-bottom: 1px solid rgba(0, 0, 0, 0.1);\n  padding-bottom: 1rem;\n}\n\n@media(min-width: 1024px) {\n  d-article h2 {\n    font-size: 36px;\n  }\n}\n\n/* H3 */\n\nd-article h3 {\n  font-weight: 700;\n  font-size: 18px;\n  line-height: 1.4em;\n  margin-bottom: 1em;\n  margin-top: 2em;\n}\n\n@media(min-width: 1024px) {\n  d-article h3 {\n    font-size: 20px;\n  }\n}\n\n/* H4 */\n\nd-article h4 {\n  font-weight: 600;\n  text-transform: uppercase;\n  font-size: 14px;\n  line-height: 1.4em;\n}\n\nd-article a {\n  color: inherit;\n}\n\nd-article p,\nd-article ul,\nd-article ol,\nd-article blockquote {\n  margin-top: 0;\n  margin-bottom: 1em;\n  margin-left: 0;\n  margin-right: 0;\n}\n\nd-article blockquote {\n  border-left: 2px solid rgba(0, 0, 0, 0.2);\n  padding-left: 2em;\n  font-style: italic;\n  color: rgba(0, 0, 0, 0.6);\n}\n\nd-article a {\n  border-bottom: 1px solid rgba(0, 0, 0, 0.4);\n  text-decoration: none;\n}\n\nd-article a:hover {\n  border-bottom: 1px solid rgba(0, 0, 0, 0.8);\n}\n\nd-article .link {\n  text-decoration: underline;\n  cursor: pointer;\n}\n\nd-article ul,\nd-article ol {\n  padding-left: 24px;\n}\n\nd-article li {\n  margin-bottom: 1em;\n  margin-left: 0;\n  padding-left: 0;\n}\n\nd-article li:last-child {\n  margin-bottom: 0;\n}\n\nd-article pre {\n  font-size: 14px;\n  margin-bottom: 20px;\n}\n\nd-article hr {\n  grid-column: screen;\n  width: 100%;\n  border: none;\n  border-bottom: 1px solid rgba(0, 0, 0, 0.1);\n  margin-top: 60px;\n  margin-bottom: 60px;\n}\n\nd-article section {\n  margin-top: 60px;\n  margin-bottom: 60px;\n}\n\nd-article span.equation-mimic {\n  font-family: georgia;\n  font-size: 115%;\n  font-style: italic;\n}\n\nd-article > d-code,\nd-article section > d-code  {\n  display: block;\n}\n\nd-article > d-math[block],\nd-article section > d-math[block]  {\n  display: block;\n}\n\n@media (max-width: 768px) {\n  d-article > d-code,\n  d-article section > d-code,\n  d-article > d-math[block],\n  d-article section > d-math[block] {\n      overflow-x: scroll;\n      -ms-overflow-style: none;  // IE 10+\n      overflow: -moz-scrollbars-none;  // Firefox\n  }\n\n  d-article > d-code::-webkit-scrollbar,\n  d-article section > d-code::-webkit-scrollbar,\n  d-article > d-math[block]::-webkit-scrollbar,\n  d-article section > d-math[block]::-webkit-scrollbar {\n    display: none;  // Safari and Chrome\n  }\n}\n\nd-article .citation {\n  color: #668;\n  cursor: pointer;\n}\n\nd-include {\n  width: auto;\n  display: block;\n}\n\nd-figure {\n  contain: layout style;\n}\n\n/* KaTeX */\n\n.katex, .katex-prerendered {\n  contain: style;\n  display: inline-block;\n}\n\n/* Tables */\n\nd-article table {\n  border-collapse: collapse;\n  margin-bottom: 1.5rem;\n  border-bottom: 1px solid rgba(0, 0, 0, 0.2);\n}\n\nd-article table th {\n  border-bottom: 1px solid rgba(0, 0, 0, 0.2);\n}\n\nd-article table td {\n  border-bottom: 1px solid rgba(0, 0, 0, 0.05);\n}\n\nd-article table tr:last-of-type td {\n  border-bottom: none;\n}\n\nd-article table th,\nd-article table td {\n  font-size: 15px;\n  padding: 2px 8px;\n}\n\nd-article table tbody :first-child td {\n  padding-top: 2px;\n}\n";
-
-  var title = "/*\n * Copyright 2018 The Distill Template Authors\n *\n * Licensed under the Apache License, Version 2.0 (the \"License\");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *      http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an \"AS IS\" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n\nd-title {\n  padding: 2rem 0 1.5rem;\n  contain: layout style;\n  overflow-x: hidden;\n}\n\n@media(min-width: 768px) {\n  d-title {\n    padding: 4rem 0 1.5rem;\n  }\n}\n\nd-title h1 {\n  grid-column: text;\n  font-size: 40px;\n  font-weight: 700;\n  line-height: 1.1em;\n  margin: 0 0 0.5rem;\n}\n\n@media(min-width: 768px) {\n  d-title h1 {\n    font-size: 50px;\n  }\n}\n\nd-title p {\n  font-weight: 300;\n  font-size: 1.2rem;\n  line-height: 1.55em;\n  grid-column: text;\n}\n\nd-title .status {\n  margin-top: 0px;\n  font-size: 12px;\n  color: #009688;\n  opacity: 0.8;\n  grid-column: kicker;\n}\n\nd-title .status span {\n  line-height: 1;\n  display: inline-block;\n  padding: 6px 0;\n  border-bottom: 1px solid #80cbc4;\n  font-size: 11px;\n  text-transform: uppercase;\n}\n";
-
-  // Copyright 2018 The Distill Template Authors
-
-  const styles = base + layout + title + byline + article + math + print;
-
-  function makeStyleTag(dom) {
-
-    const styleTagId = 'distill-prerendered-styles';
-    const prerenderedTag = dom.getElementById(styleTagId);
-    if (!prerenderedTag) {
-      const styleTag = dom.createElement('style');
-      styleTag.id = styleTagId;
-      styleTag.type = 'text/css';
-      const cssTextTag = dom.createTextNode(styles);
-      styleTag.appendChild(cssTextTag);
-      const firstScriptTag = dom.head.querySelector('script');
-      dom.head.insertBefore(styleTag, firstScriptTag);
-    }
-
-  }
-
-  // Copyright 2018 The Distill Template Authors
-  //
-  // Licensed under the Apache License, Version 2.0 (the "License");
-  // you may not use this file except in compliance with the License.
-  // You may obtain a copy of the License at
-  //
-  //      http://www.apache.org/licenses/LICENSE-2.0
-  //
-  // Unless required by applicable law or agreed to in writing, software
-  // distributed under the License is distributed on an "AS IS" BASIS,
-  // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  // See the License for the specific language governing permissions and
-  // limitations under the License.
-
-  function addPolyfill(polyfill, polyfillLoadedCallback) {
-    console.debug('Runlevel 0: Polyfill required: ' + polyfill.name);
-    const script = document.createElement('script');
-    script.src = polyfill.url;
-    script.async = false;
-    if (polyfillLoadedCallback) {
-      script.onload = function() { polyfillLoadedCallback(polyfill); };
-    }
-    script.onerror = function() {
-      new Error('Runlevel 0: Polyfills failed to load script ' + polyfill.name);
-    };
-    document.head.appendChild(script);
-  }
-
-  const polyfills = [
-    {
-      name: 'WebComponents',
-      support: function() {
-        return 'customElements' in window &&
-               'attachShadow' in Element.prototype &&
-               'getRootNode' in Element.prototype &&
-               'content' in document.createElement('template') &&
-               'Promise' in window &&
-               'from' in Array;
-      },
-      url: 'https://distill.pub/third-party/polyfills/webcomponents-lite.js'
-    }, {
-      name: 'IntersectionObserver',
-      support: function() {
-        return 'IntersectionObserver' in window &&
-               'IntersectionObserverEntry' in window;
-      },
-      url: 'https://distill.pub/third-party/polyfills/intersection-observer.js'
-    },
-  ];
-
-  class Polyfills {
-
-    static browserSupportsAllFeatures() {
-      return polyfills.every((poly) => poly.support());
-    }
-
-    static load(callback) {
-      // Define an intermediate callback that checks if all is loaded.
-      const polyfillLoaded = function(polyfill) {
-        polyfill.loaded = true;
-        console.debug('Runlevel 0: Polyfill has finished loading: ' + polyfill.name);
-        // console.debug(window[polyfill.name]);
-        if (Polyfills.neededPolyfills.every((poly) => poly.loaded)) {
-          console.debug('Runlevel 0: All required polyfills have finished loading.');
-          console.debug('Runlevel 0->1.');
-          window.distillRunlevel = 1;
-          callback();
-        }
-      };
-      // Add polyfill script tags
-      for (const polyfill of Polyfills.neededPolyfills) {
-        addPolyfill(polyfill, polyfillLoaded);
-      }
-    }
-
-    static get neededPolyfills() {
-      if (!Polyfills._neededPolyfills) {
-        Polyfills._neededPolyfills = polyfills.filter((poly) => !poly.support());
-      }
-      return Polyfills._neededPolyfills;
-    }
-  }
-
-  // Copyright 2018 The Distill Template Authors
-  //
-  // Licensed under the Apache License, Version 2.0 (the "License");
-  // you may not use this file except in compliance with the License.
-  // You may obtain a copy of the License at
-  //
-  //      http://www.apache.org/licenses/LICENSE-2.0
-  //
-  // Unless required by applicable law or agreed to in writing, software
-  // distributed under the License is distributed on an "AS IS" BASIS,
-  // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  // See the License for the specific language governing permissions and
-  // limitations under the License.
-
-  // const marginSmall = 16;
-  // const marginLarge = 3 * marginSmall;
-  // const margin = marginSmall + marginLarge;
-  // const gutter = marginSmall;
-  // const outsetAmount = margin / 2;
-  // const numCols = 4;
-  // const numGutters = numCols - 1;
-  // const columnWidth = (768 - 2 * marginLarge - numGutters * gutter) / numCols;
-  //
-  // const screenwidth = 768;
-  // const pageWidth = screenwidth - 2 * marginLarge;
-  // const bodyWidth = pageWidth - columnWidth - gutter;
-
-  function body(selector) {
-    return `${selector} {
-      grid-column: left / text;
-    }
-  `;
-  }
-
-  // Copyright 2018 The Distill Template Authors
-
-  const T$1 = Template('d-abstract', `
-<style>
-  :host {
-    font-size: 1.25rem;
-    line-height: 1.6em;
-    color: rgba(0, 0, 0, 0.7);
-    -webkit-font-smoothing: antialiased;
-  }
-
-  ::slotted(p) {
-    margin-top: 0;
-    margin-bottom: 1em;
-    grid-column: text-start / middle-end;
-  }
-  ${body('d-abstract')}
-</style>
-
-<slot></slot>
-`);
-
-  class Abstract extends T$1(HTMLElement) {
-
-  }
-
-  // Copyright 2018 The Distill Template Authors
-
-  const T$2 = Template('d-appendix', `
-<style>
-
-d-appendix {
-  contain: layout style;
-  font-size: 0.8em;
-  line-height: 1.7em;
-  margin-top: 60px;
-  margin-bottom: 0;
-  border-top: 1px solid rgba(0, 0, 0, 0.1);
-  color: rgba(0,0,0,0.5);
-  padding-top: 60px;
-  padding-bottom: 48px;
-}
-
-d-appendix h3 {
-  grid-column: page-start / text-start;
-  font-size: 15px;
-  font-weight: 500;
-  margin-top: 1em;
-  margin-bottom: 0;
-  color: rgba(0,0,0,0.65);
-}
-
-d-appendix h3 + * {
-  margin-top: 1em;
-}
-
-d-appendix ol {
-  padding: 0 0 0 15px;
-}
-
-@media (min-width: 768px) {
-  d-appendix ol {
-    padding: 0 0 0 30px;
-    margin-left: -30px;
-  }
-}
-
-d-appendix li {
-  margin-bottom: 1em;
-}
-
-d-appendix a {
-  color: rgba(0, 0, 0, 0.6);
-}
-
-d-appendix > * {
-  grid-column: text;
-}
-
-d-appendix > d-footnote-list,
-d-appendix > d-citation-list,
-d-appendix > distill-appendix {
-  grid-column: screen;
-}
-
-</style>
-
-`, false);
-
-  class Appendix extends T$2(HTMLElement) {
-
-  }
-
-  // Copyright 2018 The Distill Template Authors
-  //
-  // Licensed under the Apache License, Version 2.0 (the "License");
-  // you may not use this file except in compliance with the License.
-  // You may obtain a copy of the License at
-  //
-  //      http://www.apache.org/licenses/LICENSE-2.0
-  //
-  // Unless required by applicable law or agreed to in writing, software
-  // distributed under the License is distributed on an "AS IS" BASIS,
-  // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  // See the License for the specific language governing permissions and
-  // limitations under the License.
-
-  // import { Template } from '../mixins/template';
-  // import { Controller } from '../controller';
-
-  const isOnlyWhitespace = /^\s*$/;
-
-  class Article extends HTMLElement {
-    static get is() { return 'd-article'; }
-
-    constructor() {
-      super();
-
-      new MutationObserver( (mutations) => {
-        for (const mutation of mutations) {
-          for (const addedNode of mutation.addedNodes) {
-            switch (addedNode.nodeName) {
-            case '#text': { // usually text nodes are only linebreaks.
-              const text = addedNode.nodeValue;
-              if (!isOnlyWhitespace.test(text)) {
-                console.warn('Use of unwrapped text in distill articles is discouraged as it breaks layout! Please wrap any text in a <span> or <p> tag. We found the following text: ' + text);
-                const wrapper = document.createElement('span');
-                wrapper.innerHTML = addedNode.nodeValue;
-                addedNode.parentNode.insertBefore(wrapper, addedNode);
-                addedNode.parentNode.removeChild(addedNode);
-              }
-            } break;
-            }
-          }
-        }
-      }).observe(this, {childList: true});
-    }
-
-  }
-
-  var commonjsGlobal = typeof globalThis !== 'undefined' ? globalThis : typeof window !== 'undefined' ? window : typeof global !== 'undefined' ? global : typeof self !== 'undefined' ? self : {};
-
-  function createCommonjsModule(fn, module) {
-  	return module = { exports: {} }, fn(module, module.exports), module.exports;
-  }
-
-  var bibtexParse = createCommonjsModule(function (module, exports) {
-  /* start bibtexParse 0.0.22 */
-
-  //Original work by Henrik Muehe (c) 2010
-  //
-  //CommonJS port by Mikola Lysenko 2013
-  //
-  //Port to Browser lib by ORCID / RCPETERS
-  //
-  //Issues:
-  //no comment handling within strings
-  //no string concatenation
-  //no variable values yet
-  //Grammar implemented here:
-  //bibtex -> (string | preamble | comment | entry)*;
-  //string -> '@STRING' '{' key_equals_value '}';
-  //preamble -> '@PREAMBLE' '{' value '}';
-  //comment -> '@COMMENT' '{' value '}';
-  //entry -> '@' key '{' key ',' key_value_list '}';
-  //key_value_list -> key_equals_value (',' key_equals_value)*;
-  //key_equals_value -> key '=' value;
-  //value -> value_quotes | value_braces | key;
-  //value_quotes -> '"' .*? '"'; // not quite
-  //value_braces -> '{' .*? '"'; // not quite
-  (function(exports) {
-
-      function BibtexParser() {
-          
-          this.months = ["jan", "feb", "mar", "apr", "may", "jun", "jul", "aug", "sep", "oct", "nov", "dec"];
-          this.notKey = [',','{','}',' ','='];
-          this.pos = 0;
-          this.input = "";
-          this.entries = new Array();
-
-          this.currentEntry = "";
-
-          this.setInput = function(t) {
-              this.input = t;
-          };
-
-          this.getEntries = function() {
-              return this.entries;
-          };
-
-          this.isWhitespace = function(s) {
-              return (s == ' ' || s == '\r' || s == '\t' || s == '\n');
-          };
-
-          this.match = function(s, canCommentOut) {
-              if (canCommentOut == undefined || canCommentOut == null)
-                  canCommentOut = true;
-              this.skipWhitespace(canCommentOut);
-              if (this.input.substring(this.pos, this.pos + s.length) == s) {
-                  this.pos += s.length;
-              } else {
-                  throw "Token mismatch, expected " + s + ", found "
-                          + this.input.substring(this.pos);
-              }            this.skipWhitespace(canCommentOut);
-          };
-
-          this.tryMatch = function(s, canCommentOut) {
-              if (canCommentOut == undefined || canCommentOut == null)
-                  canCommentOut = true;
-              this.skipWhitespace(canCommentOut);
-              if (this.input.substring(this.pos, this.pos + s.length) == s) {
-                  return true;
-              } else {
-                  return false;
-              }        };
-
-          /* when search for a match all text can be ignored, not just white space */
-          this.matchAt = function() {
-              while (this.input.length > this.pos && this.input[this.pos] != '@') {
-                  this.pos++;
-              }
-              if (this.input[this.pos] == '@') {
-                  return true;
-              }            return false;
-          };
-
-          this.skipWhitespace = function(canCommentOut) {
-              while (this.isWhitespace(this.input[this.pos])) {
-                  this.pos++;
-              }            if (this.input[this.pos] == "%" && canCommentOut == true) {
-                  while (this.input[this.pos] != "\n") {
-                      this.pos++;
-                  }                this.skipWhitespace(canCommentOut);
-              }        };
-
-          this.value_braces = function() {
-              var bracecount = 0;
-              this.match("{", false);
-              var start = this.pos;
-              var escaped = false;
-              while (true) {
-                  if (!escaped) {
-                      if (this.input[this.pos] == '}') {
-                          if (bracecount > 0) {
-                              bracecount--;
-                          } else {
-                              var end = this.pos;
-                              this.match("}", false);
-                              return this.input.substring(start, end);
-                          }                    } else if (this.input[this.pos] == '{') {
-                          bracecount++;
-                      } else if (this.pos >= this.input.length - 1) {
-                          throw "Unterminated value";
-                      }                }                if (this.input[this.pos] == '\\' && escaped == false)
-                      escaped = true;
-                  else
-                      escaped = false;
-                  this.pos++;
-              }        };
-
-          this.value_comment = function() {
-              var str = '';
-              var brcktCnt = 0;
-              while (!(this.tryMatch("}", false) && brcktCnt == 0)) {
-                  str = str + this.input[this.pos];
-                  if (this.input[this.pos] == '{')
-                      brcktCnt++;
-                  if (this.input[this.pos] == '}')
-                      brcktCnt--;
-                  if (this.pos >= this.input.length - 1) {
-                      throw "Unterminated value:" + this.input.substring(start);
-                  }                this.pos++;
-              }            return str;
-          };
-
-          this.value_quotes = function() {
-              this.match('"', false);
-              var start = this.pos;
-              var escaped = false;
-              while (true) {
-                  if (!escaped) {
-                      if (this.input[this.pos] == '"') {
-                          var end = this.pos;
-                          this.match('"', false);
-                          return this.input.substring(start, end);
-                      } else if (this.pos >= this.input.length - 1) {
-                          throw "Unterminated value:" + this.input.substring(start);
-                      }                }
-                  if (this.input[this.pos] == '\\' && escaped == false)
-                      escaped = true;
-                  else
-                      escaped = false;
-                  this.pos++;
-              }        };
-
-          this.single_value = function() {
-              var start = this.pos;
-              if (this.tryMatch("{")) {
-                  return this.value_braces();
-              } else if (this.tryMatch('"')) {
-                  return this.value_quotes();
-              } else {
-                  var k = this.key();
-                  if (k.match("^[0-9]+$"))
-                      return k;
-                  else if (this.months.indexOf(k.toLowerCase()) >= 0)
-                      return k.toLowerCase();
-                  else
-                      throw "Value expected:" + this.input.substring(start) + ' for key: ' + k;
-              
-              }        };
-
-          this.value = function() {
-              var values = [];
-              values.push(this.single_value());
-              while (this.tryMatch("#")) {
-                  this.match("#");
-                  values.push(this.single_value());
-              }            return values.join("");
-          };
-
-          this.key = function() {
-              var start = this.pos;
-              while (true) {
-                  if (this.pos >= this.input.length) {
-                      throw "Runaway key";
-                  }                                // а-яА-Я is Cyrillic
-                  //console.log(this.input[this.pos]);
-                  if (this.notKey.indexOf(this.input[this.pos]) >= 0) {
-                      return this.input.substring(start, this.pos);
-                  } else {
-                      this.pos++;
-                      
-                  }            }        };
-
-          this.key_equals_value = function() {
-              var key = this.key();
-              if (this.tryMatch("=")) {
-                  this.match("=");
-                  var val = this.value();
-                  return [ key, val ];
-              } else {
-                  throw "... = value expected, equals sign missing:"
-                          + this.input.substring(this.pos);
-              }        };
-
-          this.key_value_list = function() {
-              var kv = this.key_equals_value();
-              this.currentEntry['entryTags'] = {};
-              this.currentEntry['entryTags'][kv[0]] = kv[1];
-              while (this.tryMatch(",")) {
-                  this.match(",");
-                  // fixes problems with commas at the end of a list
-                  if (this.tryMatch("}")) {
-                      break;
-                  }
-                  kv = this.key_equals_value();
-                  this.currentEntry['entryTags'][kv[0]] = kv[1];
-              }        };
-
-          this.entry_body = function(d) {
-              this.currentEntry = {};
-              this.currentEntry['citationKey'] = this.key();
-              this.currentEntry['entryType'] = d.substring(1);
-              this.match(",");
-              this.key_value_list();
-              this.entries.push(this.currentEntry);
-          };
-
-          this.directive = function() {
-              this.match("@");
-              return "@" + this.key();
-          };
-
-          this.preamble = function() {
-              this.currentEntry = {};
-              this.currentEntry['entryType'] = 'PREAMBLE';
-              this.currentEntry['entry'] = this.value_comment();
-              this.entries.push(this.currentEntry);
-          };
-
-          this.comment = function() {
-              this.currentEntry = {};
-              this.currentEntry['entryType'] = 'COMMENT';
-              this.currentEntry['entry'] = this.value_comment();
-              this.entries.push(this.currentEntry);
-          };
-
-          this.entry = function(d) {
-              this.entry_body(d);
-          };
-
-          this.bibtex = function() {
-              while (this.matchAt()) {
-                  var d = this.directive();
-                  this.match("{");
-                  if (d == "@STRING") {
-                      this.string();
-                  } else if (d == "@PREAMBLE") {
-                      this.preamble();
-                  } else if (d == "@COMMENT") {
-                      this.comment();
-                  } else {
-                      this.entry(d);
-                  }
-                  this.match("}");
-              }        };
-      }    
-      exports.toJSON = function(bibtex) {
-          var b = new BibtexParser();
-          b.setInput(bibtex);
-          b.bibtex();
-          return b.entries;
-      };
-
-      /* added during hackathon don't hate on me */
-      exports.toBibtex = function(json) {
-          var out = '';
-          for ( var i in json) {
-              out += "@" + json[i].entryType;
-              out += '{';
-              if (json[i].citationKey)
-                  out += json[i].citationKey + ', ';
-              if (json[i].entry)
-                  out += json[i].entry ;
-              if (json[i].entryTags) {
-                  var tags = '';
-                  for (var jdx in json[i].entryTags) {
-                      if (tags.length != 0)
-                          tags += ', ';
-                      tags += jdx + '= {' + json[i].entryTags[jdx] + '}';
-                  }
-                  out += tags;
-              }
-              out += '}\n\n';
-          }
-          return out;
-          
-      };
-
-  })( exports);
-
-  /* end bibtexParse */
-  });
-
-  // Copyright 2018 The Distill Template Authors
-
-  function normalizeTag(string) {
-    return string
-      .replace(/[\t\n ]+/g, ' ')
-      .replace(/{\\["^`.'acu~Hvs]( )?([a-zA-Z])}/g, (full, x, char) => char)
-      .replace(/{\\([a-zA-Z])}/g, (full, char) => char);
-  }
-
-  function parseBibtex(bibtex) {
-    const bibliography = new Map();
-    const parsedEntries = bibtexParse.toJSON(bibtex);
-    for (const entry of parsedEntries) {
-      // normalize tags; note entryTags is an object, not Map
-      for (const [key, value] of Object.entries(entry.entryTags)) {
-        entry.entryTags[key.toLowerCase()] = normalizeTag(value);
-      }
-      entry.entryTags.type = entry.entryType;
-      // add to bibliography
-      bibliography.set(entry.citationKey, entry.entryTags);
-    }
-    return bibliography;
-  }
-
-  function serializeFrontmatterToBibtex(frontMatter) {
-    return `@article{${frontMatter.slug},
-  author = {${frontMatter.bibtexAuthors}},
-  title = {${frontMatter.title}},
-  journal = {${frontMatter.journal.title}},
-  year = {${frontMatter.publishedYear}},
-  note = {${frontMatter.url}},
-  doi = {${frontMatter.doi}}
-}`;
-  }
-
-  // Copyright 2018 The Distill Template Authors
-
-  class Bibliography extends HTMLElement {
-
-    static get is() { return 'd-bibliography'; }
-
-    constructor() {
-      super();
-
-      // set up mutation observer
-      const options = {childList: true, characterData: true, subtree: true};
-      const observer = new MutationObserver( (entries) => {
-        for (const entry of entries) {
-          if (entry.target.nodeName === 'SCRIPT' || entry.type === 'characterData') {
-            this.parseIfPossible();
-          }
-        }
-      });
-      observer.observe(this, options);
-    }
-
-    connectedCallback() {
-      requestAnimationFrame(() => {
-        this.parseIfPossible();
-      });
-    }
-
-    parseIfPossible() {
-      const scriptTag = this.querySelector('script');
-      if (!scriptTag) return;
-      if (scriptTag.type == 'text/bibtex') {
-        const newBibtex = scriptTag.textContent;
-        if (this.bibtex !== newBibtex) {
-          this.bibtex = newBibtex;
-          const bibliography = parseBibtex(this.bibtex);
-          this.notify(bibliography);
-        }
-      } else if (scriptTag.type == 'text/json') {
-        const bibliography = new Map(JSON.parse(scriptTag.textContent));
-        this.notify(bibliography);
-      } else {
-        console.warn('Unsupported bibliography script tag type: ' + scriptTag.type);
-      }
-    }
-
-    notify(bibliography) {
-      const options = { detail: bibliography, bubbles: true };
-      const event = new CustomEvent('onBibliographyChanged', options);
-      this.dispatchEvent(event);
-    }
-
-    /* observe 'src' attribute */
-
-    static get observedAttributes() {
-      return ['src'];
-    }
-
-    receivedBibtex(event) {
-      const bibliography = parseBibtex(event.target.response);
-      this.notify(bibliography);
-    }
-
-    attributeChangedCallback(name, oldValue, newValue) {
-      var oReq = new XMLHttpRequest();
-      oReq.onload = (e) => this.receivedBibtex(e);
-      oReq.onerror = () => console.warn(`Could not load Bibtex! (tried ${newValue})`);
-      oReq.responseType = 'text';
-      oReq.open('GET', newValue, true);
-      oReq.send();
-    }
-
-
-  }
-
-  // Copyright 2018 The Distill Template Authors
-  //
-  // Licensed under the Apache License, Version 2.0 (the "License");
-  // you may not use this file except in compliance with the License.
-  // You may obtain a copy of the License at
-  //
-  //      http://www.apache.org/licenses/LICENSE-2.0
-  //
-  // Unless required by applicable law or agreed to in writing, software
-  // distributed under the License is distributed on an "AS IS" BASIS,
-  // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  // See the License for the specific language governing permissions and
-  // limitations under the License.
-
-  // import style from '../styles/d-byline.css';
-
-  function bylineTemplate(frontMatter) {
-    return `
-  <div class="byline grid">
-    <div class="authors-affiliations grid">
-      <h3>Authors</h3>
-      <h3>Affiliations</h3>
-      ${frontMatter.authors.map(author => `
-        <p class="author">
-          ${author.personalURL ? `
-            <a class="name" href="${author.personalURL}">${author.name}</a>` : `
-            <span class="name">${author.name}</span>`}
-        </p>
-        <p class="affiliation">
-        ${author.affiliations.map(affiliation =>
-          affiliation.url ? `<a class="affiliation" href="${affiliation.url}">${affiliation.name}</a>` : `<span class="affiliation">${affiliation.name}</span>`
-        ).join(', ')}
-        </p>
-      `).join('')}
-    </div>
-    <div>
-      <h3>Published</h3>
-      ${frontMatter.publishedDate ? `
-        <p>${frontMatter.publishedMonth} ${frontMatter.publishedDay}, ${frontMatter.publishedYear}</p> ` : `
-        <p><em>Not published yet.</em></p>`}
-    </div>
-  </div>
-`;
-  }
-
-  class Byline extends HTMLElement {
-
-    static get is() { return 'd-byline'; }
-
-    set frontMatter(frontMatter) {
-      this.innerHTML = bylineTemplate(frontMatter);
-    }
-
-  }
-
-  // Copyright 2018 The Distill Template Authors
-
-  const T$3 = Template(
-    "d-cite",
-    `
-<style>
-
-:host {
-  display: inline-block;
-}
-
-.citation {
-  color: hsla(206, 90%, 20%, 0.7);
-}
-
-.citation-number {
-  cursor: default;
-  white-space: nowrap;
-  font-family: -apple-system, BlinkMacSystemFont, "Roboto", Helvetica, sans-serif;
-  font-size: 75%;
-  color: hsla(206, 90%, 20%, 0.7);
-  display: inline-block;
-  line-height: 1.1em;
-  text-align: center;
-  position: relative;
-  top: -2px;
-  margin: 0 2px;
-}
-
-figcaption .citation-number {
-  font-size: 11px;
-  font-weight: normal;
-  top: -2px;
-  line-height: 1em;
-}
-
-ul {
-  margin: 0;
-  padding: 0;
-  list-style-type: none;
-}
-
-ul li {
-  padding: 15px 10px 15px 10px;
-  border-bottom: 1px solid rgba(0,0,0,0.1)
-}
-
-ul li:last-of-type {
-  border-bottom: none;
-}
-
-</style>
-
-<d-hover-box id="hover-box"></d-hover-box>
-
-<div id="citation-" class="citation">
-  <span class="citation-number"></span>
-</div>
-`
-  );
-
-  class Cite extends T$3(HTMLElement) {
-    /* Lifecycle */
-    constructor() {
-      super();
-      this._numbers = [];
-      this._entries = [];
-    }
-
-    connectedCallback() {
-      this.outerSpan = this.root.querySelector("#citation-");
-      this.innerSpan = this.root.querySelector(".citation-number");
-      this.hoverBox = this.root.querySelector("d-hover-box");
-      window.customElements.whenDefined("d-hover-box").then(() => {
-        this.hoverBox.listen(this);
-      });
-      // in case this component got connected after values were set
-      if (this.numbers) {
-        this.displayNumbers(this.numbers);
-      }
-      if (this.entries) {
-        this.displayEntries(this.entries);
-      }
-    }
-
-    //TODO This causes an infinite loop on firefox with polyfills.
-    // This is only needed for interactive editing so no priority.
-    // disconnectedCallback() {
-    // const options = { detail: [this, this.keys], bubbles: true };
-    // const event = new CustomEvent('onCiteKeyRemoved', options);
-    // document.dispatchEvent(event);
-    // }
-
-    /* observe 'key' attribute */
-
-    static get observedAttributes() {
-      return ["key", "bibtex-key"];
-    }
-
-    attributeChangedCallback(name, oldValue, newValue) {
-      const eventName = oldValue ? "onCiteKeyChanged" : "onCiteKeyCreated";
-      const keys = newValue.split(",").map(k => k.trim());
-      const options = { detail: [this, keys], bubbles: true };
-      const event = new CustomEvent(eventName, options);
-      document.dispatchEvent(event);
-    }
-
-    set key(value) {
-      this.setAttribute("key", value);
-    }
-
-    get key() {
-      return this.getAttribute("key") || this.getAttribute("bibtex-key");
-    }
-
-    get keys() {
-      const result = this.key.split(",");
-      console.log(result);
-      return result;
-    }
-
-    /* Setters & Rendering */
-
-    set numbers(numbers) {
-      this._numbers = numbers;
-      this.displayNumbers(numbers);
-    }
-
-    get numbers() {
-      return this._numbers;
-    }
-
-    displayNumbers(numbers) {
-      if (!this.innerSpan) return;
-      const numberStrings = numbers.map(index => {
-        return index == -1 ? "?" : index + 1 + "";
-      });
-      const textContent = "[" + numberStrings.join(", ") + "]";
-      this.innerSpan.textContent = textContent;
-    }
-
-    set entries(entries) {
-      this._entries = entries;
-      this.displayEntries(entries);
-    }
-
-    get entries() {
-      return this._entries;
-    }
-
-    displayEntries(entries) {
-      if (!this.hoverBox) return;
-      this.hoverBox.innerHTML = `<ul>
-      ${entries
-        .map(hover_cite)
-        .map(html => `<li>${html}</li>`)
-        .join("\n")}
-    </ul>`;
-    }
-  }
-
-  // Copyright 2018 The Distill Template Authors
-
-  const styles$1 = `
-d-citation-list {
-  contain: style;
-}
-
-d-citation-list .references {
-  grid-column: text;
-}
-
-d-citation-list .references .title {
-  font-weight: 500;
-}
-`;
-
-  function renderCitationList(element, entries, dom=document) {
-    if (entries.size > 0) {
-      element.style.display = '';
-      let list = element.querySelector('.references');
-      if (list) {
-        list.innerHTML = '';
-      } else {
-        const stylesTag = dom.createElement('style');
-        stylesTag.innerHTML = styles$1;
-        element.appendChild(stylesTag);
-
-        const heading = dom.createElement('h3');
-        heading.id = 'references';
-        heading.textContent = 'References';
-        element.appendChild(heading);
-
-        list = dom.createElement('ol');
-        list.id = 'references-list';
-        list.className = 'references';
-        element.appendChild(list);
-      }
-
-      for (const [key, entry] of entries) {
-        const listItem = dom.createElement('li');
-        listItem.id = key;
-        listItem.innerHTML = bibliography_cite(entry);
-        list.appendChild(listItem);
-      }
-    } else {
-      element.style.display = 'none';
-    }
-  }
-
-  class CitationList extends HTMLElement {
-
-    static get is() { return 'd-citation-list'; }
-
-    connectedCallback() {
-      if (!this.hasAttribute('distill-prerendered')) {
-        this.style.display = 'none';
-      }
-    }
-
-    set citations(citations) {
-      renderCitationList(this, citations);
-    }
-
-  }
-
-  var prism = createCommonjsModule(function (module) {
-  /* **********************************************
-       Begin prism-core.js
-  ********************************************** */
-
-  var _self = (typeof window !== 'undefined')
-  	? window   // if in browser
-  	: (
-  		(typeof WorkerGlobalScope !== 'undefined' && self instanceof WorkerGlobalScope)
-  		? self // if in worker
-  		: {}   // if in node js
-  	);
-
-  /**
-   * Prism: Lightweight, robust, elegant syntax highlighting
-   * MIT license http://www.opensource.org/licenses/mit-license.php/
-   * @author Lea Verou http://lea.verou.me
-   */
-
-  var Prism = (function (_self){
-
-  // Private helper vars
-  var lang = /\blang(?:uage)?-([\w-]+)\b/i;
-  var uniqueId = 0;
-
-
-  var _ = {
-  	manual: _self.Prism && _self.Prism.manual,
-  	disableWorkerMessageHandler: _self.Prism && _self.Prism.disableWorkerMessageHandler,
-  	util: {
-  		encode: function encode(tokens) {
-  			if (tokens instanceof Token) {
-  				return new Token(tokens.type, encode(tokens.content), tokens.alias);
-  			} else if (Array.isArray(tokens)) {
-  				return tokens.map(encode);
-  			} else {
-  				return tokens.replace(/&/g, '&amp;').replace(/</g, '&lt;').replace(/\u00a0/g, ' ');
-  			}
-  		},
-
-  		type: function (o) {
-  			return Object.prototype.toString.call(o).slice(8, -1);
-  		},
-
-  		objId: function (obj) {
-  			if (!obj['__id']) {
-  				Object.defineProperty(obj, '__id', { value: ++uniqueId });
-  			}
-  			return obj['__id'];
-  		},
-
-  		// Deep clone a language definition (e.g. to extend it)
-  		clone: function deepClone(o, visited) {
-  			var clone, id, type = _.util.type(o);
-  			visited = visited || {};
-
-  			switch (type) {
-  				case 'Object':
-  					id = _.util.objId(o);
-  					if (visited[id]) {
-  						return visited[id];
-  					}
-  					clone = {};
-  					visited[id] = clone;
-
-  					for (var key in o) {
-  						if (o.hasOwnProperty(key)) {
-  							clone[key] = deepClone(o[key], visited);
-  						}
-  					}
-
-  					return clone;
-
-  				case 'Array':
-  					id = _.util.objId(o);
-  					if (visited[id]) {
-  						return visited[id];
-  					}
-  					clone = [];
-  					visited[id] = clone;
-
-  					o.forEach(function (v, i) {
-  						clone[i] = deepClone(v, visited);
-  					});
-
-  					return clone;
-
-  				default:
-  					return o;
-  			}
-  		},
-
-  		/**
-  		 * Returns the Prism language of the given element set by a `language-xxxx` or `lang-xxxx` class.
-  		 *
-  		 * If no language is set for the element or the element is `null` or `undefined`, `none` will be returned.
-  		 *
-  		 * @param {Element} element
-  		 * @returns {string}
-  		 */
-  		getLanguage: function (element) {
-  			while (element && !lang.test(element.className)) {
-  				element = element.parentElement;
-  			}
-  			if (element) {
-  				return (element.className.match(lang) || [, 'none'])[1].toLowerCase();
-  			}
-  			return 'none';
-  		},
-
-  		/**
-  		 * Returns the script element that is currently executing.
-  		 *
-  		 * This does __not__ work for line script element.
-  		 *
-  		 * @returns {HTMLScriptElement | null}
-  		 */
-  		currentScript: function () {
-  			if (typeof document === 'undefined') {
-  				return null;
-  			}
-  			if ('currentScript' in document) {
-  				return document.currentScript;
-  			}
-
-  			// IE11 workaround
-  			// we'll get the src of the current script by parsing IE11's error stack trace
-  			// this will not work for inline scripts
-
-  			try {
-  				throw new Error();
-  			} catch (err) {
-  				// Get file src url from stack. Specifically works with the format of stack traces in IE.
-  				// A stack will look like this:
-  				//
-  				// Error
-  				//    at _.util.currentScript (http://localhost/components/prism-core.js:119:5)
-  				//    at Global code (http://localhost/components/prism-core.js:606:1)
-
-  				var src = (/at [^(\r\n]*\((.*):.+:.+\)$/i.exec(err.stack) || [])[1];
-  				if (src) {
-  					var scripts = document.getElementsByTagName('script');
-  					for (var i in scripts) {
-  						if (scripts[i].src == src) {
-  							return scripts[i];
-  						}
-  					}
-  				}
-  				return null;
-  			}
-  		}
-  	},
-
-  	languages: {
-  		extend: function (id, redef) {
-  			var lang = _.util.clone(_.languages[id]);
-
-  			for (var key in redef) {
-  				lang[key] = redef[key];
-  			}
-
-  			return lang;
-  		},
-
-  		/**
-  		 * Insert a token before another token in a language literal
-  		 * As this needs to recreate the object (we cannot actually insert before keys in object literals),
-  		 * we cannot just provide an object, we need an object and a key.
-  		 * @param inside The key (or language id) of the parent
-  		 * @param before The key to insert before.
-  		 * @param insert Object with the key/value pairs to insert
-  		 * @param root The object that contains `inside`. If equal to Prism.languages, it can be omitted.
-  		 */
-  		insertBefore: function (inside, before, insert, root) {
-  			root = root || _.languages;
-  			var grammar = root[inside];
-  			var ret = {};
-
-  			for (var token in grammar) {
-  				if (grammar.hasOwnProperty(token)) {
-
-  					if (token == before) {
-  						for (var newToken in insert) {
-  							if (insert.hasOwnProperty(newToken)) {
-  								ret[newToken] = insert[newToken];
-  							}
-  						}
-  					}
-
-  					// Do not insert token which also occur in insert. See #1525
-  					if (!insert.hasOwnProperty(token)) {
-  						ret[token] = grammar[token];
-  					}
-  				}
-  			}
-
-  			var old = root[inside];
-  			root[inside] = ret;
-
-  			// Update references in other language definitions
-  			_.languages.DFS(_.languages, function(key, value) {
-  				if (value === old && key != inside) {
-  					this[key] = ret;
-  				}
-  			});
-
-  			return ret;
-  		},
-
-  		// Traverse a language definition with Depth First Search
-  		DFS: function DFS(o, callback, type, visited) {
-  			visited = visited || {};
-
-  			var objId = _.util.objId;
-
-  			for (var i in o) {
-  				if (o.hasOwnProperty(i)) {
-  					callback.call(o, i, o[i], type || i);
-
-  					var property = o[i],
-  					    propertyType = _.util.type(property);
-
-  					if (propertyType === 'Object' && !visited[objId(property)]) {
-  						visited[objId(property)] = true;
-  						DFS(property, callback, null, visited);
-  					}
-  					else if (propertyType === 'Array' && !visited[objId(property)]) {
-  						visited[objId(property)] = true;
-  						DFS(property, callback, i, visited);
-  					}
-  				}
-  			}
-  		}
-  	},
-  	plugins: {},
-
-  	highlightAll: function(async, callback) {
-  		_.highlightAllUnder(document, async, callback);
-  	},
-
-  	highlightAllUnder: function(container, async, callback) {
-  		var env = {
-  			callback: callback,
-  			container: container,
-  			selector: 'code[class*="language-"], [class*="language-"] code, code[class*="lang-"], [class*="lang-"] code'
-  		};
-
-  		_.hooks.run('before-highlightall', env);
-
-  		env.elements = Array.prototype.slice.apply(env.container.querySelectorAll(env.selector));
-
-  		_.hooks.run('before-all-elements-highlight', env);
-
-  		for (var i = 0, element; element = env.elements[i++];) {
-  			_.highlightElement(element, async === true, env.callback);
-  		}
-  	},
-
-  	highlightElement: function(element, async, callback) {
-  		// Find language
-  		var language = _.util.getLanguage(element);
-  		var grammar = _.languages[language];
-
-  		// Set language on the element, if not present
-  		element.className = element.className.replace(lang, '').replace(/\s+/g, ' ') + ' language-' + language;
-
-  		// Set language on the parent, for styling
-  		var parent = element.parentNode;
-  		if (parent && parent.nodeName.toLowerCase() === 'pre') {
-  			parent.className = parent.className.replace(lang, '').replace(/\s+/g, ' ') + ' language-' + language;
-  		}
-
-  		var code = element.textContent;
-
-  		var env = {
-  			element: element,
-  			language: language,
-  			grammar: grammar,
-  			code: code
-  		};
-
-  		function insertHighlightedCode(highlightedCode) {
-  			env.highlightedCode = highlightedCode;
-
-  			_.hooks.run('before-insert', env);
-
-  			env.element.innerHTML = env.highlightedCode;
-
-  			_.hooks.run('after-highlight', env);
-  			_.hooks.run('complete', env);
-  			callback && callback.call(env.element);
-  		}
-
-  		_.hooks.run('before-sanity-check', env);
-
-  		if (!env.code) {
-  			_.hooks.run('complete', env);
-  			callback && callback.call(env.element);
-  			return;
-  		}
-
-  		_.hooks.run('before-highlight', env);
-
-  		if (!env.grammar) {
-  			insertHighlightedCode(_.util.encode(env.code));
-  			return;
-  		}
-
-  		if (async && _self.Worker) {
-  			var worker = new Worker(_.filename);
-
-  			worker.onmessage = function(evt) {
-  				insertHighlightedCode(evt.data);
-  			};
-
-  			worker.postMessage(JSON.stringify({
-  				language: env.language,
-  				code: env.code,
-  				immediateClose: true
-  			}));
-  		}
-  		else {
-  			insertHighlightedCode(_.highlight(env.code, env.grammar, env.language));
-  		}
-  	},
-
-  	highlight: function (text, grammar, language) {
-  		var env = {
-  			code: text,
-  			grammar: grammar,
-  			language: language
-  		};
-  		_.hooks.run('before-tokenize', env);
-  		env.tokens = _.tokenize(env.code, env.grammar);
-  		_.hooks.run('after-tokenize', env);
-  		return Token.stringify(_.util.encode(env.tokens), env.language);
-  	},
-
-  	tokenize: function(text, grammar) {
-  		var rest = grammar.rest;
-  		if (rest) {
-  			for (var token in rest) {
-  				grammar[token] = rest[token];
-  			}
-
-  			delete grammar.rest;
-  		}
-
-  		var tokenList = new LinkedList();
-  		addAfter(tokenList, tokenList.head, text);
-
-  		matchGrammar(text, tokenList, grammar, tokenList.head, 0);
-
-  		return toArray(tokenList);
-  	},
-
-  	hooks: {
-  		all: {},
-
-  		add: function (name, callback) {
-  			var hooks = _.hooks.all;
-
-  			hooks[name] = hooks[name] || [];
-
-  			hooks[name].push(callback);
-  		},
-
-  		run: function (name, env) {
-  			var callbacks = _.hooks.all[name];
-
-  			if (!callbacks || !callbacks.length) {
-  				return;
-  			}
-
-  			for (var i=0, callback; callback = callbacks[i++];) {
-  				callback(env);
-  			}
-  		}
-  	},
-
-  	Token: Token
-  };
-
-  _self.Prism = _;
-
-  function Token(type, content, alias, matchedStr, greedy) {
-  	this.type = type;
-  	this.content = content;
-  	this.alias = alias;
-  	// Copy of the full string this token was created from
-  	this.length = (matchedStr || '').length|0;
-  	this.greedy = !!greedy;
-  }
-
-  Token.stringify = function stringify(o, language) {
-  	if (typeof o == 'string') {
-  		return o;
-  	}
-  	if (Array.isArray(o)) {
-  		var s = '';
-  		o.forEach(function (e) {
-  			s += stringify(e, language);
-  		});
-  		return s;
-  	}
-
-  	var env = {
-  		type: o.type,
-  		content: stringify(o.content, language),
-  		tag: 'span',
-  		classes: ['token', o.type],
-  		attributes: {},
-  		language: language
-  	};
-
-  	var aliases = o.alias;
-  	if (aliases) {
-  		if (Array.isArray(aliases)) {
-  			Array.prototype.push.apply(env.classes, aliases);
-  		} else {
-  			env.classes.push(aliases);
-  		}
-  	}
-
-  	_.hooks.run('wrap', env);
-
-  	var attributes = '';
-  	for (var name in env.attributes) {
-  		attributes += ' ' + name + '="' + (env.attributes[name] || '').replace(/"/g, '&quot;') + '"';
-  	}
-
-  	return '<' + env.tag + ' class="' + env.classes.join(' ') + '"' + attributes + '>' + env.content + '</' + env.tag + '>';
-  };
-
-  /**
-   * @param {string} text
-   * @param {LinkedList<string | Token>} tokenList
-   * @param {any} grammar
-   * @param {LinkedListNode<string | Token>} startNode
-   * @param {number} startPos
-   * @param {boolean} [oneshot=false]
-   * @param {string} [target]
-   */
-  function matchGrammar(text, tokenList, grammar, startNode, startPos, oneshot, target) {
-  	for (var token in grammar) {
-  		if (!grammar.hasOwnProperty(token) || !grammar[token]) {
-  			continue;
-  		}
-
-  		var patterns = grammar[token];
-  		patterns = Array.isArray(patterns) ? patterns : [patterns];
-
-  		for (var j = 0; j < patterns.length; ++j) {
-  			if (target && target == token + ',' + j) {
-  				return;
-  			}
-
-  			var pattern = patterns[j],
-  				inside = pattern.inside,
-  				lookbehind = !!pattern.lookbehind,
-  				greedy = !!pattern.greedy,
-  				lookbehindLength = 0,
-  				alias = pattern.alias;
-
-  			if (greedy && !pattern.pattern.global) {
-  				// Without the global flag, lastIndex won't work
-  				var flags = pattern.pattern.toString().match(/[imsuy]*$/)[0];
-  				pattern.pattern = RegExp(pattern.pattern.source, flags + 'g');
-  			}
-
-  			pattern = pattern.pattern || pattern;
-
-  			for ( // iterate the token list and keep track of the current token/string position
-  				var currentNode = startNode.next, pos = startPos;
-  				currentNode !== tokenList.tail;
-  				pos += currentNode.value.length, currentNode = currentNode.next
-  			) {
-
-  				var str = currentNode.value;
-
-  				if (tokenList.length > text.length) {
-  					// Something went terribly wrong, ABORT, ABORT!
-  					return;
-  				}
-
-  				if (str instanceof Token) {
-  					continue;
-  				}
-
-  				var removeCount = 1; // this is the to parameter of removeBetween
-
-  				if (greedy && currentNode != tokenList.tail.prev) {
-  					pattern.lastIndex = pos;
-  					var match = pattern.exec(text);
-  					if (!match) {
-  						break;
-  					}
-
-  					var from = match.index + (lookbehind && match[1] ? match[1].length : 0);
-  					var to = match.index + match[0].length;
-  					var p = pos;
-
-  					// find the node that contains the match
-  					p += currentNode.value.length;
-  					while (from >= p) {
-  						currentNode = currentNode.next;
-  						p += currentNode.value.length;
-  					}
-  					// adjust pos (and p)
-  					p -= currentNode.value.length;
-  					pos = p;
-
-  					// the current node is a Token, then the match starts inside another Token, which is invalid
-  					if (currentNode.value instanceof Token) {
-  						continue;
-  					}
-
-  					// find the last node which is affected by this match
-  					for (
-  						var k = currentNode;
-  						k !== tokenList.tail && (p < to || (typeof k.value === 'string' && !k.prev.value.greedy));
-  						k = k.next
-  					) {
-  						removeCount++;
-  						p += k.value.length;
-  					}
-  					removeCount--;
-
-  					// replace with the new match
-  					str = text.slice(pos, p);
-  					match.index -= pos;
-  				} else {
-  					pattern.lastIndex = 0;
-
-  					var match = pattern.exec(str);
-  				}
-
-  				if (!match) {
-  					if (oneshot) {
-  						break;
-  					}
-
-  					continue;
-  				}
-
-  				if (lookbehind) {
-  					lookbehindLength = match[1] ? match[1].length : 0;
-  				}
-
-  				var from = match.index + lookbehindLength,
-  					match = match[0].slice(lookbehindLength),
-  					to = from + match.length,
-  					before = str.slice(0, from),
-  					after = str.slice(to);
-
-  				var removeFrom = currentNode.prev;
-
-  				if (before) {
-  					removeFrom = addAfter(tokenList, removeFrom, before);
-  					pos += before.length;
-  				}
-
-  				removeRange(tokenList, removeFrom, removeCount);
-
-  				var wrapped = new Token(token, inside ? _.tokenize(match, inside) : match, alias, match, greedy);
-  				currentNode = addAfter(tokenList, removeFrom, wrapped);
-
-  				if (after) {
-  					addAfter(tokenList, currentNode, after);
-  				}
-
-
-  				if (removeCount > 1)
-  					matchGrammar(text, tokenList, grammar, currentNode.prev, pos, true, token + ',' + j);
-
-  				if (oneshot)
-  					break;
-  			}
-  		}
-  	}
-  }
-
-  /**
-   * @typedef LinkedListNode
-   * @property {T} value
-   * @property {LinkedListNode<T> | null} prev The previous node.
-   * @property {LinkedListNode<T> | null} next The next node.
-   * @template T
-   */
-
-  /**
-   * @template T
-   */
-  function LinkedList() {
-  	/** @type {LinkedListNode<T>} */
-  	var head = { value: null, prev: null, next: null };
-  	/** @type {LinkedListNode<T>} */
-  	var tail = { value: null, prev: head, next: null };
-  	head.next = tail;
-
-  	/** @type {LinkedListNode<T>} */
-  	this.head = head;
-  	/** @type {LinkedListNode<T>} */
-  	this.tail = tail;
-  	this.length = 0;
-  }
-
-  /**
-   * Adds a new node with the given value to the list.
-   * @param {LinkedList<T>} list
-   * @param {LinkedListNode<T>} node
-   * @param {T} value
-   * @returns {LinkedListNode<T>} The added node.
-   * @template T
-   */
-  function addAfter(list, node, value) {
-  	// assumes that node != list.tail && values.length >= 0
-  	var next = node.next;
-
-  	var newNode = { value: value, prev: node, next: next };
-  	node.next = newNode;
-  	next.prev = newNode;
-  	list.length++;
-
-  	return newNode;
-  }
-  /**
-   * Removes `count` nodes after the given node. The given node will not be removed.
-   * @param {LinkedList<T>} list
-   * @param {LinkedListNode<T>} node
-   * @param {number} count
-   * @template T
-   */
-  function removeRange(list, node, count) {
-  	var next = node.next;
-  	for (var i = 0; i < count && next !== list.tail; i++) {
-  		next = next.next;
-  	}
-  	node.next = next;
-  	next.prev = node;
-  	list.length -= i;
-  }
-  /**
-   * @param {LinkedList<T>} list
-   * @returns {T[]}
-   * @template T
-   */
-  function toArray(list) {
-  	var array = [];
-  	var node = list.head.next;
-  	while (node !== list.tail) {
-  		array.push(node.value);
-  		node = node.next;
-  	}
-  	return array;
-  }
-
-
-  if (!_self.document) {
-  	if (!_self.addEventListener) {
-  		// in Node.js
-  		return _;
-  	}
-
-  	if (!_.disableWorkerMessageHandler) {
-  		// In worker
-  		_self.addEventListener('message', function (evt) {
-  			var message = JSON.parse(evt.data),
-  				lang = message.language,
-  				code = message.code,
-  				immediateClose = message.immediateClose;
-
-  			_self.postMessage(_.highlight(code, _.languages[lang], lang));
-  			if (immediateClose) {
-  				_self.close();
-  			}
-  		}, false);
-  	}
-
-  	return _;
-  }
-
-  //Get current script and highlight
-  var script = _.util.currentScript();
-
-  if (script) {
-  	_.filename = script.src;
-
-  	if (script.hasAttribute('data-manual')) {
-  		_.manual = true;
-  	}
-  }
-
-  function highlightAutomaticallyCallback() {
-  	if (!_.manual) {
-  		_.highlightAll();
-  	}
-  }
-
-  if (!_.manual) {
-  	// If the document state is "loading", then we'll use DOMContentLoaded.
-  	// If the document state is "interactive" and the prism.js script is deferred, then we'll also use the
-  	// DOMContentLoaded event because there might be some plugins or languages which have also been deferred and they
-  	// might take longer one animation frame to execute which can create a race condition where only some plugins have
-  	// been loaded when Prism.highlightAll() is executed, depending on how fast resources are loaded.
-  	// See https://github.com/PrismJS/prism/issues/2102
-  	var readyState = document.readyState;
-  	if (readyState === 'loading' || readyState === 'interactive' && script && script.defer) {
-  		document.addEventListener('DOMContentLoaded', highlightAutomaticallyCallback);
-  	} else {
-  		if (window.requestAnimationFrame) {
-  			window.requestAnimationFrame(highlightAutomaticallyCallback);
-  		} else {
-  			window.setTimeout(highlightAutomaticallyCallback, 16);
-  		}
-  	}
-  }
-
-  return _;
-
-  })(_self);
-
-  if ( module.exports) {
-  	module.exports = Prism;
-  }
-
-  // hack for components to work correctly in node.js
-  if (typeof commonjsGlobal !== 'undefined') {
-  	commonjsGlobal.Prism = Prism;
-  }
-
-
-  /* **********************************************
-       Begin prism-markup.js
-  ********************************************** */
-
-  Prism.languages.markup = {
-  	'comment': /<!--[\s\S]*?-->/,
-  	'prolog': /<\?[\s\S]+?\?>/,
-  	'doctype': {
-  		pattern: /<!DOCTYPE(?:[^>"'[\]]|"[^"]*"|'[^']*')+(?:\[(?:(?!<!--)[^"'\]]|"[^"]*"|'[^']*'|<!--[\s\S]*?-->)*\]\s*)?>/i,
-  		greedy: true
-  	},
-  	'cdata': /<!\[CDATA\[[\s\S]*?]]>/i,
-  	'tag': {
-  		pattern: /<\/?(?!\d)[^\s>\/=$<%]+(?:\s(?:\s*[^\s>\/=]+(?:\s*=\s*(?:"[^"]*"|'[^']*'|[^\s'">=]+(?=[\s>]))|(?=[\s/>])))+)?\s*\/?>/i,
-  		greedy: true,
-  		inside: {
-  			'tag': {
-  				pattern: /^<\/?[^\s>\/]+/i,
-  				inside: {
-  					'punctuation': /^<\/?/,
-  					'namespace': /^[^\s>\/:]+:/
-  				}
-  			},
-  			'attr-value': {
-  				pattern: /=\s*(?:"[^"]*"|'[^']*'|[^\s'">=]+)/i,
-  				inside: {
-  					'punctuation': [
-  						/^=/,
-  						{
-  							pattern: /^(\s*)["']|["']$/,
-  							lookbehind: true
-  						}
-  					]
-  				}
-  			},
-  			'punctuation': /\/?>/,
-  			'attr-name': {
-  				pattern: /[^\s>\/]+/,
-  				inside: {
-  					'namespace': /^[^\s>\/:]+:/
-  				}
-  			}
-
-  		}
-  	},
-  	'entity': /&#?[\da-z]{1,8};/i
-  };
-
-  Prism.languages.markup['tag'].inside['attr-value'].inside['entity'] =
-  	Prism.languages.markup['entity'];
-
-  // Plugin to make entity title show the real entity, idea by Roman Komarov
-  Prism.hooks.add('wrap', function(env) {
-
-  	if (env.type === 'entity') {
-  		env.attributes['title'] = env.content.replace(/&amp;/, '&');
-  	}
-  });
-
-  Object.defineProperty(Prism.languages.markup.tag, 'addInlined', {
-  	/**
-  	 * Adds an inlined language to markup.
-  	 *
-  	 * An example of an inlined language is CSS with `<style>` tags.
-  	 *
-  	 * @param {string} tagName The name of the tag that contains the inlined language. This name will be treated as
-  	 * case insensitive.
-  	 * @param {string} lang The language key.
-  	 * @example
-  	 * addInlined('style', 'css');
-  	 */
-  	value: function addInlined(tagName, lang) {
-  		var includedCdataInside = {};
-  		includedCdataInside['language-' + lang] = {
-  			pattern: /(^<!\[CDATA\[)[\s\S]+?(?=\]\]>$)/i,
-  			lookbehind: true,
-  			inside: Prism.languages[lang]
-  		};
-  		includedCdataInside['cdata'] = /^<!\[CDATA\[|\]\]>$/i;
-
-  		var inside = {
-  			'included-cdata': {
-  				pattern: /<!\[CDATA\[[\s\S]*?\]\]>/i,
-  				inside: includedCdataInside
-  			}
-  		};
-  		inside['language-' + lang] = {
-  			pattern: /[\s\S]+/,
-  			inside: Prism.languages[lang]
-  		};
-
-  		var def = {};
-  		def[tagName] = {
-  			pattern: RegExp(/(<__[\s\S]*?>)(?:<!\[CDATA\[[\s\S]*?\]\]>\s*|[\s\S])*?(?=<\/__>)/.source.replace(/__/g, function () { return tagName; }), 'i'),
-  			lookbehind: true,
-  			greedy: true,
-  			inside: inside
-  		};
-
-  		Prism.languages.insertBefore('markup', 'cdata', def);
-  	}
-  });
-
-  Prism.languages.xml = Prism.languages.extend('markup', {});
-  Prism.languages.html = Prism.languages.markup;
-  Prism.languages.mathml = Prism.languages.markup;
-  Prism.languages.svg = Prism.languages.markup;
-
-
-  /* **********************************************
-       Begin prism-css.js
-  ********************************************** */
-
-  (function (Prism) {
-
-  	var string = /("|')(?:\\(?:\r\n|[\s\S])|(?!\1)[^\\\r\n])*\1/;
-
-  	Prism.languages.css = {
-  		'comment': /\/\*[\s\S]*?\*\//,
-  		'atrule': {
-  			pattern: /@[\w-]+[\s\S]*?(?:;|(?=\s*\{))/,
-  			inside: {
-  				'rule': /^@[\w-]+/,
-  				'selector-function-argument': {
-  					pattern: /(\bselector\s*\((?!\s*\))\s*)(?:[^()]|\((?:[^()]|\([^()]*\))*\))+?(?=\s*\))/,
-  					lookbehind: true,
-  					alias: 'selector'
-  				}
-  				// See rest below
-  			}
-  		},
-  		'url': {
-  			pattern: RegExp('url\\((?:' + string.source + '|[^\n\r()]*)\\)', 'i'),
-  			greedy: true,
-  			inside: {
-  				'function': /^url/i,
-  				'punctuation': /^\(|\)$/
-  			}
-  		},
-  		'selector': RegExp('[^{}\\s](?:[^{};"\']|' + string.source + ')*?(?=\\s*\\{)'),
-  		'string': {
-  			pattern: string,
-  			greedy: true
-  		},
-  		'property': /[-_a-z\xA0-\uFFFF][-\w\xA0-\uFFFF]*(?=\s*:)/i,
-  		'important': /!important\b/i,
-  		'function': /[-a-z0-9]+(?=\()/i,
-  		'punctuation': /[(){};:,]/
-  	};
-
-  	Prism.languages.css['atrule'].inside.rest = Prism.languages.css;
-
-  	var markup = Prism.languages.markup;
-  	if (markup) {
-  		markup.tag.addInlined('style', 'css');
-
-  		Prism.languages.insertBefore('inside', 'attr-value', {
-  			'style-attr': {
-  				pattern: /\s*style=("|')(?:\\[\s\S]|(?!\1)[^\\])*\1/i,
-  				inside: {
-  					'attr-name': {
-  						pattern: /^\s*style/i,
-  						inside: markup.tag.inside
-  					},
-  					'punctuation': /^\s*=\s*['"]|['"]\s*$/,
-  					'attr-value': {
-  						pattern: /.+/i,
-  						inside: Prism.languages.css
-  					}
-  				},
-  				alias: 'language-css'
-  			}
-  		}, markup.tag);
-  	}
-
-  }(Prism));
-
-
-  /* **********************************************
-       Begin prism-clike.js
-  ********************************************** */
-
-  Prism.languages.clike = {
-  	'comment': [
-  		{
-  			pattern: /(^|[^\\])\/\*[\s\S]*?(?:\*\/|$)/,
-  			lookbehind: true
-  		},
-  		{
-  			pattern: /(^|[^\\:])\/\/.*/,
-  			lookbehind: true,
-  			greedy: true
-  		}
-  	],
-  	'string': {
-  		pattern: /(["'])(?:\\(?:\r\n|[\s\S])|(?!\1)[^\\\r\n])*\1/,
-  		greedy: true
-  	},
-  	'class-name': {
-  		pattern: /(\b(?:class|interface|extends|implements|trait|instanceof|new)\s+|\bcatch\s+\()[\w.\\]+/i,
-  		lookbehind: true,
-  		inside: {
-  			'punctuation': /[.\\]/
-  		}
-  	},
-  	'keyword': /\b(?:if|else|while|do|for|return|in|instanceof|function|new|try|throw|catch|finally|null|break|continue)\b/,
-  	'boolean': /\b(?:true|false)\b/,
-  	'function': /\w+(?=\()/,
-  	'number': /\b0x[\da-f]+\b|(?:\b\d+\.?\d*|\B\.\d+)(?:e[+-]?\d+)?/i,
-  	'operator': /[<>]=?|[!=]=?=?|--?|\+\+?|&&?|\|\|?|[?*/~^%]/,
-  	'punctuation': /[{}[\];(),.:]/
-  };
-
-
-  /* **********************************************
-       Begin prism-javascript.js
-  ********************************************** */
-
-  Prism.languages.javascript = Prism.languages.extend('clike', {
-  	'class-name': [
-  		Prism.languages.clike['class-name'],
-  		{
-  			pattern: /(^|[^$\w\xA0-\uFFFF])[_$A-Z\xA0-\uFFFF][$\w\xA0-\uFFFF]*(?=\.(?:prototype|constructor))/,
-  			lookbehind: true
-  		}
-  	],
-  	'keyword': [
-  		{
-  			pattern: /((?:^|})\s*)(?:catch|finally)\b/,
-  			lookbehind: true
-  		},
-  		{
-  			pattern: /(^|[^.]|\.\.\.\s*)\b(?:as|async(?=\s*(?:function\b|\(|[$\w\xA0-\uFFFF]|$))|await|break|case|class|const|continue|debugger|default|delete|do|else|enum|export|extends|for|from|function|get|if|implements|import|in|instanceof|interface|let|new|null|of|package|private|protected|public|return|set|static|super|switch|this|throw|try|typeof|undefined|var|void|while|with|yield)\b/,
-  			lookbehind: true
-  		},
-  	],
-  	'number': /\b(?:(?:0[xX](?:[\dA-Fa-f](?:_[\dA-Fa-f])?)+|0[bB](?:[01](?:_[01])?)+|0[oO](?:[0-7](?:_[0-7])?)+)n?|(?:\d(?:_\d)?)+n|NaN|Infinity)\b|(?:\b(?:\d(?:_\d)?)+\.?(?:\d(?:_\d)?)*|\B\.(?:\d(?:_\d)?)+)(?:[Ee][+-]?(?:\d(?:_\d)?)+)?/,
-  	// Allow for all non-ASCII characters (See http://stackoverflow.com/a/2008444)
-  	'function': /#?[_$a-zA-Z\xA0-\uFFFF][$\w\xA0-\uFFFF]*(?=\s*(?:\.\s*(?:apply|bind|call)\s*)?\()/,
-  	'operator': /--|\+\+|\*\*=?|=>|&&|\|\||[!=]==|<<=?|>>>?=?|[-+*/%&|^!=<>]=?|\.{3}|\?[.?]?|[~:]/
-  });
-
-  Prism.languages.javascript['class-name'][0].pattern = /(\b(?:class|interface|extends|implements|instanceof|new)\s+)[\w.\\]+/;
-
-  Prism.languages.insertBefore('javascript', 'keyword', {
-  	'regex': {
-  		pattern: /((?:^|[^$\w\xA0-\uFFFF."'\])\s])\s*)\/(?:\[(?:[^\]\\\r\n]|\\.)*]|\\.|[^/\\\[\r\n])+\/[gimyus]{0,6}(?=(?:\s|\/\*[\s\S]*?\*\/)*(?:$|[\r\n,.;:})\]]|\/\/))/,
-  		lookbehind: true,
-  		greedy: true
-  	},
-  	// This must be declared before keyword because we use "function" inside the look-forward
-  	'function-variable': {
-  		pattern: /#?[_$a-zA-Z\xA0-\uFFFF][$\w\xA0-\uFFFF]*(?=\s*[=:]\s*(?:async\s*)?(?:\bfunction\b|(?:\((?:[^()]|\([^()]*\))*\)|[_$a-zA-Z\xA0-\uFFFF][$\w\xA0-\uFFFF]*)\s*=>))/,
-  		alias: 'function'
-  	},
-  	'parameter': [
-  		{
-  			pattern: /(function(?:\s+[_$A-Za-z\xA0-\uFFFF][$\w\xA0-\uFFFF]*)?\s*\(\s*)(?!\s)(?:[^()]|\([^()]*\))+?(?=\s*\))/,
-  			lookbehind: true,
-  			inside: Prism.languages.javascript
-  		},
-  		{
-  			pattern: /[_$a-z\xA0-\uFFFF][$\w\xA0-\uFFFF]*(?=\s*=>)/i,
-  			inside: Prism.languages.javascript
-  		},
-  		{
-  			pattern: /(\(\s*)(?!\s)(?:[^()]|\([^()]*\))+?(?=\s*\)\s*=>)/,
-  			lookbehind: true,
-  			inside: Prism.languages.javascript
-  		},
-  		{
-  			pattern: /((?:\b|\s|^)(?!(?:as|async|await|break|case|catch|class|const|continue|debugger|default|delete|do|else|enum|export|extends|finally|for|from|function|get|if|implements|import|in|instanceof|interface|let|new|null|of|package|private|protected|public|return|set|static|super|switch|this|throw|try|typeof|undefined|var|void|while|with|yield)(?![$\w\xA0-\uFFFF]))(?:[_$A-Za-z\xA0-\uFFFF][$\w\xA0-\uFFFF]*\s*)\(\s*)(?!\s)(?:[^()]|\([^()]*\))+?(?=\s*\)\s*\{)/,
-  			lookbehind: true,
-  			inside: Prism.languages.javascript
-  		}
-  	],
-  	'constant': /\b[A-Z](?:[A-Z_]|\dx?)*\b/
-  });
-
-  Prism.languages.insertBefore('javascript', 'string', {
-  	'template-string': {
-  		pattern: /`(?:\\[\s\S]|\${(?:[^{}]|{(?:[^{}]|{[^}]*})*})+}|(?!\${)[^\\`])*`/,
-  		greedy: true,
-  		inside: {
-  			'template-punctuation': {
-  				pattern: /^`|`$/,
-  				alias: 'string'
-  			},
-  			'interpolation': {
-  				pattern: /((?:^|[^\\])(?:\\{2})*)\${(?:[^{}]|{(?:[^{}]|{[^}]*})*})+}/,
-  				lookbehind: true,
-  				inside: {
-  					'interpolation-punctuation': {
-  						pattern: /^\${|}$/,
-  						alias: 'punctuation'
-  					},
-  					rest: Prism.languages.javascript
-  				}
-  			},
-  			'string': /[\s\S]+/
-  		}
-  	}
-  });
-
-  if (Prism.languages.markup) {
-  	Prism.languages.markup.tag.addInlined('script', 'javascript');
-  }
-
-  Prism.languages.js = Prism.languages.javascript;
-
-
-  /* **********************************************
-       Begin prism-file-highlight.js
-  ********************************************** */
-
-  (function () {
-  	if (typeof self === 'undefined' || !self.Prism || !self.document || !document.querySelector) {
-  		return;
-  	}
-
-  	/**
-  	 * @param {Element} [container=document]
-  	 */
-  	self.Prism.fileHighlight = function(container) {
-  		container = container || document;
-
-  		var Extensions = {
-  			'js': 'javascript',
-  			'py': 'python',
-  			'rb': 'ruby',
-  			'ps1': 'powershell',
-  			'psm1': 'powershell',
-  			'sh': 'bash',
-  			'bat': 'batch',
-  			'h': 'c',
-  			'tex': 'latex'
-  		};
-
-  		Array.prototype.slice.call(container.querySelectorAll('pre[data-src]')).forEach(function (pre) {
-  			// ignore if already loaded
-  			if (pre.hasAttribute('data-src-loaded')) {
-  				return;
-  			}
-
-  			// load current
-  			var src = pre.getAttribute('data-src');
-
-  			var language, parent = pre;
-  			var lang = /\blang(?:uage)?-([\w-]+)\b/i;
-  			while (parent && !lang.test(parent.className)) {
-  				parent = parent.parentNode;
-  			}
-
-  			if (parent) {
-  				language = (pre.className.match(lang) || [, ''])[1];
-  			}
-
-  			if (!language) {
-  				var extension = (src.match(/\.(\w+)$/) || [, ''])[1];
-  				language = Extensions[extension] || extension;
-  			}
-
-  			var code = document.createElement('code');
-  			code.className = 'language-' + language;
-
-  			pre.textContent = '';
-
-  			code.textContent = 'Loading…';
-
-  			pre.appendChild(code);
-
-  			var xhr = new XMLHttpRequest();
-
-  			xhr.open('GET', src, true);
-
-  			xhr.onreadystatechange = function () {
-  				if (xhr.readyState == 4) {
-
-  					if (xhr.status < 400 && xhr.responseText) {
-  						code.textContent = xhr.responseText;
-
-  						Prism.highlightElement(code);
-  						// mark as loaded
-  						pre.setAttribute('data-src-loaded', '');
-  					}
-  					else if (xhr.status >= 400) {
-  						code.textContent = '✖ Error ' + xhr.status + ' while fetching file: ' + xhr.statusText;
-  					}
-  					else {
-  						code.textContent = '✖ Error: File does not exist or is empty';
-  					}
-  				}
-  			};
-
-  			xhr.send(null);
-  		});
-  	};
-
-  	document.addEventListener('DOMContentLoaded', function () {
-  		// execute inside handler, for dropping Event as argument
-  		self.Prism.fileHighlight();
-  	});
-
-  })();
-  });
-
-  Prism.languages.python = {
-  	'comment': {
-  		pattern: /(^|[^\\])#.*/,
-  		lookbehind: true
-  	},
-  	'string-interpolation': {
-  		pattern: /(?:f|rf|fr)(?:("""|''')[\s\S]+?\1|("|')(?:\\.|(?!\2)[^\\\r\n])*\2)/i,
-  		greedy: true,
-  		inside: {
-  			'interpolation': {
-  				// "{" <expression> <optional "!s", "!r", or "!a"> <optional ":" format specifier> "}"
-  				pattern: /((?:^|[^{])(?:{{)*){(?!{)(?:[^{}]|{(?!{)(?:[^{}]|{(?!{)(?:[^{}])+})+})+}/,
-  				lookbehind: true,
-  				inside: {
-  					'format-spec': {
-  						pattern: /(:)[^:(){}]+(?=}$)/,
-  						lookbehind: true
-  					},
-  					'conversion-option': {
-  						pattern: /![sra](?=[:}]$)/,
-  						alias: 'punctuation'
-  					},
-  					rest: null
-  				}
-  			},
-  			'string': /[\s\S]+/
-  		}
-  	},
-  	'triple-quoted-string': {
-  		pattern: /(?:[rub]|rb|br)?("""|''')[\s\S]+?\1/i,
-  		greedy: true,
-  		alias: 'string'
-  	},
-  	'string': {
-  		pattern: /(?:[rub]|rb|br)?("|')(?:\\.|(?!\1)[^\\\r\n])*\1/i,
-  		greedy: true
-  	},
-  	'function': {
-  		pattern: /((?:^|\s)def[ \t]+)[a-zA-Z_]\w*(?=\s*\()/g,
-  		lookbehind: true
-  	},
-  	'class-name': {
-  		pattern: /(\bclass\s+)\w+/i,
-  		lookbehind: true
-  	},
-  	'decorator': {
-  		pattern: /(^\s*)@\w+(?:\.\w+)*/im,
-  		lookbehind: true,
-  		alias: ['annotation', 'punctuation'],
-  		inside: {
-  			'punctuation': /\./
-  		}
-  	},
-  	'keyword': /\b(?:and|as|assert|async|await|break|class|continue|def|del|elif|else|except|exec|finally|for|from|global|if|import|in|is|lambda|nonlocal|not|or|pass|print|raise|return|try|while|with|yield)\b/,
-  	'builtin': /\b(?:__import__|abs|all|any|apply|ascii|basestring|bin|bool|buffer|bytearray|bytes|callable|chr|classmethod|cmp|coerce|compile|complex|delattr|dict|dir|divmod|enumerate|eval|execfile|file|filter|float|format|frozenset|getattr|globals|hasattr|hash|help|hex|id|input|int|intern|isinstance|issubclass|iter|len|list|locals|long|map|max|memoryview|min|next|object|oct|open|ord|pow|property|range|raw_input|reduce|reload|repr|reversed|round|set|setattr|slice|sorted|staticmethod|str|sum|super|tuple|type|unichr|unicode|vars|xrange|zip)\b/,
-  	'boolean': /\b(?:True|False|None)\b/,
-  	'number': /(?:\b(?=\d)|\B(?=\.))(?:0[bo])?(?:(?:\d|0x[\da-f])[\da-f]*\.?\d*|\.\d+)(?:e[+-]?\d+)?j?\b/i,
-  	'operator': /[-+%=]=?|!=|\*\*?=?|\/\/?=?|<[<=>]?|>[=>]?|[&|^~]/,
-  	'punctuation': /[{}[\];(),.:]/
-  };
-
-  Prism.languages.python['string-interpolation'].inside['interpolation'].inside.rest = Prism.languages.python;
-
-  Prism.languages.py = Prism.languages.python;
-
-  Prism.languages.clike = {
-  	'comment': [
-  		{
-  			pattern: /(^|[^\\])\/\*[\s\S]*?(?:\*\/|$)/,
-  			lookbehind: true
-  		},
-  		{
-  			pattern: /(^|[^\\:])\/\/.*/,
-  			lookbehind: true,
-  			greedy: true
-  		}
-  	],
-  	'string': {
-  		pattern: /(["'])(?:\\(?:\r\n|[\s\S])|(?!\1)[^\\\r\n])*\1/,
-  		greedy: true
-  	},
-  	'class-name': {
-  		pattern: /(\b(?:class|interface|extends|implements|trait|instanceof|new)\s+|\bcatch\s+\()[\w.\\]+/i,
-  		lookbehind: true,
-  		inside: {
-  			'punctuation': /[.\\]/
-  		}
-  	},
-  	'keyword': /\b(?:if|else|while|do|for|return|in|instanceof|function|new|try|throw|catch|finally|null|break|continue)\b/,
-  	'boolean': /\b(?:true|false)\b/,
-  	'function': /\w+(?=\()/,
-  	'number': /\b0x[\da-f]+\b|(?:\b\d+\.?\d*|\B\.\d+)(?:e[+-]?\d+)?/i,
-  	'operator': /[<>]=?|[!=]=?=?|--?|\+\+?|&&?|\|\|?|[?*/~^%]/,
-  	'punctuation': /[{}[\];(),.:]/
-  };
-
-  Prism.languages.lua = {
-  	'comment': /^#!.+|--(?:\[(=*)\[[\s\S]*?\]\1\]|.*)/m,
-  	// \z may be used to skip the following space
-  	'string': {
-  		pattern: /(["'])(?:(?!\1)[^\\\r\n]|\\z(?:\r\n|\s)|\\(?:\r\n|[\s\S]))*\1|\[(=*)\[[\s\S]*?\]\2\]/,
-  		greedy: true
-  	},
-  	'number': /\b0x[a-f\d]+\.?[a-f\d]*(?:p[+-]?\d+)?\b|\b\d+(?:\.\B|\.?\d*(?:e[+-]?\d+)?\b)|\B\.\d+(?:e[+-]?\d+)?\b/i,
-  	'keyword': /\b(?:and|break|do|else|elseif|end|false|for|function|goto|if|in|local|nil|not|or|repeat|return|then|true|until|while)\b/,
-  	'function': /(?!\d)\w+(?=\s*(?:[({]))/,
-  	'operator': [
-  		/[-+*%^&|#]|\/\/?|<[<=]?|>[>=]?|[=~]=?/,
-  		{
-  			// Match ".." but don't break "..."
-  			pattern: /(^|[^.])\.\.(?!\.)/,
-  			lookbehind: true
-  		}
-  	],
-  	'punctuation': /[\[\](){},;]|\.+|:+/
-  };
-
-  (function(Prism) {
-  	// $ set | grep '^[A-Z][^[:space:]]*=' | cut -d= -f1 | tr '\n' '|'
-  	// + LC_ALL, RANDOM, REPLY, SECONDS.
-  	// + make sure PS1..4 are here as they are not always set,
-  	// - some useless things.
-  	var envVars = '\\b(?:BASH|BASHOPTS|BASH_ALIASES|BASH_ARGC|BASH_ARGV|BASH_CMDS|BASH_COMPLETION_COMPAT_DIR|BASH_LINENO|BASH_REMATCH|BASH_SOURCE|BASH_VERSINFO|BASH_VERSION|COLORTERM|COLUMNS|COMP_WORDBREAKS|DBUS_SESSION_BUS_ADDRESS|DEFAULTS_PATH|DESKTOP_SESSION|DIRSTACK|DISPLAY|EUID|GDMSESSION|GDM_LANG|GNOME_KEYRING_CONTROL|GNOME_KEYRING_PID|GPG_AGENT_INFO|GROUPS|HISTCONTROL|HISTFILE|HISTFILESIZE|HISTSIZE|HOME|HOSTNAME|HOSTTYPE|IFS|INSTANCE|JOB|LANG|LANGUAGE|LC_ADDRESS|LC_ALL|LC_IDENTIFICATION|LC_MEASUREMENT|LC_MONETARY|LC_NAME|LC_NUMERIC|LC_PAPER|LC_TELEPHONE|LC_TIME|LESSCLOSE|LESSOPEN|LINES|LOGNAME|LS_COLORS|MACHTYPE|MAILCHECK|MANDATORY_PATH|NO_AT_BRIDGE|OLDPWD|OPTERR|OPTIND|ORBIT_SOCKETDIR|OSTYPE|PAPERSIZE|PATH|PIPESTATUS|PPID|PS1|PS2|PS3|PS4|PWD|RANDOM|REPLY|SECONDS|SELINUX_INIT|SESSION|SESSIONTYPE|SESSION_MANAGER|SHELL|SHELLOPTS|SHLVL|SSH_AUTH_SOCK|TERM|UID|UPSTART_EVENTS|UPSTART_INSTANCE|UPSTART_JOB|UPSTART_SESSION|USER|WINDOWID|XAUTHORITY|XDG_CONFIG_DIRS|XDG_CURRENT_DESKTOP|XDG_DATA_DIRS|XDG_GREETER_DATA_DIR|XDG_MENU_PREFIX|XDG_RUNTIME_DIR|XDG_SEAT|XDG_SEAT_PATH|XDG_SESSION_DESKTOP|XDG_SESSION_ID|XDG_SESSION_PATH|XDG_SESSION_TYPE|XDG_VTNR|XMODIFIERS)\\b';
-  	var insideString = {
-  		'environment': {
-  			pattern: RegExp("\\$" + envVars),
-  			alias: 'constant'
-  		},
-  		'variable': [
-  			// [0]: Arithmetic Environment
-  			{
-  				pattern: /\$?\(\([\s\S]+?\)\)/,
-  				greedy: true,
-  				inside: {
-  					// If there is a $ sign at the beginning highlight $(( and )) as variable
-  					'variable': [
-  						{
-  							pattern: /(^\$\(\([\s\S]+)\)\)/,
-  							lookbehind: true
-  						},
-  						/^\$\(\(/
-  					],
-  					'number': /\b0x[\dA-Fa-f]+\b|(?:\b\d+\.?\d*|\B\.\d+)(?:[Ee]-?\d+)?/,
-  					// Operators according to https://www.gnu.org/software/bash/manual/bashref.html#Shell-Arithmetic
-  					'operator': /--?|-=|\+\+?|\+=|!=?|~|\*\*?|\*=|\/=?|%=?|<<=?|>>=?|<=?|>=?|==?|&&?|&=|\^=?|\|\|?|\|=|\?|:/,
-  					// If there is no $ sign at the beginning highlight (( and )) as punctuation
-  					'punctuation': /\(\(?|\)\)?|,|;/
-  				}
-  			},
-  			// [1]: Command Substitution
-  			{
-  				pattern: /\$\((?:\([^)]+\)|[^()])+\)|`[^`]+`/,
-  				greedy: true,
-  				inside: {
-  					'variable': /^\$\(|^`|\)$|`$/
-  				}
-  			},
-  			// [2]: Brace expansion
-  			{
-  				pattern: /\$\{[^}]+\}/,
-  				greedy: true,
-  				inside: {
-  					'operator': /:[-=?+]?|[!\/]|##?|%%?|\^\^?|,,?/,
-  					'punctuation': /[\[\]]/,
-  					'environment': {
-  						pattern: RegExp("(\\{)" + envVars),
-  						lookbehind: true,
-  						alias: 'constant'
-  					}
-  				}
-  			},
-  			/\$(?:\w+|[#?*!@$])/
-  		],
-  		// Escape sequences from echo and printf's manuals, and escaped quotes.
-  		'entity': /\\(?:[abceEfnrtv\\"]|O?[0-7]{1,3}|x[0-9a-fA-F]{1,2}|u[0-9a-fA-F]{4}|U[0-9a-fA-F]{8})/
-  	};
-
-  	Prism.languages.bash = {
-  		'shebang': {
-  			pattern: /^#!\s*\/.*/,
-  			alias: 'important'
-  		},
-  		'comment': {
-  			pattern: /(^|[^"{\\$])#.*/,
-  			lookbehind: true
-  		},
-  		'function-name': [
-  			// a) function foo {
-  			// b) foo() {
-  			// c) function foo() {
-  			// but not “foo {”
-  			{
-  				// a) and c)
-  				pattern: /(\bfunction\s+)\w+(?=(?:\s*\(?:\s*\))?\s*\{)/,
-  				lookbehind: true,
-  				alias: 'function'
-  			},
-  			{
-  				// b)
-  				pattern: /\b\w+(?=\s*\(\s*\)\s*\{)/,
-  				alias: 'function'
-  			}
-  		],
-  		// Highlight variable names as variables in for and select beginnings.
-  		'for-or-select': {
-  			pattern: /(\b(?:for|select)\s+)\w+(?=\s+in\s)/,
-  			alias: 'variable',
-  			lookbehind: true
-  		},
-  		// Highlight variable names as variables in the left-hand part
-  		// of assignments (“=” and “+=”).
-  		'assign-left': {
-  			pattern: /(^|[\s;|&]|[<>]\()\w+(?=\+?=)/,
-  			inside: {
-  				'environment': {
-  					pattern: RegExp("(^|[\\s;|&]|[<>]\\()" + envVars),
-  					lookbehind: true,
-  					alias: 'constant'
-  				}
-  			},
-  			alias: 'variable',
-  			lookbehind: true
-  		},
-  		'string': [
-  			// Support for Here-documents https://en.wikipedia.org/wiki/Here_document
-  			{
-  				pattern: /((?:^|[^<])<<-?\s*)(\w+?)\s*(?:\r?\n|\r)[\s\S]*?(?:\r?\n|\r)\2/,
-  				lookbehind: true,
-  				greedy: true,
-  				inside: insideString
-  			},
-  			// Here-document with quotes around the tag
-  			// → No expansion (so no “inside”).
-  			{
-  				pattern: /((?:^|[^<])<<-?\s*)(["'])(\w+)\2\s*(?:\r?\n|\r)[\s\S]*?(?:\r?\n|\r)\3/,
-  				lookbehind: true,
-  				greedy: true
-  			},
-  			// “Normal” string
-  			{
-  				pattern: /(^|[^\\](?:\\\\)*)(["'])(?:\\[\s\S]|\$\([^)]+\)|`[^`]+`|(?!\2)[^\\])*\2/,
-  				lookbehind: true,
-  				greedy: true,
-  				inside: insideString
-  			}
-  		],
-  		'environment': {
-  			pattern: RegExp("\\$?" + envVars),
-  			alias: 'constant'
-  		},
-  		'variable': insideString.variable,
-  		'function': {
-  			pattern: /(^|[\s;|&]|[<>]\()(?:add|apropos|apt|aptitude|apt-cache|apt-get|aspell|automysqlbackup|awk|basename|bash|bc|bconsole|bg|bzip2|cal|cat|cfdisk|chgrp|chkconfig|chmod|chown|chroot|cksum|clear|cmp|column|comm|cp|cron|crontab|csplit|curl|cut|date|dc|dd|ddrescue|debootstrap|df|diff|diff3|dig|dir|dircolors|dirname|dirs|dmesg|du|egrep|eject|env|ethtool|expand|expect|expr|fdformat|fdisk|fg|fgrep|file|find|fmt|fold|format|free|fsck|ftp|fuser|gawk|git|gparted|grep|groupadd|groupdel|groupmod|groups|grub-mkconfig|gzip|halt|head|hg|history|host|hostname|htop|iconv|id|ifconfig|ifdown|ifup|import|install|ip|jobs|join|kill|killall|less|link|ln|locate|logname|logrotate|look|lpc|lpr|lprint|lprintd|lprintq|lprm|ls|lsof|lynx|make|man|mc|mdadm|mkconfig|mkdir|mke2fs|mkfifo|mkfs|mkisofs|mknod|mkswap|mmv|more|most|mount|mtools|mtr|mutt|mv|nano|nc|netstat|nice|nl|nohup|notify-send|npm|nslookup|op|open|parted|passwd|paste|pathchk|ping|pkill|pnpm|popd|pr|printcap|printenv|ps|pushd|pv|quota|quotacheck|quotactl|ram|rar|rcp|reboot|remsync|rename|renice|rev|rm|rmdir|rpm|rsync|scp|screen|sdiff|sed|sendmail|seq|service|sftp|sh|shellcheck|shuf|shutdown|sleep|slocate|sort|split|ssh|stat|strace|su|sudo|sum|suspend|swapon|sync|tac|tail|tar|tee|time|timeout|top|touch|tr|traceroute|tsort|tty|umount|uname|unexpand|uniq|units|unrar|unshar|unzip|update-grub|uptime|useradd|userdel|usermod|users|uudecode|uuencode|v|vdir|vi|vim|virsh|vmstat|wait|watch|wc|wget|whereis|which|who|whoami|write|xargs|xdg-open|yarn|yes|zenity|zip|zsh|zypper)(?=$|[)\s;|&])/,
-  			lookbehind: true
-  		},
-  		'keyword': {
-  			pattern: /(^|[\s;|&]|[<>]\()(?:if|then|else|elif|fi|for|while|in|case|esac|function|select|do|done|until)(?=$|[)\s;|&])/,
-  			lookbehind: true
-  		},
-  		// https://www.gnu.org/software/bash/manual/html_node/Shell-Builtin-Commands.html
-  		'builtin': {
-  			pattern: /(^|[\s;|&]|[<>]\()(?:\.|:|break|cd|continue|eval|exec|exit|export|getopts|hash|pwd|readonly|return|shift|test|times|trap|umask|unset|alias|bind|builtin|caller|command|declare|echo|enable|help|let|local|logout|mapfile|printf|read|readarray|source|type|typeset|ulimit|unalias|set|shopt)(?=$|[)\s;|&])/,
-  			lookbehind: true,
-  			// Alias added to make those easier to distinguish from strings.
-  			alias: 'class-name'
-  		},
-  		'boolean': {
-  			pattern: /(^|[\s;|&]|[<>]\()(?:true|false)(?=$|[)\s;|&])/,
-  			lookbehind: true
-  		},
-  		'file-descriptor': {
-  			pattern: /\B&\d\b/,
-  			alias: 'important'
-  		},
-  		'operator': {
-  			// Lots of redirections here, but not just that.
-  			pattern: /\d?<>|>\||\+=|==?|!=?|=~|<<[<-]?|[&\d]?>>|\d?[<>]&?|&[>&]?|\|[&|]?|<=?|>=?/,
-  			inside: {
-  				'file-descriptor': {
-  					pattern: /^\d/,
-  					alias: 'important'
-  				}
-  			}
-  		},
-  		'punctuation': /\$?\(\(?|\)\)?|\.\.|[{}[\];\\]/,
-  		'number': {
-  			pattern: /(^|\s)(?:[1-9]\d*|0)(?:[.,]\d+)?\b/,
-  			lookbehind: true
-  		}
-  	};
-
-  	/* Patterns in command substitution. */
-  	var toBeCopied = [
-  		'comment',
-  		'function-name',
-  		'for-or-select',
-  		'assign-left',
-  		'string',
-  		'environment',
-  		'function',
-  		'keyword',
-  		'builtin',
-  		'boolean',
-  		'file-descriptor',
-  		'operator',
-  		'punctuation',
-  		'number'
-  	];
-  	var inside = insideString.variable[1].inside;
-  	for(var i = 0; i < toBeCopied.length; i++) {
-  		inside[toBeCopied[i]] = Prism.languages.bash[toBeCopied[i]];
-  	}
-
-  	Prism.languages.shell = Prism.languages.bash;
-  })(Prism);
-
-  Prism.languages.go = Prism.languages.extend('clike', {
-  	'keyword': /\b(?:break|case|chan|const|continue|default|defer|else|fallthrough|for|func|go(?:to)?|if|import|interface|map|package|range|return|select|struct|switch|type|var)\b/,
-  	'builtin': /\b(?:bool|byte|complex(?:64|128)|error|float(?:32|64)|rune|string|u?int(?:8|16|32|64)?|uintptr|append|cap|close|complex|copy|delete|imag|len|make|new|panic|print(?:ln)?|real|recover)\b/,
-  	'boolean': /\b(?:_|iota|nil|true|false)\b/,
-  	'operator': /[*\/%^!=]=?|\+[=+]?|-[=-]?|\|[=|]?|&(?:=|&|\^=?)?|>(?:>=?|=)?|<(?:<=?|=|-)?|:=|\.\.\./,
-  	'number': /(?:\b0x[a-f\d]+|(?:\b\d+\.?\d*|\B\.\d+)(?:e[-+]?\d+)?)i?/i,
-  	'string': {
-  		pattern: /(["'`])(?:\\[\s\S]|(?!\1)[^\\])*\1/,
-  		greedy: true
-  	}
-  });
-  delete Prism.languages.go['class-name'];
-
-  (function (Prism) {
-
-  	// Allow only one line break
-  	var inner = /(?:\\.|[^\\\n\r]|(?:\n|\r\n?)(?!\n|\r\n?))/.source;
-
-  	/**
-  	 * This function is intended for the creation of the bold or italic pattern.
-  	 *
-  	 * This also adds a lookbehind group to the given pattern to ensure that the pattern is not backslash-escaped.
-  	 *
-  	 * _Note:_ Keep in mind that this adds a capturing group.
-  	 *
-  	 * @param {string} pattern
-  	 * @param {boolean} starAlternative Whether to also add an alternative where all `_`s are replaced with `*`s.
-  	 * @returns {RegExp}
-  	 */
-  	function createInline(pattern, starAlternative) {
-  		pattern = pattern.replace(/<inner>/g, function () { return inner; });
-  		if (starAlternative) {
-  			pattern = pattern + '|' + pattern.replace(/_/g, '\\*');
-  		}
-  		return RegExp(/((?:^|[^\\])(?:\\{2})*)/.source + '(?:' + pattern + ')');
-  	}
-
-
-  	var tableCell = /(?:\\.|``.+?``|`[^`\r\n]+`|[^\\|\r\n`])+/.source;
-  	var tableRow = /\|?__(?:\|__)+\|?(?:(?:\n|\r\n?)|$)/.source.replace(/__/g, function () { return tableCell; });
-  	var tableLine = /\|?[ \t]*:?-{3,}:?[ \t]*(?:\|[ \t]*:?-{3,}:?[ \t]*)+\|?(?:\n|\r\n?)/.source;
-
-
-  	Prism.languages.markdown = Prism.languages.extend('markup', {});
-  	Prism.languages.insertBefore('markdown', 'prolog', {
-  		'blockquote': {
-  			// > ...
-  			pattern: /^>(?:[\t ]*>)*/m,
-  			alias: 'punctuation'
-  		},
-  		'table': {
-  			pattern: RegExp('^' + tableRow + tableLine + '(?:' + tableRow + ')*', 'm'),
-  			inside: {
-  				'table-data-rows': {
-  					pattern: RegExp('^(' + tableRow + tableLine + ')(?:' + tableRow + ')*$'),
-  					lookbehind: true,
-  					inside: {
-  						'table-data': {
-  							pattern: RegExp(tableCell),
-  							inside: Prism.languages.markdown
-  						},
-  						'punctuation': /\|/
-  					}
-  				},
-  				'table-line': {
-  					pattern: RegExp('^(' + tableRow + ')' + tableLine + '$'),
-  					lookbehind: true,
-  					inside: {
-  						'punctuation': /\||:?-{3,}:?/
-  					}
-  				},
-  				'table-header-row': {
-  					pattern: RegExp('^' + tableRow + '$'),
-  					inside: {
-  						'table-header': {
-  							pattern: RegExp(tableCell),
-  							alias: 'important',
-  							inside: Prism.languages.markdown
-  						},
-  						'punctuation': /\|/
-  					}
-  				}
-  			}
-  		},
-  		'code': [
-  			{
-  				// Prefixed by 4 spaces or 1 tab and preceded by an empty line
-  				pattern: /((?:^|\n)[ \t]*\n|(?:^|\r\n?)[ \t]*\r\n?)(?: {4}|\t).+(?:(?:\n|\r\n?)(?: {4}|\t).+)*/,
-  				lookbehind: true,
-  				alias: 'keyword'
-  			},
-  			{
-  				// `code`
-  				// ``code``
-  				pattern: /``.+?``|`[^`\r\n]+`/,
-  				alias: 'keyword'
-  			},
-  			{
-  				// ```optional language
-  				// code block
-  				// ```
-  				pattern: /^```[\s\S]*?^```$/m,
-  				greedy: true,
-  				inside: {
-  					'code-block': {
-  						pattern: /^(```.*(?:\n|\r\n?))[\s\S]+?(?=(?:\n|\r\n?)^```$)/m,
-  						lookbehind: true
-  					},
-  					'code-language': {
-  						pattern: /^(```).+/,
-  						lookbehind: true
-  					},
-  					'punctuation': /```/
-  				}
-  			}
-  		],
-  		'title': [
-  			{
-  				// title 1
-  				// =======
-
-  				// title 2
-  				// -------
-  				pattern: /\S.*(?:\n|\r\n?)(?:==+|--+)(?=[ \t]*$)/m,
-  				alias: 'important',
-  				inside: {
-  					punctuation: /==+$|--+$/
-  				}
-  			},
-  			{
-  				// # title 1
-  				// ###### title 6
-  				pattern: /(^\s*)#+.+/m,
-  				lookbehind: true,
-  				alias: 'important',
-  				inside: {
-  					punctuation: /^#+|#+$/
-  				}
-  			}
-  		],
-  		'hr': {
-  			// ***
-  			// ---
-  			// * * *
-  			// -----------
-  			pattern: /(^\s*)([*-])(?:[\t ]*\2){2,}(?=\s*$)/m,
-  			lookbehind: true,
-  			alias: 'punctuation'
-  		},
-  		'list': {
-  			// * item
-  			// + item
-  			// - item
-  			// 1. item
-  			pattern: /(^\s*)(?:[*+-]|\d+\.)(?=[\t ].)/m,
-  			lookbehind: true,
-  			alias: 'punctuation'
-  		},
-  		'url-reference': {
-  			// [id]: http://example.com "Optional title"
-  			// [id]: http://example.com 'Optional title'
-  			// [id]: http://example.com (Optional title)
-  			// [id]: <http://example.com> "Optional title"
-  			pattern: /!?\[[^\]]+\]:[\t ]+(?:\S+|<(?:\\.|[^>\\])+>)(?:[\t ]+(?:"(?:\\.|[^"\\])*"|'(?:\\.|[^'\\])*'|\((?:\\.|[^)\\])*\)))?/,
-  			inside: {
-  				'variable': {
-  					pattern: /^(!?\[)[^\]]+/,
-  					lookbehind: true
-  				},
-  				'string': /(?:"(?:\\.|[^"\\])*"|'(?:\\.|[^'\\])*'|\((?:\\.|[^)\\])*\))$/,
-  				'punctuation': /^[\[\]!:]|[<>]/
-  			},
-  			alias: 'url'
-  		},
-  		'bold': {
-  			// **strong**
-  			// __strong__
-
-  			// allow one nested instance of italic text using the same delimiter
-  			pattern: createInline(/__(?:(?!_)<inner>|_(?:(?!_)<inner>)+_)+__/.source, true),
-  			lookbehind: true,
-  			greedy: true,
-  			inside: {
-  				'content': {
-  					pattern: /(^..)[\s\S]+(?=..$)/,
-  					lookbehind: true,
-  					inside: {} // see below
-  				},
-  				'punctuation': /\*\*|__/
-  			}
-  		},
-  		'italic': {
-  			// *em*
-  			// _em_
-
-  			// allow one nested instance of bold text using the same delimiter
-  			pattern: createInline(/_(?:(?!_)<inner>|__(?:(?!_)<inner>)+__)+_/.source, true),
-  			lookbehind: true,
-  			greedy: true,
-  			inside: {
-  				'content': {
-  					pattern: /(^.)[\s\S]+(?=.$)/,
-  					lookbehind: true,
-  					inside: {} // see below
-  				},
-  				'punctuation': /[*_]/
-  			}
-  		},
-  		'strike': {
-  			// ~~strike through~~
-  			// ~strike~
-  			pattern: createInline(/(~~?)(?:(?!~)<inner>)+?\2/.source, false),
-  			lookbehind: true,
-  			greedy: true,
-  			inside: {
-  				'content': {
-  					pattern: /(^~~?)[\s\S]+(?=\1$)/,
-  					lookbehind: true,
-  					inside: {} // see below
-  				},
-  				'punctuation': /~~?/
-  			}
-  		},
-  		'url': {
-  			// [example](http://example.com "Optional title")
-  			// [example][id]
-  			// [example] [id]
-  			pattern: createInline(/!?\[(?:(?!\])<inner>)+\](?:\([^\s)]+(?:[\t ]+"(?:\\.|[^"\\])*")?\)| ?\[(?:(?!\])<inner>)+\])/.source, false),
-  			lookbehind: true,
-  			greedy: true,
-  			inside: {
-  				'variable': {
-  					pattern: /(\[)[^\]]+(?=\]$)/,
-  					lookbehind: true
-  				},
-  				'content': {
-  					pattern: /(^!?\[)[^\]]+(?=\])/,
-  					lookbehind: true,
-  					inside: {} // see below
-  				},
-  				'string': {
-  					pattern: /"(?:\\.|[^"\\])*"(?=\)$)/
-  				}
-  			}
-  		}
-  	});
-
-  	['url', 'bold', 'italic', 'strike'].forEach(function (token) {
-  		['url', 'bold', 'italic', 'strike'].forEach(function (inside) {
-  			if (token !== inside) {
-  				Prism.languages.markdown[token].inside.content.inside[inside] = Prism.languages.markdown[inside];
-  			}
-  		});
-  	});
-
-  	Prism.hooks.add('after-tokenize', function (env) {
-  		if (env.language !== 'markdown' && env.language !== 'md') {
-  			return;
-  		}
-
-  		function walkTokens(tokens) {
-  			if (!tokens || typeof tokens === 'string') {
-  				return;
-  			}
-
-  			for (var i = 0, l = tokens.length; i < l; i++) {
-  				var token = tokens[i];
-
-  				if (token.type !== 'code') {
-  					walkTokens(token.content);
-  					continue;
-  				}
-
-  				/*
-  				 * Add the correct `language-xxxx` class to this code block. Keep in mind that the `code-language` token
-  				 * is optional. But the grammar is defined so that there is only one case we have to handle:
-  				 *
-  				 * token.content = [
-  				 *     <span class="punctuation">```</span>,
-  				 *     <span class="code-language">xxxx</span>,
-  				 *     '\n', // exactly one new lines (\r or \n or \r\n)
-  				 *     <span class="code-block">...</span>,
-  				 *     '\n', // exactly one new lines again
-  				 *     <span class="punctuation">```</span>
-  				 * ];
-  				 */
-
-  				var codeLang = token.content[1];
-  				var codeBlock = token.content[3];
-
-  				if (codeLang && codeBlock &&
-  					codeLang.type === 'code-language' && codeBlock.type === 'code-block' &&
-  					typeof codeLang.content === 'string') {
-
-  					// this might be a language that Prism does not support
-
-  					// do some replacements to support C++, C#, and F#
-  					var lang = codeLang.content.replace(/\b#/g, 'sharp').replace(/\b\+\+/g, 'pp');
-  					// only use the first word
-  					lang = (/[a-z][\w-]*/i.exec(lang) || [''])[0].toLowerCase();
-  					var alias = 'language-' + lang;
-
-  					// add alias
-  					if (!codeBlock.alias) {
-  						codeBlock.alias = [alias];
-  					} else if (typeof codeBlock.alias === 'string') {
-  						codeBlock.alias = [codeBlock.alias, alias];
-  					} else {
-  						codeBlock.alias.push(alias);
-  					}
-  				}
-  			}
-  		}
-
-  		walkTokens(env.tokens);
-  	});
-
-  	Prism.hooks.add('wrap', function (env) {
-  		if (env.type !== 'code-block') {
-  			return;
-  		}
-
-  		var codeLang = '';
-  		for (var i = 0, l = env.classes.length; i < l; i++) {
-  			var cls = env.classes[i];
-  			var match = /language-(.+)/.exec(cls);
-  			if (match) {
-  				codeLang = match[1];
-  				break;
-  			}
-  		}
-
-  		var grammar = Prism.languages[codeLang];
-
-  		if (!grammar) {
-  			if (codeLang && codeLang !== 'none' && Prism.plugins.autoloader) {
-  				var id = 'md-' + new Date().valueOf() + '-' + Math.floor(Math.random() * 1e16);
-  				env.attributes['id'] = id;
-
-  				Prism.plugins.autoloader.loadLanguages(codeLang, function () {
-  					var ele = document.getElementById(id);
-  					if (ele) {
-  						ele.innerHTML = Prism.highlight(ele.textContent, Prism.languages[codeLang], codeLang);
-  					}
-  				});
-  			}
-  		} else {
-  			// reverse Prism.util.encode
-  			var code = env.content.replace(/&lt;/g, '<').replace(/&amp;/g, '&');
-
-  			env.content = Prism.highlight(code, grammar, codeLang);
-  		}
-  	});
-
-  	Prism.languages.md = Prism.languages.markdown;
-
-  }(Prism));
-
-  Prism.languages.julia= {
-  	'comment': {
-  		pattern: /(^|[^\\])#.*/,
-  		lookbehind: true
-  	},
-  	'string': /("""|''')[\s\S]+?\1|("|')(?:\\.|(?!\2)[^\\\r\n])*\2/,
-  	'keyword' : /\b(?:abstract|baremodule|begin|bitstype|break|catch|ccall|const|continue|do|else|elseif|end|export|finally|for|function|global|if|immutable|import|importall|in|let|local|macro|module|print|println|quote|return|struct|try|type|typealias|using|while)\b/,
-  	'boolean' : /\b(?:true|false)\b/,
-  	'number' : /(?:\b(?=\d)|\B(?=\.))(?:0[box])?(?:[\da-f]+\.?\d*|\.\d+)(?:[efp][+-]?\d+)?j?/i,
-  	'operator': /[-+*^%÷&$\\]=?|\/[\/=]?|!=?=?|\|[=>]?|<(?:<=?|[=:])?|>(?:=|>>?=?)?|==?=?|[~≠≤≥]/,
-  	'punctuation' : /[{}[\];(),.:]/,
-  	'constant': /\b(?:(?:NaN|Inf)(?:16|32|64)?)\b/
-  };
-
-  var css = "/**\n * prism.js default theme for JavaScript, CSS and HTML\n * Based on dabblet (http://dabblet.com)\n * @author Lea Verou\n */\n\ncode[class*=\"language-\"],\npre[class*=\"language-\"] {\n\tcolor: black;\n\tbackground: none;\n\ttext-shadow: 0 1px white;\n\tfont-family: Consolas, Monaco, 'Andale Mono', 'Ubuntu Mono', monospace;\n\tfont-size: 1em;\n\ttext-align: left;\n\twhite-space: pre;\n\tword-spacing: normal;\n\tword-break: normal;\n\tword-wrap: normal;\n\tline-height: 1.5;\n\n\t-moz-tab-size: 4;\n\t-o-tab-size: 4;\n\ttab-size: 4;\n\n\t-webkit-hyphens: none;\n\t-moz-hyphens: none;\n\t-ms-hyphens: none;\n\thyphens: none;\n}\n\npre[class*=\"language-\"]::-moz-selection, pre[class*=\"language-\"] ::-moz-selection,\ncode[class*=\"language-\"]::-moz-selection, code[class*=\"language-\"] ::-moz-selection {\n\ttext-shadow: none;\n\tbackground: #b3d4fc;\n}\n\npre[class*=\"language-\"]::selection, pre[class*=\"language-\"] ::selection,\ncode[class*=\"language-\"]::selection, code[class*=\"language-\"] ::selection {\n\ttext-shadow: none;\n\tbackground: #b3d4fc;\n}\n\n@media print {\n\tcode[class*=\"language-\"],\n\tpre[class*=\"language-\"] {\n\t\ttext-shadow: none;\n\t}\n}\n\n/* Code blocks */\npre[class*=\"language-\"] {\n\tpadding: 1em;\n\tmargin: .5em 0;\n\toverflow: auto;\n}\n\n:not(pre) > code[class*=\"language-\"],\npre[class*=\"language-\"] {\n\tbackground: #f5f2f0;\n}\n\n/* Inline code */\n:not(pre) > code[class*=\"language-\"] {\n\tpadding: .1em;\n\tborder-radius: .3em;\n\twhite-space: normal;\n}\n\n.token.comment,\n.token.prolog,\n.token.doctype,\n.token.cdata {\n\tcolor: slategray;\n}\n\n.token.punctuation {\n\tcolor: #999;\n}\n\n.token.namespace {\n\topacity: .7;\n}\n\n.token.property,\n.token.tag,\n.token.boolean,\n.token.number,\n.token.constant,\n.token.symbol,\n.token.deleted {\n\tcolor: #905;\n}\n\n.token.selector,\n.token.attr-name,\n.token.string,\n.token.char,\n.token.builtin,\n.token.inserted {\n\tcolor: #690;\n}\n\n.token.operator,\n.token.entity,\n.token.url,\n.language-css .token.string,\n.style .token.string {\n\tcolor: #9a6e3a;\n\tbackground: hsla(0, 0%, 100%, .5);\n}\n\n.token.atrule,\n.token.attr-value,\n.token.keyword {\n\tcolor: #07a;\n}\n\n.token.function,\n.token.class-name {\n\tcolor: #DD4A68;\n}\n\n.token.regex,\n.token.important,\n.token.variable {\n\tcolor: #e90;\n}\n\n.token.important,\n.token.bold {\n\tfont-weight: bold;\n}\n.token.italic {\n\tfont-style: italic;\n}\n\n.token.entity {\n\tcursor: help;\n}\n";
-
-  // Copyright 2018 The Distill Template Authors
-
-  const T$4 = Template('d-code', `
-<style>
-
-code {
-  white-space: nowrap;
-  background: rgba(0, 0, 0, 0.04);
-  border-radius: 2px;
-  padding: 4px 7px;
-  font-size: 15px;
-  color: rgba(0, 0, 0, 0.6);
-}
-
-pre code {
-  display: block;
-  border-left: 2px solid rgba(0, 0, 0, .1);
-  padding: 0 0 0 36px;
-}
-
-${css}
-</style>
-
-<code id="code-container"></code>
-
-`);
-
-  class Code extends Mutating(T$4(HTMLElement)) {
-
-    renderContent() {
-
-      // check if language can be highlighted
-      this.languageName = this.getAttribute('language');
-      if (!this.languageName) {
-        console.warn('You need to provide a language attribute to your <d-code> block to let us know how to highlight your code; e.g.:\n <d-code language="python">zeros = np.zeros(shape)</d-code>.');
-        return;
-      }
-      const language = prism.languages[this.languageName];
-      if (language == undefined) {
-        console.warn(`Distill does not yet support highlighting your code block in "${this.languageName}'.`);
-        return;
-      }
-
-      let content = this.textContent;
-      const codeTag = this.shadowRoot.querySelector('#code-container');
-
-      if (this.hasAttribute('block')) {
-        // normalize the tab indents
-        content = content.replace(/\n/, '');
-        const tabs = content.match(/\s*/);
-        content = content.replace(new RegExp('\n' + tabs, 'g'), '\n');
-        content = content.trim();
-        // wrap code block in pre tag if needed
-        if (codeTag.parentNode instanceof ShadowRoot) {
-          const preTag = document.createElement('pre');
-          this.shadowRoot.removeChild(codeTag);
-          preTag.appendChild(codeTag);
-          this.shadowRoot.appendChild(preTag);
-        }
-
-      }
-
-      codeTag.className = `language-${this.languageName}`;
-      codeTag.innerHTML = prism.highlight(content, language);
-    }
-
-  }
-
-  // Copyright 2018 The Distill Template Authors
-
-  const T$5 = Template('d-footnote', `
-<style>
-
-d-math[block] {
-  display: block;
-}
-
-:host {
-
-}
-
-sup {
-  line-height: 1em;
-  font-size: 0.75em;
-  position: relative;
-  top: -.5em;
-  vertical-align: baseline;
-}
-
-span {
-  color: hsla(206, 90%, 20%, 0.7);
-  cursor: default;
-}
-
-.footnote-container {
-  padding: 10px;
-}
-
-</style>
-
-<d-hover-box>
-  <div class="footnote-container">
-    <slot id="slot"></slot>
-  </div>
-</d-hover-box>
-
-<sup>
-  <span id="fn-" data-hover-ref=""></span>
-</sup>
-
-`);
-
-  class Footnote extends T$5(HTMLElement) {
-
-    constructor() {
-      super();
-
-      const options = {childList: true, characterData: true, subtree: true};
-      const observer = new MutationObserver(this.notify);
-      observer.observe(this, options);
-    }
-
-    notify() {
-      const options = { detail: this, bubbles: true };
-      const event = new CustomEvent('onFootnoteChanged', options);
-      document.dispatchEvent(event);
-    }
-
-    connectedCallback() {
-      // listen and notify about changes to slotted content
-      // const slot = this.shadowRoot.querySelector('#slot');
-      // console.warn(slot.textContent);
-      // slot.addEventListener('slotchange', this.notify);
-      this.hoverBox = this.root.querySelector('d-hover-box');
-      window.customElements.whenDefined('d-hover-box').then(() => {
-        this.hoverBox.listen(this);
-      });
-      // create numeric ID
-      Footnote.currentFootnoteId += 1;
-      const IdString = Footnote.currentFootnoteId.toString();
-      this.root.host.id = 'd-footnote-' + IdString;
-
-      // set up hidden hover box
-      const id = 'dt-fn-hover-box-' + IdString;
-      this.hoverBox.id = id;
-
-      // set up visible footnote marker
-      const span = this.root.querySelector('#fn-');
-      span.setAttribute('id', 'fn-' + IdString);
-      span.setAttribute('data-hover-ref', id);
-      span.textContent = IdString;
-    }
-
-  }
-
-  Footnote.currentFootnoteId = 0;
-
-  // Copyright 2018 The Distill Template Authors
-
-  const T$6 = Template('d-footnote-list', `
-<style>
-
-d-footnote-list {
-  contain: layout style;
-}
-
-d-footnote-list > * {
-  grid-column: text;
-}
-
-d-footnote-list a.footnote-backlink {
-  color: rgba(0,0,0,0.3);
-  padding-left: 0.5em;
-}
-
-</style>
-
-<h3>Footnotes</h3>
-<ol></ol>
-`, false);
-
-  class FootnoteList extends T$6(HTMLElement) {
-
-    connectedCallback() {
-      super.connectedCallback();
-
-      this.list = this.root.querySelector('ol');
-      // footnotes list is initially hidden
-      this.root.style.display = 'none';
-      // look through document and register existing footnotes
-      // Store.subscribeTo('footnotes', (footnote) => {
-      //   this.renderFootnote(footnote);
-      // });
-    }
-
-    // TODO: could optimize this to accept individual footnotes?
-    set footnotes(footnotes) {
-      this.list.innerHTML = '';
-      if (footnotes.length) {
-        // ensure footnote list is visible
-        this.root.style.display = '';
-
-        for (const footnote of footnotes) {
-          // construct and append list item to show footnote
-          const listItem = document.createElement('li');
-          listItem.id = footnote.id + '-listing';
-          listItem.innerHTML = footnote.innerHTML;
-
-          const backlink = document.createElement('a');
-          backlink.setAttribute('class', 'footnote-backlink');
-          backlink.textContent = '[↩]';
-          backlink.href = '#' + footnote.id;
-
-          listItem.appendChild(backlink);
-          this.list.appendChild(listItem);
-        }
-      } else {
-        // ensure footnote list is invisible
-        this.root.style.display = 'none';
-      }
-    }
-
-  }
-
-  // Copyright 2018 The Distill Template Authors
-
-  const T$7 = Template('d-hover-box', `
-<style>
-
-:host {
-  position: absolute;
-  width: 100%;
-  left: 0px;
-  z-index: 10000;
-  display: none;
-  white-space: normal
-}
-
-.container {
-  position: relative;
-  width: 704px;
-  max-width: 100vw;
-  margin: 0 auto;
-}
-
-.panel {
-  position: absolute;
-  font-size: 1rem;
-  line-height: 1.5em;
-  top: 0;
-  left: 0;
-  width: 100%;
-  border: 1px solid rgba(0, 0, 0, 0.1);
-  background-color: rgba(250, 250, 250, 0.95);
-  box-shadow: 0 0 7px rgba(0, 0, 0, 0.1);
-  border-radius: 4px;
-  box-sizing: border-box;
-
-  backdrop-filter: blur(2px);
-  -webkit-backdrop-filter: blur(2px);
-}
-
-</style>
-
-<div class="container">
-  <div class="panel">
-    <slot></slot>
-  </div>
-</div>
-`);
-
-  class HoverBox extends T$7(HTMLElement) {
-
-    constructor() {
-      super();
-    }
-
-    connectedCallback() {
-
-    }
-
-    listen(element) {
-      // console.log(element)
-      this.bindDivEvents(this);
-      this.bindTriggerEvents(element);
-      // this.style.display = "block";
-    }
-
-    bindDivEvents(element) {
-      // For mice, same behavior as hovering on links
-      element.addEventListener('mouseover', () => {
-        if (!this.visible) this.showAtNode(element);
-        this.stopTimeout();
-      });
-      element.addEventListener('mouseout', () => {
-        this.extendTimeout(500);
-      });
-      // Don't trigger body touchstart event when touching within box
-      element.addEventListener('touchstart', (event) => {
-        event.stopPropagation();
-      }, {passive: true});
-      // Close box when touching outside box
-      document.body.addEventListener('touchstart', () => {
-        this.hide();
-      }, {passive: true});
-    }
-
-    bindTriggerEvents(node) {
-      node.addEventListener('mouseover', () => {
-        if (!this.visible) {
-          this.showAtNode(node);
-        }
-        this.stopTimeout();
-      });
-
-      node.addEventListener('mouseout', () => {
-        this.extendTimeout(300);
-      });
-
-      node.addEventListener('touchstart', (event) => {
-        if (this.visible) {
-          this.hide();
-        } else {
-          this.showAtNode(node);
-        }
-        // Don't trigger body touchstart event when touching link
-        event.stopPropagation();
-      }, {passive: true});
-    }
-
-    show(position) {
-      this.visible = true;
-      this.style.display = 'block';
-      // 10px extra offset from element
-      this.style.top = Math.round(position[1] + 10) + 'px';
-    }
-
-    showAtNode(node) {
-      // https://developer.mozilla.org/en-US/docs/Web/API/HTMLElement/offsetTop
-      const bbox = node.getBoundingClientRect();
-      this.show([node.offsetLeft + bbox.width, node.offsetTop + bbox.height]);
-    }
-
-    hide() {
-      this.visible = false;
-      this.style.display = 'none';
-      this.stopTimeout();
-    }
-
-    stopTimeout() {
-      if (this.timeout) {
-        clearTimeout(this.timeout);
-      }
-    }
-
-    extendTimeout(time) {
-      this.stopTimeout();
-      this.timeout = setTimeout(() => {
-        this.hide();
-      }, time);
-    }
-
-  }
-
-  // Copyright 2018 The Distill Template Authors
-  //
-  // Licensed under the Apache License, Version 2.0 (the "License");
-  // you may not use this file except in compliance with the License.
-  // You may obtain a copy of the License at
-  //
-  //      http://www.apache.org/licenses/LICENSE-2.0
-  //
-  // Unless required by applicable law or agreed to in writing, software
-  // distributed under the License is distributed on an "AS IS" BASIS,
-  // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  // See the License for the specific language governing permissions and
-  // limitations under the License.
-
-  class Title extends HTMLElement {
-    static get is() { return 'd-title'; }
-  }
-
-  // Copyright 2018 The Distill Template Authors
-
-  const T$8 = Template('d-references', `
-<style>
-d-references {
-  display: block;
-}
-</style>
-`, false);
-
-  class References extends T$8(HTMLElement) {
-
-  }
-
-  // Copyright 2018 The Distill Template Authors
-  //
-  // Licensed under the Apache License, Version 2.0 (the "License");
-  // you may not use this file except in compliance with the License.
-  // You may obtain a copy of the License at
-  //
-  //      http://www.apache.org/licenses/LICENSE-2.0
-  //
-  // Unless required by applicable law or agreed to in writing, software
-  // distributed under the License is distributed on an "AS IS" BASIS,
-  // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  // See the License for the specific language governing permissions and
-  // limitations under the License.
-
-  class TOC extends HTMLElement {
-
-    static get is() { return 'd-toc'; }
-
-    connectedCallback() {
-      if (!this.getAttribute('prerendered')) {
-        window.onload = () => {
-          const article = document.querySelector('d-article');
-          const headings = article.querySelectorAll('h2, h3');
-          renderTOC(this, headings);
-        };
-      }
-    }
-
-  }
-
-  function renderTOC(element, headings) {
-
-    let ToC =`
-  <style>
-
-  d-toc {
-    contain: layout style;
-    display: block;
-  }
-
-  d-toc ul {
-    padding-left: 0;
-  }
-
-  d-toc ul > ul {
-    padding-left: 24px;
-  }
-
-  d-toc a {
-    border-bottom: none;
-    text-decoration: none;
-  }
-
-  </style>
-  <nav role="navigation" class="table-of-contents"></nav>
-  <h2>Table of contents</h2>
-  <ul>`;
-
-    for (const el of headings) {
-      // should element be included in TOC?
-      const isInTitle = el.parentElement.tagName == 'D-TITLE';
-      const isException = el.getAttribute('no-toc');
-      if (isInTitle || isException) continue;
-      // create TOC entry
-      const title = el.textContent;
-      const link = '#' + el.getAttribute('id');
-
-      let newLine = '<li>' + '<a href="' + link + '">' + title + '</a>' + '</li>';
-      if (el.tagName == 'H3') {
-        newLine = '<ul>' + newLine + '</ul>';
-      } else {
-        newLine += '<br>';
-      }
-      ToC += newLine;
-
-    }
-
-    ToC += '</ul></nav>';
-    element.innerHTML = ToC;
-  }
-
-  // Copyright 2018 The Distill Template Authors
-  //
-  // Licensed under the Apache License, Version 2.0 (the "License");
-  // you may not use this file except in compliance with the License.
-  // You may obtain a copy of the License at
-  //
-  //      http://www.apache.org/licenses/LICENSE-2.0
-  //
-  // Unless required by applicable law or agreed to in writing, software
-  // distributed under the License is distributed on an "AS IS" BASIS,
-  // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  // See the License for the specific language governing permissions and
-  // limitations under the License.
-
-  // Figure
-  //
-  // d-figure provides a state-machine of visibility events:
-  //
-  //                         scroll out of view
-  //                         +----------------+
-  //   *do work here*        |                |
-  // +----------------+    +-+---------+    +-v---------+
-  // | ready          +----> onscreen  |    | offscreen |
-  // +----------------+    +---------^-+    +---------+-+
-  //                                 |                |
-  //                                 +----------------+
-  //                                  scroll into view
-  //
-
-  class Figure extends HTMLElement {
-
-    static get is() { return 'd-figure'; }
-
-    static get readyQueue() {
-      if (!Figure._readyQueue) {
-        Figure._readyQueue = [];
-      }
-      return Figure._readyQueue;
-    }
-
-    static addToReadyQueue(figure) {
-      if (Figure.readyQueue.indexOf(figure) === -1) {
-        Figure.readyQueue.push(figure);
-        Figure.runReadyQueue();
-      }
-    }
-
-    static runReadyQueue() {
-      // console.log("Checking to run readyQueue, length: " + Figure.readyQueue.length + ", scrolling: " + Figure.isScrolling);
-      // if (Figure.isScrolling) return;
-      // console.log("Running ready Queue");
-      const figure = Figure.readyQueue
-        .sort((a,b) => a._seenOnScreen - b._seenOnScreen )
-        .filter((figure) => !figure._ready)
-        .pop();
-      if (figure) {
-        figure.ready();
-        requestAnimationFrame(Figure.runReadyQueue);
-      }
-
-    }
-
-    constructor() {
-      super();
-      // debugger
-      this._ready = false;
-      this._onscreen = false;
-      this._offscreen = true;
-    }
-
-    connectedCallback() {
-      this.loadsWhileScrolling = this.hasAttribute('loadsWhileScrolling');
-      Figure.marginObserver.observe(this);
-      Figure.directObserver.observe(this);
-    }
-
-    disconnectedCallback() {
-      Figure.marginObserver.unobserve(this);
-      Figure.directObserver.unobserve(this);
-    }
-
-    // We use two separate observers:
-    // One with an extra 1000px margin to warn if the viewpoint gets close,
-    // And one for the actual on/off screen events
-
-    static get marginObserver() {
-      if (!Figure._marginObserver) {
-        // if (!('IntersectionObserver' in window)) {
-        //   throw new Error('no interscetionobbserver!');
-        // }
-        const viewportHeight = window.innerHeight;
-        const margin = Math.floor(2 * viewportHeight);
-        const options = {rootMargin: margin + 'px 0px ' + margin + 'px 0px', threshold: 0.01};
-        const callback = Figure.didObserveMarginIntersection;
-        const observer = new IntersectionObserver(callback, options);
-        Figure._marginObserver = observer;
-      }
-      return Figure._marginObserver;
-    }
-
-    static didObserveMarginIntersection(entries) {
-      for (const entry of entries) {
-        const figure = entry.target;
-        if (entry.isIntersecting && !figure._ready) {
-          Figure.addToReadyQueue(figure);
-        }
-      }
-    }
-
-    static get directObserver() {
-      if (!Figure._directObserver) {
-        Figure._directObserver = new IntersectionObserver(
-          Figure.didObserveDirectIntersection, {
-            rootMargin: '0px', threshold: [0, 1.0],
-          }
-        );
-      }
-      return Figure._directObserver;
-    }
-
-    static didObserveDirectIntersection(entries) {
-      for (const entry of entries) {
-        const figure = entry.target;
-        if (entry.isIntersecting) {
-          figure._seenOnScreen = new Date();
-          // if (!figure._ready) { figure.ready(); }
-          if (figure._offscreen) { figure.onscreen(); }
-        } else {
-          if (figure._onscreen) { figure.offscreen(); }
-        }
-      }
-    }
-
-    // Notify listeners that registered late, too:
-
-    addEventListener(eventName, callback) {
-      super.addEventListener(eventName, callback);
-      // if we had already dispatched something while presumingly no one was listening, we do so again
-      // debugger
-      if (eventName === 'ready') {
-        if (Figure.readyQueue.indexOf(this) !== -1) {
-          this._ready = false;
-          Figure.runReadyQueue();
-        }
-      }
-      if (eventName === 'onscreen') {
-        this.onscreen();
-      }
-    }
-
-    // Custom Events
-
-    ready() {
-      // debugger
-      this._ready = true;
-      Figure.marginObserver.unobserve(this);
-      const event = new CustomEvent('ready');
-      this.dispatchEvent(event);
-    }
-
-    onscreen() {
-      this._onscreen = true;
-      this._offscreen = false;
-      const event = new CustomEvent('onscreen');
-      this.dispatchEvent(event);
-    }
-
-    offscreen() {
-      this._onscreen = false;
-      this._offscreen = true;
-      const event = new CustomEvent('offscreen');
-      this.dispatchEvent(event);
-    }
-
-  }
-
-  if (typeof window !== 'undefined') {
-
-    Figure.isScrolling = false;
-    let timeout;
-    const resetTimer = () => {
-      Figure.isScrolling = true;
-      clearTimeout(timeout);
-      timeout = setTimeout(() => {
-        Figure.isScrolling = false;
-        Figure.runReadyQueue();
-      }, 500);
-    };
-    window.addEventListener('scroll', resetTimer, true);
-
-  }
-
-  // Copyright 2018 The Distill Template Authors
-
-  // This overlay is not secure.
-  // It is only meant as a social deterrent.
-
-  const productionHostname = 'distill.pub';
-  const T$9 = Template('d-interstitial', `
-<style>
-
-.overlay {
-  position: fixed;
-  width: 100%;
-  height: 100%;
-  top: 0;
-  left: 0;
-  background: white;
-
-  opacity: 1;
-  visibility: visible;
-
-  display: flex;
-  flex-flow: column;
-  justify-content: center;
-  z-index: 2147483647 /* MaxInt32 */
-
-}
-
-.container {
-  position: relative;
-  margin-left: auto;
-  margin-right: auto;
-  max-width: 420px;
-  padding: 2em;
-}
-
-h1 {
-  text-decoration: underline;
-  text-decoration-color: hsl(0,100%,40%);
-  -webkit-text-decoration-color: hsl(0,100%,40%);
-  margin-bottom: 1em;
-  line-height: 1.5em;
-}
-
-input[type="password"] {
-  -webkit-appearance: none;
-  -moz-appearance: none;
-  appearance: none;
-  -webkit-box-shadow: none;
-  -moz-box-shadow: none;
-  box-shadow: none;
-  -webkit-border-radius: none;
-  -moz-border-radius: none;
-  -ms-border-radius: none;
-  -o-border-radius: none;
-  border-radius: none;
-  outline: none;
-
-  font-size: 18px;
-  background: none;
-  width: 25%;
-  padding: 10px;
-  border: none;
-  border-bottom: solid 2px #999;
-  transition: border .3s;
-}
-
-input[type="password"]:focus {
-  border-bottom: solid 2px #333;
-}
-
-input[type="password"].wrong {
-  border-bottom: solid 2px hsl(0,100%,40%);
-}
-
-p small {
-  color: #888;
-}
-
-.logo {
-  position: relative;
-  font-size: 1.5em;
-  margin-bottom: 3em;
-}
-
-.logo svg {
-  width: 36px;
-  position: relative;
-  top: 6px;
-  margin-right: 2px;
-}
-
-.logo svg path {
-  fill: none;
-  stroke: black;
-  stroke-width: 2px;
-}
-
-</style>
-
-<div class="overlay">
-  <div class="container">
-    <h1>This article is in review.</h1>
-    <p>Do not share this URL or the contents of this article. Thank you!</p>
-    <input id="interstitial-password-input" type="password" name="password" autofocus/>
-    <p><small>Enter the password we shared with you as part of the review process to view the article.</small></p>
-  </div>
-</div>
-`);
-
-  class Interstitial extends T$9(HTMLElement) {
-
-    connectedCallback() {
-      if (this.shouldRemoveSelf()) {
-        this.parentElement.removeChild(this);
-      } else {
-        const passwordInput = this.root.querySelector('#interstitial-password-input');
-        passwordInput.oninput = (event) => this.passwordChanged(event);
-      }
-    }
-
-    passwordChanged(event) {
-      const entered = event.target.value;
-      if (entered === this.password) {
-        console.log('Correct password entered.');
-        this.parentElement.removeChild(this);
-        if (typeof(Storage) !== 'undefined') {
-          console.log('Saved that correct password was entered.');
-          localStorage.setItem(this.localStorageIdentifier(), 'true');
-        }
-      }
-    }
-
-    shouldRemoveSelf() {
-      // should never be visible in production
-      if (window && window.location.hostname === productionHostname) {
-        console.warn('Interstitial found on production, hiding it.');
-        return true
-      }
-      // should only have to enter password once
-      if (typeof(Storage) !== 'undefined') {
-        if (localStorage.getItem(this.localStorageIdentifier()) === 'true') {
-          console.log('Loaded that correct password was entered before; skipping interstitial.');
-          return true;
-        }
-      }
-      // otherwise, leave visible
-      return false;
-    }
-
-    localStorageIdentifier() {
-      const prefix = 'distill-drafts';
-      const suffix = 'interstitial-password-correct';
-      return prefix + (window ? window.location.pathname : '-') + suffix
-    }
-
-  }
-
-  function ascending(a, b) {
-    return a < b ? -1 : a > b ? 1 : a >= b ? 0 : NaN;
-  }
-
-  function bisector(compare) {
-    if (compare.length === 1) compare = ascendingComparator(compare);
-    return {
-      left: function(a, x, lo, hi) {
-        if (lo == null) lo = 0;
-        if (hi == null) hi = a.length;
-        while (lo < hi) {
-          var mid = lo + hi >>> 1;
-          if (compare(a[mid], x) < 0) lo = mid + 1;
-          else hi = mid;
-        }
-        return lo;
-      },
-      right: function(a, x, lo, hi) {
-        if (lo == null) lo = 0;
-        if (hi == null) hi = a.length;
-        while (lo < hi) {
-          var mid = lo + hi >>> 1;
-          if (compare(a[mid], x) > 0) hi = mid;
-          else lo = mid + 1;
-        }
-        return lo;
-      }
-    };
-  }
-
-  function ascendingComparator(f) {
-    return function(d, x) {
-      return ascending(f(d), x);
-    };
-  }
-
-  var ascendingBisect = bisector(ascending);
-  var bisectRight = ascendingBisect.right;
-
-  function range(start, stop, step) {
-    start = +start, stop = +stop, step = (n = arguments.length) < 2 ? (stop = start, start = 0, 1) : n < 3 ? 1 : +step;
-
-    var i = -1,
-        n = Math.max(0, Math.ceil((stop - start) / step)) | 0,
-        range = new Array(n);
-
-    while (++i < n) {
-      range[i] = start + i * step;
-    }
-
-    return range;
-  }
-
-  var e10 = Math.sqrt(50),
-      e5 = Math.sqrt(10),
-      e2 = Math.sqrt(2);
-
-  function ticks(start, stop, count) {
-    var reverse,
-        i = -1,
-        n,
-        ticks,
-        step;
-
-    stop = +stop, start = +start, count = +count;
-    if (start === stop && count > 0) return [start];
-    if (reverse = stop < start) n = start, start = stop, stop = n;
-    if ((step = tickIncrement(start, stop, count)) === 0 || !isFinite(step)) return [];
-
-    if (step > 0) {
-      start = Math.ceil(start / step);
-      stop = Math.floor(stop / step);
-      ticks = new Array(n = Math.ceil(stop - start + 1));
-      while (++i < n) ticks[i] = (start + i) * step;
-    } else {
-      start = Math.floor(start * step);
-      stop = Math.ceil(stop * step);
-      ticks = new Array(n = Math.ceil(start - stop + 1));
-      while (++i < n) ticks[i] = (start - i) / step;
-    }
-
-    if (reverse) ticks.reverse();
-
-    return ticks;
-  }
-
-  function tickIncrement(start, stop, count) {
-    var step = (stop - start) / Math.max(0, count),
-        power = Math.floor(Math.log(step) / Math.LN10),
-        error = step / Math.pow(10, power);
-    return power >= 0
-        ? (error >= e10 ? 10 : error >= e5 ? 5 : error >= e2 ? 2 : 1) * Math.pow(10, power)
-        : -Math.pow(10, -power) / (error >= e10 ? 10 : error >= e5 ? 5 : error >= e2 ? 2 : 1);
-  }
-
-  function tickStep(start, stop, count) {
-    var step0 = Math.abs(stop - start) / Math.max(0, count),
-        step1 = Math.pow(10, Math.floor(Math.log(step0) / Math.LN10)),
-        error = step0 / step1;
-    if (error >= e10) step1 *= 10;
-    else if (error >= e5) step1 *= 5;
-    else if (error >= e2) step1 *= 2;
-    return stop < start ? -step1 : step1;
-  }
-
-  function initRange(domain, range) {
-    switch (arguments.length) {
-      case 0: break;
-      case 1: this.range(domain); break;
-      default: this.range(range).domain(domain); break;
-    }
-    return this;
-  }
-
-  function define(constructor, factory, prototype) {
-    constructor.prototype = factory.prototype = prototype;
-    prototype.constructor = constructor;
-  }
-
-  function extend(parent, definition) {
-    var prototype = Object.create(parent.prototype);
-    for (var key in definition) prototype[key] = definition[key];
-    return prototype;
-  }
-
-  function Color() {}
-
-  var darker = 0.7;
-  var brighter = 1 / darker;
-
-  var reI = "\\s*([+-]?\\d+)\\s*",
-      reN = "\\s*([+-]?\\d*\\.?\\d+(?:[eE][+-]?\\d+)?)\\s*",
-      reP = "\\s*([+-]?\\d*\\.?\\d+(?:[eE][+-]?\\d+)?)%\\s*",
-      reHex = /^#([0-9a-f]{3,8})$/,
-      reRgbInteger = new RegExp("^rgb\\(" + [reI, reI, reI] + "\\)$"),
-      reRgbPercent = new RegExp("^rgb\\(" + [reP, reP, reP] + "\\)$"),
-      reRgbaInteger = new RegExp("^rgba\\(" + [reI, reI, reI, reN] + "\\)$"),
-      reRgbaPercent = new RegExp("^rgba\\(" + [reP, reP, reP, reN] + "\\)$"),
-      reHslPercent = new RegExp("^hsl\\(" + [reN, reP, reP] + "\\)$"),
-      reHslaPercent = new RegExp("^hsla\\(" + [reN, reP, reP, reN] + "\\)$");
-
-  var named = {
-    aliceblue: 0xf0f8ff,
-    antiquewhite: 0xfaebd7,
-    aqua: 0x00ffff,
-    aquamarine: 0x7fffd4,
-    azure: 0xf0ffff,
-    beige: 0xf5f5dc,
-    bisque: 0xffe4c4,
-    black: 0x000000,
-    blanchedalmond: 0xffebcd,
-    blue: 0x0000ff,
-    blueviolet: 0x8a2be2,
-    brown: 0xa52a2a,
-    burlywood: 0xdeb887,
-    cadetblue: 0x5f9ea0,
-    chartreuse: 0x7fff00,
-    chocolate: 0xd2691e,
-    coral: 0xff7f50,
-    cornflowerblue: 0x6495ed,
-    cornsilk: 0xfff8dc,
-    crimson: 0xdc143c,
-    cyan: 0x00ffff,
-    darkblue: 0x00008b,
-    darkcyan: 0x008b8b,
-    darkgoldenrod: 0xb8860b,
-    darkgray: 0xa9a9a9,
-    darkgreen: 0x006400,
-    darkgrey: 0xa9a9a9,
-    darkkhaki: 0xbdb76b,
-    darkmagenta: 0x8b008b,
-    darkolivegreen: 0x556b2f,
-    darkorange: 0xff8c00,
-    darkorchid: 0x9932cc,
-    darkred: 0x8b0000,
-    darksalmon: 0xe9967a,
-    darkseagreen: 0x8fbc8f,
-    darkslateblue: 0x483d8b,
-    darkslategray: 0x2f4f4f,
-    darkslategrey: 0x2f4f4f,
-    darkturquoise: 0x00ced1,
-    darkviolet: 0x9400d3,
-    deeppink: 0xff1493,
-    deepskyblue: 0x00bfff,
-    dimgray: 0x696969,
-    dimgrey: 0x696969,
-    dodgerblue: 0x1e90ff,
-    firebrick: 0xb22222,
-    floralwhite: 0xfffaf0,
-    forestgreen: 0x228b22,
-    fuchsia: 0xff00ff,
-    gainsboro: 0xdcdcdc,
-    ghostwhite: 0xf8f8ff,
-    gold: 0xffd700,
-    goldenrod: 0xdaa520,
-    gray: 0x808080,
-    green: 0x008000,
-    greenyellow: 0xadff2f,
-    grey: 0x808080,
-    honeydew: 0xf0fff0,
-    hotpink: 0xff69b4,
-    indianred: 0xcd5c5c,
-    indigo: 0x4b0082,
-    ivory: 0xfffff0,
-    khaki: 0xf0e68c,
-    lavender: 0xe6e6fa,
-    lavenderblush: 0xfff0f5,
-    lawngreen: 0x7cfc00,
-    lemonchiffon: 0xfffacd,
-    lightblue: 0xadd8e6,
-    lightcoral: 0xf08080,
-    lightcyan: 0xe0ffff,
-    lightgoldenrodyellow: 0xfafad2,
-    lightgray: 0xd3d3d3,
-    lightgreen: 0x90ee90,
-    lightgrey: 0xd3d3d3,
-    lightpink: 0xffb6c1,
-    lightsalmon: 0xffa07a,
-    lightseagreen: 0x20b2aa,
-    lightskyblue: 0x87cefa,
-    lightslategray: 0x778899,
-    lightslategrey: 0x778899,
-    lightsteelblue: 0xb0c4de,
-    lightyellow: 0xffffe0,
-    lime: 0x00ff00,
-    limegreen: 0x32cd32,
-    linen: 0xfaf0e6,
-    magenta: 0xff00ff,
-    maroon: 0x800000,
-    mediumaquamarine: 0x66cdaa,
-    mediumblue: 0x0000cd,
-    mediumorchid: 0xba55d3,
-    mediumpurple: 0x9370db,
-    mediumseagreen: 0x3cb371,
-    mediumslateblue: 0x7b68ee,
-    mediumspringgreen: 0x00fa9a,
-    mediumturquoise: 0x48d1cc,
-    mediumvioletred: 0xc71585,
-    midnightblue: 0x191970,
-    mintcream: 0xf5fffa,
-    mistyrose: 0xffe4e1,
-    moccasin: 0xffe4b5,
-    navajowhite: 0xffdead,
-    navy: 0x000080,
-    oldlace: 0xfdf5e6,
-    olive: 0x808000,
-    olivedrab: 0x6b8e23,
-    orange: 0xffa500,
-    orangered: 0xff4500,
-    orchid: 0xda70d6,
-    palegoldenrod: 0xeee8aa,
-    palegreen: 0x98fb98,
-    paleturquoise: 0xafeeee,
-    palevioletred: 0xdb7093,
-    papayawhip: 0xffefd5,
-    peachpuff: 0xffdab9,
-    peru: 0xcd853f,
-    pink: 0xffc0cb,
-    plum: 0xdda0dd,
-    powderblue: 0xb0e0e6,
-    purple: 0x800080,
-    rebeccapurple: 0x663399,
-    red: 0xff0000,
-    rosybrown: 0xbc8f8f,
-    royalblue: 0x4169e1,
-    saddlebrown: 0x8b4513,
-    salmon: 0xfa8072,
-    sandybrown: 0xf4a460,
-    seagreen: 0x2e8b57,
-    seashell: 0xfff5ee,
-    sienna: 0xa0522d,
-    silver: 0xc0c0c0,
-    skyblue: 0x87ceeb,
-    slateblue: 0x6a5acd,
-    slategray: 0x708090,
-    slategrey: 0x708090,
-    snow: 0xfffafa,
-    springgreen: 0x00ff7f,
-    steelblue: 0x4682b4,
-    tan: 0xd2b48c,
-    teal: 0x008080,
-    thistle: 0xd8bfd8,
-    tomato: 0xff6347,
-    turquoise: 0x40e0d0,
-    violet: 0xee82ee,
-    wheat: 0xf5deb3,
-    white: 0xffffff,
-    whitesmoke: 0xf5f5f5,
-    yellow: 0xffff00,
-    yellowgreen: 0x9acd32
-  };
-
-  define(Color, color, {
-    copy: function(channels) {
-      return Object.assign(new this.constructor, this, channels);
-    },
-    displayable: function() {
-      return this.rgb().displayable();
-    },
-    hex: color_formatHex, // Deprecated! Use color.formatHex.
-    formatHex: color_formatHex,
-    formatHsl: color_formatHsl,
-    formatRgb: color_formatRgb,
-    toString: color_formatRgb
-  });
-
-  function color_formatHex() {
-    return this.rgb().formatHex();
-  }
-
-  function color_formatHsl() {
-    return hslConvert(this).formatHsl();
-  }
-
-  function color_formatRgb() {
-    return this.rgb().formatRgb();
-  }
-
-  function color(format) {
-    var m, l;
-    format = (format + "").trim().toLowerCase();
-    return (m = reHex.exec(format)) ? (l = m[1].length, m = parseInt(m[1], 16), l === 6 ? rgbn(m) // #ff0000
-        : l === 3 ? new Rgb((m >> 8 & 0xf) | (m >> 4 & 0xf0), (m >> 4 & 0xf) | (m & 0xf0), ((m & 0xf) << 4) | (m & 0xf), 1) // #f00
-        : l === 8 ? rgba(m >> 24 & 0xff, m >> 16 & 0xff, m >> 8 & 0xff, (m & 0xff) / 0xff) // #ff000000
-        : l === 4 ? rgba((m >> 12 & 0xf) | (m >> 8 & 0xf0), (m >> 8 & 0xf) | (m >> 4 & 0xf0), (m >> 4 & 0xf) | (m & 0xf0), (((m & 0xf) << 4) | (m & 0xf)) / 0xff) // #f000
-        : null) // invalid hex
-        : (m = reRgbInteger.exec(format)) ? new Rgb(m[1], m[2], m[3], 1) // rgb(255, 0, 0)
-        : (m = reRgbPercent.exec(format)) ? new Rgb(m[1] * 255 / 100, m[2] * 255 / 100, m[3] * 255 / 100, 1) // rgb(100%, 0%, 0%)
-        : (m = reRgbaInteger.exec(format)) ? rgba(m[1], m[2], m[3], m[4]) // rgba(255, 0, 0, 1)
-        : (m = reRgbaPercent.exec(format)) ? rgba(m[1] * 255 / 100, m[2] * 255 / 100, m[3] * 255 / 100, m[4]) // rgb(100%, 0%, 0%, 1)
-        : (m = reHslPercent.exec(format)) ? hsla(m[1], m[2] / 100, m[3] / 100, 1) // hsl(120, 50%, 50%)
-        : (m = reHslaPercent.exec(format)) ? hsla(m[1], m[2] / 100, m[3] / 100, m[4]) // hsla(120, 50%, 50%, 1)
-        : named.hasOwnProperty(format) ? rgbn(named[format]) // eslint-disable-line no-prototype-builtins
-        : format === "transparent" ? new Rgb(NaN, NaN, NaN, 0)
-        : null;
-  }
-
-  function rgbn(n) {
-    return new Rgb(n >> 16 & 0xff, n >> 8 & 0xff, n & 0xff, 1);
-  }
-
-  function rgba(r, g, b, a) {
-    if (a <= 0) r = g = b = NaN;
-    return new Rgb(r, g, b, a);
-  }
-
-  function rgbConvert(o) {
-    if (!(o instanceof Color)) o = color(o);
-    if (!o) return new Rgb;
-    o = o.rgb();
-    return new Rgb(o.r, o.g, o.b, o.opacity);
-  }
-
-  function rgb(r, g, b, opacity) {
-    return arguments.length === 1 ? rgbConvert(r) : new Rgb(r, g, b, opacity == null ? 1 : opacity);
-  }
-
-  function Rgb(r, g, b, opacity) {
-    this.r = +r;
-    this.g = +g;
-    this.b = +b;
-    this.opacity = +opacity;
-  }
-
-  define(Rgb, rgb, extend(Color, {
-    brighter: function(k) {
-      k = k == null ? brighter : Math.pow(brighter, k);
-      return new Rgb(this.r * k, this.g * k, this.b * k, this.opacity);
-    },
-    darker: function(k) {
-      k = k == null ? darker : Math.pow(darker, k);
-      return new Rgb(this.r * k, this.g * k, this.b * k, this.opacity);
-    },
-    rgb: function() {
-      return this;
-    },
-    displayable: function() {
-      return (-0.5 <= this.r && this.r < 255.5)
-          && (-0.5 <= this.g && this.g < 255.5)
-          && (-0.5 <= this.b && this.b < 255.5)
-          && (0 <= this.opacity && this.opacity <= 1);
-    },
-    hex: rgb_formatHex, // Deprecated! Use color.formatHex.
-    formatHex: rgb_formatHex,
-    formatRgb: rgb_formatRgb,
-    toString: rgb_formatRgb
-  }));
-
-  function rgb_formatHex() {
-    return "#" + hex(this.r) + hex(this.g) + hex(this.b);
-  }
-
-  function rgb_formatRgb() {
-    var a = this.opacity; a = isNaN(a) ? 1 : Math.max(0, Math.min(1, a));
-    return (a === 1 ? "rgb(" : "rgba(")
-        + Math.max(0, Math.min(255, Math.round(this.r) || 0)) + ", "
-        + Math.max(0, Math.min(255, Math.round(this.g) || 0)) + ", "
-        + Math.max(0, Math.min(255, Math.round(this.b) || 0))
-        + (a === 1 ? ")" : ", " + a + ")");
-  }
-
-  function hex(value) {
-    value = Math.max(0, Math.min(255, Math.round(value) || 0));
-    return (value < 16 ? "0" : "") + value.toString(16);
-  }
-
-  function hsla(h, s, l, a) {
-    if (a <= 0) h = s = l = NaN;
-    else if (l <= 0 || l >= 1) h = s = NaN;
-    else if (s <= 0) h = NaN;
-    return new Hsl(h, s, l, a);
-  }
-
-  function hslConvert(o) {
-    if (o instanceof Hsl) return new Hsl(o.h, o.s, o.l, o.opacity);
-    if (!(o instanceof Color)) o = color(o);
-    if (!o) return new Hsl;
-    if (o instanceof Hsl) return o;
-    o = o.rgb();
-    var r = o.r / 255,
-        g = o.g / 255,
-        b = o.b / 255,
-        min = Math.min(r, g, b),
-        max = Math.max(r, g, b),
-        h = NaN,
-        s = max - min,
-        l = (max + min) / 2;
-    if (s) {
-      if (r === max) h = (g - b) / s + (g < b) * 6;
-      else if (g === max) h = (b - r) / s + 2;
-      else h = (r - g) / s + 4;
-      s /= l < 0.5 ? max + min : 2 - max - min;
-      h *= 60;
-    } else {
-      s = l > 0 && l < 1 ? 0 : h;
-    }
-    return new Hsl(h, s, l, o.opacity);
-  }
-
-  function hsl(h, s, l, opacity) {
-    return arguments.length === 1 ? hslConvert(h) : new Hsl(h, s, l, opacity == null ? 1 : opacity);
-  }
-
-  function Hsl(h, s, l, opacity) {
-    this.h = +h;
-    this.s = +s;
-    this.l = +l;
-    this.opacity = +opacity;
-  }
-
-  define(Hsl, hsl, extend(Color, {
-    brighter: function(k) {
-      k = k == null ? brighter : Math.pow(brighter, k);
-      return new Hsl(this.h, this.s, this.l * k, this.opacity);
-    },
-    darker: function(k) {
-      k = k == null ? darker : Math.pow(darker, k);
-      return new Hsl(this.h, this.s, this.l * k, this.opacity);
-    },
-    rgb: function() {
-      var h = this.h % 360 + (this.h < 0) * 360,
-          s = isNaN(h) || isNaN(this.s) ? 0 : this.s,
-          l = this.l,
-          m2 = l + (l < 0.5 ? l : 1 - l) * s,
-          m1 = 2 * l - m2;
-      return new Rgb(
-        hsl2rgb(h >= 240 ? h - 240 : h + 120, m1, m2),
-        hsl2rgb(h, m1, m2),
-        hsl2rgb(h < 120 ? h + 240 : h - 120, m1, m2),
-        this.opacity
-      );
-    },
-    displayable: function() {
-      return (0 <= this.s && this.s <= 1 || isNaN(this.s))
-          && (0 <= this.l && this.l <= 1)
-          && (0 <= this.opacity && this.opacity <= 1);
-    },
-    formatHsl: function() {
-      var a = this.opacity; a = isNaN(a) ? 1 : Math.max(0, Math.min(1, a));
-      return (a === 1 ? "hsl(" : "hsla(")
-          + (this.h || 0) + ", "
-          + (this.s || 0) * 100 + "%, "
-          + (this.l || 0) * 100 + "%"
-          + (a === 1 ? ")" : ", " + a + ")");
-    }
-  }));
-
-  /* From FvD 13.37, CSS Color Module Level 3 */
-  function hsl2rgb(h, m1, m2) {
-    return (h < 60 ? m1 + (m2 - m1) * h / 60
-        : h < 180 ? m2
-        : h < 240 ? m1 + (m2 - m1) * (240 - h) / 60
-        : m1) * 255;
-  }
-
-  var deg2rad = Math.PI / 180;
-  var rad2deg = 180 / Math.PI;
-
-  // https://observablehq.com/@mbostock/lab-and-rgb
-  var K = 18,
-      Xn = 0.96422,
-      Yn = 1,
-      Zn = 0.82521,
-      t0 = 4 / 29,
-      t1 = 6 / 29,
-      t2 = 3 * t1 * t1,
-      t3 = t1 * t1 * t1;
-
-  function labConvert(o) {
-    if (o instanceof Lab) return new Lab(o.l, o.a, o.b, o.opacity);
-    if (o instanceof Hcl) return hcl2lab(o);
-    if (!(o instanceof Rgb)) o = rgbConvert(o);
-    var r = rgb2lrgb(o.r),
-        g = rgb2lrgb(o.g),
-        b = rgb2lrgb(o.b),
-        y = xyz2lab((0.2225045 * r + 0.7168786 * g + 0.0606169 * b) / Yn), x, z;
-    if (r === g && g === b) x = z = y; else {
-      x = xyz2lab((0.4360747 * r + 0.3850649 * g + 0.1430804 * b) / Xn);
-      z = xyz2lab((0.0139322 * r + 0.0971045 * g + 0.7141733 * b) / Zn);
-    }
-    return new Lab(116 * y - 16, 500 * (x - y), 200 * (y - z), o.opacity);
-  }
-
-  function lab(l, a, b, opacity) {
-    return arguments.length === 1 ? labConvert(l) : new Lab(l, a, b, opacity == null ? 1 : opacity);
-  }
-
-  function Lab(l, a, b, opacity) {
-    this.l = +l;
-    this.a = +a;
-    this.b = +b;
-    this.opacity = +opacity;
-  }
-
-  define(Lab, lab, extend(Color, {
-    brighter: function(k) {
-      return new Lab(this.l + K * (k == null ? 1 : k), this.a, this.b, this.opacity);
-    },
-    darker: function(k) {
-      return new Lab(this.l - K * (k == null ? 1 : k), this.a, this.b, this.opacity);
-    },
-    rgb: function() {
-      var y = (this.l + 16) / 116,
-          x = isNaN(this.a) ? y : y + this.a / 500,
-          z = isNaN(this.b) ? y : y - this.b / 200;
-      x = Xn * lab2xyz(x);
-      y = Yn * lab2xyz(y);
-      z = Zn * lab2xyz(z);
-      return new Rgb(
-        lrgb2rgb( 3.1338561 * x - 1.6168667 * y - 0.4906146 * z),
-        lrgb2rgb(-0.9787684 * x + 1.9161415 * y + 0.0334540 * z),
-        lrgb2rgb( 0.0719453 * x - 0.2289914 * y + 1.4052427 * z),
-        this.opacity
-      );
-    }
-  }));
-
-  function xyz2lab(t) {
-    return t > t3 ? Math.pow(t, 1 / 3) : t / t2 + t0;
-  }
-
-  function lab2xyz(t) {
-    return t > t1 ? t * t * t : t2 * (t - t0);
-  }
-
-  function lrgb2rgb(x) {
-    return 255 * (x <= 0.0031308 ? 12.92 * x : 1.055 * Math.pow(x, 1 / 2.4) - 0.055);
-  }
-
-  function rgb2lrgb(x) {
-    return (x /= 255) <= 0.04045 ? x / 12.92 : Math.pow((x + 0.055) / 1.055, 2.4);
-  }
-
-  function hclConvert(o) {
-    if (o instanceof Hcl) return new Hcl(o.h, o.c, o.l, o.opacity);
-    if (!(o instanceof Lab)) o = labConvert(o);
-    if (o.a === 0 && o.b === 0) return new Hcl(NaN, 0 < o.l && o.l < 100 ? 0 : NaN, o.l, o.opacity);
-    var h = Math.atan2(o.b, o.a) * rad2deg;
-    return new Hcl(h < 0 ? h + 360 : h, Math.sqrt(o.a * o.a + o.b * o.b), o.l, o.opacity);
-  }
-
-  function hcl(h, c, l, opacity) {
-    return arguments.length === 1 ? hclConvert(h) : new Hcl(h, c, l, opacity == null ? 1 : opacity);
-  }
-
-  function Hcl(h, c, l, opacity) {
-    this.h = +h;
-    this.c = +c;
-    this.l = +l;
-    this.opacity = +opacity;
-  }
-
-  function hcl2lab(o) {
-    if (isNaN(o.h)) return new Lab(o.l, 0, 0, o.opacity);
-    var h = o.h * deg2rad;
-    return new Lab(o.l, Math.cos(h) * o.c, Math.sin(h) * o.c, o.opacity);
-  }
-
-  define(Hcl, hcl, extend(Color, {
-    brighter: function(k) {
-      return new Hcl(this.h, this.c, this.l + K * (k == null ? 1 : k), this.opacity);
-    },
-    darker: function(k) {
-      return new Hcl(this.h, this.c, this.l - K * (k == null ? 1 : k), this.opacity);
-    },
-    rgb: function() {
-      return hcl2lab(this).rgb();
-    }
-  }));
-
-  var A = -0.14861,
-      B = +1.78277,
-      C = -0.29227,
-      D = -0.90649,
-      E = +1.97294,
-      ED = E * D,
-      EB = E * B,
-      BC_DA = B * C - D * A;
-
-  function cubehelixConvert(o) {
-    if (o instanceof Cubehelix) return new Cubehelix(o.h, o.s, o.l, o.opacity);
-    if (!(o instanceof Rgb)) o = rgbConvert(o);
-    var r = o.r / 255,
-        g = o.g / 255,
-        b = o.b / 255,
-        l = (BC_DA * b + ED * r - EB * g) / (BC_DA + ED - EB),
-        bl = b - l,
-        k = (E * (g - l) - C * bl) / D,
-        s = Math.sqrt(k * k + bl * bl) / (E * l * (1 - l)), // NaN if l=0 or l=1
-        h = s ? Math.atan2(k, bl) * rad2deg - 120 : NaN;
-    return new Cubehelix(h < 0 ? h + 360 : h, s, l, o.opacity);
-  }
-
-  function cubehelix(h, s, l, opacity) {
-    return arguments.length === 1 ? cubehelixConvert(h) : new Cubehelix(h, s, l, opacity == null ? 1 : opacity);
-  }
-
-  function Cubehelix(h, s, l, opacity) {
-    this.h = +h;
-    this.s = +s;
-    this.l = +l;
-    this.opacity = +opacity;
-  }
-
-  define(Cubehelix, cubehelix, extend(Color, {
-    brighter: function(k) {
-      k = k == null ? brighter : Math.pow(brighter, k);
-      return new Cubehelix(this.h, this.s, this.l * k, this.opacity);
-    },
-    darker: function(k) {
-      k = k == null ? darker : Math.pow(darker, k);
-      return new Cubehelix(this.h, this.s, this.l * k, this.opacity);
-    },
-    rgb: function() {
-      var h = isNaN(this.h) ? 0 : (this.h + 120) * deg2rad,
-          l = +this.l,
-          a = isNaN(this.s) ? 0 : this.s * l * (1 - l),
-          cosh = Math.cos(h),
-          sinh = Math.sin(h);
-      return new Rgb(
-        255 * (l + a * (A * cosh + B * sinh)),
-        255 * (l + a * (C * cosh + D * sinh)),
-        255 * (l + a * (E * cosh)),
-        this.opacity
-      );
-    }
-  }));
-
-  function constant(x) {
-    return function() {
-      return x;
-    };
-  }
-
-  function linear(a, d) {
-    return function(t) {
-      return a + t * d;
-    };
-  }
-
-  function exponential(a, b, y) {
-    return a = Math.pow(a, y), b = Math.pow(b, y) - a, y = 1 / y, function(t) {
-      return Math.pow(a + t * b, y);
-    };
-  }
-
-  function gamma(y) {
-    return (y = +y) === 1 ? nogamma : function(a, b) {
-      return b - a ? exponential(a, b, y) : constant(isNaN(a) ? b : a);
-    };
-  }
-
-  function nogamma(a, b) {
-    var d = b - a;
-    return d ? linear(a, d) : constant(isNaN(a) ? b : a);
-  }
-
-  var rgb$1 = (function rgbGamma(y) {
-    var color = gamma(y);
-
-    function rgb$1(start, end) {
-      var r = color((start = rgb(start)).r, (end = rgb(end)).r),
-          g = color(start.g, end.g),
-          b = color(start.b, end.b),
-          opacity = nogamma(start.opacity, end.opacity);
-      return function(t) {
-        start.r = r(t);
-        start.g = g(t);
-        start.b = b(t);
-        start.opacity = opacity(t);
-        return start + "";
-      };
-    }
-
-    rgb$1.gamma = rgbGamma;
-
-    return rgb$1;
-  })(1);
-
-  function numberArray(a, b) {
-    if (!b) b = [];
-    var n = a ? Math.min(b.length, a.length) : 0,
-        c = b.slice(),
-        i;
-    return function(t) {
-      for (i = 0; i < n; ++i) c[i] = a[i] * (1 - t) + b[i] * t;
-      return c;
-    };
-  }
-
-  function isNumberArray(x) {
-    return ArrayBuffer.isView(x) && !(x instanceof DataView);
-  }
-
-  function genericArray(a, b) {
-    var nb = b ? b.length : 0,
-        na = a ? Math.min(nb, a.length) : 0,
-        x = new Array(na),
-        c = new Array(nb),
-        i;
-
-    for (i = 0; i < na; ++i) x[i] = interpolate(a[i], b[i]);
-    for (; i < nb; ++i) c[i] = b[i];
-
-    return function(t) {
-      for (i = 0; i < na; ++i) c[i] = x[i](t);
-      return c;
-    };
-  }
-
-  function date(a, b) {
-    var d = new Date;
-    return a = +a, b = +b, function(t) {
-      return d.setTime(a * (1 - t) + b * t), d;
-    };
-  }
-
-  function interpolateNumber(a, b) {
-    return a = +a, b = +b, function(t) {
-      return a * (1 - t) + b * t;
-    };
-  }
-
-  function object(a, b) {
-    var i = {},
-        c = {},
-        k;
-
-    if (a === null || typeof a !== "object") a = {};
-    if (b === null || typeof b !== "object") b = {};
-
-    for (k in b) {
-      if (k in a) {
-        i[k] = interpolate(a[k], b[k]);
-      } else {
-        c[k] = b[k];
-      }
-    }
-
-    return function(t) {
-      for (k in i) c[k] = i[k](t);
-      return c;
-    };
-  }
-
-  var reA = /[-+]?(?:\d+\.?\d*|\.?\d+)(?:[eE][-+]?\d+)?/g,
-      reB = new RegExp(reA.source, "g");
-
-  function zero(b) {
-    return function() {
-      return b;
-    };
-  }
-
-  function one(b) {
-    return function(t) {
-      return b(t) + "";
-    };
-  }
-
-  function string(a, b) {
-    var bi = reA.lastIndex = reB.lastIndex = 0, // scan index for next number in b
-        am, // current match in a
-        bm, // current match in b
-        bs, // string preceding current number in b, if any
-        i = -1, // index in s
-        s = [], // string constants and placeholders
-        q = []; // number interpolators
-
-    // Coerce inputs to strings.
-    a = a + "", b = b + "";
-
-    // Interpolate pairs of numbers in a & b.
-    while ((am = reA.exec(a))
-        && (bm = reB.exec(b))) {
-      if ((bs = bm.index) > bi) { // a string precedes the next number in b
-        bs = b.slice(bi, bs);
-        if (s[i]) s[i] += bs; // coalesce with previous string
-        else s[++i] = bs;
-      }
-      if ((am = am[0]) === (bm = bm[0])) { // numbers in a & b match
-        if (s[i]) s[i] += bm; // coalesce with previous string
-        else s[++i] = bm;
-      } else { // interpolate non-matching numbers
-        s[++i] = null;
-        q.push({i: i, x: interpolateNumber(am, bm)});
-      }
-      bi = reB.lastIndex;
-    }
-
-    // Add remains of b.
-    if (bi < b.length) {
-      bs = b.slice(bi);
-      if (s[i]) s[i] += bs; // coalesce with previous string
-      else s[++i] = bs;
-    }
-
-    // Special optimization for only a single match.
-    // Otherwise, interpolate each of the numbers and rejoin the string.
-    return s.length < 2 ? (q[0]
-        ? one(q[0].x)
-        : zero(b))
-        : (b = q.length, function(t) {
-            for (var i = 0, o; i < b; ++i) s[(o = q[i]).i] = o.x(t);
-            return s.join("");
-          });
-  }
-
-  function interpolate(a, b) {
-    var t = typeof b, c;
-    return b == null || t === "boolean" ? constant(b)
-        : (t === "number" ? interpolateNumber
-        : t === "string" ? ((c = color(b)) ? (b = c, rgb$1) : string)
-        : b instanceof color ? rgb$1
-        : b instanceof Date ? date
-        : isNumberArray(b) ? numberArray
-        : Array.isArray(b) ? genericArray
-        : typeof b.valueOf !== "function" && typeof b.toString !== "function" || isNaN(b) ? object
-        : interpolateNumber)(a, b);
-  }
-
-  function interpolateRound(a, b) {
-    return a = +a, b = +b, function(t) {
-      return Math.round(a * (1 - t) + b * t);
-    };
-  }
-
-  function constant$1(x) {
-    return function() {
-      return x;
-    };
-  }
-
-  function number(x) {
-    return +x;
-  }
-
-  var unit = [0, 1];
-
-  function identity(x) {
-    return x;
-  }
-
-  function normalize(a, b) {
-    return (b -= (a = +a))
-        ? function(x) { return (x - a) / b; }
-        : constant$1(isNaN(b) ? NaN : 0.5);
-  }
-
-  function clamper(a, b) {
-    var t;
-    if (a > b) t = a, a = b, b = t;
-    return function(x) { return Math.max(a, Math.min(b, x)); };
-  }
-
-  // normalize(a, b)(x) takes a domain value x in [a,b] and returns the corresponding parameter t in [0,1].
-  // interpolate(a, b)(t) takes a parameter t in [0,1] and returns the corresponding range value x in [a,b].
-  function bimap(domain, range, interpolate) {
-    var d0 = domain[0], d1 = domain[1], r0 = range[0], r1 = range[1];
-    if (d1 < d0) d0 = normalize(d1, d0), r0 = interpolate(r1, r0);
-    else d0 = normalize(d0, d1), r0 = interpolate(r0, r1);
-    return function(x) { return r0(d0(x)); };
-  }
-
-  function polymap(domain, range, interpolate) {
-    var j = Math.min(domain.length, range.length) - 1,
-        d = new Array(j),
-        r = new Array(j),
-        i = -1;
-
-    // Reverse descending domains.
-    if (domain[j] < domain[0]) {
-      domain = domain.slice().reverse();
-      range = range.slice().reverse();
-    }
-
-    while (++i < j) {
-      d[i] = normalize(domain[i], domain[i + 1]);
-      r[i] = interpolate(range[i], range[i + 1]);
-    }
-
-    return function(x) {
-      var i = bisectRight(domain, x, 1, j) - 1;
-      return r[i](d[i](x));
-    };
-  }
-
-  function copy(source, target) {
-    return target
-        .domain(source.domain())
-        .range(source.range())
-        .interpolate(source.interpolate())
-        .clamp(source.clamp())
-        .unknown(source.unknown());
-  }
-
-  function transformer() {
-    var domain = unit,
-        range = unit,
-        interpolate$1 = interpolate,
-        transform,
-        untransform,
-        unknown,
-        clamp = identity,
-        piecewise,
-        output,
-        input;
-
-    function rescale() {
-      var n = Math.min(domain.length, range.length);
-      if (clamp !== identity) clamp = clamper(domain[0], domain[n - 1]);
-      piecewise = n > 2 ? polymap : bimap;
-      output = input = null;
-      return scale;
-    }
-
-    function scale(x) {
-      return isNaN(x = +x) ? unknown : (output || (output = piecewise(domain.map(transform), range, interpolate$1)))(transform(clamp(x)));
-    }
-
-    scale.invert = function(y) {
-      return clamp(untransform((input || (input = piecewise(range, domain.map(transform), interpolateNumber)))(y)));
-    };
-
-    scale.domain = function(_) {
-      return arguments.length ? (domain = Array.from(_, number), rescale()) : domain.slice();
-    };
-
-    scale.range = function(_) {
-      return arguments.length ? (range = Array.from(_), rescale()) : range.slice();
-    };
-
-    scale.rangeRound = function(_) {
-      return range = Array.from(_), interpolate$1 = interpolateRound, rescale();
-    };
-
-    scale.clamp = function(_) {
-      return arguments.length ? (clamp = _ ? true : identity, rescale()) : clamp !== identity;
-    };
-
-    scale.interpolate = function(_) {
-      return arguments.length ? (interpolate$1 = _, rescale()) : interpolate$1;
-    };
-
-    scale.unknown = function(_) {
-      return arguments.length ? (unknown = _, scale) : unknown;
-    };
-
-    return function(t, u) {
-      transform = t, untransform = u;
-      return rescale();
-    };
-  }
-
-  function continuous() {
-    return transformer()(identity, identity);
-  }
-
-  // Computes the decimal coefficient and exponent of the specified number x with
-  // significant digits p, where x is positive and p is in [1, 21] or undefined.
-  // For example, formatDecimal(1.23) returns ["123", 0].
-  function formatDecimal(x, p) {
-    if ((i = (x = p ? x.toExponential(p - 1) : x.toExponential()).indexOf("e")) < 0) return null; // NaN, ±Infinity
-    var i, coefficient = x.slice(0, i);
-
-    // The string returned by toExponential either has the form \d\.\d+e[-+]\d+
-    // (e.g., 1.2e+3) or the form \de[-+]\d+ (e.g., 1e+3).
-    return [
-      coefficient.length > 1 ? coefficient[0] + coefficient.slice(2) : coefficient,
-      +x.slice(i + 1)
-    ];
-  }
-
-  function exponent(x) {
-    return x = formatDecimal(Math.abs(x)), x ? x[1] : NaN;
-  }
-
-  function formatGroup(grouping, thousands) {
-    return function(value, width) {
-      var i = value.length,
-          t = [],
-          j = 0,
-          g = grouping[0],
-          length = 0;
-
-      while (i > 0 && g > 0) {
-        if (length + g + 1 > width) g = Math.max(1, width - length);
-        t.push(value.substring(i -= g, i + g));
-        if ((length += g + 1) > width) break;
-        g = grouping[j = (j + 1) % grouping.length];
-      }
-
-      return t.reverse().join(thousands);
-    };
-  }
-
-  function formatNumerals(numerals) {
-    return function(value) {
-      return value.replace(/[0-9]/g, function(i) {
-        return numerals[+i];
-      });
-    };
-  }
-
-  // [[fill]align][sign][symbol][0][width][,][.precision][~][type]
-  var re = /^(?:(.)?([<>=^]))?([+\-( ])?([$#])?(0)?(\d+)?(,)?(\.\d+)?(~)?([a-z%])?$/i;
-
-  function formatSpecifier(specifier) {
-    if (!(match = re.exec(specifier))) throw new Error("invalid format: " + specifier);
-    var match;
-    return new FormatSpecifier({
-      fill: match[1],
-      align: match[2],
-      sign: match[3],
-      symbol: match[4],
-      zero: match[5],
-      width: match[6],
-      comma: match[7],
-      precision: match[8] && match[8].slice(1),
-      trim: match[9],
-      type: match[10]
-    });
-  }
-
-  formatSpecifier.prototype = FormatSpecifier.prototype; // instanceof
-
-  function FormatSpecifier(specifier) {
-    this.fill = specifier.fill === undefined ? " " : specifier.fill + "";
-    this.align = specifier.align === undefined ? ">" : specifier.align + "";
-    this.sign = specifier.sign === undefined ? "-" : specifier.sign + "";
-    this.symbol = specifier.symbol === undefined ? "" : specifier.symbol + "";
-    this.zero = !!specifier.zero;
-    this.width = specifier.width === undefined ? undefined : +specifier.width;
-    this.comma = !!specifier.comma;
-    this.precision = specifier.precision === undefined ? undefined : +specifier.precision;
-    this.trim = !!specifier.trim;
-    this.type = specifier.type === undefined ? "" : specifier.type + "";
-  }
-
-  FormatSpecifier.prototype.toString = function() {
-    return this.fill
-        + this.align
-        + this.sign
-        + this.symbol
-        + (this.zero ? "0" : "")
-        + (this.width === undefined ? "" : Math.max(1, this.width | 0))
-        + (this.comma ? "," : "")
-        + (this.precision === undefined ? "" : "." + Math.max(0, this.precision | 0))
-        + (this.trim ? "~" : "")
-        + this.type;
-  };
-
-  // Trims insignificant zeros, e.g., replaces 1.2000k with 1.2k.
-  function formatTrim(s) {
-    out: for (var n = s.length, i = 1, i0 = -1, i1; i < n; ++i) {
-      switch (s[i]) {
-        case ".": i0 = i1 = i; break;
-        case "0": if (i0 === 0) i0 = i; i1 = i; break;
-        default: if (!+s[i]) break out; if (i0 > 0) i0 = 0; break;
-      }
-    }
-    return i0 > 0 ? s.slice(0, i0) + s.slice(i1 + 1) : s;
-  }
-
-  var prefixExponent;
-
-  function formatPrefixAuto(x, p) {
-    var d = formatDecimal(x, p);
-    if (!d) return x + "";
-    var coefficient = d[0],
-        exponent = d[1],
-        i = exponent - (prefixExponent = Math.max(-8, Math.min(8, Math.floor(exponent / 3))) * 3) + 1,
-        n = coefficient.length;
-    return i === n ? coefficient
-        : i > n ? coefficient + new Array(i - n + 1).join("0")
-        : i > 0 ? coefficient.slice(0, i) + "." + coefficient.slice(i)
-        : "0." + new Array(1 - i).join("0") + formatDecimal(x, Math.max(0, p + i - 1))[0]; // less than 1y!
-  }
-
-  function formatRounded(x, p) {
-    var d = formatDecimal(x, p);
-    if (!d) return x + "";
-    var coefficient = d[0],
-        exponent = d[1];
-    return exponent < 0 ? "0." + new Array(-exponent).join("0") + coefficient
-        : coefficient.length > exponent + 1 ? coefficient.slice(0, exponent + 1) + "." + coefficient.slice(exponent + 1)
-        : coefficient + new Array(exponent - coefficient.length + 2).join("0");
-  }
-
-  var formatTypes = {
-    "%": function(x, p) { return (x * 100).toFixed(p); },
-    "b": function(x) { return Math.round(x).toString(2); },
-    "c": function(x) { return x + ""; },
-    "d": function(x) { return Math.round(x).toString(10); },
-    "e": function(x, p) { return x.toExponential(p); },
-    "f": function(x, p) { return x.toFixed(p); },
-    "g": function(x, p) { return x.toPrecision(p); },
-    "o": function(x) { return Math.round(x).toString(8); },
-    "p": function(x, p) { return formatRounded(x * 100, p); },
-    "r": formatRounded,
-    "s": formatPrefixAuto,
-    "X": function(x) { return Math.round(x).toString(16).toUpperCase(); },
-    "x": function(x) { return Math.round(x).toString(16); }
-  };
-
-  function identity$1(x) {
-    return x;
-  }
-
-  var map = Array.prototype.map,
-      prefixes = ["y","z","a","f","p","n","µ","m","","k","M","G","T","P","E","Z","Y"];
-
-  function formatLocale(locale) {
-    var group = locale.grouping === undefined || locale.thousands === undefined ? identity$1 : formatGroup(map.call(locale.grouping, Number), locale.thousands + ""),
-        currencyPrefix = locale.currency === undefined ? "" : locale.currency[0] + "",
-        currencySuffix = locale.currency === undefined ? "" : locale.currency[1] + "",
-        decimal = locale.decimal === undefined ? "." : locale.decimal + "",
-        numerals = locale.numerals === undefined ? identity$1 : formatNumerals(map.call(locale.numerals, String)),
-        percent = locale.percent === undefined ? "%" : locale.percent + "",
-        minus = locale.minus === undefined ? "-" : locale.minus + "",
-        nan = locale.nan === undefined ? "NaN" : locale.nan + "";
-
-    function newFormat(specifier) {
-      specifier = formatSpecifier(specifier);
-
-      var fill = specifier.fill,
-          align = specifier.align,
-          sign = specifier.sign,
-          symbol = specifier.symbol,
-          zero = specifier.zero,
-          width = specifier.width,
-          comma = specifier.comma,
-          precision = specifier.precision,
-          trim = specifier.trim,
-          type = specifier.type;
-
-      // The "n" type is an alias for ",g".
-      if (type === "n") comma = true, type = "g";
-
-      // The "" type, and any invalid type, is an alias for ".12~g".
-      else if (!formatTypes[type]) precision === undefined && (precision = 12), trim = true, type = "g";
-
-      // If zero fill is specified, padding goes after sign and before digits.
-      if (zero || (fill === "0" && align === "=")) zero = true, fill = "0", align = "=";
-
-      // Compute the prefix and suffix.
-      // For SI-prefix, the suffix is lazily computed.
-      var prefix = symbol === "$" ? currencyPrefix : symbol === "#" && /[boxX]/.test(type) ? "0" + type.toLowerCase() : "",
-          suffix = symbol === "$" ? currencySuffix : /[%p]/.test(type) ? percent : "";
-
-      // What format function should we use?
-      // Is this an integer type?
-      // Can this type generate exponential notation?
-      var formatType = formatTypes[type],
-          maybeSuffix = /[defgprs%]/.test(type);
-
-      // Set the default precision if not specified,
-      // or clamp the specified precision to the supported range.
-      // For significant precision, it must be in [1, 21].
-      // For fixed precision, it must be in [0, 20].
-      precision = precision === undefined ? 6
-          : /[gprs]/.test(type) ? Math.max(1, Math.min(21, precision))
-          : Math.max(0, Math.min(20, precision));
-
-      function format(value) {
-        var valuePrefix = prefix,
-            valueSuffix = suffix,
-            i, n, c;
-
-        if (type === "c") {
-          valueSuffix = formatType(value) + valueSuffix;
-          value = "";
-        } else {
-          value = +value;
-
-          // Determine the sign. -0 is not less than 0, but 1 / -0 is!
-          var valueNegative = value < 0 || 1 / value < 0;
-
-          // Perform the initial formatting.
-          value = isNaN(value) ? nan : formatType(Math.abs(value), precision);
-
-          // Trim insignificant zeros.
-          if (trim) value = formatTrim(value);
-
-          // If a negative value rounds to zero after formatting, and no explicit positive sign is requested, hide the sign.
-          if (valueNegative && +value === 0 && sign !== "+") valueNegative = false;
-
-          // Compute the prefix and suffix.
-          valuePrefix = (valueNegative ? (sign === "(" ? sign : minus) : sign === "-" || sign === "(" ? "" : sign) + valuePrefix;
-          valueSuffix = (type === "s" ? prefixes[8 + prefixExponent / 3] : "") + valueSuffix + (valueNegative && sign === "(" ? ")" : "");
-
-          // Break the formatted value into the integer “value” part that can be
-          // grouped, and fractional or exponential “suffix” part that is not.
-          if (maybeSuffix) {
-            i = -1, n = value.length;
-            while (++i < n) {
-              if (c = value.charCodeAt(i), 48 > c || c > 57) {
-                valueSuffix = (c === 46 ? decimal + value.slice(i + 1) : value.slice(i)) + valueSuffix;
-                value = value.slice(0, i);
-                break;
-              }
-            }
-          }
-        }
-
-        // If the fill character is not "0", grouping is applied before padding.
-        if (comma && !zero) value = group(value, Infinity);
-
-        // Compute the padding.
-        var length = valuePrefix.length + value.length + valueSuffix.length,
-            padding = length < width ? new Array(width - length + 1).join(fill) : "";
-
-        // If the fill character is "0", grouping is applied after padding.
-        if (comma && zero) value = group(padding + value, padding.length ? width - valueSuffix.length : Infinity), padding = "";
-
-        // Reconstruct the final output based on the desired alignment.
-        switch (align) {
-          case "<": value = valuePrefix + value + valueSuffix + padding; break;
-          case "=": value = valuePrefix + padding + value + valueSuffix; break;
-          case "^": value = padding.slice(0, length = padding.length >> 1) + valuePrefix + value + valueSuffix + padding.slice(length); break;
-          default: value = padding + valuePrefix + value + valueSuffix; break;
-        }
-
-        return numerals(value);
-      }
-
-      format.toString = function() {
-        return specifier + "";
-      };
-
-      return format;
-    }
-
-    function formatPrefix(specifier, value) {
-      var f = newFormat((specifier = formatSpecifier(specifier), specifier.type = "f", specifier)),
-          e = Math.max(-8, Math.min(8, Math.floor(exponent(value) / 3))) * 3,
-          k = Math.pow(10, -e),
-          prefix = prefixes[8 + e / 3];
-      return function(value) {
-        return f(k * value) + prefix;
-      };
-    }
-
-    return {
-      format: newFormat,
-      formatPrefix: formatPrefix
-    };
-  }
-
-  var locale;
-  var format;
-  var formatPrefix;
-
-  defaultLocale({
-    decimal: ".",
-    thousands: ",",
-    grouping: [3],
-    currency: ["$", ""],
-    minus: "-"
-  });
-
-  function defaultLocale(definition) {
-    locale = formatLocale(definition);
-    format = locale.format;
-    formatPrefix = locale.formatPrefix;
-    return locale;
-  }
-
-  function precisionFixed(step) {
-    return Math.max(0, -exponent(Math.abs(step)));
-  }
-
-  function precisionPrefix(step, value) {
-    return Math.max(0, Math.max(-8, Math.min(8, Math.floor(exponent(value) / 3))) * 3 - exponent(Math.abs(step)));
-  }
-
-  function precisionRound(step, max) {
-    step = Math.abs(step), max = Math.abs(max) - step;
-    return Math.max(0, exponent(max) - exponent(step)) + 1;
-  }
-
-  function tickFormat(start, stop, count, specifier) {
-    var step = tickStep(start, stop, count),
-        precision;
-    specifier = formatSpecifier(specifier == null ? ",f" : specifier);
-    switch (specifier.type) {
-      case "s": {
-        var value = Math.max(Math.abs(start), Math.abs(stop));
-        if (specifier.precision == null && !isNaN(precision = precisionPrefix(step, value))) specifier.precision = precision;
-        return formatPrefix(specifier, value);
-      }
-      case "":
-      case "e":
-      case "g":
-      case "p":
-      case "r": {
-        if (specifier.precision == null && !isNaN(precision = precisionRound(step, Math.max(Math.abs(start), Math.abs(stop))))) specifier.precision = precision - (specifier.type === "e");
-        break;
-      }
-      case "f":
-      case "%": {
-        if (specifier.precision == null && !isNaN(precision = precisionFixed(step))) specifier.precision = precision - (specifier.type === "%") * 2;
-        break;
-      }
-    }
-    return format(specifier);
-  }
-
-  function linearish(scale) {
-    var domain = scale.domain;
-
-    scale.ticks = function(count) {
-      var d = domain();
-      return ticks(d[0], d[d.length - 1], count == null ? 10 : count);
-    };
-
-    scale.tickFormat = function(count, specifier) {
-      var d = domain();
-      return tickFormat(d[0], d[d.length - 1], count == null ? 10 : count, specifier);
-    };
-
-    scale.nice = function(count) {
-      if (count == null) count = 10;
-
-      var d = domain(),
-          i0 = 0,
-          i1 = d.length - 1,
-          start = d[i0],
-          stop = d[i1],
-          step;
-
-      if (stop < start) {
-        step = start, start = stop, stop = step;
-        step = i0, i0 = i1, i1 = step;
-      }
-
-      step = tickIncrement(start, stop, count);
-
-      if (step > 0) {
-        start = Math.floor(start / step) * step;
-        stop = Math.ceil(stop / step) * step;
-        step = tickIncrement(start, stop, count);
-      } else if (step < 0) {
-        start = Math.ceil(start * step) / step;
-        stop = Math.floor(stop * step) / step;
-        step = tickIncrement(start, stop, count);
-      }
-
-      if (step > 0) {
-        d[i0] = Math.floor(start / step) * step;
-        d[i1] = Math.ceil(stop / step) * step;
-        domain(d);
-      } else if (step < 0) {
-        d[i0] = Math.ceil(start * step) / step;
-        d[i1] = Math.floor(stop * step) / step;
-        domain(d);
-      }
-
-      return scale;
-    };
-
-    return scale;
-  }
-
-  function linear$1() {
-    var scale = continuous();
-
-    scale.copy = function() {
-      return copy(scale, linear$1());
-    };
-
-    initRange.apply(scale, arguments);
-
-    return linearish(scale);
-  }
-
-  var t0$1 = new Date,
-      t1$1 = new Date;
-
-  function newInterval(floori, offseti, count, field) {
-
-    function interval(date) {
-      return floori(date = arguments.length === 0 ? new Date : new Date(+date)), date;
-    }
-
-    interval.floor = function(date) {
-      return floori(date = new Date(+date)), date;
-    };
-
-    interval.ceil = function(date) {
-      return floori(date = new Date(date - 1)), offseti(date, 1), floori(date), date;
-    };
-
-    interval.round = function(date) {
-      var d0 = interval(date),
-          d1 = interval.ceil(date);
-      return date - d0 < d1 - date ? d0 : d1;
-    };
-
-    interval.offset = function(date, step) {
-      return offseti(date = new Date(+date), step == null ? 1 : Math.floor(step)), date;
-    };
-
-    interval.range = function(start, stop, step) {
-      var range = [], previous;
-      start = interval.ceil(start);
-      step = step == null ? 1 : Math.floor(step);
-      if (!(start < stop) || !(step > 0)) return range; // also handles Invalid Date
-      do range.push(previous = new Date(+start)), offseti(start, step), floori(start);
-      while (previous < start && start < stop);
-      return range;
-    };
-
-    interval.filter = function(test) {
-      return newInterval(function(date) {
-        if (date >= date) while (floori(date), !test(date)) date.setTime(date - 1);
-      }, function(date, step) {
-        if (date >= date) {
-          if (step < 0) while (++step <= 0) {
-            while (offseti(date, -1), !test(date)) {} // eslint-disable-line no-empty
-          } else while (--step >= 0) {
-            while (offseti(date, +1), !test(date)) {} // eslint-disable-line no-empty
-          }
-        }
-      });
-    };
-
-    if (count) {
-      interval.count = function(start, end) {
-        t0$1.setTime(+start), t1$1.setTime(+end);
-        floori(t0$1), floori(t1$1);
-        return Math.floor(count(t0$1, t1$1));
-      };
-
-      interval.every = function(step) {
-        step = Math.floor(step);
-        return !isFinite(step) || !(step > 0) ? null
-            : !(step > 1) ? interval
-            : interval.filter(field
-                ? function(d) { return field(d) % step === 0; }
-                : function(d) { return interval.count(0, d) % step === 0; });
-      };
-    }
-
-    return interval;
-  }
-
-  var millisecond = newInterval(function() {
-    // noop
-  }, function(date, step) {
-    date.setTime(+date + step);
-  }, function(start, end) {
-    return end - start;
-  });
-
-  // An optimized implementation for this simple case.
-  millisecond.every = function(k) {
-    k = Math.floor(k);
-    if (!isFinite(k) || !(k > 0)) return null;
-    if (!(k > 1)) return millisecond;
-    return newInterval(function(date) {
-      date.setTime(Math.floor(date / k) * k);
-    }, function(date, step) {
-      date.setTime(+date + step * k);
-    }, function(start, end) {
-      return (end - start) / k;
-    });
-  };
-
-  var durationSecond = 1e3;
-  var durationMinute = 6e4;
-  var durationHour = 36e5;
-  var durationDay = 864e5;
-  var durationWeek = 6048e5;
-
-  var second = newInterval(function(date) {
-    date.setTime(date - date.getMilliseconds());
-  }, function(date, step) {
-    date.setTime(+date + step * durationSecond);
-  }, function(start, end) {
-    return (end - start) / durationSecond;
-  }, function(date) {
-    return date.getUTCSeconds();
-  });
-
-  var minute = newInterval(function(date) {
-    date.setTime(date - date.getMilliseconds() - date.getSeconds() * durationSecond);
-  }, function(date, step) {
-    date.setTime(+date + step * durationMinute);
-  }, function(start, end) {
-    return (end - start) / durationMinute;
-  }, function(date) {
-    return date.getMinutes();
-  });
-
-  var hour = newInterval(function(date) {
-    date.setTime(date - date.getMilliseconds() - date.getSeconds() * durationSecond - date.getMinutes() * durationMinute);
-  }, function(date, step) {
-    date.setTime(+date + step * durationHour);
-  }, function(start, end) {
-    return (end - start) / durationHour;
-  }, function(date) {
-    return date.getHours();
-  });
-
-  var day = newInterval(function(date) {
-    date.setHours(0, 0, 0, 0);
-  }, function(date, step) {
-    date.setDate(date.getDate() + step);
-  }, function(start, end) {
-    return (end - start - (end.getTimezoneOffset() - start.getTimezoneOffset()) * durationMinute) / durationDay;
-  }, function(date) {
-    return date.getDate() - 1;
-  });
-
-  function weekday(i) {
-    return newInterval(function(date) {
-      date.setDate(date.getDate() - (date.getDay() + 7 - i) % 7);
-      date.setHours(0, 0, 0, 0);
-    }, function(date, step) {
-      date.setDate(date.getDate() + step * 7);
-    }, function(start, end) {
-      return (end - start - (end.getTimezoneOffset() - start.getTimezoneOffset()) * durationMinute) / durationWeek;
-    });
-  }
-
-  var sunday = weekday(0);
-  var monday = weekday(1);
-  var tuesday = weekday(2);
-  var wednesday = weekday(3);
-  var thursday = weekday(4);
-  var friday = weekday(5);
-  var saturday = weekday(6);
-
-  var month = newInterval(function(date) {
-    date.setDate(1);
-    date.setHours(0, 0, 0, 0);
-  }, function(date, step) {
-    date.setMonth(date.getMonth() + step);
-  }, function(start, end) {
-    return end.getMonth() - start.getMonth() + (end.getFullYear() - start.getFullYear()) * 12;
-  }, function(date) {
-    return date.getMonth();
-  });
-
-  var year = newInterval(function(date) {
-    date.setMonth(0, 1);
-    date.setHours(0, 0, 0, 0);
-  }, function(date, step) {
-    date.setFullYear(date.getFullYear() + step);
-  }, function(start, end) {
-    return end.getFullYear() - start.getFullYear();
-  }, function(date) {
-    return date.getFullYear();
-  });
-
-  // An optimized implementation for this simple case.
-  year.every = function(k) {
-    return !isFinite(k = Math.floor(k)) || !(k > 0) ? null : newInterval(function(date) {
-      date.setFullYear(Math.floor(date.getFullYear() / k) * k);
-      date.setMonth(0, 1);
-      date.setHours(0, 0, 0, 0);
-    }, function(date, step) {
-      date.setFullYear(date.getFullYear() + step * k);
-    });
-  };
-
-  var utcMinute = newInterval(function(date) {
-    date.setUTCSeconds(0, 0);
-  }, function(date, step) {
-    date.setTime(+date + step * durationMinute);
-  }, function(start, end) {
-    return (end - start) / durationMinute;
-  }, function(date) {
-    return date.getUTCMinutes();
-  });
-
-  var utcHour = newInterval(function(date) {
-    date.setUTCMinutes(0, 0, 0);
-  }, function(date, step) {
-    date.setTime(+date + step * durationHour);
-  }, function(start, end) {
-    return (end - start) / durationHour;
-  }, function(date) {
-    return date.getUTCHours();
-  });
-
-  var utcDay = newInterval(function(date) {
-    date.setUTCHours(0, 0, 0, 0);
-  }, function(date, step) {
-    date.setUTCDate(date.getUTCDate() + step);
-  }, function(start, end) {
-    return (end - start) / durationDay;
-  }, function(date) {
-    return date.getUTCDate() - 1;
-  });
-
-  function utcWeekday(i) {
-    return newInterval(function(date) {
-      date.setUTCDate(date.getUTCDate() - (date.getUTCDay() + 7 - i) % 7);
-      date.setUTCHours(0, 0, 0, 0);
-    }, function(date, step) {
-      date.setUTCDate(date.getUTCDate() + step * 7);
-    }, function(start, end) {
-      return (end - start) / durationWeek;
-    });
-  }
-
-  var utcSunday = utcWeekday(0);
-  var utcMonday = utcWeekday(1);
-  var utcTuesday = utcWeekday(2);
-  var utcWednesday = utcWeekday(3);
-  var utcThursday = utcWeekday(4);
-  var utcFriday = utcWeekday(5);
-  var utcSaturday = utcWeekday(6);
-
-  var utcMonth = newInterval(function(date) {
-    date.setUTCDate(1);
-    date.setUTCHours(0, 0, 0, 0);
-  }, function(date, step) {
-    date.setUTCMonth(date.getUTCMonth() + step);
-  }, function(start, end) {
-    return end.getUTCMonth() - start.getUTCMonth() + (end.getUTCFullYear() - start.getUTCFullYear()) * 12;
-  }, function(date) {
-    return date.getUTCMonth();
-  });
-
-  var utcYear = newInterval(function(date) {
-    date.setUTCMonth(0, 1);
-    date.setUTCHours(0, 0, 0, 0);
-  }, function(date, step) {
-    date.setUTCFullYear(date.getUTCFullYear() + step);
-  }, function(start, end) {
-    return end.getUTCFullYear() - start.getUTCFullYear();
-  }, function(date) {
-    return date.getUTCFullYear();
-  });
-
-  // An optimized implementation for this simple case.
-  utcYear.every = function(k) {
-    return !isFinite(k = Math.floor(k)) || !(k > 0) ? null : newInterval(function(date) {
-      date.setUTCFullYear(Math.floor(date.getUTCFullYear() / k) * k);
-      date.setUTCMonth(0, 1);
-      date.setUTCHours(0, 0, 0, 0);
-    }, function(date, step) {
-      date.setUTCFullYear(date.getUTCFullYear() + step * k);
-    });
-  };
-
-  function localDate(d) {
-    if (0 <= d.y && d.y < 100) {
-      var date = new Date(-1, d.m, d.d, d.H, d.M, d.S, d.L);
-      date.setFullYear(d.y);
-      return date;
-    }
-    return new Date(d.y, d.m, d.d, d.H, d.M, d.S, d.L);
-  }
-
-  function utcDate(d) {
-    if (0 <= d.y && d.y < 100) {
-      var date = new Date(Date.UTC(-1, d.m, d.d, d.H, d.M, d.S, d.L));
-      date.setUTCFullYear(d.y);
-      return date;
-    }
-    return new Date(Date.UTC(d.y, d.m, d.d, d.H, d.M, d.S, d.L));
-  }
-
-  function newDate(y, m, d) {
-    return {y: y, m: m, d: d, H: 0, M: 0, S: 0, L: 0};
-  }
-
-  function formatLocale$1(locale) {
-    var locale_dateTime = locale.dateTime,
-        locale_date = locale.date,
-        locale_time = locale.time,
-        locale_periods = locale.periods,
-        locale_weekdays = locale.days,
-        locale_shortWeekdays = locale.shortDays,
-        locale_months = locale.months,
-        locale_shortMonths = locale.shortMonths;
-
-    var periodRe = formatRe(locale_periods),
-        periodLookup = formatLookup(locale_periods),
-        weekdayRe = formatRe(locale_weekdays),
-        weekdayLookup = formatLookup(locale_weekdays),
-        shortWeekdayRe = formatRe(locale_shortWeekdays),
-        shortWeekdayLookup = formatLookup(locale_shortWeekdays),
-        monthRe = formatRe(locale_months),
-        monthLookup = formatLookup(locale_months),
-        shortMonthRe = formatRe(locale_shortMonths),
-        shortMonthLookup = formatLookup(locale_shortMonths);
-
-    var formats = {
-      "a": formatShortWeekday,
-      "A": formatWeekday,
-      "b": formatShortMonth,
-      "B": formatMonth,
-      "c": null,
-      "d": formatDayOfMonth,
-      "e": formatDayOfMonth,
-      "f": formatMicroseconds,
-      "H": formatHour24,
-      "I": formatHour12,
-      "j": formatDayOfYear,
-      "L": formatMilliseconds,
-      "m": formatMonthNumber,
-      "M": formatMinutes,
-      "p": formatPeriod,
-      "q": formatQuarter,
-      "Q": formatUnixTimestamp,
-      "s": formatUnixTimestampSeconds,
-      "S": formatSeconds,
-      "u": formatWeekdayNumberMonday,
-      "U": formatWeekNumberSunday,
-      "V": formatWeekNumberISO,
-      "w": formatWeekdayNumberSunday,
-      "W": formatWeekNumberMonday,
-      "x": null,
-      "X": null,
-      "y": formatYear,
-      "Y": formatFullYear,
-      "Z": formatZone,
-      "%": formatLiteralPercent
-    };
-
-    var utcFormats = {
-      "a": formatUTCShortWeekday,
-      "A": formatUTCWeekday,
-      "b": formatUTCShortMonth,
-      "B": formatUTCMonth,
-      "c": null,
-      "d": formatUTCDayOfMonth,
-      "e": formatUTCDayOfMonth,
-      "f": formatUTCMicroseconds,
-      "H": formatUTCHour24,
-      "I": formatUTCHour12,
-      "j": formatUTCDayOfYear,
-      "L": formatUTCMilliseconds,
-      "m": formatUTCMonthNumber,
-      "M": formatUTCMinutes,
-      "p": formatUTCPeriod,
-      "q": formatUTCQuarter,
-      "Q": formatUnixTimestamp,
-      "s": formatUnixTimestampSeconds,
-      "S": formatUTCSeconds,
-      "u": formatUTCWeekdayNumberMonday,
-      "U": formatUTCWeekNumberSunday,
-      "V": formatUTCWeekNumberISO,
-      "w": formatUTCWeekdayNumberSunday,
-      "W": formatUTCWeekNumberMonday,
-      "x": null,
-      "X": null,
-      "y": formatUTCYear,
-      "Y": formatUTCFullYear,
-      "Z": formatUTCZone,
-      "%": formatLiteralPercent
-    };
-
-    var parses = {
-      "a": parseShortWeekday,
-      "A": parseWeekday,
-      "b": parseShortMonth,
-      "B": parseMonth,
-      "c": parseLocaleDateTime,
-      "d": parseDayOfMonth,
-      "e": parseDayOfMonth,
-      "f": parseMicroseconds,
-      "H": parseHour24,
-      "I": parseHour24,
-      "j": parseDayOfYear,
-      "L": parseMilliseconds,
-      "m": parseMonthNumber,
-      "M": parseMinutes,
-      "p": parsePeriod,
-      "q": parseQuarter,
-      "Q": parseUnixTimestamp,
-      "s": parseUnixTimestampSeconds,
-      "S": parseSeconds,
-      "u": parseWeekdayNumberMonday,
-      "U": parseWeekNumberSunday,
-      "V": parseWeekNumberISO,
-      "w": parseWeekdayNumberSunday,
-      "W": parseWeekNumberMonday,
-      "x": parseLocaleDate,
-      "X": parseLocaleTime,
-      "y": parseYear,
-      "Y": parseFullYear,
-      "Z": parseZone,
-      "%": parseLiteralPercent
-    };
-
-    // These recursive directive definitions must be deferred.
-    formats.x = newFormat(locale_date, formats);
-    formats.X = newFormat(locale_time, formats);
-    formats.c = newFormat(locale_dateTime, formats);
-    utcFormats.x = newFormat(locale_date, utcFormats);
-    utcFormats.X = newFormat(locale_time, utcFormats);
-    utcFormats.c = newFormat(locale_dateTime, utcFormats);
-
-    function newFormat(specifier, formats) {
-      return function(date) {
-        var string = [],
-            i = -1,
-            j = 0,
-            n = specifier.length,
-            c,
-            pad,
-            format;
-
-        if (!(date instanceof Date)) date = new Date(+date);
-
-        while (++i < n) {
-          if (specifier.charCodeAt(i) === 37) {
-            string.push(specifier.slice(j, i));
-            if ((pad = pads[c = specifier.charAt(++i)]) != null) c = specifier.charAt(++i);
-            else pad = c === "e" ? " " : "0";
-            if (format = formats[c]) c = format(date, pad);
-            string.push(c);
-            j = i + 1;
-          }
-        }
-
-        string.push(specifier.slice(j, i));
-        return string.join("");
-      };
-    }
-
-    function newParse(specifier, Z) {
-      return function(string) {
-        var d = newDate(1900, undefined, 1),
-            i = parseSpecifier(d, specifier, string += "", 0),
-            week, day$1;
-        if (i != string.length) return null;
-
-        // If a UNIX timestamp is specified, return it.
-        if ("Q" in d) return new Date(d.Q);
-        if ("s" in d) return new Date(d.s * 1000 + ("L" in d ? d.L : 0));
-
-        // If this is utcParse, never use the local timezone.
-        if (Z && !("Z" in d)) d.Z = 0;
-
-        // The am-pm flag is 0 for AM, and 1 for PM.
-        if ("p" in d) d.H = d.H % 12 + d.p * 12;
-
-        // If the month was not specified, inherit from the quarter.
-        if (d.m === undefined) d.m = "q" in d ? d.q : 0;
-
-        // Convert day-of-week and week-of-year to day-of-year.
-        if ("V" in d) {
-          if (d.V < 1 || d.V > 53) return null;
-          if (!("w" in d)) d.w = 1;
-          if ("Z" in d) {
-            week = utcDate(newDate(d.y, 0, 1)), day$1 = week.getUTCDay();
-            week = day$1 > 4 || day$1 === 0 ? utcMonday.ceil(week) : utcMonday(week);
-            week = utcDay.offset(week, (d.V - 1) * 7);
-            d.y = week.getUTCFullYear();
-            d.m = week.getUTCMonth();
-            d.d = week.getUTCDate() + (d.w + 6) % 7;
-          } else {
-            week = localDate(newDate(d.y, 0, 1)), day$1 = week.getDay();
-            week = day$1 > 4 || day$1 === 0 ? monday.ceil(week) : monday(week);
-            week = day.offset(week, (d.V - 1) * 7);
-            d.y = week.getFullYear();
-            d.m = week.getMonth();
-            d.d = week.getDate() + (d.w + 6) % 7;
-          }
-        } else if ("W" in d || "U" in d) {
-          if (!("w" in d)) d.w = "u" in d ? d.u % 7 : "W" in d ? 1 : 0;
-          day$1 = "Z" in d ? utcDate(newDate(d.y, 0, 1)).getUTCDay() : localDate(newDate(d.y, 0, 1)).getDay();
-          d.m = 0;
-          d.d = "W" in d ? (d.w + 6) % 7 + d.W * 7 - (day$1 + 5) % 7 : d.w + d.U * 7 - (day$1 + 6) % 7;
-        }
-
-        // If a time zone is specified, all fields are interpreted as UTC and then
-        // offset according to the specified time zone.
-        if ("Z" in d) {
-          d.H += d.Z / 100 | 0;
-          d.M += d.Z % 100;
-          return utcDate(d);
-        }
-
-        // Otherwise, all fields are in local time.
-        return localDate(d);
-      };
-    }
-
-    function parseSpecifier(d, specifier, string, j) {
-      var i = 0,
-          n = specifier.length,
-          m = string.length,
-          c,
-          parse;
-
-      while (i < n) {
-        if (j >= m) return -1;
-        c = specifier.charCodeAt(i++);
-        if (c === 37) {
-          c = specifier.charAt(i++);
-          parse = parses[c in pads ? specifier.charAt(i++) : c];
-          if (!parse || ((j = parse(d, string, j)) < 0)) return -1;
-        } else if (c != string.charCodeAt(j++)) {
-          return -1;
-        }
-      }
-
-      return j;
-    }
-
-    function parsePeriod(d, string, i) {
-      var n = periodRe.exec(string.slice(i));
-      return n ? (d.p = periodLookup[n[0].toLowerCase()], i + n[0].length) : -1;
-    }
-
-    function parseShortWeekday(d, string, i) {
-      var n = shortWeekdayRe.exec(string.slice(i));
-      return n ? (d.w = shortWeekdayLookup[n[0].toLowerCase()], i + n[0].length) : -1;
-    }
-
-    function parseWeekday(d, string, i) {
-      var n = weekdayRe.exec(string.slice(i));
-      return n ? (d.w = weekdayLookup[n[0].toLowerCase()], i + n[0].length) : -1;
-    }
-
-    function parseShortMonth(d, string, i) {
-      var n = shortMonthRe.exec(string.slice(i));
-      return n ? (d.m = shortMonthLookup[n[0].toLowerCase()], i + n[0].length) : -1;
-    }
-
-    function parseMonth(d, string, i) {
-      var n = monthRe.exec(string.slice(i));
-      return n ? (d.m = monthLookup[n[0].toLowerCase()], i + n[0].length) : -1;
-    }
-
-    function parseLocaleDateTime(d, string, i) {
-      return parseSpecifier(d, locale_dateTime, string, i);
-    }
-
-    function parseLocaleDate(d, string, i) {
-      return parseSpecifier(d, locale_date, string, i);
-    }
-
-    function parseLocaleTime(d, string, i) {
-      return parseSpecifier(d, locale_time, string, i);
-    }
-
-    function formatShortWeekday(d) {
-      return locale_shortWeekdays[d.getDay()];
-    }
-
-    function formatWeekday(d) {
-      return locale_weekdays[d.getDay()];
-    }
-
-    function formatShortMonth(d) {
-      return locale_shortMonths[d.getMonth()];
-    }
-
-    function formatMonth(d) {
-      return locale_months[d.getMonth()];
-    }
-
-    function formatPeriod(d) {
-      return locale_periods[+(d.getHours() >= 12)];
-    }
-
-    function formatQuarter(d) {
-      return 1 + ~~(d.getMonth() / 3);
-    }
-
-    function formatUTCShortWeekday(d) {
-      return locale_shortWeekdays[d.getUTCDay()];
-    }
-
-    function formatUTCWeekday(d) {
-      return locale_weekdays[d.getUTCDay()];
-    }
-
-    function formatUTCShortMonth(d) {
-      return locale_shortMonths[d.getUTCMonth()];
-    }
-
-    function formatUTCMonth(d) {
-      return locale_months[d.getUTCMonth()];
-    }
-
-    function formatUTCPeriod(d) {
-      return locale_periods[+(d.getUTCHours() >= 12)];
-    }
-
-    function formatUTCQuarter(d) {
-      return 1 + ~~(d.getUTCMonth() / 3);
-    }
-
-    return {
-      format: function(specifier) {
-        var f = newFormat(specifier += "", formats);
-        f.toString = function() { return specifier; };
-        return f;
-      },
-      parse: function(specifier) {
-        var p = newParse(specifier += "", false);
-        p.toString = function() { return specifier; };
-        return p;
-      },
-      utcFormat: function(specifier) {
-        var f = newFormat(specifier += "", utcFormats);
-        f.toString = function() { return specifier; };
-        return f;
-      },
-      utcParse: function(specifier) {
-        var p = newParse(specifier += "", true);
-        p.toString = function() { return specifier; };
-        return p;
-      }
-    };
-  }
-
-  var pads = {"-": "", "_": " ", "0": "0"},
-      numberRe = /^\s*\d+/, // note: ignores next directive
-      percentRe = /^%/,
-      requoteRe = /[\\^$*+?|[\]().{}]/g;
-
-  function pad(value, fill, width) {
-    var sign = value < 0 ? "-" : "",
-        string = (sign ? -value : value) + "",
-        length = string.length;
-    return sign + (length < width ? new Array(width - length + 1).join(fill) + string : string);
-  }
-
-  function requote(s) {
-    return s.replace(requoteRe, "\\$&");
-  }
-
-  function formatRe(names) {
-    return new RegExp("^(?:" + names.map(requote).join("|") + ")", "i");
-  }
-
-  function formatLookup(names) {
-    var map = {}, i = -1, n = names.length;
-    while (++i < n) map[names[i].toLowerCase()] = i;
-    return map;
-  }
-
-  function parseWeekdayNumberSunday(d, string, i) {
-    var n = numberRe.exec(string.slice(i, i + 1));
-    return n ? (d.w = +n[0], i + n[0].length) : -1;
-  }
-
-  function parseWeekdayNumberMonday(d, string, i) {
-    var n = numberRe.exec(string.slice(i, i + 1));
-    return n ? (d.u = +n[0], i + n[0].length) : -1;
-  }
-
-  function parseWeekNumberSunday(d, string, i) {
-    var n = numberRe.exec(string.slice(i, i + 2));
-    return n ? (d.U = +n[0], i + n[0].length) : -1;
-  }
-
-  function parseWeekNumberISO(d, string, i) {
-    var n = numberRe.exec(string.slice(i, i + 2));
-    return n ? (d.V = +n[0], i + n[0].length) : -1;
-  }
-
-  function parseWeekNumberMonday(d, string, i) {
-    var n = numberRe.exec(string.slice(i, i + 2));
-    return n ? (d.W = +n[0], i + n[0].length) : -1;
-  }
-
-  function parseFullYear(d, string, i) {
-    var n = numberRe.exec(string.slice(i, i + 4));
-    return n ? (d.y = +n[0], i + n[0].length) : -1;
-  }
-
-  function parseYear(d, string, i) {
-    var n = numberRe.exec(string.slice(i, i + 2));
-    return n ? (d.y = +n[0] + (+n[0] > 68 ? 1900 : 2000), i + n[0].length) : -1;
-  }
-
-  function parseZone(d, string, i) {
-    var n = /^(Z)|([+-]\d\d)(?::?(\d\d))?/.exec(string.slice(i, i + 6));
-    return n ? (d.Z = n[1] ? 0 : -(n[2] + (n[3] || "00")), i + n[0].length) : -1;
-  }
-
-  function parseQuarter(d, string, i) {
-    var n = numberRe.exec(string.slice(i, i + 1));
-    return n ? (d.q = n[0] * 3 - 3, i + n[0].length) : -1;
-  }
-
-  function parseMonthNumber(d, string, i) {
-    var n = numberRe.exec(string.slice(i, i + 2));
-    return n ? (d.m = n[0] - 1, i + n[0].length) : -1;
-  }
-
-  function parseDayOfMonth(d, string, i) {
-    var n = numberRe.exec(string.slice(i, i + 2));
-    return n ? (d.d = +n[0], i + n[0].length) : -1;
-  }
-
-  function parseDayOfYear(d, string, i) {
-    var n = numberRe.exec(string.slice(i, i + 3));
-    return n ? (d.m = 0, d.d = +n[0], i + n[0].length) : -1;
-  }
-
-  function parseHour24(d, string, i) {
-    var n = numberRe.exec(string.slice(i, i + 2));
-    return n ? (d.H = +n[0], i + n[0].length) : -1;
-  }
-
-  function parseMinutes(d, string, i) {
-    var n = numberRe.exec(string.slice(i, i + 2));
-    return n ? (d.M = +n[0], i + n[0].length) : -1;
-  }
-
-  function parseSeconds(d, string, i) {
-    var n = numberRe.exec(string.slice(i, i + 2));
-    return n ? (d.S = +n[0], i + n[0].length) : -1;
-  }
-
-  function parseMilliseconds(d, string, i) {
-    var n = numberRe.exec(string.slice(i, i + 3));
-    return n ? (d.L = +n[0], i + n[0].length) : -1;
-  }
-
-  function parseMicroseconds(d, string, i) {
-    var n = numberRe.exec(string.slice(i, i + 6));
-    return n ? (d.L = Math.floor(n[0] / 1000), i + n[0].length) : -1;
-  }
-
-  function parseLiteralPercent(d, string, i) {
-    var n = percentRe.exec(string.slice(i, i + 1));
-    return n ? i + n[0].length : -1;
-  }
-
-  function parseUnixTimestamp(d, string, i) {
-    var n = numberRe.exec(string.slice(i));
-    return n ? (d.Q = +n[0], i + n[0].length) : -1;
-  }
-
-  function parseUnixTimestampSeconds(d, string, i) {
-    var n = numberRe.exec(string.slice(i));
-    return n ? (d.s = +n[0], i + n[0].length) : -1;
-  }
-
-  function formatDayOfMonth(d, p) {
-    return pad(d.getDate(), p, 2);
-  }
-
-  function formatHour24(d, p) {
-    return pad(d.getHours(), p, 2);
-  }
-
-  function formatHour12(d, p) {
-    return pad(d.getHours() % 12 || 12, p, 2);
-  }
-
-  function formatDayOfYear(d, p) {
-    return pad(1 + day.count(year(d), d), p, 3);
-  }
-
-  function formatMilliseconds(d, p) {
-    return pad(d.getMilliseconds(), p, 3);
-  }
-
-  function formatMicroseconds(d, p) {
-    return formatMilliseconds(d, p) + "000";
-  }
-
-  function formatMonthNumber(d, p) {
-    return pad(d.getMonth() + 1, p, 2);
-  }
-
-  function formatMinutes(d, p) {
-    return pad(d.getMinutes(), p, 2);
-  }
-
-  function formatSeconds(d, p) {
-    return pad(d.getSeconds(), p, 2);
-  }
-
-  function formatWeekdayNumberMonday(d) {
-    var day = d.getDay();
-    return day === 0 ? 7 : day;
-  }
-
-  function formatWeekNumberSunday(d, p) {
-    return pad(sunday.count(year(d) - 1, d), p, 2);
-  }
-
-  function formatWeekNumberISO(d, p) {
-    var day = d.getDay();
-    d = (day >= 4 || day === 0) ? thursday(d) : thursday.ceil(d);
-    return pad(thursday.count(year(d), d) + (year(d).getDay() === 4), p, 2);
-  }
-
-  function formatWeekdayNumberSunday(d) {
-    return d.getDay();
-  }
-
-  function formatWeekNumberMonday(d, p) {
-    return pad(monday.count(year(d) - 1, d), p, 2);
-  }
-
-  function formatYear(d, p) {
-    return pad(d.getFullYear() % 100, p, 2);
-  }
-
-  function formatFullYear(d, p) {
-    return pad(d.getFullYear() % 10000, p, 4);
-  }
-
-  function formatZone(d) {
-    var z = d.getTimezoneOffset();
-    return (z > 0 ? "-" : (z *= -1, "+"))
-        + pad(z / 60 | 0, "0", 2)
-        + pad(z % 60, "0", 2);
-  }
-
-  function formatUTCDayOfMonth(d, p) {
-    return pad(d.getUTCDate(), p, 2);
-  }
-
-  function formatUTCHour24(d, p) {
-    return pad(d.getUTCHours(), p, 2);
-  }
-
-  function formatUTCHour12(d, p) {
-    return pad(d.getUTCHours() % 12 || 12, p, 2);
-  }
-
-  function formatUTCDayOfYear(d, p) {
-    return pad(1 + utcDay.count(utcYear(d), d), p, 3);
-  }
-
-  function formatUTCMilliseconds(d, p) {
-    return pad(d.getUTCMilliseconds(), p, 3);
-  }
-
-  function formatUTCMicroseconds(d, p) {
-    return formatUTCMilliseconds(d, p) + "000";
-  }
-
-  function formatUTCMonthNumber(d, p) {
-    return pad(d.getUTCMonth() + 1, p, 2);
-  }
-
-  function formatUTCMinutes(d, p) {
-    return pad(d.getUTCMinutes(), p, 2);
-  }
-
-  function formatUTCSeconds(d, p) {
-    return pad(d.getUTCSeconds(), p, 2);
-  }
-
-  function formatUTCWeekdayNumberMonday(d) {
-    var dow = d.getUTCDay();
-    return dow === 0 ? 7 : dow;
-  }
-
-  function formatUTCWeekNumberSunday(d, p) {
-    return pad(utcSunday.count(utcYear(d) - 1, d), p, 2);
-  }
-
-  function formatUTCWeekNumberISO(d, p) {
-    var day = d.getUTCDay();
-    d = (day >= 4 || day === 0) ? utcThursday(d) : utcThursday.ceil(d);
-    return pad(utcThursday.count(utcYear(d), d) + (utcYear(d).getUTCDay() === 4), p, 2);
-  }
-
-  function formatUTCWeekdayNumberSunday(d) {
-    return d.getUTCDay();
-  }
-
-  function formatUTCWeekNumberMonday(d, p) {
-    return pad(utcMonday.count(utcYear(d) - 1, d), p, 2);
-  }
-
-  function formatUTCYear(d, p) {
-    return pad(d.getUTCFullYear() % 100, p, 2);
-  }
-
-  function formatUTCFullYear(d, p) {
-    return pad(d.getUTCFullYear() % 10000, p, 4);
-  }
-
-  function formatUTCZone() {
-    return "+0000";
-  }
-
-  function formatLiteralPercent() {
-    return "%";
-  }
-
-  function formatUnixTimestamp(d) {
-    return +d;
-  }
-
-  function formatUnixTimestampSeconds(d) {
-    return Math.floor(+d / 1000);
-  }
-
-  var locale$1;
-  var timeFormat;
-  var timeParse;
-  var utcFormat;
-  var utcParse;
-
-  defaultLocale$1({
-    dateTime: "%x, %X",
-    date: "%-m/%-d/%Y",
-    time: "%-I:%M:%S %p",
-    periods: ["AM", "PM"],
-    days: ["Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday"],
-    shortDays: ["Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"],
-    months: ["January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"],
-    shortMonths: ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"]
-  });
-
-  function defaultLocale$1(definition) {
-    locale$1 = formatLocale$1(definition);
-    timeFormat = locale$1.format;
-    timeParse = locale$1.parse;
-    utcFormat = locale$1.utcFormat;
-    utcParse = locale$1.utcParse;
-    return locale$1;
-  }
-
-  var isoSpecifier = "%Y-%m-%dT%H:%M:%S.%LZ";
-
-  function formatIsoNative(date) {
-    return date.toISOString();
-  }
-
-  var formatIso = Date.prototype.toISOString
-      ? formatIsoNative
-      : utcFormat(isoSpecifier);
-
-  function parseIsoNative(string) {
-    var date = new Date(string);
-    return isNaN(date) ? null : date;
-  }
-
-  var parseIso = +new Date("2000-01-01T00:00:00.000Z")
-      ? parseIsoNative
-      : utcParse(isoSpecifier);
-
-  var noop = {value: function() {}};
-
-  function dispatch() {
-    for (var i = 0, n = arguments.length, _ = {}, t; i < n; ++i) {
-      if (!(t = arguments[i] + "") || (t in _) || /[\s.]/.test(t)) throw new Error("illegal type: " + t);
-      _[t] = [];
-    }
-    return new Dispatch(_);
-  }
-
-  function Dispatch(_) {
-    this._ = _;
-  }
-
-  function parseTypenames(typenames, types) {
-    return typenames.trim().split(/^|\s+/).map(function(t) {
-      var name = "", i = t.indexOf(".");
-      if (i >= 0) name = t.slice(i + 1), t = t.slice(0, i);
-      if (t && !types.hasOwnProperty(t)) throw new Error("unknown type: " + t);
-      return {type: t, name: name};
-    });
-  }
-
-  Dispatch.prototype = dispatch.prototype = {
-    constructor: Dispatch,
-    on: function(typename, callback) {
-      var _ = this._,
-          T = parseTypenames(typename + "", _),
-          t,
-          i = -1,
-          n = T.length;
-
-      // If no callback was specified, return the callback of the given type and name.
-      if (arguments.length < 2) {
-        while (++i < n) if ((t = (typename = T[i]).type) && (t = get(_[t], typename.name))) return t;
-        return;
-      }
-
-      // If a type was specified, set the callback for the given type and name.
-      // Otherwise, if a null callback was specified, remove callbacks of the given name.
-      if (callback != null && typeof callback !== "function") throw new Error("invalid callback: " + callback);
-      while (++i < n) {
-        if (t = (typename = T[i]).type) _[t] = set(_[t], typename.name, callback);
-        else if (callback == null) for (t in _) _[t] = set(_[t], typename.name, null);
-      }
-
-      return this;
-    },
-    copy: function() {
-      var copy = {}, _ = this._;
-      for (var t in _) copy[t] = _[t].slice();
-      return new Dispatch(copy);
-    },
-    call: function(type, that) {
-      if ((n = arguments.length - 2) > 0) for (var args = new Array(n), i = 0, n, t; i < n; ++i) args[i] = arguments[i + 2];
-      if (!this._.hasOwnProperty(type)) throw new Error("unknown type: " + type);
-      for (t = this._[type], i = 0, n = t.length; i < n; ++i) t[i].value.apply(that, args);
-    },
-    apply: function(type, that, args) {
-      if (!this._.hasOwnProperty(type)) throw new Error("unknown type: " + type);
-      for (var t = this._[type], i = 0, n = t.length; i < n; ++i) t[i].value.apply(that, args);
-    }
-  };
-
-  function get(type, name) {
-    for (var i = 0, n = type.length, c; i < n; ++i) {
-      if ((c = type[i]).name === name) {
-        return c.value;
-      }
-    }
-  }
-
-  function set(type, name, callback) {
-    for (var i = 0, n = type.length; i < n; ++i) {
-      if (type[i].name === name) {
-        type[i] = noop, type = type.slice(0, i).concat(type.slice(i + 1));
-        break;
-      }
-    }
-    if (callback != null) type.push({name: name, value: callback});
-    return type;
-  }
-
-  var xhtml = "http://www.w3.org/1999/xhtml";
-
-  var namespaces = {
-    svg: "http://www.w3.org/2000/svg",
-    xhtml: xhtml,
-    xlink: "http://www.w3.org/1999/xlink",
-    xml: "http://www.w3.org/XML/1998/namespace",
-    xmlns: "http://www.w3.org/2000/xmlns/"
-  };
-
-  function namespace(name) {
-    var prefix = name += "", i = prefix.indexOf(":");
-    if (i >= 0 && (prefix = name.slice(0, i)) !== "xmlns") name = name.slice(i + 1);
-    return namespaces.hasOwnProperty(prefix) ? {space: namespaces[prefix], local: name} : name;
-  }
-
-  function creatorInherit(name) {
-    return function() {
-      var document = this.ownerDocument,
-          uri = this.namespaceURI;
-      return uri === xhtml && document.documentElement.namespaceURI === xhtml
-          ? document.createElement(name)
-          : document.createElementNS(uri, name);
-    };
-  }
-
-  function creatorFixed(fullname) {
-    return function() {
-      return this.ownerDocument.createElementNS(fullname.space, fullname.local);
-    };
-  }
-
-  function creator(name) {
-    var fullname = namespace(name);
-    return (fullname.local
-        ? creatorFixed
-        : creatorInherit)(fullname);
-  }
-
-  function none() {}
-
-  function selector(selector) {
-    return selector == null ? none : function() {
-      return this.querySelector(selector);
-    };
-  }
-
-  function selection_select(select) {
-    if (typeof select !== "function") select = selector(select);
-
-    for (var groups = this._groups, m = groups.length, subgroups = new Array(m), j = 0; j < m; ++j) {
-      for (var group = groups[j], n = group.length, subgroup = subgroups[j] = new Array(n), node, subnode, i = 0; i < n; ++i) {
-        if ((node = group[i]) && (subnode = select.call(node, node.__data__, i, group))) {
-          if ("__data__" in node) subnode.__data__ = node.__data__;
-          subgroup[i] = subnode;
-        }
-      }
-    }
-
-    return new Selection(subgroups, this._parents);
-  }
-
-  function empty() {
-    return [];
-  }
-
-  function selectorAll(selector) {
-    return selector == null ? empty : function() {
-      return this.querySelectorAll(selector);
-    };
-  }
-
-  function selection_selectAll(select) {
-    if (typeof select !== "function") select = selectorAll(select);
-
-    for (var groups = this._groups, m = groups.length, subgroups = [], parents = [], j = 0; j < m; ++j) {
-      for (var group = groups[j], n = group.length, node, i = 0; i < n; ++i) {
-        if (node = group[i]) {
-          subgroups.push(select.call(node, node.__data__, i, group));
-          parents.push(node);
-        }
-      }
-    }
-
-    return new Selection(subgroups, parents);
-  }
-
-  function matcher(selector) {
-    return function() {
-      return this.matches(selector);
-    };
-  }
-
-  function selection_filter(match) {
-    if (typeof match !== "function") match = matcher(match);
-
-    for (var groups = this._groups, m = groups.length, subgroups = new Array(m), j = 0; j < m; ++j) {
-      for (var group = groups[j], n = group.length, subgroup = subgroups[j] = [], node, i = 0; i < n; ++i) {
-        if ((node = group[i]) && match.call(node, node.__data__, i, group)) {
-          subgroup.push(node);
-        }
-      }
-    }
-
-    return new Selection(subgroups, this._parents);
-  }
-
-  function sparse(update) {
-    return new Array(update.length);
-  }
-
-  function selection_enter() {
-    return new Selection(this._enter || this._groups.map(sparse), this._parents);
-  }
-
-  function EnterNode(parent, datum) {
-    this.ownerDocument = parent.ownerDocument;
-    this.namespaceURI = parent.namespaceURI;
-    this._next = null;
-    this._parent = parent;
-    this.__data__ = datum;
-  }
-
-  EnterNode.prototype = {
-    constructor: EnterNode,
-    appendChild: function(child) { return this._parent.insertBefore(child, this._next); },
-    insertBefore: function(child, next) { return this._parent.insertBefore(child, next); },
-    querySelector: function(selector) { return this._parent.querySelector(selector); },
-    querySelectorAll: function(selector) { return this._parent.querySelectorAll(selector); }
-  };
-
-  function constant$2(x) {
-    return function() {
-      return x;
-    };
-  }
-
-  var keyPrefix = "$"; // Protect against keys like “__proto__”.
-
-  function bindIndex(parent, group, enter, update, exit, data) {
-    var i = 0,
-        node,
-        groupLength = group.length,
-        dataLength = data.length;
-
-    // Put any non-null nodes that fit into update.
-    // Put any null nodes into enter.
-    // Put any remaining data into enter.
-    for (; i < dataLength; ++i) {
-      if (node = group[i]) {
-        node.__data__ = data[i];
-        update[i] = node;
-      } else {
-        enter[i] = new EnterNode(parent, data[i]);
-      }
-    }
-
-    // Put any non-null nodes that don’t fit into exit.
-    for (; i < groupLength; ++i) {
-      if (node = group[i]) {
-        exit[i] = node;
-      }
-    }
-  }
-
-  function bindKey(parent, group, enter, update, exit, data, key) {
-    var i,
-        node,
-        nodeByKeyValue = {},
-        groupLength = group.length,
-        dataLength = data.length,
-        keyValues = new Array(groupLength),
-        keyValue;
-
-    // Compute the key for each node.
-    // If multiple nodes have the same key, the duplicates are added to exit.
-    for (i = 0; i < groupLength; ++i) {
-      if (node = group[i]) {
-        keyValues[i] = keyValue = keyPrefix + key.call(node, node.__data__, i, group);
-        if (keyValue in nodeByKeyValue) {
-          exit[i] = node;
-        } else {
-          nodeByKeyValue[keyValue] = node;
-        }
-      }
-    }
-
-    // Compute the key for each datum.
-    // If there a node associated with this key, join and add it to update.
-    // If there is not (or the key is a duplicate), add it to enter.
-    for (i = 0; i < dataLength; ++i) {
-      keyValue = keyPrefix + key.call(parent, data[i], i, data);
-      if (node = nodeByKeyValue[keyValue]) {
-        update[i] = node;
-        node.__data__ = data[i];
-        nodeByKeyValue[keyValue] = null;
-      } else {
-        enter[i] = new EnterNode(parent, data[i]);
-      }
-    }
-
-    // Add any remaining nodes that were not bound to data to exit.
-    for (i = 0; i < groupLength; ++i) {
-      if ((node = group[i]) && (nodeByKeyValue[keyValues[i]] === node)) {
-        exit[i] = node;
-      }
-    }
-  }
-
-  function selection_data(value, key) {
-    if (!value) {
-      data = new Array(this.size()), j = -1;
-      this.each(function(d) { data[++j] = d; });
-      return data;
-    }
-
-    var bind = key ? bindKey : bindIndex,
-        parents = this._parents,
-        groups = this._groups;
-
-    if (typeof value !== "function") value = constant$2(value);
-
-    for (var m = groups.length, update = new Array(m), enter = new Array(m), exit = new Array(m), j = 0; j < m; ++j) {
-      var parent = parents[j],
-          group = groups[j],
-          groupLength = group.length,
-          data = value.call(parent, parent && parent.__data__, j, parents),
-          dataLength = data.length,
-          enterGroup = enter[j] = new Array(dataLength),
-          updateGroup = update[j] = new Array(dataLength),
-          exitGroup = exit[j] = new Array(groupLength);
-
-      bind(parent, group, enterGroup, updateGroup, exitGroup, data, key);
-
-      // Now connect the enter nodes to their following update node, such that
-      // appendChild can insert the materialized enter node before this node,
-      // rather than at the end of the parent node.
-      for (var i0 = 0, i1 = 0, previous, next; i0 < dataLength; ++i0) {
-        if (previous = enterGroup[i0]) {
-          if (i0 >= i1) i1 = i0 + 1;
-          while (!(next = updateGroup[i1]) && ++i1 < dataLength);
-          previous._next = next || null;
-        }
-      }
-    }
-
-    update = new Selection(update, parents);
-    update._enter = enter;
-    update._exit = exit;
-    return update;
-  }
-
-  function selection_exit() {
-    return new Selection(this._exit || this._groups.map(sparse), this._parents);
-  }
-
-  function selection_join(onenter, onupdate, onexit) {
-    var enter = this.enter(), update = this, exit = this.exit();
-    enter = typeof onenter === "function" ? onenter(enter) : enter.append(onenter + "");
-    if (onupdate != null) update = onupdate(update);
-    if (onexit == null) exit.remove(); else onexit(exit);
-    return enter && update ? enter.merge(update).order() : update;
-  }
-
-  function selection_merge(selection) {
-
-    for (var groups0 = this._groups, groups1 = selection._groups, m0 = groups0.length, m1 = groups1.length, m = Math.min(m0, m1), merges = new Array(m0), j = 0; j < m; ++j) {
-      for (var group0 = groups0[j], group1 = groups1[j], n = group0.length, merge = merges[j] = new Array(n), node, i = 0; i < n; ++i) {
-        if (node = group0[i] || group1[i]) {
-          merge[i] = node;
-        }
-      }
-    }
-
-    for (; j < m0; ++j) {
-      merges[j] = groups0[j];
-    }
-
-    return new Selection(merges, this._parents);
-  }
-
-  function selection_order() {
-
-    for (var groups = this._groups, j = -1, m = groups.length; ++j < m;) {
-      for (var group = groups[j], i = group.length - 1, next = group[i], node; --i >= 0;) {
-        if (node = group[i]) {
-          if (next && node.compareDocumentPosition(next) ^ 4) next.parentNode.insertBefore(node, next);
-          next = node;
-        }
-      }
-    }
-
-    return this;
-  }
-
-  function selection_sort(compare) {
-    if (!compare) compare = ascending$1;
-
-    function compareNode(a, b) {
-      return a && b ? compare(a.__data__, b.__data__) : !a - !b;
-    }
-
-    for (var groups = this._groups, m = groups.length, sortgroups = new Array(m), j = 0; j < m; ++j) {
-      for (var group = groups[j], n = group.length, sortgroup = sortgroups[j] = new Array(n), node, i = 0; i < n; ++i) {
-        if (node = group[i]) {
-          sortgroup[i] = node;
-        }
-      }
-      sortgroup.sort(compareNode);
-    }
-
-    return new Selection(sortgroups, this._parents).order();
-  }
-
-  function ascending$1(a, b) {
-    return a < b ? -1 : a > b ? 1 : a >= b ? 0 : NaN;
-  }
-
-  function selection_call() {
-    var callback = arguments[0];
-    arguments[0] = this;
-    callback.apply(null, arguments);
-    return this;
-  }
-
-  function selection_nodes() {
-    var nodes = new Array(this.size()), i = -1;
-    this.each(function() { nodes[++i] = this; });
-    return nodes;
-  }
-
-  function selection_node() {
-
-    for (var groups = this._groups, j = 0, m = groups.length; j < m; ++j) {
-      for (var group = groups[j], i = 0, n = group.length; i < n; ++i) {
-        var node = group[i];
-        if (node) return node;
-      }
-    }
-
-    return null;
-  }
-
-  function selection_size() {
-    var size = 0;
-    this.each(function() { ++size; });
-    return size;
-  }
-
-  function selection_empty() {
-    return !this.node();
-  }
-
-  function selection_each(callback) {
-
-    for (var groups = this._groups, j = 0, m = groups.length; j < m; ++j) {
-      for (var group = groups[j], i = 0, n = group.length, node; i < n; ++i) {
-        if (node = group[i]) callback.call(node, node.__data__, i, group);
-      }
-    }
-
-    return this;
-  }
-
-  function attrRemove(name) {
-    return function() {
-      this.removeAttribute(name);
-    };
-  }
-
-  function attrRemoveNS(fullname) {
-    return function() {
-      this.removeAttributeNS(fullname.space, fullname.local);
-    };
-  }
-
-  function attrConstant(name, value) {
-    return function() {
-      this.setAttribute(name, value);
-    };
-  }
-
-  function attrConstantNS(fullname, value) {
-    return function() {
-      this.setAttributeNS(fullname.space, fullname.local, value);
-    };
-  }
-
-  function attrFunction(name, value) {
-    return function() {
-      var v = value.apply(this, arguments);
-      if (v == null) this.removeAttribute(name);
-      else this.setAttribute(name, v);
-    };
-  }
-
-  function attrFunctionNS(fullname, value) {
-    return function() {
-      var v = value.apply(this, arguments);
-      if (v == null) this.removeAttributeNS(fullname.space, fullname.local);
-      else this.setAttributeNS(fullname.space, fullname.local, v);
-    };
-  }
-
-  function selection_attr(name, value) {
-    var fullname = namespace(name);
-
-    if (arguments.length < 2) {
-      var node = this.node();
-      return fullname.local
-          ? node.getAttributeNS(fullname.space, fullname.local)
-          : node.getAttribute(fullname);
-    }
-
-    return this.each((value == null
-        ? (fullname.local ? attrRemoveNS : attrRemove) : (typeof value === "function"
-        ? (fullname.local ? attrFunctionNS : attrFunction)
-        : (fullname.local ? attrConstantNS : attrConstant)))(fullname, value));
-  }
-
-  function defaultView(node) {
-    return (node.ownerDocument && node.ownerDocument.defaultView) // node is a Node
-        || (node.document && node) // node is a Window
-        || node.defaultView; // node is a Document
-  }
-
-  function styleRemove(name) {
-    return function() {
-      this.style.removeProperty(name);
-    };
-  }
-
-  function styleConstant(name, value, priority) {
-    return function() {
-      this.style.setProperty(name, value, priority);
-    };
-  }
-
-  function styleFunction(name, value, priority) {
-    return function() {
-      var v = value.apply(this, arguments);
-      if (v == null) this.style.removeProperty(name);
-      else this.style.setProperty(name, v, priority);
-    };
-  }
-
-  function selection_style(name, value, priority) {
-    return arguments.length > 1
-        ? this.each((value == null
-              ? styleRemove : typeof value === "function"
-              ? styleFunction
-              : styleConstant)(name, value, priority == null ? "" : priority))
-        : styleValue(this.node(), name);
-  }
-
-  function styleValue(node, name) {
-    return node.style.getPropertyValue(name)
-        || defaultView(node).getComputedStyle(node, null).getPropertyValue(name);
-  }
-
-  function propertyRemove(name) {
-    return function() {
-      delete this[name];
-    };
-  }
-
-  function propertyConstant(name, value) {
-    return function() {
-      this[name] = value;
-    };
-  }
-
-  function propertyFunction(name, value) {
-    return function() {
-      var v = value.apply(this, arguments);
-      if (v == null) delete this[name];
-      else this[name] = v;
-    };
-  }
-
-  function selection_property(name, value) {
-    return arguments.length > 1
-        ? this.each((value == null
-            ? propertyRemove : typeof value === "function"
-            ? propertyFunction
-            : propertyConstant)(name, value))
-        : this.node()[name];
-  }
-
-  function classArray(string) {
-    return string.trim().split(/^|\s+/);
-  }
-
-  function classList(node) {
-    return node.classList || new ClassList(node);
-  }
-
-  function ClassList(node) {
-    this._node = node;
-    this._names = classArray(node.getAttribute("class") || "");
-  }
-
-  ClassList.prototype = {
-    add: function(name) {
-      var i = this._names.indexOf(name);
-      if (i < 0) {
-        this._names.push(name);
-        this._node.setAttribute("class", this._names.join(" "));
-      }
-    },
-    remove: function(name) {
-      var i = this._names.indexOf(name);
-      if (i >= 0) {
-        this._names.splice(i, 1);
-        this._node.setAttribute("class", this._names.join(" "));
-      }
-    },
-    contains: function(name) {
-      return this._names.indexOf(name) >= 0;
-    }
-  };
-
-  function classedAdd(node, names) {
-    var list = classList(node), i = -1, n = names.length;
-    while (++i < n) list.add(names[i]);
-  }
-
-  function classedRemove(node, names) {
-    var list = classList(node), i = -1, n = names.length;
-    while (++i < n) list.remove(names[i]);
-  }
-
-  function classedTrue(names) {
-    return function() {
-      classedAdd(this, names);
-    };
-  }
-
-  function classedFalse(names) {
-    return function() {
-      classedRemove(this, names);
-    };
-  }
-
-  function classedFunction(names, value) {
-    return function() {
-      (value.apply(this, arguments) ? classedAdd : classedRemove)(this, names);
-    };
-  }
-
-  function selection_classed(name, value) {
-    var names = classArray(name + "");
-
-    if (arguments.length < 2) {
-      var list = classList(this.node()), i = -1, n = names.length;
-      while (++i < n) if (!list.contains(names[i])) return false;
-      return true;
-    }
-
-    return this.each((typeof value === "function"
-        ? classedFunction : value
-        ? classedTrue
-        : classedFalse)(names, value));
-  }
-
-  function textRemove() {
-    this.textContent = "";
-  }
-
-  function textConstant(value) {
-    return function() {
-      this.textContent = value;
-    };
-  }
-
-  function textFunction(value) {
-    return function() {
-      var v = value.apply(this, arguments);
-      this.textContent = v == null ? "" : v;
-    };
-  }
-
-  function selection_text(value) {
-    return arguments.length
-        ? this.each(value == null
-            ? textRemove : (typeof value === "function"
-            ? textFunction
-            : textConstant)(value))
-        : this.node().textContent;
-  }
-
-  function htmlRemove() {
-    this.innerHTML = "";
-  }
-
-  function htmlConstant(value) {
-    return function() {
-      this.innerHTML = value;
-    };
-  }
-
-  function htmlFunction(value) {
-    return function() {
-      var v = value.apply(this, arguments);
-      this.innerHTML = v == null ? "" : v;
-    };
-  }
-
-  function selection_html(value) {
-    return arguments.length
-        ? this.each(value == null
-            ? htmlRemove : (typeof value === "function"
-            ? htmlFunction
-            : htmlConstant)(value))
-        : this.node().innerHTML;
-  }
-
-  function raise() {
-    if (this.nextSibling) this.parentNode.appendChild(this);
-  }
-
-  function selection_raise() {
-    return this.each(raise);
-  }
-
-  function lower() {
-    if (this.previousSibling) this.parentNode.insertBefore(this, this.parentNode.firstChild);
-  }
-
-  function selection_lower() {
-    return this.each(lower);
-  }
-
-  function selection_append(name) {
-    var create = typeof name === "function" ? name : creator(name);
-    return this.select(function() {
-      return this.appendChild(create.apply(this, arguments));
-    });
-  }
-
-  function constantNull() {
-    return null;
-  }
-
-  function selection_insert(name, before) {
-    var create = typeof name === "function" ? name : creator(name),
-        select = before == null ? constantNull : typeof before === "function" ? before : selector(before);
-    return this.select(function() {
-      return this.insertBefore(create.apply(this, arguments), select.apply(this, arguments) || null);
-    });
-  }
-
-  function remove() {
-    var parent = this.parentNode;
-    if (parent) parent.removeChild(this);
-  }
-
-  function selection_remove() {
-    return this.each(remove);
-  }
-
-  function selection_cloneShallow() {
-    var clone = this.cloneNode(false), parent = this.parentNode;
-    return parent ? parent.insertBefore(clone, this.nextSibling) : clone;
-  }
-
-  function selection_cloneDeep() {
-    var clone = this.cloneNode(true), parent = this.parentNode;
-    return parent ? parent.insertBefore(clone, this.nextSibling) : clone;
-  }
-
-  function selection_clone(deep) {
-    return this.select(deep ? selection_cloneDeep : selection_cloneShallow);
-  }
-
-  function selection_datum(value) {
-    return arguments.length
-        ? this.property("__data__", value)
-        : this.node().__data__;
-  }
-
-  var filterEvents = {};
-
-  var event = null;
-
-  if (typeof document !== "undefined") {
-    var element = document.documentElement;
-    if (!("onmouseenter" in element)) {
-      filterEvents = {mouseenter: "mouseover", mouseleave: "mouseout"};
-    }
-  }
-
-  function filterContextListener(listener, index, group) {
-    listener = contextListener(listener, index, group);
-    return function(event) {
-      var related = event.relatedTarget;
-      if (!related || (related !== this && !(related.compareDocumentPosition(this) & 8))) {
-        listener.call(this, event);
-      }
-    };
-  }
-
-  function contextListener(listener, index, group) {
-    return function(event1) {
-      var event0 = event; // Events can be reentrant (e.g., focus).
-      event = event1;
-      try {
-        listener.call(this, this.__data__, index, group);
-      } finally {
-        event = event0;
-      }
-    };
-  }
-
-  function parseTypenames$1(typenames) {
-    return typenames.trim().split(/^|\s+/).map(function(t) {
-      var name = "", i = t.indexOf(".");
-      if (i >= 0) name = t.slice(i + 1), t = t.slice(0, i);
-      return {type: t, name: name};
-    });
-  }
-
-  function onRemove(typename) {
-    return function() {
-      var on = this.__on;
-      if (!on) return;
-      for (var j = 0, i = -1, m = on.length, o; j < m; ++j) {
-        if (o = on[j], (!typename.type || o.type === typename.type) && o.name === typename.name) {
-          this.removeEventListener(o.type, o.listener, o.capture);
-        } else {
-          on[++i] = o;
-        }
-      }
-      if (++i) on.length = i;
-      else delete this.__on;
-    };
-  }
-
-  function onAdd(typename, value, capture) {
-    var wrap = filterEvents.hasOwnProperty(typename.type) ? filterContextListener : contextListener;
-    return function(d, i, group) {
-      var on = this.__on, o, listener = wrap(value, i, group);
-      if (on) for (var j = 0, m = on.length; j < m; ++j) {
-        if ((o = on[j]).type === typename.type && o.name === typename.name) {
-          this.removeEventListener(o.type, o.listener, o.capture);
-          this.addEventListener(o.type, o.listener = listener, o.capture = capture);
-          o.value = value;
-          return;
-        }
-      }
-      this.addEventListener(typename.type, listener, capture);
-      o = {type: typename.type, name: typename.name, value: value, listener: listener, capture: capture};
-      if (!on) this.__on = [o];
-      else on.push(o);
-    };
-  }
-
-  function selection_on(typename, value, capture) {
-    var typenames = parseTypenames$1(typename + ""), i, n = typenames.length, t;
-
-    if (arguments.length < 2) {
-      var on = this.node().__on;
-      if (on) for (var j = 0, m = on.length, o; j < m; ++j) {
-        for (i = 0, o = on[j]; i < n; ++i) {
-          if ((t = typenames[i]).type === o.type && t.name === o.name) {
-            return o.value;
-          }
-        }
-      }
-      return;
-    }
-
-    on = value ? onAdd : onRemove;
-    if (capture == null) capture = false;
-    for (i = 0; i < n; ++i) this.each(on(typenames[i], value, capture));
-    return this;
-  }
-
-  function customEvent(event1, listener, that, args) {
-    var event0 = event;
-    event1.sourceEvent = event;
-    event = event1;
-    try {
-      return listener.apply(that, args);
-    } finally {
-      event = event0;
-    }
-  }
-
-  function dispatchEvent(node, type, params) {
-    var window = defaultView(node),
-        event = window.CustomEvent;
-
-    if (typeof event === "function") {
-      event = new event(type, params);
-    } else {
-      event = window.document.createEvent("Event");
-      if (params) event.initEvent(type, params.bubbles, params.cancelable), event.detail = params.detail;
-      else event.initEvent(type, false, false);
-    }
-
-    node.dispatchEvent(event);
-  }
-
-  function dispatchConstant(type, params) {
-    return function() {
-      return dispatchEvent(this, type, params);
-    };
-  }
-
-  function dispatchFunction(type, params) {
-    return function() {
-      return dispatchEvent(this, type, params.apply(this, arguments));
-    };
-  }
-
-  function selection_dispatch(type, params) {
-    return this.each((typeof params === "function"
-        ? dispatchFunction
-        : dispatchConstant)(type, params));
-  }
-
-  var root = [null];
-
-  function Selection(groups, parents) {
-    this._groups = groups;
-    this._parents = parents;
-  }
-
-  function selection() {
-    return new Selection([[document.documentElement]], root);
-  }
-
-  Selection.prototype = selection.prototype = {
-    constructor: Selection,
-    select: selection_select,
-    selectAll: selection_selectAll,
-    filter: selection_filter,
-    data: selection_data,
-    enter: selection_enter,
-    exit: selection_exit,
-    join: selection_join,
-    merge: selection_merge,
-    order: selection_order,
-    sort: selection_sort,
-    call: selection_call,
-    nodes: selection_nodes,
-    node: selection_node,
-    size: selection_size,
-    empty: selection_empty,
-    each: selection_each,
-    attr: selection_attr,
-    style: selection_style,
-    property: selection_property,
-    classed: selection_classed,
-    text: selection_text,
-    html: selection_html,
-    raise: selection_raise,
-    lower: selection_lower,
-    append: selection_append,
-    insert: selection_insert,
-    remove: selection_remove,
-    clone: selection_clone,
-    datum: selection_datum,
-    on: selection_on,
-    dispatch: selection_dispatch
-  };
-
-  function select(selector) {
-    return typeof selector === "string"
-        ? new Selection([[document.querySelector(selector)]], [document.documentElement])
-        : new Selection([[selector]], root);
-  }
-
-  function sourceEvent() {
-    var current = event, source;
-    while (source = current.sourceEvent) current = source;
-    return current;
-  }
-
-  function point(node, event) {
-    var svg = node.ownerSVGElement || node;
-
-    if (svg.createSVGPoint) {
-      var point = svg.createSVGPoint();
-      point.x = event.clientX, point.y = event.clientY;
-      point = point.matrixTransform(node.getScreenCTM().inverse());
-      return [point.x, point.y];
-    }
-
-    var rect = node.getBoundingClientRect();
-    return [event.clientX - rect.left - node.clientLeft, event.clientY - rect.top - node.clientTop];
-  }
-
-  function mouse(node) {
-    var event = sourceEvent();
-    if (event.changedTouches) event = event.changedTouches[0];
-    return point(node, event);
-  }
-
-  function touch(node, touches, identifier) {
-    if (arguments.length < 3) identifier = touches, touches = sourceEvent().changedTouches;
-
-    for (var i = 0, n = touches ? touches.length : 0, touch; i < n; ++i) {
-      if ((touch = touches[i]).identifier === identifier) {
-        return point(node, touch);
-      }
-    }
-
-    return null;
-  }
-
-  function nopropagation() {
-    event.stopImmediatePropagation();
-  }
-
-  function noevent() {
-    event.preventDefault();
-    event.stopImmediatePropagation();
-  }
-
-  function nodrag(view) {
-    var root = view.document.documentElement,
-        selection = select(view).on("dragstart.drag", noevent, true);
-    if ("onselectstart" in root) {
-      selection.on("selectstart.drag", noevent, true);
-    } else {
-      root.__noselect = root.style.MozUserSelect;
-      root.style.MozUserSelect = "none";
-    }
-  }
-
-  function yesdrag(view, noclick) {
-    var root = view.document.documentElement,
-        selection = select(view).on("dragstart.drag", null);
-    if (noclick) {
-      selection.on("click.drag", noevent, true);
-      setTimeout(function() { selection.on("click.drag", null); }, 0);
-    }
-    if ("onselectstart" in root) {
-      selection.on("selectstart.drag", null);
-    } else {
-      root.style.MozUserSelect = root.__noselect;
-      delete root.__noselect;
-    }
-  }
-
-  function constant$3(x) {
-    return function() {
-      return x;
-    };
-  }
-
-  function DragEvent(target, type, subject, id, active, x, y, dx, dy, dispatch) {
-    this.target = target;
-    this.type = type;
-    this.subject = subject;
-    this.identifier = id;
-    this.active = active;
-    this.x = x;
-    this.y = y;
-    this.dx = dx;
-    this.dy = dy;
-    this._ = dispatch;
-  }
-
-  DragEvent.prototype.on = function() {
-    var value = this._.on.apply(this._, arguments);
-    return value === this._ ? this : value;
-  };
-
-  // Ignore right-click, since that should open the context menu.
-  function defaultFilter() {
-    return !event.ctrlKey && !event.button;
-  }
-
-  function defaultContainer() {
-    return this.parentNode;
-  }
-
-  function defaultSubject(d) {
-    return d == null ? {x: event.x, y: event.y} : d;
-  }
-
-  function defaultTouchable() {
-    return navigator.maxTouchPoints || ("ontouchstart" in this);
-  }
-
-  function drag() {
-    var filter = defaultFilter,
-        container = defaultContainer,
-        subject = defaultSubject,
-        touchable = defaultTouchable,
-        gestures = {},
-        listeners = dispatch("start", "drag", "end"),
-        active = 0,
-        mousedownx,
-        mousedowny,
-        mousemoving,
-        touchending,
-        clickDistance2 = 0;
-
-    function drag(selection) {
-      selection
-          .on("mousedown.drag", mousedowned)
-        .filter(touchable)
-          .on("touchstart.drag", touchstarted)
-          .on("touchmove.drag", touchmoved)
-          .on("touchend.drag touchcancel.drag", touchended)
-          .style("touch-action", "none")
-          .style("-webkit-tap-highlight-color", "rgba(0,0,0,0)");
-    }
-
-    function mousedowned() {
-      if (touchending || !filter.apply(this, arguments)) return;
-      var gesture = beforestart("mouse", container.apply(this, arguments), mouse, this, arguments);
-      if (!gesture) return;
-      select(event.view).on("mousemove.drag", mousemoved, true).on("mouseup.drag", mouseupped, true);
-      nodrag(event.view);
-      nopropagation();
-      mousemoving = false;
-      mousedownx = event.clientX;
-      mousedowny = event.clientY;
-      gesture("start");
-    }
-
-    function mousemoved() {
-      noevent();
-      if (!mousemoving) {
-        var dx = event.clientX - mousedownx, dy = event.clientY - mousedowny;
-        mousemoving = dx * dx + dy * dy > clickDistance2;
-      }
-      gestures.mouse("drag");
-    }
-
-    function mouseupped() {
-      select(event.view).on("mousemove.drag mouseup.drag", null);
-      yesdrag(event.view, mousemoving);
-      noevent();
-      gestures.mouse("end");
-    }
-
-    function touchstarted() {
-      if (!filter.apply(this, arguments)) return;
-      var touches = event.changedTouches,
-          c = container.apply(this, arguments),
-          n = touches.length, i, gesture;
-
-      for (i = 0; i < n; ++i) {
-        if (gesture = beforestart(touches[i].identifier, c, touch, this, arguments)) {
-          nopropagation();
-          gesture("start");
-        }
-      }
-    }
-
-    function touchmoved() {
-      var touches = event.changedTouches,
-          n = touches.length, i, gesture;
-
-      for (i = 0; i < n; ++i) {
-        if (gesture = gestures[touches[i].identifier]) {
-          noevent();
-          gesture("drag");
-        }
-      }
-    }
-
-    function touchended() {
-      var touches = event.changedTouches,
-          n = touches.length, i, gesture;
-
-      if (touchending) clearTimeout(touchending);
-      touchending = setTimeout(function() { touchending = null; }, 500); // Ghost clicks are delayed!
-      for (i = 0; i < n; ++i) {
-        if (gesture = gestures[touches[i].identifier]) {
-          nopropagation();
-          gesture("end");
-        }
-      }
-    }
-
-    function beforestart(id, container, point, that, args) {
-      var p = point(container, id), s, dx, dy,
-          sublisteners = listeners.copy();
-
-      if (!customEvent(new DragEvent(drag, "beforestart", s, id, active, p[0], p[1], 0, 0, sublisteners), function() {
-        if ((event.subject = s = subject.apply(that, args)) == null) return false;
-        dx = s.x - p[0] || 0;
-        dy = s.y - p[1] || 0;
-        return true;
-      })) return;
-
-      return function gesture(type) {
-        var p0 = p, n;
-        switch (type) {
-          case "start": gestures[id] = gesture, n = active++; break;
-          case "end": delete gestures[id], --active; // nobreak
-          case "drag": p = point(container, id), n = active; break;
-        }
-        customEvent(new DragEvent(drag, type, s, id, n, p[0] + dx, p[1] + dy, p[0] - p0[0], p[1] - p0[1], sublisteners), sublisteners.apply, sublisteners, [type, that, args]);
-      };
-    }
-
-    drag.filter = function(_) {
-      return arguments.length ? (filter = typeof _ === "function" ? _ : constant$3(!!_), drag) : filter;
-    };
-
-    drag.container = function(_) {
-      return arguments.length ? (container = typeof _ === "function" ? _ : constant$3(_), drag) : container;
-    };
-
-    drag.subject = function(_) {
-      return arguments.length ? (subject = typeof _ === "function" ? _ : constant$3(_), drag) : subject;
-    };
-
-    drag.touchable = function(_) {
-      return arguments.length ? (touchable = typeof _ === "function" ? _ : constant$3(!!_), drag) : touchable;
-    };
-
-    drag.on = function() {
-      var value = listeners.on.apply(listeners, arguments);
-      return value === listeners ? drag : value;
-    };
-
-    drag.clickDistance = function(_) {
-      return arguments.length ? (clickDistance2 = (_ = +_) * _, drag) : Math.sqrt(clickDistance2);
-    };
-
-    return drag;
-  }
-
-  // Copyright 2018 The Distill Template Authors
-
-  const T$a = Template('d-slider', `
-<style>
-  :host {
-    position: relative;
-    display: inline-block;
-  }
-
-  :host(:focus) {
-    outline: none;
-  }
-
-  .background {
-    padding: 9px 0;
-    color: white;
-    position: relative;
-  }
-
-  .track {
-    height: 3px;
-    width: 100%;
-    border-radius: 2px;
-    background-color: hsla(0, 0%, 0%, 0.2);
-  }
-
-  .track-fill {
-    position: absolute;
-    top: 9px;
-    height: 3px;
-    border-radius: 4px;
-    background-color: hsl(24, 100%, 50%);
-  }
-
-  .knob-container {
-    position: absolute;
-    top: 10px;
-  }
-
-  .knob {
-    position: absolute;
-    top: -6px;
-    left: -6px;
-    width: 13px;
-    height: 13px;
-    background-color: hsl(24, 100%, 50%);
-    border-radius: 50%;
-    transition-property: transform;
-    transition-duration: 0.18s;
-    transition-timing-function: ease;
-  }
-  .mousedown .knob {
-    transform: scale(1.5);
-  }
-
-  .knob-highlight {
-    position: absolute;
-    top: -6px;
-    left: -6px;
-    width: 13px;
-    height: 13px;
-    background-color: hsla(0, 0%, 0%, 0.1);
-    border-radius: 50%;
-    transition-property: transform;
-    transition-duration: 0.18s;
-    transition-timing-function: ease;
-  }
-
-  .focus .knob-highlight {
-    transform: scale(2);
-  }
-
-  .ticks {
-    position: absolute;
-    top: 16px;
-    height: 4px;
-    width: 100%;
-    z-index: -1;
-  }
-
-  .ticks .tick {
-    position: absolute;
-    height: 100%;
-    border-left: 1px solid hsla(0, 0%, 0%, 0.2);
-  }
-
-</style>
-
-  <div class='background'>
-    <div class='track'></div>
-    <div class='track-fill'></div>
-    <div class='knob-container'>
-      <div class='knob-highlight'></div>
-      <div class='knob'></div>
-    </div>
-    <div class='ticks'></div>
-  </div>
-`);
-
-  // ARIA
-  // If the slider has a visible label, it is referenced by aria-labelledby on the slider element. Otherwise, the slider element has a label provided by aria-label.
-  // If the slider is vertically oriented, it has aria-orientation set to vertical. The default value of aria-orientation for a slider is horizontal.
-
-  const keyCodes = {
-    left: 37,
-    up: 38,
-    right: 39,
-    down: 40,
-    pageUp: 33,
-    pageDown: 34,
-    end: 35,
-    home: 36
-  };
-
-  class Slider extends T$a(HTMLElement) {
-
-
-    connectedCallback() {
-      this.connected = true;
-      this.setAttribute('role', 'slider');
-      // Makes the element tab-able.
-      if (!this.hasAttribute('tabindex')) { this.setAttribute('tabindex', 0); }
-
-      // Keeps track of keyboard vs. mouse interactions for focus rings
-      this.mouseEvent = false;
-
-      // Handles to shadow DOM elements
-      this.knob = this.root.querySelector('.knob-container');
-      this.background = this.root.querySelector('.background');
-      this.trackFill = this.root.querySelector('.track-fill');
-      this.track = this.root.querySelector('.track');
-
-      // Default values for attributes
-      this.min = this.min ? this.min : 0;
-      this.max = this.max ? this.max : 100;
-      this.scale = linear$1().domain([this.min, this.max]).range([0, 1]).clamp(true);
-
-      this.origin = this.origin !== undefined ? this.origin : this.min;
-      this.step = this.step ? this.step : 1;
-      this.update(this.value ? this.value : 0);
-
-      this.ticks = this.ticks ? this.ticks : false;
-      this.renderTicks();
-
-      this.drag = drag()
-        .container(this.background)
-        .on('start', () => {
-          this.mouseEvent = true;
-          this.background.classList.add('mousedown');
-          this.changeValue = this.value;
-          this.dragUpdate();
-        })
-        .on('drag', () => {
-          this.dragUpdate();
-        })
-        .on('end', () => {
-          this.mouseEvent = false;
-          this.background.classList.remove('mousedown');
-          this.dragUpdate();
-          if (this.changeValue !== this.value) this.dispatchChange();
-          this.changeValue = this.value;
-        });
-      this.drag(select(this.background));
-
-      this.addEventListener('focusin', () => {
-        if(!this.mouseEvent) {
-          this.background.classList.add('focus');
-        }
-      });
-      this.addEventListener('focusout', () => {
-        this.background.classList.remove('focus');
-      });
-      this.addEventListener('keydown', this.onKeyDown);
-
-    }
-
-    static get observedAttributes() {return ['min', 'max', 'value', 'step', 'ticks', 'origin', 'tickValues', 'tickLabels']; }
-
-    attributeChangedCallback(attr, oldValue, newValue) {
-      if (isNaN(newValue) || newValue === undefined || newValue === null) return;
-      if (attr == 'min') {
-        this.min = +newValue;
-        this.setAttribute('aria-valuemin', this.min);
-      }
-      if (attr == 'max') {
-        this.max = +newValue;
-        this.setAttribute('aria-valuemax', this.max);
-      }
-      if (attr == 'value') {
-        this.update(+newValue);
-      }
-      if (attr == 'origin') {
-        this.origin = +newValue;
-        // this.update(this.value);
-      }
-      if (attr == 'step') {
-        if (newValue > 0) {
-          this.step = +newValue;
-        }
-      }
-      if (attr == 'ticks') {
-        this.ticks = (newValue === '' ? true : newValue);
-      }
-    }
-
-    onKeyDown(event) {
-      this.changeValue = this.value;
-      let stopPropagation = false;
-      switch (event.keyCode) {
-      case keyCodes.left:
-      case keyCodes.down:
-        this.update(this.value - this.step);
-        stopPropagation = true;
-        break;
-      case keyCodes.right:
-      case keyCodes.up:
-        this.update(this.value + this.step);
-        stopPropagation = true;
-        break;
-      case keyCodes.pageUp:
-        this.update(this.value + this.step * 10);
-        stopPropagation = true;
-        break;
-
-      case keyCodes.pageDown:
-        this.update(this.value + this.step * 10);
-        stopPropagation = true;
-        break;
-      case keyCodes.home:
-        this.update(this.min);
-        stopPropagation = true;
-        break;
-      case keyCodes.end:
-        this.update(this.max);
-        stopPropagation = true;
-        break;
-      }
-      if (stopPropagation) {
-        this.background.classList.add('focus');
-        event.preventDefault();
-        event.stopPropagation();
-        if (this.changeValue !== this.value) this.dispatchChange();
-      }
-    }
-
-    validateValueRange(min, max, value) {
-      return Math.max(Math.min(max, value), min);
-    }
-
-    quantizeValue(value, step) {
-      return Math.round(value / step) * step;
-    }
-
-    dragUpdate() {
-      const bbox = this.background.getBoundingClientRect();
-      const x = event.x;
-      const width = bbox.width;
-      this.update(this.scale.invert(x / width));
-    }
-
-    update(value) {
-      let v = value;
-      if (this.step !== 'any') {
-        v = this.quantizeValue(value, this.step);
-      }
-      v = this.validateValueRange(this.min, this.max, v);
-      if (this.connected) {
-        this.knob.style.left = this.scale(v) * 100 + '%';
-        this.trackFill.style.width = this.scale(this.min + Math.abs(v - this.origin)) * 100 + '%';
-        this.trackFill.style.left = this.scale(Math.min(v, this.origin)) * 100 + '%';
-      }
-      if (this.value !== v) {
-        this.value = v;
-        this.setAttribute('aria-valuenow', this.value);
-        this.dispatchInput();
-      }
-    }
-
-    // Dispatches only on a committed change (basically only on mouseup).
-    dispatchChange() {
-      const e = new Event('change');
-      this.dispatchEvent(e, {});
-    }
-
-    // Dispatches on each value change.
-    dispatchInput() {
-      const e = new Event('input');
-      this.dispatchEvent(e, {});
-    }
-
-    renderTicks() {
-      const ticksContainer = this.root.querySelector('.ticks');
-      if (this.ticks !== false) {
-        let tickData = [];
-        if (this.ticks > 0) {
-          tickData = this.scale.ticks(this.ticks);
-        } else if (this.step === 'any') {
-          tickData = this.scale.ticks();
-        } else {
-          tickData = range(this.min, this.max + 1e-6, this.step);
-        }
-        tickData.forEach(d => {
-          const tick = document.createElement('div');
-          tick.classList.add('tick');
-          tick.style.left = this.scale(d) * 100 + '%';
-          ticksContainer.appendChild(tick);
-        });
-      } else {
-        ticksContainer.style.display = 'none';
-      }
-    }
-  }
-
-  var logo = "<svg viewBox=\"-607 419 64 64\">\n  <path d=\"M-573.4,478.9c-8,0-14.6-6.4-14.6-14.5s14.6-25.9,14.6-40.8c0,14.9,14.6,32.8,14.6,40.8S-565.4,478.9-573.4,478.9z\"/>\n</svg>\n";
-
-  const headerTemplate = `
-<style>
-distill-header {
-  position: relative;
-  height: 60px;
-  background-color: hsl(200, 60%, 15%);
-  width: 100%;
-  box-sizing: border-box;
-  z-index: 2;
-  color: rgba(0, 0, 0, 0.8);
-  border-bottom: 1px solid rgba(0, 0, 0, 0.08);
-  box-shadow: 0 1px 6px rgba(0, 0, 0, 0.05);
-}
-distill-header .content {
-  height: 70px;
-  grid-column: page;
-}
-distill-header a {
-  font-size: 16px;
-  height: 60px;
-  line-height: 60px;
-  text-decoration: none;
-  color: rgba(255, 255, 255, 0.8);
-  padding: 22px 0;
-}
-distill-header a:hover {
-  color: rgba(255, 255, 255, 1);
-}
-distill-header svg {
-  width: 24px;
-  position: relative;
-  top: 4px;
-  margin-right: 2px;
-}
-@media(min-width: 1080px) {
-  distill-header {
-    height: 70px;
-  }
-  distill-header a {
-    height: 70px;
-    line-height: 70px;
-    padding: 28px 0;
-  }
-  distill-header .logo {
-  }
-}
-distill-header svg path {
-  fill: none;
-  stroke: rgba(255, 255, 255, 0.8);
-  stroke-width: 3px;
-}
-distill-header .logo {
-  font-size: 17px;
-  font-weight: 200;
-}
-distill-header .nav {
-  float: right;
-  font-weight: 300;
-}
-distill-header .nav a {
-  font-size: 12px;
-  margin-left: 24px;
-  text-transform: uppercase;
-}
-</style>
-<div class="content">
-  <a href="/" class="logo">
-    ${logo}
-    Distill
-  </a>
-  <nav class="nav">
-    <a href="/about/">About</a>
-    <a href="/prize/">Prize</a>
-    <a href="/journal/">Submit</a>
-  </nav>
-</div>
-`;
-
-  // Copyright 2018 The Distill Template Authors
-
-  const T$b = Template('distill-header', headerTemplate, false);
-
-  class DistillHeader extends T$b(HTMLElement) {
-
-  }
-
-  // Copyright 2018 The Distill Template Authors
-
-  const styles$2 = `
-<style>
-  distill-appendix {
-    contain: layout style;
-  }
-
-  distill-appendix .citation {
-    font-size: 11px;
-    line-height: 15px;
-    border-left: 1px solid rgba(0, 0, 0, 0.1);
-    padding-left: 18px;
-    border: 1px solid rgba(0,0,0,0.1);
-    background: rgba(0, 0, 0, 0.02);
-    padding: 10px 18px;
-    border-radius: 3px;
-    color: rgba(150, 150, 150, 1);
-    overflow: hidden;
-    margin-top: -12px;
-    white-space: pre-wrap;
-    word-wrap: break-word;
-  }
-
-  distill-appendix > * {
-    grid-column: text;
-  }
-</style>
-`;
-
-  function appendixTemplate(frontMatter) {
-    let html = styles$2;
-
-    if (typeof frontMatter.githubUrl !== 'undefined') {
-      html += `
-    <h3 id="updates-and-corrections">Updates and Corrections</h3>
-    <p>`;
-      if (frontMatter.githubCompareUpdatesUrl) {
-        html += `<a href="${frontMatter.githubCompareUpdatesUrl}">View all changes</a> to this article since it was first published.`;
-      }
-      html += `
-    If you see mistakes or want to suggest changes, please <a href="${frontMatter.githubUrl + '/issues/new'}">create an issue on GitHub</a>. </p>
-    `;
-    }
-
-    const journal = frontMatter.journal;
-    if (typeof journal !== 'undefined' && journal.title === 'Distill') {
-      html += `
-    <h3 id="reuse">Reuse</h3>
-    <p>Diagrams and text are licensed under Creative Commons Attribution <a href="https://creativecommons.org/licenses/by/4.0/">CC-BY 4.0</a> with the <a class="github" href="${frontMatter.githubUrl}">source available on GitHub</a>, unless noted otherwise. The figures that have been reused from other sources don’t fall under this license and can be recognized by a note in their caption: “Figure from …”.</p>
-    `;
-    }
-
-    if (typeof frontMatter.publishedDate !== 'undefined') {
-      html += `
-    <h3 id="citation">Citation</h3>
-    <p>For attribution in academic contexts, please cite this work as</p>
-    <pre class="citation short">${frontMatter.concatenatedAuthors}, "${frontMatter.title}", Distill, ${frontMatter.publishedYear}.</pre>
-    <p>BibTeX citation</p>
-    <pre class="citation long">${serializeFrontmatterToBibtex(frontMatter)}</pre>
-    `;
-    }
-
-    return html;
-  }
-
-  class DistillAppendix extends HTMLElement {
-
-    static get is() { return 'distill-appendix'; }
-
-    set frontMatter(frontMatter) {
-      this.innerHTML = appendixTemplate(frontMatter);
-    }
-
-  }
-
-  const footerTemplate = `
-<style>
-
-:host {
-  color: rgba(255, 255, 255, 0.5);
-  font-weight: 300;
-  padding: 2rem 0;
-  border-top: 1px solid rgba(0, 0, 0, 0.1);
-  background-color: hsl(180, 5%, 15%); /*hsl(200, 60%, 15%);*/
-  text-align: left;
-  contain: content;
-}
-
-.footer-container .logo svg {
-  width: 24px;
-  position: relative;
-  top: 4px;
-  margin-right: 2px;
-}
-
-.footer-container .logo svg path {
-  fill: none;
-  stroke: rgba(255, 255, 255, 0.8);
-  stroke-width: 3px;
-}
-
-.footer-container .logo {
-  font-size: 17px;
-  font-weight: 200;
-  color: rgba(255, 255, 255, 0.8);
-  text-decoration: none;
-  margin-right: 6px;
-}
-
-.footer-container {
-  grid-column: text;
-}
-
-.footer-container .nav {
-  font-size: 0.9em;
-  margin-top: 1.5em;
-}
-
-.footer-container .nav a {
-  color: rgba(255, 255, 255, 0.8);
-  margin-right: 6px;
-  text-decoration: none;
-}
-
-</style>
-
-<div class='footer-container'>
-
-  <a href="/" class="logo">
-    ${logo}
-    Distill
-  </a> is dedicated to clear explanations of machine learning
-
-  <div class="nav">
-    <a href="https://distill.pub/about/">About</a>
-    <a href="https://distill.pub/journal/">Submit</a>
-    <a href="https://distill.pub/prize/">Prize</a>
-    <a href="https://distill.pub/archive/">Archive</a>
-    <a href="https://distill.pub/rss.xml">RSS</a>
-    <a href="https://github.com/distillpub">GitHub</a>
-    <a href="https://twitter.com/distillpub">Twitter</a>
-    &nbsp;&nbsp;&nbsp;&nbsp; ISSN 2476-0757
-  </div>
-
-</div>
-
-`;
-
-  // Copyright 2018 The Distill Template Authors
-
-  const T$c = Template('distill-footer', footerTemplate);
-
-  class DistillFooter extends T$c(HTMLElement) {
-
-  }
-
-  // Copyright 2018 The Distill Template Authors
-
-  let templateIsLoading = false;
-  let runlevel = 0;
-  const initialize = function() {
-    if (window.distill.runlevel < 1) {
-      throw new Error("Insufficient Runlevel for Distill Template!");
-    }
-
-    /* 1. Flag that we're being loaded */
-    if ("distill" in window && window.distill.templateIsLoading) {
-      throw new Error(
-        "Runlevel 1: Distill Template is getting loaded more than once, aborting!"
-      );
-    } else {
-      window.distill.templateIsLoading = true;
-      console.debug("Runlevel 1: Distill Template has started loading.");
-    }
-
-    /* 2. Add styles if they weren't added during prerendering */
-    makeStyleTag(document);
-    console.debug("Runlevel 1: Static Distill styles have been added.");
-    console.debug("Runlevel 1->2.");
-    window.distill.runlevel += 1;
-
-    /* 3. Register Controller listener functions */
-    /* Needs to happen before components to their connected callbacks have a controller to talk to. */
-    for (const [functionName, callback] of Object.entries(Controller.listeners)) {
-      if (typeof callback === "function") {
-        document.addEventListener(functionName, callback);
-      } else {
-        console.error("Runlevel 2: Controller listeners need to be functions!");
-      }
-    }
-    console.debug("Runlevel 2: We can now listen to controller events.");
-    console.debug("Runlevel 2->3.");
-    window.distill.runlevel += 1;
-
-    /* 4. Register components */
-    const components = [
-      Abstract, Appendix, Article, Bibliography, Byline, Cite, CitationList, Code,
-      Footnote, FootnoteList, FrontMatter$1, HoverBox, Title, DMath, References, TOC, Figure,
-      Slider, Interstitial
-    ];
-
-    const distillComponents = [DistillHeader, DistillAppendix, DistillFooter];
-
-    if (window.distill.runlevel < 2) {
-      throw new Error("Insufficient Runlevel for adding custom elements!");
-    }
-    const allComponents = components.concat(distillComponents);
-    for (const component of allComponents) {
-      console.debug("Runlevel 2: Registering custom element: " + component.is);
-      customElements.define(component.is, component);
-    }
-
-    console.debug(
-      "Runlevel 3: Distill Template finished registering custom elements."
-    );
-    console.debug("Runlevel 3->4.");
-    window.distill.runlevel += 1;
-
-    // If template was added after DOMContentLoaded we may have missed that event.
-    // Controller will check for that case, so trigger the event explicitly:
-    if (domContentLoaded()) {
-      Controller.listeners.DOMContentLoaded();
-    }
-
-    console.debug("Runlevel 4: Distill Template initialisation complete.");
-    window.distill.templateIsLoading = false;
-    window.distill.templateHasLoaded = true;
-  };
-
-  window.distill = { runlevel, initialize, templateIsLoading };
-
-  /* 0. Check browser feature support; synchronously polyfill if needed */
-  if (Polyfills.browserSupportsAllFeatures()) {
-    console.debug("Runlevel 0: No need for polyfills.");
-    console.debug("Runlevel 0->1.");
-    window.distill.runlevel += 1;
-    window.distill.initialize();
-  } else {
-    console.debug("Runlevel 0: Distill Template is loading polyfills.");
-    Polyfills.load(window.distill.initialize);
-  }
-
-})));
-//# sourceMappingURL=template.v2.js.map
+!function(n){"function"==typeof define&&define.amd?define(n):n()}(function(){"use strict";
+// Copyright 2018 The Distill Template Authors
+function n(n,t){n.title=t.title,t.published&&(t.published instanceof Date?n.publishedDate=t.published:t.published.constructor===String&&(n.publishedDate=new Date(t.published))),t.publishedDate&&(t.publishedDate instanceof Date?n.publishedDate=t.publishedDate:t.publishedDate.constructor===String?n.publishedDate=new Date(t.publishedDate):console.error("Don't know what to do with published date: "+t.publishedDate)),n.description=t.description,n.authors=t.authors.map(n=>new Nr(n)),n.katex=t.katex,n.password=t.password,t.doi&&(n.doi=t.doi)}
+// Copyright 2018 The Distill Template Authors
+function t(n=document){const t=new Set,e=n.querySelectorAll("d-cite");for(const n of e){const e=(n.getAttribute("key")||n.getAttribute("bibtex-key")).split(",").map(n=>n.trim());for(const n of e)t.add(n)}return[...t]}function e(n,t,e,i){if(null==n.author)return"";var r=n.author.split(" and ");let o=r.map(n=>{if(-1!=(n=n.trim()).indexOf(","))var e=n.split(",")[0].trim(),i=n.split(",")[1];else if(-1!=n.indexOf(" "))e=n.split(" ").slice(-1)[0].trim(),i=n.split(" ").slice(0,-1).join(" ");else e=n.trim();var r="";return i!=undefined&&(r=(r=i.trim().split(" ").map(n=>n.trim()[0])).join(".")+"."),t.replace("${F}",i).replace("${L}",e).replace("${I}",r).trim()});if(r.length>1){var a=o.slice(0,r.length-1).join(e);return a+=(i||e)+o[r.length-1]}return o[0]}function i(n){var t=n.journal||n.booktitle||"";if("volume"in n){var e=n.issue||n.number;e=e!=undefined?"("+e+")":"",t+=", Vol "+n.volume+e}return"pages"in n&&(t+=", pp. "+n.pages),""!=t&&(t+=". "),"publisher"in n&&"."!=(t+=n.publisher)[t.length-1]&&(t+="."),t}function r(n){if("url"in n){var t=n.url,e=/arxiv\.org\/abs\/([0-9\.]*)/.exec(t);if(null!=e&&(t=`http://arxiv.org/pdf/${e[1]}.pdf`),".pdf"==t.slice(-4))var i="PDF";else if(".html"==t.slice(-5))i="HTML";return` &ensp;<a href="${t}">[${i||"link"}]</a>`}return""}function o(n,t){return"doi"in n?`${t?"<br>":""} <a href="https://doi.org/${n.doi}" style="text-decoration:inherit;">DOI: ${n.doi}</a>`:""}function a(n){return'<span class="title">'+n.title+"</span> "}function s(n){if(n){var t=a(n);return t+=r(n)+"<br>",n.author&&(t+=e(n,"${L}, ${I}",", "," and "),(n.year||n.date)&&(t+=", ")),n.year||n.date?t+=(n.year||n.date)+". ":t+=". ",t+=i(n),t+=o(n)}return"?"}function l(n){if(n){var t="";t+="<strong>"+n.title+"</strong>",t+=r(n),t+="<br>";var a=e(n,"${I} ${L}",", ")+".",s=i(n).trim()+" "+n.year+". "+o(n,!0);return(a+s).length<Math.min(40,n.title.length)?t+=a+" "+s:t+=a+"<br>"+s,t}return"?"}function u(){return-1!==["interactive","complete"].indexOf(document.readyState)}
+// Copyright 2018 The Distill Template Authors
+function c(n){for(let t of n.authors){const n=Boolean(t.affiliation),e=Boolean(t.affiliations);if(n)if(e)console.warn(`Author ${t.author} has both old-style ("affiliation" & "affiliationURL") and new style ("affiliations") affiliation information!`);else{let n={name:t.affiliation};t.affiliationURL&&(n.url=t.affiliationURL),t.affiliations=[n]}}return n}function d(n){const t=n.firstElementChild;if(t){if("json"==t.getAttribute("type").split("/")[1]){const n=t.textContent;return c(JSON.parse(n))}console.error("Distill only supports JSON frontmatter tags anymore; no more YAML.")}else console.error("You added a frontmatter tag but did not provide a script tag with front matter data in it. Please take a look at our templates.");return{}}
+// Copyright 2018 The Distill Template Authors
+function h(n,t){const e=n.body,i=e.querySelector("d-article");if(!i)return void console.warn("No d-article tag found; skipping adding optional components!");let r=n.querySelector("d-byline");r||(t.authors?(r=n.createElement("d-byline"),e.insertBefore(r,i)):console.warn("No authors found in front matter; please add them before submission!"));let o=n.querySelector("d-title");o||(o=n.createElement("d-title"),e.insertBefore(o,r));let a=o.querySelector("h1");a||((a=n.createElement("h1")).textContent=t.title,o.insertBefore(a,o.firstChild));const s="undefined"!=typeof t.password;let l=e.querySelector("d-interstitial");if(s&&!l){const i="undefined"!=typeof window,r=i&&window.location.hostname.includes("localhost");i&&r||((l=n.createElement("d-interstitial")).password=t.password,e.insertBefore(l,e.firstChild))}else!s&&l&&l.parentElement.removeChild(this);let u=n.querySelector("d-appendix");u||(u=n.createElement("d-appendix"),n.body.appendChild(u));let c=n.querySelector("d-footnote-list");c||(c=n.createElement("d-footnote-list"),u.appendChild(c));let d=n.querySelector("d-citation-list");d||(d=n.createElement("d-citation-list"),u.appendChild(d))}
+// Copyright 2018 The Distill Template Authors
+function p(n){const t="distill-prerendered-styles";if(!n.getElementById(t)){const e=n.createElement("style");e.id=t,e.type="text/css";const i=n.createTextNode(Kr);e.appendChild(i);const r=n.head.querySelector("script");n.head.insertBefore(e,r)}}
+// Copyright 2018 The Distill Template Authors
+function f(n,t){console.debug("Runlevel 0: Polyfill required: "+n.name);const e=document.createElement("script");e.src=n.url,e.async=!1,t&&(e.onload=function(){t(n)}),e.onerror=function(){new Error("Runlevel 0: Polyfills failed to load script "+n.name)},document.head.appendChild(e)}
+// Copyright 2018 The Distill Template Authors
+function g(n){return`${n} {\n      grid-column: left / text;\n    }\n  `}
+// Copyright 2018 The Distill Template Authors
+function m(n,t){return n(t={exports:{}},t.exports),t.exports}
+// Copyright 2018 The Distill Template Authors
+function b(n){return n.replace(/[\t\n ]+/g," ").replace(/{\\["^`.'acu~Hvs]( )?([a-zA-Z])}/g,(n,t,e)=>e).replace(/{\\([a-zA-Z])}/g,(n,t)=>t)}function y(n){const t=new Map,e=oo.toJSON(n);for(const n of e){for(const[t,e]of Object.entries(n.entryTags))n.entryTags[t.toLowerCase()]=b(e);n.entryTags.type=n.entryType,t.set(n.citationKey,n.entryTags)}return t}function v(n){return`@article{${n.slug},\n  author = {${n.bibtexAuthors}},\n  title = {${n.title}},\n  journal = {${n.journal.title}},\n  year = {${n.publishedYear}},\n  note = {${n.url}},\n  doi = {${n.doi}}\n}`}
+// Copyright 2018 The Distill Template Authors
+// Copyright 2018 The Distill Template Authors
+function w(n){return`\n  <div class="byline grid">\n    <div class="authors-affiliations grid">\n      <h3>Authors</h3>\n      <h3>Affiliations</h3>\n      ${n.authors.map(n=>`\n        <p class="author">\n          ${n.personalURL?`\n            <a class="name" href="${n.personalURL}">${n.name}</a>`:`\n            <span class="name">${n.name}</span>`}\n        </p>\n        <p class="affiliation">\n        ${n.affiliations.map(n=>n.url?`<a class="affiliation" href="${n.url}">${n.name}</a>`:`<span class="affiliation">${n.name}</span>`).join(", ")}\n        </p>\n      `).join("")}\n    </div>\n    <div>\n      <h3>Published</h3>\n      ${n.publishedDate?`\n        <p>${n.publishedMonth} ${n.publishedDay}, ${n.publishedYear}</p> `:"\n        <p><em>Not published yet.</em></p>"}\n    </div>\n  </div>\n`}function x(n,t,e=document){if(t.size>0){n.style.display="";let i=n.querySelector(".references");if(i)i.innerHTML="";else{const t=e.createElement("style");t.innerHTML=co,n.appendChild(t);const r=e.createElement("h3");r.id="references",r.textContent="References",n.appendChild(r),(i=e.createElement("ol")).id="references-list",i.className="references",n.appendChild(i)}for(const[n,r]of t){const t=e.createElement("li");t.id=n,t.innerHTML=s(r),i.appendChild(t)}}else n.style.display="none"}function k(n,t){let e='\n  <style>\n\n  d-toc {\n    contain: layout style;\n    display: block;\n  }\n\n  d-toc ul {\n    padding-left: 0;\n  }\n\n  d-toc ul > ul {\n    padding-left: 24px;\n  }\n\n  d-toc a {\n    border-bottom: none;\n    text-decoration: none;\n  }\n\n  </style>\n  <nav role="navigation" class="table-of-contents"></nav>\n  <h2>Table of contents</h2>\n  <ul>';for(const n of t){const t="D-TITLE"==n.parentElement.tagName,i=n.getAttribute("no-toc");if(t||i)continue;const r=n.textContent;let o='<li><a href="'+("#"+n.getAttribute("id"))+'">'+r+"</a></li>";"H3"==n.tagName?o="<ul>"+o+"</ul>":o+="<br>",e+=o}e+="</ul></nav>",n.innerHTML=e}
+// Copyright 2018 The Distill Template Authors
+function S(n,t){return n<t?-1:n>t?1:n>=t?0:NaN}function M(n){return 1===n.length&&(n=T(n)),{left:function(t,e,i,r){for(null==i&&(i=0),null==r&&(r=t.length);i<r;){var o=i+r>>>1;n(t[o],e)<0?i=o+1:r=o}return i},right:function(t,e,i,r){for(null==i&&(i=0),null==r&&(r=t.length);i<r;){var o=i+r>>>1;n(t[o],e)>0?r=o:i=o+1}return i}}}function T(n){return function(t,e){return S(n(t),e)}}function _(n,t,e){n=+n,t=+t,e=(r=arguments.length)<2?(t=n,n=0,1):r<3?1:+e;for(var i=-1,r=0|Math.max(0,Math.ceil((t-n)/e)),o=new Array(r);++i<r;)o[i]=n+i*e;return o}function C(n,t,e){var i,r,o,a,s=-1;if(e=+e,(n=+n)===(t=+t)&&e>0)return[n];if((i=t<n)&&(r=n,n=t,t=r),0===(a=A(n,t,e))||!isFinite(a))return[];if(a>0)for(n=Math.ceil(n/a),t=Math.floor(t/a),o=new Array(r=Math.ceil(t-n+1));++s<r;)o[s]=(n+s)*a;else for(n=Math.floor(n*a),t=Math.ceil(t*a),o=new Array(r=Math.ceil(n-t+1));++s<r;)o[s]=(n-s)/a;return i&&o.reverse(),o}function A(n,t,e){var i=(t-n)/Math.max(0,e),r=Math.floor(Math.log(i)/Math.LN10),o=i/Math.pow(10,r);return r>=0?(o>=Lo?10:o>=Do?5:o>=Oo?2:1)*Math.pow(10,r):-Math.pow(10,-r)/(o>=Lo?10:o>=Do?5:o>=Oo?2:1)}function E(n,t,e){var i=Math.abs(t-n)/Math.max(0,e),r=Math.pow(10,Math.floor(Math.log(i)/Math.LN10)),o=i/r;return o>=Lo?r*=10:o>=Do?r*=5:o>=Oo&&(r*=2),t<n?-r:r}function N(n,t){switch(arguments.length){case 0:break;case 1:this.range(n);break;default:this.range(t).domain(n)}return this}function L(n,t,e){n.prototype=t.prototype=e,e.constructor=n}function D(n,t){var e=Object.create(n.prototype);for(var i in t)e[i]=t[i];return e}function O(){}function I(){return this.rgb().formatHex()}function F(){return G(this).formatHsl()}function R(){return this.rgb().formatRgb()}function U(n){var t,e;return n=(n+"").trim().toLowerCase(),(t=Po.exec(n))?(e=t[1].length,t=parseInt(t[1],16),6===e?$(t):3===e?new q(t>>8&15|t>>4&240,t>>4&15|240&t,(15&t)<<4|15&t,1):8===e?P(t>>24&255,t>>16&255,t>>8&255,(255&t)/255):4===e?P(t>>12&15|t>>8&240,t>>8&15|t>>4&240,t>>4&15|240&t,((15&t)<<4|15&t)/255):null):(t=Ho.exec(n))?new q(t[1],t[2],t[3],1):(t=zo.exec(n))?new q(255*t[1]/100,255*t[2]/100,255*t[3]/100,1):(t=qo.exec(n))?P(t[1],t[2],t[3],t[4]):(t=jo.exec(n))?P(255*t[1]/100,255*t[2]/100,255*t[3]/100,t[4]):(t=Bo.exec(n))?W(t[1],t[2]/100,t[3]/100,1):(t=Yo.exec(n))?W(t[1],t[2]/100,t[3]/100,t[4]):Wo.hasOwnProperty(n)?$(Wo[n]):"transparent"===n?new q(NaN,NaN,NaN,0):null}function $(n){return new q(n>>16&255,n>>8&255,255&n,1)}function P(n,t,e,i){return i<=0&&(n=t=e=NaN),new q(n,t,e,i)}function H(n){return n instanceof O||(n=U(n)),n?new q((n=n.rgb()).r,n.g,n.b,n.opacity):new q}function z(n,t,e,i){return 1===arguments.length?H(n):new q(n,t,e,null==i?1:i)}function q(n,t,e,i){this.r=+n,this.g=+t,this.b=+e,this.opacity=+i}function j(){return"#"+Y(this.r)+Y(this.g)+Y(this.b)}function B(){var n=this.opacity;return(1===(n=isNaN(n)?1:Math.max(0,Math.min(1,n)))?"rgb(":"rgba(")+Math.max(0,Math.min(255,Math.round(this.r)||0))+", "+Math.max(0,Math.min(255,Math.round(this.g)||0))+", "+Math.max(0,Math.min(255,Math.round(this.b)||0))+(1===n?")":", "+n+")")}function Y(n){return((n=Math.max(0,Math.min(255,Math.round(n)||0)))<16?"0":"")+n.toString(16)}function W(n,t,e,i){return i<=0?n=t=e=NaN:e<=0||e>=1?n=t=NaN:t<=0&&(n=NaN),new K(n,t,e,i)}function G(n){if(n instanceof K)return new K(n.h,n.s,n.l,n.opacity);if(n instanceof O||(n=U(n)),!n)return new K;if(n instanceof K)return n;var t=(n=n.rgb()).r/255,e=n.g/255,i=n.b/255,r=Math.min(t,e,i),o=Math.max(t,e,i),a=NaN,s=o-r,l=(o+r)/2;return s?(a=t===o?(e-i)/s+6*(e<i):e===o?(i-t)/s+2:(t-e)/s+4,s/=l<.5?o+r:2-o-r,a*=60):s=l>0&&l<1?0:a,new K(a,s,l,n.opacity)}function V(n,t,e,i){return 1===arguments.length?G(n):new K(n,t,e,null==i?1:i)}function K(n,t,e,i){this.h=+n,this.s=+t,this.l=+e,this.opacity=+i}function X(n,t,e){return 255*(n<60?t+(e-t)*n/60:n<180?e:n<240?t+(e-t)*(240-n)/60:t)}function Z(n){if(n instanceof J)return new J(n.l,n.a,n.b,n.opacity);if(n instanceof sn)return ln(n);n instanceof q||(n=H(n));var t,e,i=rn(n.r),r=rn(n.g),o=rn(n.b),a=nn((.2225045*i+.7168786*r+.0606169*o)/Zo);return i===r&&r===o?t=e=a:(t=nn((.4360747*i+.3850649*r+.1430804*o)/Xo),e=nn((.0139322*i+.0971045*r+.7141733*o)/Qo)),new J(116*a-16,500*(t-a),200*(a-e),n.opacity)}function Q(n,t,e,i){return 1===arguments.length?Z(n):new J(n,t,e,null==i?1:i)}function J(n,t,e,i){this.l=+n,this.a=+t,this.b=+e,this.opacity=+i}function nn(n){return n>ea?Math.pow(n,1/3):n/ta+Jo}function tn(n){return n>na?n*n*n:ta*(n-Jo)}function en(n){return 255*(n<=.0031308?12.92*n:1.055*Math.pow(n,1/2.4)-.055)}function rn(n){return(n/=255)<=.04045?n/12.92:Math.pow((n+.055)/1.055,2.4)}function on(n){if(n instanceof sn)return new sn(n.h,n.c,n.l,n.opacity);if(n instanceof J||(n=Z(n)),0===n.a&&0===n.b)return new sn(NaN,0<n.l&&n.l<100?0:NaN,n.l,n.opacity);var t=Math.atan2(n.b,n.a)*Vo;return new sn(t<0?t+360:t,Math.sqrt(n.a*n.a+n.b*n.b),n.l,n.opacity)}function an(n,t,e,i){return 1===arguments.length?on(n):new sn(n,t,e,null==i?1:i)}function sn(n,t,e,i){this.h=+n,this.c=+t,this.l=+e,this.opacity=+i}function ln(n){if(isNaN(n.h))return new J(n.l,0,0,n.opacity);var t=n.h*Go;return new J(n.l,Math.cos(t)*n.c,Math.sin(t)*n.c,n.opacity)}function un(n){if(n instanceof dn)return new dn(n.h,n.s,n.l,n.opacity);n instanceof q||(n=H(n));var t=n.r/255,e=n.g/255,i=n.b/255,r=(ca*i+la*t-ua*e)/(ca+la-ua),o=i-r,a=(sa*(e-r)-oa*o)/aa,s=Math.sqrt(a*a+o*o)/(sa*r*(1-r)),l=s?Math.atan2(a,o)*Vo-120:NaN;return new dn(l<0?l+360:l,s,r,n.opacity)}function cn(n,t,e,i){return 1===arguments.length?un(n):new dn(n,t,e,null==i?1:i)}function dn(n,t,e,i){this.h=+n,this.s=+t,this.l=+e,this.opacity=+i}function hn(n){return function(){return n}}function pn(n,t){return function(e){return n+e*t}}function fn(n,t,e){return n=Math.pow(n,e),t=Math.pow(t,e)-n,e=1/e,function(i){return Math.pow(n+i*t,e)}}function gn(n){return 1==(n=+n)?mn:function(t,e){return e-t?fn(t,e,n):hn(isNaN(t)?e:t)}}function mn(n,t){var e=t-n;return e?pn(n,e):hn(isNaN(n)?t:n)}function bn(n,t){t||(t=[]);var e,i=n?Math.min(t.length,n.length):0,r=t.slice();return function(o){for(e=0;e<i;++e)r[e]=n[e]*(1-o)+t[e]*o;return r}}function yn(n){return ArrayBuffer.isView(n)&&!(n instanceof DataView)}function vn(n,t){var e,i=t?t.length:0,r=n?Math.min(i,n.length):0,o=new Array(r),a=new Array(i);for(e=0;e<r;++e)o[e]=_n(n[e],t[e]);for(;e<i;++e)a[e]=t[e];return function(n){for(e=0;e<r;++e)a[e]=o[e](n);return a}}function wn(n,t){var e=new Date;return n=+n,t=+t,function(i){return e.setTime(n*(1-i)+t*i),e}}function xn(n,t){return n=+n,t=+t,function(e){return n*(1-e)+t*e}}function kn(n,t){var e,i={},r={};for(e in null!==n&&"object"==typeof n||(n={}),null!==t&&"object"==typeof t||(t={}),t)e in n?i[e]=_n(n[e],t[e]):r[e]=t[e];return function(n){for(e in i)r[e]=i[e](n);return r}}function Sn(n){return function(){return n}}function Mn(n){return function(t){return n(t)+""}}function Tn(n,t){var e,i,r,o=pa.lastIndex=fa.lastIndex=0,a=-1,s=[],l=[];for(n+="",t+="";(e=pa.exec(n))&&(i=fa.exec(t));)(r=i.index)>o&&(r=t.slice(o,r),s[a]?s[a]+=r:s[++a]=r),(e=e[0])===(i=i[0])?s[a]?s[a]+=i:s[++a]=i:(s[++a]=null,l.push({i:a,x:xn(e,i)})),o=fa.lastIndex;return o<t.length&&(r=t.slice(o),s[a]?s[a]+=r:s[++a]=r),s.length<2?l[0]?Mn(l[0].x):Sn(t):(t=l.length,function(n){for(var e,i=0;i<t;++i)s[(e=l[i]).i]=e.x(n);return s.join("")})}function _n(n,t){var e,i=typeof t;return null==t||"boolean"===i?hn(t):("number"===i?xn:"string"===i?(e=U(t))?(t=e,ha):Tn:t instanceof U?ha:t instanceof Date?wn:yn(t)?bn:Array.isArray(t)?vn:"function"!=typeof t.valueOf&&"function"!=typeof t.toString||isNaN(t)?kn:xn)(n,t)}function Cn(n,t){return n=+n,t=+t,function(e){return Math.round(n*(1-e)+t*e)}}function An(n){return function(){return n}}function En(n){return+n}function Nn(n){return n}function Ln(n,t){return(t-=n=+n)?function(e){return(e-n)/t}:An(isNaN(t)?NaN:.5)}function Dn(n,t){var e;return n>t&&(e=n,n=t,t=e),function(e){return Math.max(n,Math.min(t,e))}}function On(n,t,e){var i=n[0],r=n[1],o=t[0],a=t[1];return r<i?(i=Ln(r,i),o=e(a,o)):(i=Ln(i,r),o=e(o,a)),function(n){return o(i(n))}}function In(n,t,e){var i=Math.min(n.length,t.length)-1,r=new Array(i),o=new Array(i),a=-1;for(n[i]<n[0]&&(n=n.slice().reverse(),t=t.slice().reverse());++a<i;)r[a]=Ln(n[a],n[a+1]),o[a]=e(t[a],t[a+1]);return function(t){var e=No(n,t,1,i)-1;return o[e](r[e](t))}}function Fn(n,t){return t.domain(n.domain()).range(n.range()).interpolate(n.interpolate()).clamp(n.clamp()).unknown(n.unknown())}function Rn(){function n(){var n=Math.min(l.length,u.length);return d!==Nn&&(d=Dn(l[0],l[n-1])),o=n>2?In:On,a=s=null,t}function t(n){return isNaN(n=+n)?r:(a||(a=o(l.map(e),u,c)))(e(d(n)))}var e,i,r,o,a,s,l=ga,u=ga,c=_n,d=Nn;return t.invert=function(n){return d(i((s||(s=o(u,l.map(e),xn)))(n)))},t.domain=function(t){return arguments.length?(l=Array.from(t,En),n()):l.slice()},t.range=function(t){return arguments.length?(u=Array.from(t),n()):u.slice()},t.rangeRound=function(t){return u=Array.from(t),c=Cn,n()},t.clamp=function(t){return arguments.length?(d=!!t||Nn,n()):d!==Nn},t.interpolate=function(t){return arguments.length?(c=t,n()):c},t.unknown=function(n){return arguments.length?(r=n,t):r},function(t,r){return e=t,i=r,n()}}function Un(){return Rn()(Nn,Nn)}function $n(n,t){if((e=(n=t?n.toExponential(t-1):n.toExponential()).indexOf("e"))<0)return null;var e,i=n.slice(0,e);return[i.length>1?i[0]+i.slice(2):i,+n.slice(e+1)]}function Pn(n){return(n=$n(Math.abs(n)))?n[1]:NaN}function Hn(n,t){return function(e,i){for(var r=e.length,o=[],a=0,s=n[0],l=0;r>0&&s>0&&(l+s+1>i&&(s=Math.max(1,i-l)),o.push(e.substring(r-=s,r+s)),!((l+=s+1)>i));)s=n[a=(a+1)%n.length];return o.reverse().join(t)}}function zn(n){return function(t){return t.replace(/[0-9]/g,function(t){return n[+t]})}}function qn(n){if(!(t=ma.exec(n)))throw new Error("invalid format: "+n);var t;return new jn({fill:t[1],align:t[2],sign:t[3],symbol:t[4],zero:t[5],width:t[6],comma:t[7],precision:t[8]&&t[8].slice(1),trim:t[9],type:t[10]})}function jn(n){this.fill=n.fill===undefined?" ":n.fill+"",this.align=n.align===undefined?">":n.align+"",this.sign=n.sign===undefined?"-":n.sign+"",this.symbol=n.symbol===undefined?"":n.symbol+"",this.zero=!!n.zero,this.width=n.width===undefined?undefined:+n.width,this.comma=!!n.comma,this.precision=n.precision===undefined?undefined:+n.precision,this.trim=!!n.trim,this.type=n.type===undefined?"":n.type+""}function Bn(n){n:for(var t,e=n.length,i=1,r=-1;i<e;++i)switch(n[i]){case".":r=t=i;break;case"0":0===r&&(r=i),t=i;break;default:if(!+n[i])break n;r>0&&(r=0)}return r>0?n.slice(0,r)+n.slice(t+1):n}function Yn(n,t){var e=$n(n,t);if(!e)return n+"";var i=e[0],r=e[1],o=r-(da=3*Math.max(-8,Math.min(8,Math.floor(r/3))))+1,a=i.length;return o===a?i:o>a?i+new Array(o-a+1).join("0"):o>0?i.slice(0,o)+"."+i.slice(o):"0."+new Array(1-o).join("0")+$n(n,Math.max(0,t+o-1))[0]}function Wn(n,t){var e=$n(n,t);if(!e)return n+"";var i=e[0],r=e[1];return r<0?"0."+new Array(-r).join("0")+i:i.length>r+1?i.slice(0,r+1)+"."+i.slice(r+1):i+new Array(r-i.length+2).join("0")}function Gn(n){return n}function Vn(n){function t(n){function t(n){var t,r,o,l=w,p=x;if("c"===v)p=k(n)+p,n="";else{var M=(n=+n)<0||1/n<0;if(n=isNaN(n)?c:k(Math.abs(n),b),y&&(n=Bn(n)),M&&0==+n&&"+"!==h&&(M=!1),l=(M?"("===h?h:u:"-"===h||"("===h?"":h)+l,p=("s"===v?ka[8+da/3]:"")+p+(M&&"("===h?")":""),S)for(t=-1,r=n.length;++t<r;)if(48>(o=n.charCodeAt(t))||o>57){p=(46===o?a+n.slice(t+1):n.slice(t))+p,n=n.slice(0,t);break}}m&&!f&&(n=i(n,Infinity));var T=l.length+n.length+p.length,_=T<g?new Array(g-T+1).join(e):"";switch(m&&f&&(n=i(_+n,_.length?g-p.length:Infinity),_=""),d){case"<":n=l+n+p+_;break;case"=":n=l+_+n+p;break;case"^":n=_.slice(0,T=_.length>>1)+l+n+p+_.slice(T);break;default:n=_+l+n+p}return s(n)}var e=(n=qn(n)).fill,d=n.align,h=n.sign,p=n.symbol,f=n.zero,g=n.width,m=n.comma,b=n.precision,y=n.trim,v=n.type;"n"===v?(m=!0,v="g"):wa[v]||(b===undefined&&(b=12),y=!0,v="g"),(f||"0"===e&&"="===d)&&(f=!0,e="0",d="=");var w="$"===p?r:"#"===p&&/[boxX]/.test(v)?"0"+v.toLowerCase():"",x="$"===p?o:/[%p]/.test(v)?l:"",k=wa[v],S=/[defgprs%]/.test(v);return b=b===undefined?6:/[gprs]/.test(v)?Math.max(1,Math.min(21,b)):Math.max(0,Math.min(20,b)),t.toString=function(){return n+""},t}function e(n,e){var i=t(((n=qn(n)).type="f",n)),r=3*Math.max(-8,Math.min(8,Math.floor(Pn(e)/3))),o=Math.pow(10,-r),a=ka[8+r/3];return function(n){return i(o*n)+a}}var i=n.grouping===undefined||n.thousands===undefined?Gn:Hn(xa.call(n.grouping,Number),n.thousands+""),r=n.currency===undefined?"":n.currency[0]+"",o=n.currency===undefined?"":n.currency[1]+"",a=n.decimal===undefined?".":n.decimal+"",s=n.numerals===undefined?Gn:zn(xa.call(n.numerals,String)),l=n.percent===undefined?"%":n.percent+"",u=n.minus===undefined?"-":n.minus+"",c=n.nan===undefined?"NaN":n.nan+"";return{format:t,formatPrefix:e}}function Kn(n){return ba=Vn(n),ya=ba.format,va=ba.formatPrefix,ba}function Xn(n){return Math.max(0,-Pn(Math.abs(n)))}function Zn(n,t){return Math.max(0,3*Math.max(-8,Math.min(8,Math.floor(Pn(t)/3)))-Pn(Math.abs(n)))}function Qn(n,t){return n=Math.abs(n),t=Math.abs(t)-n,Math.max(0,Pn(t)-Pn(n))+1}function Jn(n,t,e,i){var r,o=E(n,t,e);switch((i=qn(null==i?",f":i)).type){case"s":var a=Math.max(Math.abs(n),Math.abs(t));return null!=i.precision||isNaN(r=Zn(o,a))||(i.precision=r),va(i,a);case"":case"e":case"g":case"p":case"r":null!=i.precision||isNaN(r=Qn(o,Math.max(Math.abs(n),Math.abs(t))))||(i.precision=r-("e"===i.type));break;case"f":case"%":null!=i.precision||isNaN(r=Xn(o))||(i.precision=r-2*("%"===i.type))}return ya(i)}function nt(n){var t=n.domain;return n.ticks=function(n){var e=t();return C(e[0],e[e.length-1],null==n?10:n)},n.tickFormat=function(n,e){var i=t();return Jn(i[0],i[i.length-1],null==n?10:n,e)},n.nice=function(e){null==e&&(e=10);var i,r=t(),o=0,a=r.length-1,s=r[o],l=r[a];return l<s&&(i=s,s=l,l=i,i=o,o=a,a=i),(i=A(s,l,e))>0?i=A(s=Math.floor(s/i)*i,l=Math.ceil(l/i)*i,e):i<0&&(i=A(s=Math.ceil(s*i)/i,l=Math.floor(l*i)/i,e)),i>0?(r[o]=Math.floor(s/i)*i,r[a]=Math.ceil(l/i)*i,t(r)):i<0&&(r[o]=Math.ceil(s*i)/i,r[a]=Math.floor(l*i)/i,t(r)),n},n}function tt(){var n=Un();return n.copy=function(){return Fn(n,tt())},N.apply(n,arguments),nt(n)}function et(n,t,e,i){function r(t){return n(t=0===arguments.length?new Date:new Date(+t)),t}return r.floor=function(t){return n(t=new Date(+t)),t},r.ceil=function(e){return n(e=new Date(e-1)),t(e,1),n(e),e},r.round=function(n){var t=r(n),e=r.ceil(n);return n-t<e-n?t:e},r.offset=function(n,e){return t(n=new Date(+n),null==e?1:Math.floor(e)),n},r.range=function(e,i,o){var a,s=[];if(e=r.ceil(e),o=null==o?1:Math.floor(o),!(e<i&&o>0))return s;do{s.push(a=new Date(+e)),t(e,o),n(e)}while(a<e&&e<i);return s},r.filter=function(e){return et(function(t){if(t>=t)for(;n(t),!e(t);)t.setTime(t-1)},function(n,i){if(n>=n)if(i<0)for(;++i<=0;)for(;t(n,-1),!e(n););else for(;--i>=0;)for(;t(n,1),!e(n););})},e&&(r.count=function(t,i){return Sa.setTime(+t),Ma.setTime(+i),n(Sa),n(Ma),Math.floor(e(Sa,Ma))},r.every=function(n){return n=Math.floor(n),isFinite(n)&&n>0?n>1?r.filter(i?function(t){return i(t)%n==0}:function(t){return r.count(0,t)%n==0}):r:null}),r}function it(n){return et(function(t){t.setDate(t.getDate()-(t.getDay()+7-n)%7),t.setHours(0,0,0,0)},function(n,t){n.setDate(n.getDate()+7*t)},function(n,t){return(t-n-(t.getTimezoneOffset()-n.getTimezoneOffset())*Ca)/Na})}function rt(n){return et(function(t){t.setUTCDate(t.getUTCDate()-(t.getUTCDay()+7-n)%7),t.setUTCHours(0,0,0,0)},function(n,t){n.setUTCDate(n.getUTCDate()+7*t)},function(n,t){return(t-n)/Na})}function ot(n){if(0<=n.y&&n.y<100){var t=new Date(-1,n.m,n.d,n.H,n.M,n.S,n.L);return t.setFullYear(n.y),t}return new Date(n.y,n.m,n.d,n.H,n.M,n.S,n.L)}function at(n){if(0<=n.y&&n.y<100){var t=new Date(Date.UTC(-1,n.m,n.d,n.H,n.M,n.S,n.L));return t.setUTCFullYear(n.y),t}return new Date(Date.UTC(n.y,n.m,n.d,n.H,n.M,n.S,n.L))}function st(n,t,e){return{y:n,m:t,d:e,H:0,M:0,S:0,L:0}}function lt(n){function t(n,t){return function(e){var i,r,o,a=[],s=-1,l=0,u=n.length;for(e instanceof Date||(e=new Date(+e));++s<u;)37===n.charCodeAt(s)&&(a.push(n.slice(l,s)),null!=(r=Ba[i=n.charAt(++s)])?i=n.charAt(++s):r="e"===i?" ":"0",(o=t[i])&&(i=o(e,r)),a.push(i),l=s+1);return a.push(n.slice(l,s)),a.join("")}}function e(n,t){return function(e){var r,o,a=st(1900,undefined,1);if(i(a,n,e+="",0)!=e.length)return null;if("Q"in a)return new Date(a.Q);if("s"in a)return new Date(1e3*a.s+("L"in a?a.L:0));if(!t||"Z"in a||(a.Z=0),"p"in a&&(a.H=a.H%12+12*a.p),a.m===undefined&&(a.m="q"in a?a.q:0),"V"in a){if(a.V<1||a.V>53)return null;"w"in a||(a.w=1),"Z"in a?(r=(o=(r=at(st(a.y,0,1))).getUTCDay())>4||0===o?$a.ceil(r):$a(r),r=Ra.offset(r,7*(a.V-1)),a.y=r.getUTCFullYear(),a.m=r.getUTCMonth(),a.d=r.getUTCDate()+(a.w+6)%7):(r=(o=(r=ot(st(a.y,0,1))).getDay())>4||0===o?Oa.ceil(r):Oa(r),r=La.offset(r,7*(a.V-1)),a.y=r.getFullYear(),a.m=r.getMonth(),a.d=r.getDate()+(a.w+6)%7)}else("W"in a||"U"in a)&&("w"in a||(a.w="u"in a?a.u%7:"W"in a?1:0),o="Z"in a?at(st(a.y,0,1)).getUTCDay():ot(st(a.y,0,1)).getDay(),a.m=0,a.d="W"in a?(a.w+6)%7+7*a.W-(o+5)%7:a.w+7*a.U-(o+6)%7);return"Z"in a?(a.H+=a.Z/100|0,a.M+=a.Z%100,at(a)):ot(a)}}function i(n,t,e,i){for(var r,o,a=0,s=t.length,l=e.length;a<s;){if(i>=l)return-1;if(37===(r=t.charCodeAt(a++))){if(r=t.charAt(a++),!(o=B[r in Ba?t.charAt(a++):r])||(i=o(n,e,i))<0)return-1}else if(r!=e.charCodeAt(i++))return-1}return i}function r(n,t,e){var i=D.exec(t.slice(e));return i?(n.p=O[i[0].toLowerCase()],e+i[0].length):-1}function o(n,t,e){var i=R.exec(t.slice(e));return i?(n.w=U[i[0].toLowerCase()],e+i[0].length):-1}function a(n,t,e){var i=I.exec(t.slice(e));return i?(n.w=F[i[0].toLowerCase()],e+i[0].length):-1}function s(n,t,e){var i=H.exec(t.slice(e));return i?(n.m=z[i[0].toLowerCase()],e+i[0].length):-1}function l(n,t,e){var i=$.exec(t.slice(e));return i?(n.m=P[i[0].toLowerCase()],e+i[0].length):-1}function u(n,t,e){return i(n,M,t,e)}function c(n,t,e){return i(n,T,t,e)}function d(n,t,e){return i(n,_,t,e)}function h(n){return E[n.getDay()]}function p(n){return A[n.getDay()]}function f(n){return L[n.getMonth()]}function g(n){return N[n.getMonth()]}function m(n){return C[+(n.getHours()>=12)]}function b(n){return 1+~~(n.getMonth()/3)}function y(n){return E[n.getUTCDay()]}function v(n){return A[n.getUTCDay()]}function w(n){return L[n.getUTCMonth()]}function x(n){return N[n.getUTCMonth()]}function k(n){return C[+(n.getUTCHours()>=12)]}function S(n){return 1+~~(n.getUTCMonth()/3)}var M=n.dateTime,T=n.date,_=n.time,C=n.periods,A=n.days,E=n.shortDays,N=n.months,L=n.shortMonths,D=dt(C),O=ht(C),I=dt(A),F=ht(A),R=dt(E),U=ht(E),$=dt(N),P=ht(N),H=dt(L),z=ht(L),q={a:h,A:p,b:f,B:g,c:null,d:Ot,e:Ot,f:$t,H:It,I:Ft,j:Rt,L:Ut,m:Pt,M:Ht,p:m,q:b,Q:fe,s:ge,S:zt,u:qt,U:jt,V:Bt,w:Yt,W:Wt,x:null,X:null,y:Gt,Y:Vt,Z:Kt,"%":pe},j={a:y,A:v,b:w,B:x,c:null,d:Xt,e:Xt,f:te,H:Zt,I:Qt,j:Jt,L:ne,m:ee,M:ie,p:k,q:S,Q:fe,s:ge,S:re,u:oe,U:ae,V:se,w:le,W:ue,x:null,X:null,y:ce,Y:de,Z:he,"%":pe},B={a:o,A:a,b:s,B:l,c:u,d:St,e:St,f:Et,H:Tt,I:Tt,j:Mt,L:At,m:kt,M:_t,p:r,q:xt,Q:Lt,s:Dt,S:Ct,u:ft,U:gt,V:mt,w:pt,W:bt,x:c,X:d,y:vt,Y:yt,Z:wt,"%":Nt};return q.x=t(T,q),q.X=t(_,q),q.c=t(M,q),j.x=t(T,j),j.X=t(_,j),j.c=t(M,j),{format:function(n){var e=t(n+="",q);return e.toString=function(){return n},e},parse:function(n){var t=e(n+="",!1);return t.toString=function(){return n},t},utcFormat:function(n){var e=t(n+="",j);return e.toString=function(){return n},e},utcParse:function(n){var t=e(n+="",!0);return t.toString=function(){return n},t}}}function ut(n,t,e){var i=n<0?"-":"",r=(i?-n:n)+"",o=r.length;return i+(o<e?new Array(e-o+1).join(t)+r:r)}function ct(n){return n.replace(Ga,"\\$&")}function dt(n){return new RegExp("^(?:"+n.map(ct).join("|")+")","i")}function ht(n){for(var t={},e=-1,i=n.length;++e<i;)t[n[e].toLowerCase()]=e;return t}function pt(n,t,e){var i=Ya.exec(t.slice(e,e+1));return i?(n.w=+i[0],e+i[0].length):-1}function ft(n,t,e){var i=Ya.exec(t.slice(e,e+1));return i?(n.u=+i[0],e+i[0].length):-1}function gt(n,t,e){var i=Ya.exec(t.slice(e,e+2));return i?(n.U=+i[0],e+i[0].length):-1}function mt(n,t,e){var i=Ya.exec(t.slice(e,e+2));return i?(n.V=+i[0],e+i[0].length):-1}function bt(n,t,e){var i=Ya.exec(t.slice(e,e+2));return i?(n.W=+i[0],e+i[0].length):-1}function yt(n,t,e){var i=Ya.exec(t.slice(e,e+4));return i?(n.y=+i[0],e+i[0].length):-1}function vt(n,t,e){var i=Ya.exec(t.slice(e,e+2));return i?(n.y=+i[0]+(+i[0]>68?1900:2e3),e+i[0].length):-1}function wt(n,t,e){var i=/^(Z)|([+-]\d\d)(?::?(\d\d))?/.exec(t.slice(e,e+6));return i?(n.Z=i[1]?0:-(i[2]+(i[3]||"00")),e+i[0].length):-1}function xt(n,t,e){var i=Ya.exec(t.slice(e,e+1));return i?(n.q=3*i[0]-3,e+i[0].length):-1}function kt(n,t,e){var i=Ya.exec(t.slice(e,e+2));return i?(n.m=i[0]-1,e+i[0].length):-1}function St(n,t,e){var i=Ya.exec(t.slice(e,e+2));return i?(n.d=+i[0],e+i[0].length):-1}function Mt(n,t,e){var i=Ya.exec(t.slice(e,e+3));return i?(n.m=0,n.d=+i[0],e+i[0].length):-1}function Tt(n,t,e){var i=Ya.exec(t.slice(e,e+2));return i?(n.H=+i[0],e+i[0].length):-1}function _t(n,t,e){var i=Ya.exec(t.slice(e,e+2));return i?(n.M=+i[0],e+i[0].length):-1}function Ct(n,t,e){var i=Ya.exec(t.slice(e,e+2));return i?(n.S=+i[0],e+i[0].length):-1}function At(n,t,e){var i=Ya.exec(t.slice(e,e+3));return i?(n.L=+i[0],e+i[0].length):-1}function Et(n,t,e){var i=Ya.exec(t.slice(e,e+6));return i?(n.L=Math.floor(i[0]/1e3),e+i[0].length):-1}function Nt(n,t,e){var i=Wa.exec(t.slice(e,e+1));return i?e+i[0].length:-1}function Lt(n,t,e){var i=Ya.exec(t.slice(e));return i?(n.Q=+i[0],e+i[0].length):-1}function Dt(n,t,e){var i=Ya.exec(t.slice(e));return i?(n.s=+i[0],e+i[0].length):-1}function Ot(n,t){return ut(n.getDate(),t,2)}function It(n,t){return ut(n.getHours(),t,2)}function Ft(n,t){return ut(n.getHours()%12||12,t,2)}function Rt(n,t){return ut(1+La.count(Fa(n),n),t,3)}function Ut(n,t){return ut(n.getMilliseconds(),t,3)}function $t(n,t){return Ut(n,t)+"000"}function Pt(n,t){return ut(n.getMonth()+1,t,2)}function Ht(n,t){return ut(n.getMinutes(),t,2)}function zt(n,t){return ut(n.getSeconds(),t,2)}function qt(n){var t=n.getDay();return 0===t?7:t}function jt(n,t){return ut(Da.count(Fa(n)-1,n),t,2)}function Bt(n,t){var e=n.getDay();return n=e>=4||0===e?Ia(n):Ia.ceil(n),ut(Ia.count(Fa(n),n)+(4===Fa(n).getDay()),t,2)}function Yt(n){return n.getDay()}function Wt(n,t){return ut(Oa.count(Fa(n)-1,n),t,2)}function Gt(n,t){return ut(n.getFullYear()%100,t,2)}function Vt(n,t){return ut(n.getFullYear()%1e4,t,4)}function Kt(n){var t=n.getTimezoneOffset();return(t>0?"-":(t*=-1,"+"))+ut(t/60|0,"0",2)+ut(t%60,"0",2)}function Xt(n,t){return ut(n.getUTCDate(),t,2)}function Zt(n,t){return ut(n.getUTCHours(),t,2)}function Qt(n,t){return ut(n.getUTCHours()%12||12,t,2)}function Jt(n,t){return ut(1+Ra.count(Ha(n),n),t,3)}function ne(n,t){return ut(n.getUTCMilliseconds(),t,3)}function te(n,t){return ne(n,t)+"000"}function ee(n,t){return ut(n.getUTCMonth()+1,t,2)}function ie(n,t){return ut(n.getUTCMinutes(),t,2)}function re(n,t){return ut(n.getUTCSeconds(),t,2)}function oe(n){var t=n.getUTCDay();return 0===t?7:t}function ae(n,t){return ut(Ua.count(Ha(n)-1,n),t,2)}function se(n,t){var e=n.getUTCDay();return n=e>=4||0===e?Pa(n):Pa.ceil(n),ut(Pa.count(Ha(n),n)+(4===Ha(n).getUTCDay()),t,2)}function le(n){return n.getUTCDay()}function ue(n,t){return ut($a.count(Ha(n)-1,n),t,2)}function ce(n,t){return ut(n.getUTCFullYear()%100,t,2)}function de(n,t){return ut(n.getUTCFullYear()%1e4,t,4)}function he(){return"+0000"}function pe(){return"%"}function fe(n){return+n}function ge(n){return Math.floor(+n/1e3)}function me(n){return za=lt(n),za.format,za.parse,qa=za.utcFormat,ja=za.utcParse,za}function be(n){return n.toISOString()}function ye(n){var t=new Date(n);return isNaN(t)?null:t}function ve(){for(var n,t=0,e=arguments.length,i={};t<e;++t){if(!(n=arguments[t]+"")||n in i||/[\s.]/.test(n))throw new Error("illegal type: "+n);i[n]=[]}return new we(i)}function we(n){this._=n}function xe(n,t){return n.trim().split(/^|\s+/).map(function(n){var e="",i=n.indexOf(".");if(i>=0&&(e=n.slice(i+1),n=n.slice(0,i)),n&&!t.hasOwnProperty(n))throw new Error("unknown type: "+n);return{type:n,name:e}})}function ke(n,t){for(var e,i=0,r=n.length;i<r;++i)if((e=n[i]).name===t)return e.value}function Se(n,t,e){for(var i=0,r=n.length;i<r;++i)if(n[i].name===t){n[i]=Ka,n=n.slice(0,i).concat(n.slice(i+1));break}return null!=e&&n.push({name:t,value:e}),n}function Me(n){var t=n+="",e=t.indexOf(":");return e>=0&&"xmlns"!==(t=n.slice(0,e))&&(n=n.slice(e+1)),Za.hasOwnProperty(t)?{space:Za[t],local:n}:n}function Te(n){return function(){var t=this.ownerDocument,e=this.namespaceURI;return e===Xa&&t.documentElement.namespaceURI===Xa?t.createElement(n):t.createElementNS(e,n)}}function _e(n){return function(){return this.ownerDocument.createElementNS(n.space,n.local)}}function Ce(n){var t=Me(n);return(t.local?_e:Te)(t)}function Ae(){}function Ee(n){return null==n?Ae:function(){return this.querySelector(n)}}function Ne(n){"function"!=typeof n&&(n=Ee(n));for(var t=this._groups,e=t.length,i=new Array(e),r=0;r<e;++r)for(var o,a,s=t[r],l=s.length,u=i[r]=new Array(l),c=0;c<l;++c)(o=s[c])&&(a=n.call(o,o.__data__,c,s))&&("__data__"in o&&(a.__data__=o.__data__),u[c]=a);return new or(i,this._parents)}function Le(){return[]}function De(n){return null==n?Le:function(){return this.querySelectorAll(n)}}function Oe(n){"function"!=typeof n&&(n=De(n));for(var t=this._groups,e=t.length,i=[],r=[],o=0;o<e;++o)for(var a,s=t[o],l=s.length,u=0;u<l;++u)(a=s[u])&&(i.push(n.call(a,a.__data__,u,s)),r.push(a));return new or(i,r)}function Ie(n){return function(){return this.matches(n)}}function Fe(n){"function"!=typeof n&&(n=Ie(n));for(var t=this._groups,e=t.length,i=new Array(e),r=0;r<e;++r)for(var o,a=t[r],s=a.length,l=i[r]=[],u=0;u<s;++u)(o=a[u])&&n.call(o,o.__data__,u,a)&&l.push(o);return new or(i,this._parents)}function Re(n){return new Array(n.length)}function Ue(){return new or(this._enter||this._groups.map(Re),this._parents)}function $e(n,t){this.ownerDocument=n.ownerDocument,this.namespaceURI=n.namespaceURI,this._next=null,this._parent=n,this.__data__=t}function Pe(n){return function(){return n}}function He(n,t,e,i,r,o){for(var a,s=0,l=t.length,u=o.length;s<u;++s)(a=t[s])?(a.__data__=o[s],i[s]=a):e[s]=new $e(n,o[s]);for(;s<l;++s)(a=t[s])&&(r[s]=a)}function ze(n,t,e,i,r,o,a){var s,l,u,c={},d=t.length,h=o.length,p=new Array(d);for(s=0;s<d;++s)(l=t[s])&&(p[s]=u=Qa+a.call(l,l.__data__,s,t),u in c?r[s]=l:c[u]=l);for(s=0;s<h;++s)(l=c[u=Qa+a.call(n,o[s],s,o)])?(i[s]=l,l.__data__=o[s],c[u]=null):e[s]=new $e(n,o[s]);for(s=0;s<d;++s)(l=t[s])&&c[p[s]]===l&&(r[s]=l)}function qe(n,t){if(!n)return p=new Array(this.size()),u=-1,this.each(function(n){p[++u]=n}),p;var e=t?ze:He,i=this._parents,r=this._groups;"function"!=typeof n&&(n=Pe(n));for(var o=r.length,a=new Array(o),s=new Array(o),l=new Array(o),u=0;u<o;++u){var c=i[u],d=r[u],h=d.length,p=n.call(c,c&&c.__data__,u,i),f=p.length,g=s[u]=new Array(f),m=a[u]=new Array(f);e(c,d,g,m,l[u]=new Array(h),p,t);for(var b,y,v=0,w=0;v<f;++v)if(b=g[v]){for(v>=w&&(w=v+1);!(y=m[w])&&++w<f;);b._next=y||null}}return(a=new or(a,i))._enter=s,a._exit=l,a}function je(){return new or(this._exit||this._groups.map(Re),this._parents)}function Be(n,t,e){var i=this.enter(),r=this,o=this.exit();return i="function"==typeof n?n(i):i.append(n+""),null!=t&&(r=t(r)),null==e?o.remove():e(o),i&&r?i.merge(r).order():r}function Ye(n){for(var t=this._groups,e=n._groups,i=t.length,r=e.length,o=Math.min(i,r),a=new Array(i),s=0;s<o;++s)for(var l,u=t[s],c=e[s],d=u.length,h=a[s]=new Array(d),p=0;p<d;++p)(l=u[p]||c[p])&&(h[p]=l);for(;s<i;++s)a[s]=t[s];return new or(a,this._parents)}function We(){for(var n=this._groups,t=-1,e=n.length;++t<e;)for(var i,r=n[t],o=r.length-1,a=r[o];--o>=0;)(i=r[o])&&(a&&4^i.compareDocumentPosition(a)&&a.parentNode.insertBefore(i,a),a=i);return this}function Ge(n){function t(t,e){return t&&e?n(t.__data__,e.__data__):!t-!e}n||(n=Ve);for(var e=this._groups,i=e.length,r=new Array(i),o=0;o<i;++o){for(var a,s=e[o],l=s.length,u=r[o]=new Array(l),c=0;c<l;++c)(a=s[c])&&(u[c]=a);u.sort(t)}return new or(r,this._parents).order()}function Ve(n,t){return n<t?-1:n>t?1:n>=t?0:NaN}function Ke(){var n=arguments[0];return arguments[0]=this,n.apply(null,arguments),this}function Xe(){var n=new Array(this.size()),t=-1;return this.each(function(){n[++t]=this}),n}function Ze(){for(var n=this._groups,t=0,e=n.length;t<e;++t)for(var i=n[t],r=0,o=i.length;r<o;++r){var a=i[r];if(a)return a}return null}function Qe(){var n=0;return this.each(function(){++n}),n}function Je(){return!this.node()}function ni(n){for(var t=this._groups,e=0,i=t.length;e<i;++e)for(var r,o=t[e],a=0,s=o.length;a<s;++a)(r=o[a])&&n.call(r,r.__data__,a,o);return this}function ti(n){return function(){this.removeAttribute(n)}}function ei(n){return function(){this.removeAttributeNS(n.space,n.local)}}function ii(n,t){return function(){this.setAttribute(n,t)}}function ri(n,t){return function(){this.setAttributeNS(n.space,n.local,t)}}function oi(n,t){return function(){var e=t.apply(this,arguments);null==e?this.removeAttribute(n):this.setAttribute(n,e)}}function ai(n,t){return function(){var e=t.apply(this,arguments);null==e?this.removeAttributeNS(n.space,n.local):this.setAttributeNS(n.space,n.local,e)}}function si(n,t){var e=Me(n);if(arguments.length<2){var i=this.node();return e.local?i.getAttributeNS(e.space,e.local):i.getAttribute(e)}return this.each((null==t?e.local?ei:ti:"function"==typeof t?e.local?ai:oi:e.local?ri:ii)(e,t))}function li(n){return n.ownerDocument&&n.ownerDocument.defaultView||n.document&&n||n.defaultView}function ui(n){return function(){this.style.removeProperty(n)}}function ci(n,t,e){return function(){this.style.setProperty(n,t,e)}}function di(n,t,e){return function(){var i=t.apply(this,arguments);null==i?this.style.removeProperty(n):this.style.setProperty(n,i,e)}}function hi(n,t,e){return arguments.length>1?this.each((null==t?ui:"function"==typeof t?di:ci)(n,t,null==e?"":e)):pi(this.node(),n)}function pi(n,t){return n.style.getPropertyValue(t)||li(n).getComputedStyle(n,null).getPropertyValue(t)}function fi(n){return function(){delete this[n]}}function gi(n,t){return function(){this[n]=t}}function mi(n,t){return function(){var e=t.apply(this,arguments);null==e?delete this[n]:this[n]=e}}function bi(n,t){return arguments.length>1?this.each((null==t?fi:"function"==typeof t?mi:gi)(n,t)):this.node()[n]}function yi(n){return n.trim().split(/^|\s+/)}function vi(n){return n.classList||new wi(n)}function wi(n){this._node=n,this._names=yi(n.getAttribute("class")||"")}function xi(n,t){for(var e=vi(n),i=-1,r=t.length;++i<r;)e.add(t[i])}function ki(n,t){for(var e=vi(n),i=-1,r=t.length;++i<r;)e.remove(t[i])}function Si(n){return function(){xi(this,n)}}function Mi(n){return function(){ki(this,n)}}function Ti(n,t){return function(){(t.apply(this,arguments)?xi:ki)(this,n)}}function _i(n,t){var e=yi(n+"");if(arguments.length<2){for(var i=vi(this.node()),r=-1,o=e.length;++r<o;)if(!i.contains(e[r]))return!1;return!0}return this.each(("function"==typeof t?Ti:t?Si:Mi)(e,t))}function Ci(){this.textContent=""}function Ai(n){return function(){this.textContent=n}}function Ei(n){return function(){var t=n.apply(this,arguments);this.textContent=null==t?"":t}}function Ni(n){return arguments.length?this.each(null==n?Ci:("function"==typeof n?Ei:Ai)(n)):this.node().textContent}function Li(){this.innerHTML=""}function Di(n){return function(){this.innerHTML=n}}function Oi(n){return function(){var t=n.apply(this,arguments);this.innerHTML=null==t?"":t}}function Ii(n){return arguments.length?this.each(null==n?Li:("function"==typeof n?Oi:Di)(n)):this.node().innerHTML}function Fi(){this.nextSibling&&this.parentNode.appendChild(this)}function Ri(){return this.each(Fi)}function Ui(){this.previousSibling&&this.parentNode.insertBefore(this,this.parentNode.firstChild)}function $i(){return this.each(Ui)}function Pi(n){var t="function"==typeof n?n:Ce(n);return this.select(function(){return this.appendChild(t.apply(this,arguments))})}function Hi(){return null}function zi(n,t){var e="function"==typeof n?n:Ce(n),i=null==t?Hi:"function"==typeof t?t:Ee(t);return this.select(function(){return this.insertBefore(e.apply(this,arguments),i.apply(this,arguments)||null)})}function qi(){var n=this.parentNode;n&&n.removeChild(this)}function ji(){return this.each(qi)}function Bi(){var n=this.cloneNode(!1),t=this.parentNode;return t?t.insertBefore(n,this.nextSibling):n}function Yi(){var n=this.cloneNode(!0),t=this.parentNode;return t?t.insertBefore(n,this.nextSibling):n}function Wi(n){return this.select(n?Yi:Bi)}function Gi(n){return arguments.length?this.property("__data__",n):this.node().__data__}function Vi(n,t,e){return n=Ki(n,t,e),function(t){var e=t.relatedTarget;e&&(e===this||8&e.compareDocumentPosition(this))||n.call(this,t)}}function Ki(n,t,e){return function(i){var r=ns;ns=i;try{
+n.call(this,this.__data__,t,e)}finally{ns=r}}}function Xi(n){return n.trim().split(/^|\s+/).map(function(n){var t="",e=n.indexOf(".");return e>=0&&(t=n.slice(e+1),n=n.slice(0,e)),{type:n,name:t}})}function Zi(n){return function(){var t=this.__on;if(t){for(var e,i=0,r=-1,o=t.length;i<o;++i)e=t[i],n.type&&e.type!==n.type||e.name!==n.name?t[++r]=e:this.removeEventListener(e.type,e.listener,e.capture);++r?t.length=r:delete this.__on}}}function Qi(n,t,e){var i=Ja.hasOwnProperty(n.type)?Vi:Ki;return function(r,o,a){var s,l=this.__on,u=i(t,o,a);if(l)for(var c=0,d=l.length;c<d;++c)if((s=l[c]).type===n.type&&s.name===n.name)return this.removeEventListener(s.type,s.listener,s.capture),this.addEventListener(s.type,s.listener=u,s.capture=e),void(s.value=t);this.addEventListener(n.type,u,e),s={type:n.type,name:n.name,value:t,listener:u,capture:e},l?l.push(s):this.__on=[s]}}function Ji(n,t,e){var i,r,o=Xi(n+""),a=o.length;if(!(arguments.length<2)){for(s=t?Qi:Zi,null==e&&(e=!1),i=0;i<a;++i)this.each(s(o[i],t,e));return this}var s=this.node().__on;if(s)for(var l,u=0,c=s.length;u<c;++u)for(i=0,l=s[u];i<a;++i)if((r=o[i]).type===l.type&&r.name===l.name)return l.value}function nr(n,t,e,i){var r=ns;n.sourceEvent=ns,ns=n;try{return t.apply(e,i)}finally{ns=r}}function tr(n,t,e){var i=li(n),r=i.CustomEvent;"function"==typeof r?r=new r(t,e):(r=i.document.createEvent("Event"),e?(r.initEvent(t,e.bubbles,e.cancelable),r.detail=e.detail):r.initEvent(t,!1,!1)),n.dispatchEvent(r)}function er(n,t){return function(){return tr(this,n,t)}}function ir(n,t){return function(){return tr(this,n,t.apply(this,arguments))}}function rr(n,t){return this.each(("function"==typeof t?ir:er)(n,t))}function or(n,t){this._groups=n,this._parents=t}function ar(){return new or([[document.documentElement]],ts)}function sr(n){return"string"==typeof n?new or([[document.querySelector(n)]],[document.documentElement]):new or([[n]],ts)}function lr(){for(var n,t=ns;n=t.sourceEvent;)t=n;return t}function ur(n,t){var e=n.ownerSVGElement||n;if(e.createSVGPoint){var i=e.createSVGPoint();return i.x=t.clientX,i.y=t.clientY,[(i=i.matrixTransform(n.getScreenCTM().inverse())).x,i.y]}var r=n.getBoundingClientRect();return[t.clientX-r.left-n.clientLeft,t.clientY-r.top-n.clientTop]}function cr(n){var t=lr();return t.changedTouches&&(t=t.changedTouches[0]),ur(n,t)}function dr(n,t,e){arguments.length<3&&(e=t,t=lr().changedTouches);for(var i,r=0,o=t?t.length:0;r<o;++r)if((i=t[r]).identifier===e)return ur(n,i);return null}function hr(){ns.stopImmediatePropagation()}function pr(){ns.preventDefault(),ns.stopImmediatePropagation()}function fr(n){var t=n.document.documentElement,e=sr(n).on("dragstart.drag",pr,!0);"onselectstart"in t?e.on("selectstart.drag",pr,!0):(t.__noselect=t.style.MozUserSelect,t.style.MozUserSelect="none")}function gr(n,t){var e=n.document.documentElement,i=sr(n).on("dragstart.drag",null);t&&(i.on("click.drag",pr,!0),setTimeout(function(){i.on("click.drag",null)},0)),"onselectstart"in e?i.on("selectstart.drag",null):(e.style.MozUserSelect=e.__noselect,delete e.__noselect)}function mr(n){return function(){return n}}function br(n,t,e,i,r,o,a,s,l,u){this.target=n,this.type=t,this.subject=e,this.identifier=i,this.active=r,this.x=o,this.y=a,this.dx=s,this.dy=l,this._=u}function yr(){return!ns.ctrlKey&&!ns.button}function vr(){return this.parentNode}function wr(n){return null==n?{x:ns.x,y:ns.y}:n}function xr(){return navigator.maxTouchPoints||"ontouchstart"in this}function kr(){function n(n){n.on("mousedown.drag",t).filter(g).on("touchstart.drag",r).on("touchmove.drag",o).on("touchend.drag touchcancel.drag",a).style("touch-action","none").style("-webkit-tap-highlight-color","rgba(0,0,0,0)")}function t(){if(!d&&h.apply(this,arguments)){var n=s("mouse",p.apply(this,arguments),cr,this,arguments);n&&(sr(ns.view).on("mousemove.drag",e,!0).on("mouseup.drag",i,!0),fr(ns.view),hr(),c=!1,l=ns.clientX,u=ns.clientY,n("start"))}}function e(){if(pr(),!c){var n=ns.clientX-l,t=ns.clientY-u;c=n*n+t*t>v}m.mouse("drag")}function i(){sr(ns.view).on("mousemove.drag mouseup.drag",null),gr(ns.view,c),pr(),m.mouse("end")}function r(){if(h.apply(this,arguments)){var n,t,e=ns.changedTouches,i=p.apply(this,arguments),r=e.length;for(n=0;n<r;++n)(t=s(e[n].identifier,i,dr,this,arguments))&&(hr(),t("start"))}}function o(){var n,t,e=ns.changedTouches,i=e.length;for(n=0;n<i;++n)(t=m[e[n].identifier])&&(pr(),t("drag"))}function a(){var n,t,e=ns.changedTouches,i=e.length;for(d&&clearTimeout(d),d=setTimeout(function(){d=null},500),n=0;n<i;++n)(t=m[e[n].identifier])&&(hr(),t("end"))}function s(t,e,i,r,o){var a,s,l,u=i(e,t),c=b.copy();if(nr(new br(n,"beforestart",a,t,y,u[0],u[1],0,0,c),function(){return null!=(ns.subject=a=f.apply(r,o))&&(s=a.x-u[0]||0,l=a.y-u[1]||0,!0)}))return function d(h){var p,f=u;switch(h){case"start":m[t]=d,p=y++;break;case"end":delete m[t],--y;case"drag":u=i(e,t),p=y}nr(new br(n,h,a,t,p,u[0]+s,u[1]+l,u[0]-f[0],u[1]-f[1],c),c.apply,c,[h,r,o])}}var l,u,c,d,h=yr,p=vr,f=wr,g=xr,m={},b=ve("start","drag","end"),y=0,v=0;return n.filter=function(t){return arguments.length?(h="function"==typeof t?t:mr(!!t),n):h},n.container=function(t){return arguments.length?(p="function"==typeof t?t:mr(t),n):p},n.subject=function(t){return arguments.length?(f="function"==typeof t?t:mr(t),n):f},n.touchable=function(t){return arguments.length?(g="function"==typeof t?t:mr(!!t),n):g},n.on=function(){var t=b.on.apply(b,arguments);return t===b?n:t},n.clickDistance=function(t){return arguments.length?(v=(t=+t)*t,n):Math.sqrt(v)},n}
+// Copyright 2018 The Distill Template Authors
+function Sr(n){let t=ls;"undefined"!=typeof n.githubUrl&&(t+='\n    <h3 id="updates-and-corrections">Updates and Corrections</h3>\n    <p>',n.githubCompareUpdatesUrl&&(t+=`<a href="${n.githubCompareUpdatesUrl}">View all changes</a> to this article since it was first published.`),t+=`\n    If you see mistakes or want to suggest changes, please <a href="${n.githubUrl+"/issues/new"}">create an issue on GitHub</a>. </p>\n    `);const e=n.journal;return void 0!==e&&"Distill"===e.title&&(t+=`\n    <h3 id="reuse">Reuse</h3>\n    <p>Diagrams and text are licensed under Creative Commons Attribution <a href="https://creativecommons.org/licenses/by/4.0/">CC-BY 4.0</a> with the <a class="github" href="${n.githubUrl}">source available on GitHub</a>, unless noted otherwise. The figures that have been reused from other sources don\u2019t fall under this license and can be recognized by a note in their caption: \u201cFigure from \u2026\u201d.</p>\n    `),"undefined"!=typeof n.publishedDate&&(t+=`\n    <h3 id="citation">Citation</h3>\n    <p>For attribution in academic contexts, please cite this work as</p>\n    <pre class="citation short">${n.concatenatedAuthors}, "${n.title}", Distill, ${n.publishedYear}.</pre>\n    <p>BibTeX citation</p>\n    <pre class="citation long">${v(n)}</pre>\n    `),t}const Mr=["Sunday","Monday","Tuesday","Wednesday","Thursday","Friday","Saturday"],Tr=["Jan.","Feb.","March","April","May","June","July","Aug.","Sept.","Oct.","Nov.","Dec."],_r=n=>n<10?"0"+n:n,Cr=function(n){return`${Mr[n.getDay()].substring(0,3)}, ${_r(n.getDate())} ${Tr[n.getMonth()].substring(0,3)} ${n.getFullYear().toString()} ${n.getUTCHours().toString()}:${n.getUTCMinutes().toString()}:${n.getUTCSeconds().toString()} Z`},Ar=function(n){return Array.from(n).reduce((n,[t,e])=>Object.assign(n,{[t]:e}),{})},Er=function(n){const t=new Map;for(var e in n)n.hasOwnProperty(e)&&t.set(e,n[e]);return t};class Nr{constructor(n){this.name=n.author,this.personalURL=n.authorURL,this.affiliation=n.affiliation,this.affiliationURL=n.affiliationURL,this.affiliations=n.affiliations||[]}get firstName(){const n=this.name.split(" ");return n.slice(0,n.length-1).join(" ")}get lastName(){const n=this.name.split(" ");return n[n.length-1]}}class Lr{constructor(){this.title="unnamed article",this.description="",this.authors=[],this.bibliography=new Map,this.bibliographyParsed=!1,this.citations=[],this.citationsCollected=!1,this.journal={},this.katex={},this.doi=undefined,this.publishedDate=undefined}set url(n){this._url=n}get url(){return this._url?this._url:this.distillPath&&this.journal.url?this.journal.url+"/"+this.distillPath:this.journal.url?this.journal.url:void 0}get githubUrl(){return this.githubPath?"https://github.com/"+this.githubPath:undefined}set previewURL(n){this._previewURL=n}get previewURL(){return this._previewURL?this._previewURL:this.url+"/thumbnail.jpg"}get publishedDateRFC(){return Cr(this.publishedDate)}get updatedDateRFC(){return Cr(this.updatedDate)}get publishedYear(){return this.publishedDate.getFullYear()}get publishedMonth(){return Tr[this.publishedDate.getMonth()]}get publishedDay(){return this.publishedDate.getDate()}get publishedMonthPadded(){return _r(this.publishedDate.getMonth()+1)}get publishedDayPadded(){return _r(this.publishedDate.getDate())}get publishedISODateOnly(){return this.publishedDate.toISOString().split("T")[0]}get volume(){const n=this.publishedYear-2015;if(n<1)throw new Error("Invalid publish date detected during computing volume");return n}get issue(){return this.publishedDate.getMonth()+1}get concatenatedAuthors(){return this.authors.length>2?this.authors[0].lastName+", et al.":2===this.authors.length?this.authors[0].lastName+" & "+this.authors[1].lastName:1===this.authors.length?this.authors[0].lastName:void 0}get bibtexAuthors(){return this.authors.map(n=>n.lastName+", "+n.firstName).join(" and ")}get slug(){let n="";return this.authors.length&&(n+=this.authors[0].lastName.toLowerCase(),n+=this.publishedYear,n+=this.title.split(" ")[0].toLowerCase()),n||"Untitled"}get bibliographyEntries(){return new Map(this.citations.map(n=>{return[n,this.bibliography.get(n)]}))}set bibliography(n){n instanceof Map?this._bibliography=n:"object"==typeof n&&(this._bibliography=Er(n))}get bibliography(){return this._bibliography}static fromObject(n){const t=new Lr;return Object.assign(t,n),t}assignToObject(n){Object.assign(n,this),n.bibliography=Ar(this.bibliographyEntries),n.url=this.url,n.doi=this.doi,n.githubUrl=this.githubUrl,n.previewURL=this.previewURL,this.publishedDate&&(n.volume=this.volume,n.issue=this.issue,n.publishedDateRFC=this.publishedDateRFC,n.publishedYear=this.publishedYear,n.publishedMonth=this.publishedMonth,n.publishedDay=this.publishedDay,n.publishedMonthPadded=this.publishedMonthPadded,n.publishedDayPadded=this.publishedDayPadded),this.updatedDate&&(n.updatedDateRFC=this.updatedDateRFC),n.concatenatedAuthors=this.concatenatedAuthors,n.bibtexAuthors=this.bibtexAuthors,n.slug=this.slug}}
+// Copyright 2018 The Distill Template Authors
+const Dr=n=>(class extends n{constructor(){super();const n={childList:!0,characterData:!0,subtree:!0},t=new MutationObserver(()=>{t.disconnect(),this.renderIfPossible(),t.observe(this,n)});t.observe(this,n)}connectedCallback(){super.connectedCallback(),this.renderIfPossible()}renderIfPossible(){this.textContent&&this.root&&this.renderContent()}renderContent(){console.error(`Your class ${this.constructor.name} must provide a custom renderContent() method!`)}}),Or=(n,t,e=!0)=>i=>{const r=document.createElement("template");return r.innerHTML=t,e&&"ShadyCSS"in window&&ShadyCSS.prepareTemplate(r,n),class extends i{static get is(){return n}constructor(){super(),this.clone=document.importNode(r.content,!0),e&&(this.attachShadow({mode:"open"}),this.shadowRoot.appendChild(this.clone))}connectedCallback(){this.hasAttribute("distill-prerendered")||(e?"ShadyCSS"in window&&ShadyCSS.styleElement(this):this.insertBefore(this.clone,this.firstChild))}get root(){return e?this.shadowRoot:this}$(n){return this.root.querySelector(n)}$$(n){return this.root.querySelectorAll(n)}}};
+// Copyright 2018 The Distill Template Authors
+var Ir='/*\n * Copyright 2018 The Distill Template Authors\n *\n * Licensed under the Apache License, Version 2.0 (the "License");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *      http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an "AS IS" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n\nspan.katex-display {\n  text-align: left;\n  padding: 8px 0 8px 0;\n  margin: 0.5em 0 0.5em 1em;\n}\n\nspan.katex {\n  -webkit-font-smoothing: antialiased;\n  color: rgba(0, 0, 0, 0.8);\n  font-size: 1.18em;\n}\n';
+// Copyright 2018 The Distill Template Authors
+const Fr=function(n,t,e){let i=e,r=0;const o=n.length;for(;i<t.length;){const e=t[i];if(r<=0&&t.slice(i,i+o)===n)return i;"\\"===e?i++:"{"===e?r++:"}"===e&&r--,i++}return-1},Rr=function(n,t,e,i){const r=[];for(let o=0;o<n.length;o++)if("text"===n[o].type){const a=n[o].data;let s,l=!0,u=0;for(-1!==(s=a.indexOf(t))&&(u=s,r.push({type:"text",data:a.slice(0,u)}),l=!1);;){if(l){if(-1===(s=a.indexOf(t,u)))break;r.push({type:"text",data:a.slice(u,s)}),u=s}else{if(-1===(s=Fr(e,a,u+t.length)))break;r.push({type:"math",data:a.slice(u+t.length,s),rawData:a.slice(u,s+e.length),display:i}),u=s+e.length}l=!l}r.push({type:"text",data:a.slice(u)})}else r.push(n[o]);return r},Ur=function(n,t){let e=[{type:"text",data:n}];for(let n=0;n<t.length;n++){const i=t[n];e=Rr(e,i.left,i.right,i.display||!1)}return e},$r=function(n,t){const e=Ur(n,t.delimiters),i=document.createDocumentFragment();for(let n=0;n<e.length;n++)if("text"===e[n].type)i.appendChild(document.createTextNode(e[n].data));else{const o=document.createElement("d-math"),a=e[n].data;t.displayMode=e[n].display;try{o.textContent=a,t.displayMode&&o.setAttribute("block","")}catch(r){if(!(r instanceof katex.ParseError))throw r;t.errorCallback("KaTeX auto-render: Failed to parse `"+e[n].data+"` with ",r),i.appendChild(document.createTextNode(e[n].rawData));continue}i.appendChild(o)}return i},Pr=function(n,t){for(let e=0;e<n.childNodes.length;e++){const i=n.childNodes[e];if(3===i.nodeType){const r=i.textContent;if(t.mightHaveMath(r)){const o=$r(r,t);e+=o.childNodes.length-1,n.replaceChild(o,i)}}else if(1===i.nodeType){-1===t.ignoredTags.indexOf(i.nodeName.toLowerCase())&&Pr(i,t)}}},Hr={delimiters:[{left:"$$",right:"$$",display:!0},{left:"\\[",right:"\\]",display:!0},{left:"\\(",right:"\\)",display:!1}],ignoredTags:["script","noscript","style","textarea","pre","code","svg"],errorCallback:function(n,t){console.error(n,t)}},zr=function(n,t){if(!n)throw new Error("No element provided to render");const e=Object.assign({},Hr,t),i=e.delimiters.flatMap(n=>[n.left,n.right]),r=n=>i.some(t=>-1!==n.indexOf(t));e.mightHaveMath=r,Pr(n,e)},qr="https://distill.pub/third-party/katex/katex.min.js",jr='<link rel="stylesheet" href="https://distill.pub/third-party/katex/katex.min.css" crossorigin="anonymous">',Br=Or("d-math",`\n${jr}\n<style>\n\n:host {\n  display: inline-block;\n  contain: style;\n}\n\n:host([block]) {\n  display: block;\n}\n\n${Ir}\n</style>\n<span id='katex-container'></span>\n`);class Yr extends(Dr(Br(HTMLElement))){static set katexOptions(n){Yr._katexOptions=n,Yr.katexOptions.delimiters&&(Yr.katexAdded?Yr.katexLoadedCallback():Yr.addKatex())}static get katexOptions(){return Yr._katexOptions||(Yr._katexOptions={delimiters:[{left:"$$",right:"$$",display:!1}]}),Yr._katexOptions}static katexLoadedCallback(){const n=document.querySelectorAll("d-math");for(const t of n)t.renderContent();Yr.katexOptions.delimiters&&zr(document.body,Yr.katexOptions)}static addKatex(){document.head.insertAdjacentHTML("beforeend",jr);const n=document.createElement("script");n.src=qr,n.async=!0,n.onload=Yr.katexLoadedCallback,n.crossorigin="anonymous",document.head.appendChild(n),Yr.katexAdded=!0}get options(){const n={displayMode:this.hasAttribute("block")};return Object.assign(n,Yr.katexOptions)}connectedCallback(){super.connectedCallback(),Yr.katexAdded||Yr.addKatex()}renderContent(){if("undefined"!=typeof katex){const n=this.root.querySelector("#katex-container");katex.render(this.textContent,n,this.options)}}}Yr.katexAdded=!1,Yr.inlineMathRendered=!1,window.DMath=Yr;class Wr extends HTMLElement{static get is(){return"d-front-matter"}constructor(){super();const n={childList:!0,characterData:!0,subtree:!0};new MutationObserver(n=>{for(const t of n)if("SCRIPT"===t.target.nodeName||"characterData"===t.type){const n=d(this);this.notify(n)}}).observe(this,n)}notify(n){const t=new CustomEvent("onFrontMatterChanged",{detail:n,bubbles:!0});document.dispatchEvent(t)}}const Gr=new Lr,Vr={frontMatter:Gr,waitingOn:{bibliography:[],citations:[]},listeners:{onCiteKeyCreated(n){const[t,e]=n.detail;if(!Gr.citationsCollected)return void Vr.waitingOn.citations.push(()=>Vr.listeners.onCiteKeyCreated(n));if(!Gr.bibliographyParsed)return void Vr.waitingOn.bibliography.push(()=>Vr.listeners.onCiteKeyCreated(n));const i=e.map(n=>Gr.citations.indexOf(n));t.numbers=i;const r=e.map(n=>Gr.bibliography.get(n));t.entries=r},onCiteKeyChanged(){Gr.citations=t(),Gr.citationsCollected=!0;for(const n of Vr.waitingOn.citations.slice())n();const n=document.querySelector("d-citation-list"),e=new Map(Gr.citations.map(n=>[n,Gr.bibliography.get(n)]));n.citations=e;const i=document.querySelectorAll("d-cite");for(const n of i){console.log(n);const t=n.keys,e=t.map(n=>Gr.citations.indexOf(n));n.numbers=e;const i=t.map(n=>Gr.bibliography.get(n));n.entries=i}},onCiteKeyRemoved(n){Vr.listeners.onCiteKeyChanged(n)},onBibliographyChanged(n){const t=document.querySelector("d-citation-list"),e=n.detail;Gr.bibliography=e,Gr.bibliographyParsed=!0;for(const n of Vr.waitingOn.bibliography.slice())n();if(Gr.citationsCollected)if(t.hasAttribute("distill-prerendered"))console.debug("Citation list was prerendered; not updating it.");else{const n=new Map(Gr.citations.map(n=>[n,Gr.bibliography.get(n)]));t.citations=n}else Vr.waitingOn.citations.push(function(){Vr.listeners.onBibliographyChanged({target:n.target,detail:n.detail})})},onFootnoteChanged(){const n=document.querySelector("d-footnote-list");if(n){const t=document.querySelectorAll("d-footnote");n.footnotes=t}},onFrontMatterChanged(t){const e=t.detail;n(Gr,e);const i=document.querySelector("d-interstitial");if(i&&("undefined"!=typeof Gr.password?i.password=Gr.password:i.parentElement.removeChild(i)),!document.body.hasAttribute("distill-prerendered")&&u()){h(document,Gr);const n=document.querySelector("distill-appendix");n&&(n.frontMatter=Gr);const t=document.querySelector("d-byline");t&&(t.frontMatter=Gr),e.katex&&(Yr.katexOptions=e.katex)}},DOMContentLoaded(){if(Vr.loaded)return void console.warn("Controller received DOMContentLoaded but was already loaded!");if(!u())return void console.warn("Controller received DOMContentLoaded at document.readyState: "+document.readyState+"!");Vr.loaded=!0,console.debug("Runlevel 4: Controller running DOMContentLoaded");const n=document.querySelector("d-front-matter");if(n){const t=d(n);Vr.listeners.onFrontMatterChanged({detail:t})}Gr.citations=t(),Gr.citationsCollected=!0;for(const n of Vr.waitingOn.citations.slice())n();if(Gr.bibliographyParsed)for(const n of Vr.waitingOn.bibliography.slice())n();const e=document.querySelector("d-footnote-list");if(e){const n=document.querySelectorAll("d-footnote");e.footnotes=n}}}};
+// Copyright 2018 The Distill Template Authors
+const Kr='/*\n * Copyright 2018 The Distill Template Authors\n *\n * Licensed under the Apache License, Version 2.0 (the "License");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *      http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an "AS IS" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n\nhtml {\n  font-size: 14px;\n\tline-height: 1.6em;\n  /* font-family: "Libre Franklin", "Helvetica Neue", sans-serif; */\n  font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, Oxygen, Ubuntu, Cantarell, "Fira Sans", "Droid Sans", "Helvetica Neue", Arial, sans-serif;\n  /*, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol";*/\n  text-size-adjust: 100%;\n  -ms-text-size-adjust: 100%;\n  -webkit-text-size-adjust: 100%;\n}\n\n@media(min-width: 768px) {\n  html {\n    font-size: 16px;\n  }\n}\n\nbody {\n  margin: 0;\n}\n\na {\n  color: #004276;\n}\n\nfigure {\n  margin: 0;\n}\n\ntable {\n\tborder-collapse: collapse;\n\tborder-spacing: 0;\n}\n\ntable th {\n\ttext-align: left;\n}\n\ntable thead {\n  border-bottom: 1px solid rgba(0, 0, 0, 0.05);\n}\n\ntable thead th {\n  padding-bottom: 0.5em;\n}\n\ntable tbody :first-child td {\n  padding-top: 0.5em;\n}\n\npre {\n  overflow: auto;\n  max-width: 100%;\n}\n\np {\n  margin-top: 0;\n  margin-bottom: 1em;\n}\n\nsup, sub {\n  vertical-align: baseline;\n  position: relative;\n  top: -0.4em;\n  line-height: 1em;\n}\n\nsub {\n  top: 0.4em;\n}\n\n.kicker,\n.marker {\n  font-size: 15px;\n  font-weight: 600;\n  color: rgba(0, 0, 0, 0.5);\n}\n\n\n/* Headline */\n\n@media(min-width: 1024px) {\n  d-title h1 span {\n    display: block;\n  }\n}\n\n/* Figure */\n\nfigure {\n  position: relative;\n  margin-bottom: 2.5em;\n  margin-top: 1.5em;\n}\n\nfigcaption+figure {\n\n}\n\nfigure img {\n  width: 100%;\n}\n\nfigure svg text,\nfigure svg tspan {\n}\n\nfigcaption,\n.figcaption {\n  color: rgba(0, 0, 0, 0.6);\n  font-size: 12px;\n  line-height: 1.5em;\n}\n\n@media(min-width: 1024px) {\nfigcaption,\n.figcaption {\n    font-size: 13px;\n  }\n}\n\nfigure.external img {\n  background: white;\n  border: 1px solid rgba(0, 0, 0, 0.1);\n  box-shadow: 0 1px 8px rgba(0, 0, 0, 0.1);\n  padding: 18px;\n  box-sizing: border-box;\n}\n\nfigcaption a {\n  color: rgba(0, 0, 0, 0.6);\n}\n\nfigcaption b,\nfigcaption strong, {\n  font-weight: 600;\n  color: rgba(0, 0, 0, 1.0);\n}\n'+'/*\n * Copyright 2018 The Distill Template Authors\n *\n * Licensed under the Apache License, Version 2.0 (the "License");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *      http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an "AS IS" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n\n@supports not (display: grid) {\n  .base-grid,\n  distill-header,\n  d-title,\n  d-abstract,\n  d-article,\n  d-appendix,\n  distill-appendix,\n  d-byline,\n  d-footnote-list,\n  d-citation-list,\n  distill-footer {\n    display: block;\n    padding: 8px;\n  }\n}\n\n.base-grid,\ndistill-header,\nd-title,\nd-abstract,\nd-article,\nd-appendix,\ndistill-appendix,\nd-byline,\nd-footnote-list,\nd-citation-list,\ndistill-footer {\n  display: grid;\n  justify-items: stretch;\n  grid-template-columns: [screen-start] 8px [page-start kicker-start text-start gutter-start middle-start] 1fr 1fr 1fr 1fr 1fr 1fr 1fr 1fr [text-end page-end gutter-end kicker-end middle-end] 8px [screen-end];\n  grid-column-gap: 8px;\n}\n\n.grid {\n  display: grid;\n  grid-column-gap: 8px;\n}\n\n@media(min-width: 768px) {\n  .base-grid,\n  distill-header,\n  d-title,\n  d-abstract,\n  d-article,\n  d-appendix,\n  distill-appendix,\n  d-byline,\n  d-footnote-list,\n  d-citation-list,\n  distill-footer {\n    grid-template-columns: [screen-start] 1fr [page-start kicker-start middle-start text-start] 45px 45px 45px 45px 45px 45px 45px 45px [ kicker-end text-end gutter-start] 45px [middle-end] 45px [page-end gutter-end] 1fr [screen-end];\n    grid-column-gap: 16px;\n  }\n\n  .grid {\n    grid-column-gap: 16px;\n  }\n}\n\n@media(min-width: 1000px) {\n  .base-grid,\n  distill-header,\n  d-title,\n  d-abstract,\n  d-article,\n  d-appendix,\n  distill-appendix,\n  d-byline,\n  d-footnote-list,\n  d-citation-list,\n  distill-footer {\n    grid-template-columns: [screen-start] 1fr [page-start kicker-start] 50px [middle-start] 50px [text-start kicker-end] 50px 50px 50px 50px 50px 50px 50px 50px [text-end gutter-start] 50px [middle-end] 50px [page-end gutter-end] 1fr [screen-end];\n    grid-column-gap: 16px;\n  }\n\n  .grid {\n    grid-column-gap: 16px;\n  }\n}\n\n@media(min-width: 1180px) {\n  .base-grid,\n  distill-header,\n  d-title,\n  d-abstract,\n  d-article,\n  d-appendix,\n  distill-appendix,\n  d-byline,\n  d-footnote-list,\n  d-citation-list,\n  distill-footer {\n    grid-template-columns: [screen-start] 1fr [page-start kicker-start] 60px [middle-start] 60px [text-start kicker-end] 60px 60px 60px 60px 60px 60px 60px 60px [text-end gutter-start] 60px [middle-end] 60px [page-end gutter-end] 1fr [screen-end];\n    grid-column-gap: 32px;\n  }\n\n  .grid {\n    grid-column-gap: 32px;\n  }\n}\n\n\n\n\n.base-grid {\n  grid-column: screen;\n}\n\n/* .l-body,\nd-article > *  {\n  grid-column: text;\n}\n\n.l-page,\nd-title > *,\nd-figure {\n  grid-column: page;\n} */\n\n.l-gutter {\n  grid-column: gutter;\n}\n\n.l-text,\n.l-body {\n  grid-column: text;\n}\n\n.l-page {\n  grid-column: page;\n}\n\n.l-body-outset {\n  grid-column: middle;\n}\n\n.l-page-outset {\n  grid-column: page;\n}\n\n.l-screen {\n  grid-column: screen;\n}\n\n.l-screen-inset {\n  grid-column: screen;\n  padding-left: 16px;\n  padding-left: 16px;\n}\n\n\n/* Aside */\n\nd-article aside {\n  grid-column: gutter;\n  font-size: 12px;\n  line-height: 1.6em;\n  color: rgba(0, 0, 0, 0.6)\n}\n\n@media(min-width: 768px) {\n  aside {\n    grid-column: gutter;\n  }\n\n  .side {\n    grid-column: gutter;\n  }\n}\n'+'/*\n * Copyright 2018 The Distill Template Authors\n *\n * Licensed under the Apache License, Version 2.0 (the "License");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *      http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an "AS IS" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n\nd-title {\n  padding: 2rem 0 1.5rem;\n  contain: layout style;\n  overflow-x: hidden;\n}\n\n@media(min-width: 768px) {\n  d-title {\n    padding: 4rem 0 1.5rem;\n  }\n}\n\nd-title h1 {\n  grid-column: text;\n  font-size: 40px;\n  font-weight: 700;\n  line-height: 1.1em;\n  margin: 0 0 0.5rem;\n}\n\n@media(min-width: 768px) {\n  d-title h1 {\n    font-size: 50px;\n  }\n}\n\nd-title p {\n  font-weight: 300;\n  font-size: 1.2rem;\n  line-height: 1.55em;\n  grid-column: text;\n}\n\nd-title .status {\n  margin-top: 0px;\n  font-size: 12px;\n  color: #009688;\n  opacity: 0.8;\n  grid-column: kicker;\n}\n\nd-title .status span {\n  line-height: 1;\n  display: inline-block;\n  padding: 6px 0;\n  border-bottom: 1px solid #80cbc4;\n  font-size: 11px;\n  text-transform: uppercase;\n}\n'+'/*\n * Copyright 2018 The Distill Template Authors\n *\n * Licensed under the Apache License, Version 2.0 (the "License");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *      http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an "AS IS" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n\nd-byline {\n  contain: style;\n  overflow: hidden;\n  border-top: 1px solid rgba(0, 0, 0, 0.1);\n  font-size: 0.8rem;\n  line-height: 1.8em;\n  padding: 1.5rem 0;\n  min-height: 1.8em;\n}\n\n\nd-byline .byline {\n  grid-template-columns: 1fr 1fr;\n  grid-column: text;\n}\n\n@media(min-width: 768px) {\n  d-byline .byline {\n    grid-template-columns: 1fr 1fr 1fr 1fr;\n  }\n}\n\nd-byline .authors-affiliations {\n  grid-column-end: span 2;\n  grid-template-columns: 1fr 1fr;\n  margin-bottom: 1em;\n}\n\n@media(min-width: 768px) {\n  d-byline .authors-affiliations {\n    margin-bottom: 0;\n  }\n}\n\nd-byline h3 {\n  font-size: 0.6rem;\n  font-weight: 400;\n  color: rgba(0, 0, 0, 0.5);\n  margin: 0;\n  text-transform: uppercase;\n}\n\nd-byline p {\n  margin: 0;\n}\n\nd-byline a,\nd-article d-byline a {\n  color: rgba(0, 0, 0, 0.8);\n  text-decoration: none;\n  border-bottom: none;\n}\n\nd-article d-byline a:hover {\n  text-decoration: underline;\n  border-bottom: none;\n}\n\nd-byline p.author {\n  font-weight: 500;\n}\n\nd-byline .affiliations {\n\n}\n'+'/*\n * Copyright 2018 The Distill Template Authors\n *\n * Licensed under the Apache License, Version 2.0 (the "License");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *      http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an "AS IS" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n\nd-article {\n  contain: layout style;\n  overflow-x: hidden;\n  border-top: 1px solid rgba(0, 0, 0, 0.1);\n  padding-top: 2rem;\n  color: rgba(0, 0, 0, 0.8);\n}\n\nd-article > * {\n  grid-column: text;\n}\n\n@media(min-width: 768px) {\n  d-article {\n    font-size: 16px;\n  }\n}\n\n@media(min-width: 1024px) {\n  d-article {\n    font-size: 1.06rem;\n    line-height: 1.7em;\n  }\n}\n\n\n/* H2 */\n\n\nd-article .marker {\n  text-decoration: none;\n  border: none;\n  counter-reset: section;\n  grid-column: kicker;\n  line-height: 1.7em;\n}\n\nd-article .marker:hover {\n  border: none;\n}\n\nd-article .marker span {\n  padding: 0 3px 4px;\n  border-bottom: 1px solid rgba(0, 0, 0, 0.2);\n  position: relative;\n  top: 4px;\n}\n\nd-article .marker:hover span {\n  color: rgba(0, 0, 0, 0.7);\n  border-bottom: 1px solid rgba(0, 0, 0, 0.7);\n}\n\nd-article h2 {\n  font-weight: 600;\n  font-size: 24px;\n  line-height: 1.25em;\n  margin: 2rem 0 1.5rem 0;\n  border-bottom: 1px solid rgba(0, 0, 0, 0.1);\n  padding-bottom: 1rem;\n}\n\n@media(min-width: 1024px) {\n  d-article h2 {\n    font-size: 36px;\n  }\n}\n\n/* H3 */\n\nd-article h3 {\n  font-weight: 700;\n  font-size: 18px;\n  line-height: 1.4em;\n  margin-bottom: 1em;\n  margin-top: 2em;\n}\n\n@media(min-width: 1024px) {\n  d-article h3 {\n    font-size: 20px;\n  }\n}\n\n/* H4 */\n\nd-article h4 {\n  font-weight: 600;\n  text-transform: uppercase;\n  font-size: 14px;\n  line-height: 1.4em;\n}\n\nd-article a {\n  color: inherit;\n}\n\nd-article p,\nd-article ul,\nd-article ol,\nd-article blockquote {\n  margin-top: 0;\n  margin-bottom: 1em;\n  margin-left: 0;\n  margin-right: 0;\n}\n\nd-article blockquote {\n  border-left: 2px solid rgba(0, 0, 0, 0.2);\n  padding-left: 2em;\n  font-style: italic;\n  color: rgba(0, 0, 0, 0.6);\n}\n\nd-article a {\n  border-bottom: 1px solid rgba(0, 0, 0, 0.4);\n  text-decoration: none;\n}\n\nd-article a:hover {\n  border-bottom: 1px solid rgba(0, 0, 0, 0.8);\n}\n\nd-article .link {\n  text-decoration: underline;\n  cursor: pointer;\n}\n\nd-article ul,\nd-article ol {\n  padding-left: 24px;\n}\n\nd-article li {\n  margin-bottom: 1em;\n  margin-left: 0;\n  padding-left: 0;\n}\n\nd-article li:last-child {\n  margin-bottom: 0;\n}\n\nd-article pre {\n  font-size: 14px;\n  margin-bottom: 20px;\n}\n\nd-article hr {\n  grid-column: screen;\n  width: 100%;\n  border: none;\n  border-bottom: 1px solid rgba(0, 0, 0, 0.1);\n  margin-top: 60px;\n  margin-bottom: 60px;\n}\n\nd-article section {\n  margin-top: 60px;\n  margin-bottom: 60px;\n}\n\nd-article span.equation-mimic {\n  font-family: georgia;\n  font-size: 115%;\n  font-style: italic;\n}\n\nd-article > d-code,\nd-article section > d-code  {\n  display: block;\n}\n\nd-article > d-math[block],\nd-article section > d-math[block]  {\n  display: block;\n}\n\n@media (max-width: 768px) {\n  d-article > d-code,\n  d-article section > d-code,\n  d-article > d-math[block],\n  d-article section > d-math[block] {\n      overflow-x: scroll;\n      -ms-overflow-style: none;  // IE 10+\n      overflow: -moz-scrollbars-none;  // Firefox\n  }\n\n  d-article > d-code::-webkit-scrollbar,\n  d-article section > d-code::-webkit-scrollbar,\n  d-article > d-math[block]::-webkit-scrollbar,\n  d-article section > d-math[block]::-webkit-scrollbar {\n    display: none;  // Safari and Chrome\n  }\n}\n\nd-article .citation {\n  color: #668;\n  cursor: pointer;\n}\n\nd-include {\n  width: auto;\n  display: block;\n}\n\nd-figure {\n  contain: layout style;\n}\n\n/* KaTeX */\n\n.katex, .katex-prerendered {\n  contain: style;\n  display: inline-block;\n}\n\n/* Tables */\n\nd-article table {\n  border-collapse: collapse;\n  margin-bottom: 1.5rem;\n  border-bottom: 1px solid rgba(0, 0, 0, 0.2);\n}\n\nd-article table th {\n  border-bottom: 1px solid rgba(0, 0, 0, 0.2);\n}\n\nd-article table td {\n  border-bottom: 1px solid rgba(0, 0, 0, 0.05);\n}\n\nd-article table tr:last-of-type td {\n  border-bottom: none;\n}\n\nd-article table th,\nd-article table td {\n  font-size: 15px;\n  padding: 2px 8px;\n}\n\nd-article table tbody :first-child td {\n  padding-top: 2px;\n}\n'+Ir+'/*\n * Copyright 2018 The Distill Template Authors\n *\n * Licensed under the Apache License, Version 2.0 (the "License");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *      http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an "AS IS" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n\n@media print {\n\n  @page {\n    size: 8in 11in;\n    @bottom-right {\n      content: counter(page) " of " counter(pages);\n    }\n  }\n\n  html {\n    /* no general margins -- CSS Grid takes care of those */\n  }\n\n  p, code {\n    page-break-inside: avoid;\n  }\n\n  h2, h3 {\n    page-break-after: avoid;\n  }\n\n  d-header {\n    visibility: hidden;\n  }\n\n  d-footer {\n    display: none!important;\n  }\n\n}\n',Xr=[{name:"WebComponents",support:function(){return"customElements"in window&&"attachShadow"in Element.prototype&&"getRootNode"in Element.prototype&&"content"in document.createElement("template")&&"Promise"in window&&"from"in Array},url:"https://distill.pub/third-party/polyfills/webcomponents-lite.js"},{name:"IntersectionObserver",support:function(){return"IntersectionObserver"in window&&"IntersectionObserverEntry"in window},url:"https://distill.pub/third-party/polyfills/intersection-observer.js"}];class Zr{static browserSupportsAllFeatures(){return Xr.every(n=>n.support())}static load(n){const t=function(t){t.loaded=!0,console.debug("Runlevel 0: Polyfill has finished loading: "+t.name),Zr.neededPolyfills.every(n=>n.loaded)&&(console.debug("Runlevel 0: All required polyfills have finished loading."),console.debug("Runlevel 0->1."),window.distillRunlevel=1,n())};for(const n of Zr.neededPolyfills)f(n,t)}static get neededPolyfills(){return Zr._neededPolyfills||(Zr._neededPolyfills=Xr.filter(n=>!n.support())),Zr._neededPolyfills}}const Qr=Or("d-abstract",`\n<style>\n  :host {\n    font-size: 1.25rem;\n    line-height: 1.6em;\n    color: rgba(0, 0, 0, 0.7);\n    -webkit-font-smoothing: antialiased;\n  }\n\n  ::slotted(p) {\n    margin-top: 0;\n    margin-bottom: 1em;\n    grid-column: text-start / middle-end;\n  }\n  ${g("d-abstract")}\n</style>\n\n<slot></slot>\n`);class Jr extends(Qr(HTMLElement)){}
+// Copyright 2018 The Distill Template Authors
+const no=Or("d-appendix","\n<style>\n\nd-appendix {\n  contain: layout style;\n  font-size: 0.8em;\n  line-height: 1.7em;\n  margin-top: 60px;\n  margin-bottom: 0;\n  border-top: 1px solid rgba(0, 0, 0, 0.1);\n  color: rgba(0,0,0,0.5);\n  padding-top: 60px;\n  padding-bottom: 48px;\n}\n\nd-appendix h3 {\n  grid-column: page-start / text-start;\n  font-size: 15px;\n  font-weight: 500;\n  margin-top: 1em;\n  margin-bottom: 0;\n  color: rgba(0,0,0,0.65);\n}\n\nd-appendix h3 + * {\n  margin-top: 1em;\n}\n\nd-appendix ol {\n  padding: 0 0 0 15px;\n}\n\n@media (min-width: 768px) {\n  d-appendix ol {\n    padding: 0 0 0 30px;\n    margin-left: -30px;\n  }\n}\n\nd-appendix li {\n  margin-bottom: 1em;\n}\n\nd-appendix a {\n  color: rgba(0, 0, 0, 0.6);\n}\n\nd-appendix > * {\n  grid-column: text;\n}\n\nd-appendix > d-footnote-list,\nd-appendix > d-citation-list,\nd-appendix > distill-appendix {\n  grid-column: screen;\n}\n\n</style>\n\n",!1);class to extends(no(HTMLElement)){}
+// Copyright 2018 The Distill Template Authors
+const eo=/^\s*$/;class io extends HTMLElement{static get is(){return"d-article"}constructor(){super(),new MutationObserver(n=>{for(const t of n)for(const n of t.addedNodes)switch(n.nodeName){case"#text":{const t=n.nodeValue;if(!eo.test(t)){console.warn("Use of unwrapped text in distill articles is discouraged as it breaks layout! Please wrap any text in a <span> or <p> tag. We found the following text: "+t);const e=document.createElement("span");e.innerHTML=n.nodeValue,n.parentNode.insertBefore(e,n),n.parentNode.removeChild(n)}}}}).observe(this,{childList:!0})}}var ro="undefined"!=typeof globalThis?globalThis:"undefined"!=typeof window?window:"undefined"!=typeof global?global:"undefined"!=typeof self?self:{},oo=m(function(n,t){!function(n){function t(){this.months=["jan","feb","mar","apr","may","jun","jul","aug","sep","oct","nov","dec"],this.notKey=[",","{","}"," ","="],this.pos=0,this.input="",this.entries=new Array,this.currentEntry="",this.setInput=function(n){this.input=n},this.getEntries=function(){return this.entries},this.isWhitespace=function(n){return" "==n||"\r"==n||"\t"==n||"\n"==n},this.match=function(n,t){if(t!=undefined&&null!=t||(t=!0),this.skipWhitespace(t),this.input.substring(this.pos,this.pos+n.length)!=n)throw"Token mismatch, expected "+n+", found "+this.input.substring(this.pos);this.pos+=n.length,this.skipWhitespace(t)},this.tryMatch=function(n,t){return t!=undefined&&null!=t||(t=!0),this.skipWhitespace(t),this.input.substring(this.pos,this.pos+n.length)==n},this.matchAt=function(){for(;this.input.length>this.pos&&"@"!=this.input[this.pos];)this.pos++;return"@"==this.input[this.pos]},this.skipWhitespace=function(n){for(;this.isWhitespace(this.input[this.pos]);)this.pos++;if("%"==this.input[this.pos]&&1==n){for(;"\n"!=this.input[this.pos];)this.pos++;this.skipWhitespace(n)}},this.value_braces=function(){var n=0;this.match("{",!1);for(var t=this.pos,e=!1;;){if(!e)if("}"==this.input[this.pos]){if(!(n>0)){var i=this.pos;return this.match("}",!1),this.input.substring(t,i)}n--}else if("{"==this.input[this.pos])n++;else if(this.pos>=this.input.length-1)throw"Unterminated value";e="\\"==this.input[this.pos]&&0==e,this.pos++}},this.value_comment=function(){for(var n="",t=0;!this.tryMatch("}",!1)||0!=t;){if(n+=this.input[this.pos],"{"==this.input[this.pos]&&t++,"}"==this.input[this.pos]&&t--,this.pos>=this.input.length-1)throw"Unterminated value:"+this.input.substring(start);this.pos++}return n},this.value_quotes=function(){this.match('"',!1);for(var n=this.pos,t=!1;;){if(!t){if('"'==this.input[this.pos]){var e=this.pos;return this.match('"',!1),this.input.substring(n,e)}if(this.pos>=this.input.length-1)throw"Unterminated value:"+this.input.substring(n)}t="\\"==this.input[this.pos]&&0==t,this.pos++}},this.single_value=function(){var n=this.pos;if(this.tryMatch("{"))return this.value_braces();if(this.tryMatch('"'))return this.value_quotes();var t=this.key();if(t.match("^[0-9]+$"))return t;if(this.months.indexOf(t.toLowerCase())>=0)return t.toLowerCase();throw"Value expected:"+this.input.substring(n)+" for key: "+t},this.value=function(){var n=[];for(n.push(this.single_value());this.tryMatch("#");)this.match("#"),n.push(this.single_value());return n.join("")},this.key=function(){for(var n=this.pos;;){if(this.pos>=this.input.length)throw"Runaway key";if(this.notKey.indexOf(this.input[this.pos])>=0)return this.input.substring(n,this.pos);this.pos++}},this.key_equals_value=function(){var n=this.key();if(this.tryMatch("="))return this.match("="),[n,this.value()];throw"... = value expected, equals sign missing:"+this.input.substring(this.pos)},this.key_value_list=function(){var n=this.key_equals_value();for(this.currentEntry.entryTags={},this.currentEntry.entryTags[n[0]]=n[1];this.tryMatch(",")&&(this.match(","),!this.tryMatch("}"));)n=this.key_equals_value(),this.currentEntry.entryTags[n[0]]=n[1]},this.entry_body=function(n){this.currentEntry={},this.currentEntry.citationKey=this.key(),this.currentEntry.entryType=n.substring(1),this.match(","),this.key_value_list(),this.entries.push(this.currentEntry)},this.directive=function(){return this.match("@"),"@"+this.key()},this.preamble=function(){this.currentEntry={},this.currentEntry.entryType="PREAMBLE",this.currentEntry.entry=this.value_comment(),this.entries.push(this.currentEntry)},this.comment=function(){this.currentEntry={},this.currentEntry.entryType="COMMENT",this.currentEntry.entry=this.value_comment(),this.entries.push(this.currentEntry)},this.entry=function(n){this.entry_body(n)},this.bibtex=function(){for(;this.matchAt();){var n=this.directive();this.match("{"),"@STRING"==n?this.string():"@PREAMBLE"==n?this.preamble():"@COMMENT"==n?this.comment():this.entry(n),this.match("}")}}}n.toJSON=function(n){var e=new t;return e.setInput(n),e.bibtex(),e.entries},n.toBibtex=function(n){var t="";for(var e in n){if(t+="@"+n[e].entryType,t+="{",n[e].citationKey&&(t+=n[e].citationKey+", "),n[e].entry&&(t+=n[e].entry),n[e].entryTags){var i="";for(var r in n[e].entryTags)0!=i.length&&(i+=", "),i+=r+"= {"+n[e].entryTags[r]+"}";t+=i}t+="}\n\n"}return t}}(t)});class ao extends HTMLElement{static get is(){return"d-bibliography"}constructor(){super();const n={childList:!0,characterData:!0,subtree:!0};new MutationObserver(n=>{for(const t of n)"SCRIPT"!==t.target.nodeName&&"characterData"!==t.type||this.parseIfPossible()}).observe(this,n)}connectedCallback(){requestAnimationFrame(()=>{this.parseIfPossible()})}parseIfPossible(){const n=this.querySelector("script");if(n)if("text/bibtex"==n.type){const t=n.textContent;if(this.bibtex!==t){this.bibtex=t;const n=y(this.bibtex);this.notify(n)}}else if("text/json"==n.type){const t=new Map(JSON.parse(n.textContent));this.notify(t)}else console.warn("Unsupported bibliography script tag type: "+n.type)}notify(n){const t=new CustomEvent("onBibliographyChanged",{detail:n,bubbles:!0});this.dispatchEvent(t)}static get observedAttributes(){return["src"]}receivedBibtex(n){const t=y(n.target.response);this.notify(t)}attributeChangedCallback(n,t,e){var i=new XMLHttpRequest;i.onload=(n=>this.receivedBibtex(n)),i.onerror=(()=>console.warn(`Could not load Bibtex! (tried ${e})`)),i.responseType="text",i.open("GET",e,!0),i.send()}}class so extends HTMLElement{static get is(){return"d-byline"}set frontMatter(n){this.innerHTML=w(n)}}
+// Copyright 2018 The Distill Template Authors
+const lo=Or("d-cite",'\n<style>\n\n:host {\n  display: inline-block;\n}\n\n.citation {\n  color: hsla(206, 90%, 20%, 0.7);\n}\n\n.citation-number {\n  cursor: default;\n  white-space: nowrap;\n  font-family: -apple-system, BlinkMacSystemFont, "Roboto", Helvetica, sans-serif;\n  font-size: 75%;\n  color: hsla(206, 90%, 20%, 0.7);\n  display: inline-block;\n  line-height: 1.1em;\n  text-align: center;\n  position: relative;\n  top: -2px;\n  margin: 0 2px;\n}\n\nfigcaption .citation-number {\n  font-size: 11px;\n  font-weight: normal;\n  top: -2px;\n  line-height: 1em;\n}\n\nul {\n  margin: 0;\n  padding: 0;\n  list-style-type: none;\n}\n\nul li {\n  padding: 15px 10px 15px 10px;\n  border-bottom: 1px solid rgba(0,0,0,0.1)\n}\n\nul li:last-of-type {\n  border-bottom: none;\n}\n\n</style>\n\n<d-hover-box id="hover-box"></d-hover-box>\n\n<div id="citation-" class="citation">\n  <span class="citation-number"></span>\n</div>\n');class uo extends(lo(HTMLElement)){constructor(){super(),this._numbers=[],this._entries=[]}connectedCallback(){this.outerSpan=this.root.querySelector("#citation-"),this.innerSpan=this.root.querySelector(".citation-number"),this.hoverBox=this.root.querySelector("d-hover-box"),window.customElements.whenDefined("d-hover-box").then(()=>{this.hoverBox.listen(this)}),this.numbers&&this.displayNumbers(this.numbers),this.entries&&this.displayEntries(this.entries)}static get observedAttributes(){return["key","bibtex-key"]}attributeChangedCallback(n,t,e){const i=t?"onCiteKeyChanged":"onCiteKeyCreated",r=e.split(",").map(n=>n.trim()),o=new CustomEvent(i,{detail:[this,r],bubbles:!0});document.dispatchEvent(o)}set key(n){this.setAttribute("key",n)}get key(){return this.getAttribute("key")||this.getAttribute("bibtex-key")}get keys(){const n=this.key.split(",");return console.log(n),n}set numbers(n){this._numbers=n,this.displayNumbers(n)}get numbers(){return this._numbers}displayNumbers(n){if(!this.innerSpan)return;const t="["+n.map(n=>-1==n?"?":n+1+"").join(", ")+"]";this.innerSpan.textContent=t}set entries(n){this._entries=n,this.displayEntries(n)}get entries(){return this._entries}displayEntries(n){this.hoverBox&&(this.hoverBox.innerHTML=`<ul>\n      ${n.map(l).map(n=>`<li>${n}</li>`).join("\n")}\n    </ul>`)}}
+// Copyright 2018 The Distill Template Authors
+const co="\nd-citation-list {\n  contain: style;\n}\n\nd-citation-list .references {\n  grid-column: text;\n}\n\nd-citation-list .references .title {\n  font-weight: 500;\n}\n";class ho extends HTMLElement{static get is(){return"d-citation-list"}connectedCallback(){this.hasAttribute("distill-prerendered")||(this.style.display="none")}set citations(n){x(this,n)}}var po=m(function(n){var t=function(n){function t(n,t,e,i,r){this.type=n,this.content=t,this.alias=e,this.length=0|(i||"").length,this.greedy=!!r}function e(n,i,a,s,l,u,d){for(var h in a)if(a.hasOwnProperty(h)&&a[h]){var p=a[h];p=Array.isArray(p)?p:[p];for(var f=0;f<p.length;++f){if(d&&d==h+","+f)return;var g=p[f],m=g.inside,b=!!g.lookbehind,y=!!g.greedy,v=0,w=g.alias;if(y&&!g.pattern.global){var x=g.pattern.toString().match(/[imsuy]*$/)[0];g.pattern=RegExp(g.pattern.source,x+"g")}g=g.pattern||g;for(var k=s.next,S=l;k!==i.tail;S+=k.value.length,k=k.next){var M=k.value;if(i.length>n.length)return;if(!(M instanceof t)){var T=1;if(y&&k!=i.tail.prev){if(g.lastIndex=S,!(N=g.exec(n)))break;var _=N.index+(b&&N[1]?N[1].length:0),C=N.index+N[0].length,A=S;for(A+=k.value.length;_>=A;)A+=(k=k.next).value.length;if(S=A-=k.value.length,k.value instanceof t)continue;for(var E=k;E!==i.tail&&(A<C||"string"==typeof E.value&&!E.prev.value.greedy);E=E.next)T++,A+=E.value.length;T--,M=n.slice(S,A),N.index-=S}else{g.lastIndex=0;var N=g.exec(M)}if(N){b&&(v=N[1]?N[1].length:0);C=(_=N.index+v)+(N=N[0].slice(v)).length;var L=M.slice(0,_),D=M.slice(C),O=k.prev;if(L&&(O=r(i,O,L),S+=L.length),o(i,O,T),k=r(i,O,new t(h,m?c.tokenize(N,m):N,w,N,y)),D&&r(i,k,D),T>1&&e(n,i,a,k.prev,S,!0,h+","+f),u)break}else if(u)break}}}}}function i(){var n={value:null,prev:null,next:null},t={value:null,prev:n,next:null};n.next=t,this.head=n,this.tail=t,this.length=0}function r(n,t,e){var i=t.next,r={value:e,prev:t,next:i};return t.next=r,i.prev=r,n.length++,r}function o(n,t,e){for(var i=t.next,r=0;r<e&&i!==n.tail;r++)i=i.next;t.next=i,i.prev=t,n.length-=r}function a(n){for(var t=[],e=n.head.next;e!==n.tail;)t.push(e.value),e=e.next;return t}function s(){c.manual||c.highlightAll()}var l=/\blang(?:uage)?-([\w-]+)\b/i,u=0,c={manual:n.Prism&&n.Prism.manual,disableWorkerMessageHandler:n.Prism&&n.Prism.disableWorkerMessageHandler,util:{encode:function p(n){return n instanceof t?new t(n.type,p(n.content),n.alias):Array.isArray(n)?n.map(p):n.replace(/&/g,"&amp;").replace(/</g,"&lt;").replace(/\u00a0/g," ")},type:function(n){return Object.prototype.toString.call(n).slice(8,-1)},objId:function(n){return n.__id||Object.defineProperty(n,"__id",{value:++u}),n.__id},clone:function f(n,t){var e,i,r=c.util.type(n);switch(t=t||{},r){case"Object":if(i=c.util.objId(n),t[i])return t[i];for(var o in e={},t[i]=e,n)n.hasOwnProperty(o)&&(e[o]=f(n[o],t));return e;case"Array":return i=c.util.objId(n),t[i]?t[i]:(e=[],t[i]=e,n.forEach(function(n,i){e[i]=f(n,t)}),e);default:return n}},getLanguage:function(n){for(;n&&!l.test(n.className);)n=n.parentElement;return n?(n.className.match(l)||[,"none"])[1].toLowerCase():"none"},currentScript:function(){if("undefined"==typeof document)return null;if("currentScript"in document)return document.currentScript;try{throw new Error}catch(i){var n=(/at [^(\r\n]*\((.*):.+:.+\)$/i.exec(i.stack)||[])[1];if(n){var t=document.getElementsByTagName("script");for(var e in t)if(t[e].src==n)return t[e]}return null}}},languages:{extend:function(n,t){var e=c.util.clone(c.languages[n]);for(var i in t)e[i]=t[i];return e},insertBefore:function(n,t,e,i){var r=(i=i||c.languages)[n],o={};for(var a in r)if(r.hasOwnProperty(a)){if(a==t)for(var s in e)e.hasOwnProperty(s)&&(o[s]=e[s]);e.hasOwnProperty(a)||(o[a]=r[a])}var l=i[n];return i[n]=o,c.languages.DFS(c.languages,function(t,e){e===l&&t!=n&&(this[t]=o)}),o},DFS:function g(n,t,e,i){i=i||{};var r=c.util.objId;for(var o in n)if(n.hasOwnProperty(o)){t.call(n,o,n[o],e||o);var a=n[o],s=c.util.type(a);"Object"!==s||i[r(a)]?"Array"!==s||i[r(a)]||(i[r(a)]=!0,g(a,t,o,i)):(i[r(a)]=!0,g(a,t,null,i))}}},plugins:{},highlightAll:function(n,t){c.highlightAllUnder(document,n,t)},highlightAllUnder:function(n,t,e){var i={callback:e,container:n,selector:'code[class*="language-"], [class*="language-"] code, code[class*="lang-"], [class*="lang-"] code'};c.hooks.run("before-highlightall",i),i.elements=Array.prototype.slice.apply(i.container.querySelectorAll(i.selector)),c.hooks.run("before-all-elements-highlight",i);for(var r,o=0;r=i.elements[o++];)c.highlightElement(r,!0===t,i.callback)},highlightElement:function(t,e,i){function r(n){u.highlightedCode=n,c.hooks.run("before-insert",u),u.element.innerHTML=u.highlightedCode,c.hooks.run("after-highlight",u),c.hooks.run("complete",u),i&&i.call(u.element)}var o=c.util.getLanguage(t),a=c.languages[o];t.className=t.className.replace(l,"").replace(/\s+/g," ")+" language-"+o;var s=t.parentNode;s&&"pre"===s.nodeName.toLowerCase()&&(s.className=s.className.replace(l,"").replace(/\s+/g," ")+" language-"+o);var u={element:t,language:o,grammar:a,code:t.textContent};if(c.hooks.run("before-sanity-check",u),!u.code)return c.hooks.run("complete",u),void(i&&i.call(u.element));if(c.hooks.run("before-highlight",u),u.grammar)if(e&&n.Worker){var d=new Worker(c.filename);d.onmessage=function(n){r(n.data)},d.postMessage(JSON.stringify({language:u.language,code:u.code,immediateClose:!0}))}else r(c.highlight(u.code,u.grammar,u.language));else r(c.util.encode(u.code))},highlight:function(n,e,i){var r={code:n,grammar:e,language:i};return c.hooks.run("before-tokenize",r),r.tokens=c.tokenize(r.code,r.grammar),c.hooks.run("after-tokenize",r),t.stringify(c.util.encode(r.tokens),r.language)},tokenize:function(n,t){var o=t.rest;if(o){for(var s in o)t[s]=o[s];delete t.rest}var l=new i;return r(l,l.head,n),e(n,l,t,l.head,0),a(l)},hooks:{all:{},add:function(n,t){var e=c.hooks.all;e[n]=e[n]||[],e[n].push(t)},run:function(n,t){var e=c.hooks.all[n];if(e&&e.length)for(var i,r=0;i=e[r++];)i(t)}},Token:t};if(n.Prism=c,t.stringify=function m(n,t){if("string"==typeof n)return n;if(Array.isArray(n)){var e="";return n.forEach(function(n){e+=m(n,t)}),e}var i={type:n.type,content:m(n.content,t),tag:"span",classes:["token",n.type],attributes:{},language:t},r=n.alias;r&&(Array.isArray(r)?Array.prototype.push.apply(i.classes,r):i.classes.push(r)),c.hooks.run("wrap",i);var o="";for(var a in i.attributes)o+=" "+a+'="'+(i.attributes[a]||"").replace(/"/g,"&quot;")+'"';return"<"+i.tag+' class="'+i.classes.join(" ")+'"'+o+">"+i.content+"</"+i.tag+">"},!n.document)return n.addEventListener?(c.disableWorkerMessageHandler||n.addEventListener("message",function(t){var e=JSON.parse(t.data),i=e.language,r=e.code,o=e.immediateClose;n.postMessage(c.highlight(r,c.languages[i],i)),o&&n.close()},!1),c):c;var d=c.util.currentScript();if(d&&(c.filename=d.src,d.hasAttribute("data-manual")&&(c.manual=!0)),!c.manual){var h=document.readyState;"loading"===h||"interactive"===h&&d&&d.defer?document.addEventListener("DOMContentLoaded",s):window.requestAnimationFrame?window.requestAnimationFrame(s):window.setTimeout(s,16)}return c}("undefined"!=typeof window?window:"undefined"!=typeof WorkerGlobalScope&&self instanceof WorkerGlobalScope?self:{});n.exports&&(n.exports=t),void 0!==ro&&(ro.Prism=t),t.languages.markup={comment:/<!--[\s\S]*?-->/,prolog:/<\?[\s\S]+?\?>/,doctype:{pattern:/<!DOCTYPE(?:[^>"'[\]]|"[^"]*"|'[^']*')+(?:\[(?:(?!<!--)[^"'\]]|"[^"]*"|'[^']*'|<!--[\s\S]*?-->)*\]\s*)?>/i,greedy:!0},cdata:/<!\[CDATA\[[\s\S]*?]]>/i,tag:{pattern:/<\/?(?!\d)[^\s>\/=$<%]+(?:\s(?:\s*[^\s>\/=]+(?:\s*=\s*(?:"[^"]*"|'[^']*'|[^\s'">=]+(?=[\s>]))|(?=[\s/>])))+)?\s*\/?>/i,greedy:!0,inside:{tag:{pattern:/^<\/?[^\s>\/]+/i,inside:{punctuation:/^<\/?/,namespace:/^[^\s>\/:]+:/}},"attr-value":{pattern:/=\s*(?:"[^"]*"|'[^']*'|[^\s'">=]+)/i,inside:{punctuation:[/^=/,{pattern:/^(\s*)["']|["']$/,lookbehind:!0}]}},punctuation:/\/?>/,"attr-name":{pattern:/[^\s>\/]+/,inside:{namespace:/^[^\s>\/:]+:/}}}},entity:/&#?[\da-z]{1,8};/i},t.languages.markup.tag.inside["attr-value"].inside.entity=t.languages.markup.entity,t.hooks.add("wrap",function(n){"entity"===n.type&&(n.attributes.title=n.content.replace(/&amp;/,"&"))}),Object.defineProperty(t.languages.markup.tag,"addInlined",{value:function(n,e){var i={};i["language-"+e]={pattern:/(^<!\[CDATA\[)[\s\S]+?(?=\]\]>$)/i,lookbehind:!0,inside:t.languages[e]},i.cdata=/^<!\[CDATA\[|\]\]>$/i;var r={"included-cdata":{pattern:/<!\[CDATA\[[\s\S]*?\]\]>/i,inside:i}};r["language-"+e]={pattern:/[\s\S]+/,inside:t.languages[e]};var o={};o[n]={pattern:RegExp(/(<__[\s\S]*?>)(?:<!\[CDATA\[[\s\S]*?\]\]>\s*|[\s\S])*?(?=<\/__>)/.source.replace(/__/g,function(){return n}),"i"),lookbehind:!0,greedy:!0,inside:r},t.languages.insertBefore("markup","cdata",o)}}),t.languages.xml=t.languages.extend("markup",{}),t.languages.html=t.languages.markup,t.languages.mathml=t.languages.markup,t.languages.svg=t.languages.markup,function(n){var t=/("|')(?:\\(?:\r\n|[\s\S])|(?!\1)[^\\\r\n])*\1/;n.languages.css={comment:/\/\*[\s\S]*?\*\//,atrule:{pattern:/@[\w-]+[\s\S]*?(?:;|(?=\s*\{))/,inside:{rule:/^@[\w-]+/,"selector-function-argument":{pattern:/(\bselector\s*\((?!\s*\))\s*)(?:[^()]|\((?:[^()]|\([^()]*\))*\))+?(?=\s*\))/,lookbehind:!0,alias:"selector"}}},url:{pattern:RegExp("url\\((?:"+t.source+"|[^\n\r()]*)\\)","i"),greedy:!0,inside:{"function":/^url/i,punctuation:/^\(|\)$/}},selector:RegExp("[^{}\\s](?:[^{};\"']|"+t.source+")*?(?=\\s*\\{)"),string:{pattern:t,greedy:!0},property:/[-_a-z\xA0-\uFFFF][-\w\xA0-\uFFFF]*(?=\s*:)/i,important:/!important\b/i,"function":/[-a-z0-9]+(?=\()/i,punctuation:/[(){};:,]/},n.languages.css.atrule.inside.rest=n.languages.css;var e=n.languages.markup;e&&(e.tag.addInlined("style","css"),n.languages.insertBefore("inside","attr-value",{"style-attr":{pattern:/\s*style=("|')(?:\\[\s\S]|(?!\1)[^\\])*\1/i,inside:{"attr-name":{pattern:/^\s*style/i,inside:e.tag.inside},punctuation:/^\s*=\s*['"]|['"]\s*$/,"attr-value":{pattern:/.+/i,inside:n.languages.css}},alias:"language-css"}},e.tag))}(t),t.languages.clike={comment:[{pattern:/(^|[^\\])\/\*[\s\S]*?(?:\*\/|$)/,lookbehind:!0},{pattern:/(^|[^\\:])\/\/.*/,lookbehind:!0,greedy:!0}],string:{pattern:/(["'])(?:\\(?:\r\n|[\s\S])|(?!\1)[^\\\r\n])*\1/,greedy:!0},"class-name":{pattern:/(\b(?:class|interface|extends|implements|trait|instanceof|new)\s+|\bcatch\s+\()[\w.\\]+/i,lookbehind:!0,inside:{punctuation:/[.\\]/}},keyword:/\b(?:if|else|while|do|for|return|in|instanceof|function|new|try|throw|catch|finally|null|break|continue)\b/,boolean:/\b(?:true|false)\b/,"function":/\w+(?=\()/,number:/\b0x[\da-f]+\b|(?:\b\d+\.?\d*|\B\.\d+)(?:e[+-]?\d+)?/i,operator:/[<>]=?|[!=]=?=?|--?|\+\+?|&&?|\|\|?|[?*/~^%]/,punctuation:/[{}[\];(),.:]/},t.languages.javascript=t.languages.extend("clike",{"class-name":[t.languages.clike["class-name"],{pattern:/(^|[^$\w\xA0-\uFFFF])[_$A-Z\xA0-\uFFFF][$\w\xA0-\uFFFF]*(?=\.(?:prototype|constructor))/,lookbehind:!0}],keyword:[{pattern:/((?:^|})\s*)(?:catch|finally)\b/,lookbehind:!0},{pattern:/(^|[^.]|\.\.\.\s*)\b(?:as|async(?=\s*(?:function\b|\(|[$\w\xA0-\uFFFF]|$))|await|break|case|class|const|continue|debugger|default|delete|do|else|enum|export|extends|for|from|function|get|if|implements|import|in|instanceof|interface|let|new|null|of|package|private|protected|public|return|set|static|super|switch|this|throw|try|typeof|undefined|var|void|while|with|yield)\b/,lookbehind:!0}],number:/\b(?:(?:0[xX](?:[\dA-Fa-f](?:_[\dA-Fa-f])?)+|0[bB](?:[01](?:_[01])?)+|0[oO](?:[0-7](?:_[0-7])?)+)n?|(?:\d(?:_\d)?)+n|NaN|Infinity)\b|(?:\b(?:\d(?:_\d)?)+\.?(?:\d(?:_\d)?)*|\B\.(?:\d(?:_\d)?)+)(?:[Ee][+-]?(?:\d(?:_\d)?)+)?/,"function":/#?[_$a-zA-Z\xA0-\uFFFF][$\w\xA0-\uFFFF]*(?=\s*(?:\.\s*(?:apply|bind|call)\s*)?\()/,operator:/--|\+\+|\*\*=?|=>|&&|\|\||[!=]==|<<=?|>>>?=?|[-+*/%&|^!=<>]=?|\.{3}|\?[.?]?|[~:]/}),t.languages.javascript["class-name"][0].pattern=/(\b(?:class|interface|extends|implements|instanceof|new)\s+)[\w.\\]+/,t.languages.insertBefore("javascript","keyword",{regex:{pattern:/((?:^|[^$\w\xA0-\uFFFF."'\])\s])\s*)\/(?:\[(?:[^\]\\\r\n]|\\.)*]|\\.|[^/\\\[\r\n])+\/[gimyus]{0,6}(?=(?:\s|\/\*[\s\S]*?\*\/)*(?:$|[\r\n,.;:})\]]|\/\/))/,lookbehind:!0,greedy:!0},"function-variable":{pattern:/#?[_$a-zA-Z\xA0-\uFFFF][$\w\xA0-\uFFFF]*(?=\s*[=:]\s*(?:async\s*)?(?:\bfunction\b|(?:\((?:[^()]|\([^()]*\))*\)|[_$a-zA-Z\xA0-\uFFFF][$\w\xA0-\uFFFF]*)\s*=>))/,alias:"function"},parameter:[{pattern:/(function(?:\s+[_$A-Za-z\xA0-\uFFFF][$\w\xA0-\uFFFF]*)?\s*\(\s*)(?!\s)(?:[^()]|\([^()]*\))+?(?=\s*\))/,lookbehind:!0,inside:t.languages.javascript},{pattern:/[_$a-z\xA0-\uFFFF][$\w\xA0-\uFFFF]*(?=\s*=>)/i,inside:t.languages.javascript},{pattern:/(\(\s*)(?!\s)(?:[^()]|\([^()]*\))+?(?=\s*\)\s*=>)/,lookbehind:!0,inside:t.languages.javascript},{pattern:/((?:\b|\s|^)(?!(?:as|async|await|break|case|catch|class|const|continue|debugger|default|delete|do|else|enum|export|extends|finally|for|from|function|get|if|implements|import|in|instanceof|interface|let|new|null|of|package|private|protected|public|return|set|static|super|switch|this|throw|try|typeof|undefined|var|void|while|with|yield)(?![$\w\xA0-\uFFFF]))(?:[_$A-Za-z\xA0-\uFFFF][$\w\xA0-\uFFFF]*\s*)\(\s*)(?!\s)(?:[^()]|\([^()]*\))+?(?=\s*\)\s*\{)/,lookbehind:!0,inside:t.languages.javascript}],constant:/\b[A-Z](?:[A-Z_]|\dx?)*\b/}),t.languages.insertBefore("javascript","string",{"template-string":{pattern:/`(?:\\[\s\S]|\${(?:[^{}]|{(?:[^{}]|{[^}]*})*})+}|(?!\${)[^\\`])*`/,greedy:!0,inside:{"template-punctuation":{pattern:/^`|`$/,alias:"string"},interpolation:{pattern:/((?:^|[^\\])(?:\\{2})*)\${(?:[^{}]|{(?:[^{}]|{[^}]*})*})+}/,lookbehind:!0,inside:{"interpolation-punctuation":{pattern:/^\${|}$/,alias:"punctuation"},rest:t.languages.javascript}},string:/[\s\S]+/}}}),t.languages.markup&&t.languages.markup.tag.addInlined("script","javascript"),t.languages.js=t.languages.javascript,"undefined"!=typeof self&&self.Prism&&self.document&&document.querySelector&&(self.Prism.fileHighlight=function(n){n=n||document;var e={js:"javascript",py:"python",rb:"ruby",ps1:"powershell",psm1:"powershell",sh:"bash",bat:"batch",h:"c",tex:"latex"};Array.prototype.slice.call(n.querySelectorAll("pre[data-src]")).forEach(function(n){if(!n.hasAttribute("data-src-loaded")){for(var i,r=n.getAttribute("data-src"),o=n,a=/\blang(?:uage)?-([\w-]+)\b/i;o&&!a.test(o.className);)o=o.parentNode;if(o&&(i=(n.className.match(a)||[,""])[1]),!i){var s=(r.match(/\.(\w+)$/)||[,""])[1];i=e[s]||s}var l=document.createElement("code");l.className="language-"+i,n.textContent="",l.textContent="Loading\u2026",n.appendChild(l);var u=new XMLHttpRequest;u.open("GET",r,!0),u.onreadystatechange=function(){4==u.readyState&&(u.status<400&&u.responseText?(l.textContent=u.responseText,t.highlightElement(l),n.setAttribute("data-src-loaded","")):u.status>=400?l.textContent="\u2716 Error "+u.status+" while fetching file: "+u.statusText:l.textContent="\u2716 Error: File does not exist or is empty")},u.send(null)}})},document.addEventListener("DOMContentLoaded",function(){self.Prism.fileHighlight()}))});Prism.languages.python={comment:{pattern:/(^|[^\\])#.*/,lookbehind:!0},"string-interpolation":{pattern:/(?:f|rf|fr)(?:("""|''')[\s\S]+?\1|("|')(?:\\.|(?!\2)[^\\\r\n])*\2)/i,greedy:!0,inside:{interpolation:{pattern:/((?:^|[^{])(?:{{)*){(?!{)(?:[^{}]|{(?!{)(?:[^{}]|{(?!{)(?:[^{}])+})+})+}/,lookbehind:!0,inside:{"format-spec":{pattern:/(:)[^:(){}]+(?=}$)/,lookbehind:!0},"conversion-option":{pattern:/![sra](?=[:}]$)/,alias:"punctuation"},rest:null}},string:/[\s\S]+/}},"triple-quoted-string":{pattern:/(?:[rub]|rb|br)?("""|''')[\s\S]+?\1/i,greedy:!0,alias:"string"},string:{pattern:/(?:[rub]|rb|br)?("|')(?:\\.|(?!\1)[^\\\r\n])*\1/i,greedy:!0},"function":{pattern:/((?:^|\s)def[ \t]+)[a-zA-Z_]\w*(?=\s*\()/g,lookbehind:!0},"class-name":{pattern:/(\bclass\s+)\w+/i,lookbehind:!0},decorator:{pattern:/(^\s*)@\w+(?:\.\w+)*/im,lookbehind:!0,alias:["annotation","punctuation"],inside:{punctuation:/\./}},keyword:/\b(?:and|as|assert|async|await|break|class|continue|def|del|elif|else|except|exec|finally|for|from|global|if|import|in|is|lambda|nonlocal|not|or|pass|print|raise|return|try|while|with|yield)\b/,builtin:/\b(?:__import__|abs|all|any|apply|ascii|basestring|bin|bool|buffer|bytearray|bytes|callable|chr|classmethod|cmp|coerce|compile|complex|delattr|dict|dir|divmod|enumerate|eval|execfile|file|filter|float|format|frozenset|getattr|globals|hasattr|hash|help|hex|id|input|int|intern|isinstance|issubclass|iter|len|list|locals|long|map|max|memoryview|min|next|object|oct|open|ord|pow|property|range|raw_input|reduce|reload|repr|reversed|round|set|setattr|slice|sorted|staticmethod|str|sum|super|tuple|type|unichr|unicode|vars|xrange|zip)\b/,boolean:/\b(?:True|False|None)\b/,number:/(?:\b(?=\d)|\B(?=\.))(?:0[bo])?(?:(?:\d|0x[\da-f])[\da-f]*\.?\d*|\.\d+)(?:e[+-]?\d+)?j?\b/i,operator:/[-+%=]=?|!=|\*\*?=?|\/\/?=?|<[<=>]?|>[=>]?|[&|^~]/,punctuation:/[{}[\];(),.:]/},Prism.languages.python["string-interpolation"].inside.interpolation.inside.rest=Prism.languages.python,Prism.languages.py=Prism.languages.python,Prism.languages.clike={comment:[{pattern:/(^|[^\\])\/\*[\s\S]*?(?:\*\/|$)/,lookbehind:!0},{pattern:/(^|[^\\:])\/\/.*/,lookbehind:!0,greedy:!0}],string:{pattern:/(["'])(?:\\(?:\r\n|[\s\S])|(?!\1)[^\\\r\n])*\1/,greedy:!0},"class-name":{pattern:/(\b(?:class|interface|extends|implements|trait|instanceof|new)\s+|\bcatch\s+\()[\w.\\]+/i,lookbehind:!0,inside:{punctuation:/[.\\]/}},keyword:/\b(?:if|else|while|do|for|return|in|instanceof|function|new|try|throw|catch|finally|null|break|continue)\b/,boolean:/\b(?:true|false)\b/,"function":/\w+(?=\()/,number:/\b0x[\da-f]+\b|(?:\b\d+\.?\d*|\B\.\d+)(?:e[+-]?\d+)?/i,operator:/[<>]=?|[!=]=?=?|--?|\+\+?|&&?|\|\|?|[?*/~^%]/,punctuation:/[{}[\];(),.:]/},Prism.languages.lua={comment:/^#!.+|--(?:\[(=*)\[[\s\S]*?\]\1\]|.*)/m,string:{pattern:/(["'])(?:(?!\1)[^\\\r\n]|\\z(?:\r\n|\s)|\\(?:\r\n|[\s\S]))*\1|\[(=*)\[[\s\S]*?\]\2\]/,greedy:!0},number:/\b0x[a-f\d]+\.?[a-f\d]*(?:p[+-]?\d+)?\b|\b\d+(?:\.\B|\.?\d*(?:e[+-]?\d+)?\b)|\B\.\d+(?:e[+-]?\d+)?\b/i,keyword:/\b(?:and|break|do|else|elseif|end|false|for|function|goto|if|in|local|nil|not|or|repeat|return|then|true|until|while)\b/,"function":/(?!\d)\w+(?=\s*(?:[({]))/,operator:[/[-+*%^&|#]|\/\/?|<[<=]?|>[>=]?|[=~]=?/,{pattern:/(^|[^.])\.\.(?!\.)/,lookbehind:!0}],punctuation:/[\[\](){},;]|\.+|:+/},function(n){var t="\\b(?:BASH|BASHOPTS|BASH_ALIASES|BASH_ARGC|BASH_ARGV|BASH_CMDS|BASH_COMPLETION_COMPAT_DIR|BASH_LINENO|BASH_REMATCH|BASH_SOURCE|BASH_VERSINFO|BASH_VERSION|COLORTERM|COLUMNS|COMP_WORDBREAKS|DBUS_SESSION_BUS_ADDRESS|DEFAULTS_PATH|DESKTOP_SESSION|DIRSTACK|DISPLAY|EUID|GDMSESSION|GDM_LANG|GNOME_KEYRING_CONTROL|GNOME_KEYRING_PID|GPG_AGENT_INFO|GROUPS|HISTCONTROL|HISTFILE|HISTFILESIZE|HISTSIZE|HOME|HOSTNAME|HOSTTYPE|IFS|INSTANCE|JOB|LANG|LANGUAGE|LC_ADDRESS|LC_ALL|LC_IDENTIFICATION|LC_MEASUREMENT|LC_MONETARY|LC_NAME|LC_NUMERIC|LC_PAPER|LC_TELEPHONE|LC_TIME|LESSCLOSE|LESSOPEN|LINES|LOGNAME|LS_COLORS|MACHTYPE|MAILCHECK|MANDATORY_PATH|NO_AT_BRIDGE|OLDPWD|OPTERR|OPTIND|ORBIT_SOCKETDIR|OSTYPE|PAPERSIZE|PATH|PIPESTATUS|PPID|PS1|PS2|PS3|PS4|PWD|RANDOM|REPLY|SECONDS|SELINUX_INIT|SESSION|SESSIONTYPE|SESSION_MANAGER|SHELL|SHELLOPTS|SHLVL|SSH_AUTH_SOCK|TERM|UID|UPSTART_EVENTS|UPSTART_INSTANCE|UPSTART_JOB|UPSTART_SESSION|USER|WINDOWID|XAUTHORITY|XDG_CONFIG_DIRS|XDG_CURRENT_DESKTOP|XDG_DATA_DIRS|XDG_GREETER_DATA_DIR|XDG_MENU_PREFIX|XDG_RUNTIME_DIR|XDG_SEAT|XDG_SEAT_PATH|XDG_SESSION_DESKTOP|XDG_SESSION_ID|XDG_SESSION_PATH|XDG_SESSION_TYPE|XDG_VTNR|XMODIFIERS)\\b",e={environment:{pattern:RegExp("\\$"+t),alias:"constant"},variable:[{pattern:/\$?\(\([\s\S]+?\)\)/,greedy:!0,inside:{variable:[{pattern:/(^\$\(\([\s\S]+)\)\)/,lookbehind:!0},/^\$\(\(/],number:/\b0x[\dA-Fa-f]+\b|(?:\b\d+\.?\d*|\B\.\d+)(?:[Ee]-?\d+)?/,operator:/--?|-=|\+\+?|\+=|!=?|~|\*\*?|\*=|\/=?|%=?|<<=?|>>=?|<=?|>=?|==?|&&?|&=|\^=?|\|\|?|\|=|\?|:/,punctuation:/\(\(?|\)\)?|,|;/}},{pattern:/\$\((?:\([^)]+\)|[^()])+\)|`[^`]+`/,greedy:!0,inside:{variable:/^\$\(|^`|\)$|`$/}},{pattern:/\$\{[^}]+\}/,greedy:!0,inside:{operator:/:[-=?+]?|[!\/]|##?|%%?|\^\^?|,,?/,punctuation:/[\[\]]/,environment:{pattern:RegExp("(\\{)"+t),lookbehind:!0,alias:"constant"}}},/\$(?:\w+|[#?*!@$])/],entity:/\\(?:[abceEfnrtv\\"]|O?[0-7]{1,3}|x[0-9a-fA-F]{1,2}|u[0-9a-fA-F]{4}|U[0-9a-fA-F]{8})/};n.languages.bash={shebang:{pattern:/^#!\s*\/.*/,alias:"important"},comment:{pattern:/(^|[^"{\\$])#.*/,lookbehind:!0},"function-name":[{pattern:/(\bfunction\s+)\w+(?=(?:\s*\(?:\s*\))?\s*\{)/,lookbehind:!0,alias:"function"},{pattern:/\b\w+(?=\s*\(\s*\)\s*\{)/,alias:"function"}],"for-or-select":{pattern:/(\b(?:for|select)\s+)\w+(?=\s+in\s)/,alias:"variable",lookbehind:!0},"assign-left":{pattern:/(^|[\s;|&]|[<>]\()\w+(?=\+?=)/,inside:{environment:{pattern:RegExp("(^|[\\s;|&]|[<>]\\()"+t),lookbehind:!0,alias:"constant"}},alias:"variable",lookbehind:!0},string:[{pattern:/((?:^|[^<])<<-?\s*)(\w+?)\s*(?:\r?\n|\r)[\s\S]*?(?:\r?\n|\r)\2/,lookbehind:!0,greedy:!0,inside:e},{pattern:/((?:^|[^<])<<-?\s*)(["'])(\w+)\2\s*(?:\r?\n|\r)[\s\S]*?(?:\r?\n|\r)\3/,lookbehind:!0,greedy:!0},{pattern:/(^|[^\\](?:\\\\)*)(["'])(?:\\[\s\S]|\$\([^)]+\)|`[^`]+`|(?!\2)[^\\])*\2/,lookbehind:!0,greedy:!0,inside:e}],environment:{pattern:RegExp("\\$?"+t),alias:"constant"},variable:e.variable,"function":{pattern:/(^|[\s;|&]|[<>]\()(?:add|apropos|apt|aptitude|apt-cache|apt-get|aspell|automysqlbackup|awk|basename|bash|bc|bconsole|bg|bzip2|cal|cat|cfdisk|chgrp|chkconfig|chmod|chown|chroot|cksum|clear|cmp|column|comm|cp|cron|crontab|csplit|curl|cut|date|dc|dd|ddrescue|debootstrap|df|diff|diff3|dig|dir|dircolors|dirname|dirs|dmesg|du|egrep|eject|env|ethtool|expand|expect|expr|fdformat|fdisk|fg|fgrep|file|find|fmt|fold|format|free|fsck|ftp|fuser|gawk|git|gparted|grep|groupadd|groupdel|groupmod|groups|grub-mkconfig|gzip|halt|head|hg|history|host|hostname|htop|iconv|id|ifconfig|ifdown|ifup|import|install|ip|jobs|join|kill|killall|less|link|ln|locate|logname|logrotate|look|lpc|lpr|lprint|lprintd|lprintq|lprm|ls|lsof|lynx|make|man|mc|mdadm|mkconfig|mkdir|mke2fs|mkfifo|mkfs|mkisofs|mknod|mkswap|mmv|more|most|mount|mtools|mtr|mutt|mv|nano|nc|netstat|nice|nl|nohup|notify-send|npm|nslookup|op|open|parted|passwd|paste|pathchk|ping|pkill|pnpm|popd|pr|printcap|printenv|ps|pushd|pv|quota|quotacheck|quotactl|ram|rar|rcp|reboot|remsync|rename|renice|rev|rm|rmdir|rpm|rsync|scp|screen|sdiff|sed|sendmail|seq|service|sftp|sh|shellcheck|shuf|shutdown|sleep|slocate|sort|split|ssh|stat|strace|su|sudo|sum|suspend|swapon|sync|tac|tail|tar|tee|time|timeout|top|touch|tr|traceroute|tsort|tty|umount|uname|unexpand|uniq|units|unrar|unshar|unzip|update-grub|uptime|useradd|userdel|usermod|users|uudecode|uuencode|v|vdir|vi|vim|virsh|vmstat|wait|watch|wc|wget|whereis|which|who|whoami|write|xargs|xdg-open|yarn|yes|zenity|zip|zsh|zypper)(?=$|[)\s;|&])/,lookbehind:!0},keyword:{pattern:/(^|[\s;|&]|[<>]\()(?:if|then|else|elif|fi|for|while|in|case|esac|function|select|do|done|until)(?=$|[)\s;|&])/,lookbehind:!0},builtin:{pattern:/(^|[\s;|&]|[<>]\()(?:\.|:|break|cd|continue|eval|exec|exit|export|getopts|hash|pwd|readonly|return|shift|test|times|trap|umask|unset|alias|bind|builtin|caller|command|declare|echo|enable|help|let|local|logout|mapfile|printf|read|readarray|source|type|typeset|ulimit|unalias|set|shopt)(?=$|[)\s;|&])/,lookbehind:!0,alias:"class-name"},boolean:{pattern:/(^|[\s;|&]|[<>]\()(?:true|false)(?=$|[)\s;|&])/,lookbehind:!0},"file-descriptor":{pattern:/\B&\d\b/,alias:"important"},operator:{pattern:/\d?<>|>\||\+=|==?|!=?|=~|<<[<-]?|[&\d]?>>|\d?[<>]&?|&[>&]?|\|[&|]?|<=?|>=?/,inside:{"file-descriptor":{pattern:/^\d/,alias:"important"}}},punctuation:/\$?\(\(?|\)\)?|\.\.|[{}[\];\\]/,number:{pattern:/(^|\s)(?:[1-9]\d*|0)(?:[.,]\d+)?\b/,lookbehind:!0}};for(var i=["comment","function-name","for-or-select","assign-left","string","environment","function","keyword","builtin","boolean","file-descriptor","operator","punctuation","number"],r=e.variable[1].inside,o=0;o<i.length;o++)r[i[o]]=n.languages.bash[i[o]];n.languages.shell=n.languages.bash}(Prism),Prism.languages.go=Prism.languages.extend("clike",{keyword:/\b(?:break|case|chan|const|continue|default|defer|else|fallthrough|for|func|go(?:to)?|if|import|interface|map|package|range|return|select|struct|switch|type|var)\b/,builtin:/\b(?:bool|byte|complex(?:64|128)|error|float(?:32|64)|rune|string|u?int(?:8|16|32|64)?|uintptr|append|cap|close|complex|copy|delete|imag|len|make|new|panic|print(?:ln)?|real|recover)\b/,boolean:/\b(?:_|iota|nil|true|false)\b/,operator:/[*\/%^!=]=?|\+[=+]?|-[=-]?|\|[=|]?|&(?:=|&|\^=?)?|>(?:>=?|=)?|<(?:<=?|=|-)?|:=|\.\.\./,number:/(?:\b0x[a-f\d]+|(?:\b\d+\.?\d*|\B\.\d+)(?:e[-+]?\d+)?)i?/i,string:{pattern:/(["'`])(?:\\[\s\S]|(?!\1)[^\\])*\1/,greedy:!0}}),delete Prism.languages.go["class-name"],function(n){function t(n,t){return n=n.replace(/<inner>/g,function(){return e}),t&&(n=n+"|"+n.replace(/_/g,"\\*")),RegExp(/((?:^|[^\\])(?:\\{2})*)/.source+"(?:"+n+")")}var e=/(?:\\.|[^\\\n\r]|(?:\n|\r\n?)(?!\n|\r\n?))/.source,i=/(?:\\.|``.+?``|`[^`\r\n]+`|[^\\|\r\n`])+/.source,r=/\|?__(?:\|__)+\|?(?:(?:\n|\r\n?)|$)/.source.replace(/__/g,function(){return i}),o=/\|?[ \t]*:?-{3,}:?[ \t]*(?:\|[ \t]*:?-{3,}:?[ \t]*)+\|?(?:\n|\r\n?)/.source;n.languages.markdown=n.languages.extend("markup",{}),n.languages.insertBefore("markdown","prolog",{blockquote:{pattern:/^>(?:[\t ]*>)*/m,alias:"punctuation"},table:{pattern:RegExp("^"+r+o+"(?:"+r+")*","m"),inside:{"table-data-rows":{pattern:RegExp("^("+r+o+")(?:"+r+")*$"),lookbehind:!0,inside:{"table-data":{pattern:RegExp(i),inside:n.languages.markdown},punctuation:/\|/}},"table-line":{pattern:RegExp("^("+r+")"+o+"$"),lookbehind:!0,inside:{punctuation:/\||:?-{3,}:?/}},"table-header-row":{pattern:RegExp("^"+r+"$"),inside:{"table-header":{pattern:RegExp(i),alias:"important",inside:n.languages.markdown},punctuation:/\|/}}}},code:[{pattern:/((?:^|\n)[ \t]*\n|(?:^|\r\n?)[ \t]*\r\n?)(?: {4}|\t).+(?:(?:\n|\r\n?)(?: {4}|\t).+)*/,lookbehind:!0,alias:"keyword"},{pattern:/``.+?``|`[^`\r\n]+`/,alias:"keyword"},{pattern:/^```[\s\S]*?^```$/m,greedy:!0,inside:{"code-block":{pattern:/^(```.*(?:\n|\r\n?))[\s\S]+?(?=(?:\n|\r\n?)^```$)/m,lookbehind:!0},"code-language":{pattern:/^(```).+/,lookbehind:!0},punctuation:/```/}}],title:[{pattern:/\S.*(?:\n|\r\n?)(?:==+|--+)(?=[ \t]*$)/m,alias:"important",inside:{punctuation:/==+$|--+$/}},{pattern:/(^\s*)#+.+/m,lookbehind:!0,alias:"important",inside:{punctuation:/^#+|#+$/}}],hr:{pattern:/(^\s*)([*-])(?:[\t ]*\2){2,}(?=\s*$)/m,lookbehind:!0,alias:"punctuation"},list:{pattern:/(^\s*)(?:[*+-]|\d+\.)(?=[\t ].)/m,lookbehind:!0,alias:"punctuation"},"url-reference":{pattern:/!?\[[^\]]+\]:[\t ]+(?:\S+|<(?:\\.|[^>\\])+>)(?:[\t ]+(?:"(?:\\.|[^"\\])*"|'(?:\\.|[^'\\])*'|\((?:\\.|[^)\\])*\)))?/,inside:{variable:{pattern:/^(!?\[)[^\]]+/,lookbehind:!0},string:/(?:"(?:\\.|[^"\\])*"|'(?:\\.|[^'\\])*'|\((?:\\.|[^)\\])*\))$/,punctuation:/^[\[\]!:]|[<>]/},alias:"url"},bold:{pattern:t(/__(?:(?!_)<inner>|_(?:(?!_)<inner>)+_)+__/.source,!0),lookbehind:!0,greedy:!0,inside:{content:{pattern:/(^..)[\s\S]+(?=..$)/,lookbehind:!0,inside:{}},punctuation:/\*\*|__/}},italic:{pattern:t(/_(?:(?!_)<inner>|__(?:(?!_)<inner>)+__)+_/.source,!0),lookbehind:!0,greedy:!0,inside:{content:{pattern:/(^.)[\s\S]+(?=.$)/,lookbehind:!0,inside:{}},punctuation:/[*_]/}},strike:{pattern:t(/(~~?)(?:(?!~)<inner>)+?\2/.source,!1),lookbehind:!0,greedy:!0,inside:{content:{pattern:/(^~~?)[\s\S]+(?=\1$)/,lookbehind:!0,inside:{}},punctuation:/~~?/}},url:{pattern:t(/!?\[(?:(?!\])<inner>)+\](?:\([^\s)]+(?:[\t ]+"(?:\\.|[^"\\])*")?\)| ?\[(?:(?!\])<inner>)+\])/.source,!1),lookbehind:!0,greedy:!0,inside:{variable:{pattern:/(\[)[^\]]+(?=\]$)/,lookbehind:!0},content:{pattern:/(^!?\[)[^\]]+(?=\])/,lookbehind:!0,inside:{}},string:{pattern:/"(?:\\.|[^"\\])*"(?=\)$)/}}}}),["url","bold","italic","strike"].forEach(function(t){["url","bold","italic","strike"].forEach(function(e){t!==e&&(n.languages.markdown[t].inside.content.inside[e]=n.languages.markdown[e])})}),n.hooks.add("after-tokenize",function(n){function t(n){if(n&&"string"!=typeof n)for(var e=0,i=n.length;e<i;e++){var r=n[e];if("code"===r.type){var o=r.content[1],a=r.content[3];if(o&&a&&"code-language"===o.type&&"code-block"===a.type&&"string"==typeof o.content){var s=o.content.replace(/\b#/g,"sharp").replace(/\b\+\+/g,"pp"),l="language-"+(s=(/[a-z][\w-]*/i.exec(s)||[""])[0].toLowerCase());a.alias?"string"==typeof a.alias?a.alias=[a.alias,l]:a.alias.push(l):a.alias=[l]}}else t(r.content)}}"markdown"!==n.language&&"md"!==n.language||t(n.tokens)}),n.hooks.add("wrap",function(t){if("code-block"===t.type){for(var e="",i=0,r=t.classes.length;i<r;i++){var o=t.classes[i],a=/language-(.+)/.exec(o);if(a){e=a[1];break}}var s=n.languages[e];if(s){var l=t.content.replace(/&lt;/g,"<").replace(/&amp;/g,"&");t.content=n.highlight(l,s,e)}else if(e&&"none"!==e&&n.plugins.autoloader){var u="md-"+(new Date).valueOf()+"-"+Math.floor(1e16*Math.random());t.attributes.id=u,n.plugins.autoloader.loadLanguages(e,function(){var t=document.getElementById(u);t&&(t.innerHTML=n.highlight(t.textContent,n.languages[e],e))})}}}),n.languages.md=n.languages.markdown}(Prism),Prism.languages.julia={comment:{pattern:/(^|[^\\])#.*/,lookbehind:!0},string:/("""|''')[\s\S]+?\1|("|')(?:\\.|(?!\2)[^\\\r\n])*\2/,keyword:/\b(?:abstract|baremodule|begin|bitstype|break|catch|ccall|const|continue|do|else|elseif|end|export|finally|for|function|global|if|immutable|import|importall|in|let|local|macro|module|print|println|quote|return|struct|try|type|typealias|using|while)\b/,boolean:/\b(?:true|false)\b/,number:/(?:\b(?=\d)|\B(?=\.))(?:0[box])?(?:[\da-f]+\.?\d*|\.\d+)(?:[efp][+-]?\d+)?j?/i,operator:/[-+*^%\xf7&$\\]=?|\/[\/=]?|!=?=?|\|[=>]?|<(?:<=?|[=:])?|>(?:=|>>?=?)?|==?=?|[~\u2260\u2264\u2265]/,punctuation:/[{}[\];(),.:]/,constant:/\b(?:(?:NaN|Inf)(?:16|32|64)?)\b/};
+// Copyright 2018 The Distill Template Authors
+const fo=Or("d-code",`\n<style>\n\ncode {\n  white-space: nowrap;\n  background: rgba(0, 0, 0, 0.04);\n  border-radius: 2px;\n  padding: 4px 7px;\n  font-size: 15px;\n  color: rgba(0, 0, 0, 0.6);\n}\n\npre code {\n  display: block;\n  border-left: 2px solid rgba(0, 0, 0, .1);\n  padding: 0 0 0 36px;\n}\n\n${'/**\n * prism.js default theme for JavaScript, CSS and HTML\n * Based on dabblet (http://dabblet.com)\n * @author Lea Verou\n */\n\ncode[class*="language-"],\npre[class*="language-"] {\n\tcolor: black;\n\tbackground: none;\n\ttext-shadow: 0 1px white;\n\tfont-family: Consolas, Monaco, \'Andale Mono\', \'Ubuntu Mono\', monospace;\n\tfont-size: 1em;\n\ttext-align: left;\n\twhite-space: pre;\n\tword-spacing: normal;\n\tword-break: normal;\n\tword-wrap: normal;\n\tline-height: 1.5;\n\n\t-moz-tab-size: 4;\n\t-o-tab-size: 4;\n\ttab-size: 4;\n\n\t-webkit-hyphens: none;\n\t-moz-hyphens: none;\n\t-ms-hyphens: none;\n\thyphens: none;\n}\n\npre[class*="language-"]::-moz-selection, pre[class*="language-"] ::-moz-selection,\ncode[class*="language-"]::-moz-selection, code[class*="language-"] ::-moz-selection {\n\ttext-shadow: none;\n\tbackground: #b3d4fc;\n}\n\npre[class*="language-"]::selection, pre[class*="language-"] ::selection,\ncode[class*="language-"]::selection, code[class*="language-"] ::selection {\n\ttext-shadow: none;\n\tbackground: #b3d4fc;\n}\n\n@media print {\n\tcode[class*="language-"],\n\tpre[class*="language-"] {\n\t\ttext-shadow: none;\n\t}\n}\n\n/* Code blocks */\npre[class*="language-"] {\n\tpadding: 1em;\n\tmargin: .5em 0;\n\toverflow: auto;\n}\n\n:not(pre) > code[class*="language-"],\npre[class*="language-"] {\n\tbackground: #f5f2f0;\n}\n\n/* Inline code */\n:not(pre) > code[class*="language-"] {\n\tpadding: .1em;\n\tborder-radius: .3em;\n\twhite-space: normal;\n}\n\n.token.comment,\n.token.prolog,\n.token.doctype,\n.token.cdata {\n\tcolor: slategray;\n}\n\n.token.punctuation {\n\tcolor: #999;\n}\n\n.token.namespace {\n\topacity: .7;\n}\n\n.token.property,\n.token.tag,\n.token.boolean,\n.token.number,\n.token.constant,\n.token.symbol,\n.token.deleted {\n\tcolor: #905;\n}\n\n.token.selector,\n.token.attr-name,\n.token.string,\n.token.char,\n.token.builtin,\n.token.inserted {\n\tcolor: #690;\n}\n\n.token.operator,\n.token.entity,\n.token.url,\n.language-css .token.string,\n.style .token.string {\n\tcolor: #9a6e3a;\n\tbackground: hsla(0, 0%, 100%, .5);\n}\n\n.token.atrule,\n.token.attr-value,\n.token.keyword {\n\tcolor: #07a;\n}\n\n.token.function,\n.token.class-name {\n\tcolor: #DD4A68;\n}\n\n.token.regex,\n.token.important,\n.token.variable {\n\tcolor: #e90;\n}\n\n.token.important,\n.token.bold {\n\tfont-weight: bold;\n}\n.token.italic {\n\tfont-style: italic;\n}\n\n.token.entity {\n\tcursor: help;\n}\n'}\n</style>\n\n<code id="code-container"></code>\n\n`);class go extends(Dr(fo(HTMLElement))){renderContent(){if(this.languageName=this.getAttribute("language"),!this.languageName)return void console.warn('You need to provide a language attribute to your <d-code> block to let us know how to highlight your code; e.g.:\n <d-code language="python">zeros = np.zeros(shape)</d-code>.');const n=po.languages[this.languageName];if(n==undefined)return void console.warn(`Distill does not yet support highlighting your code block in "${this.languageName}'.`);let t=this.textContent;const e=this.shadowRoot.querySelector("#code-container");if(this.hasAttribute("block")){const n=(t=t.replace(/\n/,"")).match(/\s*/);if(t=(t=t.replace(new RegExp("\n"+n,"g"),"\n")).trim(),e.parentNode instanceof ShadowRoot){const n=document.createElement("pre");this.shadowRoot.removeChild(e),n.appendChild(e),this.shadowRoot.appendChild(n)}}e.className=`language-${this.languageName}`,e.innerHTML=po.highlight(t,n)}}
+// Copyright 2018 The Distill Template Authors
+const mo=Or("d-footnote",'\n<style>\n\nd-math[block] {\n  display: block;\n}\n\n:host {\n\n}\n\nsup {\n  line-height: 1em;\n  font-size: 0.75em;\n  position: relative;\n  top: -.5em;\n  vertical-align: baseline;\n}\n\nspan {\n  color: hsla(206, 90%, 20%, 0.7);\n  cursor: default;\n}\n\n.footnote-container {\n  padding: 10px;\n}\n\n</style>\n\n<d-hover-box>\n  <div class="footnote-container">\n    <slot id="slot"></slot>\n  </div>\n</d-hover-box>\n\n<sup>\n  <span id="fn-" data-hover-ref=""></span>\n</sup>\n\n');class bo extends(mo(HTMLElement)){constructor(){super();const n={childList:!0,characterData:!0,subtree:!0};new MutationObserver(this.notify).observe(this,n)}notify(){const n=new CustomEvent("onFootnoteChanged",{detail:this,bubbles:!0});document.dispatchEvent(n)}connectedCallback(){this.hoverBox=this.root.querySelector("d-hover-box"),window.customElements.whenDefined("d-hover-box").then(()=>{this.hoverBox.listen(this)}),bo.currentFootnoteId+=1;const n=bo.currentFootnoteId.toString();this.root.host.id="d-footnote-"+n;const t="dt-fn-hover-box-"+n;this.hoverBox.id=t;const e=this.root.querySelector("#fn-");e.setAttribute("id","fn-"+n),e.setAttribute("data-hover-ref",t),e.textContent=n}}bo.currentFootnoteId=0;
+// Copyright 2018 The Distill Template Authors
+const yo=Or("d-footnote-list","\n<style>\n\nd-footnote-list {\n  contain: layout style;\n}\n\nd-footnote-list > * {\n  grid-column: text;\n}\n\nd-footnote-list a.footnote-backlink {\n  color: rgba(0,0,0,0.3);\n  padding-left: 0.5em;\n}\n\n</style>\n\n<h3>Footnotes</h3>\n<ol></ol>\n",!1);class vo extends(yo(HTMLElement)){connectedCallback(){super.connectedCallback(),this.list=this.root.querySelector("ol"),this.root.style.display="none"}set footnotes(n){if(this.list.innerHTML="",n.length){this.root.style.display="";for(const t of n){const n=document.createElement("li");n.id=t.id+"-listing",n.innerHTML=t.innerHTML;const e=document.createElement("a");e.setAttribute("class","footnote-backlink"),e.textContent="[\u21a9]",e.href="#"+t.id,n.appendChild(e),this.list.appendChild(n)}}else this.root.style.display="none"}}
+// Copyright 2018 The Distill Template Authors
+const wo=Or("d-hover-box",'\n<style>\n\n:host {\n  position: absolute;\n  width: 100%;\n  left: 0px;\n  z-index: 10000;\n  display: none;\n  white-space: normal\n}\n\n.container {\n  position: relative;\n  width: 704px;\n  max-width: 100vw;\n  margin: 0 auto;\n}\n\n.panel {\n  position: absolute;\n  font-size: 1rem;\n  line-height: 1.5em;\n  top: 0;\n  left: 0;\n  width: 100%;\n  border: 1px solid rgba(0, 0, 0, 0.1);\n  background-color: rgba(250, 250, 250, 0.95);\n  box-shadow: 0 0 7px rgba(0, 0, 0, 0.1);\n  border-radius: 4px;\n  box-sizing: border-box;\n\n  backdrop-filter: blur(2px);\n  -webkit-backdrop-filter: blur(2px);\n}\n\n</style>\n\n<div class="container">\n  <div class="panel">\n    <slot></slot>\n  </div>\n</div>\n');class xo extends(wo(HTMLElement)){constructor(){super()}connectedCallback(){}listen(n){this.bindDivEvents(this),this.bindTriggerEvents(n)}bindDivEvents(n){n.addEventListener("mouseover",()=>{this.visible||this.showAtNode(n),this.stopTimeout()}),n.addEventListener("mouseout",()=>{this.extendTimeout(500)}),n.addEventListener("touchstart",n=>{n.stopPropagation()},{passive:!0}),document.body.addEventListener("touchstart",()=>{this.hide()},{passive:!0})}bindTriggerEvents(n){n.addEventListener("mouseover",()=>{this.visible||this.showAtNode(n),this.stopTimeout()}),n.addEventListener("mouseout",()=>{this.extendTimeout(300)}),n.addEventListener("touchstart",t=>{this.visible?this.hide():this.showAtNode(n),t.stopPropagation()},{passive:!0})}show(n){this.visible=!0,this.style.display="block",this.style.top=Math.round(n[1]+10)+"px"}showAtNode(n){const t=n.getBoundingClientRect();this.show([n.offsetLeft+t.width,n.offsetTop+t.height])}hide(){this.visible=!1,this.style.display="none",this.stopTimeout()}stopTimeout(){this.timeout&&clearTimeout(this.timeout)}extendTimeout(n){this.stopTimeout(),this.timeout=setTimeout(()=>{this.hide()},n)}}
+// Copyright 2018 The Distill Template Authors
+class ko extends HTMLElement{static get is(){return"d-title"}}
+// Copyright 2018 The Distill Template Authors
+const So=Or("d-references","\n<style>\nd-references {\n  display: block;\n}\n</style>\n",!1);class Mo extends(So(HTMLElement)){}
+// Copyright 2018 The Distill Template Authors
+class To extends HTMLElement{static get is(){return"d-toc"}connectedCallback(){this.getAttribute("prerendered")||(window.onload=(()=>{k(this,document.querySelector("d-article").querySelectorAll("h2, h3"))}))}}class _o extends HTMLElement{static get is(){return"d-figure"}static get readyQueue(){return _o._readyQueue||(_o._readyQueue=[]),_o._readyQueue}static addToReadyQueue(n){-1===_o.readyQueue.indexOf(n)&&(_o.readyQueue.push(n),_o.runReadyQueue())}static runReadyQueue(){const n=_o.readyQueue.sort((n,t)=>n._seenOnScreen-t._seenOnScreen).filter(n=>!n._ready).pop();n&&(n.ready(),requestAnimationFrame(_o.runReadyQueue))}constructor(){super(),this._ready=!1,this._onscreen=!1,this._offscreen=!0}connectedCallback(){this.loadsWhileScrolling=this.hasAttribute("loadsWhileScrolling"),_o.marginObserver.observe(this),_o.directObserver.observe(this)}disconnectedCallback(){_o.marginObserver.unobserve(this),_o.directObserver.unobserve(this)}static get marginObserver(){if(!_o._marginObserver){const n=window.innerHeight,t=Math.floor(2*n),e={rootMargin:t+"px 0px "+t+"px 0px",threshold:.01},i=_o.didObserveMarginIntersection,r=new IntersectionObserver(i,e);_o._marginObserver=r}return _o._marginObserver}static didObserveMarginIntersection(n){for(const t of n){const n=t.target;t.isIntersecting&&!n._ready&&_o.addToReadyQueue(n)}}static get directObserver(){return _o._directObserver||(_o._directObserver=new IntersectionObserver(_o.didObserveDirectIntersection,{rootMargin:"0px",threshold:[0,1]})),_o._directObserver}static didObserveDirectIntersection(n){for(const t of n){const n=t.target;t.isIntersecting?(n._seenOnScreen=new Date,n._offscreen&&n.onscreen()):n._onscreen&&n.offscreen()}}addEventListener(n,t){super.addEventListener(n,t),"ready"===n&&-1!==_o.readyQueue.indexOf(this)&&(this._ready=!1,_o.runReadyQueue()),"onscreen"===n&&this.onscreen()}ready(){this._ready=!0,_o.marginObserver.unobserve(this);const n=new CustomEvent("ready");this.dispatchEvent(n)}onscreen(){this._onscreen=!0,this._offscreen=!1;const n=new CustomEvent("onscreen");this.dispatchEvent(n)}offscreen(){this._onscreen=!1,this._offscreen=!0;const n=new CustomEvent("offscreen");this.dispatchEvent(n)}}if("undefined"!=typeof window){let n;_o.isScrolling=!1;const t=()=>{_o.isScrolling=!0,clearTimeout(n),n=setTimeout(()=>{_o.isScrolling=!1,_o.runReadyQueue()},500)};window.addEventListener("scroll",t,!0)}
+// Copyright 2018 The Distill Template Authors
+const Co="distill.pub",Ao=Or("d-interstitial",'\n<style>\n\n.overlay {\n  position: fixed;\n  width: 100%;\n  height: 100%;\n  top: 0;\n  left: 0;\n  background: white;\n\n  opacity: 1;\n  visibility: visible;\n\n  display: flex;\n  flex-flow: column;\n  justify-content: center;\n  z-index: 2147483647 /* MaxInt32 */\n\n}\n\n.container {\n  position: relative;\n  margin-left: auto;\n  margin-right: auto;\n  max-width: 420px;\n  padding: 2em;\n}\n\nh1 {\n  text-decoration: underline;\n  text-decoration-color: hsl(0,100%,40%);\n  -webkit-text-decoration-color: hsl(0,100%,40%);\n  margin-bottom: 1em;\n  line-height: 1.5em;\n}\n\ninput[type="password"] {\n  -webkit-appearance: none;\n  -moz-appearance: none;\n  appearance: none;\n  -webkit-box-shadow: none;\n  -moz-box-shadow: none;\n  box-shadow: none;\n  -webkit-border-radius: none;\n  -moz-border-radius: none;\n  -ms-border-radius: none;\n  -o-border-radius: none;\n  border-radius: none;\n  outline: none;\n\n  font-size: 18px;\n  background: none;\n  width: 25%;\n  padding: 10px;\n  border: none;\n  border-bottom: solid 2px #999;\n  transition: border .3s;\n}\n\ninput[type="password"]:focus {\n  border-bottom: solid 2px #333;\n}\n\ninput[type="password"].wrong {\n  border-bottom: solid 2px hsl(0,100%,40%);\n}\n\np small {\n  color: #888;\n}\n\n.logo {\n  position: relative;\n  font-size: 1.5em;\n  margin-bottom: 3em;\n}\n\n.logo svg {\n  width: 36px;\n  position: relative;\n  top: 6px;\n  margin-right: 2px;\n}\n\n.logo svg path {\n  fill: none;\n  stroke: black;\n  stroke-width: 2px;\n}\n\n</style>\n\n<div class="overlay">\n  <div class="container">\n    <h1>This article is in review.</h1>\n    <p>Do not share this URL or the contents of this article. Thank you!</p>\n    <input id="interstitial-password-input" type="password" name="password" autofocus/>\n    <p><small>Enter the password we shared with you as part of the review process to view the article.</small></p>\n  </div>\n</div>\n');class Eo extends(Ao(HTMLElement)){connectedCallback(){if(this.shouldRemoveSelf())this.parentElement.removeChild(this);else{this.root.querySelector("#interstitial-password-input").oninput=(n=>this.passwordChanged(n))}}passwordChanged(n){n.target.value===this.password&&(console.log("Correct password entered."),this.parentElement.removeChild(this),"undefined"!=typeof Storage&&(console.log("Saved that correct password was entered."),localStorage.setItem(this.localStorageIdentifier(),"true")))}shouldRemoveSelf(){return window&&window.location.hostname===Co?(console.warn("Interstitial found on production, hiding it."),!0):"undefined"!=typeof Storage&&"true"===localStorage.getItem(this.localStorageIdentifier())&&(console.log("Loaded that correct password was entered before; skipping interstitial."),!0)}localStorageIdentifier(){const n="interstitial-password-correct";return"distill-drafts"+(window?window.location.pathname:"-")+n}}var No=M(S).right,Lo=Math.sqrt(50),Do=Math.sqrt(10),Oo=Math.sqrt(2),Io=.7,Fo=1/Io,Ro="\\s*([+-]?\\d+)\\s*",Uo="\\s*([+-]?\\d*\\.?\\d+(?:[eE][+-]?\\d+)?)\\s*",$o="\\s*([+-]?\\d*\\.?\\d+(?:[eE][+-]?\\d+)?)%\\s*",Po=/^#([0-9a-f]{3,8})$/,Ho=new RegExp("^rgb\\("+[Ro,Ro,Ro]+"\\)$"),zo=new RegExp("^rgb\\("+[$o,$o,$o]+"\\)$"),qo=new RegExp("^rgba\\("+[Ro,Ro,Ro,Uo]+"\\)$"),jo=new RegExp("^rgba\\("+[$o,$o,$o,Uo]+"\\)$"),Bo=new RegExp("^hsl\\("+[Uo,$o,$o]+"\\)$"),Yo=new RegExp("^hsla\\("+[Uo,$o,$o,Uo]+"\\)$"),Wo={aliceblue:15792383,antiquewhite:16444375,aqua:65535,aquamarine:8388564,azure:15794175,beige:16119260,bisque:16770244,black:0,blanchedalmond:16772045,blue:255,blueviolet:9055202,brown:10824234,burlywood:14596231,cadetblue:6266528,chartreuse:8388352,chocolate:13789470,coral:16744272,cornflowerblue:6591981,cornsilk:16775388,crimson:14423100,cyan:65535,darkblue:139,darkcyan:35723,darkgoldenrod:12092939,darkgray:11119017,darkgreen:25600,darkgrey:11119017,darkkhaki:12433259,darkmagenta:9109643,darkolivegreen:5597999,darkorange:16747520,darkorchid:10040012,darkred:9109504,darksalmon:15308410,darkseagreen:9419919,darkslateblue:4734347,darkslategray:3100495,darkslategrey:3100495,darkturquoise:52945,darkviolet:9699539,deeppink:16716947,deepskyblue:49151,dimgray:6908265,dimgrey:6908265,dodgerblue:2003199,firebrick:11674146,floralwhite:16775920,forestgreen:2263842,fuchsia:16711935,gainsboro:14474460,ghostwhite:16316671,gold:16766720,goldenrod:14329120,gray:8421504,green:32768,greenyellow:11403055,grey:8421504,honeydew:15794160,hotpink:16738740,indianred:13458524,indigo:4915330,ivory:16777200,khaki:15787660,lavender:15132410,lavenderblush:16773365,lawngreen:8190976,lemonchiffon:16775885,lightblue:11393254,lightcoral:15761536,lightcyan:14745599,lightgoldenrodyellow:16448210,lightgray:13882323,lightgreen:9498256,lightgrey:13882323,lightpink:16758465,lightsalmon:16752762,lightseagreen:2142890,lightskyblue:8900346,lightslategray:7833753,lightslategrey:7833753,lightsteelblue:11584734,lightyellow:16777184,lime:65280,limegreen:3329330,linen:16445670,magenta:16711935,maroon:8388608,mediumaquamarine:6737322,mediumblue:205,mediumorchid:12211667,mediumpurple:9662683,mediumseagreen:3978097,mediumslateblue:8087790,mediumspringgreen:64154,mediumturquoise:4772300,mediumvioletred:13047173,midnightblue:1644912,mintcream:16121850,mistyrose:16770273,moccasin:16770229,navajowhite:16768685,navy:128,oldlace:16643558,olive:8421376,olivedrab:7048739,orange:16753920,orangered:16729344,orchid:14315734,palegoldenrod:15657130,palegreen:10025880,paleturquoise:11529966,palevioletred:14381203,papayawhip:16773077,peachpuff:16767673,peru:13468991,pink:16761035,plum:14524637,powderblue:11591910,purple:8388736,rebeccapurple:6697881,red:16711680,rosybrown:12357519,royalblue:4286945,saddlebrown:9127187,salmon:16416882,sandybrown:16032864,seagreen:3050327,seashell:16774638,sienna:10506797,silver:12632256,skyblue:8900331,slateblue:6970061,slategray:7372944,slategrey:7372944,snow:16775930,springgreen:65407,steelblue:4620980,tan:13808780,teal:32896,thistle:14204888,tomato:16737095,turquoise:4251856,violet:15631086,wheat:16113331,white:16777215,whitesmoke:16119285,yellow:16776960,yellowgreen:10145074};L(O,U,{copy:function(n){return Object.assign(new this.constructor,this,n)},displayable:function(){return this.rgb().displayable()},hex:I,formatHex:I,formatHsl:F,formatRgb:R,toString:R}),L(q,z,D(O,{brighter:function(n){return n=null==n?Fo:Math.pow(Fo,n),new q(this.r*n,this.g*n,this.b*n,this.opacity)},darker:function(n){return n=null==n?Io:Math.pow(Io,n),new q(this.r*n,this.g*n,this.b*n,this.opacity)},rgb:function(){return this},displayable:function(){return-.5<=this.r&&this.r<255.5&&-.5<=this.g&&this.g<255.5&&-.5<=this.b&&this.b<255.5&&0<=this.opacity&&this.opacity<=1},hex:j,formatHex:j,formatRgb:B,toString:B})),L(K,V,D(O,{brighter:function(n){return n=null==n?Fo:Math.pow(Fo,n),new K(this.h,this.s,this.l*n,this.opacity)},darker:function(n){return n=null==n?Io:Math.pow(Io,n),new K(this.h,this.s,this.l*n,this.opacity)},rgb:function(){var n=this.h%360+360*(this.h<0),t=isNaN(n)||isNaN(this.s)?0:this.s,e=this.l,i=e+(e<.5?e:1-e)*t,r=2*e-i;return new q(X(n>=240?n-240:n+120,r,i),X(n,r,i),X(n<120?n+240:n-120,r,i),this.opacity)},displayable:function(){return(0<=this.s&&this.s<=1||isNaN(this.s))&&0<=this.l&&this.l<=1&&0<=this.opacity&&this.opacity<=1},formatHsl:function(){var n=this.opacity;return(1===(n=isNaN(n)?1:Math.max(0,Math.min(1,n)))?"hsl(":"hsla(")+(this.h||0)+", "+100*(this.s||0)+"%, "+100*(this.l||0)+"%"+(1===n?")":", "+n+")")}}));var Go=Math.PI/180,Vo=180/Math.PI,Ko=18,Xo=.96422,Zo=1,Qo=.82521,Jo=4/29,na=6/29,ta=3*na*na,ea=na*na*na;L(J,Q,D(O,{brighter:function(n){return new J(this.l+Ko*(null==n?1:n),this.a,this.b,this.opacity)},darker:function(n){return new J(this.l-Ko*(null==n?1:n),this.a,this.b,this.opacity)},rgb:function(){var n=(this.l+16)/116,t=isNaN(this.a)?n:n+this.a/500,e=isNaN(this.b)?n:n-this.b/200;return new q(en(3.1338561*(t=Xo*tn(t))-1.6168667*(n=Zo*tn(n))-.4906146*(e=Qo*tn(e))),en(-.9787684*t+1.9161415*n+.033454*e),en(.0719453*t-.2289914*n+1.4052427*e),this.opacity)}})),L(sn,an,D(O,{brighter:function(n){return new sn(this.h,this.c,this.l+Ko*(null==n?1:n),this.opacity)},darker:function(n){return new sn(this.h,this.c,this.l-Ko*(null==n?1:n),this.opacity)},rgb:function(){return ln(this).rgb()}}));var ia=-.14861,ra=1.78277,oa=-.29227,aa=-.90649,sa=1.97294,la=sa*aa,ua=sa*ra,ca=ra*oa-aa*ia;L(dn,cn,D(O,{brighter:function(n){return n=null==n?Fo:Math.pow(Fo,n),new dn(this.h,this.s,this.l*n,this.opacity)},darker:function(n){return n=null==n?Io:Math.pow(Io,n),new dn(this.h,this.s,this.l*n,this.opacity)},rgb:function(){var n=isNaN(this.h)?0:(this.h+120)*Go,t=+this.l,e=isNaN(this.s)?0:this.s*t*(1-t),i=Math.cos(n),r=Math.sin(n);return new q(255*(t+e*(ia*i+ra*r)),255*(t+e*(oa*i+aa*r)),255*(t+e*(sa*i)),this.opacity)}}));var da,ha=function gs(n){function t(n,t){var i=e((n=z(n)).r,(t=z(t)).r),r=e(n.g,t.g),o=e(n.b,t.b),a=mn(n.opacity,t.opacity);return function(t){return n.r=i(t),n.g=r(t),n.b=o(t),n.opacity=a(t),n+""}}var e=gn(n);return t.gamma=gs,t}(1),pa=/[-+]?(?:\d+\.?\d*|\.?\d+)(?:[eE][-+]?\d+)?/g,fa=new RegExp(pa.source,"g"),ga=[0,1],ma=/^(?:(.)?([<>=^]))?([+\-( ])?([$#])?(0)?(\d+)?(,)?(\.\d+)?(~)?([a-z%])?$/i;qn.prototype=jn.prototype,jn.prototype.toString=function(){return this.fill+this.align+this.sign+this.symbol+(this.zero?"0":"")+(this.width===undefined?"":Math.max(1,0|this.width))+(this.comma?",":"")+(this.precision===undefined?"":"."+Math.max(0,0|this.precision))+(this.trim?"~":"")+this.type};var ba,ya,va,wa={"%":function(n,t){return(100*n).toFixed(t)},b:function(n){return Math.round(n).toString(2)},c:function(n){return n+""},d:function(n){return Math.round(n).toString(10)},e:function(n,t){return n.toExponential(t)},f:function(n,t){return n.toFixed(t)},g:function(n,t){return n.toPrecision(t)},o:function(n){return Math.round(n).toString(8)},p:function(n,t){return Wn(100*n,t)},r:Wn,s:Yn,X:function(n){return Math.round(n).toString(16).toUpperCase()},x:function(n){return Math.round(n).toString(16)}},xa=Array.prototype.map,ka=["y","z","a","f","p","n","\xb5","m","","k","M","G","T","P","E","Z","Y"];Kn({decimal:".",thousands:",",grouping:[3],currency:["$",""],minus:"-"});var Sa=new Date,Ma=new Date,Ta=et(function(){},function(n,t){n.setTime(+n+t)},function(n,t){return t-n});Ta.every=function(n){return n=Math.floor(n),isFinite(n)&&n>0?n>1?et(function(t){t.setTime(Math.floor(t/n)*n)},function(t,e){t.setTime(+t+e*n)},function(t,e){return(e-t)/n}):Ta:null};var _a=1e3,Ca=6e4,Aa=36e5,Ea=864e5,Na=6048e5,La=(et(function(n){n.setTime(n-n.getMilliseconds())},function(n,t){n.setTime(+n+t*_a)},function(n,t){return(t-n)/_a},function(n){return n.getUTCSeconds()}),et(function(n){n.setTime(n-n.getMilliseconds()-n.getSeconds()*_a)},function(n,t){n.setTime(+n+t*Ca)},function(n,t){return(t-n)/Ca},function(n){return n.getMinutes()}),et(function(n){n.setTime(n-n.getMilliseconds()-n.getSeconds()*_a-n.getMinutes()*Ca)},function(n,t){n.setTime(+n+t*Aa)},function(n,t){return(t-n)/Aa},function(n){return n.getHours()}),et(function(n){n.setHours(0,0,0,0)},function(n,t){n.setDate(n.getDate()+t)},function(n,t){return(t-n-(t.getTimezoneOffset()-n.getTimezoneOffset())*Ca)/Ea},function(n){return n.getDate()-1})),Da=it(0),Oa=it(1),Ia=(it(2),it(3),it(4)),Fa=(it(5),it(6),et(function(n){n.setDate(1),n.setHours(0,0,0,0)},function(n,t){n.setMonth(n.getMonth()+t)},function(n,t){return t.getMonth()-n.getMonth()+12*(t.getFullYear()-n.getFullYear())},function(n){return n.getMonth()}),et(function(n){n.setMonth(0,1),n.setHours(0,0,0,0)},function(n,t){n.setFullYear(n.getFullYear()+t)},function(n,t){return t.getFullYear()-n.getFullYear()},function(n){return n.getFullYear()}));Fa.every=function(n){return isFinite(n=Math.floor(n))&&n>0?et(function(t){t.setFullYear(Math.floor(t.getFullYear()/n)*n),t.setMonth(0,1),t.setHours(0,0,0,0)},function(t,e){t.setFullYear(t.getFullYear()+e*n)}):null};et(function(n){n.setUTCSeconds(0,0)},function(n,t){n.setTime(+n+t*Ca)},function(n,t){return(t-n)/Ca},function(n){return n.getUTCMinutes()}),et(function(n){n.setUTCMinutes(0,0,0)},function(n,t){n.setTime(+n+t*Aa)},function(n,t){return(t-n)/Aa},function(n){return n.getUTCHours()});var Ra=et(function(n){n.setUTCHours(0,0,0,0)},function(n,t){n.setUTCDate(n.getUTCDate()+t)},function(n,t){return(t-n)/Ea},function(n){return n.getUTCDate()-1}),Ua=rt(0),$a=rt(1),Pa=(rt(2),rt(3),rt(4)),Ha=(rt(5),rt(6),et(function(n){n.setUTCDate(1),n.setUTCHours(0,0,0,0)},function(n,t){n.setUTCMonth(n.getUTCMonth()+t)},function(n,t){return t.getUTCMonth()-n.getUTCMonth()+12*(t.getUTCFullYear()-n.getUTCFullYear())},function(n){return n.getUTCMonth()}),et(function(n){n.setUTCMonth(0,1),n.setUTCHours(0,0,0,0)},function(n,t){n.setUTCFullYear(n.getUTCFullYear()+t)},function(n,t){return t.getUTCFullYear()-n.getUTCFullYear()},function(n){return n.getUTCFullYear()}));Ha.every=function(n){return isFinite(n=Math.floor(n))&&n>0?et(function(t){t.setUTCFullYear(Math.floor(t.getUTCFullYear()/n)*n),t.setUTCMonth(0,1),t.setUTCHours(0,0,0,0)},function(t,e){t.setUTCFullYear(t.getUTCFullYear()+e*n)}):null};var za,qa,ja,Ba={"-":"",_:" ",0:"0"},Ya=/^\s*\d+/,Wa=/^%/,Ga=/[\\^$*+?|[\]().{}]/g;me({dateTime:"%x, %X",date:"%-m/%-d/%Y",time:"%-I:%M:%S %p",periods:["AM","PM"],days:["Sunday","Monday","Tuesday","Wednesday","Thursday","Friday","Saturday"],shortDays:["Sun","Mon","Tue","Wed","Thu","Fri","Sat"],months:["January","February","March","April","May","June","July","August","September","October","November","December"],shortMonths:["Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"]});var Va="%Y-%m-%dT%H:%M:%S.%LZ",Ka=(Date.prototype.toISOString||qa(Va),+new Date("2000-01-01T00:00:00.000Z")||ja(Va),{value:function(){}});we.prototype=ve.prototype={constructor:we,on:function(n,t){var e,i=this._,r=xe(n+"",i),o=-1,a=r.length;if(!(arguments.length<2)){if(null!=t&&"function"!=typeof t)throw new Error("invalid callback: "+t);for(;++o<a;)if(e=(n=r[o]).type)i[e]=Se(i[e],n.name,t);else if(null==t)for(e in i)i[e]=Se(i[e],n.name,null);return this}for(;++o<a;)if((e=(n=r[o]).type)&&(e=ke(i[e],n.name)))return e},copy:function(){var n={},t=this._;for(var e in t)n[e]=t[e].slice();return new we(n)},call:function(n,t){if((e=arguments.length-2)>0)for(var e,i,r=new Array(e),o=0;o<e;++o)r[o]=arguments[o+2];if(!this._.hasOwnProperty(n))throw new Error("unknown type: "+n);for(o=0,e=(i=this._[n]).length;o<e;++o)i[o].value.apply(t,r)},apply:function(n,t,e){if(!this._.hasOwnProperty(n))throw new Error("unknown type: "+n);for(var i=this._[n],r=0,o=i.length;r<o;++r)i[r].value.apply(t,e)}};var Xa="http://www.w3.org/1999/xhtml",Za={svg:"http://www.w3.org/2000/svg",xhtml:Xa,xlink:"http://www.w3.org/1999/xlink",xml:"http://www.w3.org/XML/1998/namespace",xmlns:"http://www.w3.org/2000/xmlns/"};$e.prototype={constructor:$e,appendChild:function(n){return this._parent.insertBefore(n,this._next)},insertBefore:function(n,t){return this._parent.insertBefore(n,t)},querySelector:function(n){return this._parent.querySelector(n)},querySelectorAll:function(n){return this._parent.querySelectorAll(n)}};var Qa="$";wi.prototype={add:function(n){this._names.indexOf(n)<0&&(this._names.push(n),this._node.setAttribute("class",this._names.join(" ")))},remove:function(n){var t=this._names.indexOf(n);t>=0&&(this._names.splice(t,1),this._node.setAttribute("class",this._names.join(" ")))},contains:function(n){return this._names.indexOf(n)>=0}};var Ja={},ns=null;"undefined"!=typeof document&&("onmouseenter"in document.documentElement||(Ja={mouseenter:"mouseover",mouseleave:"mouseout"}));var ts=[null];or.prototype=ar.prototype={constructor:or,select:Ne,selectAll:Oe,filter:Fe,data:qe,enter:Ue,exit:je,join:Be,merge:Ye,order:We,sort:Ge,call:Ke,nodes:Xe,node:Ze,size:Qe,empty:Je,each:ni,attr:si,style:hi,property:bi,classed:_i,text:Ni,html:Ii,raise:Ri,lower:$i,append:Pi,insert:zi,remove:ji,clone:Wi,datum:Gi,on:Ji,dispatch:rr},br.prototype.on=function(){var n=this._.on.apply(this._,arguments);return n===this._?this:n};const es=Or("d-slider","\n<style>\n  :host {\n    position: relative;\n    display: inline-block;\n  }\n\n  :host(:focus) {\n    outline: none;\n  }\n\n  .background {\n    padding: 9px 0;\n    color: white;\n    position: relative;\n  }\n\n  .track {\n    height: 3px;\n    width: 100%;\n    border-radius: 2px;\n    background-color: hsla(0, 0%, 0%, 0.2);\n  }\n\n  .track-fill {\n    position: absolute;\n    top: 9px;\n    height: 3px;\n    border-radius: 4px;\n    background-color: hsl(24, 100%, 50%);\n  }\n\n  .knob-container {\n    position: absolute;\n    top: 10px;\n  }\n\n  .knob {\n    position: absolute;\n    top: -6px;\n    left: -6px;\n    width: 13px;\n    height: 13px;\n    background-color: hsl(24, 100%, 50%);\n    border-radius: 50%;\n    transition-property: transform;\n    transition-duration: 0.18s;\n    transition-timing-function: ease;\n  }\n  .mousedown .knob {\n    transform: scale(1.5);\n  }\n\n  .knob-highlight {\n    position: absolute;\n    top: -6px;\n    left: -6px;\n    width: 13px;\n    height: 13px;\n    background-color: hsla(0, 0%, 0%, 0.1);\n    border-radius: 50%;\n    transition-property: transform;\n    transition-duration: 0.18s;\n    transition-timing-function: ease;\n  }\n\n  .focus .knob-highlight {\n    transform: scale(2);\n  }\n\n  .ticks {\n    position: absolute;\n    top: 16px;\n    height: 4px;\n    width: 100%;\n    z-index: -1;\n  }\n\n  .ticks .tick {\n    position: absolute;\n    height: 100%;\n    border-left: 1px solid hsla(0, 0%, 0%, 0.2);\n  }\n\n</style>\n\n  <div class='background'>\n    <div class='track'></div>\n    <div class='track-fill'></div>\n    <div class='knob-container'>\n      <div class='knob-highlight'></div>\n      <div class='knob'></div>\n    </div>\n    <div class='ticks'></div>\n  </div>\n"),is={left:37,up:38,right:39,down:40,pageUp:33,pageDown:34,end:35,home:36};class rs extends(es(HTMLElement)){connectedCallback(){this.connected=!0,this.setAttribute("role","slider"),this.hasAttribute("tabindex")||this.setAttribute("tabindex",0),this.mouseEvent=!1,this.knob=this.root.querySelector(".knob-container"),this.background=this.root.querySelector(".background"),this.trackFill=this.root.querySelector(".track-fill"),this.track=this.root.querySelector(".track"),this.min=this.min?this.min:0,this.max=this.max?this.max:100,this.scale=tt().domain([this.min,this.max]).range([0,1]).clamp(!0),this.origin=this.origin!==undefined?this.origin:this.min,this.step=this.step?this.step:1,this.update(this.value?this.value:0),this.ticks=!!this.ticks&&this.ticks,this.renderTicks(),this.drag=kr().container(this.background).on("start",()=>{this.mouseEvent=!0,this.background.classList.add("mousedown"),this.changeValue=this.value,this.dragUpdate()}).on("drag",()=>{this.dragUpdate()}).on("end",()=>{this.mouseEvent=!1,this.background.classList.remove("mousedown"),this.dragUpdate(),this.changeValue!==this.value&&this.dispatchChange(),this.changeValue=this.value}),this.drag(sr(this.background)),this.addEventListener("focusin",()=>{this.mouseEvent||this.background.classList.add("focus")}),this.addEventListener("focusout",()=>{this.background.classList.remove("focus")}),this.addEventListener("keydown",this.onKeyDown)}static get observedAttributes(){return["min","max","value","step","ticks","origin","tickValues","tickLabels"]}attributeChangedCallback(n,t,e){isNaN(e)||e===undefined||null===e||("min"==n&&(this.min=+e,this.setAttribute("aria-valuemin",this.min)),"max"==n&&(this.max=+e,this.setAttribute("aria-valuemax",this.max)),"value"==n&&this.update(+e),"origin"==n&&(this.origin=+e),"step"==n&&e>0&&(this.step=+e),"ticks"==n&&(this.ticks=""===e||e))}onKeyDown(n){this.changeValue=this.value;let t=!1;switch(n.keyCode){case is.left:case is.down:this.update(this.value-this.step),t=!0;break;case is.right:case is.up:this.update(this.value+this.step),t=!0;break;case is.pageUp:case is.pageDown:this.update(this.value+10*this.step),t=!0;break;case is.home:this.update(this.min),t=!0;break;case is.end:this.update(this.max),t=!0}t&&(this.background.classList.add("focus"),n.preventDefault(),n.stopPropagation(),this.changeValue!==this.value&&this.dispatchChange())}validateValueRange(n,t,e){return Math.max(Math.min(t,e),n)}quantizeValue(n,t){return Math.round(n/t)*t}dragUpdate(){const n=this.background.getBoundingClientRect(),t=ns.x,e=n.width;this.update(this.scale.invert(t/e))}update(n){let t=n;"any"!==this.step&&(t=this.quantizeValue(n,this.step)),t=this.validateValueRange(this.min,this.max,t),this.connected&&(this.knob.style.left=100*this.scale(t)+"%",this.trackFill.style.width=100*this.scale(this.min+Math.abs(t-this.origin))+"%",this.trackFill.style.left=100*this.scale(Math.min(t,this.origin))+"%"),this.value!==t&&(this.value=t,this.setAttribute("aria-valuenow",this.value),this.dispatchInput())}dispatchChange(){const n=new Event("change");this.dispatchEvent(n,{})}dispatchInput(){const n=new Event("input");this.dispatchEvent(n,{})}renderTicks(){const n=this.root.querySelector(".ticks");if(!1!==this.ticks){let t=[];(t=this.ticks>0?this.scale.ticks(this.ticks):"any"===this.step?this.scale.ticks():_(this.min,this.max+1e-6,this.step)).forEach(t=>{const e=document.createElement("div");e.classList.add("tick"),e.style.left=100*this.scale(t)+"%",n.appendChild(e)})}else n.style.display="none"}}var os='<svg viewBox="-607 419 64 64">\n  <path d="M-573.4,478.9c-8,0-14.6-6.4-14.6-14.5s14.6-25.9,14.6-40.8c0,14.9,14.6,32.8,14.6,40.8S-565.4,478.9-573.4,478.9z"/>\n</svg>\n';const as=Or("distill-header",`\n<style>\ndistill-header {\n  position: relative;\n  height: 60px;\n  background-color: hsl(200, 60%, 15%);\n  width: 100%;\n  box-sizing: border-box;\n  z-index: 2;\n  color: rgba(0, 0, 0, 0.8);\n  border-bottom: 1px solid rgba(0, 0, 0, 0.08);\n  box-shadow: 0 1px 6px rgba(0, 0, 0, 0.05);\n}\ndistill-header .content {\n  height: 70px;\n  grid-column: page;\n}\ndistill-header a {\n  font-size: 16px;\n  height: 60px;\n  line-height: 60px;\n  text-decoration: none;\n  color: rgba(255, 255, 255, 0.8);\n  padding: 22px 0;\n}\ndistill-header a:hover {\n  color: rgba(255, 255, 255, 1);\n}\ndistill-header svg {\n  width: 24px;\n  position: relative;\n  top: 4px;\n  margin-right: 2px;\n}\n@media(min-width: 1080px) {\n  distill-header {\n    height: 70px;\n  }\n  distill-header a {\n    height: 70px;\n    line-height: 70px;\n    padding: 28px 0;\n  }\n  distill-header .logo {\n  }\n}\ndistill-header svg path {\n  fill: none;\n  stroke: rgba(255, 255, 255, 0.8);\n  stroke-width: 3px;\n}\ndistill-header .logo {\n  font-size: 17px;\n  font-weight: 200;\n}\ndistill-header .nav {\n  float: right;\n  font-weight: 300;\n}\ndistill-header .nav a {\n  font-size: 12px;\n  margin-left: 24px;\n  text-transform: uppercase;\n}\n</style>\n<div class="content">\n  <a href="/" class="logo">\n    ${os}\n    Distill\n  </a>\n  <nav class="nav">\n    <a href="/about/">About</a>\n    <a href="/prize/">Prize</a>\n    <a href="/journal/">Submit</a>\n  </nav>\n</div>\n`,!1);
+// Copyright 2018 The Distill Template Authors
+class ss extends(as(HTMLElement)){}
+// Copyright 2018 The Distill Template Authors
+const ls="\n<style>\n  distill-appendix {\n    contain: layout style;\n  }\n\n  distill-appendix .citation {\n    font-size: 11px;\n    line-height: 15px;\n    border-left: 1px solid rgba(0, 0, 0, 0.1);\n    padding-left: 18px;\n    border: 1px solid rgba(0,0,0,0.1);\n    background: rgba(0, 0, 0, 0.02);\n    padding: 10px 18px;\n    border-radius: 3px;\n    color: rgba(150, 150, 150, 1);\n    overflow: hidden;\n    margin-top: -12px;\n    white-space: pre-wrap;\n    word-wrap: break-word;\n  }\n\n  distill-appendix > * {\n    grid-column: text;\n  }\n</style>\n";class us extends HTMLElement{static get is(){return"distill-appendix"}set frontMatter(n){this.innerHTML=Sr(n)}}const cs=Or("distill-footer",`\n<style>\n\n:host {\n  color: rgba(255, 255, 255, 0.5);\n  font-weight: 300;\n  padding: 2rem 0;\n  border-top: 1px solid rgba(0, 0, 0, 0.1);\n  background-color: hsl(180, 5%, 15%); /*hsl(200, 60%, 15%);*/\n  text-align: left;\n  contain: content;\n}\n\n.footer-container .logo svg {\n  width: 24px;\n  position: relative;\n  top: 4px;\n  margin-right: 2px;\n}\n\n.footer-container .logo svg path {\n  fill: none;\n  stroke: rgba(255, 255, 255, 0.8);\n  stroke-width: 3px;\n}\n\n.footer-container .logo {\n  font-size: 17px;\n  font-weight: 200;\n  color: rgba(255, 255, 255, 0.8);\n  text-decoration: none;\n  margin-right: 6px;\n}\n\n.footer-container {\n  grid-column: text;\n}\n\n.footer-container .nav {\n  font-size: 0.9em;\n  margin-top: 1.5em;\n}\n\n.footer-container .nav a {\n  color: rgba(255, 255, 255, 0.8);\n  margin-right: 6px;\n  text-decoration: none;\n}\n\n</style>\n\n<div class='footer-container'>\n\n  <a href="/" class="logo">\n    ${os}\n    Distill\n  </a> is dedicated to clear explanations of machine learning\n\n  <div class="nav">\n    <a href="https://distill.pub/about/">About</a>\n    <a href="https://distill.pub/journal/">Submit</a>\n    <a href="https://distill.pub/prize/">Prize</a>\n    <a href="https://distill.pub/archive/">Archive</a>\n    <a href="https://distill.pub/rss.xml">RSS</a>\n    <a href="https://github.com/distillpub">GitHub</a>\n    <a href="https://twitter.com/distillpub">Twitter</a>\n    &nbsp;&nbsp;&nbsp;&nbsp; ISSN 2476-0757\n  </div>\n\n</div>\n\n`);
+// Copyright 2018 The Distill Template Authors
+class ds extends(cs(HTMLElement)){}
+// Copyright 2018 The Distill Template Authors
+let hs=!1,ps=0;const fs=function(){if(window.distill.runlevel<1)throw new Error("Insufficient Runlevel for Distill Template!");if("distill"in window&&window.distill.templateIsLoading)throw new Error("Runlevel 1: Distill Template is getting loaded more than once, aborting!");window.distill.templateIsLoading=!0,console.debug("Runlevel 1: Distill Template has started loading."),p(document),console.debug("Runlevel 1: Static Distill styles have been added."),console.debug("Runlevel 1->2."),window.distill.runlevel+=1;for(const[n,t]of Object.entries(Vr.listeners))"function"==typeof t?document.addEventListener(n,t):console.error("Runlevel 2: Controller listeners need to be functions!");console.debug("Runlevel 2: We can now listen to controller events."),console.debug("Runlevel 2->3."),window.distill.runlevel+=1;const n=[Jr,to,io,ao,so,uo,ho,go,bo,vo,Wr,xo,ko,Yr,Mo,To,_o,rs,Eo],t=[ss,us,ds];if(window.distill.runlevel<2)throw new Error("Insufficient Runlevel for adding custom elements!");const e=n.concat(t);for(const n of e)console.debug("Runlevel 2: Registering custom element: "+n.is),customElements.define(n.is,n);console.debug("Runlevel 3: Distill Template finished registering custom elements."),console.debug("Runlevel 3->4."),window.distill.runlevel+=1,u()&&Vr.listeners.DOMContentLoaded(),console.debug("Runlevel 4: Distill Template initialisation complete."),window.distill.templateIsLoading=!1,window.distill.templateHasLoaded=!0};window.distill={runlevel:ps,initialize:fs,templateIsLoading:hs},Zr.browserSupportsAllFeatures()?(console.debug("Runlevel 0: No need for polyfills."),console.debug("Runlevel 0->1."),window.distill.runlevel+=1,window.distill.initialize()):(console.debug("Runlevel 0: Distill Template is loading polyfills."),Zr.load(window.distill.initialize))});
\ No newline at end of file
diff --git a/assets/js/distillpub/transforms.v2.js b/assets/js/distillpub/transforms.v2.js
index 2d12d323..41d3b7d3 100644
--- a/assets/js/distillpub/transforms.v2.js
+++ b/assets/js/distillpub/transforms.v2.js
@@ -1,13185 +1,75 @@
-(function (global, factory) {
-  typeof exports === 'object' && typeof module !== 'undefined' ? factory(exports, require('fs')) :
-  typeof define === 'function' && define.amd ? define(['exports', 'fs'], factory) :
-  (global = global || self, factory(global.dl = {}, global.fs));
-}(this, (function (exports, fs) { 'use strict';
-
-  fs = fs && Object.prototype.hasOwnProperty.call(fs, 'default') ? fs['default'] : fs;
-
-  // Copyright 2018 The Distill Template Authors
-  //
-  // Licensed under the Apache License, Version 2.0 (the "License");
-  // you may not use this file except in compliance with the License.
-  // You may obtain a copy of the License at
-  //
-  //      http://www.apache.org/licenses/LICENSE-2.0
-  //
-  // Unless required by applicable law or agreed to in writing, software
-  // distributed under the License is distributed on an "AS IS" BASIS,
-  // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  // See the License for the specific language governing permissions and
-  // limitations under the License.
-
-  const days = ['Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday'];
-  const months = ['Jan.', 'Feb.', 'March', 'April', 'May', 'June', 'July', 'Aug.', 'Sept.', 'Oct.', 'Nov.', 'Dec.'];
-  const zeroPad = n => n < 10 ? '0' + n : n;
-
-  const RFC = function(date) {
-    const day = days[date.getDay()].substring(0, 3);
-    const paddedDate = zeroPad(date.getDate());
-    const month = months[date.getMonth()].substring(0,3);
-    const year = date.getFullYear().toString();
-    const hours = date.getUTCHours().toString();
-    const minutes = date.getUTCMinutes().toString();
-    const seconds = date.getUTCSeconds().toString();
-    return `${day}, ${paddedDate} ${month} ${year} ${hours}:${minutes}:${seconds} Z`;
-  };
-
-  const objectFromMap = function(map) {
-    const object = Array.from(map).reduce((object, [key, value]) => (
-      Object.assign(object, { [key]: value }) // Be careful! Maps can have non-String keys; object literals can't.
-    ), {});
-    return object;
-  };
-
-  const mapFromObject = function(object) {
-    const map = new Map();
-    for (var property in object) {
-      if (object.hasOwnProperty(property)) {
-        map.set(property, object[property]);
-      }
-    }
-    return map;
-  };
-
-  class Author {
-
-    // constructor(name='', personalURL='', affiliation='', affiliationURL='') {
-    //   this.name = name; // 'Chris Olah'
-    //   this.personalURL = personalURL; // 'https://colah.github.io'
-    //   this.affiliation = affiliation; // 'Google Brain'
-    //   this.affiliationURL = affiliationURL; // 'https://g.co/brain'
-    // }
-
-    constructor(object) {
-      this.name = object.author; // 'Chris Olah'
-      this.personalURL = object.authorURL; // 'https://colah.github.io'
-      this.affiliation = object.affiliation; // 'Google Brain'
-      this.affiliationURL = object.affiliationURL; // 'https://g.co/brain'
-      this.affiliations = object.affiliations || []; // new-style affiliations
-    }
-
-    // 'Chris'
-    get firstName() {
-      const names = this.name.split(' ');
-      return names.slice(0, names.length - 1).join(' ');
-    }
-
-    // 'Olah'
-    get lastName() {
-      const names = this.name.split(' ');
-      return names[names.length -1];
-    }
-  }
-
-  function mergeFromYMLFrontmatter(target, source) {
-    target.title = source.title;
-    if (source.published) {
-      if (source.published instanceof Date) {
-        target.publishedDate = source.published;
-      } else if (source.published.constructor === String) {
-        target.publishedDate = new Date(source.published);
-      }
-    }
-    if (source.publishedDate) {
-      if (source.publishedDate instanceof Date) {
-        target.publishedDate = source.publishedDate;
-      } else if (source.publishedDate.constructor === String) {
-        target.publishedDate = new Date(source.publishedDate);
-      } else {
-        console.error('Don\'t know what to do with published date: ' + source.publishedDate);
-      }
-    }
-    target.description = source.description;
-    target.authors = source.authors.map( (authorObject) => new Author(authorObject));
-    target.katex = source.katex;
-    target.password = source.password;
-    if (source.doi) {
-      target.doi = source.doi;
-    }
-  }
-
-  class FrontMatter {
-    constructor() {
-      this.title = 'unnamed article'; // 'Attention and Augmented Recurrent Neural Networks'
-      this.description = ''; // 'A visual overview of neural attention...'
-      this.authors = []; // Array of Author(s)
-
-      this.bibliography = new Map();
-      this.bibliographyParsed = false;
-      //  {
-      //    'gregor2015draw': {
-      //      'title': 'DRAW: A recurrent neural network for image generation',
-      //      'author': 'Gregor, Karol and Danihelka, Ivo and Graves, Alex and Rezende, Danilo Jimenez and Wierstra, Daan',
-      //      'journal': 'arXiv preprint arXiv:1502.04623',
-      //      'year': '2015',
-      //      'url': 'https://arxiv.org/pdf/1502.04623.pdf',
-      //      'type': 'article'
-      //    },
-      //  }
-
-      // Citation keys should be listed in the order that they are appear in the document.
-      // Each key refers to a key in the bibliography dictionary.
-      this.citations = []; // [ 'gregor2015draw', 'mercier2011humans' ]
-      this.citationsCollected = false;
-
-      //
-      // Assigned from posts.csv
-      //
-
-      //  publishedDate: 2016-09-08T07:00:00.000Z,
-      //  tags: [ 'rnn' ],
-      //  distillPath: '2016/augmented-rnns',
-      //  githubPath: 'distillpub/post--augmented-rnns',
-      //  doiSuffix: 1,
-
-      //
-      // Assigned from journal
-      //
-      this.journal = {};
-      //  journal: {
-      //    'title': 'Distill',
-      //    'full_title': 'Distill',
-      //    'abbrev_title': 'Distill',
-      //    'url': 'http://distill.pub',
-      //    'doi': '10.23915/distill',
-      //    'publisherName': 'Distill Working Group',
-      //    'publisherEmail': 'admin@distill.pub',
-      //    'issn': '2476-0757',
-      //    'editors': [...],
-      //    'committee': [...]
-      //  }
-      //  volume: 1,
-      //  issue: 9,
-
-      this.katex = {};
-
-      //
-      // Assigned from publishing process
-      //
-
-      //  githubCompareUpdatesUrl: 'https://github.com/distillpub/post--augmented-rnns/compare/1596e094d8943d2dc0ea445d92071129c6419c59...3bd9209e0c24d020f87cf6152dcecc6017cbc193',
-      //  updatedDate: 2017-03-21T07:13:16.000Z,
-      //  doi: '10.23915/distill.00001',
-      this.doi = undefined;
-      this.publishedDate = undefined;
-    }
-
-    // Example:
-    // title: Demo Title Attention and Augmented Recurrent Neural Networks
-    // published: Jan 10, 2017
-    // authors:
-    // - Chris Olah:
-    // - Shan Carter: http://shancarter.com
-    // affiliations:
-    // - Google Brain:
-    // - Google Brain: http://g.co/brain
-
-    //
-    // Computed Properties
-    //
-
-    // 'http://distill.pub/2016/augmented-rnns',
-    set url(value) {
-      this._url = value;
-    }
-    get url() {
-      if (this._url) {
-        return this._url;
-      } else if (this.distillPath && this.journal.url) {
-        return this.journal.url + '/' + this.distillPath;
-      } else if (this.journal.url) {
-        return this.journal.url;
-      }
-    }
-
-    // 'https://github.com/distillpub/post--augmented-rnns',
-    get githubUrl() {
-      if (this.githubPath) {
-        return 'https://github.com/' + this.githubPath;
-      } else {
-        return undefined;
-      }
-    }
-
-    // TODO resolve differences in naming of URL/Url/url.
-    // 'http://distill.pub/2016/augmented-rnns/thumbnail.jpg',
-    set previewURL(value) {
-      this._previewURL = value;
-    }
-    get previewURL() {
-      return this._previewURL ? this._previewURL : this.url + '/thumbnail.jpg';
-    }
-
-    // 'Thu, 08 Sep 2016 00:00:00 -0700',
-    get publishedDateRFC() {
-      return RFC(this.publishedDate);
-    }
-
-    // 'Thu, 08 Sep 2016 00:00:00 -0700',
-    get updatedDateRFC() {
-      return RFC(this.updatedDate);
-    }
-
-    // 2016,
-    get publishedYear() {
-      return this.publishedDate.getFullYear();
-    }
-
-    // 'Sept',
-    get publishedMonth() {
-      return months[this.publishedDate.getMonth()];
-    }
-
-    // 8,
-    get publishedDay() {
-      return this.publishedDate.getDate();
-    }
-
-    // '09',
-    get publishedMonthPadded() {
-      return zeroPad(this.publishedDate.getMonth() + 1);
-    }
-
-    // '08',
-    get publishedDayPadded() {
-      return zeroPad(this.publishedDate.getDate());
-    }
-
-    get publishedISODateOnly() {
-      return this.publishedDate.toISOString().split('T')[0];
-    }
-
-    get volume() {
-      const volume = this.publishedYear - 2015;
-      if (volume < 1) {
-        throw new Error('Invalid publish date detected during computing volume');
-      }
-      return volume;
-    }
-
-    get issue() {
-      return this.publishedDate.getMonth() + 1;
-    }
-
-    // 'Olah & Carter',
-    get concatenatedAuthors() {
-      if (this.authors.length > 2) {
-        return this.authors[0].lastName + ', et al.';
-      } else if (this.authors.length === 2) {
-        return this.authors[0].lastName + ' & ' + this.authors[1].lastName;
-      } else if (this.authors.length === 1) {
-        return this.authors[0].lastName;
-      }
-    }
-
-    // 'Olah, Chris and Carter, Shan',
-    get bibtexAuthors() {
-      return this.authors.map(author => {
-        return author.lastName + ', ' + author.firstName;
-      }).join(' and ');
-    }
-
-    // 'olah2016attention'
-    get slug() {
-      let slug = '';
-      if (this.authors.length) {
-        slug += this.authors[0].lastName.toLowerCase();
-        slug += this.publishedYear;
-        slug += this.title.split(' ')[0].toLowerCase();
-      }
-      return slug || 'Untitled';
-    }
-
-    get bibliographyEntries() {
-      return new Map(this.citations.map( citationKey => {
-        const entry = this.bibliography.get(citationKey);
-        return [citationKey, entry];
-      }));
-    }
-
-    set bibliography(bibliography) {
-      if (bibliography instanceof Map) {
-        this._bibliography = bibliography;
-      } else if (typeof bibliography === 'object') {
-        this._bibliography = mapFromObject(bibliography);
-      }
-    }
-
-    get bibliography() {
-      return this._bibliography;
-    }
-
-    static fromObject(source) {
-      const frontMatter = new FrontMatter();
-      Object.assign(frontMatter, source);
-      return frontMatter;
-    }
-
-    assignToObject(target) {
-      Object.assign(target, this);
-      target.bibliography = objectFromMap(this.bibliographyEntries);
-      target.url = this.url;
-      target.doi = this.doi;
-      target.githubUrl = this.githubUrl;
-      target.previewURL = this.previewURL;
-      if (this.publishedDate) {
-        target.volume = this.volume;
-        target.issue = this.issue;
-        target.publishedDateRFC = this.publishedDateRFC;
-        target.publishedYear = this.publishedYear;
-        target.publishedMonth = this.publishedMonth;
-        target.publishedDay = this.publishedDay;
-        target.publishedMonthPadded = this.publishedMonthPadded;
-        target.publishedDayPadded = this.publishedDayPadded;
-      }
-      if (this.updatedDate) {
-        target.updatedDateRFC = this.updatedDateRFC;
-      }
-      target.concatenatedAuthors = this.concatenatedAuthors;
-      target.bibtexAuthors = this.bibtexAuthors;
-      target.slug = this.slug;
-    }
-
-  }
-
-  // Copyright 2018 The Distill Template Authors
-  //
-  // Licensed under the Apache License, Version 2.0 (the "License");
-  // you may not use this file except in compliance with the License.
-  // You may obtain a copy of the License at
-  //
-  //      http://www.apache.org/licenses/LICENSE-2.0
-  //
-  // Unless required by applicable law or agreed to in writing, software
-  // distributed under the License is distributed on an "AS IS" BASIS,
-  // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  // See the License for the specific language governing permissions and
-  // limitations under the License.
-
-  function _moveLegacyAffiliationFormatIntoArray(frontMatter) {
-    // authors used to have propoerties "affiliation" and "affiliationURL".
-    // We now encourage using an array for affiliations containing objects with
-    // properties "name" and "url".
-    for (let author of frontMatter.authors) {
-      const hasOldStyle = Boolean(author.affiliation);
-      const hasNewStyle = Boolean(author.affiliations);
-      if (!hasOldStyle) continue;
-      if (hasNewStyle) {
-        console.warn(`Author ${author.author} has both old-style ("affiliation" & "affiliationURL") and new style ("affiliations") affiliation information!`);
-      } else {
-        let newAffiliation = {
-          "name": author.affiliation
-        };
-        if (author.affiliationURL) newAffiliation.url = author.affiliationURL;
-        author.affiliations = [newAffiliation];
-      }
-    }
-    return frontMatter
-  }
-
-  function parseFrontmatter(element) {
-    const scriptTag = element.firstElementChild;
-    if (scriptTag) {
-      const type = scriptTag.getAttribute('type');
-      if (type.split('/')[1] == 'json') {
-        const content = scriptTag.textContent;
-        const parsed = JSON.parse(content);
-        return _moveLegacyAffiliationFormatIntoArray(parsed);
-      } else {
-        console.error('Distill only supports JSON frontmatter tags anymore; no more YAML.');
-      }
-    } else {
-      console.error('You added a frontmatter tag but did not provide a script tag with front matter data in it. Please take a look at our templates.');
-    }
-    return {};
-  }
-
-  // Copyright 2018 The Distill Template Authors
-
-  function ExtractFrontmatter(dom, data) {
-    const frontMatterTag = dom.querySelector('d-front-matter');
-    if (!frontMatterTag) {
-      console.warn('No front matter tag found!');
-      return;
-    }
-    const extractedData = parseFrontmatter(frontMatterTag);
-    mergeFromYMLFrontmatter(data, extractedData);
-  }
-
-  function commonjsRequire () {
-  	throw new Error('Dynamic requires are not currently supported by rollup-plugin-commonjs');
-  }
-
-  function unwrapExports (x) {
-  	return x && x.__esModule && Object.prototype.hasOwnProperty.call(x, 'default') ? x['default'] : x;
-  }
-
-  function createCommonjsModule(fn, module) {
-  	return module = { exports: {} }, fn(module, module.exports), module.exports;
-  }
-
-  var bibtexParse = createCommonjsModule(function (module, exports) {
-  /* start bibtexParse 0.0.22 */
-
-  //Original work by Henrik Muehe (c) 2010
-  //
-  //CommonJS port by Mikola Lysenko 2013
-  //
-  //Port to Browser lib by ORCID / RCPETERS
-  //
-  //Issues:
-  //no comment handling within strings
-  //no string concatenation
-  //no variable values yet
-  //Grammar implemented here:
-  //bibtex -> (string | preamble | comment | entry)*;
-  //string -> '@STRING' '{' key_equals_value '}';
-  //preamble -> '@PREAMBLE' '{' value '}';
-  //comment -> '@COMMENT' '{' value '}';
-  //entry -> '@' key '{' key ',' key_value_list '}';
-  //key_value_list -> key_equals_value (',' key_equals_value)*;
-  //key_equals_value -> key '=' value;
-  //value -> value_quotes | value_braces | key;
-  //value_quotes -> '"' .*? '"'; // not quite
-  //value_braces -> '{' .*? '"'; // not quite
-  (function(exports) {
-
-      function BibtexParser() {
-          
-          this.months = ["jan", "feb", "mar", "apr", "may", "jun", "jul", "aug", "sep", "oct", "nov", "dec"];
-          this.notKey = [',','{','}',' ','='];
-          this.pos = 0;
-          this.input = "";
-          this.entries = new Array();
-
-          this.currentEntry = "";
-
-          this.setInput = function(t) {
-              this.input = t;
-          };
-
-          this.getEntries = function() {
-              return this.entries;
-          };
-
-          this.isWhitespace = function(s) {
-              return (s == ' ' || s == '\r' || s == '\t' || s == '\n');
-          };
-
-          this.match = function(s, canCommentOut) {
-              if (canCommentOut == undefined || canCommentOut == null)
-                  canCommentOut = true;
-              this.skipWhitespace(canCommentOut);
-              if (this.input.substring(this.pos, this.pos + s.length) == s) {
-                  this.pos += s.length;
-              } else {
-                  throw "Token mismatch, expected " + s + ", found "
-                          + this.input.substring(this.pos);
-              }            this.skipWhitespace(canCommentOut);
-          };
-
-          this.tryMatch = function(s, canCommentOut) {
-              if (canCommentOut == undefined || canCommentOut == null)
-                  canCommentOut = true;
-              this.skipWhitespace(canCommentOut);
-              if (this.input.substring(this.pos, this.pos + s.length) == s) {
-                  return true;
-              } else {
-                  return false;
-              }        };
-
-          /* when search for a match all text can be ignored, not just white space */
-          this.matchAt = function() {
-              while (this.input.length > this.pos && this.input[this.pos] != '@') {
-                  this.pos++;
-              }
-              if (this.input[this.pos] == '@') {
-                  return true;
-              }            return false;
-          };
-
-          this.skipWhitespace = function(canCommentOut) {
-              while (this.isWhitespace(this.input[this.pos])) {
-                  this.pos++;
-              }            if (this.input[this.pos] == "%" && canCommentOut == true) {
-                  while (this.input[this.pos] != "\n") {
-                      this.pos++;
-                  }                this.skipWhitespace(canCommentOut);
-              }        };
-
-          this.value_braces = function() {
-              var bracecount = 0;
-              this.match("{", false);
-              var start = this.pos;
-              var escaped = false;
-              while (true) {
-                  if (!escaped) {
-                      if (this.input[this.pos] == '}') {
-                          if (bracecount > 0) {
-                              bracecount--;
-                          } else {
-                              var end = this.pos;
-                              this.match("}", false);
-                              return this.input.substring(start, end);
-                          }                    } else if (this.input[this.pos] == '{') {
-                          bracecount++;
-                      } else if (this.pos >= this.input.length - 1) {
-                          throw "Unterminated value";
-                      }                }                if (this.input[this.pos] == '\\' && escaped == false)
-                      escaped = true;
-                  else
-                      escaped = false;
-                  this.pos++;
-              }        };
-
-          this.value_comment = function() {
-              var str = '';
-              var brcktCnt = 0;
-              while (!(this.tryMatch("}", false) && brcktCnt == 0)) {
-                  str = str + this.input[this.pos];
-                  if (this.input[this.pos] == '{')
-                      brcktCnt++;
-                  if (this.input[this.pos] == '}')
-                      brcktCnt--;
-                  if (this.pos >= this.input.length - 1) {
-                      throw "Unterminated value:" + this.input.substring(start);
-                  }                this.pos++;
-              }            return str;
-          };
-
-          this.value_quotes = function() {
-              this.match('"', false);
-              var start = this.pos;
-              var escaped = false;
-              while (true) {
-                  if (!escaped) {
-                      if (this.input[this.pos] == '"') {
-                          var end = this.pos;
-                          this.match('"', false);
-                          return this.input.substring(start, end);
-                      } else if (this.pos >= this.input.length - 1) {
-                          throw "Unterminated value:" + this.input.substring(start);
-                      }                }
-                  if (this.input[this.pos] == '\\' && escaped == false)
-                      escaped = true;
-                  else
-                      escaped = false;
-                  this.pos++;
-              }        };
-
-          this.single_value = function() {
-              var start = this.pos;
-              if (this.tryMatch("{")) {
-                  return this.value_braces();
-              } else if (this.tryMatch('"')) {
-                  return this.value_quotes();
-              } else {
-                  var k = this.key();
-                  if (k.match("^[0-9]+$"))
-                      return k;
-                  else if (this.months.indexOf(k.toLowerCase()) >= 0)
-                      return k.toLowerCase();
-                  else
-                      throw "Value expected:" + this.input.substring(start) + ' for key: ' + k;
-              
-              }        };
-
-          this.value = function() {
-              var values = [];
-              values.push(this.single_value());
-              while (this.tryMatch("#")) {
-                  this.match("#");
-                  values.push(this.single_value());
-              }            return values.join("");
-          };
-
-          this.key = function() {
-              var start = this.pos;
-              while (true) {
-                  if (this.pos >= this.input.length) {
-                      throw "Runaway key";
-                  }                                // а-яА-Я is Cyrillic
-                  //console.log(this.input[this.pos]);
-                  if (this.notKey.indexOf(this.input[this.pos]) >= 0) {
-                      return this.input.substring(start, this.pos);
-                  } else {
-                      this.pos++;
-                      
-                  }            }        };
-
-          this.key_equals_value = function() {
-              var key = this.key();
-              if (this.tryMatch("=")) {
-                  this.match("=");
-                  var val = this.value();
-                  return [ key, val ];
-              } else {
-                  throw "... = value expected, equals sign missing:"
-                          + this.input.substring(this.pos);
-              }        };
-
-          this.key_value_list = function() {
-              var kv = this.key_equals_value();
-              this.currentEntry['entryTags'] = {};
-              this.currentEntry['entryTags'][kv[0]] = kv[1];
-              while (this.tryMatch(",")) {
-                  this.match(",");
-                  // fixes problems with commas at the end of a list
-                  if (this.tryMatch("}")) {
-                      break;
-                  }
-                  kv = this.key_equals_value();
-                  this.currentEntry['entryTags'][kv[0]] = kv[1];
-              }        };
-
-          this.entry_body = function(d) {
-              this.currentEntry = {};
-              this.currentEntry['citationKey'] = this.key();
-              this.currentEntry['entryType'] = d.substring(1);
-              this.match(",");
-              this.key_value_list();
-              this.entries.push(this.currentEntry);
-          };
-
-          this.directive = function() {
-              this.match("@");
-              return "@" + this.key();
-          };
-
-          this.preamble = function() {
-              this.currentEntry = {};
-              this.currentEntry['entryType'] = 'PREAMBLE';
-              this.currentEntry['entry'] = this.value_comment();
-              this.entries.push(this.currentEntry);
-          };
-
-          this.comment = function() {
-              this.currentEntry = {};
-              this.currentEntry['entryType'] = 'COMMENT';
-              this.currentEntry['entry'] = this.value_comment();
-              this.entries.push(this.currentEntry);
-          };
-
-          this.entry = function(d) {
-              this.entry_body(d);
-          };
-
-          this.bibtex = function() {
-              while (this.matchAt()) {
-                  var d = this.directive();
-                  this.match("{");
-                  if (d == "@STRING") {
-                      this.string();
-                  } else if (d == "@PREAMBLE") {
-                      this.preamble();
-                  } else if (d == "@COMMENT") {
-                      this.comment();
-                  } else {
-                      this.entry(d);
-                  }
-                  this.match("}");
-              }        };
-      }    
-      exports.toJSON = function(bibtex) {
-          var b = new BibtexParser();
-          b.setInput(bibtex);
-          b.bibtex();
-          return b.entries;
-      };
-
-      /* added during hackathon don't hate on me */
-      exports.toBibtex = function(json) {
-          var out = '';
-          for ( var i in json) {
-              out += "@" + json[i].entryType;
-              out += '{';
-              if (json[i].citationKey)
-                  out += json[i].citationKey + ', ';
-              if (json[i].entry)
-                  out += json[i].entry ;
-              if (json[i].entryTags) {
-                  var tags = '';
-                  for (var jdx in json[i].entryTags) {
-                      if (tags.length != 0)
-                          tags += ', ';
-                      tags += jdx + '= {' + json[i].entryTags[jdx] + '}';
-                  }
-                  out += tags;
-              }
-              out += '}\n\n';
-          }
-          return out;
-          
-      };
-
-  })( exports);
-
-  /* end bibtexParse */
-  });
-
-  // Copyright 2018 The Distill Template Authors
-
-  function normalizeTag(string) {
-    return string
-      .replace(/[\t\n ]+/g, ' ')
-      .replace(/{\\["^`.'acu~Hvs]( )?([a-zA-Z])}/g, (full, x, char) => char)
-      .replace(/{\\([a-zA-Z])}/g, (full, char) => char);
-  }
-
-  function parseBibtex(bibtex) {
-    const bibliography = new Map();
-    const parsedEntries = bibtexParse.toJSON(bibtex);
-    for (const entry of parsedEntries) {
-      // normalize tags; note entryTags is an object, not Map
-      for (const [key, value] of Object.entries(entry.entryTags)) {
-        entry.entryTags[key.toLowerCase()] = normalizeTag(value);
-      }
-      entry.entryTags.type = entry.entryType;
-      // add to bibliography
-      bibliography.set(entry.citationKey, entry.entryTags);
-    }
-    return bibliography;
-  }
-
-  function serializeFrontmatterToBibtex(frontMatter) {
-    return `@article{${frontMatter.slug},
-  author = {${frontMatter.bibtexAuthors}},
-  title = {${frontMatter.title}},
-  journal = {${frontMatter.journal.title}},
-  year = {${frontMatter.publishedYear}},
-  note = {${frontMatter.url}},
-  doi = {${frontMatter.doi}}
-}`;
-  }
-
-  // Copyright 2018 The Distill Template Authors
-
-  function parseBibliography(element) {
-    const scriptTag = element.firstElementChild;
-    if (scriptTag && scriptTag.tagName === 'SCRIPT') {
-      if (scriptTag.type == 'text/bibtex') {
-        const bibtex = element.firstElementChild.textContent;
-        return parseBibtex(bibtex);
-      } else if (scriptTag.type == 'text/json') {
-        return new Map(JSON.parse(scriptTag.textContent));
-      } else {
-        console.warn('Unsupported bibliography script tag type: ' + scriptTag.type);
-      }
-    } else {
-      console.warn('Bibliography did not have any script tag.');
-    }
-  }
-
-  // Copyright 2018 The Distill Template Authors
-
-  function ExtractBibliography(dom, data) {
-    const bibliographyTag = dom.querySelector('d-bibliography');
-    if (!bibliographyTag) {
-      console.warn('No bibliography tag found!');
-      return;
-    }
-
-    const src = bibliographyTag.getAttribute('src');
-    if (src) {
-      const path = data.inputDirectory + '/' + src;
-      const text = fs.readFileSync(path, 'utf-8');
-      const bibliography = parseBibtex(text);
-      const scriptTag = dom.createElement('script');
-      scriptTag.type = 'text/json';
-      scriptTag.textContent = JSON.stringify([...bibliography]);
-      bibliographyTag.appendChild(scriptTag);
-      bibliographyTag.removeAttribute('src');
-    }
-
-    data.bibliography = parseBibliography(bibliographyTag);
-  }
-
-  // Copyright 2018 The Distill Template Authors
-  //
-  // Licensed under the Apache License, Version 2.0 (the "License");
-  // you may not use this file except in compliance with the License.
-  // You may obtain a copy of the License at
-  //
-  //      http://www.apache.org/licenses/LICENSE-2.0
-  //
-  // Unless required by applicable law or agreed to in writing, software
-  // distributed under the License is distributed on an "AS IS" BASIS,
-  // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  // See the License for the specific language governing permissions and
-  // limitations under the License.
-
-  function collect_citations(dom = document) {
-    const citations = new Set();
-    const citeTags = dom.querySelectorAll("d-cite");
-    for (const tag of citeTags) {
-      const keyString = tag.getAttribute("key") || tag.getAttribute("bibtex-key");
-      const keys = keyString.split(",").map(k => k.trim());
-      for (const key of keys) {
-        citations.add(key);
-      }
-    }
-    return [...citations];
-  }
-
-  function author_string(ent, template, sep, finalSep) {
-    if (ent.author == null) {
-      return "";
-    }
-    var names = ent.author.split(" and ");
-    let name_strings = names.map(name => {
-      name = name.trim();
-      if (name.indexOf(",") != -1) {
-        var last = name.split(",")[0].trim();
-        var firsts = name.split(",")[1];
-      } else if (name.indexOf(" ") != -1) {
-        var last = name
-          .split(" ")
-          .slice(-1)[0]
-          .trim();
-        var firsts = name
-          .split(" ")
-          .slice(0, -1)
-          .join(" ");
-      } else {
-        var last = name.trim();
-      }
-      var initials = "";
-      if (firsts != undefined) {
-        initials = firsts
-          .trim()
-          .split(" ")
-          .map(s => s.trim()[0]);
-        initials = initials.join(".") + ".";
-      }
-      return template
-        .replace("${F}", firsts)
-        .replace("${L}", last)
-        .replace("${I}", initials)
-        .trim(); // in case one of first or last was empty
-    });
-    if (names.length > 1) {
-      var str = name_strings.slice(0, names.length - 1).join(sep);
-      str += (finalSep || sep) + name_strings[names.length - 1];
-      return str;
-    } else {
-      return name_strings[0];
-    }
-  }
-
-  function venue_string(ent) {
-    var cite = ent.journal || ent.booktitle || "";
-    if ("volume" in ent) {
-      var issue = ent.issue || ent.number;
-      issue = issue != undefined ? "(" + issue + ")" : "";
-      cite += ", Vol " + ent.volume + issue;
-    }
-    if ("pages" in ent) {
-      cite += ", pp. " + ent.pages;
-    }
-    if (cite != "") cite += ". ";
-    if ("publisher" in ent) {
-      cite += ent.publisher;
-      if (cite[cite.length - 1] != ".") cite += ".";
-    }
-    return cite;
-  }
-
-  function link_string(ent) {
-    if ("url" in ent) {
-      var url = ent.url;
-      var arxiv_match = /arxiv\.org\/abs\/([0-9\.]*)/.exec(url);
-      if (arxiv_match != null) {
-        url = `http://arxiv.org/pdf/${arxiv_match[1]}.pdf`;
-      }
-
-      if (url.slice(-4) == ".pdf") {
-        var label = "PDF";
-      } else if (url.slice(-5) == ".html") {
-        var label = "HTML";
-      }
-      return ` &ensp;<a href="${url}">[${label || "link"}]</a>`;
-    } /* else if ("doi" in ent){
-      return ` &ensp;<a href="https://doi.org/${ent.doi}" >[DOI]</a>`;
-    }*/ else {
-      return "";
-    }
-  }
-  function doi_string(ent, new_line) {
-    if ("doi" in ent) {
-      return `${new_line ? "<br>" : ""} <a href="https://doi.org/${
-      ent.doi
-    }" style="text-decoration:inherit;">DOI: ${ent.doi}</a>`;
-    } else {
-      return "";
-    }
-  }
-
-  function title_string(ent) {
-    return '<span class="title">' + ent.title + "</span> ";
-  }
-
-  function bibliography_cite(ent, fancy) {
-    if (ent) {
-      var cite = title_string(ent);
-      cite += link_string(ent) + "<br>";
-      if (ent.author) {
-        cite += author_string(ent, "${L}, ${I}", ", ", " and ");
-        if (ent.year || ent.date) {
-          cite += ", ";
-        }
-      }
-      if (ent.year || ent.date) {
-        cite += (ent.year || ent.date) + ". ";
-      } else {
-        cite += ". ";
-      }
-      cite += venue_string(ent);
-      cite += doi_string(ent);
-      return cite;
-      /*var cite =  author_string(ent, "${L}, ${I}", ", ", " and ");
-      if (ent.year || ent.date){
-        cite += ", " + (ent.year || ent.date) + ". "
-      } else {
-        cite += ". "
-      }
-      cite += "<b>" + ent.title + "</b>. ";
-      cite += venue_string(ent);
-      cite += doi_string(ent);
-      cite += link_string(ent);
-      return cite*/
-    } else {
-      return "?";
-    }
-  }
-
-  // Copyright 2018 The Distill Template Authors
-
-  function ExtractCitations(dom, data) {
-    const citations = new Set(data.citations);
-    const newCitations = collect_citations(dom);
-    for (const citation of newCitations) {
-      citations.add(citation);
-    }
-    data.citations = Array.from(citations);
-  }
-
-  // Copyright 2018 The Distill Template Authors
-  //
-  // Licensed under the Apache License, Version 2.0 (the "License");
-  // you may not use this file except in compliance with the License.
-  // You may obtain a copy of the License at
-  //
-  //      http://www.apache.org/licenses/LICENSE-2.0
-  //
-  // Unless required by applicable law or agreed to in writing, software
-  // distributed under the License is distributed on an "AS IS" BASIS,
-  // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  // See the License for the specific language governing permissions and
-  // limitations under the License.
-
-  function HTML(dom) {
-
-    const head = dom.querySelector('head');
-
-    // set language to 'en'
-    if (!dom.querySelector('html').getAttribute('lang')) {
-      dom.querySelector('html').setAttribute('lang', 'en');
-    }
-
-    // set charset to 'utf-8'
-    if (!dom.querySelector('meta[charset]')) {
-      const meta = dom.createElement('meta');
-      meta.setAttribute('charset', 'utf-8');
-      head.appendChild(meta);
-    }
-
-    // set viewport
-    if (!dom.querySelector('meta[name=viewport]')) {
-      const meta = dom.createElement('meta');
-      meta.setAttribute('name', 'viewport');
-      meta.setAttribute('content', 'width=device-width, initial-scale=1');
-      head.appendChild(meta);
-    }
-  }
-
-  // Copyright 2018 The Distill Template Authors
-  //
-  // Licensed under the Apache License, Version 2.0 (the "License");
-  // you may not use this file except in compliance with the License.
-  // You may obtain a copy of the License at
-  //
-  //      http://www.apache.org/licenses/LICENSE-2.0
-  //
-  // Unless required by applicable law or agreed to in writing, software
-  // distributed under the License is distributed on an "AS IS" BASIS,
-  // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  // See the License for the specific language governing permissions and
-  // limitations under the License.
-
-  // import style from '../styles/d-byline.css';
-
-  function bylineTemplate(frontMatter) {
-    return `
-  <div class="byline grid">
-    <div class="authors-affiliations grid">
-      <h3>Authors</h3>
-      <h3>Affiliations</h3>
-      ${frontMatter.authors.map(author => `
-        <p class="author">
-          ${author.personalURL ? `
-            <a class="name" href="${author.personalURL}">${author.name}</a>` : `
-            <span class="name">${author.name}</span>`}
-        </p>
-        <p class="affiliation">
-        ${author.affiliations.map(affiliation =>
-          affiliation.url ? `<a class="affiliation" href="${affiliation.url}">${affiliation.name}</a>` : `<span class="affiliation">${affiliation.name}</span>`
-        ).join(', ')}
-        </p>
-      `).join('')}
-    </div>
-    <div>
-      <h3>Published</h3>
-      ${frontMatter.publishedDate ? `
-        <p>${frontMatter.publishedMonth} ${frontMatter.publishedDay}, ${frontMatter.publishedYear}</p> ` : `
-        <p><em>Not published yet.</em></p>`}
-    </div>
-  </div>
-`;
-  }
-
-  // Copyright 2018 The Distill Template Authors
-
-  function Byline(dom, data) {
-    const byline = dom.querySelector('d-byline');
-    if (byline) {
-      byline.innerHTML = bylineTemplate(data);
-    }
-  }
-
-  // Copyright 2018 The Distill Template Authors
-  //
-  // Licensed under the Apache License, Version 2.0 (the "License");
-  // you may not use this file except in compliance with the License.
-  // You may obtain a copy of the License at
-  //
-  //      http://www.apache.org/licenses/LICENSE-2.0
-  //
-  // Unless required by applicable law or agreed to in writing, software
-  // distributed under the License is distributed on an "AS IS" BASIS,
-  // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  // See the License for the specific language governing permissions and
-  // limitations under the License.
-
-  // no appendix -> add appendix
-  // title in front, no h1 -> add it
-  // no title in front, h1 -> read and put into frontMatter
-  // footnote -> footnote list
-  // break up bib
-  // if citation, no bib-list -> add citation-list
-
-  // if authors, no byline -> add byline
-
-  function OptionalComponents(dom, data) {
-    const body = dom.body;
-    const article = body.querySelector('d-article');
-
-    // If we don't have an article tag, something weird is going on—giving up.
-    if (!article) {
-      console.warn('No d-article tag found; skipping adding optional components!');
-      return;
-    }
-
-    let byline = dom.querySelector('d-byline');
-    if (!byline) {
-      if (data.authors) {
-        byline = dom.createElement('d-byline');
-        body.insertBefore(byline, article);
-      } else {
-        console.warn('No authors found in front matter; please add them before submission!');
-      }
-    }
-
-    let title = dom.querySelector('d-title');
-    if (!title) {
-      title = dom.createElement('d-title');
-      body.insertBefore(title, byline);
-    }
-
-    let h1 = title.querySelector('h1');
-    if (!h1) {
-      h1 = dom.createElement('h1');
-      h1.textContent = data.title;
-      title.insertBefore(h1, title.firstChild);
-    }
-
-    const hasPassword = typeof data.password !== 'undefined';
-    let interstitial = body.querySelector('d-interstitial');
-    if (hasPassword && !interstitial) {
-      const inBrowser = typeof window !== 'undefined';
-      const onLocalhost = inBrowser && window.location.hostname.includes('localhost');
-      if (!inBrowser || !onLocalhost) {
-        interstitial = dom.createElement('d-interstitial');
-        interstitial.password = data.password;
-        body.insertBefore(interstitial, body.firstChild);
-      }
-    } else if (!hasPassword && interstitial) {
-      interstitial.parentElement.removeChild(this);
-    }
-
-    let appendix = dom.querySelector('d-appendix');
-    if (!appendix) {
-      appendix = dom.createElement('d-appendix');
-      dom.body.appendChild(appendix);
-    }
-
-    let footnoteList = dom.querySelector('d-footnote-list');
-    if (!footnoteList) {
-      footnoteList = dom.createElement('d-footnote-list');
-      appendix.appendChild(footnoteList);
-    }
-
-    let citationList = dom.querySelector('d-citation-list');
-    if (!citationList) {
-      citationList = dom.createElement('d-citation-list');
-      appendix.appendChild(citationList);
-    }
-
-  }
-
-  var katex$1 = createCommonjsModule(function (module, exports) {
-  (function(f){{module.exports=f();}})(function(){return (function e(t,n,r){function s(o,u){if(!n[o]){if(!t[o]){var a=typeof commonjsRequire=="function"&&commonjsRequire;if(!u&&a)return a(o,!0);if(i)return i(o,!0);var f=new Error("Cannot find module '"+o+"'");throw f.code="MODULE_NOT_FOUND",f}var l=n[o]={exports:{}};t[o][0].call(l.exports,function(e){var n=t[o][1][e];return s(n?n:e)},l,l.exports,e,t,n,r);}return n[o].exports}var i=typeof commonjsRequire=="function"&&commonjsRequire;for(var o=0;o<r.length;o++)s(r[o]);return s})({1:[function(require,module,exports){
-
-  var _ParseError = require("./src/ParseError");
-
-  var _ParseError2 = _interopRequireDefault(_ParseError);
-
-  var _Settings = require("./src/Settings");
-
-  var _Settings2 = _interopRequireDefault(_Settings);
-
-  var _buildTree = require("./src/buildTree");
-
-  var _buildTree2 = _interopRequireDefault(_buildTree);
-
-  var _parseTree = require("./src/parseTree");
-
-  var _parseTree2 = _interopRequireDefault(_parseTree);
-
-  var _utils = require("./src/utils");
-
-  var _utils2 = _interopRequireDefault(_utils);
-
-  function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { default: obj }; }
-
-  /**
-   * Parse and build an expression, and place that expression in the DOM node
-   * given.
-   */
-  var render = function render(expression, baseNode, options) {
-      _utils2.default.clearNode(baseNode);
-
-      var settings = new _Settings2.default(options);
-
-      var tree = (0, _parseTree2.default)(expression, settings);
-      var node = (0, _buildTree2.default)(tree, expression, settings).toNode();
-
-      baseNode.appendChild(node);
-  };
-
-  // KaTeX's styles don't work properly in quirks mode. Print out an error, and
-  // disable rendering.
-  /* eslint no-console:0 */
-  /**
-   * This is the main entry point for KaTeX. Here, we expose functions for
-   * rendering expressions either to DOM nodes or to markup strings.
-   *
-   * We also expose the ParseError class to check if errors thrown from KaTeX are
-   * errors in the expression, or errors in javascript handling.
-   */
-
-  if (typeof document !== "undefined") {
-      if (document.compatMode !== "CSS1Compat") {
-          typeof console !== "undefined" && console.warn("Warning: KaTeX doesn't work in quirks mode. Make sure your " + "website has a suitable doctype.");
-
-          render = function render() {
-              throw new _ParseError2.default("KaTeX doesn't work in quirks mode.");
-          };
-      }
-  }
-
-  /**
-   * Parse and build an expression, and return the markup for that.
-   */
-  var renderToString = function renderToString(expression, options) {
-      var settings = new _Settings2.default(options);
-
-      var tree = (0, _parseTree2.default)(expression, settings);
-      return (0, _buildTree2.default)(tree, expression, settings).toMarkup();
-  };
-
-  /**
-   * Parse an expression and return the parse tree.
-   */
-  var generateParseTree = function generateParseTree(expression, options) {
-      var settings = new _Settings2.default(options);
-      return (0, _parseTree2.default)(expression, settings);
-  };
-
-  module.exports = {
-      render: render,
-      renderToString: renderToString,
-      /**
-       * NOTE: This method is not currently recommended for public use.
-       * The internal tree representation is unstable and is very likely
-       * to change. Use at your own risk.
-       */
-      __parse: generateParseTree,
-      ParseError: _ParseError2.default
-  };
-
-  },{"./src/ParseError":29,"./src/Settings":32,"./src/buildTree":37,"./src/parseTree":46,"./src/utils":51}],2:[function(require,module,exports){
-  module.exports = { "default": require("core-js/library/fn/json/stringify"), __esModule: true };
-  },{"core-js/library/fn/json/stringify":6}],3:[function(require,module,exports){
-  module.exports = { "default": require("core-js/library/fn/object/define-property"), __esModule: true };
-  },{"core-js/library/fn/object/define-property":7}],4:[function(require,module,exports){
-
-  exports.__esModule = true;
-
-  exports.default = function (instance, Constructor) {
-    if (!(instance instanceof Constructor)) {
-      throw new TypeError("Cannot call a class as a function");
-    }
-  };
-  },{}],5:[function(require,module,exports){
-
-  exports.__esModule = true;
-
-  var _defineProperty = require("../core-js/object/define-property");
-
-  var _defineProperty2 = _interopRequireDefault(_defineProperty);
-
-  function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { default: obj }; }
-
-  exports.default = function () {
-    function defineProperties(target, props) {
-      for (var i = 0; i < props.length; i++) {
-        var descriptor = props[i];
-        descriptor.enumerable = descriptor.enumerable || false;
-        descriptor.configurable = true;
-        if ("value" in descriptor) descriptor.writable = true;
-        (0, _defineProperty2.default)(target, descriptor.key, descriptor);
-      }
-    }
-
-    return function (Constructor, protoProps, staticProps) {
-      if (protoProps) defineProperties(Constructor.prototype, protoProps);
-      if (staticProps) defineProperties(Constructor, staticProps);
-      return Constructor;
-    };
-  }();
-  },{"../core-js/object/define-property":3}],6:[function(require,module,exports){
-  var core  = require('../../modules/_core')
-    , $JSON = core.JSON || (core.JSON = {stringify: JSON.stringify});
-  module.exports = function stringify(it){ // eslint-disable-line no-unused-vars
-    return $JSON.stringify.apply($JSON, arguments);
-  };
-  },{"../../modules/_core":10}],7:[function(require,module,exports){
-  require('../../modules/es6.object.define-property');
-  var $Object = require('../../modules/_core').Object;
-  module.exports = function defineProperty(it, key, desc){
-    return $Object.defineProperty(it, key, desc);
-  };
-  },{"../../modules/_core":10,"../../modules/es6.object.define-property":23}],8:[function(require,module,exports){
-  module.exports = function(it){
-    if(typeof it != 'function')throw TypeError(it + ' is not a function!');
-    return it;
-  };
-  },{}],9:[function(require,module,exports){
-  var isObject = require('./_is-object');
-  module.exports = function(it){
-    if(!isObject(it))throw TypeError(it + ' is not an object!');
-    return it;
-  };
-  },{"./_is-object":19}],10:[function(require,module,exports){
-  var core = module.exports = {version: '2.4.0'};
-  if(typeof __e == 'number')__e = core; // eslint-disable-line no-undef
-  },{}],11:[function(require,module,exports){
-  // optional / simple context binding
-  var aFunction = require('./_a-function');
-  module.exports = function(fn, that, length){
-    aFunction(fn);
-    if(that === undefined)return fn;
-    switch(length){
-      case 1: return function(a){
-        return fn.call(that, a);
-      };
-      case 2: return function(a, b){
-        return fn.call(that, a, b);
-      };
-      case 3: return function(a, b, c){
-        return fn.call(that, a, b, c);
-      };
-    }
-    return function(/* ...args */){
-      return fn.apply(that, arguments);
-    };
-  };
-  },{"./_a-function":8}],12:[function(require,module,exports){
-  // Thank's IE8 for his funny defineProperty
-  module.exports = !require('./_fails')(function(){
-    return Object.defineProperty({}, 'a', {get: function(){ return 7; }}).a != 7;
-  });
-  },{"./_fails":15}],13:[function(require,module,exports){
-  var isObject = require('./_is-object')
-    , document = require('./_global').document
-    // in old IE typeof document.createElement is 'object'
-    , is = isObject(document) && isObject(document.createElement);
-  module.exports = function(it){
-    return is ? document.createElement(it) : {};
-  };
-  },{"./_global":16,"./_is-object":19}],14:[function(require,module,exports){
-  var global    = require('./_global')
-    , core      = require('./_core')
-    , ctx       = require('./_ctx')
-    , hide      = require('./_hide')
-    , PROTOTYPE = 'prototype';
-
-  var $export = function(type, name, source){
-    var IS_FORCED = type & $export.F
-      , IS_GLOBAL = type & $export.G
-      , IS_STATIC = type & $export.S
-      , IS_PROTO  = type & $export.P
-      , IS_BIND   = type & $export.B
-      , IS_WRAP   = type & $export.W
-      , exports   = IS_GLOBAL ? core : core[name] || (core[name] = {})
-      , expProto  = exports[PROTOTYPE]
-      , target    = IS_GLOBAL ? global : IS_STATIC ? global[name] : (global[name] || {})[PROTOTYPE]
-      , key, own, out;
-    if(IS_GLOBAL)source = name;
-    for(key in source){
-      // contains in native
-      own = !IS_FORCED && target && target[key] !== undefined;
-      if(own && key in exports)continue;
-      // export native or passed
-      out = own ? target[key] : source[key];
-      // prevent global pollution for namespaces
-      exports[key] = IS_GLOBAL && typeof target[key] != 'function' ? source[key]
-      // bind timers to global for call from export context
-      : IS_BIND && own ? ctx(out, global)
-      // wrap global constructors for prevent change them in library
-      : IS_WRAP && target[key] == out ? (function(C){
-        var F = function(a, b, c){
-          if(this instanceof C){
-            switch(arguments.length){
-              case 0: return new C;
-              case 1: return new C(a);
-              case 2: return new C(a, b);
-            } return new C(a, b, c);
-          } return C.apply(this, arguments);
-        };
-        F[PROTOTYPE] = C[PROTOTYPE];
-        return F;
-      // make static versions for prototype methods
-      })(out) : IS_PROTO && typeof out == 'function' ? ctx(Function.call, out) : out;
-      // export proto methods to core.%CONSTRUCTOR%.methods.%NAME%
-      if(IS_PROTO){
-        (exports.virtual || (exports.virtual = {}))[key] = out;
-        // export proto methods to core.%CONSTRUCTOR%.prototype.%NAME%
-        if(type & $export.R && expProto && !expProto[key])hide(expProto, key, out);
-      }
-    }
-  };
-  // type bitmap
-  $export.F = 1;   // forced
-  $export.G = 2;   // global
-  $export.S = 4;   // static
-  $export.P = 8;   // proto
-  $export.B = 16;  // bind
-  $export.W = 32;  // wrap
-  $export.U = 64;  // safe
-  $export.R = 128; // real proto method for `library` 
-  module.exports = $export;
-  },{"./_core":10,"./_ctx":11,"./_global":16,"./_hide":17}],15:[function(require,module,exports){
-  module.exports = function(exec){
-    try {
-      return !!exec();
-    } catch(e){
-      return true;
-    }
-  };
-  },{}],16:[function(require,module,exports){
-  // https://github.com/zloirock/core-js/issues/86#issuecomment-115759028
-  var global = module.exports = typeof window != 'undefined' && window.Math == Math
-    ? window : typeof self != 'undefined' && self.Math == Math ? self : Function('return this')();
-  if(typeof __g == 'number')__g = global; // eslint-disable-line no-undef
-  },{}],17:[function(require,module,exports){
-  var dP         = require('./_object-dp')
-    , createDesc = require('./_property-desc');
-  module.exports = require('./_descriptors') ? function(object, key, value){
-    return dP.f(object, key, createDesc(1, value));
-  } : function(object, key, value){
-    object[key] = value;
-    return object;
-  };
-  },{"./_descriptors":12,"./_object-dp":20,"./_property-desc":21}],18:[function(require,module,exports){
-  module.exports = !require('./_descriptors') && !require('./_fails')(function(){
-    return Object.defineProperty(require('./_dom-create')('div'), 'a', {get: function(){ return 7; }}).a != 7;
-  });
-  },{"./_descriptors":12,"./_dom-create":13,"./_fails":15}],19:[function(require,module,exports){
-  module.exports = function(it){
-    return typeof it === 'object' ? it !== null : typeof it === 'function';
-  };
-  },{}],20:[function(require,module,exports){
-  var anObject       = require('./_an-object')
-    , IE8_DOM_DEFINE = require('./_ie8-dom-define')
-    , toPrimitive    = require('./_to-primitive')
-    , dP             = Object.defineProperty;
-
-  exports.f = require('./_descriptors') ? Object.defineProperty : function defineProperty(O, P, Attributes){
-    anObject(O);
-    P = toPrimitive(P, true);
-    anObject(Attributes);
-    if(IE8_DOM_DEFINE)try {
-      return dP(O, P, Attributes);
-    } catch(e){ /* empty */ }
-    if('get' in Attributes || 'set' in Attributes)throw TypeError('Accessors not supported!');
-    if('value' in Attributes)O[P] = Attributes.value;
-    return O;
-  };
-  },{"./_an-object":9,"./_descriptors":12,"./_ie8-dom-define":18,"./_to-primitive":22}],21:[function(require,module,exports){
-  module.exports = function(bitmap, value){
-    return {
-      enumerable  : !(bitmap & 1),
-      configurable: !(bitmap & 2),
-      writable    : !(bitmap & 4),
-      value       : value
-    };
-  };
-  },{}],22:[function(require,module,exports){
-  // 7.1.1 ToPrimitive(input [, PreferredType])
-  var isObject = require('./_is-object');
-  // instead of the ES6 spec version, we didn't implement @@toPrimitive case
-  // and the second argument - flag - preferred type is a string
-  module.exports = function(it, S){
-    if(!isObject(it))return it;
-    var fn, val;
-    if(S && typeof (fn = it.toString) == 'function' && !isObject(val = fn.call(it)))return val;
-    if(typeof (fn = it.valueOf) == 'function' && !isObject(val = fn.call(it)))return val;
-    if(!S && typeof (fn = it.toString) == 'function' && !isObject(val = fn.call(it)))return val;
-    throw TypeError("Can't convert object to primitive value");
-  };
-  },{"./_is-object":19}],23:[function(require,module,exports){
-  var $export = require('./_export');
-  // 19.1.2.4 / 15.2.3.6 Object.defineProperty(O, P, Attributes)
-  $export($export.S + $export.F * !require('./_descriptors'), 'Object', {defineProperty: require('./_object-dp').f});
-  },{"./_descriptors":12,"./_export":14,"./_object-dp":20}],24:[function(require,module,exports){
-
-  function getRelocatable(re) {
-    // In the future, this could use a WeakMap instead of an expando.
-    if (!re.__matchAtRelocatable) {
-      // Disjunctions are the lowest-precedence operator, so we can make any
-      // pattern match the empty string by appending `|()` to it:
-      // https://people.mozilla.org/~jorendorff/es6-draft.html#sec-patterns
-      var source = re.source + "|()";
-
-      // We always make the new regex global.
-      var flags = "g" + (re.ignoreCase ? "i" : "") + (re.multiline ? "m" : "") + (re.unicode ? "u" : "")
-      // sticky (/.../y) doesn't make sense in conjunction with our relocation
-      // logic, so we ignore it here.
-      ;
-
-      re.__matchAtRelocatable = new RegExp(source, flags);
-    }
-    return re.__matchAtRelocatable;
-  }
-
-  function matchAt(re, str, pos) {
-    if (re.global || re.sticky) {
-      throw new Error("matchAt(...): Only non-global regexes are supported");
-    }
-    var reloc = getRelocatable(re);
-    reloc.lastIndex = pos;
-    var match = reloc.exec(str);
-    // Last capturing group is our sentinel that indicates whether the regex
-    // matched at the given location.
-    if (match[match.length - 1] == null) {
-      // Original regex matched.
-      match.length = match.length - 1;
-      return match;
-    } else {
-      return null;
-    }
-  }
-
-  module.exports = matchAt;
-  },{}],25:[function(require,module,exports){
-  /* eslint-disable no-unused-vars */
-  var hasOwnProperty = Object.prototype.hasOwnProperty;
-  var propIsEnumerable = Object.prototype.propertyIsEnumerable;
-
-  function toObject(val) {
-  	if (val === null || val === undefined) {
-  		throw new TypeError('Object.assign cannot be called with null or undefined');
-  	}
-
-  	return Object(val);
-  }
-
-  function shouldUseNative() {
-  	try {
-  		if (!Object.assign) {
-  			return false;
-  		}
-
-  		// Detect buggy property enumeration order in older V8 versions.
-
-  		// https://bugs.chromium.org/p/v8/issues/detail?id=4118
-  		var test1 = new String('abc');  // eslint-disable-line
-  		test1[5] = 'de';
-  		if (Object.getOwnPropertyNames(test1)[0] === '5') {
-  			return false;
-  		}
-
-  		// https://bugs.chromium.org/p/v8/issues/detail?id=3056
-  		var test2 = {};
-  		for (var i = 0; i < 10; i++) {
-  			test2['_' + String.fromCharCode(i)] = i;
-  		}
-  		var order2 = Object.getOwnPropertyNames(test2).map(function (n) {
-  			return test2[n];
-  		});
-  		if (order2.join('') !== '0123456789') {
-  			return false;
-  		}
-
-  		// https://bugs.chromium.org/p/v8/issues/detail?id=3056
-  		var test3 = {};
-  		'abcdefghijklmnopqrst'.split('').forEach(function (letter) {
-  			test3[letter] = letter;
-  		});
-  		if (Object.keys(Object.assign({}, test3)).join('') !==
-  				'abcdefghijklmnopqrst') {
-  			return false;
-  		}
-
-  		return true;
-  	} catch (e) {
-  		// We don't expect any of the above to throw, but better to be safe.
-  		return false;
-  	}
-  }
-
-  module.exports = shouldUseNative() ? Object.assign : function (target, source) {
-  	var from;
-  	var to = toObject(target);
-  	var symbols;
-
-  	for (var s = 1; s < arguments.length; s++) {
-  		from = Object(arguments[s]);
-
-  		for (var key in from) {
-  			if (hasOwnProperty.call(from, key)) {
-  				to[key] = from[key];
-  			}
-  		}
-
-  		if (Object.getOwnPropertySymbols) {
-  			symbols = Object.getOwnPropertySymbols(from);
-  			for (var i = 0; i < symbols.length; i++) {
-  				if (propIsEnumerable.call(from, symbols[i])) {
-  					to[symbols[i]] = from[symbols[i]];
-  				}
-  			}
-  		}
-  	}
-
-  	return to;
-  };
-
-  },{}],26:[function(require,module,exports){
-
-  var _classCallCheck2 = require("babel-runtime/helpers/classCallCheck");
-
-  var _classCallCheck3 = _interopRequireDefault(_classCallCheck2);
-
-  var _createClass2 = require("babel-runtime/helpers/createClass");
-
-  var _createClass3 = _interopRequireDefault(_createClass2);
-
-  var _matchAt = require("match-at");
-
-  var _matchAt2 = _interopRequireDefault(_matchAt);
-
-  var _ParseError = require("./ParseError");
-
-  var _ParseError2 = _interopRequireDefault(_ParseError);
-
-  function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { default: obj }; }
-
-  /**
-   * The resulting token returned from `lex`.
-   *
-   * It consists of the token text plus some position information.
-   * The position information is essentially a range in an input string,
-   * but instead of referencing the bare input string, we refer to the lexer.
-   * That way it is possible to attach extra metadata to the input string,
-   * like for example a file name or similar.
-   *
-   * The position information (all three parameters) is optional,
-   * so it is OK to construct synthetic tokens if appropriate.
-   * Not providing available position information may lead to
-   * degraded error reporting, though.
-   *
-   * @param {string}  text   the text of this token
-   * @param {number=} start  the start offset, zero-based inclusive
-   * @param {number=} end    the end offset, zero-based exclusive
-   * @param {Lexer=}  lexer  the lexer which in turn holds the input string
-   */
-  /**
-   * The Lexer class handles tokenizing the input in various ways. Since our
-   * parser expects us to be able to backtrack, the lexer allows lexing from any
-   * given starting point.
-   *
-   * Its main exposed function is the `lex` function, which takes a position to
-   * lex from and a type of token to lex. It defers to the appropriate `_innerLex`
-   * function.
-   *
-   * The various `_innerLex` functions perform the actual lexing of different
-   * kinds.
-   */
-
-  var Token = function () {
-      function Token(text, start, end, lexer) {
-          (0, _classCallCheck3.default)(this, Token);
-
-          this.text = text;
-          this.start = start;
-          this.end = end;
-          this.lexer = lexer;
-      }
-
-      /**
-       * Given a pair of tokens (this and endToken), compute a “Token” encompassing
-       * the whole input range enclosed by these two.
-       *
-       * @param {Token}  endToken  last token of the range, inclusive
-       * @param {string} text      the text of the newly constructed token
-       */
-
-
-      (0, _createClass3.default)(Token, [{
-          key: "range",
-          value: function range(endToken, text) {
-              if (endToken.lexer !== this.lexer) {
-                  return new Token(text); // sorry, no position information available
-              }
-              return new Token(text, this.start, endToken.end, this.lexer);
-          }
-      }]);
-      return Token;
-  }();
-
-  /* The following tokenRegex
-   * - matches typical whitespace (but not NBSP etc.) using its first group
-   * - does not match any control character \x00-\x1f except whitespace
-   * - does not match a bare backslash
-   * - matches any ASCII character except those just mentioned
-   * - does not match the BMP private use area \uE000-\uF8FF
-   * - does not match bare surrogate code units
-   * - matches any BMP character except for those just described
-   * - matches any valid Unicode surrogate pair
-   * - matches a backslash followed by one or more letters
-   * - matches a backslash followed by any BMP character, including newline
-   * Just because the Lexer matches something doesn't mean it's valid input:
-   * If there is no matching function or symbol definition, the Parser will
-   * still reject the input.
-   */
-
-
-  var tokenRegex = new RegExp("([ \r\n\t]+)|" + // whitespace
-  "([!-\\[\\]-\u2027\u202A-\uD7FF\uF900-\uFFFF]" + // single codepoint
-  "|[\uD800-\uDBFF][\uDC00-\uDFFF]" + // surrogate pair
-  "|\\\\(?:[a-zA-Z]+|[^\uD800-\uDFFF])" + // function name
-  ")");
-
-  /*
-   * Main Lexer class
-   */
-
-  var Lexer = function () {
-      function Lexer(input) {
-          (0, _classCallCheck3.default)(this, Lexer);
-
-          this.input = input;
-          this.pos = 0;
-      }
-
-      /**
-       * This function lexes a single token.
-       */
-
-
-      (0, _createClass3.default)(Lexer, [{
-          key: "lex",
-          value: function lex() {
-              var input = this.input;
-              var pos = this.pos;
-              if (pos === input.length) {
-                  return new Token("EOF", pos, pos, this);
-              }
-              var match = (0, _matchAt2.default)(tokenRegex, input, pos);
-              if (match === null) {
-                  throw new _ParseError2.default("Unexpected character: '" + input[pos] + "'", new Token(input[pos], pos, pos + 1, this));
-              }
-              var text = match[2] || " ";
-              var start = this.pos;
-              this.pos += match[0].length;
-              var end = this.pos;
-              return new Token(text, start, end, this);
-          }
-      }]);
-      return Lexer;
-  }();
-
-  module.exports = Lexer;
-
-  },{"./ParseError":29,"babel-runtime/helpers/classCallCheck":4,"babel-runtime/helpers/createClass":5,"match-at":24}],27:[function(require,module,exports){
-
-  var _classCallCheck2 = require("babel-runtime/helpers/classCallCheck");
-
-  var _classCallCheck3 = _interopRequireDefault(_classCallCheck2);
-
-  var _createClass2 = require("babel-runtime/helpers/createClass");
-
-  var _createClass3 = _interopRequireDefault(_createClass2);
-
-  var _Lexer = require("./Lexer");
-
-  var _Lexer2 = _interopRequireDefault(_Lexer);
-
-  var _macros = require("./macros");
-
-  var _macros2 = _interopRequireDefault(_macros);
-
-  var _ParseError = require("./ParseError");
-
-  var _ParseError2 = _interopRequireDefault(_ParseError);
-
-  var _objectAssign = require("object-assign");
-
-  var _objectAssign2 = _interopRequireDefault(_objectAssign);
-
-  function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { default: obj }; }
-
-  /**
-   * This file contains the “gullet” where macros are expanded
-   * until only non-macro tokens remain.
-   */
-
-  var MacroExpander = function () {
-      function MacroExpander(input, macros) {
-          (0, _classCallCheck3.default)(this, MacroExpander);
-
-          this.lexer = new _Lexer2.default(input);
-          this.macros = (0, _objectAssign2.default)({}, _macros2.default, macros);
-          this.stack = []; // contains tokens in REVERSE order
-          this.discardedWhiteSpace = [];
-      }
-
-      /**
-       * Recursively expand first token, then return first non-expandable token.
-       *
-       * At the moment, macro expansion doesn't handle delimited macros,
-       * i.e. things like those defined by \def\foo#1\end{…}.
-       * See the TeX book page 202ff. for details on how those should behave.
-       */
-
-
-      (0, _createClass3.default)(MacroExpander, [{
-          key: "nextToken",
-          value: function nextToken() {
-              for (;;) {
-                  if (this.stack.length === 0) {
-                      this.stack.push(this.lexer.lex());
-                  }
-                  var topToken = this.stack.pop();
-                  var name = topToken.text;
-                  if (!(name.charAt(0) === "\\" && this.macros.hasOwnProperty(name))) {
-                      return topToken;
-                  }
-                  var tok = void 0;
-                  var expansion = this.macros[name];
-                  if (typeof expansion === "string") {
-                      var numArgs = 0;
-                      if (expansion.indexOf("#") !== -1) {
-                          var stripped = expansion.replace(/##/g, "");
-                          while (stripped.indexOf("#" + (numArgs + 1)) !== -1) {
-                              ++numArgs;
-                          }
-                      }
-                      var bodyLexer = new _Lexer2.default(expansion);
-                      expansion = [];
-                      tok = bodyLexer.lex();
-                      while (tok.text !== "EOF") {
-                          expansion.push(tok);
-                          tok = bodyLexer.lex();
-                      }
-                      expansion.reverse(); // to fit in with stack using push and pop
-                      expansion.numArgs = numArgs;
-                      this.macros[name] = expansion;
-                  }
-                  if (expansion.numArgs) {
-                      var args = [];
-                      var i = void 0;
-                      // obtain arguments, either single token or balanced {…} group
-                      for (i = 0; i < expansion.numArgs; ++i) {
-                          var startOfArg = this.get(true);
-                          if (startOfArg.text === "{") {
-                              var arg = [];
-                              var depth = 1;
-                              while (depth !== 0) {
-                                  tok = this.get(false);
-                                  arg.push(tok);
-                                  if (tok.text === "{") {
-                                      ++depth;
-                                  } else if (tok.text === "}") {
-                                      --depth;
-                                  } else if (tok.text === "EOF") {
-                                      throw new _ParseError2.default("End of input in macro argument", startOfArg);
-                                  }
-                              }
-                              arg.pop(); // remove last }
-                              arg.reverse(); // like above, to fit in with stack order
-                              args[i] = arg;
-                          } else if (startOfArg.text === "EOF") {
-                              throw new _ParseError2.default("End of input expecting macro argument", topToken);
-                          } else {
-                              args[i] = [startOfArg];
-                          }
-                      }
-                      // paste arguments in place of the placeholders
-                      expansion = expansion.slice(); // make a shallow copy
-                      for (i = expansion.length - 1; i >= 0; --i) {
-                          tok = expansion[i];
-                          if (tok.text === "#") {
-                              if (i === 0) {
-                                  throw new _ParseError2.default("Incomplete placeholder at end of macro body", tok);
-                              }
-                              tok = expansion[--i]; // next token on stack
-                              if (tok.text === "#") {
-                                  // ## → #
-                                  expansion.splice(i + 1, 1); // drop first #
-                              } else if (/^[1-9]$/.test(tok.text)) {
-                                  // expansion.splice(i, 2, arg[0], arg[1], …)
-                                  // to replace placeholder with the indicated argument.
-                                  // TODO: use spread once we move to ES2015
-                                  expansion.splice.apply(expansion, [i, 2].concat(args[tok.text - 1]));
-                              } else {
-                                  throw new _ParseError2.default("Not a valid argument number", tok);
-                              }
-                          }
-                      }
-                  }
-                  this.stack = this.stack.concat(expansion);
-              }
-          }
-      }, {
-          key: "get",
-          value: function get(ignoreSpace) {
-              this.discardedWhiteSpace = [];
-              var token = this.nextToken();
-              if (ignoreSpace) {
-                  while (token.text === " ") {
-                      this.discardedWhiteSpace.push(token);
-                      token = this.nextToken();
-                  }
-              }
-              return token;
-          }
-
-          /**
-           * Undo the effect of the preceding call to the get method.
-           * A call to this method MUST be immediately preceded and immediately followed
-           * by a call to get.  Only used during mode switching, i.e. after one token
-           * was got in the old mode but should get got again in a new mode
-           * with possibly different whitespace handling.
-           */
-
-      }, {
-          key: "unget",
-          value: function unget(token) {
-              this.stack.push(token);
-              while (this.discardedWhiteSpace.length !== 0) {
-                  this.stack.push(this.discardedWhiteSpace.pop());
-              }
-          }
-      }]);
-      return MacroExpander;
-  }();
-
-  module.exports = MacroExpander;
-
-  },{"./Lexer":26,"./ParseError":29,"./macros":44,"babel-runtime/helpers/classCallCheck":4,"babel-runtime/helpers/createClass":5,"object-assign":25}],28:[function(require,module,exports){
-
-  var _classCallCheck2 = require("babel-runtime/helpers/classCallCheck");
-
-  var _classCallCheck3 = _interopRequireDefault(_classCallCheck2);
-
-  var _createClass2 = require("babel-runtime/helpers/createClass");
-
-  var _createClass3 = _interopRequireDefault(_createClass2);
-
-  var _fontMetrics2 = require("./fontMetrics");
-
-  var _fontMetrics3 = _interopRequireDefault(_fontMetrics2);
-
-  function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { default: obj }; }
-
-  var BASESIZE = 6; /**
-                     * This file contains information about the options that the Parser carries
-                     * around with it while parsing. Data is held in an `Options` object, and when
-                     * recursing, a new `Options` object can be created with the `.with*` and
-                     * `.reset` functions.
-                     */
-
-  var sizeStyleMap = [
-  // Each element contains [textsize, scriptsize, scriptscriptsize].
-  // The size mappings are taken from TeX with \normalsize=10pt.
-  [1, 1, 1], // size1: [5, 5, 5]              \tiny
-  [2, 1, 1], // size2: [6, 5, 5]
-  [3, 1, 1], // size3: [7, 5, 5]              \scriptsize
-  [4, 2, 1], // size4: [8, 6, 5]              \footnotesize
-  [5, 2, 1], // size5: [9, 6, 5]              \small
-  [6, 3, 1], // size6: [10, 7, 5]             \normalsize
-  [7, 4, 2], // size7: [12, 8, 6]             \large
-  [8, 6, 3], // size8: [14.4, 10, 7]          \Large
-  [9, 7, 6], // size9: [17.28, 12, 10]        \LARGE
-  [10, 8, 7], // size10: [20.74, 14.4, 12]     \huge
-  [11, 10, 9]];
-
-  var sizeMultipliers = [
-  // fontMetrics.js:getFontMetrics also uses size indexes, so if
-  // you change size indexes, change that function.
-  0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.2, 1.44, 1.728, 2.074, 2.488];
-
-  var sizeAtStyle = function sizeAtStyle(size, style) {
-      return style.size < 2 ? size : sizeStyleMap[size - 1][style.size - 1];
-  };
-
-  /**
-   * This is the main options class. It contains the current style, size, color,
-   * and font.
-   *
-   * Options objects should not be modified. To create a new Options with
-   * different properties, call a `.having*` method.
-   */
-
-  var Options = function () {
-      function Options(data) {
-          (0, _classCallCheck3.default)(this, Options);
-
-          this.style = data.style;
-          this.color = data.color;
-          this.size = data.size || BASESIZE;
-          this.textSize = data.textSize || this.size;
-          this.phantom = data.phantom;
-          this.font = data.font;
-          this.sizeMultiplier = sizeMultipliers[this.size - 1];
-          this._fontMetrics = null;
-      }
-
-      /**
-       * Returns a new options object with the same properties as "this".  Properties
-       * from "extension" will be copied to the new options object.
-       */
-
-
-      (0, _createClass3.default)(Options, [{
-          key: "extend",
-          value: function extend(extension) {
-              var data = {
-                  style: this.style,
-                  size: this.size,
-                  textSize: this.textSize,
-                  color: this.color,
-                  phantom: this.phantom,
-                  font: this.font
-              };
-
-              for (var key in extension) {
-                  if (extension.hasOwnProperty(key)) {
-                      data[key] = extension[key];
-                  }
-              }
-
-              return new Options(data);
-          }
-
-          /**
-           * Return an options object with the given style. If `this.style === style`,
-           * returns `this`.
-           */
-
-      }, {
-          key: "havingStyle",
-          value: function havingStyle(style) {
-              if (this.style === style) {
-                  return this;
-              } else {
-                  return this.extend({
-                      style: style,
-                      size: sizeAtStyle(this.textSize, style)
-                  });
-              }
-          }
-
-          /**
-           * Return an options object with a cramped version of the current style. If
-           * the current style is cramped, returns `this`.
-           */
-
-      }, {
-          key: "havingCrampedStyle",
-          value: function havingCrampedStyle() {
-              return this.havingStyle(this.style.cramp());
-          }
-
-          /**
-           * Return an options object with the given size and in at least `\textstyle`.
-           * Returns `this` if appropriate.
-           */
-
-      }, {
-          key: "havingSize",
-          value: function havingSize(size) {
-              if (this.size === size && this.textSize === size) {
-                  return this;
-              } else {
-                  return this.extend({
-                      style: this.style.text(),
-                      size: size,
-                      textSize: size
-                  });
-              }
-          }
-
-          /**
-           * Like `this.havingSize(BASESIZE).havingStyle(style)`. If `style` is omitted,
-           * changes to at least `\textstyle`.
-           */
-
-      }, {
-          key: "havingBaseStyle",
-          value: function havingBaseStyle(style) {
-              style = style || this.style.text();
-              var wantSize = sizeAtStyle(BASESIZE, style);
-              if (this.size === wantSize && this.textSize === BASESIZE && this.style === style) {
-                  return this;
-              } else {
-                  return this.extend({
-                      style: style,
-                      size: wantSize,
-                      baseSize: BASESIZE
-                  });
-              }
-          }
-
-          /**
-           * Create a new options object with the given color.
-           */
-
-      }, {
-          key: "withColor",
-          value: function withColor(color) {
-              return this.extend({
-                  color: color
-              });
-          }
-
-          /**
-           * Create a new options object with "phantom" set to true.
-           */
-
-      }, {
-          key: "withPhantom",
-          value: function withPhantom() {
-              return this.extend({
-                  phantom: true
-              });
-          }
-
-          /**
-           * Create a new options objects with the give font.
-           */
-
-      }, {
-          key: "withFont",
-          value: function withFont(font) {
-              return this.extend({
-                  font: font || this.font
-              });
-          }
-
-          /**
-           * Return the CSS sizing classes required to switch from enclosing options
-           * `oldOptions` to `this`. Returns an array of classes.
-           */
-
-      }, {
-          key: "sizingClasses",
-          value: function sizingClasses(oldOptions) {
-              if (oldOptions.size !== this.size) {
-                  return ["sizing", "reset-size" + oldOptions.size, "size" + this.size];
-              } else {
-                  return [];
-              }
-          }
-
-          /**
-           * Return the CSS sizing classes required to switch to the base size. Like
-           * `this.havingSize(BASESIZE).sizingClasses(this)`.
-           */
-
-      }, {
-          key: "baseSizingClasses",
-          value: function baseSizingClasses() {
-              if (this.size !== BASESIZE) {
-                  return ["sizing", "reset-size" + this.size, "size" + BASESIZE];
-              } else {
-                  return [];
-              }
-          }
-
-          /**
-           * Return the font metrics for this size.
-           */
-
-      }, {
-          key: "fontMetrics",
-          value: function fontMetrics() {
-              if (!this._fontMetrics) {
-                  this._fontMetrics = _fontMetrics3.default.getFontMetrics(this.size);
-              }
-              return this._fontMetrics;
-          }
-
-          /**
-           * A map of color names to CSS colors.
-           * TODO(emily): Remove this when we have real macros
-           */
-
-      }, {
-          key: "getColor",
-
-
-          /**
-           * Gets the CSS color of the current options object, accounting for the
-           * `colorMap`.
-           */
-          value: function getColor() {
-              if (this.phantom) {
-                  return "transparent";
-              } else {
-                  return Options.colorMap[this.color] || this.color;
-              }
-          }
-      }]);
-      return Options;
-  }();
-
-  /**
-   * The base size index.
-   */
-
-
-  Options.colorMap = {
-      "katex-blue": "#6495ed",
-      "katex-orange": "#ffa500",
-      "katex-pink": "#ff00af",
-      "katex-red": "#df0030",
-      "katex-green": "#28ae7b",
-      "katex-gray": "gray",
-      "katex-purple": "#9d38bd",
-      "katex-blueA": "#ccfaff",
-      "katex-blueB": "#80f6ff",
-      "katex-blueC": "#63d9ea",
-      "katex-blueD": "#11accd",
-      "katex-blueE": "#0c7f99",
-      "katex-tealA": "#94fff5",
-      "katex-tealB": "#26edd5",
-      "katex-tealC": "#01d1c1",
-      "katex-tealD": "#01a995",
-      "katex-tealE": "#208170",
-      "katex-greenA": "#b6ffb0",
-      "katex-greenB": "#8af281",
-      "katex-greenC": "#74cf70",
-      "katex-greenD": "#1fab54",
-      "katex-greenE": "#0d923f",
-      "katex-goldA": "#ffd0a9",
-      "katex-goldB": "#ffbb71",
-      "katex-goldC": "#ff9c39",
-      "katex-goldD": "#e07d10",
-      "katex-goldE": "#a75a05",
-      "katex-redA": "#fca9a9",
-      "katex-redB": "#ff8482",
-      "katex-redC": "#f9685d",
-      "katex-redD": "#e84d39",
-      "katex-redE": "#bc2612",
-      "katex-maroonA": "#ffbde0",
-      "katex-maroonB": "#ff92c6",
-      "katex-maroonC": "#ed5fa6",
-      "katex-maroonD": "#ca337c",
-      "katex-maroonE": "#9e034e",
-      "katex-purpleA": "#ddd7ff",
-      "katex-purpleB": "#c6b9fc",
-      "katex-purpleC": "#aa87ff",
-      "katex-purpleD": "#7854ab",
-      "katex-purpleE": "#543b78",
-      "katex-mintA": "#f5f9e8",
-      "katex-mintB": "#edf2df",
-      "katex-mintC": "#e0e5cc",
-      "katex-grayA": "#f6f7f7",
-      "katex-grayB": "#f0f1f2",
-      "katex-grayC": "#e3e5e6",
-      "katex-grayD": "#d6d8da",
-      "katex-grayE": "#babec2",
-      "katex-grayF": "#888d93",
-      "katex-grayG": "#626569",
-      "katex-grayH": "#3b3e40",
-      "katex-grayI": "#21242c",
-      "katex-kaBlue": "#314453",
-      "katex-kaGreen": "#71B307"
-  };
-  Options.BASESIZE = BASESIZE;
-
-  module.exports = Options;
-
-  },{"./fontMetrics":41,"babel-runtime/helpers/classCallCheck":4,"babel-runtime/helpers/createClass":5}],29:[function(require,module,exports){
-
-  var _classCallCheck2 = require("babel-runtime/helpers/classCallCheck");
-
-  var _classCallCheck3 = _interopRequireDefault(_classCallCheck2);
-
-  function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { default: obj }; }
-
-  /**
-   * This is the ParseError class, which is the main error thrown by KaTeX
-   * functions when something has gone wrong. This is used to distinguish internal
-   * errors from errors in the expression that the user provided.
-   *
-   * If possible, a caller should provide a Token or ParseNode with information
-   * about where in the source string the problem occurred.
-   *
-   * @param {string} message  The error message
-   * @param {(Token|ParseNode)=} token  An object providing position information
-   */
-  var ParseError = function ParseError(message, token) {
-      (0, _classCallCheck3.default)(this, ParseError);
-
-      var error = "KaTeX parse error: " + message;
-      var start = void 0;
-      var end = void 0;
-
-      if (token && token.lexer && token.start <= token.end) {
-          // If we have the input and a position, make the error a bit fancier
-
-          // Get the input
-          var input = token.lexer.input;
-
-          // Prepend some information
-          start = token.start;
-          end = token.end;
-          if (start === input.length) {
-              error += " at end of input: ";
-          } else {
-              error += " at position " + (start + 1) + ": ";
-          }
-
-          // Underline token in question using combining underscores
-          var underlined = input.slice(start, end).replace(/[^]/g, "$&\u0332");
-
-          // Extract some context from the input and add it to the error
-          var left = void 0;
-          if (start > 15) {
-              left = "…" + input.slice(start - 15, start);
-          } else {
-              left = input.slice(0, start);
-          }
-          var right = void 0;
-          if (end + 15 < input.length) {
-              right = input.slice(end, end + 15) + "…";
-          } else {
-              right = input.slice(end);
-          }
-          error += left + underlined + right;
-      }
-
-      // Some hackery to make ParseError a prototype of Error
-      // See http://stackoverflow.com/a/8460753
-      var self = new Error(error);
-      self.name = "ParseError";
-      self.__proto__ = ParseError.prototype;
-
-      self.position = start;
-      return self;
-  };
-
-  // More hackery
-
-
-  ParseError.prototype.__proto__ = Error.prototype;
-
-  module.exports = ParseError;
-
-  },{"babel-runtime/helpers/classCallCheck":4}],30:[function(require,module,exports){
-
-  Object.defineProperty(exports, "__esModule", {
-      value: true
-  });
-
-  var _classCallCheck2 = require("babel-runtime/helpers/classCallCheck");
-
-  var _classCallCheck3 = _interopRequireDefault(_classCallCheck2);
-
-  function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { default: obj }; }
-
-  /**
-   * The resulting parse tree nodes of the parse tree.
-   *
-   * It is possible to provide position information, so that a ParseNode can
-   * fulfil a role similar to a Token in error reporting.
-   * For details on the corresponding properties see Token constructor.
-   * Providing such information can lead to better error reporting.
-   *
-   * @param {string}  type       type of node, like e.g. "ordgroup"
-   * @param {?object} value      type-specific representation of the node
-   * @param {string}  mode       parse mode in action for this node,
-   *                             "math" or "text"
-   * @param {Token=} firstToken  first token of the input for this node,
-   *                             will omit position information if unset
-   * @param {Token=} lastToken   last token of the input for this node,
-   *                             will default to firstToken if unset
-   */
-  var ParseNode = function ParseNode(type, value, mode, firstToken, lastToken) {
-      (0, _classCallCheck3.default)(this, ParseNode);
-
-      this.type = type;
-      this.value = value;
-      this.mode = mode;
-      if (firstToken && (!lastToken || lastToken.lexer === firstToken.lexer)) {
-          this.lexer = firstToken.lexer;
-          this.start = firstToken.start;
-          this.end = (lastToken || firstToken).end;
-      }
-  };
-
-  exports.default = ParseNode;
-
-  },{"babel-runtime/helpers/classCallCheck":4}],31:[function(require,module,exports){
-
-  var _classCallCheck2 = require("babel-runtime/helpers/classCallCheck");
-
-  var _classCallCheck3 = _interopRequireDefault(_classCallCheck2);
-
-  var _createClass2 = require("babel-runtime/helpers/createClass");
-
-  var _createClass3 = _interopRequireDefault(_createClass2);
-
-  var _functions = require("./functions");
-
-  var _functions2 = _interopRequireDefault(_functions);
-
-  var _environments = require("./environments");
-
-  var _environments2 = _interopRequireDefault(_environments);
-
-  var _MacroExpander = require("./MacroExpander");
-
-  var _MacroExpander2 = _interopRequireDefault(_MacroExpander);
-
-  var _symbols = require("./symbols");
-
-  var _symbols2 = _interopRequireDefault(_symbols);
-
-  var _utils = require("./utils");
-
-  var _utils2 = _interopRequireDefault(_utils);
-
-  var _units = require("./units");
-
-  var _units2 = _interopRequireDefault(_units);
-
-  var _unicodeRegexes = require("./unicodeRegexes");
-
-  var _ParseNode = require("./ParseNode");
-
-  var _ParseNode2 = _interopRequireDefault(_ParseNode);
-
-  var _ParseError = require("./ParseError");
-
-  var _ParseError2 = _interopRequireDefault(_ParseError);
-
-  function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { default: obj }; }
-
-  /**
-   * This file contains the parser used to parse out a TeX expression from the
-   * input. Since TeX isn't context-free, standard parsers don't work particularly
-   * well.
-   *
-   * The strategy of this parser is as such:
-   *
-   * The main functions (the `.parse...` ones) take a position in the current
-   * parse string to parse tokens from. The lexer (found in Lexer.js, stored at
-   * this.lexer) also supports pulling out tokens at arbitrary places. When
-   * individual tokens are needed at a position, the lexer is called to pull out a
-   * token, which is then used.
-   *
-   * The parser has a property called "mode" indicating the mode that
-   * the parser is currently in. Currently it has to be one of "math" or
-   * "text", which denotes whether the current environment is a math-y
-   * one or a text-y one (e.g. inside \text). Currently, this serves to
-   * limit the functions which can be used in text mode.
-   *
-   * The main functions then return an object which contains the useful data that
-   * was parsed at its given point, and a new position at the end of the parsed
-   * data. The main functions can call each other and continue the parsing by
-   * using the returned position as a new starting point.
-   *
-   * There are also extra `.handle...` functions, which pull out some reused
-   * functionality into self-contained functions.
-   *
-   * The earlier functions return ParseNodes.
-   * The later functions (which are called deeper in the parse) sometimes return
-   * ParseFuncOrArgument, which contain a ParseNode as well as some data about
-   * whether the parsed object is a function which is missing some arguments, or a
-   * standalone object which can be used as an argument to another function.
-   */
-
-  /**
-   * An initial function (without its arguments), or an argument to a function.
-   * The `result` argument should be a ParseNode.
-   */
-  function ParseFuncOrArgument(result, isFunction, token) {
-      this.result = result;
-      // Is this a function (i.e. is it something defined in functions.js)?
-      this.isFunction = isFunction;
-      this.token = token;
-  } /* eslint no-constant-condition:0 */
-
-  var Parser = function () {
-      function Parser(input, settings) {
-          (0, _classCallCheck3.default)(this, Parser);
-
-          // Create a new macro expander (gullet) and (indirectly via that) also a
-          // new lexer (mouth) for this parser (stomach, in the language of TeX)
-          this.gullet = new _MacroExpander2.default(input, settings.macros);
-          // Use old \color behavior (same as LaTeX's \textcolor) if requested.
-          // We do this after the macros object has been copied by MacroExpander.
-          if (settings.colorIsTextColor) {
-              this.gullet.macros["\\color"] = "\\textcolor";
-          }
-          // Store the settings for use in parsing
-          this.settings = settings;
-          // Count leftright depth (for \middle errors)
-          this.leftrightDepth = 0;
-      }
-
-      /**
-       * Checks a result to make sure it has the right type, and throws an
-       * appropriate error otherwise.
-       *
-       * @param {boolean=} consume whether to consume the expected token,
-       *                           defaults to true
-       */
-
-
-      (0, _createClass3.default)(Parser, [{
-          key: "expect",
-          value: function expect(text, consume) {
-              if (this.nextToken.text !== text) {
-                  throw new _ParseError2.default("Expected '" + text + "', got '" + this.nextToken.text + "'", this.nextToken);
-              }
-              if (consume !== false) {
-                  this.consume();
-              }
-          }
-
-          /**
-           * Considers the current look ahead token as consumed,
-           * and fetches the one after that as the new look ahead.
-           */
-
-      }, {
-          key: "consume",
-          value: function consume() {
-              this.nextToken = this.gullet.get(this.mode === "math");
-          }
-      }, {
-          key: "switchMode",
-          value: function switchMode(newMode) {
-              this.gullet.unget(this.nextToken);
-              this.mode = newMode;
-              this.consume();
-          }
-
-          /**
-           * Main parsing function, which parses an entire input.
-           *
-           * @return {?Array.<ParseNode>}
-           */
-
-      }, {
-          key: "parse",
-          value: function parse() {
-              // Try to parse the input
-              this.mode = "math";
-              this.consume();
-              var parse = this.parseInput();
-              return parse;
-          }
-
-          /**
-           * Parses an entire input tree.
-           */
-
-      }, {
-          key: "parseInput",
-          value: function parseInput() {
-              // Parse an expression
-              var expression = this.parseExpression(false);
-              // If we succeeded, make sure there's an EOF at the end
-              this.expect("EOF", false);
-              return expression;
-          }
-      }, {
-          key: "parseExpression",
-
-
-          /**
-           * Parses an "expression", which is a list of atoms.
-           *
-           * @param {boolean} breakOnInfix  Should the parsing stop when we hit infix
-           *                  nodes? This happens when functions have higher precendence
-           *                  than infix nodes in implicit parses.
-           *
-           * @param {?string} breakOnTokenText  The text of the token that the expression
-           *                  should end with, or `null` if something else should end the
-           *                  expression.
-           *
-           * @return {ParseNode}
-           */
-          value: function parseExpression(breakOnInfix, breakOnTokenText) {
-              var body = [];
-              // Keep adding atoms to the body until we can't parse any more atoms (either
-              // we reached the end, a }, or a \right)
-              while (true) {
-                  var lex = this.nextToken;
-                  if (Parser.endOfExpression.indexOf(lex.text) !== -1) {
-                      break;
-                  }
-                  if (breakOnTokenText && lex.text === breakOnTokenText) {
-                      break;
-                  }
-                  if (breakOnInfix && _functions2.default[lex.text] && _functions2.default[lex.text].infix) {
-                      break;
-                  }
-                  var atom = this.parseAtom();
-                  if (!atom) {
-                      if (!this.settings.throwOnError && lex.text[0] === "\\") {
-                          var errorNode = this.handleUnsupportedCmd();
-                          body.push(errorNode);
-                          continue;
-                      }
-
-                      break;
-                  }
-                  body.push(atom);
-              }
-              return this.handleInfixNodes(body);
-          }
-
-          /**
-           * Rewrites infix operators such as \over with corresponding commands such
-           * as \frac.
-           *
-           * There can only be one infix operator per group.  If there's more than one
-           * then the expression is ambiguous.  This can be resolved by adding {}.
-           *
-           * @returns {Array}
-           */
-
-      }, {
-          key: "handleInfixNodes",
-          value: function handleInfixNodes(body) {
-              var overIndex = -1;
-              var funcName = void 0;
-
-              for (var i = 0; i < body.length; i++) {
-                  var node = body[i];
-                  if (node.type === "infix") {
-                      if (overIndex !== -1) {
-                          throw new _ParseError2.default("only one infix operator per group", node.value.token);
-                      }
-                      overIndex = i;
-                      funcName = node.value.replaceWith;
-                  }
-              }
-
-              if (overIndex !== -1) {
-                  var numerNode = void 0;
-                  var denomNode = void 0;
-
-                  var numerBody = body.slice(0, overIndex);
-                  var denomBody = body.slice(overIndex + 1);
-
-                  if (numerBody.length === 1 && numerBody[0].type === "ordgroup") {
-                      numerNode = numerBody[0];
-                  } else {
-                      numerNode = new _ParseNode2.default("ordgroup", numerBody, this.mode);
-                  }
-
-                  if (denomBody.length === 1 && denomBody[0].type === "ordgroup") {
-                      denomNode = denomBody[0];
-                  } else {
-                      denomNode = new _ParseNode2.default("ordgroup", denomBody, this.mode);
-                  }
-
-                  var value = this.callFunction(funcName, [numerNode, denomNode], null);
-                  return [new _ParseNode2.default(value.type, value, this.mode)];
-              } else {
-                  return body;
-              }
-          }
-
-          // The greediness of a superscript or subscript
-
-      }, {
-          key: "handleSupSubscript",
-
-
-          /**
-           * Handle a subscript or superscript with nice errors.
-           */
-          value: function handleSupSubscript(name) {
-              var symbolToken = this.nextToken;
-              var symbol = symbolToken.text;
-              this.consume();
-              var group = this.parseGroup();
-
-              if (!group) {
-                  if (!this.settings.throwOnError && this.nextToken.text[0] === "\\") {
-                      return this.handleUnsupportedCmd();
-                  } else {
-                      throw new _ParseError2.default("Expected group after '" + symbol + "'", symbolToken);
-                  }
-              } else if (group.isFunction) {
-                  // ^ and _ have a greediness, so handle interactions with functions'
-                  // greediness
-                  var funcGreediness = _functions2.default[group.result].greediness;
-                  if (funcGreediness > Parser.SUPSUB_GREEDINESS) {
-                      return this.parseFunction(group);
-                  } else {
-                      throw new _ParseError2.default("Got function '" + group.result + "' with no arguments " + "as " + name, symbolToken);
-                  }
-              } else {
-                  return group.result;
-              }
-          }
-
-          /**
-           * Converts the textual input of an unsupported command into a text node
-           * contained within a color node whose color is determined by errorColor
-           */
-
-      }, {
-          key: "handleUnsupportedCmd",
-          value: function handleUnsupportedCmd() {
-              var text = this.nextToken.text;
-              var textordArray = [];
-
-              for (var i = 0; i < text.length; i++) {
-                  textordArray.push(new _ParseNode2.default("textord", text[i], "text"));
-              }
-
-              var textNode = new _ParseNode2.default("text", {
-                  body: textordArray,
-                  type: "text"
-              }, this.mode);
-
-              var colorNode = new _ParseNode2.default("color", {
-                  color: this.settings.errorColor,
-                  value: [textNode],
-                  type: "color"
-              }, this.mode);
-
-              this.consume();
-              return colorNode;
-          }
-
-          /**
-           * Parses a group with optional super/subscripts.
-           *
-           * @return {?ParseNode}
-           */
-
-      }, {
-          key: "parseAtom",
-          value: function parseAtom() {
-              // The body of an atom is an implicit group, so that things like
-              // \left(x\right)^2 work correctly.
-              var base = this.parseImplicitGroup();
-
-              // In text mode, we don't have superscripts or subscripts
-              if (this.mode === "text") {
-                  return base;
-              }
-
-              // Note that base may be empty (i.e. null) at this point.
-
-              var superscript = void 0;
-              var subscript = void 0;
-              while (true) {
-                  // Lex the first token
-                  var lex = this.nextToken;
-
-                  if (lex.text === "\\limits" || lex.text === "\\nolimits") {
-                      // We got a limit control
-                      if (!base || base.type !== "op") {
-                          throw new _ParseError2.default("Limit controls must follow a math operator", lex);
-                      } else {
-                          var limits = lex.text === "\\limits";
-                          base.value.limits = limits;
-                          base.value.alwaysHandleSupSub = true;
-                      }
-                      this.consume();
-                  } else if (lex.text === "^") {
-                      // We got a superscript start
-                      if (superscript) {
-                          throw new _ParseError2.default("Double superscript", lex);
-                      }
-                      superscript = this.handleSupSubscript("superscript");
-                  } else if (lex.text === "_") {
-                      // We got a subscript start
-                      if (subscript) {
-                          throw new _ParseError2.default("Double subscript", lex);
-                      }
-                      subscript = this.handleSupSubscript("subscript");
-                  } else if (lex.text === "'") {
-                      // We got a prime
-                      if (superscript) {
-                          throw new _ParseError2.default("Double superscript", lex);
-                      }
-                      var prime = new _ParseNode2.default("textord", "\\prime", this.mode);
-
-                      // Many primes can be grouped together, so we handle this here
-                      var primes = [prime];
-                      this.consume();
-                      // Keep lexing tokens until we get something that's not a prime
-                      while (this.nextToken.text === "'") {
-                          // For each one, add another prime to the list
-                          primes.push(prime);
-                          this.consume();
-                      }
-                      // If there's a superscript following the primes, combine that
-                      // superscript in with the primes.
-                      if (this.nextToken.text === "^") {
-                          primes.push(this.handleSupSubscript("superscript"));
-                      }
-                      // Put everything into an ordgroup as the superscript
-                      superscript = new _ParseNode2.default("ordgroup", primes, this.mode);
-                  } else {
-                      // If it wasn't ^, _, or ', stop parsing super/subscripts
-                      break;
-                  }
-              }
-
-              if (superscript || subscript) {
-                  // If we got either a superscript or subscript, create a supsub
-                  return new _ParseNode2.default("supsub", {
-                      base: base,
-                      sup: superscript,
-                      sub: subscript
-                  }, this.mode);
-              } else {
-                  // Otherwise return the original body
-                  return base;
-              }
-          }
-
-          // A list of the size-changing functions, for use in parseImplicitGroup
-
-
-          // A list of the style-changing functions, for use in parseImplicitGroup
-
-
-          // Old font functions
-
-      }, {
-          key: "parseImplicitGroup",
-
-
-          /**
-           * Parses an implicit group, which is a group that starts at the end of a
-           * specified, and ends right before a higher explicit group ends, or at EOL. It
-           * is used for functions that appear to affect the current style, like \Large or
-           * \textrm, where instead of keeping a style we just pretend that there is an
-           * implicit grouping after it until the end of the group. E.g.
-           *   small text {\Large large text} small text again
-           * It is also used for \left and \right to get the correct grouping.
-           *
-           * @return {?ParseNode}
-           */
-          value: function parseImplicitGroup() {
-              var start = this.parseSymbol();
-
-              if (start == null) {
-                  // If we didn't get anything we handle, fall back to parseFunction
-                  return this.parseFunction();
-              }
-
-              var func = start.result;
-
-              if (func === "\\left") {
-                  // If we see a left:
-                  // Parse the entire left function (including the delimiter)
-                  var left = this.parseFunction(start);
-                  // Parse out the implicit body
-                  ++this.leftrightDepth;
-                  var body = this.parseExpression(false);
-                  --this.leftrightDepth;
-                  // Check the next token
-                  this.expect("\\right", false);
-                  var right = this.parseFunction();
-                  return new _ParseNode2.default("leftright", {
-                      body: body,
-                      left: left.value.value,
-                      right: right.value.value
-                  }, this.mode);
-              } else if (func === "\\begin") {
-                  // begin...end is similar to left...right
-                  var begin = this.parseFunction(start);
-                  var envName = begin.value.name;
-                  if (!_environments2.default.hasOwnProperty(envName)) {
-                      throw new _ParseError2.default("No such environment: " + envName, begin.value.nameGroup);
-                  }
-                  // Build the environment object. Arguments and other information will
-                  // be made available to the begin and end methods using properties.
-                  var env = _environments2.default[envName];
-                  var args = this.parseArguments("\\begin{" + envName + "}", env);
-                  var context = {
-                      mode: this.mode,
-                      envName: envName,
-                      parser: this,
-                      positions: args.pop()
-                  };
-                  var result = env.handler(context, args);
-                  this.expect("\\end", false);
-                  var endNameToken = this.nextToken;
-                  var end = this.parseFunction();
-                  if (end.value.name !== envName) {
-                      throw new _ParseError2.default("Mismatch: \\begin{" + envName + "} matched " + "by \\end{" + end.value.name + "}", endNameToken);
-                  }
-                  result.position = end.position;
-                  return result;
-              } else if (_utils2.default.contains(Parser.sizeFuncs, func)) {
-                  // If we see a sizing function, parse out the implicit body
-                  this.consumeSpaces();
-                  var _body = this.parseExpression(false);
-                  return new _ParseNode2.default("sizing", {
-                      // Figure out what size to use based on the list of functions above
-                      size: _utils2.default.indexOf(Parser.sizeFuncs, func) + 1,
-                      value: _body
-                  }, this.mode);
-              } else if (_utils2.default.contains(Parser.styleFuncs, func)) {
-                  // If we see a styling function, parse out the implicit body
-                  this.consumeSpaces();
-                  var _body2 = this.parseExpression(true);
-                  return new _ParseNode2.default("styling", {
-                      // Figure out what style to use by pulling out the style from
-                      // the function name
-                      style: func.slice(1, func.length - 5),
-                      value: _body2
-                  }, this.mode);
-              } else if (func in Parser.oldFontFuncs) {
-                  var style = Parser.oldFontFuncs[func];
-                  // If we see an old font function, parse out the implicit body
-                  this.consumeSpaces();
-                  var _body3 = this.parseExpression(true);
-                  if (style.slice(0, 4) === 'text') {
-                      return new _ParseNode2.default("text", {
-                          style: style,
-                          body: new _ParseNode2.default("ordgroup", _body3, this.mode)
-                      }, this.mode);
-                  } else {
-                      return new _ParseNode2.default("font", {
-                          font: style,
-                          body: new _ParseNode2.default("ordgroup", _body3, this.mode)
-                      }, this.mode);
-                  }
-              } else if (func === "\\color") {
-                  // If we see a styling function, parse out the implicit body
-                  var color = this.parseColorGroup(false);
-                  if (!color) {
-                      throw new _ParseError2.default("\\color not followed by color");
-                  }
-                  var _body4 = this.parseExpression(true);
-                  return new _ParseNode2.default("color", {
-                      type: "color",
-                      color: color.result.value,
-                      value: _body4
-                  }, this.mode);
-              } else if (func === "$") {
-                  if (this.mode === "math") {
-                      throw new _ParseError2.default("$ within math mode");
-                  }
-                  this.consume();
-                  var outerMode = this.mode;
-                  this.switchMode("math");
-                  var _body5 = this.parseExpression(false, "$");
-                  this.expect("$", true);
-                  this.switchMode(outerMode);
-                  return new _ParseNode2.default("styling", {
-                      style: "text",
-                      value: _body5
-                  }, "math");
-              } else {
-                  // Defer to parseFunction if it's not a function we handle
-                  return this.parseFunction(start);
-              }
-          }
-
-          /**
-           * Parses an entire function, including its base and all of its arguments.
-           * The base might either have been parsed already, in which case
-           * it is provided as an argument, or it's the next group in the input.
-           *
-           * @param {ParseFuncOrArgument=} baseGroup optional as described above
-           * @return {?ParseNode}
-           */
-
-      }, {
-          key: "parseFunction",
-          value: function parseFunction(baseGroup) {
-              if (!baseGroup) {
-                  baseGroup = this.parseGroup();
-              }
-
-              if (baseGroup) {
-                  if (baseGroup.isFunction) {
-                      var func = baseGroup.result;
-                      var funcData = _functions2.default[func];
-                      if (this.mode === "text" && !funcData.allowedInText) {
-                          throw new _ParseError2.default("Can't use function '" + func + "' in text mode", baseGroup.token);
-                      } else if (this.mode === "math" && funcData.allowedInMath === false) {
-                          throw new _ParseError2.default("Can't use function '" + func + "' in math mode", baseGroup.token);
-                      }
-
-                      var args = this.parseArguments(func, funcData);
-                      var token = baseGroup.token;
-                      var result = this.callFunction(func, args, args.pop(), token);
-                      return new _ParseNode2.default(result.type, result, this.mode);
-                  } else {
-                      return baseGroup.result;
-                  }
-              } else {
-                  return null;
-              }
-          }
-
-          /**
-           * Call a function handler with a suitable context and arguments.
-           */
-
-      }, {
-          key: "callFunction",
-          value: function callFunction(name, args, positions, token) {
-              var context = {
-                  funcName: name,
-                  parser: this,
-                  positions: positions,
-                  token: token
-              };
-              return _functions2.default[name].handler(context, args);
-          }
-
-          /**
-           * Parses the arguments of a function or environment
-           *
-           * @param {string} func  "\name" or "\begin{name}"
-           * @param {{numArgs:number,numOptionalArgs:number|undefined}} funcData
-           * @return the array of arguments, with the list of positions as last element
-           */
-
-      }, {
-          key: "parseArguments",
-          value: function parseArguments(func, funcData) {
-              var totalArgs = funcData.numArgs + funcData.numOptionalArgs;
-              if (totalArgs === 0) {
-                  return [[this.pos]];
-              }
-
-              var baseGreediness = funcData.greediness;
-              var positions = [this.pos];
-              var args = [];
-
-              for (var i = 0; i < totalArgs; i++) {
-                  var nextToken = this.nextToken;
-                  var argType = funcData.argTypes && funcData.argTypes[i];
-                  var arg = void 0;
-                  if (i < funcData.numOptionalArgs) {
-                      if (argType) {
-                          arg = this.parseGroupOfType(argType, true);
-                      } else {
-                          arg = this.parseGroup(true);
-                      }
-                      if (!arg) {
-                          args.push(null);
-                          positions.push(this.pos);
-                          continue;
-                      }
-                  } else {
-                      if (argType) {
-                          arg = this.parseGroupOfType(argType);
-                      } else {
-                          arg = this.parseGroup();
-                      }
-                      if (!arg) {
-                          if (!this.settings.throwOnError && this.nextToken.text[0] === "\\") {
-                              arg = new ParseFuncOrArgument(this.handleUnsupportedCmd(this.nextToken.text), false);
-                          } else {
-                              throw new _ParseError2.default("Expected group after '" + func + "'", nextToken);
-                          }
-                      }
-                  }
-                  var argNode = void 0;
-                  if (arg.isFunction) {
-                      var argGreediness = _functions2.default[arg.result].greediness;
-                      if (argGreediness > baseGreediness) {
-                          argNode = this.parseFunction(arg);
-                      } else {
-                          throw new _ParseError2.default("Got function '" + arg.result + "' as " + "argument to '" + func + "'", nextToken);
-                      }
-                  } else {
-                      argNode = arg.result;
-                  }
-                  args.push(argNode);
-                  positions.push(this.pos);
-              }
-
-              args.push(positions);
-
-              return args;
-          }
-
-          /**
-           * Parses a group when the mode is changing.
-           *
-           * @return {?ParseFuncOrArgument}
-           */
-
-      }, {
-          key: "parseGroupOfType",
-          value: function parseGroupOfType(innerMode, optional) {
-              var outerMode = this.mode;
-              // Handle `original` argTypes
-              if (innerMode === "original") {
-                  innerMode = outerMode;
-              }
-
-              if (innerMode === "color") {
-                  return this.parseColorGroup(optional);
-              }
-              if (innerMode === "size") {
-                  return this.parseSizeGroup(optional);
-              }
-
-              this.switchMode(innerMode);
-              if (innerMode === "text") {
-                  // text mode is special because it should ignore the whitespace before
-                  // it
-                  this.consumeSpaces();
-              }
-              // By the time we get here, innerMode is one of "text" or "math".
-              // We switch the mode of the parser, recurse, then restore the old mode.
-              var res = this.parseGroup(optional);
-              this.switchMode(outerMode);
-              return res;
-          }
-      }, {
-          key: "consumeSpaces",
-          value: function consumeSpaces() {
-              while (this.nextToken.text === " ") {
-                  this.consume();
-              }
-          }
-
-          /**
-           * Parses a group, essentially returning the string formed by the
-           * brace-enclosed tokens plus some position information.
-           *
-           * @param {string} modeName  Used to describe the mode in error messages
-           * @param {boolean=} optional  Whether the group is optional or required
-           */
-
-      }, {
-          key: "parseStringGroup",
-          value: function parseStringGroup(modeName, optional) {
-              if (optional && this.nextToken.text !== "[") {
-                  return null;
-              }
-              var outerMode = this.mode;
-              this.mode = "text";
-              this.expect(optional ? "[" : "{");
-              var str = "";
-              var firstToken = this.nextToken;
-              var lastToken = firstToken;
-              while (this.nextToken.text !== (optional ? "]" : "}")) {
-                  if (this.nextToken.text === "EOF") {
-                      throw new _ParseError2.default("Unexpected end of input in " + modeName, firstToken.range(this.nextToken, str));
-                  }
-                  lastToken = this.nextToken;
-                  str += lastToken.text;
-                  this.consume();
-              }
-              this.mode = outerMode;
-              this.expect(optional ? "]" : "}");
-              return firstToken.range(lastToken, str);
-          }
-
-          /**
-           * Parses a regex-delimited group: the largest sequence of tokens
-           * whose concatenated strings match `regex`. Returns the string
-           * formed by the tokens plus some position information.
-           *
-           * @param {RegExp} regex
-           * @param {string} modeName  Used to describe the mode in error messages
-           */
-
-      }, {
-          key: "parseRegexGroup",
-          value: function parseRegexGroup(regex, modeName) {
-              var outerMode = this.mode;
-              this.mode = "text";
-              var firstToken = this.nextToken;
-              var lastToken = firstToken;
-              var str = "";
-              while (this.nextToken.text !== "EOF" && regex.test(str + this.nextToken.text)) {
-                  lastToken = this.nextToken;
-                  str += lastToken.text;
-                  this.consume();
-              }
-              if (str === "") {
-                  throw new _ParseError2.default("Invalid " + modeName + ": '" + firstToken.text + "'", firstToken);
-              }
-              this.mode = outerMode;
-              return firstToken.range(lastToken, str);
-          }
-
-          /**
-           * Parses a color description.
-           */
-
-      }, {
-          key: "parseColorGroup",
-          value: function parseColorGroup(optional) {
-              var res = this.parseStringGroup("color", optional);
-              if (!res) {
-                  return null;
-              }
-              var match = /^(#[a-z0-9]+|[a-z]+)$/i.exec(res.text);
-              if (!match) {
-                  throw new _ParseError2.default("Invalid color: '" + res.text + "'", res);
-              }
-              return new ParseFuncOrArgument(new _ParseNode2.default("color", match[0], this.mode), false);
-          }
-
-          /**
-           * Parses a size specification, consisting of magnitude and unit.
-           */
-
-      }, {
-          key: "parseSizeGroup",
-          value: function parseSizeGroup(optional) {
-              var res = void 0;
-              if (!optional && this.nextToken.text !== "{") {
-                  res = this.parseRegexGroup(/^[-+]? *(?:$|\d+|\d+\.\d*|\.\d*) *[a-z]{0,2} *$/, "size");
-              } else {
-                  res = this.parseStringGroup("size", optional);
-              }
-              if (!res) {
-                  return null;
-              }
-              var match = /([-+]?) *(\d+(?:\.\d*)?|\.\d+) *([a-z]{2})/.exec(res.text);
-              if (!match) {
-                  throw new _ParseError2.default("Invalid size: '" + res.text + "'", res);
-              }
-              var data = {
-                  number: +(match[1] + match[2]), // sign + magnitude, cast to number
-                  unit: match[3]
-              };
-              if (!_units2.default.validUnit(data)) {
-                  throw new _ParseError2.default("Invalid unit: '" + data.unit + "'", res);
-              }
-              return new ParseFuncOrArgument(new _ParseNode2.default("size", data, this.mode), false);
-          }
-
-          /**
-           * If the argument is false or absent, this parses an ordinary group,
-           * which is either a single nucleus (like "x") or an expression
-           * in braces (like "{x+y}").
-           * If the argument is true, it parses either a bracket-delimited expression
-           * (like "[x+y]") or returns null to indicate the absence of a
-           * bracket-enclosed group.
-           *
-           * @param {boolean=} optional  Whether the group is optional or required
-           * @return {?ParseFuncOrArgument}
-           */
-
-      }, {
-          key: "parseGroup",
-          value: function parseGroup(optional) {
-              var firstToken = this.nextToken;
-              // Try to parse an open brace
-              if (this.nextToken.text === (optional ? "[" : "{")) {
-                  // If we get a brace, parse an expression
-                  this.consume();
-                  var expression = this.parseExpression(false, optional ? "]" : null);
-                  var lastToken = this.nextToken;
-                  // Make sure we get a close brace
-                  this.expect(optional ? "]" : "}");
-                  if (this.mode === "text") {
-                      this.formLigatures(expression);
-                  }
-                  return new ParseFuncOrArgument(new _ParseNode2.default("ordgroup", expression, this.mode, firstToken, lastToken), false);
-              } else {
-                  // Otherwise, just return a nucleus, or nothing for an optional group
-                  return optional ? null : this.parseSymbol();
-              }
-          }
-
-          /**
-           * Form ligature-like combinations of characters for text mode.
-           * This includes inputs like "--", "---", "``" and "''".
-           * The result will simply replace multiple textord nodes with a single
-           * character in each value by a single textord node having multiple
-           * characters in its value.  The representation is still ASCII source.
-           *
-           * @param {Array.<ParseNode>} group  the nodes of this group,
-           *                                   list will be moified in place
-           */
-
-      }, {
-          key: "formLigatures",
-          value: function formLigatures(group) {
-              var n = group.length - 1;
-              for (var i = 0; i < n; ++i) {
-                  var a = group[i];
-                  var v = a.value;
-                  if (v === "-" && group[i + 1].value === "-") {
-                      if (i + 1 < n && group[i + 2].value === "-") {
-                          group.splice(i, 3, new _ParseNode2.default("textord", "---", "text", a, group[i + 2]));
-                          n -= 2;
-                      } else {
-                          group.splice(i, 2, new _ParseNode2.default("textord", "--", "text", a, group[i + 1]));
-                          n -= 1;
-                      }
-                  }
-                  if ((v === "'" || v === "`") && group[i + 1].value === v) {
-                      group.splice(i, 2, new _ParseNode2.default("textord", v + v, "text", a, group[i + 1]));
-                      n -= 1;
-                  }
-              }
-          }
-
-          /**
-           * Parse a single symbol out of the string. Here, we handle both the functions
-           * we have defined, as well as the single character symbols
-           *
-           * @return {?ParseFuncOrArgument}
-           */
-
-      }, {
-          key: "parseSymbol",
-          value: function parseSymbol() {
-              var nucleus = this.nextToken;
-
-              if (_functions2.default[nucleus.text]) {
-                  this.consume();
-                  // If there exists a function with this name, we return the function and
-                  // say that it is a function.
-                  return new ParseFuncOrArgument(nucleus.text, true, nucleus);
-              } else if (_symbols2.default[this.mode][nucleus.text]) {
-                  this.consume();
-                  // Otherwise if this is a no-argument function, find the type it
-                  // corresponds to in the symbols map
-                  return new ParseFuncOrArgument(new _ParseNode2.default(_symbols2.default[this.mode][nucleus.text].group, nucleus.text, this.mode, nucleus), false, nucleus);
-              } else if (this.mode === "text" && _unicodeRegexes.cjkRegex.test(nucleus.text)) {
-                  this.consume();
-                  return new ParseFuncOrArgument(new _ParseNode2.default("textord", nucleus.text, this.mode, nucleus), false, nucleus);
-              } else if (nucleus.text === "$") {
-                  return new ParseFuncOrArgument(nucleus.text, false, nucleus);
-              } else {
-                  return null;
-              }
-          }
-      }]);
-      return Parser;
-  }();
-
-  Parser.endOfExpression = ["}", "\\end", "\\right", "&", "\\\\", "\\cr"];
-  Parser.SUPSUB_GREEDINESS = 1;
-  Parser.sizeFuncs = ["\\tiny", "\\sixptsize", "\\scriptsize", "\\footnotesize", "\\small", "\\normalsize", "\\large", "\\Large", "\\LARGE", "\\huge", "\\Huge"];
-  Parser.styleFuncs = ["\\displaystyle", "\\textstyle", "\\scriptstyle", "\\scriptscriptstyle"];
-  Parser.oldFontFuncs = {
-      "\\rm": "mathrm",
-      "\\sf": "mathsf",
-      "\\tt": "mathtt",
-      "\\bf": "mathbf",
-      "\\it": "mathit"
-  };
-
-
-  Parser.prototype.ParseNode = _ParseNode2.default;
-
-  module.exports = Parser;
-
-  },{"./MacroExpander":27,"./ParseError":29,"./ParseNode":30,"./environments":40,"./functions":43,"./symbols":48,"./unicodeRegexes":49,"./units":50,"./utils":51,"babel-runtime/helpers/classCallCheck":4,"babel-runtime/helpers/createClass":5}],32:[function(require,module,exports){
-
-  var _classCallCheck2 = require("babel-runtime/helpers/classCallCheck");
-
-  var _classCallCheck3 = _interopRequireDefault(_classCallCheck2);
-
-  var _utils = require("./utils");
-
-  var _utils2 = _interopRequireDefault(_utils);
-
-  function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { default: obj }; }
-
-  /**
-   * The main Settings object
-   *
-   * The current options stored are:
-   *  - displayMode: Whether the expression should be typeset as inline math
-   *                 (false, the default), meaning that the math starts in
-   *                 \textstyle and is placed in an inline-block); or as display
-   *                 math (true), meaning that the math starts in \displaystyle
-   *                 and is placed in a block with vertical margin.
-   */
-  var Settings = function Settings(options) {
-    (0, _classCallCheck3.default)(this, Settings);
-
-    // allow null options
-    options = options || {};
-    this.displayMode = _utils2.default.deflt(options.displayMode, false);
-    this.throwOnError = _utils2.default.deflt(options.throwOnError, true);
-    this.errorColor = _utils2.default.deflt(options.errorColor, "#cc0000");
-    this.macros = options.macros || {};
-    this.colorIsTextColor = _utils2.default.deflt(options.colorIsTextColor, false);
-  }; /**
-      * This is a module for storing settings passed into KaTeX. It correctly handles
-      * default settings.
-      */
-
-  module.exports = Settings;
-
-  },{"./utils":51,"babel-runtime/helpers/classCallCheck":4}],33:[function(require,module,exports){
-
-  var _classCallCheck2 = require("babel-runtime/helpers/classCallCheck");
-
-  var _classCallCheck3 = _interopRequireDefault(_classCallCheck2);
-
-  var _createClass2 = require("babel-runtime/helpers/createClass");
-
-  var _createClass3 = _interopRequireDefault(_createClass2);
-
-  function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { default: obj }; }
-
-  /**
-   * This file contains information and classes for the various kinds of styles
-   * used in TeX. It provides a generic `Style` class, which holds information
-   * about a specific style. It then provides instances of all the different kinds
-   * of styles possible, and provides functions to move between them and get
-   * information about them.
-   */
-
-  /**
-   * The main style class. Contains a unique id for the style, a size (which is
-   * the same for cramped and uncramped version of a style), and a cramped flag.
-   */
-  var Style = function () {
-      function Style(id, size, cramped) {
-          (0, _classCallCheck3.default)(this, Style);
-
-          this.id = id;
-          this.size = size;
-          this.cramped = cramped;
-      }
-
-      /**
-       * Get the style of a superscript given a base in the current style.
-       */
-
-
-      (0, _createClass3.default)(Style, [{
-          key: "sup",
-          value: function sup() {
-              return styles[_sup[this.id]];
-          }
-
-          /**
-           * Get the style of a subscript given a base in the current style.
-           */
-
-      }, {
-          key: "sub",
-          value: function sub() {
-              return styles[_sub[this.id]];
-          }
-
-          /**
-           * Get the style of a fraction numerator given the fraction in the current
-           * style.
-           */
-
-      }, {
-          key: "fracNum",
-          value: function fracNum() {
-              return styles[_fracNum[this.id]];
-          }
-
-          /**
-           * Get the style of a fraction denominator given the fraction in the current
-           * style.
-           */
-
-      }, {
-          key: "fracDen",
-          value: function fracDen() {
-              return styles[_fracDen[this.id]];
-          }
-
-          /**
-           * Get the cramped version of a style (in particular, cramping a cramped style
-           * doesn't change the style).
-           */
-
-      }, {
-          key: "cramp",
-          value: function cramp() {
-              return styles[_cramp[this.id]];
-          }
-
-          /**
-           * Get a text or display version of this style.
-           */
-
-      }, {
-          key: "text",
-          value: function text() {
-              return styles[_text[this.id]];
-          }
-
-          /**
-           * Return if this style is tightly spaced (scriptstyle/scriptscriptstyle)
-           */
-
-      }, {
-          key: "isTight",
-          value: function isTight() {
-              return this.size >= 2;
-          }
-      }]);
-      return Style;
-  }();
-
-  // IDs of the different styles
-
-
-  var D = 0;
-  var Dc = 1;
-  var T = 2;
-  var Tc = 3;
-  var S = 4;
-  var Sc = 5;
-  var SS = 6;
-  var SSc = 7;
-
-  // Instances of the different styles
-  var styles = [new Style(D, 0, false), new Style(Dc, 0, true), new Style(T, 1, false), new Style(Tc, 1, true), new Style(S, 2, false), new Style(Sc, 2, true), new Style(SS, 3, false), new Style(SSc, 3, true)];
-
-  // Lookup tables for switching from one style to another
-  var _sup = [S, Sc, S, Sc, SS, SSc, SS, SSc];
-  var _sub = [Sc, Sc, Sc, Sc, SSc, SSc, SSc, SSc];
-  var _fracNum = [T, Tc, S, Sc, SS, SSc, SS, SSc];
-  var _fracDen = [Tc, Tc, Sc, Sc, SSc, SSc, SSc, SSc];
-  var _cramp = [Dc, Dc, Tc, Tc, Sc, Sc, SSc, SSc];
-  var _text = [D, Dc, T, Tc, T, Tc, T, Tc];
-
-  // We only export some of the styles. Also, we don't export the `Style` class so
-  // no more styles can be generated.
-  module.exports = {
-      DISPLAY: styles[D],
-      TEXT: styles[T],
-      SCRIPT: styles[S],
-      SCRIPTSCRIPT: styles[SS]
-  };
-
-  },{"babel-runtime/helpers/classCallCheck":4,"babel-runtime/helpers/createClass":5}],34:[function(require,module,exports){
-
-  var _domTree = require("./domTree");
-
-  var _domTree2 = _interopRequireDefault(_domTree);
-
-  var _fontMetrics = require("./fontMetrics");
-
-  var _fontMetrics2 = _interopRequireDefault(_fontMetrics);
-
-  var _symbols = require("./symbols");
-
-  var _symbols2 = _interopRequireDefault(_symbols);
-
-  var _utils = require("./utils");
-
-  var _utils2 = _interopRequireDefault(_utils);
-
-  function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { default: obj }; }
-
-  // The following have to be loaded from Main-Italic font, using class mainit
-  /* eslint no-console:0 */
-  /**
-   * This module contains general functions that can be used for building
-   * different kinds of domTree nodes in a consistent manner.
-   */
-
-  var mainitLetters = ["\\imath", // dotless i
-  "\\jmath", // dotless j
-  "\\pounds"];
-
-  /**
-   * Looks up the given symbol in fontMetrics, after applying any symbol
-   * replacements defined in symbol.js
-   */
-  var lookupSymbol = function lookupSymbol(value, fontFamily, mode) {
-      // Replace the value with its replaced value from symbol.js
-      if (_symbols2.default[mode][value] && _symbols2.default[mode][value].replace) {
-          value = _symbols2.default[mode][value].replace;
-      }
-      return {
-          value: value,
-          metrics: _fontMetrics2.default.getCharacterMetrics(value, fontFamily)
-      };
-  };
-
-  /**
-   * Makes a symbolNode after translation via the list of symbols in symbols.js.
-   * Correctly pulls out metrics for the character, and optionally takes a list of
-   * classes to be attached to the node.
-   *
-   * TODO: make argument order closer to makeSpan
-   * TODO: add a separate argument for math class (e.g. `mop`, `mbin`), which
-   * should if present come first in `classes`.
-   */
-  var makeSymbol = function makeSymbol(value, fontFamily, mode, options, classes) {
-      var lookup = lookupSymbol(value, fontFamily, mode);
-      var metrics = lookup.metrics;
-      value = lookup.value;
-
-      var symbolNode = void 0;
-      if (metrics) {
-          var italic = metrics.italic;
-          if (mode === "text") {
-              italic = 0;
-          }
-          symbolNode = new _domTree2.default.symbolNode(value, metrics.height, metrics.depth, italic, metrics.skew, classes);
-      } else {
-          // TODO(emily): Figure out a good way to only print this in development
-          typeof console !== "undefined" && console.warn("No character metrics for '" + value + "' in style '" + fontFamily + "'");
-          symbolNode = new _domTree2.default.symbolNode(value, 0, 0, 0, 0, classes);
-      }
-
-      if (options) {
-          symbolNode.maxFontSize = options.sizeMultiplier;
-          if (options.style.isTight()) {
-              symbolNode.classes.push("mtight");
-          }
-          if (options.getColor()) {
-              symbolNode.style.color = options.getColor();
-          }
-      }
-
-      return symbolNode;
-  };
-
-  /**
-   * Makes a symbol in Main-Regular or AMS-Regular.
-   * Used for rel, bin, open, close, inner, and punct.
-   */
-  var mathsym = function mathsym(value, mode, options, classes) {
-      // Decide what font to render the symbol in by its entry in the symbols
-      // table.
-      // Have a special case for when the value = \ because the \ is used as a
-      // textord in unsupported command errors but cannot be parsed as a regular
-      // text ordinal and is therefore not present as a symbol in the symbols
-      // table for text
-      if (value === "\\" || _symbols2.default[mode][value].font === "main") {
-          return makeSymbol(value, "Main-Regular", mode, options, classes);
-      } else {
-          return makeSymbol(value, "AMS-Regular", mode, options, classes.concat(["amsrm"]));
-      }
-  };
-
-  /**
-   * Makes a symbol in the default font for mathords and textords.
-   */
-  var mathDefault = function mathDefault(value, mode, options, classes, type) {
-      if (type === "mathord") {
-          var fontLookup = mathit(value);
-          return makeSymbol(value, fontLookup.fontName, mode, options, classes.concat([fontLookup.fontClass]));
-      } else if (type === "textord") {
-          var font = _symbols2.default[mode][value] && _symbols2.default[mode][value].font;
-          if (font === "ams") {
-              return makeSymbol(value, "AMS-Regular", mode, options, classes.concat(["amsrm"]));
-          } else {
-              // if (font === "main") {
-              return makeSymbol(value, "Main-Regular", mode, options, classes.concat(["mathrm"]));
-          }
-      } else {
-          throw new Error("unexpected type: " + type + " in mathDefault");
-      }
-  };
-
-  /**
-   * Determines which of the two font names (Main-Italic and Math-Italic) and
-   * corresponding style tags (mainit or mathit) to use for font "mathit",
-   * depending on the symbol.  Use this function instead of fontMap for font
-   * "mathit".
-   */
-  var mathit = function mathit(value, mode, options, classes) {
-      if (/[0-9]/.test(value.charAt(0)) ||
-      // glyphs for \imath and \jmath do not exist in Math-Italic so we
-      // need to use Main-Italic instead
-      _utils2.default.contains(mainitLetters, value)) {
-          return {
-              fontName: "Main-Italic",
-              fontClass: "mainit"
-          };
-      } else {
-          return {
-              fontName: "Math-Italic",
-              fontClass: "mathit"
-          };
-      }
-  };
-
-  /**
-   * Makes either a mathord or textord in the correct font and color.
-   */
-  var makeOrd = function makeOrd(group, options, type) {
-      var mode = group.mode;
-      var value = group.value;
-
-      var classes = ["mord"];
-
-      var font = options.font;
-      if (font) {
-          var fontLookup = void 0;
-          if (font === "mathit" || _utils2.default.contains(mainitLetters, value)) {
-              fontLookup = mathit(value);
-          } else {
-              fontLookup = fontMap[font];
-          }
-          if (lookupSymbol(value, fontLookup.fontName, mode).metrics) {
-              return makeSymbol(value, fontLookup.fontName, mode, options, classes.concat([fontLookup.fontClass || font]));
-          } else {
-              return mathDefault(value, mode, options, classes, type);
-          }
-      } else {
-          return mathDefault(value, mode, options, classes, type);
-      }
-  };
-
-  /**
-   * Calculate the height, depth, and maxFontSize of an element based on its
-   * children.
-   */
-  var sizeElementFromChildren = function sizeElementFromChildren(elem) {
-      var height = 0;
-      var depth = 0;
-      var maxFontSize = 0;
-
-      if (elem.children) {
-          for (var i = 0; i < elem.children.length; i++) {
-              if (elem.children[i].height > height) {
-                  height = elem.children[i].height;
-              }
-              if (elem.children[i].depth > depth) {
-                  depth = elem.children[i].depth;
-              }
-              if (elem.children[i].maxFontSize > maxFontSize) {
-                  maxFontSize = elem.children[i].maxFontSize;
-              }
-          }
-      }
-
-      elem.height = height;
-      elem.depth = depth;
-      elem.maxFontSize = maxFontSize;
-  };
-
-  /**
-   * Makes a span with the given list of classes, list of children, and options.
-   *
-   * TODO: Ensure that `options` is always provided (currently some call sites
-   * don't pass it).
-   * TODO: add a separate argument for math class (e.g. `mop`, `mbin`), which
-   * should if present come first in `classes`.
-   */
-  var makeSpan = function makeSpan(classes, children, options) {
-      var span = new _domTree2.default.span(classes, children, options);
-
-      sizeElementFromChildren(span);
-
-      return span;
-  };
-
-  /**
-   * Prepends the given children to the given span, updating height, depth, and
-   * maxFontSize.
-   */
-  var prependChildren = function prependChildren(span, children) {
-      span.children = children.concat(span.children);
-
-      sizeElementFromChildren(span);
-  };
-
-  /**
-   * Makes a document fragment with the given list of children.
-   */
-  var makeFragment = function makeFragment(children) {
-      var fragment = new _domTree2.default.documentFragment(children);
-
-      sizeElementFromChildren(fragment);
-
-      return fragment;
-  };
-
-  /**
-   * Makes a vertical list by stacking elements and kerns on top of each other.
-   * Allows for many different ways of specifying the positioning method.
-   *
-   * Arguments:
-   *  - children: A list of child or kern nodes to be stacked on top of each other
-   *              (i.e. the first element will be at the bottom, and the last at
-   *              the top). Element nodes are specified as
-   *                {type: "elem", elem: node}
-   *              while kern nodes are specified as
-   *                {type: "kern", size: size}
-   *  - positionType: The method by which the vlist should be positioned. Valid
-   *                  values are:
-   *                   - "individualShift": The children list only contains elem
-   *                                        nodes, and each node contains an extra
-   *                                        "shift" value of how much it should be
-   *                                        shifted (note that shifting is always
-   *                                        moving downwards). positionData is
-   *                                        ignored.
-   *                   - "top": The positionData specifies the topmost point of
-   *                            the vlist (note this is expected to be a height,
-   *                            so positive values move up)
-   *                   - "bottom": The positionData specifies the bottommost point
-   *                               of the vlist (note this is expected to be a
-   *                               depth, so positive values move down
-   *                   - "shift": The vlist will be positioned such that its
-   *                              baseline is positionData away from the baseline
-   *                              of the first child. Positive values move
-   *                              downwards.
-   *                   - "firstBaseline": The vlist will be positioned such that
-   *                                      its baseline is aligned with the
-   *                                      baseline of the first child.
-   *                                      positionData is ignored. (this is
-   *                                      equivalent to "shift" with
-   *                                      positionData=0)
-   *  - positionData: Data used in different ways depending on positionType
-   *  - options: An Options object
-   *
-   */
-  var makeVList = function makeVList(children, positionType, positionData, options) {
-      var depth = void 0;
-      var currPos = void 0;
-      var i = void 0;
-      if (positionType === "individualShift") {
-          var oldChildren = children;
-          children = [oldChildren[0]];
-
-          // Add in kerns to the list of children to get each element to be
-          // shifted to the correct specified shift
-          depth = -oldChildren[0].shift - oldChildren[0].elem.depth;
-          currPos = depth;
-          for (i = 1; i < oldChildren.length; i++) {
-              var diff = -oldChildren[i].shift - currPos - oldChildren[i].elem.depth;
-              var size = diff - (oldChildren[i - 1].elem.height + oldChildren[i - 1].elem.depth);
-
-              currPos = currPos + diff;
-
-              children.push({ type: "kern", size: size });
-              children.push(oldChildren[i]);
-          }
-      } else if (positionType === "top") {
-          // We always start at the bottom, so calculate the bottom by adding up
-          // all the sizes
-          var bottom = positionData;
-          for (i = 0; i < children.length; i++) {
-              if (children[i].type === "kern") {
-                  bottom -= children[i].size;
-              } else {
-                  bottom -= children[i].elem.height + children[i].elem.depth;
-              }
-          }
-          depth = bottom;
-      } else if (positionType === "bottom") {
-          depth = -positionData;
-      } else if (positionType === "shift") {
-          depth = -children[0].elem.depth - positionData;
-      } else if (positionType === "firstBaseline") {
-          depth = -children[0].elem.depth;
-      } else {
-          depth = 0;
-      }
-
-      // Create a strut that is taller than any list item. The strut is added to
-      // each item, where it will determine the item's baseline. Since it has
-      // `overflow:hidden`, the strut's top edge will sit on the item's line box's
-      // top edge and the strut's bottom edge will sit on the item's baseline,
-      // with no additional line-height spacing. This allows the item baseline to
-      // be positioned precisely without worrying about font ascent and
-      // line-height.
-      var pstrutSize = 0;
-      for (i = 0; i < children.length; i++) {
-          if (children[i].type === "elem") {
-              var child = children[i].elem;
-              pstrutSize = Math.max(pstrutSize, child.maxFontSize, child.height);
-          }
-      }
-      pstrutSize += 2;
-      var pstrut = makeSpan(["pstrut"], []);
-      pstrut.style.height = pstrutSize + "em";
-
-      // Create a new list of actual children at the correct offsets
-      var realChildren = [];
-      var minPos = depth;
-      var maxPos = depth;
-      currPos = depth;
-      for (i = 0; i < children.length; i++) {
-          if (children[i].type === "kern") {
-              currPos += children[i].size;
-          } else {
-              var _child = children[i].elem;
-
-              var childWrap = makeSpan([], [pstrut, _child]);
-              childWrap.style.top = -pstrutSize - currPos - _child.depth + "em";
-              if (children[i].marginLeft) {
-                  childWrap.style.marginLeft = children[i].marginLeft;
-              }
-              if (children[i].marginRight) {
-                  childWrap.style.marginRight = children[i].marginRight;
-              }
-
-              realChildren.push(childWrap);
-              currPos += _child.height + _child.depth;
-          }
-          minPos = Math.min(minPos, currPos);
-          maxPos = Math.max(maxPos, currPos);
-      }
-
-      // The vlist contents go in a table-cell with `vertical-align:bottom`.
-      // This cell's bottom edge will determine the containing table's baseline
-      // without overly expanding the containing line-box.
-      var vlist = makeSpan(["vlist"], realChildren);
-      vlist.style.height = maxPos + "em";
-
-      // A second row is used if necessary to represent the vlist's depth.
-      var rows = void 0;
-      if (minPos < 0) {
-          var depthStrut = makeSpan(["vlist"], []);
-          depthStrut.style.height = -minPos + "em";
-
-          // Safari wants the first row to have inline content; otherwise it
-          // puts the bottom of the *second* row on the baseline.
-          var topStrut = makeSpan(["vlist-s"], [new _domTree2.default.symbolNode("\u200B")]);
-
-          rows = [makeSpan(["vlist-r"], [vlist, topStrut]), makeSpan(["vlist-r"], [depthStrut])];
-      } else {
-          rows = [makeSpan(["vlist-r"], [vlist])];
-      }
-
-      var vtable = makeSpan(["vlist-t"], rows);
-      if (rows.length === 2) {
-          vtable.classes.push("vlist-t2");
-      }
-      vtable.height = maxPos;
-      vtable.depth = -minPos;
-      return vtable;
-  };
-
-  // A map of spacing functions to their attributes, like size and corresponding
-  // CSS class
-  var spacingFunctions = {
-      "\\qquad": {
-          size: "2em",
-          className: "qquad"
-      },
-      "\\quad": {
-          size: "1em",
-          className: "quad"
-      },
-      "\\enspace": {
-          size: "0.5em",
-          className: "enspace"
-      },
-      "\\;": {
-          size: "0.277778em",
-          className: "thickspace"
-      },
-      "\\:": {
-          size: "0.22222em",
-          className: "mediumspace"
-      },
-      "\\,": {
-          size: "0.16667em",
-          className: "thinspace"
-      },
-      "\\!": {
-          size: "-0.16667em",
-          className: "negativethinspace"
-      }
-  };
-
-  /**
-   * Maps TeX font commands to objects containing:
-   * - variant: string used for "mathvariant" attribute in buildMathML.js
-   * - fontName: the "style" parameter to fontMetrics.getCharacterMetrics
-   */
-  // A map between tex font commands an MathML mathvariant attribute values
-  var fontMap = {
-      // styles
-      "mathbf": {
-          variant: "bold",
-          fontName: "Main-Bold"
-      },
-      "mathrm": {
-          variant: "normal",
-          fontName: "Main-Regular"
-      },
-      "textit": {
-          variant: "italic",
-          fontName: "Main-Italic"
-      },
-
-      // "mathit" is missing because it requires the use of two fonts: Main-Italic
-      // and Math-Italic.  This is handled by a special case in makeOrd which ends
-      // up calling mathit.
-
-      // families
-      "mathbb": {
-          variant: "double-struck",
-          fontName: "AMS-Regular"
-      },
-      "mathcal": {
-          variant: "script",
-          fontName: "Caligraphic-Regular"
-      },
-      "mathfrak": {
-          variant: "fraktur",
-          fontName: "Fraktur-Regular"
-      },
-      "mathscr": {
-          variant: "script",
-          fontName: "Script-Regular"
-      },
-      "mathsf": {
-          variant: "sans-serif",
-          fontName: "SansSerif-Regular"
-      },
-      "mathtt": {
-          variant: "monospace",
-          fontName: "Typewriter-Regular"
-      }
-  };
-
-  module.exports = {
-      fontMap: fontMap,
-      makeSymbol: makeSymbol,
-      mathsym: mathsym,
-      makeSpan: makeSpan,
-      makeFragment: makeFragment,
-      makeVList: makeVList,
-      makeOrd: makeOrd,
-      prependChildren: prependChildren,
-      spacingFunctions: spacingFunctions
-  };
-
-  },{"./domTree":39,"./fontMetrics":41,"./symbols":48,"./utils":51}],35:[function(require,module,exports){
-
-  var _stringify = require("babel-runtime/core-js/json/stringify");
-
-  var _stringify2 = _interopRequireDefault(_stringify);
-
-  var _ParseError = require("./ParseError");
-
-  var _ParseError2 = _interopRequireDefault(_ParseError);
-
-  var _Style = require("./Style");
-
-  var _Style2 = _interopRequireDefault(_Style);
-
-  var _buildCommon = require("./buildCommon");
-
-  var _buildCommon2 = _interopRequireDefault(_buildCommon);
-
-  var _delimiter = require("./delimiter");
-
-  var _delimiter2 = _interopRequireDefault(_delimiter);
-
-  var _domTree = require("./domTree");
-
-  var _domTree2 = _interopRequireDefault(_domTree);
-
-  var _units = require("./units");
-
-  var _units2 = _interopRequireDefault(_units);
-
-  var _utils = require("./utils");
-
-  var _utils2 = _interopRequireDefault(_utils);
-
-  var _stretchy = require("./stretchy");
-
-  var _stretchy2 = _interopRequireDefault(_stretchy);
-
-  function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { default: obj }; }
-
-  /* eslint no-console:0 */
-  /**
-   * This file does the main work of building a domTree structure from a parse
-   * tree. The entry point is the `buildHTML` function, which takes a parse tree.
-   * Then, the buildExpression, buildGroup, and various groupTypes functions are
-   * called, to produce a final HTML tree.
-   */
-
-  var isSpace = function isSpace(node) {
-      return node instanceof _domTree2.default.span && node.classes[0] === "mspace";
-  };
-
-  // Binary atoms (first class `mbin`) change into ordinary atoms (`mord`)
-  // depending on their surroundings. See TeXbook pg. 442-446, Rules 5 and 6,
-  // and the text before Rule 19.
-  var isBin = function isBin(node) {
-      return node && node.classes[0] === "mbin";
-  };
-
-  var isBinLeftCanceller = function isBinLeftCanceller(node, isRealGroup) {
-      // TODO: This code assumes that a node's math class is the first element
-      // of its `classes` array. A later cleanup should ensure this, for
-      // instance by changing the signature of `makeSpan`.
-      if (node) {
-          return _utils2.default.contains(["mbin", "mopen", "mrel", "mop", "mpunct"], node.classes[0]);
-      } else {
-          return isRealGroup;
-      }
-  };
-
-  var isBinRightCanceller = function isBinRightCanceller(node, isRealGroup) {
-      if (node) {
-          return _utils2.default.contains(["mrel", "mclose", "mpunct"], node.classes[0]);
-      } else {
-          return isRealGroup;
-      }
-  };
-
-  /**
-   * Splice out any spaces from `children` starting at position `i`, and return
-   * the spliced-out array. Returns null if `children[i]` does not exist or is not
-   * a space.
-   */
-  var spliceSpaces = function spliceSpaces(children, i) {
-      var j = i;
-      while (j < children.length && isSpace(children[j])) {
-          j++;
-      }
-      if (j === i) {
-          return null;
-      } else {
-          return children.splice(i, j - i);
-      }
-  };
-
-  /**
-   * Take a list of nodes, build them in order, and return a list of the built
-   * nodes. documentFragments are flattened into their contents, so the
-   * returned list contains no fragments. `isRealGroup` is true if `expression`
-   * is a real group (no atoms will be added on either side), as opposed to
-   * a partial group (e.g. one created by \color).
-   */
-  var buildExpression = function buildExpression(expression, options, isRealGroup) {
-      // Parse expressions into `groups`.
-      var groups = [];
-      for (var i = 0; i < expression.length; i++) {
-          var group = expression[i];
-          var output = buildGroup(group, options);
-          if (output instanceof _domTree2.default.documentFragment) {
-              Array.prototype.push.apply(groups, output.children);
-          } else {
-              groups.push(output);
-          }
-      }
-      // At this point `groups` consists entirely of `symbolNode`s and `span`s.
-
-      // Explicit spaces (e.g., \;, \,) should be ignored with respect to atom
-      // spacing (e.g., "add thick space between mord and mrel"). Since CSS
-      // adjacency rules implement atom spacing, spaces should be invisible to
-      // CSS. So we splice them out of `groups` and into the atoms themselves.
-      for (var _i = 0; _i < groups.length; _i++) {
-          var spaces = spliceSpaces(groups, _i);
-          if (spaces) {
-              // Splicing of spaces may have removed all remaining groups.
-              if (_i < groups.length) {
-                  // If there is a following group, move space within it.
-                  if (groups[_i] instanceof _domTree2.default.symbolNode) {
-                      groups[_i] = (0, _buildCommon.makeSpan)([].concat(groups[_i].classes), [groups[_i]]);
-                  }
-                  _buildCommon2.default.prependChildren(groups[_i], spaces);
-              } else {
-                  // Otherwise, put any spaces back at the end of the groups.
-                  Array.prototype.push.apply(groups, spaces);
-                  break;
-              }
-          }
-      }
-
-      // Binary operators change to ordinary symbols in some contexts.
-      for (var _i2 = 0; _i2 < groups.length; _i2++) {
-          if (isBin(groups[_i2]) && (isBinLeftCanceller(groups[_i2 - 1], isRealGroup) || isBinRightCanceller(groups[_i2 + 1], isRealGroup))) {
-              groups[_i2].classes[0] = "mord";
-          }
-      }
-
-      // Process \\not commands within the group.
-      // TODO(kevinb): Handle multiple \\not commands in a row.
-      // TODO(kevinb): Handle \\not{abc} correctly.  The \\not should appear over
-      // the 'a' instead of the 'c'.
-      for (var _i3 = 0; _i3 < groups.length; _i3++) {
-          if (groups[_i3].value === "\u0338" && _i3 + 1 < groups.length) {
-              var children = groups.slice(_i3, _i3 + 2);
-
-              children[0].classes = ["mainrm"];
-              // \u0338 is a combining glyph so we could reorder the children so
-              // that it comes after the other glyph.  This works correctly on
-              // most browsers except for Safari.  Instead we absolutely position
-              // the glyph and set its right side to match that of the other
-              // glyph which is visually equivalent.
-              children[0].style.position = "absolute";
-              children[0].style.right = "0";
-
-              // Copy the classes from the second glyph to the new container.
-              // This is so it behaves the same as though there was no \\not.
-              var classes = groups[_i3 + 1].classes;
-              var container = (0, _buildCommon.makeSpan)(classes, children);
-
-              // LaTeX adds a space between ords separated by a \\not.
-              if (classes.indexOf("mord") !== -1) {
-                  // \glue(\thickmuskip) 2.77771 plus 2.77771
-                  container.style.paddingLeft = "0.277771em";
-              }
-
-              // Ensure that the \u0338 is positioned relative to the container.
-              container.style.position = "relative";
-              groups.splice(_i3, 2, container);
-          }
-      }
-
-      return groups;
-  };
-
-  // Return math atom class (mclass) of a domTree.
-  var getTypeOfDomTree = function getTypeOfDomTree(node) {
-      if (node instanceof _domTree2.default.documentFragment) {
-          if (node.children.length) {
-              return getTypeOfDomTree(node.children[node.children.length - 1]);
-          }
-      } else {
-          if (_utils2.default.contains(["mord", "mop", "mbin", "mrel", "mopen", "mclose", "mpunct", "minner"], node.classes[0])) {
-              return node.classes[0];
-          }
-      }
-      return null;
-  };
-
-  /**
-   * Sometimes, groups perform special rules when they have superscripts or
-   * subscripts attached to them. This function lets the `supsub` group know that
-   * its inner element should handle the superscripts and subscripts instead of
-   * handling them itself.
-   */
-  var shouldHandleSupSub = function shouldHandleSupSub(group, options) {
-      if (!group.value.base) {
-          return false;
-      } else {
-          var base = group.value.base;
-          if (base.type === "op") {
-              // Operators handle supsubs differently when they have limits
-              // (e.g. `\displaystyle\sum_2^3`)
-              return base.value.limits && (options.style.size === _Style2.default.DISPLAY.size || base.value.alwaysHandleSupSub);
-          } else if (base.type === "accent") {
-              return isCharacterBox(base.value.base);
-          } else if (base.type === "horizBrace") {
-              var isSup = group.value.sub ? false : true;
-              return isSup === base.value.isOver;
-          } else {
-              return null;
-          }
-      }
-  };
-
-  /**
-   * Sometimes we want to pull out the innermost element of a group. In most
-   * cases, this will just be the group itself, but when ordgroups and colors have
-   * a single element, we want to pull that out.
-   */
-  var getBaseElem = function getBaseElem(group) {
-      if (!group) {
-          return false;
-      } else if (group.type === "ordgroup") {
-          if (group.value.length === 1) {
-              return getBaseElem(group.value[0]);
-          } else {
-              return group;
-          }
-      } else if (group.type === "color") {
-          if (group.value.value.length === 1) {
-              return getBaseElem(group.value.value[0]);
-          } else {
-              return group;
-          }
-      } else if (group.type === "font") {
-          return getBaseElem(group.value.body);
-      } else {
-          return group;
-      }
-  };
-
-  /**
-   * TeXbook algorithms often reference "character boxes", which are simply groups
-   * with a single character in them. To decide if something is a character box,
-   * we find its innermost group, and see if it is a single character.
-   */
-  var isCharacterBox = function isCharacterBox(group) {
-      var baseElem = getBaseElem(group);
-
-      // These are all they types of groups which hold single characters
-      return baseElem.type === "mathord" || baseElem.type === "textord" || baseElem.type === "bin" || baseElem.type === "rel" || baseElem.type === "inner" || baseElem.type === "open" || baseElem.type === "close" || baseElem.type === "punct";
-  };
-
-  var makeNullDelimiter = function makeNullDelimiter(options, classes) {
-      var moreClasses = ["nulldelimiter"].concat(options.baseSizingClasses());
-      return (0, _buildCommon.makeSpan)(classes.concat(moreClasses));
-  };
-
-  /**
-   * This is a map of group types to the function used to handle that type.
-   * Simpler types come at the beginning, while complicated types come afterwards.
-   */
-  var groupTypes = {};
-
-  groupTypes.mathord = function (group, options) {
-      return _buildCommon2.default.makeOrd(group, options, "mathord");
-  };
-
-  groupTypes.textord = function (group, options) {
-      return _buildCommon2.default.makeOrd(group, options, "textord");
-  };
-
-  groupTypes.bin = function (group, options) {
-      return _buildCommon2.default.mathsym(group.value, group.mode, options, ["mbin"]);
-  };
-
-  groupTypes.rel = function (group, options) {
-      return _buildCommon2.default.mathsym(group.value, group.mode, options, ["mrel"]);
-  };
-
-  groupTypes.open = function (group, options) {
-      return _buildCommon2.default.mathsym(group.value, group.mode, options, ["mopen"]);
-  };
-
-  groupTypes.close = function (group, options) {
-      return _buildCommon2.default.mathsym(group.value, group.mode, options, ["mclose"]);
-  };
-
-  groupTypes.inner = function (group, options) {
-      return _buildCommon2.default.mathsym(group.value, group.mode, options, ["minner"]);
-  };
-
-  groupTypes.punct = function (group, options) {
-      return _buildCommon2.default.mathsym(group.value, group.mode, options, ["mpunct"]);
-  };
-
-  groupTypes.ordgroup = function (group, options) {
-      return (0, _buildCommon.makeSpan)(["mord"], buildExpression(group.value, options, true), options);
-  };
-
-  groupTypes.text = function (group, options) {
-      var newOptions = options.withFont(group.value.style);
-      var inner = buildExpression(group.value.body, newOptions, true);
-      for (var i = 0; i < inner.length - 1; i++) {
-          if (inner[i].tryCombine(inner[i + 1])) {
-              inner.splice(i + 1, 1);
-              i--;
-          }
-      }
-      return (0, _buildCommon.makeSpan)(["mord", "text"], inner, newOptions);
-  };
-
-  groupTypes.color = function (group, options) {
-      var elements = buildExpression(group.value.value, options.withColor(group.value.color), false);
-
-      // \color isn't supposed to affect the type of the elements it contains.
-      // To accomplish this, we wrap the results in a fragment, so the inner
-      // elements will be able to directly interact with their neighbors. For
-      // example, `\color{red}{2 +} 3` has the same spacing as `2 + 3`
-      return new _buildCommon2.default.makeFragment(elements);
-  };
-
-  groupTypes.supsub = function (group, options) {
-      // Superscript and subscripts are handled in the TeXbook on page
-      // 445-446, rules 18(a-f).
-
-      // Here is where we defer to the inner group if it should handle
-      // superscripts and subscripts itself.
-      if (shouldHandleSupSub(group, options)) {
-          return groupTypes[group.value.base.type](group, options);
-      }
-
-      var base = buildGroup(group.value.base, options);
-      var supm = void 0;
-      var subm = void 0;
-
-      var metrics = options.fontMetrics();
-      var newOptions = void 0;
-
-      // Rule 18a
-      var supShift = 0;
-      var subShift = 0;
-
-      if (group.value.sup) {
-          newOptions = options.havingStyle(options.style.sup());
-          supm = buildGroup(group.value.sup, newOptions, options);
-          if (!isCharacterBox(group.value.base)) {
-              supShift = base.height - newOptions.fontMetrics().supDrop * newOptions.sizeMultiplier / options.sizeMultiplier;
-          }
-      }
-
-      if (group.value.sub) {
-          newOptions = options.havingStyle(options.style.sub());
-          subm = buildGroup(group.value.sub, newOptions, options);
-          if (!isCharacterBox(group.value.base)) {
-              subShift = base.depth + newOptions.fontMetrics().subDrop * newOptions.sizeMultiplier / options.sizeMultiplier;
-          }
-      }
-
-      // Rule 18c
-      var minSupShift = void 0;
-      if (options.style === _Style2.default.DISPLAY) {
-          minSupShift = metrics.sup1;
-      } else if (options.style.cramped) {
-          minSupShift = metrics.sup3;
-      } else {
-          minSupShift = metrics.sup2;
-      }
-
-      // scriptspace is a font-size-independent size, so scale it
-      // appropriately
-      var multiplier = options.sizeMultiplier;
-      var scriptspace = 0.5 / metrics.ptPerEm / multiplier + "em";
-
-      var supsub = void 0;
-      if (!group.value.sup) {
-          // Rule 18b
-          subShift = Math.max(subShift, metrics.sub1, subm.height - 0.8 * metrics.xHeight);
-
-          var vlistElem = [{ type: "elem", elem: subm, marginRight: scriptspace }];
-          // Subscripts shouldn't be shifted by the base's italic correction.
-          // Account for that by shifting the subscript back the appropriate
-          // amount. Note we only do this when the base is a single symbol.
-          if (base instanceof _domTree2.default.symbolNode) {
-              vlistElem[0].marginLeft = -base.italic + "em";
-          }
-
-          supsub = _buildCommon2.default.makeVList(vlistElem, "shift", subShift, options);
-      } else if (!group.value.sub) {
-          // Rule 18c, d
-          supShift = Math.max(supShift, minSupShift, supm.depth + 0.25 * metrics.xHeight);
-
-          supsub = _buildCommon2.default.makeVList([{ type: "elem", elem: supm, marginRight: scriptspace }], "shift", -supShift, options);
-      } else {
-          supShift = Math.max(supShift, minSupShift, supm.depth + 0.25 * metrics.xHeight);
-          subShift = Math.max(subShift, metrics.sub2);
-
-          var ruleWidth = metrics.defaultRuleThickness;
-
-          // Rule 18e
-          if (supShift - supm.depth - (subm.height - subShift) < 4 * ruleWidth) {
-              subShift = 4 * ruleWidth - (supShift - supm.depth) + subm.height;
-              var psi = 0.8 * metrics.xHeight - (supShift - supm.depth);
-              if (psi > 0) {
-                  supShift += psi;
-                  subShift -= psi;
-              }
-          }
-
-          var _vlistElem = [{ type: "elem", elem: subm, shift: subShift, marginRight: scriptspace }, { type: "elem", elem: supm, shift: -supShift, marginRight: scriptspace }];
-          // See comment above about subscripts not being shifted
-          if (base instanceof _domTree2.default.symbolNode) {
-              _vlistElem[0].marginLeft = -base.italic + "em";
-          }
-
-          supsub = _buildCommon2.default.makeVList(_vlistElem, "individualShift", null, options);
-      }
-
-      // We ensure to wrap the supsub vlist in a span.msupsub to reset text-align
-      var mclass = getTypeOfDomTree(base) || "mord";
-      return (0, _buildCommon.makeSpan)([mclass], [base, (0, _buildCommon.makeSpan)(["msupsub"], [supsub])], options);
-  };
-
-  groupTypes.genfrac = function (group, options) {
-      // Fractions are handled in the TeXbook on pages 444-445, rules 15(a-e).
-      // Figure out what style this fraction should be in based on the
-      // function used
-      var style = options.style;
-      if (group.value.size === "display") {
-          style = _Style2.default.DISPLAY;
-      } else if (group.value.size === "text") {
-          style = _Style2.default.TEXT;
-      }
-
-      var nstyle = style.fracNum();
-      var dstyle = style.fracDen();
-      var newOptions = void 0;
-
-      newOptions = options.havingStyle(nstyle);
-      var numerm = buildGroup(group.value.numer, newOptions, options);
-
-      newOptions = options.havingStyle(dstyle);
-      var denomm = buildGroup(group.value.denom, newOptions, options);
-
-      var rule = void 0;
-      var ruleWidth = void 0;
-      var ruleSpacing = void 0;
-      if (group.value.hasBarLine) {
-          rule = makeLineSpan("frac-line", options);
-          ruleWidth = rule.height;
-          ruleSpacing = rule.height;
-      } else {
-          rule = null;
-          ruleWidth = 0;
-          ruleSpacing = options.fontMetrics().defaultRuleThickness;
-      }
-
-      // Rule 15b
-      var numShift = void 0;
-      var clearance = void 0;
-      var denomShift = void 0;
-      if (style.size === _Style2.default.DISPLAY.size) {
-          numShift = options.fontMetrics().num1;
-          if (ruleWidth > 0) {
-              clearance = 3 * ruleSpacing;
-          } else {
-              clearance = 7 * ruleSpacing;
-          }
-          denomShift = options.fontMetrics().denom1;
-      } else {
-          if (ruleWidth > 0) {
-              numShift = options.fontMetrics().num2;
-              clearance = ruleSpacing;
-          } else {
-              numShift = options.fontMetrics().num3;
-              clearance = 3 * ruleSpacing;
-          }
-          denomShift = options.fontMetrics().denom2;
-      }
-
-      var frac = void 0;
-      if (ruleWidth === 0) {
-          // Rule 15c
-          var candidateClearance = numShift - numerm.depth - (denomm.height - denomShift);
-          if (candidateClearance < clearance) {
-              numShift += 0.5 * (clearance - candidateClearance);
-              denomShift += 0.5 * (clearance - candidateClearance);
-          }
-
-          frac = _buildCommon2.default.makeVList([{ type: "elem", elem: denomm, shift: denomShift }, { type: "elem", elem: numerm, shift: -numShift }], "individualShift", null, options);
-      } else {
-          // Rule 15d
-          var axisHeight = options.fontMetrics().axisHeight;
-
-          if (numShift - numerm.depth - (axisHeight + 0.5 * ruleWidth) < clearance) {
-              numShift += clearance - (numShift - numerm.depth - (axisHeight + 0.5 * ruleWidth));
-          }
-
-          if (axisHeight - 0.5 * ruleWidth - (denomm.height - denomShift) < clearance) {
-              denomShift += clearance - (axisHeight - 0.5 * ruleWidth - (denomm.height - denomShift));
-          }
-
-          var midShift = -(axisHeight - 0.5 * ruleWidth);
-
-          frac = _buildCommon2.default.makeVList([{ type: "elem", elem: denomm, shift: denomShift }, { type: "elem", elem: rule, shift: midShift }, { type: "elem", elem: numerm, shift: -numShift }], "individualShift", null, options);
-      }
-
-      // Since we manually change the style sometimes (with \dfrac or \tfrac),
-      // account for the possible size change here.
-      newOptions = options.havingStyle(style);
-      frac.height *= newOptions.sizeMultiplier / options.sizeMultiplier;
-      frac.depth *= newOptions.sizeMultiplier / options.sizeMultiplier;
-
-      // Rule 15e
-      var delimSize = void 0;
-      if (style.size === _Style2.default.DISPLAY.size) {
-          delimSize = options.fontMetrics().delim1;
-      } else {
-          delimSize = options.fontMetrics().delim2;
-      }
-
-      var leftDelim = void 0;
-      var rightDelim = void 0;
-      if (group.value.leftDelim == null) {
-          leftDelim = makeNullDelimiter(options, ["mopen"]);
-      } else {
-          leftDelim = _delimiter2.default.customSizedDelim(group.value.leftDelim, delimSize, true, options.havingStyle(style), group.mode, ["mopen"]);
-      }
-      if (group.value.rightDelim == null) {
-          rightDelim = makeNullDelimiter(options, ["mclose"]);
-      } else {
-          rightDelim = _delimiter2.default.customSizedDelim(group.value.rightDelim, delimSize, true, options.havingStyle(style), group.mode, ["mclose"]);
-      }
-
-      return (0, _buildCommon.makeSpan)(["mord"].concat(newOptions.sizingClasses(options)), [leftDelim, (0, _buildCommon.makeSpan)(["mfrac"], [frac]), rightDelim], options);
-  };
-
-  groupTypes.array = function (group, options) {
-      var r = void 0;
-      var c = void 0;
-      var nr = group.value.body.length;
-      var nc = 0;
-      var body = new Array(nr);
-
-      // Horizontal spacing
-      var pt = 1 / options.fontMetrics().ptPerEm;
-      var arraycolsep = 5 * pt; // \arraycolsep in article.cls
-
-      // Vertical spacing
-      var baselineskip = 12 * pt; // see size10.clo
-      // Default \jot from ltmath.dtx
-      // TODO(edemaine): allow overriding \jot via \setlength (#687)
-      var jot = 3 * pt;
-      // Default \arraystretch from lttab.dtx
-      // TODO(gagern): may get redefined once we have user-defined macros
-      var arraystretch = _utils2.default.deflt(group.value.arraystretch, 1);
-      var arrayskip = arraystretch * baselineskip;
-      var arstrutHeight = 0.7 * arrayskip; // \strutbox in ltfsstrc.dtx and
-      var arstrutDepth = 0.3 * arrayskip; // \@arstrutbox in lttab.dtx
-
-      var totalHeight = 0;
-      for (r = 0; r < group.value.body.length; ++r) {
-          var inrow = group.value.body[r];
-          var height = arstrutHeight; // \@array adds an \@arstrut
-          var depth = arstrutDepth; // to each tow (via the template)
-
-          if (nc < inrow.length) {
-              nc = inrow.length;
-          }
-
-          var outrow = new Array(inrow.length);
-          for (c = 0; c < inrow.length; ++c) {
-              var elt = buildGroup(inrow[c], options);
-              if (depth < elt.depth) {
-                  depth = elt.depth;
-              }
-              if (height < elt.height) {
-                  height = elt.height;
-              }
-              outrow[c] = elt;
-          }
-
-          var gap = 0;
-          if (group.value.rowGaps[r]) {
-              gap = _units2.default.calculateSize(group.value.rowGaps[r].value, options);
-              if (gap > 0) {
-                  // \@argarraycr
-                  gap += arstrutDepth;
-                  if (depth < gap) {
-                      depth = gap; // \@xargarraycr
-                  }
-                  gap = 0;
-              }
-          }
-          // In AMS multiline environments such as aligned and gathered, rows
-          // correspond to lines that have additional \jot added to the
-          // \baselineskip via \openup.
-          if (group.value.addJot) {
-              depth += jot;
-          }
-
-          outrow.height = height;
-          outrow.depth = depth;
-          totalHeight += height;
-          outrow.pos = totalHeight;
-          totalHeight += depth + gap; // \@yargarraycr
-          body[r] = outrow;
-      }
-
-      var offset = totalHeight / 2 + options.fontMetrics().axisHeight;
-      var colDescriptions = group.value.cols || [];
-      var cols = [];
-      var colSep = void 0;
-      var colDescrNum = void 0;
-      for (c = 0, colDescrNum = 0;
-      // Continue while either there are more columns or more column
-      // descriptions, so trailing separators don't get lost.
-      c < nc || colDescrNum < colDescriptions.length; ++c, ++colDescrNum) {
-
-          var colDescr = colDescriptions[colDescrNum] || {};
-
-          var firstSeparator = true;
-          while (colDescr.type === "separator") {
-              // If there is more than one separator in a row, add a space
-              // between them.
-              if (!firstSeparator) {
-                  colSep = (0, _buildCommon.makeSpan)(["arraycolsep"], []);
-                  colSep.style.width = options.fontMetrics().doubleRuleSep + "em";
-                  cols.push(colSep);
-              }
-
-              if (colDescr.separator === "|") {
-                  var separator = (0, _buildCommon.makeSpan)(["vertical-separator"], []);
-                  separator.style.height = totalHeight + "em";
-                  separator.style.verticalAlign = -(totalHeight - offset) + "em";
-
-                  cols.push(separator);
-              } else {
-                  throw new _ParseError2.default("Invalid separator type: " + colDescr.separator);
-              }
-
-              colDescrNum++;
-              colDescr = colDescriptions[colDescrNum] || {};
-              firstSeparator = false;
-          }
-
-          if (c >= nc) {
-              continue;
-          }
-
-          var sepwidth = void 0;
-          if (c > 0 || group.value.hskipBeforeAndAfter) {
-              sepwidth = _utils2.default.deflt(colDescr.pregap, arraycolsep);
-              if (sepwidth !== 0) {
-                  colSep = (0, _buildCommon.makeSpan)(["arraycolsep"], []);
-                  colSep.style.width = sepwidth + "em";
-                  cols.push(colSep);
-              }
-          }
-
-          var col = [];
-          for (r = 0; r < nr; ++r) {
-              var row = body[r];
-              var elem = row[c];
-              if (!elem) {
-                  continue;
-              }
-              var shift = row.pos - offset;
-              elem.depth = row.depth;
-              elem.height = row.height;
-              col.push({ type: "elem", elem: elem, shift: shift });
-          }
-
-          col = _buildCommon2.default.makeVList(col, "individualShift", null, options);
-          col = (0, _buildCommon.makeSpan)(["col-align-" + (colDescr.align || "c")], [col]);
-          cols.push(col);
-
-          if (c < nc - 1 || group.value.hskipBeforeAndAfter) {
-              sepwidth = _utils2.default.deflt(colDescr.postgap, arraycolsep);
-              if (sepwidth !== 0) {
-                  colSep = (0, _buildCommon.makeSpan)(["arraycolsep"], []);
-                  colSep.style.width = sepwidth + "em";
-                  cols.push(colSep);
-              }
-          }
-      }
-      body = (0, _buildCommon.makeSpan)(["mtable"], cols);
-      return (0, _buildCommon.makeSpan)(["mord"], [body], options);
-  };
-
-  groupTypes.spacing = function (group, options) {
-      if (group.value === "\\ " || group.value === "\\space" || group.value === " " || group.value === "~") {
-          // Spaces are generated by adding an actual space. Each of these
-          // things has an entry in the symbols table, so these will be turned
-          // into appropriate outputs.
-          if (group.mode === "text") {
-              return _buildCommon2.default.makeOrd(group, options, "textord");
-          } else {
-              return (0, _buildCommon.makeSpan)(["mspace"], [_buildCommon2.default.mathsym(group.value, group.mode, options)], options);
-          }
-      } else {
-          // Other kinds of spaces are of arbitrary width. We use CSS to
-          // generate these.
-          return (0, _buildCommon.makeSpan)(["mspace", _buildCommon2.default.spacingFunctions[group.value].className], [], options);
-      }
-  };
-
-  groupTypes.llap = function (group, options) {
-      var inner = (0, _buildCommon.makeSpan)(["inner"], [buildGroup(group.value.body, options)]);
-      var fix = (0, _buildCommon.makeSpan)(["fix"], []);
-      return (0, _buildCommon.makeSpan)(["mord", "llap"], [inner, fix], options);
-  };
-
-  groupTypes.rlap = function (group, options) {
-      var inner = (0, _buildCommon.makeSpan)(["inner"], [buildGroup(group.value.body, options)]);
-      var fix = (0, _buildCommon.makeSpan)(["fix"], []);
-      return (0, _buildCommon.makeSpan)(["mord", "rlap"], [inner, fix], options);
-  };
-
-  groupTypes.op = function (group, options) {
-      // Operators are handled in the TeXbook pg. 443-444, rule 13(a).
-      var supGroup = void 0;
-      var subGroup = void 0;
-      var hasLimits = false;
-      if (group.type === "supsub") {
-          // If we have limits, supsub will pass us its group to handle. Pull
-          // out the superscript and subscript and set the group to the op in
-          // its base.
-          supGroup = group.value.sup;
-          subGroup = group.value.sub;
-          group = group.value.base;
-          hasLimits = true;
-      }
-
-      var style = options.style;
-
-      // Most operators have a large successor symbol, but these don't.
-      var noSuccessor = ["\\smallint"];
-
-      var large = false;
-      if (style.size === _Style2.default.DISPLAY.size && group.value.symbol && !_utils2.default.contains(noSuccessor, group.value.body)) {
-
-          // Most symbol operators get larger in displaystyle (rule 13)
-          large = true;
-      }
-
-      var base = void 0;
-      if (group.value.symbol) {
-          // If this is a symbol, create the symbol.
-          var fontName = large ? "Size2-Regular" : "Size1-Regular";
-          base = _buildCommon2.default.makeSymbol(group.value.body, fontName, "math", options, ["mop", "op-symbol", large ? "large-op" : "small-op"]);
-      } else if (group.value.value) {
-          // If this is a list, compose that list.
-          var inner = buildExpression(group.value.value, options, true);
-          if (inner.length === 1 && inner[0] instanceof _domTree2.default.symbolNode) {
-              base = inner[0];
-              base.classes[0] = "mop"; // replace old mclass
-          } else {
-              base = (0, _buildCommon.makeSpan)(["mop"], inner, options);
-          }
-      } else {
-          // Otherwise, this is a text operator. Build the text from the
-          // operator's name.
-          // TODO(emily): Add a space in the middle of some of these
-          // operators, like \limsup
-          var output = [];
-          for (var i = 1; i < group.value.body.length; i++) {
-              output.push(_buildCommon2.default.mathsym(group.value.body[i], group.mode));
-          }
-          base = (0, _buildCommon.makeSpan)(["mop"], output, options);
-      }
-
-      // If content of op is a single symbol, shift it vertically.
-      var baseShift = 0;
-      var slant = 0;
-      if (base instanceof _domTree2.default.symbolNode) {
-          // Shift the symbol so its center lies on the axis (rule 13). It
-          // appears that our fonts have the centers of the symbols already
-          // almost on the axis, so these numbers are very small. Note we
-          // don't actually apply this here, but instead it is used either in
-          // the vlist creation or separately when there are no limits.
-          baseShift = (base.height - base.depth) / 2 - options.fontMetrics().axisHeight;
-
-          // The slant of the symbol is just its italic correction.
-          slant = base.italic;
-      }
-
-      if (hasLimits) {
-          // IE 8 clips \int if it is in a display: inline-block. We wrap it
-          // in a new span so it is an inline, and works.
-          base = (0, _buildCommon.makeSpan)([], [base]);
-
-          var supm = void 0;
-          var supKern = void 0;
-          var subm = void 0;
-          var subKern = void 0;
-          var newOptions = void 0;
-          // We manually have to handle the superscripts and subscripts. This,
-          // aside from the kern calculations, is copied from supsub.
-          if (supGroup) {
-              newOptions = options.havingStyle(style.sup());
-              supm = buildGroup(supGroup, newOptions, options);
-
-              supKern = Math.max(options.fontMetrics().bigOpSpacing1, options.fontMetrics().bigOpSpacing3 - supm.depth);
-          }
-
-          if (subGroup) {
-              newOptions = options.havingStyle(style.sub());
-              subm = buildGroup(subGroup, newOptions, options);
-
-              subKern = Math.max(options.fontMetrics().bigOpSpacing2, options.fontMetrics().bigOpSpacing4 - subm.height);
-          }
-
-          // Build the final group as a vlist of the possible subscript, base,
-          // and possible superscript.
-          var finalGroup = void 0;
-          var top = void 0;
-          var bottom = void 0;
-          if (!supGroup) {
-              top = base.height - baseShift;
-
-              // Shift the limits by the slant of the symbol. Note
-              // that we are supposed to shift the limits by 1/2 of the slant,
-              // but since we are centering the limits adding a full slant of
-              // margin will shift by 1/2 that.
-              finalGroup = _buildCommon2.default.makeVList([{ type: "kern", size: options.fontMetrics().bigOpSpacing5 }, { type: "elem", elem: subm, marginLeft: -slant + "em" }, { type: "kern", size: subKern }, { type: "elem", elem: base }], "top", top, options);
-          } else if (!subGroup) {
-              bottom = base.depth + baseShift;
-
-              finalGroup = _buildCommon2.default.makeVList([{ type: "elem", elem: base }, { type: "kern", size: supKern }, { type: "elem", elem: supm, marginLeft: slant + "em" }, { type: "kern", size: options.fontMetrics().bigOpSpacing5 }], "bottom", bottom, options);
-          } else if (!supGroup && !subGroup) {
-              // This case probably shouldn't occur (this would mean the
-              // supsub was sending us a group with no superscript or
-              // subscript) but be safe.
-              return base;
-          } else {
-              bottom = options.fontMetrics().bigOpSpacing5 + subm.height + subm.depth + subKern + base.depth + baseShift;
-
-              finalGroup = _buildCommon2.default.makeVList([{ type: "kern", size: options.fontMetrics().bigOpSpacing5 }, { type: "elem", elem: subm, marginLeft: -slant + "em" }, { type: "kern", size: subKern }, { type: "elem", elem: base }, { type: "kern", size: supKern }, { type: "elem", elem: supm, marginLeft: slant + "em" }, { type: "kern", size: options.fontMetrics().bigOpSpacing5 }], "bottom", bottom, options);
-          }
-
-          return (0, _buildCommon.makeSpan)(["mop", "op-limits"], [finalGroup], options);
-      } else {
-          if (baseShift) {
-              base.style.position = "relative";
-              base.style.top = baseShift + "em";
-          }
-
-          return base;
-      }
-  };
-
-  groupTypes.mod = function (group, options) {
-      var inner = [];
-
-      if (group.value.modType === "bmod") {
-          // “\nonscript\mskip-\medmuskip\mkern5mu”
-          if (!options.style.isTight()) {
-              inner.push((0, _buildCommon.makeSpan)(["mspace", "negativemediumspace"], [], options));
-          }
-          inner.push((0, _buildCommon.makeSpan)(["mspace", "thickspace"], [], options));
-      } else if (options.style.size === _Style2.default.DISPLAY.size) {
-          inner.push((0, _buildCommon.makeSpan)(["mspace", "quad"], [], options));
-      } else if (group.value.modType === "mod") {
-          inner.push((0, _buildCommon.makeSpan)(["mspace", "twelvemuspace"], [], options));
-      } else {
-          inner.push((0, _buildCommon.makeSpan)(["mspace", "eightmuspace"], [], options));
-      }
-
-      if (group.value.modType === "pod" || group.value.modType === "pmod") {
-          inner.push(_buildCommon2.default.mathsym("(", group.mode));
-      }
-
-      if (group.value.modType !== "pod") {
-          var modInner = [_buildCommon2.default.mathsym("m", group.mode), _buildCommon2.default.mathsym("o", group.mode), _buildCommon2.default.mathsym("d", group.mode)];
-          if (group.value.modType === "bmod") {
-              inner.push((0, _buildCommon.makeSpan)(["mbin"], modInner, options));
-              // “\mkern5mu\nonscript\mskip-\medmuskip”
-              inner.push((0, _buildCommon.makeSpan)(["mspace", "thickspace"], [], options));
-              if (!options.style.isTight()) {
-                  inner.push((0, _buildCommon.makeSpan)(["mspace", "negativemediumspace"], [], options));
-              }
-          } else {
-              Array.prototype.push.apply(inner, modInner);
-              inner.push((0, _buildCommon.makeSpan)(["mspace", "sixmuspace"], [], options));
-          }
-      }
-
-      if (group.value.value) {
-          Array.prototype.push.apply(inner, buildExpression(group.value.value, options, false));
-      }
-
-      if (group.value.modType === "pod" || group.value.modType === "pmod") {
-          inner.push(_buildCommon2.default.mathsym(")", group.mode));
-      }
-
-      return _buildCommon2.default.makeFragment(inner);
-  };
-
-  groupTypes.katex = function (group, options) {
-      // The KaTeX logo. The offsets for the K and a were chosen to look
-      // good, but the offsets for the T, E, and X were taken from the
-      // definition of \TeX in TeX (see TeXbook pg. 356)
-      var k = (0, _buildCommon.makeSpan)(["k"], [_buildCommon2.default.mathsym("K", group.mode)], options);
-      var a = (0, _buildCommon.makeSpan)(["a"], [_buildCommon2.default.mathsym("A", group.mode)], options);
-
-      a.height = (a.height + 0.2) * 0.75;
-      a.depth = (a.height - 0.2) * 0.75;
-
-      var t = (0, _buildCommon.makeSpan)(["t"], [_buildCommon2.default.mathsym("T", group.mode)], options);
-      var e = (0, _buildCommon.makeSpan)(["e"], [_buildCommon2.default.mathsym("E", group.mode)], options);
-
-      e.height = e.height - 0.2155;
-      e.depth = e.depth + 0.2155;
-
-      var x = (0, _buildCommon.makeSpan)(["x"], [_buildCommon2.default.mathsym("X", group.mode)], options);
-
-      return (0, _buildCommon.makeSpan)(["mord", "katex-logo"], [k, a, t, e, x], options);
-  };
-
-  var makeLineSpan = function makeLineSpan(className, options, thickness) {
-      var line = (0, _buildCommon.makeSpan)([className], [], options);
-      line.height = thickness || options.fontMetrics().defaultRuleThickness;
-      line.style.borderBottomWidth = line.height + "em";
-      line.maxFontSize = 1.0;
-      return line;
-  };
-
-  groupTypes.overline = function (group, options) {
-      // Overlines are handled in the TeXbook pg 443, Rule 9.
-
-      // Build the inner group in the cramped style.
-      var innerGroup = buildGroup(group.value.body, options.havingCrampedStyle());
-
-      // Create the line above the body
-      var line = makeLineSpan("overline-line", options);
-
-      // Generate the vlist, with the appropriate kerns
-      var vlist = _buildCommon2.default.makeVList([{ type: "elem", elem: innerGroup }, { type: "kern", size: 3 * line.height }, { type: "elem", elem: line }, { type: "kern", size: line.height }], "firstBaseline", null, options);
-
-      return (0, _buildCommon.makeSpan)(["mord", "overline"], [vlist], options);
-  };
-
-  groupTypes.underline = function (group, options) {
-      // Underlines are handled in the TeXbook pg 443, Rule 10.
-      // Build the inner group.
-      var innerGroup = buildGroup(group.value.body, options);
-
-      // Create the line above the body
-      var line = makeLineSpan("underline-line", options);
-
-      // Generate the vlist, with the appropriate kerns
-      var vlist = _buildCommon2.default.makeVList([{ type: "kern", size: line.height }, { type: "elem", elem: line }, { type: "kern", size: 3 * line.height }, { type: "elem", elem: innerGroup }], "top", innerGroup.height, options);
-
-      return (0, _buildCommon.makeSpan)(["mord", "underline"], [vlist], options);
-  };
-
-  groupTypes.sqrt = function (group, options) {
-      // Square roots are handled in the TeXbook pg. 443, Rule 11.
-
-      // First, we do the same steps as in overline to build the inner group
-      // and line
-      var inner = buildGroup(group.value.body, options.havingCrampedStyle());
-
-      // Some groups can return document fragments.  Handle those by wrapping
-      // them in a span.
-      if (inner instanceof _domTree2.default.documentFragment) {
-          inner = (0, _buildCommon.makeSpan)([], [inner], options);
-      }
-
-      // Calculate the minimum size for the \surd delimiter
-      var metrics = options.fontMetrics();
-      var theta = metrics.defaultRuleThickness;
-
-      var phi = theta;
-      if (options.style.id < _Style2.default.TEXT.id) {
-          phi = options.fontMetrics().xHeight;
-      }
-
-      // Calculate the clearance between the body and line
-      var lineClearance = theta + phi / 4;
-
-      var minDelimiterHeight = (inner.height + inner.depth + lineClearance + theta) * options.sizeMultiplier;
-
-      // Create a sqrt SVG of the required minimum size
-      var img = _delimiter2.default.customSizedDelim("\\surd", minDelimiterHeight, false, options, group.mode);
-
-      // Calculate the actual line width.
-      // This actually should depend on the chosen font -- e.g. \boldmath
-      // should use the thicker surd symbols from e.g. KaTeX_Main-Bold, and
-      // have thicker rules.
-      var ruleWidth = options.fontMetrics().sqrtRuleThickness * img.sizeMultiplier;
-
-      var delimDepth = img.height - ruleWidth;
-
-      // Adjust the clearance based on the delimiter size
-      if (delimDepth > inner.height + inner.depth + lineClearance) {
-          lineClearance = (lineClearance + delimDepth - inner.height - inner.depth) / 2;
-      }
-
-      // Shift the sqrt image
-      var imgShift = img.height - inner.height - lineClearance - ruleWidth;
-
-      // We add a special case here, because even when `inner` is empty, we
-      // still get a line. So, we use a simple heuristic to decide if we
-      // should omit the body entirely. (note this doesn't work for something
-      // like `\sqrt{\rlap{x}}`, but if someone is doing that they deserve for
-      // it not to work.
-      var body = void 0;
-      if (inner.height === 0 && inner.depth === 0) {
-          body = (0, _buildCommon.makeSpan)();
-      } else {
-          inner.style.paddingLeft = img.surdWidth + "em";
-
-          // Overlay the image and the argument.
-          body = _buildCommon2.default.makeVList([{ type: "elem", elem: inner }, { type: "kern", size: -(inner.height + imgShift) }, { type: "elem", elem: img }, { type: "kern", size: ruleWidth }], "firstBaseline", null, options);
-          body.children[0].children[0].classes.push("svg-align");
-      }
-
-      if (!group.value.index) {
-          return (0, _buildCommon.makeSpan)(["mord", "sqrt"], [body], options);
-      } else {
-          // Handle the optional root index
-
-          // The index is always in scriptscript style
-          var newOptions = options.havingStyle(_Style2.default.SCRIPTSCRIPT);
-          var rootm = buildGroup(group.value.index, newOptions, options);
-
-          // The amount the index is shifted by. This is taken from the TeX
-          // source, in the definition of `\r@@t`.
-          var toShift = 0.6 * (body.height - body.depth);
-
-          // Build a VList with the superscript shifted up correctly
-          var rootVList = _buildCommon2.default.makeVList([{ type: "elem", elem: rootm }], "shift", -toShift, options);
-          // Add a class surrounding it so we can add on the appropriate
-          // kerning
-          var rootVListWrap = (0, _buildCommon.makeSpan)(["root"], [rootVList]);
-
-          return (0, _buildCommon.makeSpan)(["mord", "sqrt"], [rootVListWrap, body], options);
-      }
-  };
-
-  function sizingGroup(value, options, baseOptions) {
-      var inner = buildExpression(value, options, false);
-      var multiplier = options.sizeMultiplier / baseOptions.sizeMultiplier;
-
-      // Add size-resetting classes to the inner list and set maxFontSize
-      // manually. Handle nested size changes.
-      for (var i = 0; i < inner.length; i++) {
-          var pos = _utils2.default.indexOf(inner[i].classes, "sizing");
-          if (pos < 0) {
-              Array.prototype.push.apply(inner[i].classes, options.sizingClasses(baseOptions));
-          } else if (inner[i].classes[pos + 1] === "reset-size" + options.size) {
-              // This is a nested size change: e.g., inner[i] is the "b" in
-              // `\Huge a \small b`. Override the old size (the `reset-` class)
-              // but not the new size.
-              inner[i].classes[pos + 1] = "reset-size" + baseOptions.size;
-          }
-
-          inner[i].height *= multiplier;
-          inner[i].depth *= multiplier;
-      }
-
-      return _buildCommon2.default.makeFragment(inner);
-  }
-
-  groupTypes.sizing = function (group, options) {
-      // Handle sizing operators like \Huge. Real TeX doesn't actually allow
-      // these functions inside of math expressions, so we do some special
-      // handling.
-      var newOptions = options.havingSize(group.value.size);
-      return sizingGroup(group.value.value, newOptions, options);
-  };
-
-  groupTypes.styling = function (group, options) {
-      // Style changes are handled in the TeXbook on pg. 442, Rule 3.
-
-      // Figure out what style we're changing to.
-      var styleMap = {
-          "display": _Style2.default.DISPLAY,
-          "text": _Style2.default.TEXT,
-          "script": _Style2.default.SCRIPT,
-          "scriptscript": _Style2.default.SCRIPTSCRIPT
-      };
-
-      var newStyle = styleMap[group.value.style];
-      var newOptions = options.havingStyle(newStyle);
-      return sizingGroup(group.value.value, newOptions, options);
-  };
-
-  groupTypes.font = function (group, options) {
-      var font = group.value.font;
-      return buildGroup(group.value.body, options.withFont(font));
-  };
-
-  groupTypes.delimsizing = function (group, options) {
-      var delim = group.value.value;
-
-      if (delim === ".") {
-          // Empty delimiters still count as elements, even though they don't
-          // show anything.
-          return (0, _buildCommon.makeSpan)([group.value.mclass]);
-      }
-
-      // Use delimiter.sizedDelim to generate the delimiter.
-      return _delimiter2.default.sizedDelim(delim, group.value.size, options, group.mode, [group.value.mclass]);
-  };
-
-  groupTypes.leftright = function (group, options) {
-      // Build the inner expression
-      var inner = buildExpression(group.value.body, options, true);
-
-      var innerHeight = 0;
-      var innerDepth = 0;
-      var hadMiddle = false;
-
-      // Calculate its height and depth
-      for (var i = 0; i < inner.length; i++) {
-          if (inner[i].isMiddle) {
-              hadMiddle = true;
-          } else {
-              innerHeight = Math.max(inner[i].height, innerHeight);
-              innerDepth = Math.max(inner[i].depth, innerDepth);
-          }
-      }
-
-      // The size of delimiters is the same, regardless of what style we are
-      // in. Thus, to correctly calculate the size of delimiter we need around
-      // a group, we scale down the inner size based on the size.
-      innerHeight *= options.sizeMultiplier;
-      innerDepth *= options.sizeMultiplier;
-
-      var leftDelim = void 0;
-      if (group.value.left === ".") {
-          // Empty delimiters in \left and \right make null delimiter spaces.
-          leftDelim = makeNullDelimiter(options, ["mopen"]);
-      } else {
-          // Otherwise, use leftRightDelim to generate the correct sized
-          // delimiter.
-          leftDelim = _delimiter2.default.leftRightDelim(group.value.left, innerHeight, innerDepth, options, group.mode, ["mopen"]);
-      }
-      // Add it to the beginning of the expression
-      inner.unshift(leftDelim);
-
-      // Handle middle delimiters
-      if (hadMiddle) {
-          for (var _i4 = 1; _i4 < inner.length; _i4++) {
-              var middleDelim = inner[_i4];
-              if (middleDelim.isMiddle) {
-                  // Apply the options that were active when \middle was called
-                  inner[_i4] = _delimiter2.default.leftRightDelim(middleDelim.isMiddle.value, innerHeight, innerDepth, middleDelim.isMiddle.options, group.mode, []);
-                  // Add back spaces shifted into the delimiter
-                  var spaces = spliceSpaces(middleDelim.children, 0);
-                  if (spaces) {
-                      _buildCommon2.default.prependChildren(inner[_i4], spaces);
-                  }
-              }
-          }
-      }
-
-      var rightDelim = void 0;
-      // Same for the right delimiter
-      if (group.value.right === ".") {
-          rightDelim = makeNullDelimiter(options, ["mclose"]);
-      } else {
-          rightDelim = _delimiter2.default.leftRightDelim(group.value.right, innerHeight, innerDepth, options, group.mode, ["mclose"]);
-      }
-      // Add it to the end of the expression.
-      inner.push(rightDelim);
-
-      return (0, _buildCommon.makeSpan)(["minner"], inner, options);
-  };
-
-  groupTypes.middle = function (group, options) {
-      var middleDelim = void 0;
-      if (group.value.value === ".") {
-          middleDelim = makeNullDelimiter(options, []);
-      } else {
-          middleDelim = _delimiter2.default.sizedDelim(group.value.value, 1, options, group.mode, []);
-          middleDelim.isMiddle = { value: group.value.value, options: options };
-      }
-      return middleDelim;
-  };
-
-  groupTypes.rule = function (group, options) {
-      // Make an empty span for the rule
-      var rule = (0, _buildCommon.makeSpan)(["mord", "rule"], [], options);
-
-      // Calculate the shift, width, and height of the rule, and account for units
-      var shift = 0;
-      if (group.value.shift) {
-          shift = _units2.default.calculateSize(group.value.shift, options);
-      }
-
-      var width = _units2.default.calculateSize(group.value.width, options);
-      var height = _units2.default.calculateSize(group.value.height, options);
-
-      // Style the rule to the right size
-      rule.style.borderRightWidth = width + "em";
-      rule.style.borderTopWidth = height + "em";
-      rule.style.bottom = shift + "em";
-
-      // Record the height and width
-      rule.width = width;
-      rule.height = height + shift;
-      rule.depth = -shift;
-      // Font size is the number large enough that the browser will
-      // reserve at least `absHeight` space above the baseline.
-      // The 1.125 factor was empirically determined
-      rule.maxFontSize = height * 1.125 * options.sizeMultiplier;
-
-      return rule;
-  };
-
-  groupTypes.kern = function (group, options) {
-      // Make an empty span for the rule
-      var rule = (0, _buildCommon.makeSpan)(["mord", "rule"], [], options);
-
-      if (group.value.dimension) {
-          var dimension = _units2.default.calculateSize(group.value.dimension, options);
-          rule.style.marginLeft = dimension + "em";
-      }
-
-      return rule;
-  };
-
-  groupTypes.accent = function (group, options) {
-      // Accents are handled in the TeXbook pg. 443, rule 12.
-      var base = group.value.base;
-
-      var supsubGroup = void 0;
-      if (group.type === "supsub") {
-          // If our base is a character box, and we have superscripts and
-          // subscripts, the supsub will defer to us. In particular, we want
-          // to attach the superscripts and subscripts to the inner body (so
-          // that the position of the superscripts and subscripts won't be
-          // affected by the height of the accent). We accomplish this by
-          // sticking the base of the accent into the base of the supsub, and
-          // rendering that, while keeping track of where the accent is.
-
-          // The supsub group is the group that was passed in
-          var supsub = group;
-          // The real accent group is the base of the supsub group
-          group = supsub.value.base;
-          // The character box is the base of the accent group
-          base = group.value.base;
-          // Stick the character box into the base of the supsub group
-          supsub.value.base = base;
-
-          // Rerender the supsub group with its new base, and store that
-          // result.
-          supsubGroup = buildGroup(supsub, options);
-      }
-
-      // Build the base group
-      var body = buildGroup(base, options.havingCrampedStyle());
-
-      // Does the accent need to shift for the skew of a character?
-      var mustShift = group.value.isShifty && isCharacterBox(base);
-
-      // Calculate the skew of the accent. This is based on the line "If the
-      // nucleus is not a single character, let s = 0; otherwise set s to the
-      // kern amount for the nucleus followed by the \skewchar of its font."
-      // Note that our skew metrics are just the kern between each character
-      // and the skewchar.
-      var skew = 0;
-      if (mustShift) {
-          // If the base is a character box, then we want the skew of the
-          // innermost character. To do that, we find the innermost character:
-          var baseChar = getBaseElem(base);
-          // Then, we render its group to get the symbol inside it
-          var baseGroup = buildGroup(baseChar, options.havingCrampedStyle());
-          // Finally, we pull the skew off of the symbol.
-          skew = baseGroup.skew;
-          // Note that we now throw away baseGroup, because the layers we
-          // removed with getBaseElem might contain things like \color which
-          // we can't get rid of.
-          // TODO(emily): Find a better way to get the skew
-      }
-
-      // calculate the amount of space between the body and the accent
-      var clearance = Math.min(body.height, options.fontMetrics().xHeight);
-
-      // Build the accent
-      var accentBody = void 0;
-      if (!group.value.isStretchy) {
-          var accent = _buildCommon2.default.makeSymbol(group.value.label, "Main-Regular", group.mode, options);
-          // Remove the italic correction of the accent, because it only serves to
-          // shift the accent over to a place we don't want.
-          accent.italic = 0;
-
-          // The \vec character that the fonts use is a combining character, and
-          // thus shows up much too far to the left. To account for this, we add a
-          // specific class which shifts the accent over to where we want it.
-          // TODO(emily): Fix this in a better way, like by changing the font
-          // Similarly, text accent \H is a combining character and
-          // requires a different adjustment.
-          var accentClass = null;
-          if (group.value.label === "\\vec") {
-              accentClass = "accent-vec";
-          } else if (group.value.label === '\\H') {
-              accentClass = "accent-hungarian";
-          }
-
-          accentBody = (0, _buildCommon.makeSpan)([], [accent]);
-          accentBody = (0, _buildCommon.makeSpan)(["accent-body", accentClass], [accentBody]);
-
-          // Shift the accent over by the skew. Note we shift by twice the skew
-          // because we are centering the accent, so by adding 2*skew to the left,
-          // we shift it to the right by 1*skew.
-          accentBody.style.marginLeft = 2 * skew + "em";
-
-          accentBody = _buildCommon2.default.makeVList([{ type: "elem", elem: body }, { type: "kern", size: -clearance }, { type: "elem", elem: accentBody }], "firstBaseline", null, options);
-      } else {
-          accentBody = _stretchy2.default.svgSpan(group, options);
-
-          accentBody = _buildCommon2.default.makeVList([{ type: "elem", elem: body }, { type: "elem", elem: accentBody }], "firstBaseline", null, options);
-
-          var styleSpan = accentBody.children[0].children[0].children[1];
-          styleSpan.classes.push("svg-align"); // text-align: left;
-          if (skew > 0) {
-              // Shorten the accent and nudge it to the right.
-              styleSpan.style.width = "calc(100% - " + 2 * skew + "em)";
-              styleSpan.style.marginLeft = 2 * skew + "em";
-          }
-      }
-
-      var accentWrap = (0, _buildCommon.makeSpan)(["mord", "accent"], [accentBody], options);
-
-      if (supsubGroup) {
-          // Here, we replace the "base" child of the supsub with our newly
-          // generated accent.
-          supsubGroup.children[0] = accentWrap;
-
-          // Since we don't rerun the height calculation after replacing the
-          // accent, we manually recalculate height.
-          supsubGroup.height = Math.max(accentWrap.height, supsubGroup.height);
-
-          // Accents should always be ords, even when their innards are not.
-          supsubGroup.classes[0] = "mord";
-
-          return supsubGroup;
-      } else {
-          return accentWrap;
-      }
-  };
-
-  groupTypes.horizBrace = function (group, options) {
-      var style = options.style;
-
-      var hasSupSub = group.type === "supsub";
-      var supSubGroup = void 0;
-      var newOptions = void 0;
-      if (hasSupSub) {
-          // Ref: LaTeX source2e: }}}}\limits}
-          // i.e. LaTeX treats the brace similar to an op and passes it
-          // with \limits, so we need to assign supsub style.
-          if (group.value.sup) {
-              newOptions = options.havingStyle(style.sup());
-              supSubGroup = buildGroup(group.value.sup, newOptions, options);
-          } else {
-              newOptions = options.havingStyle(style.sub());
-              supSubGroup = buildGroup(group.value.sub, newOptions, options);
-          }
-          group = group.value.base;
-      }
-
-      // Build the base group
-      var body = buildGroup(group.value.base, options.havingBaseStyle(_Style2.default.DISPLAY));
-
-      // Create the stretchy element
-      var braceBody = _stretchy2.default.svgSpan(group, options);
-
-      // Generate the vlist, with the appropriate kerns               ┏━━━━━━━━┓
-      // This first vlist contains the subject matter and the brace:   equation
-      var vlist = void 0;
-      if (group.value.isOver) {
-          vlist = _buildCommon2.default.makeVList([{ type: "elem", elem: body }, { type: "kern", size: 0.1 }, { type: "elem", elem: braceBody }], "firstBaseline", null, options);
-          vlist.children[0].children[0].children[1].classes.push("svg-align");
-      } else {
-          vlist = _buildCommon2.default.makeVList([{ type: "elem", elem: braceBody }, { type: "kern", size: 0.1 }, { type: "elem", elem: body }], "bottom", body.depth + 0.1 + braceBody.height, options);
-          vlist.children[0].children[0].children[0].classes.push("svg-align");
-      }
-
-      if (hasSupSub) {
-          // In order to write the supsub, wrap the first vlist in another vlist:
-          // They can't all go in the same vlist, because the note might be wider
-          // than the equation. We want the equation to control the brace width.
-
-          //      note          long note           long note
-          //   ┏━━━━━━━━┓   or    ┏━━━┓     not    ┏━━━━━━━━━┓
-          //    equation           eqn                 eqn
-
-          var vSpan = (0, _buildCommon.makeSpan)(["mord", group.value.isOver ? "mover" : "munder"], [vlist], options);
-
-          if (group.value.isOver) {
-              vlist = _buildCommon2.default.makeVList([{ type: "elem", elem: vSpan }, { type: "kern", size: 0.2 }, { type: "elem", elem: supSubGroup }], "firstBaseline", null, options);
-          } else {
-              vlist = _buildCommon2.default.makeVList([{ type: "elem", elem: supSubGroup }, { type: "kern", size: 0.2 }, { type: "elem", elem: vSpan }], "bottom", vSpan.depth + 0.2 + supSubGroup.height, options);
-          }
-      }
-
-      return (0, _buildCommon.makeSpan)(["mord", group.value.isOver ? "mover" : "munder"], [vlist], options);
-  };
-
-  groupTypes.accentUnder = function (group, options) {
-      // Treat under accents much like underlines.
-      var innerGroup = buildGroup(group.value.body, options);
-
-      var accentBody = _stretchy2.default.svgSpan(group, options);
-      var kern = /tilde/.test(group.value.label) ? 0.12 : 0;
-
-      // Generate the vlist, with the appropriate kerns
-      var vlist = _buildCommon2.default.makeVList([{ type: "elem", elem: accentBody }, { type: "kern", size: kern }, { type: "elem", elem: innerGroup }], "bottom", accentBody.height + kern, options);
-
-      vlist.children[0].children[0].children[0].classes.push("svg-align");
-
-      return (0, _buildCommon.makeSpan)(["mord", "accentunder"], [vlist], options);
-  };
-
-  groupTypes.enclose = function (group, options) {
-      // \cancel, \bcancel, \xcancel, \sout, \fbox
-      var inner = buildGroup(group.value.body, options);
-
-      var label = group.value.label.substr(1);
-      var scale = options.sizeMultiplier;
-      var img = void 0;
-      var pad = 0;
-      var imgShift = 0;
-
-      if (label === "sout") {
-          img = (0, _buildCommon.makeSpan)(["stretchy", "sout"]);
-          img.height = options.fontMetrics().defaultRuleThickness / scale;
-          imgShift = -0.5 * options.fontMetrics().xHeight;
-      } else {
-          // Add horizontal padding
-          inner.classes.push(label === "fbox" ? "boxpad" : "cancel-pad");
-
-          // Add vertical padding
-          var isCharBox = isCharacterBox(group.value.body);
-          // ref: LaTeX source2e: \fboxsep = 3pt;  \fboxrule = .4pt
-          // ref: cancel package: \advance\totalheight2\p@ % "+2"
-          pad = label === "fbox" ? 0.34 : isCharBox ? 0.2 : 0;
-          imgShift = inner.depth + pad;
-
-          img = _stretchy2.default.encloseSpan(inner, label, pad, options);
-      }
-
-      var vlist = _buildCommon2.default.makeVList([{ type: "elem", elem: inner, shift: 0 }, { type: "elem", elem: img, shift: imgShift }], "individualShift", null, options);
-
-      if (label !== "fbox") {
-          vlist.children[0].children[0].children[1].classes.push("svg-align");
-      }
-
-      if (/cancel/.test(label)) {
-          // cancel does not create horiz space for its line extension.
-          // That is, not when adjacent to a mord.
-          return (0, _buildCommon.makeSpan)(["mord", "cancel-lap"], [vlist], options);
-      } else {
-          return (0, _buildCommon.makeSpan)(["mord"], [vlist], options);
-      }
-  };
-
-  groupTypes.xArrow = function (group, options) {
-      var style = options.style;
-
-      // Build the argument groups in the appropriate style.
-      // Ref: amsmath.dtx:   \hbox{$\scriptstyle\mkern#3mu{#6}\mkern#4mu$}%
-
-      var newOptions = options.havingStyle(style.sup());
-      var upperGroup = buildGroup(group.value.body, newOptions, options);
-      upperGroup.classes.push("x-arrow-pad");
-
-      var lowerGroup = void 0;
-      if (group.value.below) {
-          // Build the lower group
-          newOptions = options.havingStyle(style.sub());
-          lowerGroup = buildGroup(group.value.below, newOptions, options);
-          lowerGroup.classes.push("x-arrow-pad");
-      }
-
-      var arrowBody = _stretchy2.default.svgSpan(group, options);
-
-      var arrowShift = -options.fontMetrics().axisHeight + arrowBody.depth;
-      var upperShift = -options.fontMetrics().axisHeight - arrowBody.height - 0.111; // 2 mu. Ref: amsmath.dtx: #7\if0#2\else\mkern#2mu\fi
-
-      // Generate the vlist
-      var vlist = void 0;
-      if (group.value.below) {
-          var lowerShift = -options.fontMetrics().axisHeight + lowerGroup.height + arrowBody.height + 0.111;
-          vlist = _buildCommon2.default.makeVList([{ type: "elem", elem: upperGroup, shift: upperShift }, { type: "elem", elem: arrowBody, shift: arrowShift }, { type: "elem", elem: lowerGroup, shift: lowerShift }], "individualShift", null, options);
-      } else {
-          vlist = _buildCommon2.default.makeVList([{ type: "elem", elem: upperGroup, shift: upperShift }, { type: "elem", elem: arrowBody, shift: arrowShift }], "individualShift", null, options);
-      }
-
-      vlist.children[0].children[0].children[1].classes.push("svg-align");
-
-      return (0, _buildCommon.makeSpan)(["mrel", "x-arrow"], [vlist], options);
-  };
-
-  groupTypes.phantom = function (group, options) {
-      var elements = buildExpression(group.value.value, options.withPhantom(), false);
-
-      // \phantom isn't supposed to affect the elements it contains.
-      // See "color" for more details.
-      return new _buildCommon2.default.makeFragment(elements);
-  };
-
-  groupTypes.mclass = function (group, options) {
-      var elements = buildExpression(group.value.value, options, true);
-
-      return (0, _buildCommon.makeSpan)([group.value.mclass], elements, options);
-  };
-
-  /**
-   * buildGroup is the function that takes a group and calls the correct groupType
-   * function for it. It also handles the interaction of size and style changes
-   * between parents and children.
-   */
-  var buildGroup = function buildGroup(group, options, baseOptions) {
-      if (!group) {
-          return (0, _buildCommon.makeSpan)();
-      }
-
-      if (groupTypes[group.type]) {
-          // Call the groupTypes function
-          var groupNode = groupTypes[group.type](group, options);
-
-          // If the size changed between the parent and the current group, account
-          // for that size difference.
-          if (baseOptions && options.size !== baseOptions.size) {
-              groupNode = (0, _buildCommon.makeSpan)(options.sizingClasses(baseOptions), [groupNode], options);
-
-              var multiplier = options.sizeMultiplier / baseOptions.sizeMultiplier;
-
-              groupNode.height *= multiplier;
-              groupNode.depth *= multiplier;
-          }
-
-          return groupNode;
-      } else {
-          throw new _ParseError2.default("Got group of unknown type: '" + group.type + "'");
-      }
-  };
-
-  /**
-   * Take an entire parse tree, and build it into an appropriate set of HTML
-   * nodes.
-   */
-  var buildHTML = function buildHTML(tree, options) {
-      // buildExpression is destructive, so we need to make a clone
-      // of the incoming tree so that it isn't accidentally changed
-      tree = JSON.parse((0, _stringify2.default)(tree));
-
-      // Build the expression contained in the tree
-      var expression = buildExpression(tree, options, true);
-      var body = (0, _buildCommon.makeSpan)(["base"], expression, options);
-
-      // Add struts, which ensure that the top of the HTML element falls at the
-      // height of the expression, and the bottom of the HTML element falls at the
-      // depth of the expression.
-      var topStrut = (0, _buildCommon.makeSpan)(["strut"]);
-      var bottomStrut = (0, _buildCommon.makeSpan)(["strut", "bottom"]);
-
-      topStrut.style.height = body.height + "em";
-      bottomStrut.style.height = body.height + body.depth + "em";
-      // We'd like to use `vertical-align: top` but in IE 9 this lowers the
-      // baseline of the box to the bottom of this strut (instead staying in the
-      // normal place) so we use an absolute value for vertical-align instead
-      bottomStrut.style.verticalAlign = -body.depth + "em";
-
-      // Wrap the struts and body together
-      var htmlNode = (0, _buildCommon.makeSpan)(["katex-html"], [topStrut, bottomStrut, body]);
-
-      htmlNode.setAttribute("aria-hidden", "true");
-
-      return htmlNode;
-  };
-
-  module.exports = buildHTML;
-
-  },{"./ParseError":29,"./Style":33,"./buildCommon":34,"./delimiter":38,"./domTree":39,"./stretchy":47,"./units":50,"./utils":51,"babel-runtime/core-js/json/stringify":2}],36:[function(require,module,exports){
-
-  var _buildCommon = require("./buildCommon");
-
-  var _buildCommon2 = _interopRequireDefault(_buildCommon);
-
-  var _fontMetrics = require("./fontMetrics");
-
-  var _fontMetrics2 = _interopRequireDefault(_fontMetrics);
-
-  var _mathMLTree = require("./mathMLTree");
-
-  var _mathMLTree2 = _interopRequireDefault(_mathMLTree);
-
-  var _ParseError = require("./ParseError");
-
-  var _ParseError2 = _interopRequireDefault(_ParseError);
-
-  var _Style = require("./Style");
-
-  var _Style2 = _interopRequireDefault(_Style);
-
-  var _symbols = require("./symbols");
-
-  var _symbols2 = _interopRequireDefault(_symbols);
-
-  var _utils = require("./utils");
-
-  var _utils2 = _interopRequireDefault(_utils);
-
-  var _stretchy = require("./stretchy");
-
-  var _stretchy2 = _interopRequireDefault(_stretchy);
-
-  function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { default: obj }; }
-
-  /**
-   * Takes a symbol and converts it into a MathML text node after performing
-   * optional replacement from symbols.js.
-   */
-  /**
-   * This file converts a parse tree into a cooresponding MathML tree. The main
-   * entry point is the `buildMathML` function, which takes a parse tree from the
-   * parser.
-   */
-
-  var makeText = function makeText(text, mode) {
-      if (_symbols2.default[mode][text] && _symbols2.default[mode][text].replace) {
-          text = _symbols2.default[mode][text].replace;
-      }
-
-      return new _mathMLTree2.default.TextNode(text);
-  };
-
-  /**
-   * Returns the math variant as a string or null if none is required.
-   */
-  var getVariant = function getVariant(group, options) {
-      var font = options.font;
-      if (!font) {
-          return null;
-      }
-
-      var mode = group.mode;
-      if (font === "mathit") {
-          return "italic";
-      }
-
-      var value = group.value;
-      if (_utils2.default.contains(["\\imath", "\\jmath"], value)) {
-          return null;
-      }
-
-      if (_symbols2.default[mode][value] && _symbols2.default[mode][value].replace) {
-          value = _symbols2.default[mode][value].replace;
-      }
-
-      var fontName = _buildCommon.fontMap[font].fontName;
-      if (_fontMetrics2.default.getCharacterMetrics(value, fontName)) {
-          return _buildCommon.fontMap[options.font].variant;
-      }
-
-      return null;
-  };
-
-  /**
-   * Functions for handling the different types of groups found in the parse
-   * tree. Each function should take a parse group and return a MathML node.
-   */
-  var groupTypes = {};
-
-  var defaultVariant = {
-      "mi": "italic",
-      "mn": "normal",
-      "mtext": "normal"
-  };
-
-  groupTypes.mathord = function (group, options) {
-      var node = new _mathMLTree2.default.MathNode("mi", [makeText(group.value, group.mode)]);
-
-      var variant = getVariant(group, options) || "italic";
-      if (variant !== defaultVariant[node.type]) {
-          node.setAttribute("mathvariant", variant);
-      }
-      return node;
-  };
-
-  groupTypes.textord = function (group, options) {
-      var text = makeText(group.value, group.mode);
-
-      var variant = getVariant(group, options) || "normal";
-
-      var node = void 0;
-      if (group.mode === 'text') {
-          node = new _mathMLTree2.default.MathNode("mtext", [text]);
-      } else if (/[0-9]/.test(group.value)) {
-          // TODO(kevinb) merge adjacent <mn> nodes
-          // do it as a post processing step
-          node = new _mathMLTree2.default.MathNode("mn", [text]);
-      } else if (group.value === "\\prime") {
-          node = new _mathMLTree2.default.MathNode("mo", [text]);
-      } else {
-          node = new _mathMLTree2.default.MathNode("mi", [text]);
-      }
-      if (variant !== defaultVariant[node.type]) {
-          node.setAttribute("mathvariant", variant);
-      }
-
-      return node;
-  };
-
-  groupTypes.bin = function (group) {
-      var node = new _mathMLTree2.default.MathNode("mo", [makeText(group.value, group.mode)]);
-
-      return node;
-  };
-
-  groupTypes.rel = function (group) {
-      var node = new _mathMLTree2.default.MathNode("mo", [makeText(group.value, group.mode)]);
-
-      return node;
-  };
-
-  groupTypes.open = function (group) {
-      var node = new _mathMLTree2.default.MathNode("mo", [makeText(group.value, group.mode)]);
-
-      return node;
-  };
-
-  groupTypes.close = function (group) {
-      var node = new _mathMLTree2.default.MathNode("mo", [makeText(group.value, group.mode)]);
-
-      return node;
-  };
-
-  groupTypes.inner = function (group) {
-      var node = new _mathMLTree2.default.MathNode("mo", [makeText(group.value, group.mode)]);
-
-      return node;
-  };
-
-  groupTypes.punct = function (group) {
-      var node = new _mathMLTree2.default.MathNode("mo", [makeText(group.value, group.mode)]);
-
-      node.setAttribute("separator", "true");
-
-      return node;
-  };
-
-  groupTypes.ordgroup = function (group, options) {
-      var inner = buildExpression(group.value, options);
-
-      var node = new _mathMLTree2.default.MathNode("mrow", inner);
-
-      return node;
-  };
-
-  groupTypes.text = function (group, options) {
-      var body = group.value.body;
-
-      // Convert each element of the body into MathML, and combine consecutive
-      // <mtext> outputs into a single <mtext> tag.  In this way, we don't
-      // nest non-text items (e.g., $nested-math$) within an <mtext>.
-      var inner = [];
-      var currentText = null;
-      for (var i = 0; i < body.length; i++) {
-          var _group = buildGroup(body[i], options);
-          if (_group.type === 'mtext' && currentText != null) {
-              Array.prototype.push.apply(currentText.children, _group.children);
-          } else {
-              inner.push(_group);
-              if (_group.type === 'mtext') {
-                  currentText = _group;
-              }
-          }
-      }
-
-      // If there is a single tag in the end (presumably <mtext>),
-      // just return it.  Otherwise, wrap them in an <mrow>.
-      if (inner.length === 1) {
-          return inner[0];
-      } else {
-          return new _mathMLTree2.default.MathNode("mrow", inner);
-      }
-  };
-
-  groupTypes.color = function (group, options) {
-      var inner = buildExpression(group.value.value, options);
-
-      var node = new _mathMLTree2.default.MathNode("mstyle", inner);
-
-      node.setAttribute("mathcolor", group.value.color);
-
-      return node;
-  };
-
-  groupTypes.supsub = function (group, options) {
-      // Is the inner group a relevant horizonal brace?
-      var isBrace = false;
-      var isOver = void 0;
-      var isSup = void 0;
-      if (group.value.base) {
-          if (group.value.base.value.type === "horizBrace") {
-              isSup = group.value.sup ? true : false;
-              if (isSup === group.value.base.value.isOver) {
-                  isBrace = true;
-                  isOver = group.value.base.value.isOver;
-              }
-          }
-      }
-
-      var removeUnnecessaryRow = true;
-      var children = [buildGroup(group.value.base, options, removeUnnecessaryRow)];
-
-      if (group.value.sub) {
-          children.push(buildGroup(group.value.sub, options, removeUnnecessaryRow));
-      }
-
-      if (group.value.sup) {
-          children.push(buildGroup(group.value.sup, options, removeUnnecessaryRow));
-      }
-
-      var nodeType = void 0;
-      if (isBrace) {
-          nodeType = isOver ? "mover" : "munder";
-      } else if (!group.value.sub) {
-          nodeType = "msup";
-      } else if (!group.value.sup) {
-          nodeType = "msub";
-      } else {
-          var base = group.value.base;
-          if (base && base.value.limits && options.style === _Style2.default.DISPLAY) {
-              nodeType = "munderover";
-          } else {
-              nodeType = "msubsup";
-          }
-      }
-
-      var node = new _mathMLTree2.default.MathNode(nodeType, children);
-
-      return node;
-  };
-
-  groupTypes.genfrac = function (group, options) {
-      var node = new _mathMLTree2.default.MathNode("mfrac", [buildGroup(group.value.numer, options), buildGroup(group.value.denom, options)]);
-
-      if (!group.value.hasBarLine) {
-          node.setAttribute("linethickness", "0px");
-      }
-
-      if (group.value.leftDelim != null || group.value.rightDelim != null) {
-          var withDelims = [];
-
-          if (group.value.leftDelim != null) {
-              var leftOp = new _mathMLTree2.default.MathNode("mo", [new _mathMLTree2.default.TextNode(group.value.leftDelim)]);
-
-              leftOp.setAttribute("fence", "true");
-
-              withDelims.push(leftOp);
-          }
-
-          withDelims.push(node);
-
-          if (group.value.rightDelim != null) {
-              var rightOp = new _mathMLTree2.default.MathNode("mo", [new _mathMLTree2.default.TextNode(group.value.rightDelim)]);
-
-              rightOp.setAttribute("fence", "true");
-
-              withDelims.push(rightOp);
-          }
-
-          var outerNode = new _mathMLTree2.default.MathNode("mrow", withDelims);
-
-          return outerNode;
-      }
-
-      return node;
-  };
-
-  groupTypes.array = function (group, options) {
-      return new _mathMLTree2.default.MathNode("mtable", group.value.body.map(function (row) {
-          return new _mathMLTree2.default.MathNode("mtr", row.map(function (cell) {
-              return new _mathMLTree2.default.MathNode("mtd", [buildGroup(cell, options)]);
-          }));
-      }));
-  };
-
-  groupTypes.sqrt = function (group, options) {
-      var node = void 0;
-      if (group.value.index) {
-          node = new _mathMLTree2.default.MathNode("mroot", [buildGroup(group.value.body, options), buildGroup(group.value.index, options)]);
-      } else {
-          node = new _mathMLTree2.default.MathNode("msqrt", [buildGroup(group.value.body, options)]);
-      }
-
-      return node;
-  };
-
-  groupTypes.leftright = function (group, options) {
-      var inner = buildExpression(group.value.body, options);
-
-      if (group.value.left !== ".") {
-          var leftNode = new _mathMLTree2.default.MathNode("mo", [makeText(group.value.left, group.mode)]);
-
-          leftNode.setAttribute("fence", "true");
-
-          inner.unshift(leftNode);
-      }
-
-      if (group.value.right !== ".") {
-          var rightNode = new _mathMLTree2.default.MathNode("mo", [makeText(group.value.right, group.mode)]);
-
-          rightNode.setAttribute("fence", "true");
-
-          inner.push(rightNode);
-      }
-
-      var outerNode = new _mathMLTree2.default.MathNode("mrow", inner);
-
-      return outerNode;
-  };
-
-  groupTypes.middle = function (group, options) {
-      var middleNode = new _mathMLTree2.default.MathNode("mo", [makeText(group.value.middle, group.mode)]);
-      middleNode.setAttribute("fence", "true");
-      return middleNode;
-  };
-
-  groupTypes.accent = function (group, options) {
-      var accentNode = void 0;
-      if (group.value.isStretchy) {
-          accentNode = _stretchy2.default.mathMLnode(group.value.label);
-      } else {
-          accentNode = new _mathMLTree2.default.MathNode("mo", [makeText(group.value.label, group.mode)]);
-      }
-
-      var node = new _mathMLTree2.default.MathNode("mover", [buildGroup(group.value.base, options), accentNode]);
-
-      node.setAttribute("accent", "true");
-
-      return node;
-  };
-
-  groupTypes.spacing = function (group) {
-      var node = void 0;
-
-      if (group.value === "\\ " || group.value === "\\space" || group.value === " " || group.value === "~") {
-          node = new _mathMLTree2.default.MathNode("mtext", [new _mathMLTree2.default.TextNode("\xA0")]);
-      } else {
-          node = new _mathMLTree2.default.MathNode("mspace");
-
-          node.setAttribute("width", _buildCommon2.default.spacingFunctions[group.value].size);
-      }
-
-      return node;
-  };
-
-  groupTypes.op = function (group, options) {
-      var node = void 0;
-
-      // TODO(emily): handle big operators using the `largeop` attribute
-
-      if (group.value.symbol) {
-          // This is a symbol. Just add the symbol.
-          node = new _mathMLTree2.default.MathNode("mo", [makeText(group.value.body, group.mode)]);
-      } else if (group.value.value) {
-          // This is an operator with children. Add them.
-          node = new _mathMLTree2.default.MathNode("mo", buildExpression(group.value.value, options));
-      } else {
-          // This is a text operator. Add all of the characters from the
-          // operator's name.
-          // TODO(emily): Add a space in the middle of some of these
-          // operators, like \limsup.
-          node = new _mathMLTree2.default.MathNode("mi", [new _mathMLTree2.default.TextNode(group.value.body.slice(1))]);
-      }
-
-      return node;
-  };
-
-  groupTypes.mod = function (group, options) {
-      var inner = [];
-
-      if (group.value.modType === "pod" || group.value.modType === "pmod") {
-          inner.push(new _mathMLTree2.default.MathNode("mo", [makeText("(", group.mode)]));
-      }
-      if (group.value.modType !== "pod") {
-          inner.push(new _mathMLTree2.default.MathNode("mo", [makeText("mod", group.mode)]));
-      }
-      if (group.value.value) {
-          var space = new _mathMLTree2.default.MathNode("mspace");
-          space.setAttribute("width", "0.333333em");
-          inner.push(space);
-          inner = inner.concat(buildExpression(group.value.value, options));
-      }
-      if (group.value.modType === "pod" || group.value.modType === "pmod") {
-          inner.push(new _mathMLTree2.default.MathNode("mo", [makeText(")", group.mode)]));
-      }
-
-      return new _mathMLTree2.default.MathNode("mo", inner);
-  };
-
-  groupTypes.katex = function (group) {
-      var node = new _mathMLTree2.default.MathNode("mtext", [new _mathMLTree2.default.TextNode("KaTeX")]);
-
-      return node;
-  };
-
-  groupTypes.font = function (group, options) {
-      var font = group.value.font;
-      return buildGroup(group.value.body, options.withFont(font));
-  };
-
-  groupTypes.delimsizing = function (group) {
-      var children = [];
-
-      if (group.value.value !== ".") {
-          children.push(makeText(group.value.value, group.mode));
-      }
-
-      var node = new _mathMLTree2.default.MathNode("mo", children);
-
-      if (group.value.mclass === "mopen" || group.value.mclass === "mclose") {
-          // Only some of the delimsizing functions act as fences, and they
-          // return "mopen" or "mclose" mclass.
-          node.setAttribute("fence", "true");
-      } else {
-          // Explicitly disable fencing if it's not a fence, to override the
-          // defaults.
-          node.setAttribute("fence", "false");
-      }
-
-      return node;
-  };
-
-  groupTypes.styling = function (group, options) {
-      // Figure out what style we're changing to.
-      // TODO(kevinb): dedupe this with buildHTML.js
-      // This will be easier of handling of styling nodes is in the same file.
-      var styleMap = {
-          "display": _Style2.default.DISPLAY,
-          "text": _Style2.default.TEXT,
-          "script": _Style2.default.SCRIPT,
-          "scriptscript": _Style2.default.SCRIPTSCRIPT
-      };
-
-      var newStyle = styleMap[group.value.style];
-      var newOptions = options.havingStyle(newStyle);
-
-      var inner = buildExpression(group.value.value, newOptions);
-
-      var node = new _mathMLTree2.default.MathNode("mstyle", inner);
-
-      var styleAttributes = {
-          "display": ["0", "true"],
-          "text": ["0", "false"],
-          "script": ["1", "false"],
-          "scriptscript": ["2", "false"]
-      };
-
-      var attr = styleAttributes[group.value.style];
-
-      node.setAttribute("scriptlevel", attr[0]);
-      node.setAttribute("displaystyle", attr[1]);
-
-      return node;
-  };
-
-  groupTypes.sizing = function (group, options) {
-      var newOptions = options.havingSize(group.value.size);
-      var inner = buildExpression(group.value.value, newOptions);
-
-      var node = new _mathMLTree2.default.MathNode("mstyle", inner);
-
-      // TODO(emily): This doesn't produce the correct size for nested size
-      // changes, because we don't keep state of what style we're currently
-      // in, so we can't reset the size to normal before changing it.  Now
-      // that we're passing an options parameter we should be able to fix
-      // this.
-      node.setAttribute("mathsize", newOptions.sizeMultiplier + "em");
-
-      return node;
-  };
-
-  groupTypes.overline = function (group, options) {
-      var operator = new _mathMLTree2.default.MathNode("mo", [new _mathMLTree2.default.TextNode("\u203E")]);
-      operator.setAttribute("stretchy", "true");
-
-      var node = new _mathMLTree2.default.MathNode("mover", [buildGroup(group.value.body, options), operator]);
-      node.setAttribute("accent", "true");
-
-      return node;
-  };
-
-  groupTypes.underline = function (group, options) {
-      var operator = new _mathMLTree2.default.MathNode("mo", [new _mathMLTree2.default.TextNode("\u203E")]);
-      operator.setAttribute("stretchy", "true");
-
-      var node = new _mathMLTree2.default.MathNode("munder", [buildGroup(group.value.body, options), operator]);
-      node.setAttribute("accentunder", "true");
-
-      return node;
-  };
-
-  groupTypes.accentUnder = function (group, options) {
-      var accentNode = _stretchy2.default.mathMLnode(group.value.label);
-      var node = new _mathMLTree2.default.MathNode("munder", [buildGroup(group.value.body, options), accentNode]);
-      node.setAttribute("accentunder", "true");
-      return node;
-  };
-
-  groupTypes.enclose = function (group, options) {
-      var node = new _mathMLTree2.default.MathNode("menclose", [buildGroup(group.value.body, options)]);
-      var notation = "";
-      switch (group.value.label) {
-          case "\\bcancel":
-              notation = "downdiagonalstrike";
-              break;
-          case "\\sout":
-              notation = "horizontalstrike";
-              break;
-          case "\\fbox":
-              notation = "box";
-              break;
-          default:
-              notation = "updiagonalstrike";
-      }
-      node.setAttribute("notation", notation);
-      return node;
-  };
-
-  groupTypes.horizBrace = function (group, options) {
-      var accentNode = _stretchy2.default.mathMLnode(group.value.label);
-      return new _mathMLTree2.default.MathNode(group.value.isOver ? "mover" : "munder", [buildGroup(group.value.base, options), accentNode]);
-  };
-
-  groupTypes.xArrow = function (group, options) {
-      var arrowNode = _stretchy2.default.mathMLnode(group.value.label);
-      var node = void 0;
-      var lowerNode = void 0;
-
-      if (group.value.body) {
-          var upperNode = buildGroup(group.value.body, options);
-          if (group.value.below) {
-              lowerNode = buildGroup(group.value.below, options);
-              node = new _mathMLTree2.default.MathNode("munderover", [arrowNode, lowerNode, upperNode]);
-          } else {
-              node = new _mathMLTree2.default.MathNode("mover", [arrowNode, upperNode]);
-          }
-      } else if (group.value.below) {
-          lowerNode = buildGroup(group.value.below, options);
-          node = new _mathMLTree2.default.MathNode("munder", [arrowNode, lowerNode]);
-      } else {
-          node = new _mathMLTree2.default.MathNode("mover", [arrowNode]);
-      }
-      return node;
-  };
-
-  groupTypes.rule = function (group) {
-      // TODO(emily): Figure out if there's an actual way to draw black boxes
-      // in MathML.
-      var node = new _mathMLTree2.default.MathNode("mrow");
-
-      return node;
-  };
-
-  groupTypes.kern = function (group) {
-      // TODO(kevin): Figure out if there's a way to add space in MathML
-      var node = new _mathMLTree2.default.MathNode("mrow");
-
-      return node;
-  };
-
-  groupTypes.llap = function (group, options) {
-      var node = new _mathMLTree2.default.MathNode("mpadded", [buildGroup(group.value.body, options)]);
-
-      node.setAttribute("lspace", "-1width");
-      node.setAttribute("width", "0px");
-
-      return node;
-  };
-
-  groupTypes.rlap = function (group, options) {
-      var node = new _mathMLTree2.default.MathNode("mpadded", [buildGroup(group.value.body, options)]);
-
-      node.setAttribute("width", "0px");
-
-      return node;
-  };
-
-  groupTypes.phantom = function (group, options) {
-      var inner = buildExpression(group.value.value, options);
-      return new _mathMLTree2.default.MathNode("mphantom", inner);
-  };
-
-  groupTypes.mclass = function (group, options) {
-      var inner = buildExpression(group.value.value, options);
-      return new _mathMLTree2.default.MathNode("mstyle", inner);
-  };
-
-  /**
-   * Takes a list of nodes, builds them, and returns a list of the generated
-   * MathML nodes. A little simpler than the HTML version because we don't do any
-   * previous-node handling.
-   */
-  var buildExpression = function buildExpression(expression, options) {
-      var groups = [];
-      for (var i = 0; i < expression.length; i++) {
-          var group = expression[i];
-          groups.push(buildGroup(group, options));
-      }
-
-      // TODO(kevinb): combine \\not with mrels and mords
-
-      return groups;
-  };
-
-  /**
-   * Takes a group from the parser and calls the appropriate groupTypes function
-   * on it to produce a MathML node.
-   */
-  // TODO(kevinb): determine if removeUnnecessaryRow should always be true
-  var buildGroup = function buildGroup(group, options) {
-      var removeUnnecessaryRow = arguments.length > 2 && arguments[2] !== undefined ? arguments[2] : false;
-
-      if (!group) {
-          return new _mathMLTree2.default.MathNode("mrow");
-      }
-
-      if (groupTypes[group.type]) {
-          // Call the groupTypes function
-          var result = groupTypes[group.type](group, options);
-          if (removeUnnecessaryRow) {
-              if (result.type === "mrow" && result.children.length === 1) {
-                  return result.children[0];
-              }
-          }
-          return result;
-      } else {
-          throw new _ParseError2.default("Got group of unknown type: '" + group.type + "'");
-      }
-  };
-
-  /**
-   * Takes a full parse tree and settings and builds a MathML representation of
-   * it. In particular, we put the elements from building the parse tree into a
-   * <semantics> tag so we can also include that TeX source as an annotation.
-   *
-   * Note that we actually return a domTree element with a `<math>` inside it so
-   * we can do appropriate styling.
-   */
-  var buildMathML = function buildMathML(tree, texExpression, options) {
-      var expression = buildExpression(tree, options);
-
-      // Wrap up the expression in an mrow so it is presented in the semantics
-      // tag correctly.
-      var wrapper = new _mathMLTree2.default.MathNode("mrow", expression);
-
-      // Build a TeX annotation of the source
-      var annotation = new _mathMLTree2.default.MathNode("annotation", [new _mathMLTree2.default.TextNode(texExpression)]);
-
-      annotation.setAttribute("encoding", "application/x-tex");
-
-      var semantics = new _mathMLTree2.default.MathNode("semantics", [wrapper, annotation]);
-
-      var math = new _mathMLTree2.default.MathNode("math", [semantics]);
-
-      // You can't style <math> nodes, so we wrap the node in a span.
-      return (0, _buildCommon.makeSpan)(["katex-mathml"], [math]);
-  };
-
-  module.exports = buildMathML;
-
-  },{"./ParseError":29,"./Style":33,"./buildCommon":34,"./fontMetrics":41,"./mathMLTree":45,"./stretchy":47,"./symbols":48,"./utils":51}],37:[function(require,module,exports){
-
-  var _buildHTML = require("./buildHTML");
-
-  var _buildHTML2 = _interopRequireDefault(_buildHTML);
-
-  var _buildMathML = require("./buildMathML");
-
-  var _buildMathML2 = _interopRequireDefault(_buildMathML);
-
-  var _buildCommon = require("./buildCommon");
-
-  var _Options = require("./Options");
-
-  var _Options2 = _interopRequireDefault(_Options);
-
-  var _Settings = require("./Settings");
-
-  var _Settings2 = _interopRequireDefault(_Settings);
-
-  var _Style = require("./Style");
-
-  var _Style2 = _interopRequireDefault(_Style);
-
-  function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { default: obj }; }
-
-  var buildTree = function buildTree(tree, expression, settings) {
-      settings = settings || new _Settings2.default({});
-
-      var startStyle = _Style2.default.TEXT;
-      if (settings.displayMode) {
-          startStyle = _Style2.default.DISPLAY;
-      }
-
-      // Setup the default options
-      var options = new _Options2.default({
-          style: startStyle
-      });
-
-      // `buildHTML` sometimes messes with the parse tree (like turning bins ->
-      // ords), so we build the MathML version first.
-      var mathMLNode = (0, _buildMathML2.default)(tree, expression, options);
-      var htmlNode = (0, _buildHTML2.default)(tree, options);
-
-      var katexNode = (0, _buildCommon.makeSpan)(["katex"], [mathMLNode, htmlNode]);
-
-      if (settings.displayMode) {
-          return (0, _buildCommon.makeSpan)(["katex-display"], [katexNode]);
-      } else {
-          return katexNode;
-      }
-  };
-
-  module.exports = buildTree;
-
-  },{"./Options":28,"./Settings":32,"./Style":33,"./buildCommon":34,"./buildHTML":35,"./buildMathML":36}],38:[function(require,module,exports){
-
-  var _ParseError = require("./ParseError");
-
-  var _ParseError2 = _interopRequireDefault(_ParseError);
-
-  var _Style = require("./Style");
-
-  var _Style2 = _interopRequireDefault(_Style);
-
-  var _buildCommon = require("./buildCommon");
-
-  var _buildCommon2 = _interopRequireDefault(_buildCommon);
-
-  var _fontMetrics = require("./fontMetrics");
-
-  var _fontMetrics2 = _interopRequireDefault(_fontMetrics);
-
-  var _symbols = require("./symbols");
-
-  var _symbols2 = _interopRequireDefault(_symbols);
-
-  var _utils = require("./utils");
-
-  var _utils2 = _interopRequireDefault(_utils);
-
-  function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { default: obj }; }
-
-  /**
-   * Get the metrics for a given symbol and font, after transformation (i.e.
-   * after following replacement from symbols.js)
-   */
-  /**
-   * This file deals with creating delimiters of various sizes. The TeXbook
-   * discusses these routines on page 441-442, in the "Another subroutine sets box
-   * x to a specified variable delimiter" paragraph.
-   *
-   * There are three main routines here. `makeSmallDelim` makes a delimiter in the
-   * normal font, but in either text, script, or scriptscript style.
-   * `makeLargeDelim` makes a delimiter in textstyle, but in one of the Size1,
-   * Size2, Size3, or Size4 fonts. `makeStackedDelim` makes a delimiter out of
-   * smaller pieces that are stacked on top of one another.
-   *
-   * The functions take a parameter `center`, which determines if the delimiter
-   * should be centered around the axis.
-   *
-   * Then, there are three exposed functions. `sizedDelim` makes a delimiter in
-   * one of the given sizes. This is used for things like `\bigl`.
-   * `customSizedDelim` makes a delimiter with a given total height+depth. It is
-   * called in places like `\sqrt`. `leftRightDelim` makes an appropriate
-   * delimiter which surrounds an expression of a given height an depth. It is
-   * used in `\left` and `\right`.
-   */
-
-  var getMetrics = function getMetrics(symbol, font) {
-      if (_symbols2.default.math[symbol] && _symbols2.default.math[symbol].replace) {
-          return _fontMetrics2.default.getCharacterMetrics(_symbols2.default.math[symbol].replace, font);
-      } else {
-          return _fontMetrics2.default.getCharacterMetrics(symbol, font);
-      }
-  };
-
-  /**
-   * Puts a delimiter span in a given style, and adds appropriate height, depth,
-   * and maxFontSizes.
-   */
-  var styleWrap = function styleWrap(delim, toStyle, options, classes) {
-      var newOptions = options.havingBaseStyle(toStyle);
-
-      var span = (0, _buildCommon.makeSpan)((classes || []).concat(newOptions.sizingClasses(options)), [delim], options);
-
-      span.delimSizeMultiplier = newOptions.sizeMultiplier / options.sizeMultiplier;
-      span.height *= span.delimSizeMultiplier;
-      span.depth *= span.delimSizeMultiplier;
-      span.maxFontSize = newOptions.sizeMultiplier;
-
-      return span;
-  };
-
-  var centerSpan = function centerSpan(span, options, style) {
-      var newOptions = options.havingBaseStyle(style);
-      var shift = (1 - options.sizeMultiplier / newOptions.sizeMultiplier) * options.fontMetrics().axisHeight;
-
-      span.classes.push("delimcenter");
-      span.style.top = shift + "em";
-      span.height -= shift;
-      span.depth += shift;
-  };
-
-  /**
-   * Makes a small delimiter. This is a delimiter that comes in the Main-Regular
-   * font, but is restyled to either be in textstyle, scriptstyle, or
-   * scriptscriptstyle.
-   */
-  var makeSmallDelim = function makeSmallDelim(delim, style, center, options, mode, classes) {
-      var text = _buildCommon2.default.makeSymbol(delim, "Main-Regular", mode, options);
-      var span = styleWrap(text, style, options, classes);
-      if (center) {
-          centerSpan(span, options, style);
-      }
-      return span;
-  };
-
-  /**
-   * Builds a symbol in the given font size (note size is an integer)
-   */
-  var mathrmSize = function mathrmSize(value, size, mode, options) {
-      return _buildCommon2.default.makeSymbol(value, "Size" + size + "-Regular", mode, options);
-  };
-
-  /**
-   * Makes a large delimiter. This is a delimiter that comes in the Size1, Size2,
-   * Size3, or Size4 fonts. It is always rendered in textstyle.
-   */
-  var makeLargeDelim = function makeLargeDelim(delim, size, center, options, mode, classes) {
-      var inner = mathrmSize(delim, size, mode, options);
-      var span = styleWrap((0, _buildCommon.makeSpan)(["delimsizing", "size" + size], [inner], options), _Style2.default.TEXT, options, classes);
-      if (center) {
-          centerSpan(span, options, _Style2.default.TEXT);
-      }
-      return span;
-  };
-
-  /**
-   * Make an inner span with the given offset and in the given font. This is used
-   * in `makeStackedDelim` to make the stacking pieces for the delimiter.
-   */
-  var makeInner = function makeInner(symbol, font, mode) {
-      var sizeClass = void 0;
-      // Apply the correct CSS class to choose the right font.
-      if (font === "Size1-Regular") {
-          sizeClass = "delim-size1";
-      } else if (font === "Size4-Regular") {
-          sizeClass = "delim-size4";
-      }
-
-      var inner = (0, _buildCommon.makeSpan)(["delimsizinginner", sizeClass], [(0, _buildCommon.makeSpan)([], [_buildCommon2.default.makeSymbol(symbol, font, mode)])]);
-
-      // Since this will be passed into `makeVList` in the end, wrap the element
-      // in the appropriate tag that VList uses.
-      return { type: "elem", elem: inner };
-  };
-
-  /**
-   * Make a stacked delimiter out of a given delimiter, with the total height at
-   * least `heightTotal`. This routine is mentioned on page 442 of the TeXbook.
-   */
-  var makeStackedDelim = function makeStackedDelim(delim, heightTotal, center, options, mode, classes) {
-      // There are four parts, the top, an optional middle, a repeated part, and a
-      // bottom.
-      var top = void 0;
-      var middle = void 0;
-      var repeat = void 0;
-      var bottom = void 0;
-      top = repeat = bottom = delim;
-      middle = null;
-      // Also keep track of what font the delimiters are in
-      var font = "Size1-Regular";
-
-      // We set the parts and font based on the symbol. Note that we use
-      // '\u23d0' instead of '|' and '\u2016' instead of '\\|' for the
-      // repeats of the arrows
-      if (delim === "\\uparrow") {
-          repeat = bottom = "\u23D0";
-      } else if (delim === "\\Uparrow") {
-          repeat = bottom = "\u2016";
-      } else if (delim === "\\downarrow") {
-          top = repeat = "\u23D0";
-      } else if (delim === "\\Downarrow") {
-          top = repeat = "\u2016";
-      } else if (delim === "\\updownarrow") {
-          top = "\\uparrow";
-          repeat = "\u23D0";
-          bottom = "\\downarrow";
-      } else if (delim === "\\Updownarrow") {
-          top = "\\Uparrow";
-          repeat = "\u2016";
-          bottom = "\\Downarrow";
-      } else if (delim === "[" || delim === "\\lbrack") {
-          top = "\u23A1";
-          repeat = "\u23A2";
-          bottom = "\u23A3";
-          font = "Size4-Regular";
-      } else if (delim === "]" || delim === "\\rbrack") {
-          top = "\u23A4";
-          repeat = "\u23A5";
-          bottom = "\u23A6";
-          font = "Size4-Regular";
-      } else if (delim === "\\lfloor") {
-          repeat = top = "\u23A2";
-          bottom = "\u23A3";
-          font = "Size4-Regular";
-      } else if (delim === "\\lceil") {
-          top = "\u23A1";
-          repeat = bottom = "\u23A2";
-          font = "Size4-Regular";
-      } else if (delim === "\\rfloor") {
-          repeat = top = "\u23A5";
-          bottom = "\u23A6";
-          font = "Size4-Regular";
-      } else if (delim === "\\rceil") {
-          top = "\u23A4";
-          repeat = bottom = "\u23A5";
-          font = "Size4-Regular";
-      } else if (delim === "(") {
-          top = "\u239B";
-          repeat = "\u239C";
-          bottom = "\u239D";
-          font = "Size4-Regular";
-      } else if (delim === ")") {
-          top = "\u239E";
-          repeat = "\u239F";
-          bottom = "\u23A0";
-          font = "Size4-Regular";
-      } else if (delim === "\\{" || delim === "\\lbrace") {
-          top = "\u23A7";
-          middle = "\u23A8";
-          bottom = "\u23A9";
-          repeat = "\u23AA";
-          font = "Size4-Regular";
-      } else if (delim === "\\}" || delim === "\\rbrace") {
-          top = "\u23AB";
-          middle = "\u23AC";
-          bottom = "\u23AD";
-          repeat = "\u23AA";
-          font = "Size4-Regular";
-      } else if (delim === "\\lgroup") {
-          top = "\u23A7";
-          bottom = "\u23A9";
-          repeat = "\u23AA";
-          font = "Size4-Regular";
-      } else if (delim === "\\rgroup") {
-          top = "\u23AB";
-          bottom = "\u23AD";
-          repeat = "\u23AA";
-          font = "Size4-Regular";
-      } else if (delim === "\\lmoustache") {
-          top = "\u23A7";
-          bottom = "\u23AD";
-          repeat = "\u23AA";
-          font = "Size4-Regular";
-      } else if (delim === "\\rmoustache") {
-          top = "\u23AB";
-          bottom = "\u23A9";
-          repeat = "\u23AA";
-          font = "Size4-Regular";
-      }
-
-      // Get the metrics of the four sections
-      var topMetrics = getMetrics(top, font);
-      var topHeightTotal = topMetrics.height + topMetrics.depth;
-      var repeatMetrics = getMetrics(repeat, font);
-      var repeatHeightTotal = repeatMetrics.height + repeatMetrics.depth;
-      var bottomMetrics = getMetrics(bottom, font);
-      var bottomHeightTotal = bottomMetrics.height + bottomMetrics.depth;
-      var middleHeightTotal = 0;
-      var middleFactor = 1;
-      if (middle !== null) {
-          var middleMetrics = getMetrics(middle, font);
-          middleHeightTotal = middleMetrics.height + middleMetrics.depth;
-          middleFactor = 2; // repeat symmetrically above and below middle
-      }
-
-      // Calcuate the minimal height that the delimiter can have.
-      // It is at least the size of the top, bottom, and optional middle combined.
-      var minHeight = topHeightTotal + bottomHeightTotal + middleHeightTotal;
-
-      // Compute the number of copies of the repeat symbol we will need
-      var repeatCount = Math.ceil((heightTotal - minHeight) / (middleFactor * repeatHeightTotal));
-
-      // Compute the total height of the delimiter including all the symbols
-      var realHeightTotal = minHeight + repeatCount * middleFactor * repeatHeightTotal;
-
-      // The center of the delimiter is placed at the center of the axis. Note
-      // that in this context, "center" means that the delimiter should be
-      // centered around the axis in the current style, while normally it is
-      // centered around the axis in textstyle.
-      var axisHeight = options.fontMetrics().axisHeight;
-      if (center) {
-          axisHeight *= options.sizeMultiplier;
-      }
-      // Calculate the depth
-      var depth = realHeightTotal / 2 - axisHeight;
-
-      // Now, we start building the pieces that will go into the vlist
-
-      // Keep a list of the inner pieces
-      var inners = [];
-
-      // Add the bottom symbol
-      inners.push(makeInner(bottom, font, mode));
-
-      if (middle === null) {
-          // Add that many symbols
-          for (var i = 0; i < repeatCount; i++) {
-              inners.push(makeInner(repeat, font, mode));
-          }
-      } else {
-          // When there is a middle bit, we need the middle part and two repeated
-          // sections
-          for (var _i = 0; _i < repeatCount; _i++) {
-              inners.push(makeInner(repeat, font, mode));
-          }
-          inners.push(makeInner(middle, font, mode));
-          for (var _i2 = 0; _i2 < repeatCount; _i2++) {
-              inners.push(makeInner(repeat, font, mode));
-          }
-      }
-
-      // Add the top symbol
-      inners.push(makeInner(top, font, mode));
-
-      // Finally, build the vlist
-      var newOptions = options.havingBaseStyle(_Style2.default.TEXT);
-      var inner = _buildCommon2.default.makeVList(inners, "bottom", depth, newOptions);
-
-      return styleWrap((0, _buildCommon.makeSpan)(["delimsizing", "mult"], [inner], newOptions), _Style2.default.TEXT, options, classes);
-  };
-
-  var sqrtInnerSVG = {
-      // The main path geometry is from glyph U221A in the font KaTeX Main
-      main: "<svg viewBox='0 0 400000 1000' preserveAspectRatio='xMinYMin\nslice'><path d='M95 622c-2.667 0-7.167-2.667-13.5\n-8S72 604 72 600c0-2 .333-3.333 1-4 1.333-2.667 23.833-20.667 67.5-54s\n65.833-50.333 66.5-51c1.333-1.333 3-2 5-2 4.667 0 8.667 3.333 12 10l173\n378c.667 0 35.333-71 104-213s137.5-285 206.5-429S812 17.333 812 14c5.333\n-9.333 12-14 20-14h399166v40H845.272L620 507 385 993c-2.667 4.667-9 7-19\n7-6 0-10-1-12-3L160 575l-65 47zM834 0h399166v40H845z'/></svg>",
-
-      // size1 is from glyph U221A in the font KaTeX_Size1-Regular
-      1: "<svg viewBox='0 0 400000 1200' preserveAspectRatio='xMinYMin\nslice'><path d='M263 601c.667 0 18 39.667 52 119s68.167\n 158.667 102.5 238 51.833 119.333 52.5 120C810 373.333 980.667 17.667 982 11\nc4.667-7.333 11-11 19-11h398999v40H1012.333L741 607c-38.667 80.667-84 175-136\n 283s-89.167 185.333-111.5 232-33.833 70.333-34.5 71c-4.667 4.667-12.333 7-23\n 7l-12-1-109-253c-72.667-168-109.333-252-110-252-10.667 8-22 16.667-34 26-22\n 17.333-33.333 26-34 26l-26-26 76-59 76-60zM1001 0h398999v40H1012z'/></svg>",
-
-      // size2 is from glyph U221A in the font KaTeX_Size2-Regular
-      2: "<svg viewBox='0 0 400000 1800' preserveAspectRatio='xMinYMin\nslice'><path d='M1001 0h398999v40H1013.084S929.667 308 749\n 880s-277 876.333-289 913c-4.667 4.667-12.667 7-24 7h-12c-1.333-3.333-3.667\n-11.667-7-25-35.333-125.333-106.667-373.333-214-744-10 12-21 25-33 39l-32 39\nc-6-5.333-15-14-27-26l25-30c26.667-32.667 52-63 76-91l52-60 208 722c56-175.333\n 126.333-397.333 211-666s153.833-488.167 207.5-658.5C944.167 129.167 975 32.667\n 983 10c4-6.667 10-10 18-10zm0 0h398999v40H1013z'/></svg>",
-
-      // size3 is from glyph U221A in the font KaTeX_Size3-Regular
-      3: "<svg viewBox='0 0 400000 2400' preserveAspectRatio='xMinYMin\nslice'><path d='M424 2398c-1.333-.667-38.5-172-111.5-514\nS202.667 1370.667 202 1370c0-2-10.667 14.333-32 49-4.667 7.333-9.833 15.667\n-15.5 25s-9.833 16-12.5 20l-5 7c-4-3.333-8.333-7.667-13-13l-13-13 76-122 77-121\n 209 968c0-2 84.667-361.667 254-1079C896.333 373.667 981.667 13.333 983 10\nc4-6.667 10-10 18-10h398999v40H1014.622S927.332 418.667 742 1206c-185.333\n 787.333-279.333 1182.333-282 1185-2 6-10 9-24 9-8 0-12-.667-12-2z\nM1001 0h398999v40H1014z'/></svg>",
-
-      // size4 is from glyph U221A in the font KaTeX_Size4-Regular
-      4: "<svg viewBox='0 0 400000 3000' preserveAspectRatio='xMinYMin\nslice'><path d='M473 2713C812.333 913.667 982.333 13 983 11\nc3.333-7.333 9.333-11 18-11h399110v40H1017.698S927.168 518 741.5 1506C555.833\n 2494 462 2989 460 2991c-2 6-10 9-24 9-8 0-12-.667-12-2s-5.333-32-16-92c-50.667\n-293.333-119.667-693.333-207-1200 0-1.333-5.333 8.667-16 30l-32 64-16 33-26-26\n 76-153 77-151c.667.667 35.667 202 105 604 67.333 400.667 102 602.667 104 606z\nM1001 0h398999v40H1017z'/></svg>",
-
-      // tall is from glyph U23B7 in the font KaTeX_Size4-Regular
-      tall: "l-4 4-4 4c-.667.667-2 1.5-4 2.5s-4.167 1.833-6.5 2.5-5.5 1-9.5 1h\n-12l-28-84c-16.667-52-96.667 -294.333-240-727l-212 -643 -85 170c-4-3.333-8.333\n-7.667-13 -13l-13-13l77-155 77-156c66 199.333 139 419.667 219 661 l218 661z\nM702 0H400000v40H742z'/></svg>"
-  };
-
-  var sqrtSpan = function sqrtSpan(height, delim, options) {
-      // Create a span containing an SVG image of a sqrt symbol.
-      var span = _buildCommon2.default.makeSpan([], [], options);
-      var sizeMultiplier = options.sizeMultiplier; // default
-
-      if (delim.type === "small") {
-          // Get an SVG that is derived from glyph U+221A in font KaTeX-Main.
-          var newOptions = options.havingBaseStyle(delim.style);
-          sizeMultiplier = newOptions.sizeMultiplier / options.sizeMultiplier;
-
-          span.height = 1 * sizeMultiplier;
-          span.style.height = span.height + "em";
-          span.surdWidth = 0.833 * sizeMultiplier; // from the font.
-          //In the font, the glyph is 1000 units tall. The font scale is 1:1000.
-
-          span.innerHTML = "<svg width='100%' height='" + span.height + "em'>\n            " + sqrtInnerSVG['main'] + "</svg>";
-      } else if (delim.type === "large") {
-          // These SVGs come from fonts: KaTeX_Size1, _Size2, etc.
-          // Get sqrt height from font data
-          span.height = sizeToMaxHeight[delim.size] / sizeMultiplier;
-          span.style.height = span.height + "em";
-          span.surdWidth = 1.0 / sizeMultiplier; // from the font
-
-          span.innerHTML = "<svg width=\"100%\" height=\"" + span.height + "em\">\n            " + sqrtInnerSVG[delim.size] + "</svg>";
-      } else {
-          // Tall sqrt. In TeX, this would be stacked using multiple glyphs.
-          // We'll use a single SVG to accomplish the same thing.
-          span.height = height / sizeMultiplier;
-          span.style.height = span.height + "em";
-          span.surdWidth = 1.056 / sizeMultiplier;
-          var viewBoxHeight = Math.floor(span.height * 1000); // scale = 1:1000
-          var vertSegment = viewBoxHeight - 54;
-
-          // This \sqrt is customized in both height and width. We set the
-          // height now. Then CSS will stretch the image to the correct width.
-          // This SVG path comes from glyph U+23B7, font KaTeX_Size4-Regular.
-          span.innerHTML = "<svg width='100%' height='" + span.height + "em'>\n            <svg viewBox='0 0 400000 " + viewBoxHeight + "'\n            preserveAspectRatio='xMinYMax slice'>\n            <path d='M702 0H400000v40H742v" + vertSegment + "\n            " + sqrtInnerSVG['tall'] + "</svg>";
-      }
-
-      span.sizeMultiplier = sizeMultiplier;
-
-      return span;
-  };
-
-  // There are three kinds of delimiters, delimiters that stack when they become
-  // too large
-  var stackLargeDelimiters = ["(", ")", "[", "\\lbrack", "]", "\\rbrack", "\\{", "\\lbrace", "\\}", "\\rbrace", "\\lfloor", "\\rfloor", "\\lceil", "\\rceil", "\\surd"];
-
-  // delimiters that always stack
-  var stackAlwaysDelimiters = ["\\uparrow", "\\downarrow", "\\updownarrow", "\\Uparrow", "\\Downarrow", "\\Updownarrow", "|", "\\|", "\\vert", "\\Vert", "\\lvert", "\\rvert", "\\lVert", "\\rVert", "\\lgroup", "\\rgroup", "\\lmoustache", "\\rmoustache"];
-
-  // and delimiters that never stack
-  var stackNeverDelimiters = ["<", ">", "\\langle", "\\rangle", "/", "\\backslash", "\\lt", "\\gt"];
-
-  // Metrics of the different sizes. Found by looking at TeX's output of
-  // $\bigl| // \Bigl| \biggl| \Biggl| \showlists$
-  // Used to create stacked delimiters of appropriate sizes in makeSizedDelim.
-  var sizeToMaxHeight = [0, 1.2, 1.8, 2.4, 3.0];
-
-  /**
-   * Used to create a delimiter of a specific size, where `size` is 1, 2, 3, or 4.
-   */
-  var makeSizedDelim = function makeSizedDelim(delim, size, options, mode, classes) {
-      // < and > turn into \langle and \rangle in delimiters
-      if (delim === "<" || delim === "\\lt") {
-          delim = "\\langle";
-      } else if (delim === ">" || delim === "\\gt") {
-          delim = "\\rangle";
-      }
-
-      // Sized delimiters are never centered.
-      if (_utils2.default.contains(stackLargeDelimiters, delim) || _utils2.default.contains(stackNeverDelimiters, delim)) {
-          return makeLargeDelim(delim, size, false, options, mode, classes);
-      } else if (_utils2.default.contains(stackAlwaysDelimiters, delim)) {
-          return makeStackedDelim(delim, sizeToMaxHeight[size], false, options, mode, classes);
-      } else {
-          throw new _ParseError2.default("Illegal delimiter: '" + delim + "'");
-      }
-  };
-
-  /**
-   * There are three different sequences of delimiter sizes that the delimiters
-   * follow depending on the kind of delimiter. This is used when creating custom
-   * sized delimiters to decide whether to create a small, large, or stacked
-   * delimiter.
-   *
-   * In real TeX, these sequences aren't explicitly defined, but are instead
-   * defined inside the font metrics. Since there are only three sequences that
-   * are possible for the delimiters that TeX defines, it is easier to just encode
-   * them explicitly here.
-   */
-
-  // Delimiters that never stack try small delimiters and large delimiters only
-  var stackNeverDelimiterSequence = [{ type: "small", style: _Style2.default.SCRIPTSCRIPT }, { type: "small", style: _Style2.default.SCRIPT }, { type: "small", style: _Style2.default.TEXT }, { type: "large", size: 1 }, { type: "large", size: 2 }, { type: "large", size: 3 }, { type: "large", size: 4 }];
-
-  // Delimiters that always stack try the small delimiters first, then stack
-  var stackAlwaysDelimiterSequence = [{ type: "small", style: _Style2.default.SCRIPTSCRIPT }, { type: "small", style: _Style2.default.SCRIPT }, { type: "small", style: _Style2.default.TEXT }, { type: "stack" }];
-
-  // Delimiters that stack when large try the small and then large delimiters, and
-  // stack afterwards
-  var stackLargeDelimiterSequence = [{ type: "small", style: _Style2.default.SCRIPTSCRIPT }, { type: "small", style: _Style2.default.SCRIPT }, { type: "small", style: _Style2.default.TEXT }, { type: "large", size: 1 }, { type: "large", size: 2 }, { type: "large", size: 3 }, { type: "large", size: 4 }, { type: "stack" }];
-
-  /**
-   * Get the font used in a delimiter based on what kind of delimiter it is.
-   */
-  var delimTypeToFont = function delimTypeToFont(type) {
-      if (type.type === "small") {
-          return "Main-Regular";
-      } else if (type.type === "large") {
-          return "Size" + type.size + "-Regular";
-      } else if (type.type === "stack") {
-          return "Size4-Regular";
-      }
-  };
-
-  /**
-   * Traverse a sequence of types of delimiters to decide what kind of delimiter
-   * should be used to create a delimiter of the given height+depth.
-   */
-  var traverseSequence = function traverseSequence(delim, height, sequence, options) {
-      // Here, we choose the index we should start at in the sequences. In smaller
-      // sizes (which correspond to larger numbers in style.size) we start earlier
-      // in the sequence. Thus, scriptscript starts at index 3-3=0, script starts
-      // at index 3-2=1, text starts at 3-1=2, and display starts at min(2,3-0)=2
-      var start = Math.min(2, 3 - options.style.size);
-      for (var i = start; i < sequence.length; i++) {
-          if (sequence[i].type === "stack") {
-              // This is always the last delimiter, so we just break the loop now.
-              break;
-          }
-
-          var metrics = getMetrics(delim, delimTypeToFont(sequence[i]));
-          var heightDepth = metrics.height + metrics.depth;
-
-          // Small delimiters are scaled down versions of the same font, so we
-          // account for the style change size.
-
-          if (sequence[i].type === "small") {
-              var newOptions = options.havingBaseStyle(sequence[i].style);
-              heightDepth *= newOptions.sizeMultiplier;
-          }
-
-          // Check if the delimiter at this size works for the given height.
-          if (heightDepth > height) {
-              return sequence[i];
-          }
-      }
-
-      // If we reached the end of the sequence, return the last sequence element.
-      return sequence[sequence.length - 1];
-  };
-
-  /**
-   * Make a delimiter of a given height+depth, with optional centering. Here, we
-   * traverse the sequences, and create a delimiter that the sequence tells us to.
-   */
-  var makeCustomSizedDelim = function makeCustomSizedDelim(delim, height, center, options, mode, classes) {
-      if (delim === "<" || delim === "\\lt") {
-          delim = "\\langle";
-      } else if (delim === ">" || delim === "\\gt") {
-          delim = "\\rangle";
-      }
-
-      // Decide what sequence to use
-      var sequence = void 0;
-      if (_utils2.default.contains(stackNeverDelimiters, delim)) {
-          sequence = stackNeverDelimiterSequence;
-      } else if (_utils2.default.contains(stackLargeDelimiters, delim)) {
-          sequence = stackLargeDelimiterSequence;
-      } else {
-          sequence = stackAlwaysDelimiterSequence;
-      }
-
-      // Look through the sequence
-      var delimType = traverseSequence(delim, height, sequence, options);
-
-      if (delim === "\\surd") {
-          // Get an SVG image for
-          return sqrtSpan(height, delimType, options);
-      } else {
-          // Get the delimiter from font glyphs.
-          // Depending on the sequence element we decided on, call the
-          // appropriate function.
-          if (delimType.type === "small") {
-              return makeSmallDelim(delim, delimType.style, center, options, mode, classes);
-          } else if (delimType.type === "large") {
-              return makeLargeDelim(delim, delimType.size, center, options, mode, classes);
-          } else if (delimType.type === "stack") {
-              return makeStackedDelim(delim, height, center, options, mode, classes);
-          }
-      }
-  };
-
-  /**
-   * Make a delimiter for use with `\left` and `\right`, given a height and depth
-   * of an expression that the delimiters surround.
-   */
-  var makeLeftRightDelim = function makeLeftRightDelim(delim, height, depth, options, mode, classes) {
-      // We always center \left/\right delimiters, so the axis is always shifted
-      var axisHeight = options.fontMetrics().axisHeight * options.sizeMultiplier;
-
-      // Taken from TeX source, tex.web, function make_left_right
-      var delimiterFactor = 901;
-      var delimiterExtend = 5.0 / options.fontMetrics().ptPerEm;
-
-      var maxDistFromAxis = Math.max(height - axisHeight, depth + axisHeight);
-
-      var totalHeight = Math.max(
-      // In real TeX, calculations are done using integral values which are
-      // 65536 per pt, or 655360 per em. So, the division here truncates in
-      // TeX but doesn't here, producing different results. If we wanted to
-      // exactly match TeX's calculation, we could do
-      //   Math.floor(655360 * maxDistFromAxis / 500) *
-      //    delimiterFactor / 655360
-      // (To see the difference, compare
-      //    x^{x^{\left(\rule{0.1em}{0.68em}\right)}}
-      // in TeX and KaTeX)
-      maxDistFromAxis / 500 * delimiterFactor, 2 * maxDistFromAxis - delimiterExtend);
-
-      // Finally, we defer to `makeCustomSizedDelim` with our calculated total
-      // height
-      return makeCustomSizedDelim(delim, totalHeight, true, options, mode, classes);
-  };
-
-  module.exports = {
-      sizedDelim: makeSizedDelim,
-      customSizedDelim: makeCustomSizedDelim,
-      leftRightDelim: makeLeftRightDelim
-  };
-
-  },{"./ParseError":29,"./Style":33,"./buildCommon":34,"./fontMetrics":41,"./symbols":48,"./utils":51}],39:[function(require,module,exports){
-
-  var _classCallCheck2 = require("babel-runtime/helpers/classCallCheck");
-
-  var _classCallCheck3 = _interopRequireDefault(_classCallCheck2);
-
-  var _createClass2 = require("babel-runtime/helpers/createClass");
-
-  var _createClass3 = _interopRequireDefault(_createClass2);
-
-  var _unicodeRegexes = require("./unicodeRegexes");
-
-  var _unicodeRegexes2 = _interopRequireDefault(_unicodeRegexes);
-
-  var _utils = require("./utils");
-
-  var _utils2 = _interopRequireDefault(_utils);
-
-  function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { default: obj }; }
-
-  /**
-   * Create an HTML className based on a list of classes. In addition to joining
-   * with spaces, we also remove null or empty classes.
-   */
-  /**
-   * These objects store the data about the DOM nodes we create, as well as some
-   * extra data. They can then be transformed into real DOM nodes with the
-   * `toNode` function or HTML markup using `toMarkup`. They are useful for both
-   * storing extra properties on the nodes, as well as providing a way to easily
-   * work with the DOM.
-   *
-   * Similar functions for working with MathML nodes exist in mathMLTree.js.
-   */
-  var createClass = function createClass(classes) {
-      classes = classes.slice();
-      for (var i = classes.length - 1; i >= 0; i--) {
-          if (!classes[i]) {
-              classes.splice(i, 1);
-          }
-      }
-
-      return classes.join(" ");
-  };
-
-  /**
-   * This node represents a span node, with a className, a list of children, and
-   * an inline style. It also contains information about its height, depth, and
-   * maxFontSize.
-   */
-
-  var span = function () {
-      function span(classes, children, options) {
-          (0, _classCallCheck3.default)(this, span);
-
-          this.classes = classes || [];
-          this.children = children || [];
-          this.height = 0;
-          this.depth = 0;
-          this.maxFontSize = 0;
-          this.style = {};
-          this.attributes = {};
-          this.innerHTML; // used for inline SVG code.
-          if (options) {
-              if (options.style.isTight()) {
-                  this.classes.push("mtight");
-              }
-              if (options.getColor()) {
-                  this.style.color = options.getColor();
-              }
-          }
-      }
-
-      /**
-       * Sets an arbitrary attribute on the span. Warning: use this wisely. Not all
-       * browsers support attributes the same, and having too many custom attributes
-       * is probably bad.
-       */
-
-
-      (0, _createClass3.default)(span, [{
-          key: "setAttribute",
-          value: function setAttribute(attribute, value) {
-              this.attributes[attribute] = value;
-          }
-      }, {
-          key: "tryCombine",
-          value: function tryCombine(sibling) {
-              return false;
-          }
-
-          /**
-           * Convert the span into an HTML node
-           */
-
-      }, {
-          key: "toNode",
-          value: function toNode() {
-              var span = document.createElement("span");
-
-              // Apply the class
-              span.className = createClass(this.classes);
-
-              // Apply inline styles
-              for (var style in this.style) {
-                  if (Object.prototype.hasOwnProperty.call(this.style, style)) {
-                      span.style[style] = this.style[style];
-                  }
-              }
-
-              // Apply attributes
-              for (var attr in this.attributes) {
-                  if (Object.prototype.hasOwnProperty.call(this.attributes, attr)) {
-                      span.setAttribute(attr, this.attributes[attr]);
-                  }
-              }
-
-              if (this.innerHTML) {
-                  span.innerHTML = this.innerHTML;
-              }
-
-              // Append the children, also as HTML nodes
-              for (var i = 0; i < this.children.length; i++) {
-                  span.appendChild(this.children[i].toNode());
-              }
-
-              return span;
-          }
-
-          /**
-           * Convert the span into an HTML markup string
-           */
-
-      }, {
-          key: "toMarkup",
-          value: function toMarkup() {
-              var markup = "<span";
-
-              // Add the class
-              if (this.classes.length) {
-                  markup += " class=\"";
-                  markup += _utils2.default.escape(createClass(this.classes));
-                  markup += "\"";
-              }
-
-              var styles = "";
-
-              // Add the styles, after hyphenation
-              for (var style in this.style) {
-                  if (this.style.hasOwnProperty(style)) {
-                      styles += _utils2.default.hyphenate(style) + ":" + this.style[style] + ";";
-                  }
-              }
-
-              if (styles) {
-                  markup += " style=\"" + _utils2.default.escape(styles) + "\"";
-              }
-
-              // Add the attributes
-              for (var attr in this.attributes) {
-                  if (Object.prototype.hasOwnProperty.call(this.attributes, attr)) {
-                      markup += " " + attr + "=\"";
-                      markup += _utils2.default.escape(this.attributes[attr]);
-                      markup += "\"";
-                  }
-              }
-
-              markup += ">";
-
-              if (this.innerHTML) {
-                  markup += this.innerHTML;
-              }
-
-              // Add the markup of the children, also as markup
-              for (var i = 0; i < this.children.length; i++) {
-                  markup += this.children[i].toMarkup();
-              }
-
-              markup += "</span>";
-
-              return markup;
-          }
-      }]);
-      return span;
-  }();
-
-  /**
-   * This node represents a document fragment, which contains elements, but when
-   * placed into the DOM doesn't have any representation itself. Thus, it only
-   * contains children and doesn't have any HTML properties. It also keeps track
-   * of a height, depth, and maxFontSize.
-   */
-
-
-  var documentFragment = function () {
-      function documentFragment(children) {
-          (0, _classCallCheck3.default)(this, documentFragment);
-
-          this.children = children || [];
-          this.height = 0;
-          this.depth = 0;
-          this.maxFontSize = 0;
-      }
-
-      /**
-       * Convert the fragment into a node
-       */
-
-
-      (0, _createClass3.default)(documentFragment, [{
-          key: "toNode",
-          value: function toNode() {
-              // Create a fragment
-              var frag = document.createDocumentFragment();
-
-              // Append the children
-              for (var i = 0; i < this.children.length; i++) {
-                  frag.appendChild(this.children[i].toNode());
-              }
-
-              return frag;
-          }
-
-          /**
-           * Convert the fragment into HTML markup
-           */
-
-      }, {
-          key: "toMarkup",
-          value: function toMarkup() {
-              var markup = "";
-
-              // Simply concatenate the markup for the children together
-              for (var i = 0; i < this.children.length; i++) {
-                  markup += this.children[i].toMarkup();
-              }
-
-              return markup;
-          }
-      }]);
-      return documentFragment;
-  }();
-
-  var iCombinations = {
-      'î': "\u0131\u0302",
-      'ï': "\u0131\u0308",
-      'í': "\u0131\u0301",
-      // 'ī': '\u0131\u0304', // enable when we add Extended Latin
-      'ì': "\u0131\u0300"
-  };
-
-  /**
-   * A symbol node contains information about a single symbol. It either renders
-   * to a single text node, or a span with a single text node in it, depending on
-   * whether it has CSS classes, styles, or needs italic correction.
-   */
-
-  var symbolNode = function () {
-      function symbolNode(value, height, depth, italic, skew, classes, style) {
-          (0, _classCallCheck3.default)(this, symbolNode);
-
-          this.value = value || "";
-          this.height = height || 0;
-          this.depth = depth || 0;
-          this.italic = italic || 0;
-          this.skew = skew || 0;
-          this.classes = classes || [];
-          this.style = style || {};
-          this.maxFontSize = 0;
-
-          // Mark CJK characters with specific classes so that we can specify which
-          // fonts to use.  This allows us to render these characters with a serif
-          // font in situations where the browser would either default to a sans serif
-          // or render a placeholder character.
-          if (_unicodeRegexes2.default.cjkRegex.test(value)) {
-              // I couldn't find any fonts that contained Hangul as well as all of
-              // the other characters we wanted to test there for it gets its own
-              // CSS class.
-              if (_unicodeRegexes2.default.hangulRegex.test(value)) {
-                  this.classes.push('hangul_fallback');
-              } else {
-                  this.classes.push('cjk_fallback');
-              }
-          }
-
-          if (/[îïíì]/.test(this.value)) {
-              // add ī when we add Extended Latin
-              this.value = iCombinations[this.value];
-          }
-      }
-
-      (0, _createClass3.default)(symbolNode, [{
-          key: "tryCombine",
-          value: function tryCombine(sibling) {
-              if (!sibling || !(sibling instanceof symbolNode) || this.italic > 0 || createClass(this.classes) !== createClass(sibling.classes) || this.skew !== sibling.skew || this.maxFontSize !== sibling.maxFontSize) {
-                  return false;
-              }
-              for (var style in this.style) {
-                  if (this.style.hasOwnProperty(style) && this.style[style] !== sibling.style[style]) {
-                      return false;
-                  }
-              }
-              for (var _style in sibling.style) {
-                  if (sibling.style.hasOwnProperty(_style) && this.style[_style] !== sibling.style[_style]) {
-                      return false;
-                  }
-              }
-              this.value += sibling.value;
-              this.height = Math.max(this.height, sibling.height);
-              this.depth = Math.max(this.depth, sibling.depth);
-              this.italic = sibling.italic;
-              return true;
-          }
-
-          /**
-           * Creates a text node or span from a symbol node. Note that a span is only
-           * created if it is needed.
-           */
-
-      }, {
-          key: "toNode",
-          value: function toNode() {
-              var node = document.createTextNode(this.value);
-              var span = null;
-
-              if (this.italic > 0) {
-                  span = document.createElement("span");
-                  span.style.marginRight = this.italic + "em";
-              }
-
-              if (this.classes.length > 0) {
-                  span = span || document.createElement("span");
-                  span.className = createClass(this.classes);
-              }
-
-              for (var style in this.style) {
-                  if (this.style.hasOwnProperty(style)) {
-                      span = span || document.createElement("span");
-                      span.style[style] = this.style[style];
-                  }
-              }
-
-              if (span) {
-                  span.appendChild(node);
-                  return span;
-              } else {
-                  return node;
-              }
-          }
-
-          /**
-           * Creates markup for a symbol node.
-           */
-
-      }, {
-          key: "toMarkup",
-          value: function toMarkup() {
-              // TODO(alpert): More duplication than I'd like from
-              // span.prototype.toMarkup and symbolNode.prototype.toNode...
-              var needsSpan = false;
-
-              var markup = "<span";
-
-              if (this.classes.length) {
-                  needsSpan = true;
-                  markup += " class=\"";
-                  markup += _utils2.default.escape(createClass(this.classes));
-                  markup += "\"";
-              }
-
-              var styles = "";
-
-              if (this.italic > 0) {
-                  styles += "margin-right:" + this.italic + "em;";
-              }
-              for (var style in this.style) {
-                  if (this.style.hasOwnProperty(style)) {
-                      styles += _utils2.default.hyphenate(style) + ":" + this.style[style] + ";";
-                  }
-              }
-
-              if (styles) {
-                  needsSpan = true;
-                  markup += " style=\"" + _utils2.default.escape(styles) + "\"";
-              }
-
-              var escaped = _utils2.default.escape(this.value);
-              if (needsSpan) {
-                  markup += ">";
-                  markup += escaped;
-                  markup += "</span>";
-                  return markup;
-              } else {
-                  return escaped;
-              }
-          }
-      }]);
-      return symbolNode;
-  }();
-
-  module.exports = {
-      span: span,
-      documentFragment: documentFragment,
-      symbolNode: symbolNode
-  };
-
-  },{"./unicodeRegexes":49,"./utils":51,"babel-runtime/helpers/classCallCheck":4,"babel-runtime/helpers/createClass":5}],40:[function(require,module,exports){
-
-  var _ParseNode = require("./ParseNode");
-
-  var _ParseNode2 = _interopRequireDefault(_ParseNode);
-
-  var _ParseError = require("./ParseError");
-
-  var _ParseError2 = _interopRequireDefault(_ParseError);
-
-  function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { default: obj }; }
-
-  /**
-   * Parse the body of the environment, with rows delimited by \\ and
-   * columns delimited by &, and create a nested list in row-major order
-   * with one group per cell.  If given an optional argument style
-   * ("text", "display", etc.), then each cell is cast into that style.
-   */
-  /* eslint no-constant-condition:0 */
-  function parseArray(parser, result, style) {
-      var row = [];
-      var body = [row];
-      var rowGaps = [];
-      while (true) {
-          var cell = parser.parseExpression(false, null);
-          cell = new _ParseNode2.default("ordgroup", cell, parser.mode);
-          if (style) {
-              cell = new _ParseNode2.default("styling", {
-                  style: style,
-                  value: [cell]
-              }, parser.mode);
-          }
-          row.push(cell);
-          var next = parser.nextToken.text;
-          if (next === "&") {
-              parser.consume();
-          } else if (next === "\\end") {
-              break;
-          } else if (next === "\\\\" || next === "\\cr") {
-              var cr = parser.parseFunction();
-              rowGaps.push(cr.value.size);
-              row = [];
-              body.push(row);
-          } else {
-              throw new _ParseError2.default("Expected & or \\\\ or \\end", parser.nextToken);
-          }
-      }
-      result.body = body;
-      result.rowGaps = rowGaps;
-      return new _ParseNode2.default(result.type, result, parser.mode);
-  }
-
-  /*
-   * An environment definition is very similar to a function definition:
-   * it is declared with a name or a list of names, a set of properties
-   * and a handler containing the actual implementation.
-   *
-   * The properties include:
-   *  - numArgs: The number of arguments after the \begin{name} function.
-   *  - argTypes: (optional) Just like for a function
-   *  - allowedInText: (optional) Whether or not the environment is allowed inside
-   *                   text mode (default false) (not enforced yet)
-   *  - numOptionalArgs: (optional) Just like for a function
-   * A bare number instead of that object indicates the numArgs value.
-   *
-   * The handler function will receive two arguments
-   *  - context: information and references provided by the parser
-   *  - args: an array of arguments passed to \begin{name}
-   * The context contains the following properties:
-   *  - envName: the name of the environment, one of the listed names.
-   *  - parser: the parser object
-   *  - lexer: the lexer object
-   *  - positions: the positions associated with these arguments from args.
-   * The handler must return a ParseResult.
-   */
-  function defineEnvironment(names, props, handler) {
-      if (typeof names === "string") {
-          names = [names];
-      }
-      if (typeof props === "number") {
-          props = { numArgs: props };
-      }
-      // Set default values of environments
-      var data = {
-          numArgs: props.numArgs || 0,
-          argTypes: props.argTypes,
-          greediness: 1,
-          allowedInText: !!props.allowedInText,
-          numOptionalArgs: props.numOptionalArgs || 0,
-          handler: handler
-      };
-      for (var i = 0; i < names.length; ++i) {
-          module.exports[names[i]] = data;
-      }
-  }
-
-  // Decides on a style for cells in an array according to whether the given
-  // environment name starts with the letter 'd'.
-  function dCellStyle(envName) {
-      if (envName.substr(0, 1) === "d") {
-          return "display";
-      } else {
-          return "text";
-      }
-  }
-
-  // Arrays are part of LaTeX, defined in lttab.dtx so its documentation
-  // is part of the source2e.pdf file of LaTeX2e source documentation.
-  // {darray} is an {array} environment where cells are set in \displaystyle,
-  // as defined in nccmath.sty.
-  defineEnvironment(["array", "darray"], {
-      numArgs: 1
-  }, function (context, args) {
-      var colalign = args[0];
-      colalign = colalign.value.map ? colalign.value : [colalign];
-      var cols = colalign.map(function (node) {
-          var ca = node.value;
-          if ("lcr".indexOf(ca) !== -1) {
-              return {
-                  type: "align",
-                  align: ca
-              };
-          } else if (ca === "|") {
-              return {
-                  type: "separator",
-                  separator: "|"
-              };
-          }
-          throw new _ParseError2.default("Unknown column alignment: " + node.value, node);
-      });
-      var res = {
-          type: "array",
-          cols: cols,
-          hskipBeforeAndAfter: true };
-      res = parseArray(context.parser, res, dCellStyle(context.envName));
-      return res;
-  });
-
-  // The matrix environments of amsmath builds on the array environment
-  // of LaTeX, which is discussed above.
-  defineEnvironment(["matrix", "pmatrix", "bmatrix", "Bmatrix", "vmatrix", "Vmatrix"], {}, function (context) {
-      var delimiters = {
-          "matrix": null,
-          "pmatrix": ["(", ")"],
-          "bmatrix": ["[", "]"],
-          "Bmatrix": ["\\{", "\\}"],
-          "vmatrix": ["|", "|"],
-          "Vmatrix": ["\\Vert", "\\Vert"]
-      }[context.envName];
-      var res = {
-          type: "array",
-          hskipBeforeAndAfter: false };
-      res = parseArray(context.parser, res, dCellStyle(context.envName));
-      if (delimiters) {
-          res = new _ParseNode2.default("leftright", {
-              body: [res],
-              left: delimiters[0],
-              right: delimiters[1]
-          }, context.mode);
-      }
-      return res;
-  });
-
-  // A cases environment (in amsmath.sty) is almost equivalent to
-  // \def\arraystretch{1.2}%
-  // \left\{\begin{array}{@{}l@{\quad}l@{}} … \end{array}\right.
-  // {dcases} is a {cases} environment where cells are set in \displaystyle,
-  // as defined in mathtools.sty.
-  defineEnvironment(["cases", "dcases"], {}, function (context) {
-      var res = {
-          type: "array",
-          arraystretch: 1.2,
-          cols: [{
-              type: "align",
-              align: "l",
-              pregap: 0,
-              // TODO(kevinb) get the current style.
-              // For now we use the metrics for TEXT style which is what we were
-              // doing before.  Before attempting to get the current style we
-              // should look at TeX's behavior especially for \over and matrices.
-              postgap: 1.0 }, {
-              type: "align",
-              align: "l",
-              pregap: 0,
-              postgap: 0
-          }]
-      };
-      res = parseArray(context.parser, res, dCellStyle(context.envName));
-      res = new _ParseNode2.default("leftright", {
-          body: [res],
-          left: "\\{",
-          right: "."
-      }, context.mode);
-      return res;
-  });
-
-  // An aligned environment is like the align* environment
-  // except it operates within math mode.
-  // Note that we assume \nomallineskiplimit to be zero,
-  // so that \strut@ is the same as \strut.
-  defineEnvironment("aligned", {}, function (context) {
-      var res = {
-          type: "array",
-          cols: [],
-          addJot: true
-      };
-      res = parseArray(context.parser, res, "display");
-      // Count number of columns = maximum number of cells in each row.
-      // At the same time, prepend empty group {} at beginning of every second
-      // cell in each row (starting with second cell) so that operators become
-      // binary.  This behavior is implemented in amsmath's \start@aligned.
-      var emptyGroup = new _ParseNode2.default("ordgroup", [], context.mode);
-      var numCols = 0;
-      res.value.body.forEach(function (row) {
-          for (var i = 1; i < row.length; i += 2) {
-              // Modify ordgroup node within styling node
-              var ordgroup = row[i].value.value[0];
-              ordgroup.value.unshift(emptyGroup);
-          }
-          if (numCols < row.length) {
-              numCols = row.length;
-          }
-      });
-      for (var i = 0; i < numCols; ++i) {
-          var align = "r";
-          var pregap = 0;
-          if (i % 2 === 1) {
-              align = "l";
-          } else if (i > 0) {
-              pregap = 2; // one \qquad between columns
-          }
-          res.value.cols[i] = {
-              type: "align",
-              align: align,
-              pregap: pregap,
-              postgap: 0
-          };
-      }
-      return res;
-  });
-
-  // A gathered environment is like an array environment with one centered
-  // column, but where rows are considered lines so get \jot line spacing
-  // and contents are set in \displaystyle.
-  defineEnvironment("gathered", {}, function (context) {
-      var res = {
-          type: "array",
-          cols: [{
-              type: "align",
-              align: "c"
-          }],
-          addJot: true
-      };
-      res = parseArray(context.parser, res, "display");
-      return res;
-  });
-
-  },{"./ParseError":29,"./ParseNode":30}],41:[function(require,module,exports){
-
-  var _unicodeRegexes = require("./unicodeRegexes");
-
-  var _fontMetricsData = require("./fontMetricsData");
-
-  var _fontMetricsData2 = _interopRequireDefault(_fontMetricsData);
-
-  function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { default: obj }; }
-
-  /**
-   * This file contains metrics regarding fonts and individual symbols. The sigma
-   * and xi variables, as well as the metricMap map contain data extracted from
-   * TeX, TeX font metrics, and the TTF files. These data are then exposed via the
-   * `metrics` variable and the getCharacterMetrics function.
-   */
-
-  // In TeX, there are actually three sets of dimensions, one for each of
-  // textstyle (size index 5 and higher: >=9pt), scriptstyle (size index 3 and 4:
-  // 7-8pt), and scriptscriptstyle (size index 1 and 2: 5-6pt).  These are
-  // provided in the the arrays below, in that order.
-  //
-  // The font metrics are stored in fonts cmsy10, cmsy7, and cmsy5 respsectively.
-  // This was determined by running the following script:
-  //
-  //     latex -interaction=nonstopmode \
-  //     '\documentclass{article}\usepackage{amsmath}\begin{document}' \
-  //     '$a$ \expandafter\show\the\textfont2' \
-  //     '\expandafter\show\the\scriptfont2' \
-  //     '\expandafter\show\the\scriptscriptfont2' \
-  //     '\stop'
-  //
-  // The metrics themselves were retreived using the following commands:
-  //
-  //     tftopl cmsy10
-  //     tftopl cmsy7
-  //     tftopl cmsy5
-  //
-  // The output of each of these commands is quite lengthy.  The only part we
-  // care about is the FONTDIMEN section. Each value is measured in EMs.
-  var sigmasAndXis = {
-      slant: [0.250, 0.250, 0.250], // sigma1
-      space: [0.000, 0.000, 0.000], // sigma2
-      stretch: [0.000, 0.000, 0.000], // sigma3
-      shrink: [0.000, 0.000, 0.000], // sigma4
-      xHeight: [0.431, 0.431, 0.431], // sigma5
-      quad: [1.000, 1.171, 1.472], // sigma6
-      extraSpace: [0.000, 0.000, 0.000], // sigma7
-      num1: [0.677, 0.732, 0.925], // sigma8
-      num2: [0.394, 0.384, 0.387], // sigma9
-      num3: [0.444, 0.471, 0.504], // sigma10
-      denom1: [0.686, 0.752, 1.025], // sigma11
-      denom2: [0.345, 0.344, 0.532], // sigma12
-      sup1: [0.413, 0.503, 0.504], // sigma13
-      sup2: [0.363, 0.431, 0.404], // sigma14
-      sup3: [0.289, 0.286, 0.294], // sigma15
-      sub1: [0.150, 0.143, 0.200], // sigma16
-      sub2: [0.247, 0.286, 0.400], // sigma17
-      supDrop: [0.386, 0.353, 0.494], // sigma18
-      subDrop: [0.050, 0.071, 0.100], // sigma19
-      delim1: [2.390, 1.700, 1.980], // sigma20
-      delim2: [1.010, 1.157, 1.420], // sigma21
-      axisHeight: [0.250, 0.250, 0.250], // sigma22
-
-      // These font metrics are extracted from TeX by using tftopl on cmex10.tfm;
-      // they correspond to the font parameters of the extension fonts (family 3).
-      // See the TeXbook, page 441. In AMSTeX, the extension fonts scale; to
-      // match cmex7, we'd use cmex7.tfm values for script and scriptscript
-      // values.
-      defaultRuleThickness: [0.04, 0.049, 0.049], // xi8; cmex7: 0.049
-      bigOpSpacing1: [0.111, 0.111, 0.111], // xi9
-      bigOpSpacing2: [0.166, 0.166, 0.166], // xi10
-      bigOpSpacing3: [0.2, 0.2, 0.2], // xi11
-      bigOpSpacing4: [0.6, 0.611, 0.611], // xi12; cmex7: 0.611
-      bigOpSpacing5: [0.1, 0.143, 0.143], // xi13; cmex7: 0.143
-
-      // The \sqrt rule width is taken from the height of the surd character.
-      // Since we use the same font at all sizes, this thickness doesn't scale.
-      sqrtRuleThickness: [0.04, 0.04, 0.04],
-
-      // This value determines how large a pt is, for metrics which are defined
-      // in terms of pts.
-      // This value is also used in katex.less; if you change it make sure the
-      // values match.
-      ptPerEm: [10.0, 10.0, 10.0],
-
-      // The space between adjacent `|` columns in an array definition. From
-      // `\showthe\doublerulesep` in LaTeX. Equals 2.0 / ptPerEm.
-      doubleRuleSep: [0.2, 0.2, 0.2]
-  };
-
-  // This map contains a mapping from font name and character code to character
-  // metrics, including height, depth, italic correction, and skew (kern from the
-  // character to the corresponding \skewchar)
-  // This map is generated via `make metrics`. It should not be changed manually.
-
-
-  // These are very rough approximations.  We default to Times New Roman which
-  // should have Latin-1 and Cyrillic characters, but may not depending on the
-  // operating system.  The metrics do not account for extra height from the
-  // accents.  In the case of Cyrillic characters which have both ascenders and
-  // descenders we prefer approximations with ascenders, primarily to prevent
-  // the fraction bar or root line from intersecting the glyph.
-  // TODO(kevinb) allow union of multiple glyph metrics for better accuracy.
-  var extraCharacterMap = {
-      // Latin-1
-      'À': 'A',
-      'Á': 'A',
-      'Â': 'A',
-      'Ã': 'A',
-      'Ä': 'A',
-      'Å': 'A',
-      'Æ': 'A',
-      'Ç': 'C',
-      'È': 'E',
-      'É': 'E',
-      'Ê': 'E',
-      'Ë': 'E',
-      'Ì': 'I',
-      'Í': 'I',
-      'Î': 'I',
-      'Ï': 'I',
-      'Ð': 'D',
-      'Ñ': 'N',
-      'Ò': 'O',
-      'Ó': 'O',
-      'Ô': 'O',
-      'Õ': 'O',
-      'Ö': 'O',
-      'Ø': 'O',
-      'Ù': 'U',
-      'Ú': 'U',
-      'Û': 'U',
-      'Ü': 'U',
-      'Ý': 'Y',
-      'Þ': 'o',
-      'ß': 'B',
-      'à': 'a',
-      'á': 'a',
-      'â': 'a',
-      'ã': 'a',
-      'ä': 'a',
-      'å': 'a',
-      'æ': 'a',
-      'ç': 'c',
-      'è': 'e',
-      'é': 'e',
-      'ê': 'e',
-      'ë': 'e',
-      'ì': 'i',
-      'í': 'i',
-      'î': 'i',
-      'ï': 'i',
-      'ð': 'd',
-      'ñ': 'n',
-      'ò': 'o',
-      'ó': 'o',
-      'ô': 'o',
-      'õ': 'o',
-      'ö': 'o',
-      'ø': 'o',
-      'ù': 'u',
-      'ú': 'u',
-      'û': 'u',
-      'ü': 'u',
-      'ý': 'y',
-      'þ': 'o',
-      'ÿ': 'y',
-
-      // Cyrillic
-      'А': 'A',
-      'Б': 'B',
-      'В': 'B',
-      'Г': 'F',
-      'Д': 'A',
-      'Е': 'E',
-      'Ж': 'K',
-      'З': '3',
-      'И': 'N',
-      'Й': 'N',
-      'К': 'K',
-      'Л': 'N',
-      'М': 'M',
-      'Н': 'H',
-      'О': 'O',
-      'П': 'N',
-      'Р': 'P',
-      'С': 'C',
-      'Т': 'T',
-      'У': 'y',
-      'Ф': 'O',
-      'Х': 'X',
-      'Ц': 'U',
-      'Ч': 'h',
-      'Ш': 'W',
-      'Щ': 'W',
-      'Ъ': 'B',
-      'Ы': 'X',
-      'Ь': 'B',
-      'Э': '3',
-      'Ю': 'X',
-      'Я': 'R',
-      'а': 'a',
-      'б': 'b',
-      'в': 'a',
-      'г': 'r',
-      'д': 'y',
-      'е': 'e',
-      'ж': 'm',
-      'з': 'e',
-      'и': 'n',
-      'й': 'n',
-      'к': 'n',
-      'л': 'n',
-      'м': 'm',
-      'н': 'n',
-      'о': 'o',
-      'п': 'n',
-      'р': 'p',
-      'с': 'c',
-      'т': 'o',
-      'у': 'y',
-      'ф': 'b',
-      'х': 'x',
-      'ц': 'n',
-      'ч': 'n',
-      'ш': 'w',
-      'щ': 'w',
-      'ъ': 'a',
-      'ы': 'm',
-      'ь': 'a',
-      'э': 'e',
-      'ю': 'm',
-      'я': 'r'
-  };
-
-  /**
-   * This function is a convenience function for looking up information in the
-   * metricMap table. It takes a character as a string, and a style.
-   *
-   * Note: the `width` property may be undefined if fontMetricsData.js wasn't
-   * built using `Make extended_metrics`.
-   */
-  var getCharacterMetrics = function getCharacterMetrics(character, style) {
-      var ch = character.charCodeAt(0);
-      if (character[0] in extraCharacterMap) {
-          ch = extraCharacterMap[character[0]].charCodeAt(0);
-      } else if (_unicodeRegexes.cjkRegex.test(character[0])) {
-          ch = 'M'.charCodeAt(0);
-      }
-      var metrics = _fontMetricsData2.default[style][ch];
-      if (metrics) {
-          return {
-              depth: metrics[0],
-              height: metrics[1],
-              italic: metrics[2],
-              skew: metrics[3],
-              width: metrics[4]
-          };
-      }
-  };
-
-  var fontMetricsBySizeIndex = {};
-
-  /**
-   * Get the font metrics for a given size.
-   */
-  var getFontMetrics = function getFontMetrics(size) {
-      var sizeIndex = void 0;
-      if (size >= 5) {
-          sizeIndex = 0;
-      } else if (size >= 3) {
-          sizeIndex = 1;
-      } else {
-          sizeIndex = 2;
-      }
-      if (!fontMetricsBySizeIndex[sizeIndex]) {
-          var metrics = fontMetricsBySizeIndex[sizeIndex] = {};
-          for (var key in sigmasAndXis) {
-              if (sigmasAndXis.hasOwnProperty(key)) {
-                  metrics[key] = sigmasAndXis[key][sizeIndex];
-              }
-          }
-          metrics.cssEmPerMu = metrics.quad / 18;
-      }
-      return fontMetricsBySizeIndex[sizeIndex];
-  };
-
-  module.exports = {
-      getFontMetrics: getFontMetrics,
-      getCharacterMetrics: getCharacterMetrics
-  };
-
-  },{"./fontMetricsData":42,"./unicodeRegexes":49}],42:[function(require,module,exports){
-
-  module.exports = {
-      "AMS-Regular": {
-          "65": [0, 0.68889, 0, 0],
-          "66": [0, 0.68889, 0, 0],
-          "67": [0, 0.68889, 0, 0],
-          "68": [0, 0.68889, 0, 0],
-          "69": [0, 0.68889, 0, 0],
-          "70": [0, 0.68889, 0, 0],
-          "71": [0, 0.68889, 0, 0],
-          "72": [0, 0.68889, 0, 0],
-          "73": [0, 0.68889, 0, 0],
-          "74": [0.16667, 0.68889, 0, 0],
-          "75": [0, 0.68889, 0, 0],
-          "76": [0, 0.68889, 0, 0],
-          "77": [0, 0.68889, 0, 0],
-          "78": [0, 0.68889, 0, 0],
-          "79": [0.16667, 0.68889, 0, 0],
-          "80": [0, 0.68889, 0, 0],
-          "81": [0.16667, 0.68889, 0, 0],
-          "82": [0, 0.68889, 0, 0],
-          "83": [0, 0.68889, 0, 0],
-          "84": [0, 0.68889, 0, 0],
-          "85": [0, 0.68889, 0, 0],
-          "86": [0, 0.68889, 0, 0],
-          "87": [0, 0.68889, 0, 0],
-          "88": [0, 0.68889, 0, 0],
-          "89": [0, 0.68889, 0, 0],
-          "90": [0, 0.68889, 0, 0],
-          "107": [0, 0.68889, 0, 0],
-          "165": [0, 0.675, 0.025, 0],
-          "174": [0.15559, 0.69224, 0, 0],
-          "240": [0, 0.68889, 0, 0],
-          "295": [0, 0.68889, 0, 0],
-          "710": [0, 0.825, 0, 0],
-          "732": [0, 0.9, 0, 0],
-          "770": [0, 0.825, 0, 0],
-          "771": [0, 0.9, 0, 0],
-          "989": [0.08167, 0.58167, 0, 0],
-          "1008": [0, 0.43056, 0.04028, 0],
-          "8245": [0, 0.54986, 0, 0],
-          "8463": [0, 0.68889, 0, 0],
-          "8487": [0, 0.68889, 0, 0],
-          "8498": [0, 0.68889, 0, 0],
-          "8502": [0, 0.68889, 0, 0],
-          "8503": [0, 0.68889, 0, 0],
-          "8504": [0, 0.68889, 0, 0],
-          "8513": [0, 0.68889, 0, 0],
-          "8592": [-0.03598, 0.46402, 0, 0],
-          "8594": [-0.03598, 0.46402, 0, 0],
-          "8602": [-0.13313, 0.36687, 0, 0],
-          "8603": [-0.13313, 0.36687, 0, 0],
-          "8606": [0.01354, 0.52239, 0, 0],
-          "8608": [0.01354, 0.52239, 0, 0],
-          "8610": [0.01354, 0.52239, 0, 0],
-          "8611": [0.01354, 0.52239, 0, 0],
-          "8619": [0, 0.54986, 0, 0],
-          "8620": [0, 0.54986, 0, 0],
-          "8621": [-0.13313, 0.37788, 0, 0],
-          "8622": [-0.13313, 0.36687, 0, 0],
-          "8624": [0, 0.69224, 0, 0],
-          "8625": [0, 0.69224, 0, 0],
-          "8630": [0, 0.43056, 0, 0],
-          "8631": [0, 0.43056, 0, 0],
-          "8634": [0.08198, 0.58198, 0, 0],
-          "8635": [0.08198, 0.58198, 0, 0],
-          "8638": [0.19444, 0.69224, 0, 0],
-          "8639": [0.19444, 0.69224, 0, 0],
-          "8642": [0.19444, 0.69224, 0, 0],
-          "8643": [0.19444, 0.69224, 0, 0],
-          "8644": [0.1808, 0.675, 0, 0],
-          "8646": [0.1808, 0.675, 0, 0],
-          "8647": [0.1808, 0.675, 0, 0],
-          "8648": [0.19444, 0.69224, 0, 0],
-          "8649": [0.1808, 0.675, 0, 0],
-          "8650": [0.19444, 0.69224, 0, 0],
-          "8651": [0.01354, 0.52239, 0, 0],
-          "8652": [0.01354, 0.52239, 0, 0],
-          "8653": [-0.13313, 0.36687, 0, 0],
-          "8654": [-0.13313, 0.36687, 0, 0],
-          "8655": [-0.13313, 0.36687, 0, 0],
-          "8666": [0.13667, 0.63667, 0, 0],
-          "8667": [0.13667, 0.63667, 0, 0],
-          "8669": [-0.13313, 0.37788, 0, 0],
-          "8672": [-0.064, 0.437, 0, 0],
-          "8674": [-0.064, 0.437, 0, 0],
-          "8705": [0, 0.825, 0, 0],
-          "8708": [0, 0.68889, 0, 0],
-          "8709": [0.08167, 0.58167, 0, 0],
-          "8717": [0, 0.43056, 0, 0],
-          "8722": [-0.03598, 0.46402, 0, 0],
-          "8724": [0.08198, 0.69224, 0, 0],
-          "8726": [0.08167, 0.58167, 0, 0],
-          "8733": [0, 0.69224, 0, 0],
-          "8736": [0, 0.69224, 0, 0],
-          "8737": [0, 0.69224, 0, 0],
-          "8738": [0.03517, 0.52239, 0, 0],
-          "8739": [0.08167, 0.58167, 0, 0],
-          "8740": [0.25142, 0.74111, 0, 0],
-          "8741": [0.08167, 0.58167, 0, 0],
-          "8742": [0.25142, 0.74111, 0, 0],
-          "8756": [0, 0.69224, 0, 0],
-          "8757": [0, 0.69224, 0, 0],
-          "8764": [-0.13313, 0.36687, 0, 0],
-          "8765": [-0.13313, 0.37788, 0, 0],
-          "8769": [-0.13313, 0.36687, 0, 0],
-          "8770": [-0.03625, 0.46375, 0, 0],
-          "8774": [0.30274, 0.79383, 0, 0],
-          "8776": [-0.01688, 0.48312, 0, 0],
-          "8778": [0.08167, 0.58167, 0, 0],
-          "8782": [0.06062, 0.54986, 0, 0],
-          "8783": [0.06062, 0.54986, 0, 0],
-          "8785": [0.08198, 0.58198, 0, 0],
-          "8786": [0.08198, 0.58198, 0, 0],
-          "8787": [0.08198, 0.58198, 0, 0],
-          "8790": [0, 0.69224, 0, 0],
-          "8791": [0.22958, 0.72958, 0, 0],
-          "8796": [0.08198, 0.91667, 0, 0],
-          "8806": [0.25583, 0.75583, 0, 0],
-          "8807": [0.25583, 0.75583, 0, 0],
-          "8808": [0.25142, 0.75726, 0, 0],
-          "8809": [0.25142, 0.75726, 0, 0],
-          "8812": [0.25583, 0.75583, 0, 0],
-          "8814": [0.20576, 0.70576, 0, 0],
-          "8815": [0.20576, 0.70576, 0, 0],
-          "8816": [0.30274, 0.79383, 0, 0],
-          "8817": [0.30274, 0.79383, 0, 0],
-          "8818": [0.22958, 0.72958, 0, 0],
-          "8819": [0.22958, 0.72958, 0, 0],
-          "8822": [0.1808, 0.675, 0, 0],
-          "8823": [0.1808, 0.675, 0, 0],
-          "8828": [0.13667, 0.63667, 0, 0],
-          "8829": [0.13667, 0.63667, 0, 0],
-          "8830": [0.22958, 0.72958, 0, 0],
-          "8831": [0.22958, 0.72958, 0, 0],
-          "8832": [0.20576, 0.70576, 0, 0],
-          "8833": [0.20576, 0.70576, 0, 0],
-          "8840": [0.30274, 0.79383, 0, 0],
-          "8841": [0.30274, 0.79383, 0, 0],
-          "8842": [0.13597, 0.63597, 0, 0],
-          "8843": [0.13597, 0.63597, 0, 0],
-          "8847": [0.03517, 0.54986, 0, 0],
-          "8848": [0.03517, 0.54986, 0, 0],
-          "8858": [0.08198, 0.58198, 0, 0],
-          "8859": [0.08198, 0.58198, 0, 0],
-          "8861": [0.08198, 0.58198, 0, 0],
-          "8862": [0, 0.675, 0, 0],
-          "8863": [0, 0.675, 0, 0],
-          "8864": [0, 0.675, 0, 0],
-          "8865": [0, 0.675, 0, 0],
-          "8872": [0, 0.69224, 0, 0],
-          "8873": [0, 0.69224, 0, 0],
-          "8874": [0, 0.69224, 0, 0],
-          "8876": [0, 0.68889, 0, 0],
-          "8877": [0, 0.68889, 0, 0],
-          "8878": [0, 0.68889, 0, 0],
-          "8879": [0, 0.68889, 0, 0],
-          "8882": [0.03517, 0.54986, 0, 0],
-          "8883": [0.03517, 0.54986, 0, 0],
-          "8884": [0.13667, 0.63667, 0, 0],
-          "8885": [0.13667, 0.63667, 0, 0],
-          "8888": [0, 0.54986, 0, 0],
-          "8890": [0.19444, 0.43056, 0, 0],
-          "8891": [0.19444, 0.69224, 0, 0],
-          "8892": [0.19444, 0.69224, 0, 0],
-          "8901": [0, 0.54986, 0, 0],
-          "8903": [0.08167, 0.58167, 0, 0],
-          "8905": [0.08167, 0.58167, 0, 0],
-          "8906": [0.08167, 0.58167, 0, 0],
-          "8907": [0, 0.69224, 0, 0],
-          "8908": [0, 0.69224, 0, 0],
-          "8909": [-0.03598, 0.46402, 0, 0],
-          "8910": [0, 0.54986, 0, 0],
-          "8911": [0, 0.54986, 0, 0],
-          "8912": [0.03517, 0.54986, 0, 0],
-          "8913": [0.03517, 0.54986, 0, 0],
-          "8914": [0, 0.54986, 0, 0],
-          "8915": [0, 0.54986, 0, 0],
-          "8916": [0, 0.69224, 0, 0],
-          "8918": [0.0391, 0.5391, 0, 0],
-          "8919": [0.0391, 0.5391, 0, 0],
-          "8920": [0.03517, 0.54986, 0, 0],
-          "8921": [0.03517, 0.54986, 0, 0],
-          "8922": [0.38569, 0.88569, 0, 0],
-          "8923": [0.38569, 0.88569, 0, 0],
-          "8926": [0.13667, 0.63667, 0, 0],
-          "8927": [0.13667, 0.63667, 0, 0],
-          "8928": [0.30274, 0.79383, 0, 0],
-          "8929": [0.30274, 0.79383, 0, 0],
-          "8934": [0.23222, 0.74111, 0, 0],
-          "8935": [0.23222, 0.74111, 0, 0],
-          "8936": [0.23222, 0.74111, 0, 0],
-          "8937": [0.23222, 0.74111, 0, 0],
-          "8938": [0.20576, 0.70576, 0, 0],
-          "8939": [0.20576, 0.70576, 0, 0],
-          "8940": [0.30274, 0.79383, 0, 0],
-          "8941": [0.30274, 0.79383, 0, 0],
-          "8994": [0.19444, 0.69224, 0, 0],
-          "8995": [0.19444, 0.69224, 0, 0],
-          "9416": [0.15559, 0.69224, 0, 0],
-          "9484": [0, 0.69224, 0, 0],
-          "9488": [0, 0.69224, 0, 0],
-          "9492": [0, 0.37788, 0, 0],
-          "9496": [0, 0.37788, 0, 0],
-          "9585": [0.19444, 0.68889, 0, 0],
-          "9586": [0.19444, 0.74111, 0, 0],
-          "9632": [0, 0.675, 0, 0],
-          "9633": [0, 0.675, 0, 0],
-          "9650": [0, 0.54986, 0, 0],
-          "9651": [0, 0.54986, 0, 0],
-          "9654": [0.03517, 0.54986, 0, 0],
-          "9660": [0, 0.54986, 0, 0],
-          "9661": [0, 0.54986, 0, 0],
-          "9664": [0.03517, 0.54986, 0, 0],
-          "9674": [0.11111, 0.69224, 0, 0],
-          "9733": [0.19444, 0.69224, 0, 0],
-          "10003": [0, 0.69224, 0, 0],
-          "10016": [0, 0.69224, 0, 0],
-          "10731": [0.11111, 0.69224, 0, 0],
-          "10846": [0.19444, 0.75583, 0, 0],
-          "10877": [0.13667, 0.63667, 0, 0],
-          "10878": [0.13667, 0.63667, 0, 0],
-          "10885": [0.25583, 0.75583, 0, 0],
-          "10886": [0.25583, 0.75583, 0, 0],
-          "10887": [0.13597, 0.63597, 0, 0],
-          "10888": [0.13597, 0.63597, 0, 0],
-          "10889": [0.26167, 0.75726, 0, 0],
-          "10890": [0.26167, 0.75726, 0, 0],
-          "10891": [0.48256, 0.98256, 0, 0],
-          "10892": [0.48256, 0.98256, 0, 0],
-          "10901": [0.13667, 0.63667, 0, 0],
-          "10902": [0.13667, 0.63667, 0, 0],
-          "10933": [0.25142, 0.75726, 0, 0],
-          "10934": [0.25142, 0.75726, 0, 0],
-          "10935": [0.26167, 0.75726, 0, 0],
-          "10936": [0.26167, 0.75726, 0, 0],
-          "10937": [0.26167, 0.75726, 0, 0],
-          "10938": [0.26167, 0.75726, 0, 0],
-          "10949": [0.25583, 0.75583, 0, 0],
-          "10950": [0.25583, 0.75583, 0, 0],
-          "10955": [0.28481, 0.79383, 0, 0],
-          "10956": [0.28481, 0.79383, 0, 0],
-          "57350": [0.08167, 0.58167, 0, 0],
-          "57351": [0.08167, 0.58167, 0, 0],
-          "57352": [0.08167, 0.58167, 0, 0],
-          "57353": [0, 0.43056, 0.04028, 0],
-          "57356": [0.25142, 0.75726, 0, 0],
-          "57357": [0.25142, 0.75726, 0, 0],
-          "57358": [0.41951, 0.91951, 0, 0],
-          "57359": [0.30274, 0.79383, 0, 0],
-          "57360": [0.30274, 0.79383, 0, 0],
-          "57361": [0.41951, 0.91951, 0, 0],
-          "57366": [0.25142, 0.75726, 0, 0],
-          "57367": [0.25142, 0.75726, 0, 0],
-          "57368": [0.25142, 0.75726, 0, 0],
-          "57369": [0.25142, 0.75726, 0, 0],
-          "57370": [0.13597, 0.63597, 0, 0],
-          "57371": [0.13597, 0.63597, 0, 0]
-      },
-      "Caligraphic-Regular": {
-          "48": [0, 0.43056, 0, 0],
-          "49": [0, 0.43056, 0, 0],
-          "50": [0, 0.43056, 0, 0],
-          "51": [0.19444, 0.43056, 0, 0],
-          "52": [0.19444, 0.43056, 0, 0],
-          "53": [0.19444, 0.43056, 0, 0],
-          "54": [0, 0.64444, 0, 0],
-          "55": [0.19444, 0.43056, 0, 0],
-          "56": [0, 0.64444, 0, 0],
-          "57": [0.19444, 0.43056, 0, 0],
-          "65": [0, 0.68333, 0, 0.19445],
-          "66": [0, 0.68333, 0.03041, 0.13889],
-          "67": [0, 0.68333, 0.05834, 0.13889],
-          "68": [0, 0.68333, 0.02778, 0.08334],
-          "69": [0, 0.68333, 0.08944, 0.11111],
-          "70": [0, 0.68333, 0.09931, 0.11111],
-          "71": [0.09722, 0.68333, 0.0593, 0.11111],
-          "72": [0, 0.68333, 0.00965, 0.11111],
-          "73": [0, 0.68333, 0.07382, 0],
-          "74": [0.09722, 0.68333, 0.18472, 0.16667],
-          "75": [0, 0.68333, 0.01445, 0.05556],
-          "76": [0, 0.68333, 0, 0.13889],
-          "77": [0, 0.68333, 0, 0.13889],
-          "78": [0, 0.68333, 0.14736, 0.08334],
-          "79": [0, 0.68333, 0.02778, 0.11111],
-          "80": [0, 0.68333, 0.08222, 0.08334],
-          "81": [0.09722, 0.68333, 0, 0.11111],
-          "82": [0, 0.68333, 0, 0.08334],
-          "83": [0, 0.68333, 0.075, 0.13889],
-          "84": [0, 0.68333, 0.25417, 0],
-          "85": [0, 0.68333, 0.09931, 0.08334],
-          "86": [0, 0.68333, 0.08222, 0],
-          "87": [0, 0.68333, 0.08222, 0.08334],
-          "88": [0, 0.68333, 0.14643, 0.13889],
-          "89": [0.09722, 0.68333, 0.08222, 0.08334],
-          "90": [0, 0.68333, 0.07944, 0.13889]
-      },
-      "Fraktur-Regular": {
-          "33": [0, 0.69141, 0, 0],
-          "34": [0, 0.69141, 0, 0],
-          "38": [0, 0.69141, 0, 0],
-          "39": [0, 0.69141, 0, 0],
-          "40": [0.24982, 0.74947, 0, 0],
-          "41": [0.24982, 0.74947, 0, 0],
-          "42": [0, 0.62119, 0, 0],
-          "43": [0.08319, 0.58283, 0, 0],
-          "44": [0, 0.10803, 0, 0],
-          "45": [0.08319, 0.58283, 0, 0],
-          "46": [0, 0.10803, 0, 0],
-          "47": [0.24982, 0.74947, 0, 0],
-          "48": [0, 0.47534, 0, 0],
-          "49": [0, 0.47534, 0, 0],
-          "50": [0, 0.47534, 0, 0],
-          "51": [0.18906, 0.47534, 0, 0],
-          "52": [0.18906, 0.47534, 0, 0],
-          "53": [0.18906, 0.47534, 0, 0],
-          "54": [0, 0.69141, 0, 0],
-          "55": [0.18906, 0.47534, 0, 0],
-          "56": [0, 0.69141, 0, 0],
-          "57": [0.18906, 0.47534, 0, 0],
-          "58": [0, 0.47534, 0, 0],
-          "59": [0.12604, 0.47534, 0, 0],
-          "61": [-0.13099, 0.36866, 0, 0],
-          "63": [0, 0.69141, 0, 0],
-          "65": [0, 0.69141, 0, 0],
-          "66": [0, 0.69141, 0, 0],
-          "67": [0, 0.69141, 0, 0],
-          "68": [0, 0.69141, 0, 0],
-          "69": [0, 0.69141, 0, 0],
-          "70": [0.12604, 0.69141, 0, 0],
-          "71": [0, 0.69141, 0, 0],
-          "72": [0.06302, 0.69141, 0, 0],
-          "73": [0, 0.69141, 0, 0],
-          "74": [0.12604, 0.69141, 0, 0],
-          "75": [0, 0.69141, 0, 0],
-          "76": [0, 0.69141, 0, 0],
-          "77": [0, 0.69141, 0, 0],
-          "78": [0, 0.69141, 0, 0],
-          "79": [0, 0.69141, 0, 0],
-          "80": [0.18906, 0.69141, 0, 0],
-          "81": [0.03781, 0.69141, 0, 0],
-          "82": [0, 0.69141, 0, 0],
-          "83": [0, 0.69141, 0, 0],
-          "84": [0, 0.69141, 0, 0],
-          "85": [0, 0.69141, 0, 0],
-          "86": [0, 0.69141, 0, 0],
-          "87": [0, 0.69141, 0, 0],
-          "88": [0, 0.69141, 0, 0],
-          "89": [0.18906, 0.69141, 0, 0],
-          "90": [0.12604, 0.69141, 0, 0],
-          "91": [0.24982, 0.74947, 0, 0],
-          "93": [0.24982, 0.74947, 0, 0],
-          "94": [0, 0.69141, 0, 0],
-          "97": [0, 0.47534, 0, 0],
-          "98": [0, 0.69141, 0, 0],
-          "99": [0, 0.47534, 0, 0],
-          "100": [0, 0.62119, 0, 0],
-          "101": [0, 0.47534, 0, 0],
-          "102": [0.18906, 0.69141, 0, 0],
-          "103": [0.18906, 0.47534, 0, 0],
-          "104": [0.18906, 0.69141, 0, 0],
-          "105": [0, 0.69141, 0, 0],
-          "106": [0, 0.69141, 0, 0],
-          "107": [0, 0.69141, 0, 0],
-          "108": [0, 0.69141, 0, 0],
-          "109": [0, 0.47534, 0, 0],
-          "110": [0, 0.47534, 0, 0],
-          "111": [0, 0.47534, 0, 0],
-          "112": [0.18906, 0.52396, 0, 0],
-          "113": [0.18906, 0.47534, 0, 0],
-          "114": [0, 0.47534, 0, 0],
-          "115": [0, 0.47534, 0, 0],
-          "116": [0, 0.62119, 0, 0],
-          "117": [0, 0.47534, 0, 0],
-          "118": [0, 0.52396, 0, 0],
-          "119": [0, 0.52396, 0, 0],
-          "120": [0.18906, 0.47534, 0, 0],
-          "121": [0.18906, 0.47534, 0, 0],
-          "122": [0.18906, 0.47534, 0, 0],
-          "8216": [0, 0.69141, 0, 0],
-          "8217": [0, 0.69141, 0, 0],
-          "58112": [0, 0.62119, 0, 0],
-          "58113": [0, 0.62119, 0, 0],
-          "58114": [0.18906, 0.69141, 0, 0],
-          "58115": [0.18906, 0.69141, 0, 0],
-          "58116": [0.18906, 0.47534, 0, 0],
-          "58117": [0, 0.69141, 0, 0],
-          "58118": [0, 0.62119, 0, 0],
-          "58119": [0, 0.47534, 0, 0]
-      },
-      "Main-Bold": {
-          "33": [0, 0.69444, 0, 0],
-          "34": [0, 0.69444, 0, 0],
-          "35": [0.19444, 0.69444, 0, 0],
-          "36": [0.05556, 0.75, 0, 0],
-          "37": [0.05556, 0.75, 0, 0],
-          "38": [0, 0.69444, 0, 0],
-          "39": [0, 0.69444, 0, 0],
-          "40": [0.25, 0.75, 0, 0],
-          "41": [0.25, 0.75, 0, 0],
-          "42": [0, 0.75, 0, 0],
-          "43": [0.13333, 0.63333, 0, 0],
-          "44": [0.19444, 0.15556, 0, 0],
-          "45": [0, 0.44444, 0, 0],
-          "46": [0, 0.15556, 0, 0],
-          "47": [0.25, 0.75, 0, 0],
-          "48": [0, 0.64444, 0, 0],
-          "49": [0, 0.64444, 0, 0],
-          "50": [0, 0.64444, 0, 0],
-          "51": [0, 0.64444, 0, 0],
-          "52": [0, 0.64444, 0, 0],
-          "53": [0, 0.64444, 0, 0],
-          "54": [0, 0.64444, 0, 0],
-          "55": [0, 0.64444, 0, 0],
-          "56": [0, 0.64444, 0, 0],
-          "57": [0, 0.64444, 0, 0],
-          "58": [0, 0.44444, 0, 0],
-          "59": [0.19444, 0.44444, 0, 0],
-          "60": [0.08556, 0.58556, 0, 0],
-          "61": [-0.10889, 0.39111, 0, 0],
-          "62": [0.08556, 0.58556, 0, 0],
-          "63": [0, 0.69444, 0, 0],
-          "64": [0, 0.69444, 0, 0],
-          "65": [0, 0.68611, 0, 0],
-          "66": [0, 0.68611, 0, 0],
-          "67": [0, 0.68611, 0, 0],
-          "68": [0, 0.68611, 0, 0],
-          "69": [0, 0.68611, 0, 0],
-          "70": [0, 0.68611, 0, 0],
-          "71": [0, 0.68611, 0, 0],
-          "72": [0, 0.68611, 0, 0],
-          "73": [0, 0.68611, 0, 0],
-          "74": [0, 0.68611, 0, 0],
-          "75": [0, 0.68611, 0, 0],
-          "76": [0, 0.68611, 0, 0],
-          "77": [0, 0.68611, 0, 0],
-          "78": [0, 0.68611, 0, 0],
-          "79": [0, 0.68611, 0, 0],
-          "80": [0, 0.68611, 0, 0],
-          "81": [0.19444, 0.68611, 0, 0],
-          "82": [0, 0.68611, 0, 0],
-          "83": [0, 0.68611, 0, 0],
-          "84": [0, 0.68611, 0, 0],
-          "85": [0, 0.68611, 0, 0],
-          "86": [0, 0.68611, 0.01597, 0],
-          "87": [0, 0.68611, 0.01597, 0],
-          "88": [0, 0.68611, 0, 0],
-          "89": [0, 0.68611, 0.02875, 0],
-          "90": [0, 0.68611, 0, 0],
-          "91": [0.25, 0.75, 0, 0],
-          "92": [0.25, 0.75, 0, 0],
-          "93": [0.25, 0.75, 0, 0],
-          "94": [0, 0.69444, 0, 0],
-          "95": [0.31, 0.13444, 0.03194, 0],
-          "96": [0, 0.69444, 0, 0],
-          "97": [0, 0.44444, 0, 0],
-          "98": [0, 0.69444, 0, 0],
-          "99": [0, 0.44444, 0, 0],
-          "100": [0, 0.69444, 0, 0],
-          "101": [0, 0.44444, 0, 0],
-          "102": [0, 0.69444, 0.10903, 0],
-          "103": [0.19444, 0.44444, 0.01597, 0],
-          "104": [0, 0.69444, 0, 0],
-          "105": [0, 0.69444, 0, 0],
-          "106": [0.19444, 0.69444, 0, 0],
-          "107": [0, 0.69444, 0, 0],
-          "108": [0, 0.69444, 0, 0],
-          "109": [0, 0.44444, 0, 0],
-          "110": [0, 0.44444, 0, 0],
-          "111": [0, 0.44444, 0, 0],
-          "112": [0.19444, 0.44444, 0, 0],
-          "113": [0.19444, 0.44444, 0, 0],
-          "114": [0, 0.44444, 0, 0],
-          "115": [0, 0.44444, 0, 0],
-          "116": [0, 0.63492, 0, 0],
-          "117": [0, 0.44444, 0, 0],
-          "118": [0, 0.44444, 0.01597, 0],
-          "119": [0, 0.44444, 0.01597, 0],
-          "120": [0, 0.44444, 0, 0],
-          "121": [0.19444, 0.44444, 0.01597, 0],
-          "122": [0, 0.44444, 0, 0],
-          "123": [0.25, 0.75, 0, 0],
-          "124": [0.25, 0.75, 0, 0],
-          "125": [0.25, 0.75, 0, 0],
-          "126": [0.35, 0.34444, 0, 0],
-          "168": [0, 0.69444, 0, 0],
-          "172": [0, 0.44444, 0, 0],
-          "175": [0, 0.59611, 0, 0],
-          "176": [0, 0.69444, 0, 0],
-          "177": [0.13333, 0.63333, 0, 0],
-          "180": [0, 0.69444, 0, 0],
-          "215": [0.13333, 0.63333, 0, 0],
-          "247": [0.13333, 0.63333, 0, 0],
-          "305": [0, 0.44444, 0, 0],
-          "567": [0.19444, 0.44444, 0, 0],
-          "710": [0, 0.69444, 0, 0],
-          "711": [0, 0.63194, 0, 0],
-          "713": [0, 0.59611, 0, 0],
-          "714": [0, 0.69444, 0, 0],
-          "715": [0, 0.69444, 0, 0],
-          "728": [0, 0.69444, 0, 0],
-          "729": [0, 0.69444, 0, 0],
-          "730": [0, 0.69444, 0, 0],
-          "732": [0, 0.69444, 0, 0],
-          "768": [0, 0.69444, 0, 0],
-          "769": [0, 0.69444, 0, 0],
-          "770": [0, 0.69444, 0, 0],
-          "771": [0, 0.69444, 0, 0],
-          "772": [0, 0.59611, 0, 0],
-          "774": [0, 0.69444, 0, 0],
-          "775": [0, 0.69444, 0, 0],
-          "776": [0, 0.69444, 0, 0],
-          "778": [0, 0.69444, 0, 0],
-          "779": [0, 0.69444, 0, 0],
-          "780": [0, 0.63194, 0, 0],
-          "824": [0.19444, 0.69444, 0, 0],
-          "915": [0, 0.68611, 0, 0],
-          "916": [0, 0.68611, 0, 0],
-          "920": [0, 0.68611, 0, 0],
-          "923": [0, 0.68611, 0, 0],
-          "926": [0, 0.68611, 0, 0],
-          "928": [0, 0.68611, 0, 0],
-          "931": [0, 0.68611, 0, 0],
-          "933": [0, 0.68611, 0, 0],
-          "934": [0, 0.68611, 0, 0],
-          "936": [0, 0.68611, 0, 0],
-          "937": [0, 0.68611, 0, 0],
-          "8211": [0, 0.44444, 0.03194, 0],
-          "8212": [0, 0.44444, 0.03194, 0],
-          "8216": [0, 0.69444, 0, 0],
-          "8217": [0, 0.69444, 0, 0],
-          "8220": [0, 0.69444, 0, 0],
-          "8221": [0, 0.69444, 0, 0],
-          "8224": [0.19444, 0.69444, 0, 0],
-          "8225": [0.19444, 0.69444, 0, 0],
-          "8242": [0, 0.55556, 0, 0],
-          "8407": [0, 0.72444, 0.15486, 0],
-          "8463": [0, 0.69444, 0, 0],
-          "8465": [0, 0.69444, 0, 0],
-          "8467": [0, 0.69444, 0, 0],
-          "8472": [0.19444, 0.44444, 0, 0],
-          "8476": [0, 0.69444, 0, 0],
-          "8501": [0, 0.69444, 0, 0],
-          "8592": [-0.10889, 0.39111, 0, 0],
-          "8593": [0.19444, 0.69444, 0, 0],
-          "8594": [-0.10889, 0.39111, 0, 0],
-          "8595": [0.19444, 0.69444, 0, 0],
-          "8596": [-0.10889, 0.39111, 0, 0],
-          "8597": [0.25, 0.75, 0, 0],
-          "8598": [0.19444, 0.69444, 0, 0],
-          "8599": [0.19444, 0.69444, 0, 0],
-          "8600": [0.19444, 0.69444, 0, 0],
-          "8601": [0.19444, 0.69444, 0, 0],
-          "8636": [-0.10889, 0.39111, 0, 0],
-          "8637": [-0.10889, 0.39111, 0, 0],
-          "8640": [-0.10889, 0.39111, 0, 0],
-          "8641": [-0.10889, 0.39111, 0, 0],
-          "8656": [-0.10889, 0.39111, 0, 0],
-          "8657": [0.19444, 0.69444, 0, 0],
-          "8658": [-0.10889, 0.39111, 0, 0],
-          "8659": [0.19444, 0.69444, 0, 0],
-          "8660": [-0.10889, 0.39111, 0, 0],
-          "8661": [0.25, 0.75, 0, 0],
-          "8704": [0, 0.69444, 0, 0],
-          "8706": [0, 0.69444, 0.06389, 0],
-          "8707": [0, 0.69444, 0, 0],
-          "8709": [0.05556, 0.75, 0, 0],
-          "8711": [0, 0.68611, 0, 0],
-          "8712": [0.08556, 0.58556, 0, 0],
-          "8715": [0.08556, 0.58556, 0, 0],
-          "8722": [0.13333, 0.63333, 0, 0],
-          "8723": [0.13333, 0.63333, 0, 0],
-          "8725": [0.25, 0.75, 0, 0],
-          "8726": [0.25, 0.75, 0, 0],
-          "8727": [-0.02778, 0.47222, 0, 0],
-          "8728": [-0.02639, 0.47361, 0, 0],
-          "8729": [-0.02639, 0.47361, 0, 0],
-          "8730": [0.18, 0.82, 0, 0],
-          "8733": [0, 0.44444, 0, 0],
-          "8734": [0, 0.44444, 0, 0],
-          "8736": [0, 0.69224, 0, 0],
-          "8739": [0.25, 0.75, 0, 0],
-          "8741": [0.25, 0.75, 0, 0],
-          "8743": [0, 0.55556, 0, 0],
-          "8744": [0, 0.55556, 0, 0],
-          "8745": [0, 0.55556, 0, 0],
-          "8746": [0, 0.55556, 0, 0],
-          "8747": [0.19444, 0.69444, 0.12778, 0],
-          "8764": [-0.10889, 0.39111, 0, 0],
-          "8768": [0.19444, 0.69444, 0, 0],
-          "8771": [0.00222, 0.50222, 0, 0],
-          "8776": [0.02444, 0.52444, 0, 0],
-          "8781": [0.00222, 0.50222, 0, 0],
-          "8801": [0.00222, 0.50222, 0, 0],
-          "8804": [0.19667, 0.69667, 0, 0],
-          "8805": [0.19667, 0.69667, 0, 0],
-          "8810": [0.08556, 0.58556, 0, 0],
-          "8811": [0.08556, 0.58556, 0, 0],
-          "8826": [0.08556, 0.58556, 0, 0],
-          "8827": [0.08556, 0.58556, 0, 0],
-          "8834": [0.08556, 0.58556, 0, 0],
-          "8835": [0.08556, 0.58556, 0, 0],
-          "8838": [0.19667, 0.69667, 0, 0],
-          "8839": [0.19667, 0.69667, 0, 0],
-          "8846": [0, 0.55556, 0, 0],
-          "8849": [0.19667, 0.69667, 0, 0],
-          "8850": [0.19667, 0.69667, 0, 0],
-          "8851": [0, 0.55556, 0, 0],
-          "8852": [0, 0.55556, 0, 0],
-          "8853": [0.13333, 0.63333, 0, 0],
-          "8854": [0.13333, 0.63333, 0, 0],
-          "8855": [0.13333, 0.63333, 0, 0],
-          "8856": [0.13333, 0.63333, 0, 0],
-          "8857": [0.13333, 0.63333, 0, 0],
-          "8866": [0, 0.69444, 0, 0],
-          "8867": [0, 0.69444, 0, 0],
-          "8868": [0, 0.69444, 0, 0],
-          "8869": [0, 0.69444, 0, 0],
-          "8900": [-0.02639, 0.47361, 0, 0],
-          "8901": [-0.02639, 0.47361, 0, 0],
-          "8902": [-0.02778, 0.47222, 0, 0],
-          "8968": [0.25, 0.75, 0, 0],
-          "8969": [0.25, 0.75, 0, 0],
-          "8970": [0.25, 0.75, 0, 0],
-          "8971": [0.25, 0.75, 0, 0],
-          "8994": [-0.13889, 0.36111, 0, 0],
-          "8995": [-0.13889, 0.36111, 0, 0],
-          "9651": [0.19444, 0.69444, 0, 0],
-          "9657": [-0.02778, 0.47222, 0, 0],
-          "9661": [0.19444, 0.69444, 0, 0],
-          "9667": [-0.02778, 0.47222, 0, 0],
-          "9711": [0.19444, 0.69444, 0, 0],
-          "9824": [0.12963, 0.69444, 0, 0],
-          "9825": [0.12963, 0.69444, 0, 0],
-          "9826": [0.12963, 0.69444, 0, 0],
-          "9827": [0.12963, 0.69444, 0, 0],
-          "9837": [0, 0.75, 0, 0],
-          "9838": [0.19444, 0.69444, 0, 0],
-          "9839": [0.19444, 0.69444, 0, 0],
-          "10216": [0.25, 0.75, 0, 0],
-          "10217": [0.25, 0.75, 0, 0],
-          "10815": [0, 0.68611, 0, 0],
-          "10927": [0.19667, 0.69667, 0, 0],
-          "10928": [0.19667, 0.69667, 0, 0]
-      },
-      "Main-Italic": {
-          "33": [0, 0.69444, 0.12417, 0],
-          "34": [0, 0.69444, 0.06961, 0],
-          "35": [0.19444, 0.69444, 0.06616, 0],
-          "37": [0.05556, 0.75, 0.13639, 0],
-          "38": [0, 0.69444, 0.09694, 0],
-          "39": [0, 0.69444, 0.12417, 0],
-          "40": [0.25, 0.75, 0.16194, 0],
-          "41": [0.25, 0.75, 0.03694, 0],
-          "42": [0, 0.75, 0.14917, 0],
-          "43": [0.05667, 0.56167, 0.03694, 0],
-          "44": [0.19444, 0.10556, 0, 0],
-          "45": [0, 0.43056, 0.02826, 0],
-          "46": [0, 0.10556, 0, 0],
-          "47": [0.25, 0.75, 0.16194, 0],
-          "48": [0, 0.64444, 0.13556, 0],
-          "49": [0, 0.64444, 0.13556, 0],
-          "50": [0, 0.64444, 0.13556, 0],
-          "51": [0, 0.64444, 0.13556, 0],
-          "52": [0.19444, 0.64444, 0.13556, 0],
-          "53": [0, 0.64444, 0.13556, 0],
-          "54": [0, 0.64444, 0.13556, 0],
-          "55": [0.19444, 0.64444, 0.13556, 0],
-          "56": [0, 0.64444, 0.13556, 0],
-          "57": [0, 0.64444, 0.13556, 0],
-          "58": [0, 0.43056, 0.0582, 0],
-          "59": [0.19444, 0.43056, 0.0582, 0],
-          "61": [-0.13313, 0.36687, 0.06616, 0],
-          "63": [0, 0.69444, 0.1225, 0],
-          "64": [0, 0.69444, 0.09597, 0],
-          "65": [0, 0.68333, 0, 0],
-          "66": [0, 0.68333, 0.10257, 0],
-          "67": [0, 0.68333, 0.14528, 0],
-          "68": [0, 0.68333, 0.09403, 0],
-          "69": [0, 0.68333, 0.12028, 0],
-          "70": [0, 0.68333, 0.13305, 0],
-          "71": [0, 0.68333, 0.08722, 0],
-          "72": [0, 0.68333, 0.16389, 0],
-          "73": [0, 0.68333, 0.15806, 0],
-          "74": [0, 0.68333, 0.14028, 0],
-          "75": [0, 0.68333, 0.14528, 0],
-          "76": [0, 0.68333, 0, 0],
-          "77": [0, 0.68333, 0.16389, 0],
-          "78": [0, 0.68333, 0.16389, 0],
-          "79": [0, 0.68333, 0.09403, 0],
-          "80": [0, 0.68333, 0.10257, 0],
-          "81": [0.19444, 0.68333, 0.09403, 0],
-          "82": [0, 0.68333, 0.03868, 0],
-          "83": [0, 0.68333, 0.11972, 0],
-          "84": [0, 0.68333, 0.13305, 0],
-          "85": [0, 0.68333, 0.16389, 0],
-          "86": [0, 0.68333, 0.18361, 0],
-          "87": [0, 0.68333, 0.18361, 0],
-          "88": [0, 0.68333, 0.15806, 0],
-          "89": [0, 0.68333, 0.19383, 0],
-          "90": [0, 0.68333, 0.14528, 0],
-          "91": [0.25, 0.75, 0.1875, 0],
-          "93": [0.25, 0.75, 0.10528, 0],
-          "94": [0, 0.69444, 0.06646, 0],
-          "95": [0.31, 0.12056, 0.09208, 0],
-          "97": [0, 0.43056, 0.07671, 0],
-          "98": [0, 0.69444, 0.06312, 0],
-          "99": [0, 0.43056, 0.05653, 0],
-          "100": [0, 0.69444, 0.10333, 0],
-          "101": [0, 0.43056, 0.07514, 0],
-          "102": [0.19444, 0.69444, 0.21194, 0],
-          "103": [0.19444, 0.43056, 0.08847, 0],
-          "104": [0, 0.69444, 0.07671, 0],
-          "105": [0, 0.65536, 0.1019, 0],
-          "106": [0.19444, 0.65536, 0.14467, 0],
-          "107": [0, 0.69444, 0.10764, 0],
-          "108": [0, 0.69444, 0.10333, 0],
-          "109": [0, 0.43056, 0.07671, 0],
-          "110": [0, 0.43056, 0.07671, 0],
-          "111": [0, 0.43056, 0.06312, 0],
-          "112": [0.19444, 0.43056, 0.06312, 0],
-          "113": [0.19444, 0.43056, 0.08847, 0],
-          "114": [0, 0.43056, 0.10764, 0],
-          "115": [0, 0.43056, 0.08208, 0],
-          "116": [0, 0.61508, 0.09486, 0],
-          "117": [0, 0.43056, 0.07671, 0],
-          "118": [0, 0.43056, 0.10764, 0],
-          "119": [0, 0.43056, 0.10764, 0],
-          "120": [0, 0.43056, 0.12042, 0],
-          "121": [0.19444, 0.43056, 0.08847, 0],
-          "122": [0, 0.43056, 0.12292, 0],
-          "126": [0.35, 0.31786, 0.11585, 0],
-          "163": [0, 0.69444, 0, 0],
-          "305": [0, 0.43056, 0, 0.02778],
-          "567": [0.19444, 0.43056, 0, 0.08334],
-          "768": [0, 0.69444, 0, 0],
-          "769": [0, 0.69444, 0.09694, 0],
-          "770": [0, 0.69444, 0.06646, 0],
-          "771": [0, 0.66786, 0.11585, 0],
-          "772": [0, 0.56167, 0.10333, 0],
-          "774": [0, 0.69444, 0.10806, 0],
-          "775": [0, 0.66786, 0.11752, 0],
-          "776": [0, 0.66786, 0.10474, 0],
-          "778": [0, 0.69444, 0, 0],
-          "779": [0, 0.69444, 0.1225, 0],
-          "780": [0, 0.62847, 0.08295, 0],
-          "915": [0, 0.68333, 0.13305, 0],
-          "916": [0, 0.68333, 0, 0],
-          "920": [0, 0.68333, 0.09403, 0],
-          "923": [0, 0.68333, 0, 0],
-          "926": [0, 0.68333, 0.15294, 0],
-          "928": [0, 0.68333, 0.16389, 0],
-          "931": [0, 0.68333, 0.12028, 0],
-          "933": [0, 0.68333, 0.11111, 0],
-          "934": [0, 0.68333, 0.05986, 0],
-          "936": [0, 0.68333, 0.11111, 0],
-          "937": [0, 0.68333, 0.10257, 0],
-          "8211": [0, 0.43056, 0.09208, 0],
-          "8212": [0, 0.43056, 0.09208, 0],
-          "8216": [0, 0.69444, 0.12417, 0],
-          "8217": [0, 0.69444, 0.12417, 0],
-          "8220": [0, 0.69444, 0.1685, 0],
-          "8221": [0, 0.69444, 0.06961, 0],
-          "8463": [0, 0.68889, 0, 0]
-      },
-      "Main-Regular": {
-          "32": [0, 0, 0, 0],
-          "33": [0, 0.69444, 0, 0],
-          "34": [0, 0.69444, 0, 0],
-          "35": [0.19444, 0.69444, 0, 0],
-          "36": [0.05556, 0.75, 0, 0],
-          "37": [0.05556, 0.75, 0, 0],
-          "38": [0, 0.69444, 0, 0],
-          "39": [0, 0.69444, 0, 0],
-          "40": [0.25, 0.75, 0, 0],
-          "41": [0.25, 0.75, 0, 0],
-          "42": [0, 0.75, 0, 0],
-          "43": [0.08333, 0.58333, 0, 0],
-          "44": [0.19444, 0.10556, 0, 0],
-          "45": [0, 0.43056, 0, 0],
-          "46": [0, 0.10556, 0, 0],
-          "47": [0.25, 0.75, 0, 0],
-          "48": [0, 0.64444, 0, 0],
-          "49": [0, 0.64444, 0, 0],
-          "50": [0, 0.64444, 0, 0],
-          "51": [0, 0.64444, 0, 0],
-          "52": [0, 0.64444, 0, 0],
-          "53": [0, 0.64444, 0, 0],
-          "54": [0, 0.64444, 0, 0],
-          "55": [0, 0.64444, 0, 0],
-          "56": [0, 0.64444, 0, 0],
-          "57": [0, 0.64444, 0, 0],
-          "58": [0, 0.43056, 0, 0],
-          "59": [0.19444, 0.43056, 0, 0],
-          "60": [0.0391, 0.5391, 0, 0],
-          "61": [-0.13313, 0.36687, 0, 0],
-          "62": [0.0391, 0.5391, 0, 0],
-          "63": [0, 0.69444, 0, 0],
-          "64": [0, 0.69444, 0, 0],
-          "65": [0, 0.68333, 0, 0],
-          "66": [0, 0.68333, 0, 0],
-          "67": [0, 0.68333, 0, 0],
-          "68": [0, 0.68333, 0, 0],
-          "69": [0, 0.68333, 0, 0],
-          "70": [0, 0.68333, 0, 0],
-          "71": [0, 0.68333, 0, 0],
-          "72": [0, 0.68333, 0, 0],
-          "73": [0, 0.68333, 0, 0],
-          "74": [0, 0.68333, 0, 0],
-          "75": [0, 0.68333, 0, 0],
-          "76": [0, 0.68333, 0, 0],
-          "77": [0, 0.68333, 0, 0],
-          "78": [0, 0.68333, 0, 0],
-          "79": [0, 0.68333, 0, 0],
-          "80": [0, 0.68333, 0, 0],
-          "81": [0.19444, 0.68333, 0, 0],
-          "82": [0, 0.68333, 0, 0],
-          "83": [0, 0.68333, 0, 0],
-          "84": [0, 0.68333, 0, 0],
-          "85": [0, 0.68333, 0, 0],
-          "86": [0, 0.68333, 0.01389, 0],
-          "87": [0, 0.68333, 0.01389, 0],
-          "88": [0, 0.68333, 0, 0],
-          "89": [0, 0.68333, 0.025, 0],
-          "90": [0, 0.68333, 0, 0],
-          "91": [0.25, 0.75, 0, 0],
-          "92": [0.25, 0.75, 0, 0],
-          "93": [0.25, 0.75, 0, 0],
-          "94": [0, 0.69444, 0, 0],
-          "95": [0.31, 0.12056, 0.02778, 0],
-          "96": [0, 0.69444, 0, 0],
-          "97": [0, 0.43056, 0, 0],
-          "98": [0, 0.69444, 0, 0],
-          "99": [0, 0.43056, 0, 0],
-          "100": [0, 0.69444, 0, 0],
-          "101": [0, 0.43056, 0, 0],
-          "102": [0, 0.69444, 0.07778, 0],
-          "103": [0.19444, 0.43056, 0.01389, 0],
-          "104": [0, 0.69444, 0, 0],
-          "105": [0, 0.66786, 0, 0],
-          "106": [0.19444, 0.66786, 0, 0],
-          "107": [0, 0.69444, 0, 0],
-          "108": [0, 0.69444, 0, 0],
-          "109": [0, 0.43056, 0, 0],
-          "110": [0, 0.43056, 0, 0],
-          "111": [0, 0.43056, 0, 0],
-          "112": [0.19444, 0.43056, 0, 0],
-          "113": [0.19444, 0.43056, 0, 0],
-          "114": [0, 0.43056, 0, 0],
-          "115": [0, 0.43056, 0, 0],
-          "116": [0, 0.61508, 0, 0],
-          "117": [0, 0.43056, 0, 0],
-          "118": [0, 0.43056, 0.01389, 0],
-          "119": [0, 0.43056, 0.01389, 0],
-          "120": [0, 0.43056, 0, 0],
-          "121": [0.19444, 0.43056, 0.01389, 0],
-          "122": [0, 0.43056, 0, 0],
-          "123": [0.25, 0.75, 0, 0],
-          "124": [0.25, 0.75, 0, 0],
-          "125": [0.25, 0.75, 0, 0],
-          "126": [0.35, 0.31786, 0, 0],
-          "160": [0, 0, 0, 0],
-          "168": [0, 0.66786, 0, 0],
-          "172": [0, 0.43056, 0, 0],
-          "175": [0, 0.56778, 0, 0],
-          "176": [0, 0.69444, 0, 0],
-          "177": [0.08333, 0.58333, 0, 0],
-          "180": [0, 0.69444, 0, 0],
-          "215": [0.08333, 0.58333, 0, 0],
-          "247": [0.08333, 0.58333, 0, 0],
-          "305": [0, 0.43056, 0, 0],
-          "567": [0.19444, 0.43056, 0, 0],
-          "710": [0, 0.69444, 0, 0],
-          "711": [0, 0.62847, 0, 0],
-          "713": [0, 0.56778, 0, 0],
-          "714": [0, 0.69444, 0, 0],
-          "715": [0, 0.69444, 0, 0],
-          "728": [0, 0.69444, 0, 0],
-          "729": [0, 0.66786, 0, 0],
-          "730": [0, 0.69444, 0, 0],
-          "732": [0, 0.66786, 0, 0],
-          "768": [0, 0.69444, 0, 0],
-          "769": [0, 0.69444, 0, 0],
-          "770": [0, 0.69444, 0, 0],
-          "771": [0, 0.66786, 0, 0],
-          "772": [0, 0.56778, 0, 0],
-          "774": [0, 0.69444, 0, 0],
-          "775": [0, 0.66786, 0, 0],
-          "776": [0, 0.66786, 0, 0],
-          "778": [0, 0.69444, 0, 0],
-          "779": [0, 0.69444, 0, 0],
-          "780": [0, 0.62847, 0, 0],
-          "824": [0.19444, 0.69444, 0, 0],
-          "915": [0, 0.68333, 0, 0],
-          "916": [0, 0.68333, 0, 0],
-          "920": [0, 0.68333, 0, 0],
-          "923": [0, 0.68333, 0, 0],
-          "926": [0, 0.68333, 0, 0],
-          "928": [0, 0.68333, 0, 0],
-          "931": [0, 0.68333, 0, 0],
-          "933": [0, 0.68333, 0, 0],
-          "934": [0, 0.68333, 0, 0],
-          "936": [0, 0.68333, 0, 0],
-          "937": [0, 0.68333, 0, 0],
-          "8211": [0, 0.43056, 0.02778, 0],
-          "8212": [0, 0.43056, 0.02778, 0],
-          "8216": [0, 0.69444, 0, 0],
-          "8217": [0, 0.69444, 0, 0],
-          "8220": [0, 0.69444, 0, 0],
-          "8221": [0, 0.69444, 0, 0],
-          "8224": [0.19444, 0.69444, 0, 0],
-          "8225": [0.19444, 0.69444, 0, 0],
-          "8230": [0, 0.12, 0, 0],
-          "8242": [0, 0.55556, 0, 0],
-          "8407": [0, 0.71444, 0.15382, 0],
-          "8463": [0, 0.68889, 0, 0],
-          "8465": [0, 0.69444, 0, 0],
-          "8467": [0, 0.69444, 0, 0.11111],
-          "8472": [0.19444, 0.43056, 0, 0.11111],
-          "8476": [0, 0.69444, 0, 0],
-          "8501": [0, 0.69444, 0, 0],
-          "8592": [-0.13313, 0.36687, 0, 0],
-          "8593": [0.19444, 0.69444, 0, 0],
-          "8594": [-0.13313, 0.36687, 0, 0],
-          "8595": [0.19444, 0.69444, 0, 0],
-          "8596": [-0.13313, 0.36687, 0, 0],
-          "8597": [0.25, 0.75, 0, 0],
-          "8598": [0.19444, 0.69444, 0, 0],
-          "8599": [0.19444, 0.69444, 0, 0],
-          "8600": [0.19444, 0.69444, 0, 0],
-          "8601": [0.19444, 0.69444, 0, 0],
-          "8614": [0.011, 0.511, 0, 0],
-          "8617": [0.011, 0.511, 0, 0],
-          "8618": [0.011, 0.511, 0, 0],
-          "8636": [-0.13313, 0.36687, 0, 0],
-          "8637": [-0.13313, 0.36687, 0, 0],
-          "8640": [-0.13313, 0.36687, 0, 0],
-          "8641": [-0.13313, 0.36687, 0, 0],
-          "8652": [0.011, 0.671, 0, 0],
-          "8656": [-0.13313, 0.36687, 0, 0],
-          "8657": [0.19444, 0.69444, 0, 0],
-          "8658": [-0.13313, 0.36687, 0, 0],
-          "8659": [0.19444, 0.69444, 0, 0],
-          "8660": [-0.13313, 0.36687, 0, 0],
-          "8661": [0.25, 0.75, 0, 0],
-          "8704": [0, 0.69444, 0, 0],
-          "8706": [0, 0.69444, 0.05556, 0.08334],
-          "8707": [0, 0.69444, 0, 0],
-          "8709": [0.05556, 0.75, 0, 0],
-          "8711": [0, 0.68333, 0, 0],
-          "8712": [0.0391, 0.5391, 0, 0],
-          "8715": [0.0391, 0.5391, 0, 0],
-          "8722": [0.08333, 0.58333, 0, 0],
-          "8723": [0.08333, 0.58333, 0, 0],
-          "8725": [0.25, 0.75, 0, 0],
-          "8726": [0.25, 0.75, 0, 0],
-          "8727": [-0.03472, 0.46528, 0, 0],
-          "8728": [-0.05555, 0.44445, 0, 0],
-          "8729": [-0.05555, 0.44445, 0, 0],
-          "8730": [0.2, 0.8, 0, 0],
-          "8733": [0, 0.43056, 0, 0],
-          "8734": [0, 0.43056, 0, 0],
-          "8736": [0, 0.69224, 0, 0],
-          "8739": [0.25, 0.75, 0, 0],
-          "8741": [0.25, 0.75, 0, 0],
-          "8743": [0, 0.55556, 0, 0],
-          "8744": [0, 0.55556, 0, 0],
-          "8745": [0, 0.55556, 0, 0],
-          "8746": [0, 0.55556, 0, 0],
-          "8747": [0.19444, 0.69444, 0.11111, 0],
-          "8764": [-0.13313, 0.36687, 0, 0],
-          "8768": [0.19444, 0.69444, 0, 0],
-          "8771": [-0.03625, 0.46375, 0, 0],
-          "8773": [-0.022, 0.589, 0, 0],
-          "8776": [-0.01688, 0.48312, 0, 0],
-          "8781": [-0.03625, 0.46375, 0, 0],
-          "8784": [-0.133, 0.67, 0, 0],
-          "8800": [0.215, 0.716, 0, 0],
-          "8801": [-0.03625, 0.46375, 0, 0],
-          "8804": [0.13597, 0.63597, 0, 0],
-          "8805": [0.13597, 0.63597, 0, 0],
-          "8810": [0.0391, 0.5391, 0, 0],
-          "8811": [0.0391, 0.5391, 0, 0],
-          "8826": [0.0391, 0.5391, 0, 0],
-          "8827": [0.0391, 0.5391, 0, 0],
-          "8834": [0.0391, 0.5391, 0, 0],
-          "8835": [0.0391, 0.5391, 0, 0],
-          "8838": [0.13597, 0.63597, 0, 0],
-          "8839": [0.13597, 0.63597, 0, 0],
-          "8846": [0, 0.55556, 0, 0],
-          "8849": [0.13597, 0.63597, 0, 0],
-          "8850": [0.13597, 0.63597, 0, 0],
-          "8851": [0, 0.55556, 0, 0],
-          "8852": [0, 0.55556, 0, 0],
-          "8853": [0.08333, 0.58333, 0, 0],
-          "8854": [0.08333, 0.58333, 0, 0],
-          "8855": [0.08333, 0.58333, 0, 0],
-          "8856": [0.08333, 0.58333, 0, 0],
-          "8857": [0.08333, 0.58333, 0, 0],
-          "8866": [0, 0.69444, 0, 0],
-          "8867": [0, 0.69444, 0, 0],
-          "8868": [0, 0.69444, 0, 0],
-          "8869": [0, 0.69444, 0, 0],
-          "8872": [0.249, 0.75, 0, 0],
-          "8900": [-0.05555, 0.44445, 0, 0],
-          "8901": [-0.05555, 0.44445, 0, 0],
-          "8902": [-0.03472, 0.46528, 0, 0],
-          "8904": [0.005, 0.505, 0, 0],
-          "8942": [0.03, 0.9, 0, 0],
-          "8943": [-0.19, 0.31, 0, 0],
-          "8945": [-0.1, 0.82, 0, 0],
-          "8968": [0.25, 0.75, 0, 0],
-          "8969": [0.25, 0.75, 0, 0],
-          "8970": [0.25, 0.75, 0, 0],
-          "8971": [0.25, 0.75, 0, 0],
-          "8994": [-0.14236, 0.35764, 0, 0],
-          "8995": [-0.14236, 0.35764, 0, 0],
-          "9136": [0.244, 0.744, 0, 0],
-          "9137": [0.244, 0.744, 0, 0],
-          "9651": [0.19444, 0.69444, 0, 0],
-          "9657": [-0.03472, 0.46528, 0, 0],
-          "9661": [0.19444, 0.69444, 0, 0],
-          "9667": [-0.03472, 0.46528, 0, 0],
-          "9711": [0.19444, 0.69444, 0, 0],
-          "9824": [0.12963, 0.69444, 0, 0],
-          "9825": [0.12963, 0.69444, 0, 0],
-          "9826": [0.12963, 0.69444, 0, 0],
-          "9827": [0.12963, 0.69444, 0, 0],
-          "9837": [0, 0.75, 0, 0],
-          "9838": [0.19444, 0.69444, 0, 0],
-          "9839": [0.19444, 0.69444, 0, 0],
-          "10216": [0.25, 0.75, 0, 0],
-          "10217": [0.25, 0.75, 0, 0],
-          "10222": [0.244, 0.744, 0, 0],
-          "10223": [0.244, 0.744, 0, 0],
-          "10229": [0.011, 0.511, 0, 0],
-          "10230": [0.011, 0.511, 0, 0],
-          "10231": [0.011, 0.511, 0, 0],
-          "10232": [0.024, 0.525, 0, 0],
-          "10233": [0.024, 0.525, 0, 0],
-          "10234": [0.024, 0.525, 0, 0],
-          "10236": [0.011, 0.511, 0, 0],
-          "10815": [0, 0.68333, 0, 0],
-          "10927": [0.13597, 0.63597, 0, 0],
-          "10928": [0.13597, 0.63597, 0, 0]
-      },
-      "Math-BoldItalic": {
-          "47": [0.19444, 0.69444, 0, 0],
-          "65": [0, 0.68611, 0, 0],
-          "66": [0, 0.68611, 0.04835, 0],
-          "67": [0, 0.68611, 0.06979, 0],
-          "68": [0, 0.68611, 0.03194, 0],
-          "69": [0, 0.68611, 0.05451, 0],
-          "70": [0, 0.68611, 0.15972, 0],
-          "71": [0, 0.68611, 0, 0],
-          "72": [0, 0.68611, 0.08229, 0],
-          "73": [0, 0.68611, 0.07778, 0],
-          "74": [0, 0.68611, 0.10069, 0],
-          "75": [0, 0.68611, 0.06979, 0],
-          "76": [0, 0.68611, 0, 0],
-          "77": [0, 0.68611, 0.11424, 0],
-          "78": [0, 0.68611, 0.11424, 0],
-          "79": [0, 0.68611, 0.03194, 0],
-          "80": [0, 0.68611, 0.15972, 0],
-          "81": [0.19444, 0.68611, 0, 0],
-          "82": [0, 0.68611, 0.00421, 0],
-          "83": [0, 0.68611, 0.05382, 0],
-          "84": [0, 0.68611, 0.15972, 0],
-          "85": [0, 0.68611, 0.11424, 0],
-          "86": [0, 0.68611, 0.25555, 0],
-          "87": [0, 0.68611, 0.15972, 0],
-          "88": [0, 0.68611, 0.07778, 0],
-          "89": [0, 0.68611, 0.25555, 0],
-          "90": [0, 0.68611, 0.06979, 0],
-          "97": [0, 0.44444, 0, 0],
-          "98": [0, 0.69444, 0, 0],
-          "99": [0, 0.44444, 0, 0],
-          "100": [0, 0.69444, 0, 0],
-          "101": [0, 0.44444, 0, 0],
-          "102": [0.19444, 0.69444, 0.11042, 0],
-          "103": [0.19444, 0.44444, 0.03704, 0],
-          "104": [0, 0.69444, 0, 0],
-          "105": [0, 0.69326, 0, 0],
-          "106": [0.19444, 0.69326, 0.0622, 0],
-          "107": [0, 0.69444, 0.01852, 0],
-          "108": [0, 0.69444, 0.0088, 0],
-          "109": [0, 0.44444, 0, 0],
-          "110": [0, 0.44444, 0, 0],
-          "111": [0, 0.44444, 0, 0],
-          "112": [0.19444, 0.44444, 0, 0],
-          "113": [0.19444, 0.44444, 0.03704, 0],
-          "114": [0, 0.44444, 0.03194, 0],
-          "115": [0, 0.44444, 0, 0],
-          "116": [0, 0.63492, 0, 0],
-          "117": [0, 0.44444, 0, 0],
-          "118": [0, 0.44444, 0.03704, 0],
-          "119": [0, 0.44444, 0.02778, 0],
-          "120": [0, 0.44444, 0, 0],
-          "121": [0.19444, 0.44444, 0.03704, 0],
-          "122": [0, 0.44444, 0.04213, 0],
-          "915": [0, 0.68611, 0.15972, 0],
-          "916": [0, 0.68611, 0, 0],
-          "920": [0, 0.68611, 0.03194, 0],
-          "923": [0, 0.68611, 0, 0],
-          "926": [0, 0.68611, 0.07458, 0],
-          "928": [0, 0.68611, 0.08229, 0],
-          "931": [0, 0.68611, 0.05451, 0],
-          "933": [0, 0.68611, 0.15972, 0],
-          "934": [0, 0.68611, 0, 0],
-          "936": [0, 0.68611, 0.11653, 0],
-          "937": [0, 0.68611, 0.04835, 0],
-          "945": [0, 0.44444, 0, 0],
-          "946": [0.19444, 0.69444, 0.03403, 0],
-          "947": [0.19444, 0.44444, 0.06389, 0],
-          "948": [0, 0.69444, 0.03819, 0],
-          "949": [0, 0.44444, 0, 0],
-          "950": [0.19444, 0.69444, 0.06215, 0],
-          "951": [0.19444, 0.44444, 0.03704, 0],
-          "952": [0, 0.69444, 0.03194, 0],
-          "953": [0, 0.44444, 0, 0],
-          "954": [0, 0.44444, 0, 0],
-          "955": [0, 0.69444, 0, 0],
-          "956": [0.19444, 0.44444, 0, 0],
-          "957": [0, 0.44444, 0.06898, 0],
-          "958": [0.19444, 0.69444, 0.03021, 0],
-          "959": [0, 0.44444, 0, 0],
-          "960": [0, 0.44444, 0.03704, 0],
-          "961": [0.19444, 0.44444, 0, 0],
-          "962": [0.09722, 0.44444, 0.07917, 0],
-          "963": [0, 0.44444, 0.03704, 0],
-          "964": [0, 0.44444, 0.13472, 0],
-          "965": [0, 0.44444, 0.03704, 0],
-          "966": [0.19444, 0.44444, 0, 0],
-          "967": [0.19444, 0.44444, 0, 0],
-          "968": [0.19444, 0.69444, 0.03704, 0],
-          "969": [0, 0.44444, 0.03704, 0],
-          "977": [0, 0.69444, 0, 0],
-          "981": [0.19444, 0.69444, 0, 0],
-          "982": [0, 0.44444, 0.03194, 0],
-          "1009": [0.19444, 0.44444, 0, 0],
-          "1013": [0, 0.44444, 0, 0]
-      },
-      "Math-Italic": {
-          "47": [0.19444, 0.69444, 0, 0],
-          "65": [0, 0.68333, 0, 0.13889],
-          "66": [0, 0.68333, 0.05017, 0.08334],
-          "67": [0, 0.68333, 0.07153, 0.08334],
-          "68": [0, 0.68333, 0.02778, 0.05556],
-          "69": [0, 0.68333, 0.05764, 0.08334],
-          "70": [0, 0.68333, 0.13889, 0.08334],
-          "71": [0, 0.68333, 0, 0.08334],
-          "72": [0, 0.68333, 0.08125, 0.05556],
-          "73": [0, 0.68333, 0.07847, 0.11111],
-          "74": [0, 0.68333, 0.09618, 0.16667],
-          "75": [0, 0.68333, 0.07153, 0.05556],
-          "76": [0, 0.68333, 0, 0.02778],
-          "77": [0, 0.68333, 0.10903, 0.08334],
-          "78": [0, 0.68333, 0.10903, 0.08334],
-          "79": [0, 0.68333, 0.02778, 0.08334],
-          "80": [0, 0.68333, 0.13889, 0.08334],
-          "81": [0.19444, 0.68333, 0, 0.08334],
-          "82": [0, 0.68333, 0.00773, 0.08334],
-          "83": [0, 0.68333, 0.05764, 0.08334],
-          "84": [0, 0.68333, 0.13889, 0.08334],
-          "85": [0, 0.68333, 0.10903, 0.02778],
-          "86": [0, 0.68333, 0.22222, 0],
-          "87": [0, 0.68333, 0.13889, 0],
-          "88": [0, 0.68333, 0.07847, 0.08334],
-          "89": [0, 0.68333, 0.22222, 0],
-          "90": [0, 0.68333, 0.07153, 0.08334],
-          "97": [0, 0.43056, 0, 0],
-          "98": [0, 0.69444, 0, 0],
-          "99": [0, 0.43056, 0, 0.05556],
-          "100": [0, 0.69444, 0, 0.16667],
-          "101": [0, 0.43056, 0, 0.05556],
-          "102": [0.19444, 0.69444, 0.10764, 0.16667],
-          "103": [0.19444, 0.43056, 0.03588, 0.02778],
-          "104": [0, 0.69444, 0, 0],
-          "105": [0, 0.65952, 0, 0],
-          "106": [0.19444, 0.65952, 0.05724, 0],
-          "107": [0, 0.69444, 0.03148, 0],
-          "108": [0, 0.69444, 0.01968, 0.08334],
-          "109": [0, 0.43056, 0, 0],
-          "110": [0, 0.43056, 0, 0],
-          "111": [0, 0.43056, 0, 0.05556],
-          "112": [0.19444, 0.43056, 0, 0.08334],
-          "113": [0.19444, 0.43056, 0.03588, 0.08334],
-          "114": [0, 0.43056, 0.02778, 0.05556],
-          "115": [0, 0.43056, 0, 0.05556],
-          "116": [0, 0.61508, 0, 0.08334],
-          "117": [0, 0.43056, 0, 0.02778],
-          "118": [0, 0.43056, 0.03588, 0.02778],
-          "119": [0, 0.43056, 0.02691, 0.08334],
-          "120": [0, 0.43056, 0, 0.02778],
-          "121": [0.19444, 0.43056, 0.03588, 0.05556],
-          "122": [0, 0.43056, 0.04398, 0.05556],
-          "915": [0, 0.68333, 0.13889, 0.08334],
-          "916": [0, 0.68333, 0, 0.16667],
-          "920": [0, 0.68333, 0.02778, 0.08334],
-          "923": [0, 0.68333, 0, 0.16667],
-          "926": [0, 0.68333, 0.07569, 0.08334],
-          "928": [0, 0.68333, 0.08125, 0.05556],
-          "931": [0, 0.68333, 0.05764, 0.08334],
-          "933": [0, 0.68333, 0.13889, 0.05556],
-          "934": [0, 0.68333, 0, 0.08334],
-          "936": [0, 0.68333, 0.11, 0.05556],
-          "937": [0, 0.68333, 0.05017, 0.08334],
-          "945": [0, 0.43056, 0.0037, 0.02778],
-          "946": [0.19444, 0.69444, 0.05278, 0.08334],
-          "947": [0.19444, 0.43056, 0.05556, 0],
-          "948": [0, 0.69444, 0.03785, 0.05556],
-          "949": [0, 0.43056, 0, 0.08334],
-          "950": [0.19444, 0.69444, 0.07378, 0.08334],
-          "951": [0.19444, 0.43056, 0.03588, 0.05556],
-          "952": [0, 0.69444, 0.02778, 0.08334],
-          "953": [0, 0.43056, 0, 0.05556],
-          "954": [0, 0.43056, 0, 0],
-          "955": [0, 0.69444, 0, 0],
-          "956": [0.19444, 0.43056, 0, 0.02778],
-          "957": [0, 0.43056, 0.06366, 0.02778],
-          "958": [0.19444, 0.69444, 0.04601, 0.11111],
-          "959": [0, 0.43056, 0, 0.05556],
-          "960": [0, 0.43056, 0.03588, 0],
-          "961": [0.19444, 0.43056, 0, 0.08334],
-          "962": [0.09722, 0.43056, 0.07986, 0.08334],
-          "963": [0, 0.43056, 0.03588, 0],
-          "964": [0, 0.43056, 0.1132, 0.02778],
-          "965": [0, 0.43056, 0.03588, 0.02778],
-          "966": [0.19444, 0.43056, 0, 0.08334],
-          "967": [0.19444, 0.43056, 0, 0.05556],
-          "968": [0.19444, 0.69444, 0.03588, 0.11111],
-          "969": [0, 0.43056, 0.03588, 0],
-          "977": [0, 0.69444, 0, 0.08334],
-          "981": [0.19444, 0.69444, 0, 0.08334],
-          "982": [0, 0.43056, 0.02778, 0],
-          "1009": [0.19444, 0.43056, 0, 0.08334],
-          "1013": [0, 0.43056, 0, 0.05556]
-      },
-      "Math-Regular": {
-          "65": [0, 0.68333, 0, 0.13889],
-          "66": [0, 0.68333, 0.05017, 0.08334],
-          "67": [0, 0.68333, 0.07153, 0.08334],
-          "68": [0, 0.68333, 0.02778, 0.05556],
-          "69": [0, 0.68333, 0.05764, 0.08334],
-          "70": [0, 0.68333, 0.13889, 0.08334],
-          "71": [0, 0.68333, 0, 0.08334],
-          "72": [0, 0.68333, 0.08125, 0.05556],
-          "73": [0, 0.68333, 0.07847, 0.11111],
-          "74": [0, 0.68333, 0.09618, 0.16667],
-          "75": [0, 0.68333, 0.07153, 0.05556],
-          "76": [0, 0.68333, 0, 0.02778],
-          "77": [0, 0.68333, 0.10903, 0.08334],
-          "78": [0, 0.68333, 0.10903, 0.08334],
-          "79": [0, 0.68333, 0.02778, 0.08334],
-          "80": [0, 0.68333, 0.13889, 0.08334],
-          "81": [0.19444, 0.68333, 0, 0.08334],
-          "82": [0, 0.68333, 0.00773, 0.08334],
-          "83": [0, 0.68333, 0.05764, 0.08334],
-          "84": [0, 0.68333, 0.13889, 0.08334],
-          "85": [0, 0.68333, 0.10903, 0.02778],
-          "86": [0, 0.68333, 0.22222, 0],
-          "87": [0, 0.68333, 0.13889, 0],
-          "88": [0, 0.68333, 0.07847, 0.08334],
-          "89": [0, 0.68333, 0.22222, 0],
-          "90": [0, 0.68333, 0.07153, 0.08334],
-          "97": [0, 0.43056, 0, 0],
-          "98": [0, 0.69444, 0, 0],
-          "99": [0, 0.43056, 0, 0.05556],
-          "100": [0, 0.69444, 0, 0.16667],
-          "101": [0, 0.43056, 0, 0.05556],
-          "102": [0.19444, 0.69444, 0.10764, 0.16667],
-          "103": [0.19444, 0.43056, 0.03588, 0.02778],
-          "104": [0, 0.69444, 0, 0],
-          "105": [0, 0.65952, 0, 0],
-          "106": [0.19444, 0.65952, 0.05724, 0],
-          "107": [0, 0.69444, 0.03148, 0],
-          "108": [0, 0.69444, 0.01968, 0.08334],
-          "109": [0, 0.43056, 0, 0],
-          "110": [0, 0.43056, 0, 0],
-          "111": [0, 0.43056, 0, 0.05556],
-          "112": [0.19444, 0.43056, 0, 0.08334],
-          "113": [0.19444, 0.43056, 0.03588, 0.08334],
-          "114": [0, 0.43056, 0.02778, 0.05556],
-          "115": [0, 0.43056, 0, 0.05556],
-          "116": [0, 0.61508, 0, 0.08334],
-          "117": [0, 0.43056, 0, 0.02778],
-          "118": [0, 0.43056, 0.03588, 0.02778],
-          "119": [0, 0.43056, 0.02691, 0.08334],
-          "120": [0, 0.43056, 0, 0.02778],
-          "121": [0.19444, 0.43056, 0.03588, 0.05556],
-          "122": [0, 0.43056, 0.04398, 0.05556],
-          "915": [0, 0.68333, 0.13889, 0.08334],
-          "916": [0, 0.68333, 0, 0.16667],
-          "920": [0, 0.68333, 0.02778, 0.08334],
-          "923": [0, 0.68333, 0, 0.16667],
-          "926": [0, 0.68333, 0.07569, 0.08334],
-          "928": [0, 0.68333, 0.08125, 0.05556],
-          "931": [0, 0.68333, 0.05764, 0.08334],
-          "933": [0, 0.68333, 0.13889, 0.05556],
-          "934": [0, 0.68333, 0, 0.08334],
-          "936": [0, 0.68333, 0.11, 0.05556],
-          "937": [0, 0.68333, 0.05017, 0.08334],
-          "945": [0, 0.43056, 0.0037, 0.02778],
-          "946": [0.19444, 0.69444, 0.05278, 0.08334],
-          "947": [0.19444, 0.43056, 0.05556, 0],
-          "948": [0, 0.69444, 0.03785, 0.05556],
-          "949": [0, 0.43056, 0, 0.08334],
-          "950": [0.19444, 0.69444, 0.07378, 0.08334],
-          "951": [0.19444, 0.43056, 0.03588, 0.05556],
-          "952": [0, 0.69444, 0.02778, 0.08334],
-          "953": [0, 0.43056, 0, 0.05556],
-          "954": [0, 0.43056, 0, 0],
-          "955": [0, 0.69444, 0, 0],
-          "956": [0.19444, 0.43056, 0, 0.02778],
-          "957": [0, 0.43056, 0.06366, 0.02778],
-          "958": [0.19444, 0.69444, 0.04601, 0.11111],
-          "959": [0, 0.43056, 0, 0.05556],
-          "960": [0, 0.43056, 0.03588, 0],
-          "961": [0.19444, 0.43056, 0, 0.08334],
-          "962": [0.09722, 0.43056, 0.07986, 0.08334],
-          "963": [0, 0.43056, 0.03588, 0],
-          "964": [0, 0.43056, 0.1132, 0.02778],
-          "965": [0, 0.43056, 0.03588, 0.02778],
-          "966": [0.19444, 0.43056, 0, 0.08334],
-          "967": [0.19444, 0.43056, 0, 0.05556],
-          "968": [0.19444, 0.69444, 0.03588, 0.11111],
-          "969": [0, 0.43056, 0.03588, 0],
-          "977": [0, 0.69444, 0, 0.08334],
-          "981": [0.19444, 0.69444, 0, 0.08334],
-          "982": [0, 0.43056, 0.02778, 0],
-          "1009": [0.19444, 0.43056, 0, 0.08334],
-          "1013": [0, 0.43056, 0, 0.05556]
-      },
-      "SansSerif-Regular": {
-          "33": [0, 0.69444, 0, 0],
-          "34": [0, 0.69444, 0, 0],
-          "35": [0.19444, 0.69444, 0, 0],
-          "36": [0.05556, 0.75, 0, 0],
-          "37": [0.05556, 0.75, 0, 0],
-          "38": [0, 0.69444, 0, 0],
-          "39": [0, 0.69444, 0, 0],
-          "40": [0.25, 0.75, 0, 0],
-          "41": [0.25, 0.75, 0, 0],
-          "42": [0, 0.75, 0, 0],
-          "43": [0.08333, 0.58333, 0, 0],
-          "44": [0.125, 0.08333, 0, 0],
-          "45": [0, 0.44444, 0, 0],
-          "46": [0, 0.08333, 0, 0],
-          "47": [0.25, 0.75, 0, 0],
-          "48": [0, 0.65556, 0, 0],
-          "49": [0, 0.65556, 0, 0],
-          "50": [0, 0.65556, 0, 0],
-          "51": [0, 0.65556, 0, 0],
-          "52": [0, 0.65556, 0, 0],
-          "53": [0, 0.65556, 0, 0],
-          "54": [0, 0.65556, 0, 0],
-          "55": [0, 0.65556, 0, 0],
-          "56": [0, 0.65556, 0, 0],
-          "57": [0, 0.65556, 0, 0],
-          "58": [0, 0.44444, 0, 0],
-          "59": [0.125, 0.44444, 0, 0],
-          "61": [-0.13, 0.37, 0, 0],
-          "63": [0, 0.69444, 0, 0],
-          "64": [0, 0.69444, 0, 0],
-          "65": [0, 0.69444, 0, 0],
-          "66": [0, 0.69444, 0, 0],
-          "67": [0, 0.69444, 0, 0],
-          "68": [0, 0.69444, 0, 0],
-          "69": [0, 0.69444, 0, 0],
-          "70": [0, 0.69444, 0, 0],
-          "71": [0, 0.69444, 0, 0],
-          "72": [0, 0.69444, 0, 0],
-          "73": [0, 0.69444, 0, 0],
-          "74": [0, 0.69444, 0, 0],
-          "75": [0, 0.69444, 0, 0],
-          "76": [0, 0.69444, 0, 0],
-          "77": [0, 0.69444, 0, 0],
-          "78": [0, 0.69444, 0, 0],
-          "79": [0, 0.69444, 0, 0],
-          "80": [0, 0.69444, 0, 0],
-          "81": [0.125, 0.69444, 0, 0],
-          "82": [0, 0.69444, 0, 0],
-          "83": [0, 0.69444, 0, 0],
-          "84": [0, 0.69444, 0, 0],
-          "85": [0, 0.69444, 0, 0],
-          "86": [0, 0.69444, 0.01389, 0],
-          "87": [0, 0.69444, 0.01389, 0],
-          "88": [0, 0.69444, 0, 0],
-          "89": [0, 0.69444, 0.025, 0],
-          "90": [0, 0.69444, 0, 0],
-          "91": [0.25, 0.75, 0, 0],
-          "93": [0.25, 0.75, 0, 0],
-          "94": [0, 0.69444, 0, 0],
-          "95": [0.35, 0.09444, 0.02778, 0],
-          "97": [0, 0.44444, 0, 0],
-          "98": [0, 0.69444, 0, 0],
-          "99": [0, 0.44444, 0, 0],
-          "100": [0, 0.69444, 0, 0],
-          "101": [0, 0.44444, 0, 0],
-          "102": [0, 0.69444, 0.06944, 0],
-          "103": [0.19444, 0.44444, 0.01389, 0],
-          "104": [0, 0.69444, 0, 0],
-          "105": [0, 0.67937, 0, 0],
-          "106": [0.19444, 0.67937, 0, 0],
-          "107": [0, 0.69444, 0, 0],
-          "108": [0, 0.69444, 0, 0],
-          "109": [0, 0.44444, 0, 0],
-          "110": [0, 0.44444, 0, 0],
-          "111": [0, 0.44444, 0, 0],
-          "112": [0.19444, 0.44444, 0, 0],
-          "113": [0.19444, 0.44444, 0, 0],
-          "114": [0, 0.44444, 0.01389, 0],
-          "115": [0, 0.44444, 0, 0],
-          "116": [0, 0.57143, 0, 0],
-          "117": [0, 0.44444, 0, 0],
-          "118": [0, 0.44444, 0.01389, 0],
-          "119": [0, 0.44444, 0.01389, 0],
-          "120": [0, 0.44444, 0, 0],
-          "121": [0.19444, 0.44444, 0.01389, 0],
-          "122": [0, 0.44444, 0, 0],
-          "126": [0.35, 0.32659, 0, 0],
-          "305": [0, 0.44444, 0, 0],
-          "567": [0.19444, 0.44444, 0, 0],
-          "768": [0, 0.69444, 0, 0],
-          "769": [0, 0.69444, 0, 0],
-          "770": [0, 0.69444, 0, 0],
-          "771": [0, 0.67659, 0, 0],
-          "772": [0, 0.60889, 0, 0],
-          "774": [0, 0.69444, 0, 0],
-          "775": [0, 0.67937, 0, 0],
-          "776": [0, 0.67937, 0, 0],
-          "778": [0, 0.69444, 0, 0],
-          "779": [0, 0.69444, 0, 0],
-          "780": [0, 0.63194, 0, 0],
-          "915": [0, 0.69444, 0, 0],
-          "916": [0, 0.69444, 0, 0],
-          "920": [0, 0.69444, 0, 0],
-          "923": [0, 0.69444, 0, 0],
-          "926": [0, 0.69444, 0, 0],
-          "928": [0, 0.69444, 0, 0],
-          "931": [0, 0.69444, 0, 0],
-          "933": [0, 0.69444, 0, 0],
-          "934": [0, 0.69444, 0, 0],
-          "936": [0, 0.69444, 0, 0],
-          "937": [0, 0.69444, 0, 0],
-          "8211": [0, 0.44444, 0.02778, 0],
-          "8212": [0, 0.44444, 0.02778, 0],
-          "8216": [0, 0.69444, 0, 0],
-          "8217": [0, 0.69444, 0, 0],
-          "8220": [0, 0.69444, 0, 0],
-          "8221": [0, 0.69444, 0, 0]
-      },
-      "Script-Regular": {
-          "65": [0, 0.7, 0.22925, 0],
-          "66": [0, 0.7, 0.04087, 0],
-          "67": [0, 0.7, 0.1689, 0],
-          "68": [0, 0.7, 0.09371, 0],
-          "69": [0, 0.7, 0.18583, 0],
-          "70": [0, 0.7, 0.13634, 0],
-          "71": [0, 0.7, 0.17322, 0],
-          "72": [0, 0.7, 0.29694, 0],
-          "73": [0, 0.7, 0.19189, 0],
-          "74": [0.27778, 0.7, 0.19189, 0],
-          "75": [0, 0.7, 0.31259, 0],
-          "76": [0, 0.7, 0.19189, 0],
-          "77": [0, 0.7, 0.15981, 0],
-          "78": [0, 0.7, 0.3525, 0],
-          "79": [0, 0.7, 0.08078, 0],
-          "80": [0, 0.7, 0.08078, 0],
-          "81": [0, 0.7, 0.03305, 0],
-          "82": [0, 0.7, 0.06259, 0],
-          "83": [0, 0.7, 0.19189, 0],
-          "84": [0, 0.7, 0.29087, 0],
-          "85": [0, 0.7, 0.25815, 0],
-          "86": [0, 0.7, 0.27523, 0],
-          "87": [0, 0.7, 0.27523, 0],
-          "88": [0, 0.7, 0.26006, 0],
-          "89": [0, 0.7, 0.2939, 0],
-          "90": [0, 0.7, 0.24037, 0]
-      },
-      "Size1-Regular": {
-          "40": [0.35001, 0.85, 0, 0],
-          "41": [0.35001, 0.85, 0, 0],
-          "47": [0.35001, 0.85, 0, 0],
-          "91": [0.35001, 0.85, 0, 0],
-          "92": [0.35001, 0.85, 0, 0],
-          "93": [0.35001, 0.85, 0, 0],
-          "123": [0.35001, 0.85, 0, 0],
-          "125": [0.35001, 0.85, 0, 0],
-          "710": [0, 0.72222, 0, 0],
-          "732": [0, 0.72222, 0, 0],
-          "770": [0, 0.72222, 0, 0],
-          "771": [0, 0.72222, 0, 0],
-          "8214": [-0.00099, 0.601, 0, 0],
-          "8593": [1e-05, 0.6, 0, 0],
-          "8595": [1e-05, 0.6, 0, 0],
-          "8657": [1e-05, 0.6, 0, 0],
-          "8659": [1e-05, 0.6, 0, 0],
-          "8719": [0.25001, 0.75, 0, 0],
-          "8720": [0.25001, 0.75, 0, 0],
-          "8721": [0.25001, 0.75, 0, 0],
-          "8730": [0.35001, 0.85, 0, 0],
-          "8739": [-0.00599, 0.606, 0, 0],
-          "8741": [-0.00599, 0.606, 0, 0],
-          "8747": [0.30612, 0.805, 0.19445, 0],
-          "8748": [0.306, 0.805, 0.19445, 0],
-          "8749": [0.306, 0.805, 0.19445, 0],
-          "8750": [0.30612, 0.805, 0.19445, 0],
-          "8896": [0.25001, 0.75, 0, 0],
-          "8897": [0.25001, 0.75, 0, 0],
-          "8898": [0.25001, 0.75, 0, 0],
-          "8899": [0.25001, 0.75, 0, 0],
-          "8968": [0.35001, 0.85, 0, 0],
-          "8969": [0.35001, 0.85, 0, 0],
-          "8970": [0.35001, 0.85, 0, 0],
-          "8971": [0.35001, 0.85, 0, 0],
-          "9168": [-0.00099, 0.601, 0, 0],
-          "10216": [0.35001, 0.85, 0, 0],
-          "10217": [0.35001, 0.85, 0, 0],
-          "10752": [0.25001, 0.75, 0, 0],
-          "10753": [0.25001, 0.75, 0, 0],
-          "10754": [0.25001, 0.75, 0, 0],
-          "10756": [0.25001, 0.75, 0, 0],
-          "10758": [0.25001, 0.75, 0, 0]
-      },
-      "Size2-Regular": {
-          "40": [0.65002, 1.15, 0, 0],
-          "41": [0.65002, 1.15, 0, 0],
-          "47": [0.65002, 1.15, 0, 0],
-          "91": [0.65002, 1.15, 0, 0],
-          "92": [0.65002, 1.15, 0, 0],
-          "93": [0.65002, 1.15, 0, 0],
-          "123": [0.65002, 1.15, 0, 0],
-          "125": [0.65002, 1.15, 0, 0],
-          "710": [0, 0.75, 0, 0],
-          "732": [0, 0.75, 0, 0],
-          "770": [0, 0.75, 0, 0],
-          "771": [0, 0.75, 0, 0],
-          "8719": [0.55001, 1.05, 0, 0],
-          "8720": [0.55001, 1.05, 0, 0],
-          "8721": [0.55001, 1.05, 0, 0],
-          "8730": [0.65002, 1.15, 0, 0],
-          "8747": [0.86225, 1.36, 0.44445, 0],
-          "8748": [0.862, 1.36, 0.44445, 0],
-          "8749": [0.862, 1.36, 0.44445, 0],
-          "8750": [0.86225, 1.36, 0.44445, 0],
-          "8896": [0.55001, 1.05, 0, 0],
-          "8897": [0.55001, 1.05, 0, 0],
-          "8898": [0.55001, 1.05, 0, 0],
-          "8899": [0.55001, 1.05, 0, 0],
-          "8968": [0.65002, 1.15, 0, 0],
-          "8969": [0.65002, 1.15, 0, 0],
-          "8970": [0.65002, 1.15, 0, 0],
-          "8971": [0.65002, 1.15, 0, 0],
-          "10216": [0.65002, 1.15, 0, 0],
-          "10217": [0.65002, 1.15, 0, 0],
-          "10752": [0.55001, 1.05, 0, 0],
-          "10753": [0.55001, 1.05, 0, 0],
-          "10754": [0.55001, 1.05, 0, 0],
-          "10756": [0.55001, 1.05, 0, 0],
-          "10758": [0.55001, 1.05, 0, 0]
-      },
-      "Size3-Regular": {
-          "40": [0.95003, 1.45, 0, 0],
-          "41": [0.95003, 1.45, 0, 0],
-          "47": [0.95003, 1.45, 0, 0],
-          "91": [0.95003, 1.45, 0, 0],
-          "92": [0.95003, 1.45, 0, 0],
-          "93": [0.95003, 1.45, 0, 0],
-          "123": [0.95003, 1.45, 0, 0],
-          "125": [0.95003, 1.45, 0, 0],
-          "710": [0, 0.75, 0, 0],
-          "732": [0, 0.75, 0, 0],
-          "770": [0, 0.75, 0, 0],
-          "771": [0, 0.75, 0, 0],
-          "8730": [0.95003, 1.45, 0, 0],
-          "8968": [0.95003, 1.45, 0, 0],
-          "8969": [0.95003, 1.45, 0, 0],
-          "8970": [0.95003, 1.45, 0, 0],
-          "8971": [0.95003, 1.45, 0, 0],
-          "10216": [0.95003, 1.45, 0, 0],
-          "10217": [0.95003, 1.45, 0, 0]
-      },
-      "Size4-Regular": {
-          "40": [1.25003, 1.75, 0, 0],
-          "41": [1.25003, 1.75, 0, 0],
-          "47": [1.25003, 1.75, 0, 0],
-          "91": [1.25003, 1.75, 0, 0],
-          "92": [1.25003, 1.75, 0, 0],
-          "93": [1.25003, 1.75, 0, 0],
-          "123": [1.25003, 1.75, 0, 0],
-          "125": [1.25003, 1.75, 0, 0],
-          "710": [0, 0.825, 0, 0],
-          "732": [0, 0.825, 0, 0],
-          "770": [0, 0.825, 0, 0],
-          "771": [0, 0.825, 0, 0],
-          "8730": [1.25003, 1.75, 0, 0],
-          "8968": [1.25003, 1.75, 0, 0],
-          "8969": [1.25003, 1.75, 0, 0],
-          "8970": [1.25003, 1.75, 0, 0],
-          "8971": [1.25003, 1.75, 0, 0],
-          "9115": [0.64502, 1.155, 0, 0],
-          "9116": [1e-05, 0.6, 0, 0],
-          "9117": [0.64502, 1.155, 0, 0],
-          "9118": [0.64502, 1.155, 0, 0],
-          "9119": [1e-05, 0.6, 0, 0],
-          "9120": [0.64502, 1.155, 0, 0],
-          "9121": [0.64502, 1.155, 0, 0],
-          "9122": [-0.00099, 0.601, 0, 0],
-          "9123": [0.64502, 1.155, 0, 0],
-          "9124": [0.64502, 1.155, 0, 0],
-          "9125": [-0.00099, 0.601, 0, 0],
-          "9126": [0.64502, 1.155, 0, 0],
-          "9127": [1e-05, 0.9, 0, 0],
-          "9128": [0.65002, 1.15, 0, 0],
-          "9129": [0.90001, 0, 0, 0],
-          "9130": [0, 0.3, 0, 0],
-          "9131": [1e-05, 0.9, 0, 0],
-          "9132": [0.65002, 1.15, 0, 0],
-          "9133": [0.90001, 0, 0, 0],
-          "9143": [0.88502, 0.915, 0, 0],
-          "10216": [1.25003, 1.75, 0, 0],
-          "10217": [1.25003, 1.75, 0, 0],
-          "57344": [-0.00499, 0.605, 0, 0],
-          "57345": [-0.00499, 0.605, 0, 0],
-          "57680": [0, 0.12, 0, 0],
-          "57681": [0, 0.12, 0, 0],
-          "57682": [0, 0.12, 0, 0],
-          "57683": [0, 0.12, 0, 0]
-      },
-      "Typewriter-Regular": {
-          "33": [0, 0.61111, 0, 0],
-          "34": [0, 0.61111, 0, 0],
-          "35": [0, 0.61111, 0, 0],
-          "36": [0.08333, 0.69444, 0, 0],
-          "37": [0.08333, 0.69444, 0, 0],
-          "38": [0, 0.61111, 0, 0],
-          "39": [0, 0.61111, 0, 0],
-          "40": [0.08333, 0.69444, 0, 0],
-          "41": [0.08333, 0.69444, 0, 0],
-          "42": [0, 0.52083, 0, 0],
-          "43": [-0.08056, 0.53055, 0, 0],
-          "44": [0.13889, 0.125, 0, 0],
-          "45": [-0.08056, 0.53055, 0, 0],
-          "46": [0, 0.125, 0, 0],
-          "47": [0.08333, 0.69444, 0, 0],
-          "48": [0, 0.61111, 0, 0],
-          "49": [0, 0.61111, 0, 0],
-          "50": [0, 0.61111, 0, 0],
-          "51": [0, 0.61111, 0, 0],
-          "52": [0, 0.61111, 0, 0],
-          "53": [0, 0.61111, 0, 0],
-          "54": [0, 0.61111, 0, 0],
-          "55": [0, 0.61111, 0, 0],
-          "56": [0, 0.61111, 0, 0],
-          "57": [0, 0.61111, 0, 0],
-          "58": [0, 0.43056, 0, 0],
-          "59": [0.13889, 0.43056, 0, 0],
-          "60": [-0.05556, 0.55556, 0, 0],
-          "61": [-0.19549, 0.41562, 0, 0],
-          "62": [-0.05556, 0.55556, 0, 0],
-          "63": [0, 0.61111, 0, 0],
-          "64": [0, 0.61111, 0, 0],
-          "65": [0, 0.61111, 0, 0],
-          "66": [0, 0.61111, 0, 0],
-          "67": [0, 0.61111, 0, 0],
-          "68": [0, 0.61111, 0, 0],
-          "69": [0, 0.61111, 0, 0],
-          "70": [0, 0.61111, 0, 0],
-          "71": [0, 0.61111, 0, 0],
-          "72": [0, 0.61111, 0, 0],
-          "73": [0, 0.61111, 0, 0],
-          "74": [0, 0.61111, 0, 0],
-          "75": [0, 0.61111, 0, 0],
-          "76": [0, 0.61111, 0, 0],
-          "77": [0, 0.61111, 0, 0],
-          "78": [0, 0.61111, 0, 0],
-          "79": [0, 0.61111, 0, 0],
-          "80": [0, 0.61111, 0, 0],
-          "81": [0.13889, 0.61111, 0, 0],
-          "82": [0, 0.61111, 0, 0],
-          "83": [0, 0.61111, 0, 0],
-          "84": [0, 0.61111, 0, 0],
-          "85": [0, 0.61111, 0, 0],
-          "86": [0, 0.61111, 0, 0],
-          "87": [0, 0.61111, 0, 0],
-          "88": [0, 0.61111, 0, 0],
-          "89": [0, 0.61111, 0, 0],
-          "90": [0, 0.61111, 0, 0],
-          "91": [0.08333, 0.69444, 0, 0],
-          "92": [0.08333, 0.69444, 0, 0],
-          "93": [0.08333, 0.69444, 0, 0],
-          "94": [0, 0.61111, 0, 0],
-          "95": [0.09514, 0, 0, 0],
-          "96": [0, 0.61111, 0, 0],
-          "97": [0, 0.43056, 0, 0],
-          "98": [0, 0.61111, 0, 0],
-          "99": [0, 0.43056, 0, 0],
-          "100": [0, 0.61111, 0, 0],
-          "101": [0, 0.43056, 0, 0],
-          "102": [0, 0.61111, 0, 0],
-          "103": [0.22222, 0.43056, 0, 0],
-          "104": [0, 0.61111, 0, 0],
-          "105": [0, 0.61111, 0, 0],
-          "106": [0.22222, 0.61111, 0, 0],
-          "107": [0, 0.61111, 0, 0],
-          "108": [0, 0.61111, 0, 0],
-          "109": [0, 0.43056, 0, 0],
-          "110": [0, 0.43056, 0, 0],
-          "111": [0, 0.43056, 0, 0],
-          "112": [0.22222, 0.43056, 0, 0],
-          "113": [0.22222, 0.43056, 0, 0],
-          "114": [0, 0.43056, 0, 0],
-          "115": [0, 0.43056, 0, 0],
-          "116": [0, 0.55358, 0, 0],
-          "117": [0, 0.43056, 0, 0],
-          "118": [0, 0.43056, 0, 0],
-          "119": [0, 0.43056, 0, 0],
-          "120": [0, 0.43056, 0, 0],
-          "121": [0.22222, 0.43056, 0, 0],
-          "122": [0, 0.43056, 0, 0],
-          "123": [0.08333, 0.69444, 0, 0],
-          "124": [0.08333, 0.69444, 0, 0],
-          "125": [0.08333, 0.69444, 0, 0],
-          "126": [0, 0.61111, 0, 0],
-          "127": [0, 0.61111, 0, 0],
-          "305": [0, 0.43056, 0, 0],
-          "567": [0.22222, 0.43056, 0, 0],
-          "768": [0, 0.61111, 0, 0],
-          "769": [0, 0.61111, 0, 0],
-          "770": [0, 0.61111, 0, 0],
-          "771": [0, 0.61111, 0, 0],
-          "772": [0, 0.56555, 0, 0],
-          "774": [0, 0.61111, 0, 0],
-          "776": [0, 0.61111, 0, 0],
-          "778": [0, 0.61111, 0, 0],
-          "780": [0, 0.56597, 0, 0],
-          "915": [0, 0.61111, 0, 0],
-          "916": [0, 0.61111, 0, 0],
-          "920": [0, 0.61111, 0, 0],
-          "923": [0, 0.61111, 0, 0],
-          "926": [0, 0.61111, 0, 0],
-          "928": [0, 0.61111, 0, 0],
-          "931": [0, 0.61111, 0, 0],
-          "933": [0, 0.61111, 0, 0],
-          "934": [0, 0.61111, 0, 0],
-          "936": [0, 0.61111, 0, 0],
-          "937": [0, 0.61111, 0, 0],
-          "2018": [0, 0.61111, 0, 0],
-          "2019": [0, 0.61111, 0, 0],
-          "8242": [0, 0.61111, 0, 0]
-      }
-  };
-
-  },{}],43:[function(require,module,exports){
-
-  var _utils = require("./utils");
-
-  var _utils2 = _interopRequireDefault(_utils);
-
-  var _ParseError = require("./ParseError");
-
-  var _ParseError2 = _interopRequireDefault(_ParseError);
-
-  var _ParseNode = require("./ParseNode");
-
-  var _ParseNode2 = _interopRequireDefault(_ParseNode);
-
-  function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { default: obj }; }
-
-  /* This file contains a list of functions that we parse, identified by
-   * the calls to defineFunction.
-   *
-   * The first argument to defineFunction is a single name or a list of names.
-   * All functions named in such a list will share a single implementation.
-   *
-   * Each declared function can have associated properties, which
-   * include the following:
-   *
-   *  - numArgs: The number of arguments the function takes.
-   *             If this is the only property, it can be passed as a number
-   *             instead of an element of a properties object.
-   *  - argTypes: (optional) An array corresponding to each argument of the
-   *              function, giving the type of argument that should be parsed. Its
-   *              length should be equal to `numArgs + numOptionalArgs`. Valid
-   *              types:
-   *               - "size": A size-like thing, such as "1em" or "5ex"
-   *               - "color": An html color, like "#abc" or "blue"
-   *               - "original": The same type as the environment that the
-   *                             function being parsed is in (e.g. used for the
-   *                             bodies of functions like \textcolor where the
-   *                             first argument is special and the second
-   *                             argument is parsed normally)
-   *              Other possible types (probably shouldn't be used)
-   *               - "text": Text-like (e.g. \text)
-   *               - "math": Normal math
-   *              If undefined, this will be treated as an appropriate length
-   *              array of "original" strings
-   *  - greediness: (optional) The greediness of the function to use ungrouped
-   *                arguments.
-   *
-   *                E.g. if you have an expression
-   *                  \sqrt \frac 1 2
-   *                since \frac has greediness=2 vs \sqrt's greediness=1, \frac
-   *                will use the two arguments '1' and '2' as its two arguments,
-   *                then that whole function will be used as the argument to
-   *                \sqrt. On the other hand, the expressions
-   *                  \frac \frac 1 2 3
-   *                and
-   *                  \frac \sqrt 1 2
-   *                will fail because \frac and \frac have equal greediness
-   *                and \sqrt has a lower greediness than \frac respectively. To
-   *                make these parse, we would have to change them to:
-   *                  \frac {\frac 1 2} 3
-   *                and
-   *                  \frac {\sqrt 1} 2
-   *
-   *                The default value is `1`
-   *  - allowedInText: (optional) Whether or not the function is allowed inside
-   *                   text mode (default false)
-   *  - numOptionalArgs: (optional) The number of optional arguments the function
-   *                     should parse. If the optional arguments aren't found,
-   *                     `null` will be passed to the handler in their place.
-   *                     (default 0)
-   *  - infix: (optional) Must be true if the function is an infix operator.
-   *
-   * The last argument is that implementation, the handler for the function(s).
-   * It is called to handle these functions and their arguments.
-   * It receives two arguments:
-   *  - context contains information and references provided by the parser
-   *  - args is an array of arguments obtained from TeX input
-   * The context contains the following properties:
-   *  - funcName: the text (i.e. name) of the function, including \
-   *  - parser: the parser object
-   *  - lexer: the lexer object
-   *  - positions: the positions in the overall string of the function
-   *               and the arguments.
-   * The latter three should only be used to produce error messages.
-   *
-   * The function should return an object with the following keys:
-   *  - type: The type of element that this is. This is then used in
-   *          buildHTML/buildMathML to determine which function
-   *          should be called to build this node into a DOM node
-   * Any other data can be added to the object, which will be passed
-   * in to the function in buildHTML/buildMathML as `group.value`.
-   */
-
-  function defineFunction(names, props, handler) {
-      if (typeof names === "string") {
-          names = [names];
-      }
-      if (typeof props === "number") {
-          props = { numArgs: props };
-      }
-      // Set default values of functions
-      var data = {
-          numArgs: props.numArgs,
-          argTypes: props.argTypes,
-          greediness: props.greediness === undefined ? 1 : props.greediness,
-          allowedInText: !!props.allowedInText,
-          allowedInMath: props.allowedInMath,
-          numOptionalArgs: props.numOptionalArgs || 0,
-          infix: !!props.infix,
-          handler: handler
-      };
-      for (var i = 0; i < names.length; ++i) {
-          module.exports[names[i]] = data;
-      }
-  }
-
-  // Since the corresponding buildHTML/buildMathML function expects a
-  // list of elements, we normalize for different kinds of arguments
-  var ordargument = function ordargument(arg) {
-      if (arg.type === "ordgroup") {
-          return arg.value;
-      } else {
-          return [arg];
-      }
-  };
-
-  // A normal square root
-  defineFunction("\\sqrt", {
-      numArgs: 1,
-      numOptionalArgs: 1
-  }, function (context, args) {
-      var index = args[0];
-      var body = args[1];
-      return {
-          type: "sqrt",
-          body: body,
-          index: index
-      };
-  });
-
-  // Non-mathy text, possibly in a font
-  var textFunctionStyles = {
-      "\\text": undefined, "\\textrm": "mathrm", "\\textsf": "mathsf",
-      "\\texttt": "mathtt", "\\textnormal": "mathrm", "\\textbf": "mathbf",
-      "\\textit": "textit"
-  };
-
-  defineFunction(["\\text", "\\textrm", "\\textsf", "\\texttt", "\\textnormal", "\\textbf", "\\textit"], {
-      numArgs: 1,
-      argTypes: ["text"],
-      greediness: 2,
-      allowedInText: true
-  }, function (context, args) {
-      var body = args[0];
-      return {
-          type: "text",
-          body: ordargument(body),
-          style: textFunctionStyles[context.funcName]
-      };
-  });
-
-  // A two-argument custom color
-  defineFunction("\\textcolor", {
-      numArgs: 2,
-      allowedInText: true,
-      greediness: 3,
-      argTypes: ["color", "original"]
-  }, function (context, args) {
-      var color = args[0];
-      var body = args[1];
-      return {
-          type: "color",
-          color: color.value,
-          value: ordargument(body)
-      };
-  });
-
-  // \color is handled in Parser.js's parseImplicitGroup
-  defineFunction("\\color", {
-      numArgs: 1,
-      allowedInText: true,
-      greediness: 3,
-      argTypes: ["color"]
-  }, null);
-
-  // An overline
-  defineFunction("\\overline", {
-      numArgs: 1
-  }, function (context, args) {
-      var body = args[0];
-      return {
-          type: "overline",
-          body: body
-      };
-  });
-
-  // An underline
-  defineFunction("\\underline", {
-      numArgs: 1
-  }, function (context, args) {
-      var body = args[0];
-      return {
-          type: "underline",
-          body: body
-      };
-  });
-
-  // A box of the width and height
-  defineFunction("\\rule", {
-      numArgs: 2,
-      numOptionalArgs: 1,
-      argTypes: ["size", "size", "size"]
-  }, function (context, args) {
-      var shift = args[0];
-      var width = args[1];
-      var height = args[2];
-      return {
-          type: "rule",
-          shift: shift && shift.value,
-          width: width.value,
-          height: height.value
-      };
-  });
-
-  // TODO: In TeX, \mkern only accepts mu-units, and \kern does not accept
-  // mu-units. In current KaTeX we relax this; both commands accept any unit.
-  defineFunction(["\\kern", "\\mkern"], {
-      numArgs: 1,
-      argTypes: ["size"]
-  }, function (context, args) {
-      return {
-          type: "kern",
-          dimension: args[0].value
-      };
-  });
-
-  // A KaTeX logo
-  defineFunction("\\KaTeX", {
-      numArgs: 0
-  }, function (context) {
-      return {
-          type: "katex"
-      };
-  });
-
-  defineFunction("\\phantom", {
-      numArgs: 1
-  }, function (context, args) {
-      var body = args[0];
-      return {
-          type: "phantom",
-          value: ordargument(body)
-      };
-  });
-
-  // Math class commands except \mathop
-  defineFunction(["\\mathord", "\\mathbin", "\\mathrel", "\\mathopen", "\\mathclose", "\\mathpunct", "\\mathinner"], {
-      numArgs: 1
-  }, function (context, args) {
-      var body = args[0];
-      return {
-          type: "mclass",
-          mclass: "m" + context.funcName.substr(5),
-          value: ordargument(body)
-      };
-  });
-
-  // Build a relation by placing one symbol on top of another
-  defineFunction("\\stackrel", {
-      numArgs: 2
-  }, function (context, args) {
-      var top = args[0];
-      var bottom = args[1];
-
-      var bottomop = new _ParseNode2.default("op", {
-          type: "op",
-          limits: true,
-          alwaysHandleSupSub: true,
-          symbol: false,
-          value: ordargument(bottom)
-      }, bottom.mode);
-
-      var supsub = new _ParseNode2.default("supsub", {
-          base: bottomop,
-          sup: top,
-          sub: null
-      }, top.mode);
-
-      return {
-          type: "mclass",
-          mclass: "mrel",
-          value: [supsub]
-      };
-  });
-
-  // \mod-type functions
-  defineFunction("\\bmod", {
-      numArgs: 0
-  }, function (context, args) {
-      return {
-          type: "mod",
-          modType: "bmod",
-          value: null
-      };
-  });
-
-  defineFunction(["\\pod", "\\pmod", "\\mod"], {
-      numArgs: 1
-  }, function (context, args) {
-      var body = args[0];
-      return {
-          type: "mod",
-          modType: context.funcName.substr(1),
-          value: ordargument(body)
-      };
-  });
-
-  // Extra data needed for the delimiter handler down below
-  var delimiterSizes = {
-      "\\bigl": { mclass: "mopen", size: 1 },
-      "\\Bigl": { mclass: "mopen", size: 2 },
-      "\\biggl": { mclass: "mopen", size: 3 },
-      "\\Biggl": { mclass: "mopen", size: 4 },
-      "\\bigr": { mclass: "mclose", size: 1 },
-      "\\Bigr": { mclass: "mclose", size: 2 },
-      "\\biggr": { mclass: "mclose", size: 3 },
-      "\\Biggr": { mclass: "mclose", size: 4 },
-      "\\bigm": { mclass: "mrel", size: 1 },
-      "\\Bigm": { mclass: "mrel", size: 2 },
-      "\\biggm": { mclass: "mrel", size: 3 },
-      "\\Biggm": { mclass: "mrel", size: 4 },
-      "\\big": { mclass: "mord", size: 1 },
-      "\\Big": { mclass: "mord", size: 2 },
-      "\\bigg": { mclass: "mord", size: 3 },
-      "\\Bigg": { mclass: "mord", size: 4 }
-  };
-
-  var delimiters = ["(", ")", "[", "\\lbrack", "]", "\\rbrack", "\\{", "\\lbrace", "\\}", "\\rbrace", "\\lfloor", "\\rfloor", "\\lceil", "\\rceil", "<", ">", "\\langle", "\\rangle", "\\lt", "\\gt", "\\lvert", "\\rvert", "\\lVert", "\\rVert", "\\lgroup", "\\rgroup", "\\lmoustache", "\\rmoustache", "/", "\\backslash", "|", "\\vert", "\\|", "\\Vert", "\\uparrow", "\\Uparrow", "\\downarrow", "\\Downarrow", "\\updownarrow", "\\Updownarrow", "."];
-
-  var fontAliases = {
-      "\\Bbb": "\\mathbb",
-      "\\bold": "\\mathbf",
-      "\\frak": "\\mathfrak"
-  };
-
-  // Single-argument color functions
-  defineFunction(["\\blue", "\\orange", "\\pink", "\\red", "\\green", "\\gray", "\\purple", "\\blueA", "\\blueB", "\\blueC", "\\blueD", "\\blueE", "\\tealA", "\\tealB", "\\tealC", "\\tealD", "\\tealE", "\\greenA", "\\greenB", "\\greenC", "\\greenD", "\\greenE", "\\goldA", "\\goldB", "\\goldC", "\\goldD", "\\goldE", "\\redA", "\\redB", "\\redC", "\\redD", "\\redE", "\\maroonA", "\\maroonB", "\\maroonC", "\\maroonD", "\\maroonE", "\\purpleA", "\\purpleB", "\\purpleC", "\\purpleD", "\\purpleE", "\\mintA", "\\mintB", "\\mintC", "\\grayA", "\\grayB", "\\grayC", "\\grayD", "\\grayE", "\\grayF", "\\grayG", "\\grayH", "\\grayI", "\\kaBlue", "\\kaGreen"], {
-      numArgs: 1,
-      allowedInText: true,
-      greediness: 3
-  }, function (context, args) {
-      var body = args[0];
-      return {
-          type: "color",
-          color: "katex-" + context.funcName.slice(1),
-          value: ordargument(body)
-      };
-  });
-
-  // There are 2 flags for operators; whether they produce limits in
-  // displaystyle, and whether they are symbols and should grow in
-  // displaystyle. These four groups cover the four possible choices.
-
-  // No limits, not symbols
-  defineFunction(["\\arcsin", "\\arccos", "\\arctan", "\\arctg", "\\arcctg", "\\arg", "\\ch", "\\cos", "\\cosec", "\\cosh", "\\cot", "\\cotg", "\\coth", "\\csc", "\\ctg", "\\cth", "\\deg", "\\dim", "\\exp", "\\hom", "\\ker", "\\lg", "\\ln", "\\log", "\\sec", "\\sin", "\\sinh", "\\sh", "\\tan", "\\tanh", "\\tg", "\\th"], {
-      numArgs: 0
-  }, function (context) {
-      return {
-          type: "op",
-          limits: false,
-          symbol: false,
-          body: context.funcName
-      };
-  });
-
-  // Limits, not symbols
-  defineFunction(["\\det", "\\gcd", "\\inf", "\\lim", "\\liminf", "\\limsup", "\\max", "\\min", "\\Pr", "\\sup"], {
-      numArgs: 0
-  }, function (context) {
-      return {
-          type: "op",
-          limits: true,
-          symbol: false,
-          body: context.funcName
-      };
-  });
-
-  // No limits, symbols
-  defineFunction(["\\int", "\\iint", "\\iiint", "\\oint"], {
-      numArgs: 0
-  }, function (context) {
-      return {
-          type: "op",
-          limits: false,
-          symbol: true,
-          body: context.funcName
-      };
-  });
-
-  // Limits, symbols
-  defineFunction(["\\coprod", "\\bigvee", "\\bigwedge", "\\biguplus", "\\bigcap", "\\bigcup", "\\intop", "\\prod", "\\sum", "\\bigotimes", "\\bigoplus", "\\bigodot", "\\bigsqcup", "\\smallint"], {
-      numArgs: 0
-  }, function (context) {
-      return {
-          type: "op",
-          limits: true,
-          symbol: true,
-          body: context.funcName
-      };
-  });
-
-  // \mathop class command
-  defineFunction("\\mathop", {
-      numArgs: 1
-  }, function (context, args) {
-      var body = args[0];
-      return {
-          type: "op",
-          limits: false,
-          symbol: false,
-          value: ordargument(body)
-      };
-  });
-
-  // Fractions
-  defineFunction(["\\dfrac", "\\frac", "\\tfrac", "\\dbinom", "\\binom", "\\tbinom", "\\\\atopfrac"], {
-      numArgs: 2,
-      greediness: 2
-  }, function (context, args) {
-      var numer = args[0];
-      var denom = args[1];
-      var hasBarLine = void 0;
-      var leftDelim = null;
-      var rightDelim = null;
-      var size = "auto";
-
-      switch (context.funcName) {
-          case "\\dfrac":
-          case "\\frac":
-          case "\\tfrac":
-              hasBarLine = true;
-              break;
-          case "\\\\atopfrac":
-              hasBarLine = false;
-              break;
-          case "\\dbinom":
-          case "\\binom":
-          case "\\tbinom":
-              hasBarLine = false;
-              leftDelim = "(";
-              rightDelim = ")";
-              break;
-          default:
-              throw new Error("Unrecognized genfrac command");
-      }
-
-      switch (context.funcName) {
-          case "\\dfrac":
-          case "\\dbinom":
-              size = "display";
-              break;
-          case "\\tfrac":
-          case "\\tbinom":
-              size = "text";
-              break;
-      }
-
-      return {
-          type: "genfrac",
-          numer: numer,
-          denom: denom,
-          hasBarLine: hasBarLine,
-          leftDelim: leftDelim,
-          rightDelim: rightDelim,
-          size: size
-      };
-  });
-
-  // Left and right overlap functions
-  defineFunction(["\\llap", "\\rlap"], {
-      numArgs: 1,
-      allowedInText: true
-  }, function (context, args) {
-      var body = args[0];
-      return {
-          type: context.funcName.slice(1),
-          body: body
-      };
-  });
-
-  // Delimiter functions
-  var checkDelimiter = function checkDelimiter(delim, context) {
-      if (_utils2.default.contains(delimiters, delim.value)) {
-          return delim;
-      } else {
-          throw new _ParseError2.default("Invalid delimiter: '" + delim.value + "' after '" + context.funcName + "'", delim);
-      }
-  };
-
-  defineFunction(["\\bigl", "\\Bigl", "\\biggl", "\\Biggl", "\\bigr", "\\Bigr", "\\biggr", "\\Biggr", "\\bigm", "\\Bigm", "\\biggm", "\\Biggm", "\\big", "\\Big", "\\bigg", "\\Bigg"], {
-      numArgs: 1
-  }, function (context, args) {
-      var delim = checkDelimiter(args[0], context);
-
-      return {
-          type: "delimsizing",
-          size: delimiterSizes[context.funcName].size,
-          mclass: delimiterSizes[context.funcName].mclass,
-          value: delim.value
-      };
-  });
-
-  defineFunction(["\\left", "\\right"], {
-      numArgs: 1
-  }, function (context, args) {
-      var delim = checkDelimiter(args[0], context);
-
-      // \left and \right are caught somewhere in Parser.js, which is
-      // why this data doesn't match what is in buildHTML.
-      return {
-          type: "leftright",
-          value: delim.value
-      };
-  });
-
-  defineFunction("\\middle", {
-      numArgs: 1
-  }, function (context, args) {
-      var delim = checkDelimiter(args[0], context);
-      if (!context.parser.leftrightDepth) {
-          throw new _ParseError2.default("\\middle without preceding \\left", delim);
-      }
-
-      return {
-          type: "middle",
-          value: delim.value
-      };
-  });
-
-  // Sizing functions (handled in Parser.js explicitly, hence no handler)
-  defineFunction(["\\tiny", "\\scriptsize", "\\footnotesize", "\\small", "\\normalsize", "\\large", "\\Large", "\\LARGE", "\\huge", "\\Huge"], 0, null);
-
-  // Style changing functions (handled in Parser.js explicitly, hence no
-  // handler)
-  defineFunction(["\\displaystyle", "\\textstyle", "\\scriptstyle", "\\scriptscriptstyle"], 0, null);
-
-  // Old font changing functions
-  defineFunction(["\\rm", "\\sf", "\\tt", "\\bf", "\\it"], 0, null);
-
-  defineFunction([
-  // styles
-  "\\mathrm", "\\mathit", "\\mathbf",
-
-  // families
-  "\\mathbb", "\\mathcal", "\\mathfrak", "\\mathscr", "\\mathsf", "\\mathtt",
-
-  // aliases
-  "\\Bbb", "\\bold", "\\frak"], {
-      numArgs: 1,
-      greediness: 2
-  }, function (context, args) {
-      var body = args[0];
-      var func = context.funcName;
-      if (func in fontAliases) {
-          func = fontAliases[func];
-      }
-      return {
-          type: "font",
-          font: func.slice(1),
-          body: body
-      };
-  });
-
-  // Accents
-  defineFunction(["\\acute", "\\grave", "\\ddot", "\\tilde", "\\bar", "\\breve", "\\check", "\\hat", "\\vec", "\\dot", "\\widehat", "\\widetilde", "\\overrightarrow", "\\overleftarrow", "\\Overrightarrow", "\\overleftrightarrow", "\\overgroup", "\\overlinesegment", "\\overleftharpoon", "\\overrightharpoon"], {
-      numArgs: 1
-  }, function (context, args) {
-      var base = args[0];
-
-      var isStretchy = !_utils2.default.contains(["\\acute", "\\grave", "\\ddot", "\\tilde", "\\bar", "\\breve", "\\check", "\\hat", "\\vec", "\\dot"], context.funcName);
-
-      var isShifty = !isStretchy || _utils2.default.contains(["\\widehat", "\\widetilde"], context.funcName);
-
-      return {
-          type: "accent",
-          label: context.funcName,
-          isStretchy: isStretchy,
-          isShifty: isShifty,
-          value: ordargument(base),
-          base: base
-      };
-  });
-
-  // Text-mode accents
-  defineFunction(["\\'", "\\`", "\\^", "\\~", "\\=", "\\u", "\\.", '\\"', "\\r", "\\H", "\\v"], {
-      numArgs: 1,
-      allowedInText: true,
-      allowedInMath: false
-  }, function (context, args) {
-      var base = args[0];
-
-      return {
-          type: "accent",
-          label: context.funcName,
-          isStretchy: false,
-          isShifty: true,
-          value: ordargument(base),
-          base: base
-      };
-  });
-
-  // Horizontal stretchy braces
-  defineFunction(["\\overbrace", "\\underbrace"], {
-      numArgs: 1
-  }, function (context, args) {
-      var base = args[0];
-      return {
-          type: "horizBrace",
-          label: context.funcName,
-          isOver: /^\\over/.test(context.funcName),
-          base: base
-      };
-  });
-
-  // Stretchy accents under the body
-  defineFunction(["\\underleftarrow", "\\underrightarrow", "\\underleftrightarrow", "\\undergroup", "\\underlinesegment", "\\undertilde"], {
-      numArgs: 1
-  }, function (context, args) {
-      var body = args[0];
-      return {
-          type: "accentUnder",
-          label: context.funcName,
-          value: ordargument(body),
-          body: body
-      };
-  });
-
-  // Stretchy arrows with an optional argument
-  defineFunction(["\\xleftarrow", "\\xrightarrow", "\\xLeftarrow", "\\xRightarrow", "\\xleftrightarrow", "\\xLeftrightarrow", "\\xhookleftarrow", "\\xhookrightarrow", "\\xmapsto", "\\xrightharpoondown", "\\xrightharpoonup", "\\xleftharpoondown", "\\xleftharpoonup", "\\xrightleftharpoons", "\\xleftrightharpoons", "\\xLongequal", "\\xtwoheadrightarrow", "\\xtwoheadleftarrow", "\\xLongequal", "\\xtofrom"], {
-      numArgs: 1,
-      numOptionalArgs: 1
-  }, function (context, args) {
-      var below = args[0];
-      var body = args[1];
-      return {
-          type: "xArrow", // x for extensible
-          label: context.funcName,
-          body: body,
-          below: below
-      };
-  });
-
-  // enclose
-  defineFunction(["\\cancel", "\\bcancel", "\\xcancel", "\\sout", "\\fbox"], {
-      numArgs: 1
-  }, function (context, args) {
-      var body = args[0];
-      return {
-          type: "enclose",
-          label: context.funcName,
-          body: body
-      };
-  });
-
-  // Infix generalized fractions
-  defineFunction(["\\over", "\\choose", "\\atop"], {
-      numArgs: 0,
-      infix: true
-  }, function (context) {
-      var replaceWith = void 0;
-      switch (context.funcName) {
-          case "\\over":
-              replaceWith = "\\frac";
-              break;
-          case "\\choose":
-              replaceWith = "\\binom";
-              break;
-          case "\\atop":
-              replaceWith = "\\\\atopfrac";
-              break;
-          default:
-              throw new Error("Unrecognized infix genfrac command");
-      }
-      return {
-          type: "infix",
-          replaceWith: replaceWith,
-          token: context.token
-      };
-  });
-
-  // Row breaks for aligned data
-  defineFunction(["\\\\", "\\cr"], {
-      numArgs: 0,
-      numOptionalArgs: 1,
-      argTypes: ["size"]
-  }, function (context, args) {
-      var size = args[0];
-      return {
-          type: "cr",
-          size: size
-      };
-  });
-
-  // Environment delimiters
-  defineFunction(["\\begin", "\\end"], {
-      numArgs: 1,
-      argTypes: ["text"]
-  }, function (context, args) {
-      var nameGroup = args[0];
-      if (nameGroup.type !== "ordgroup") {
-          throw new _ParseError2.default("Invalid environment name", nameGroup);
-      }
-      var name = "";
-      for (var i = 0; i < nameGroup.value.length; ++i) {
-          name += nameGroup.value[i].value;
-      }
-      return {
-          type: "environment",
-          name: name,
-          nameGroup: nameGroup
-      };
-  });
-
-  },{"./ParseError":29,"./ParseNode":30,"./utils":51}],44:[function(require,module,exports){
-
-  /**
-   * Predefined macros for KaTeX.
-   * This can be used to define some commands in terms of others.
-   */
-
-  // This function might one day accept additional argument and do more things.
-  function defineMacro(name, body) {
-    module.exports[name] = body;
-  }
-
-  //////////////////////////////////////////////////////////////////////
-  // basics
-  defineMacro("\\bgroup", "{");
-  defineMacro("\\egroup", "}");
-  defineMacro("\\begingroup", "{");
-  defineMacro("\\endgroup", "}");
-
-  // We don't distinguish between math and nonmath kerns.
-  // (In TeX, the mu unit works only with \mkern.)
-  defineMacro("\\mkern", "\\kern");
-
-  //////////////////////////////////////////////////////////////////////
-  // amsmath.sty
-
-  // \def\overset#1#2{\binrel@{#2}\binrel@@{\mathop{\kern\z@#2}\limits^{#1}}}
-  defineMacro("\\overset", "\\mathop{#2}\\limits^{#1}");
-  defineMacro("\\underset", "\\mathop{#2}\\limits_{#1}");
-
-  // \newcommand{\boxed}[1]{\fbox{\m@th$\displaystyle#1$}}
-  defineMacro("\\boxed", "\\fbox{\\displaystyle{#1}}");
-
-  //TODO: When implementing \dots, should ideally add the \DOTSB indicator
-  //      into the macro, to indicate these are binary operators.
-  // \def\iff{\DOTSB\;\Longleftrightarrow\;}
-  // \def\implies{\DOTSB\;\Longrightarrow\;}
-  // \def\impliedby{\DOTSB\;\Longleftarrow\;}
-  defineMacro("\\iff", "\\;\\Longleftrightarrow\\;");
-  defineMacro("\\implies", "\\;\\Longrightarrow\\;");
-  defineMacro("\\impliedby", "\\;\\Longleftarrow\\;");
-
-  //////////////////////////////////////////////////////////////////////
-  // mathtools.sty
-
-  //\providecommand\ordinarycolon{:}
-  defineMacro("\\ordinarycolon", ":");
-  //\def\vcentcolon{\mathrel{\mathop\ordinarycolon}}
-  //TODO(edemaine): Not yet centered. Fix via \raisebox or #726
-  defineMacro("\\vcentcolon", "\\mathrel{\\mathop\\ordinarycolon}");
-  // \providecommand*\dblcolon{\vcentcolon\mathrel{\mkern-.9mu}\vcentcolon}
-  defineMacro("\\dblcolon", "\\vcentcolon\\mathrel{\\mkern-.9mu}\\vcentcolon");
-  // \providecommand*\coloneqq{\vcentcolon\mathrel{\mkern-1.2mu}=}
-  defineMacro("\\coloneqq", "\\vcentcolon\\mathrel{\\mkern-1.2mu}=");
-  // \providecommand*\Coloneqq{\dblcolon\mathrel{\mkern-1.2mu}=}
-  defineMacro("\\Coloneqq", "\\dblcolon\\mathrel{\\mkern-1.2mu}=");
-  // \providecommand*\coloneq{\vcentcolon\mathrel{\mkern-1.2mu}\mathrel{-}}
-  defineMacro("\\coloneq", "\\vcentcolon\\mathrel{\\mkern-1.2mu}\\mathrel{-}");
-  // \providecommand*\Coloneq{\dblcolon\mathrel{\mkern-1.2mu}\mathrel{-}}
-  defineMacro("\\Coloneq", "\\dblcolon\\mathrel{\\mkern-1.2mu}\\mathrel{-}");
-  // \providecommand*\eqqcolon{=\mathrel{\mkern-1.2mu}\vcentcolon}
-  defineMacro("\\eqqcolon", "=\\mathrel{\\mkern-1.2mu}\\vcentcolon");
-  // \providecommand*\Eqqcolon{=\mathrel{\mkern-1.2mu}\dblcolon}
-  defineMacro("\\Eqqcolon", "=\\mathrel{\\mkern-1.2mu}\\dblcolon");
-  // \providecommand*\eqcolon{\mathrel{-}\mathrel{\mkern-1.2mu}\vcentcolon}
-  defineMacro("\\eqcolon", "\\mathrel{-}\\mathrel{\\mkern-1.2mu}\\vcentcolon");
-  // \providecommand*\Eqcolon{\mathrel{-}\mathrel{\mkern-1.2mu}\dblcolon}
-  defineMacro("\\Eqcolon", "\\mathrel{-}\\mathrel{\\mkern-1.2mu}\\dblcolon");
-  // \providecommand*\colonapprox{\vcentcolon\mathrel{\mkern-1.2mu}\approx}
-  defineMacro("\\colonapprox", "\\vcentcolon\\mathrel{\\mkern-1.2mu}\\approx");
-  // \providecommand*\Colonapprox{\dblcolon\mathrel{\mkern-1.2mu}\approx}
-  defineMacro("\\Colonapprox", "\\dblcolon\\mathrel{\\mkern-1.2mu}\\approx");
-  // \providecommand*\colonsim{\vcentcolon\mathrel{\mkern-1.2mu}\sim}
-  defineMacro("\\colonsim", "\\vcentcolon\\mathrel{\\mkern-1.2mu}\\sim");
-  // \providecommand*\Colonsim{\dblcolon\mathrel{\mkern-1.2mu}\sim}
-  defineMacro("\\Colonsim", "\\dblcolon\\mathrel{\\mkern-1.2mu}\\sim");
-
-  //////////////////////////////////////////////////////////////////////
-  // colonequals.sty
-
-  // Alternate names for mathtools's macros:
-  defineMacro("\\ratio", "\\vcentcolon");
-  defineMacro("\\coloncolon", "\\dblcolon");
-  defineMacro("\\colonequals", "\\coloneqq");
-  defineMacro("\\coloncolonequals", "\\Coloneqq");
-  defineMacro("\\equalscolon", "\\eqqcolon");
-  defineMacro("\\equalscoloncolon", "\\Eqqcolon");
-  defineMacro("\\colonminus", "\\coloneq");
-  defineMacro("\\coloncolonminus", "\\Coloneq");
-  defineMacro("\\minuscolon", "\\eqcolon");
-  defineMacro("\\minuscoloncolon", "\\Eqcolon");
-  // \colonapprox name is same in mathtools and colonequals.
-  defineMacro("\\coloncolonapprox", "\\Colonapprox");
-  // \colonsim name is same in mathtools and colonequals.
-  defineMacro("\\coloncolonsim", "\\Colonsim");
-
-  // Additional macros, implemented by analogy with mathtools definitions:
-  defineMacro("\\simcolon", "\\sim\\mathrel{\\mkern-1.2mu}\\vcentcolon");
-  defineMacro("\\simcoloncolon", "\\sim\\mathrel{\\mkern-1.2mu}\\dblcolon");
-  defineMacro("\\approxcolon", "\\approx\\mathrel{\\mkern-1.2mu}\\vcentcolon");
-  defineMacro("\\approxcoloncolon", "\\approx\\mathrel{\\mkern-1.2mu}\\dblcolon");
-
-  },{}],45:[function(require,module,exports){
-
-  var _classCallCheck2 = require("babel-runtime/helpers/classCallCheck");
-
-  var _classCallCheck3 = _interopRequireDefault(_classCallCheck2);
-
-  var _createClass2 = require("babel-runtime/helpers/createClass");
-
-  var _createClass3 = _interopRequireDefault(_createClass2);
-
-  var _utils = require("./utils");
-
-  var _utils2 = _interopRequireDefault(_utils);
-
-  function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { default: obj }; }
-
-  /**
-   * This node represents a general purpose MathML node of any type. The
-   * constructor requires the type of node to create (for example, `"mo"` or
-   * `"mspace"`, corresponding to `<mo>` and `<mspace>` tags).
-   */
-  var MathNode = function () {
-      function MathNode(type, children) {
-          (0, _classCallCheck3.default)(this, MathNode);
-
-          this.type = type;
-          this.attributes = {};
-          this.children = children || [];
-      }
-
-      /**
-       * Sets an attribute on a MathML node. MathML depends on attributes to convey a
-       * semantic content, so this is used heavily.
-       */
-
-
-      (0, _createClass3.default)(MathNode, [{
-          key: "setAttribute",
-          value: function setAttribute(name, value) {
-              this.attributes[name] = value;
-          }
-
-          /**
-           * Converts the math node into a MathML-namespaced DOM element.
-           */
-
-      }, {
-          key: "toNode",
-          value: function toNode() {
-              var node = document.createElementNS("http://www.w3.org/1998/Math/MathML", this.type);
-
-              for (var attr in this.attributes) {
-                  if (Object.prototype.hasOwnProperty.call(this.attributes, attr)) {
-                      node.setAttribute(attr, this.attributes[attr]);
-                  }
-              }
-
-              for (var i = 0; i < this.children.length; i++) {
-                  node.appendChild(this.children[i].toNode());
-              }
-
-              return node;
-          }
-
-          /**
-           * Converts the math node into an HTML markup string.
-           */
-
-      }, {
-          key: "toMarkup",
-          value: function toMarkup() {
-              var markup = "<" + this.type;
-
-              // Add the attributes
-              for (var attr in this.attributes) {
-                  if (Object.prototype.hasOwnProperty.call(this.attributes, attr)) {
-                      markup += " " + attr + "=\"";
-                      markup += _utils2.default.escape(this.attributes[attr]);
-                      markup += "\"";
-                  }
-              }
-
-              markup += ">";
-
-              for (var i = 0; i < this.children.length; i++) {
-                  markup += this.children[i].toMarkup();
-              }
-
-              markup += "</" + this.type + ">";
-
-              return markup;
-          }
-      }]);
-      return MathNode;
-  }();
-
-  /**
-   * This node represents a piece of text.
-   */
-  /**
-   * These objects store data about MathML nodes. This is the MathML equivalent
-   * of the types in domTree.js. Since MathML handles its own rendering, and
-   * since we're mainly using MathML to improve accessibility, we don't manage
-   * any of the styling state that the plain DOM nodes do.
-   *
-   * The `toNode` and `toMarkup` functions work simlarly to how they do in
-   * domTree.js, creating namespaced DOM nodes and HTML text markup respectively.
-   */
-
-  var TextNode = function () {
-      function TextNode(text) {
-          (0, _classCallCheck3.default)(this, TextNode);
-
-          this.text = text;
-      }
-
-      /**
-       * Converts the text node into a DOM text node.
-       */
-
-
-      (0, _createClass3.default)(TextNode, [{
-          key: "toNode",
-          value: function toNode() {
-              return document.createTextNode(this.text);
-          }
-
-          /**
-           * Converts the text node into HTML markup (which is just the text itself).
-           */
-
-      }, {
-          key: "toMarkup",
-          value: function toMarkup() {
-              return _utils2.default.escape(this.text);
-          }
-      }]);
-      return TextNode;
-  }();
-
-  module.exports = {
-      MathNode: MathNode,
-      TextNode: TextNode
-  };
-
-  },{"./utils":51,"babel-runtime/helpers/classCallCheck":4,"babel-runtime/helpers/createClass":5}],46:[function(require,module,exports){
-
-  var _Parser = require('./Parser');
-
-  var _Parser2 = _interopRequireDefault(_Parser);
-
-  function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { default: obj }; }
-
-  /**
-   * Parses an expression using a Parser, then returns the parsed result.
-   */
-  var parseTree = function parseTree(toParse, settings) {
-    if (!(typeof toParse === 'string' || toParse instanceof String)) {
-      throw new TypeError('KaTeX can only parse string typed expression');
-    }
-    var parser = new _Parser2.default(toParse, settings);
-
-    return parser.parse();
-  }; /**
-      * Provides a single function for parsing an expression using a Parser
-      * TODO(emily): Remove this
-      */
-
-  module.exports = parseTree;
-
-  },{"./Parser":31}],47:[function(require,module,exports){
-
-  /**
-   * This file provides support to buildMathML.js and buildHTML.js
-   * for stretchy wide elements rendered from SVG files
-   * and other CSS trickery.
-   */
-
-  var buildCommon = require("./buildCommon");
-  var mathMLTree = require("./mathMLTree");
-  var utils = require("./utils");
-
-  var stretchyCodePoint = {
-      widehat: "^",
-      widetilde: "~",
-      undertilde: "~",
-      overleftarrow: "\u2190",
-      underleftarrow: "\u2190",
-      xleftarrow: "\u2190",
-      overrightarrow: "\u2192",
-      underrightarrow: "\u2192",
-      xrightarrow: "\u2192",
-      underbrace: "\u23B5",
-      overbrace: "\u23DE",
-      overleftrightarrow: "\u2194",
-      underleftrightarrow: "\u2194",
-      xleftrightarrow: "\u2194",
-      Overrightarrow: "\u21D2",
-      xRightarrow: "\u21D2",
-      overleftharpoon: "\u21BC",
-      xleftharpoonup: "\u21BC",
-      overrightharpoon: "\u21C0",
-      xrightharpoonup: "\u21C0",
-      xLeftarrow: "\u21D0",
-      xLeftrightarrow: "\u21D4",
-      xhookleftarrow: "\u21A9",
-      xhookrightarrow: "\u21AA",
-      xmapsto: "\u21A6",
-      xrightharpoondown: "\u21C1",
-      xleftharpoondown: "\u21BD",
-      xrightleftharpoons: "\u21CC",
-      xleftrightharpoons: "\u21CB",
-      xtwoheadleftarrow: "\u219E",
-      xtwoheadrightarrow: "\u21A0",
-      xLongequal: "=",
-      xtofrom: "\u21C4"
-  };
-
-  var mathMLnode = function mathMLnode(label) {
-      var node = new mathMLTree.MathNode("mo", [new mathMLTree.TextNode(stretchyCodePoint[label.substr(1)])]);
-      node.setAttribute("stretchy", "true");
-      return node;
-  };
-
-  // In the katexImagesData object just below, the dimensions all
-  // correspond to path geometry inside the relevant SVG.
-  // For example, \rightarrow uses the same arrowhead as glyph U+2192
-  // from the KaTeX Main font. The scaling factor is 1000.
-  // That is, inside the font, that arrowhead is 522 units tall, which
-  // corresponds to 0.522 em inside the document.
-  // And for extensible arrows, we split that distance around the math axis.
-
-  var katexImagesData = {
-      // height, depth, imageName, minWidth
-      overleftarrow: [0.522, 0, "leftarrow", 0.5],
-      underleftarrow: [0.522, 0, "leftarrow", 0.5],
-      xleftarrow: [0.261, 0.261, "leftarrow", 0.783],
-      overrightarrow: [0.522, 0, "rightarrow", 0.5],
-      underrightarrow: [0.522, 0, "rightarrow", 0.5],
-      xrightarrow: [0.261, 0.261, "rightarrow", 0.783],
-      overbrace: [0.548, 0, "overbrace", 1.6],
-      underbrace: [0.548, 0, "underbrace", 1.6],
-      overleftrightarrow: [0.522, 0, "leftrightarrow", 0.5],
-      underleftrightarrow: [0.522, 0, "leftrightarrow", 0.5],
-      xleftrightarrow: [0.261, 0.261, "leftrightarrow", 0.783],
-      Overrightarrow: [0.56, 0, "doublerightarrow", 0.5],
-      xLeftarrow: [0.28, 0.28, "doubleleftarrow", 0.783],
-      xRightarrow: [0.28, 0.28, "doublerightarrow", 0.783],
-      xLeftrightarrow: [0.28, 0.28, "doubleleftrightarrow", 0.955],
-      overleftharpoon: [0.522, 0, "leftharpoon", 0.5],
-      overrightharpoon: [0.522, 0, "rightharpoon", 0.5],
-      xleftharpoonup: [0.261, 0.261, "leftharpoon", 0.783],
-      xrightharpoonup: [0.261, 0.261, "rightharpoon", 0.783],
-      xhookleftarrow: [0.261, 0.261, "hookleftarrow", 0.87],
-      xhookrightarrow: [0.261, 0.261, "hookrightarrow", 0.87],
-      overlinesegment: [0.414, 0, "linesegment", 0.5],
-      underlinesegment: [0.414, 0, "linesegment", 0.5],
-      xmapsto: [0.261, 0.261, "mapsto", 0.783],
-      xrightharpoondown: [0.261, 0.261, "rightharpoondown", 0.783],
-      xleftharpoondown: [0.261, 0.261, "leftharpoondown", 0.783],
-      xrightleftharpoons: [0.358, 0.358, "rightleftharpoons", 0.716],
-      xleftrightharpoons: [0.358, 0.358, "leftrightharpoons", 0.716],
-      overgroup: [0.342, 0, "overgroup", 0.87],
-      undergroup: [0.342, 0, "undergroup", 0.87],
-      xtwoheadleftarrow: [0.167, 0.167, "twoheadleftarrow", 0.86],
-      xtwoheadrightarrow: [0.167, 0.167, "twoheadrightarrow", 0.86],
-      xLongequal: [0.167, 0.167, "longequal", 0.5],
-      xtofrom: [0.264, 0.264, "tofrom", 0.86]
-  };
-
-  // Many of the KaTeX SVG images have been adapted from glyphs in KaTeX fonts.
-  // Copyright (c) 2009-2010, Design Science, Inc. (<www.mathjax.org>)
-  // Copyright (c) 2014-2017 Khan Academy (<www.khanacademy.org>)
-  // Licensed under the SIL Open Font License, Version 1.1.
-  // See \nhttp://scripts.sil.org/OFL
-
-  // Nested SVGs
-  //    Many of the KaTeX SVG images contain a nested SVG. This is done to
-  //    achieve a stretchy image while avoiding distortion of arrowheads or
-  //    brace corners.
-
-  //    The inner SVG typically contains a very long (400 em) arrow.
-
-  //    The outer SVG acts like a window that exposes only part of the inner SVG.
-  //    The outer SVG will grow or shrink to match the dimensions set by CSS.
-
-  //    The inner SVG always has a longer, thinner aspect ratio than the outer
-  //    SVG. After the inner SVG fills 100% of the height of the outer SVG,
-  //    there is a long arrow shaft left over. That left-over shaft is not shown.
-  //    Instead, it is sliced off because the inner SVG is set to
-  //    "preserveAspectRatio='... slice'".
-
-  //    Thus, the reader sees an arrow that matches the subject matter width
-  //    without distortion.
-
-  //    Some functions, such as \cancel, need to vary their aspect ratio. These
-  //    functions do not get the nested SVG treatment.
-
-  // Second Brush Stroke
-  //    Low resolution monitors struggle to display images in fine detail.
-  //    So browsers apply anti-aliasing. A long straight arrow shaft therefore
-  //    will sometimes appear as if it has a blurred edge.
-
-  //    To mitigate this, these SVG files contain a second "brush-stroke" on the
-  //    arrow shafts. That is, a second long thin rectangular SVG path has been
-  //    written directly on top of each arrow shaft. This reinforcement causes
-  //    some of the screen pixels to display as black instead of the anti-aliased
-  //    gray pixel that a  single path would generate. So we get arrow shafts
-  //    whose edges appear to be sharper.
-
-  var svgPath = {
-      doubleleftarrow: "<path d='M262 157\nl10-10c34-36 62.7-77 86-123 3.3-8 5-13.3 5-16 0-5.3-6.7-8-20-8-7.3\n 0-12.2.5-14.5 1.5-2.3 1-4.8 4.5-7.5 10.5-49.3 97.3-121.7 169.3-217 216-28\n 14-57.3 25-88 33-6.7 2-11 3.8-13 5.5-2 1.7-3 4.2-3 7.5s1 5.8 3 7.5\nc2 1.7 6.3 3.5 13 5.5 68 17.3 128.2 47.8 180.5 91.5 52.3 43.7 93.8 96.2 124.5\n 157.5 9.3 8 15.3 12.3 18 13h6c12-.7 18-4 18-10 0-2-1.7-7-5-15-23.3-46-52-87\n-86-123l-10-10h399738v-40H218c328 0 0 0 0 0l-10-8c-26.7-20-65.7-43-117-69 2.7\n-2 6-3.7 10-5 36.7-16 72.3-37.3 107-64l10-8h399782v-40z\nm8 0v40h399730v-40zm0 194v40h399730v-40z'/>",
-
-      doublerightarrow: "<path d='M399738 392l\n-10 10c-34 36-62.7 77-86 123-3.3 8-5 13.3-5 16 0 5.3 6.7 8 20 8 7.3 0 12.2-.5\n 14.5-1.5 2.3-1 4.8-4.5 7.5-10.5 49.3-97.3 121.7-169.3 217-216 28-14 57.3-25 88\n-33 6.7-2 11-3.8 13-5.5 2-1.7 3-4.2 3-7.5s-1-5.8-3-7.5c-2-1.7-6.3-3.5-13-5.5-68\n-17.3-128.2-47.8-180.5-91.5-52.3-43.7-93.8-96.2-124.5-157.5-9.3-8-15.3-12.3-18\n-13h-6c-12 .7-18 4-18 10 0 2 1.7 7 5 15 23.3 46 52 87 86 123l10 10H0v40h399782\nc-328 0 0 0 0 0l10 8c26.7 20 65.7 43 117 69-2.7 2-6 3.7-10 5-36.7 16-72.3 37.3\n-107 64l-10 8H0v40zM0 157v40h399730v-40zm0 194v40h399730v-40z'/>",
-
-      leftarrow: "<path d='M400000 241H110l3-3c68.7-52.7 113.7-120\n 135-202 4-14.7 6-23 6-25 0-7.3-7-11-21-11-8 0-13.2.8-15.5 2.5-2.3 1.7-4.2 5.8\n-5.5 12.5-1.3 4.7-2.7 10.3-4 17-12 48.7-34.8 92-68.5 130S65.3 228.3 18 247\nc-10 4-16 7.7-18 11 0 8.7 6 14.3 18 17 47.3 18.7 87.8 47 121.5 85S196 441.3 208\n 490c.7 2 1.3 5 2 9s1.2 6.7 1.5 8c.3 1.3 1 3.3 2 6s2.2 4.5 3.5 5.5c1.3 1 3.3\n 1.8 6 2.5s6 1 10 1c14 0 21-3.7 21-11 0-2-2-10.3-6-25-20-79.3-65-146.7-135-202\n l-3-3h399890zM100 241v40h399900v-40z'/>",
-
-      rightarrow: "<path d='M0 241v40h399891c-47.3 35.3-84 78-110 128\n-16.7 32-27.7 63.7-33 95 0 1.3-.2 2.7-.5 4-.3 1.3-.5 2.3-.5 3 0 7.3 6.7 11 20\n 11 8 0 13.2-.8 15.5-2.5 2.3-1.7 4.2-5.5 5.5-11.5 2-13.3 5.7-27 11-41 14.7-44.7\n 39-84.5 73-119.5s73.7-60.2 119-75.5c6-2 9-5.7 9-11s-3-9-9-11c-45.3-15.3-85\n-40.5-119-75.5s-58.3-74.8-73-119.5c-4.7-14-8.3-27.3-11-40-1.3-6.7-3.2-10.8-5.5\n-12.5-2.3-1.7-7.5-2.5-15.5-2.5-14 0-21 3.7-21 11 0 2 2 10.3 6 25 20.7 83.3 67\n 151.7 139 205zm0 0v40h399900v-40z'/>"
-  };
-
-  var innerSVG = {
-      // Since bcancel's SVG is inline and it omits the viewBox attribute,
-      // it's stroke-width will not vary with span area.
-      bcancel: "<line x1='0' y1='0' x2='100%' y2='100%' stroke-width='0.046em'/>",
-
-      cancel: "<line x1='0' y1='100%' x2='100%' y2='0' stroke-width='0.046em'/>",
-
-      // The doubleleftarrow geometry is from glyph U+21D0 in the font KaTeX Main
-      doubleleftarrow: "><svg viewBox='0 0 400000 549'\npreserveAspectRatio='xMinYMin slice'>" + svgPath["doubleleftarrow"] + "</svg>",
-
-      // doubleleftrightarrow is from glyph U+21D4 in font KaTeX Main
-      doubleleftrightarrow: "><svg width='50.1%' viewBox='0 0 400000 549'\npreserveAspectRatio='xMinYMin slice'>" + svgPath["doubleleftarrow"] + "</svg>\n<svg x='50%' width='50%' viewBox='0 0 400000 549' preserveAspectRatio='xMaxYMin\n slice'>" + svgPath["doublerightarrow"] + "</svg>",
-
-      // doublerightarrow is from glyph U+21D2 in font KaTeX Main
-      doublerightarrow: "><svg viewBox='0 0 400000 549'\npreserveAspectRatio='xMaxYMin slice'>" + svgPath["doublerightarrow"] + "</svg>",
-
-      // hookleftarrow is from glyph U+21A9 in font KaTeX Main
-      hookleftarrow: "><svg width='50.1%' viewBox='0 0 400000 522'\npreserveAspectRatio='xMinYMin slice'>" + svgPath["leftarrow"] + "</svg>\n<svg x='50%' width='50%' viewBox='0 0 400000 522' preserveAspectRatio='xMaxYMin\n slice'><path d='M399859 241c-764 0 0 0 0 0 40-3.3 68.7\n -15.7 86-37 10-12 15-25.3 15-40 0-22.7-9.8-40.7-29.5-54-19.7-13.3-43.5-21-71.5\n -23-17.3-1.3-26-8-26-20 0-13.3 8.7-20 26-20 38 0 71 11.2 99 33.5 0 0 7 5.6 21\n 16.7 14 11.2 21 33.5 21 66.8s-14 61.2-42 83.5c-28 22.3-61 33.5-99 33.5L0 241z\n M0 281v-40h399859v40z'/></svg>",
-
-      // hookrightarrow is from glyph U+21AA in font KaTeX Main
-      hookrightarrow: "><svg width='50.1%' viewBox='0 0 400000 522'\npreserveAspectRatio='xMinYMin slice'><path d='M400000 281\nH103s-33-11.2-61-33.5S0 197.3 0 164s14.2-61.2 42.5-83.5C70.8 58.2 104 47 142 47\nc16.7 0 25 6.7 25 20 0 12-8.7 18.7-26 20-40 3.3-68.7 15.7-86 37-10 12-15 25.3\n-15 40 0 22.7 9.8 40.7 29.5 54 19.7 13.3 43.5 21 71.5 23h399859zM103 281v-40\nh399897v40z'/></svg><svg x='50%' width='50%' viewBox='0 0 400000 522'\npreserveAspectRatio='xMaxYMin slice'>" + svgPath["rightarrow"] + "</svg>",
-
-      // leftarrow is from glyph U+2190 in font KaTeX Main
-      leftarrow: "><svg viewBox='0 0 400000 522' preserveAspectRatio='xMinYMin\n slice'>" + svgPath["leftarrow"] + "</svg>",
-
-      // leftharpoon is from glyph U+21BD in font KaTeX Main
-      leftharpoon: "><svg viewBox='0 0 400000 522' preserveAspectRatio='xMinYMin\n slice'><path d='M0 267c.7 5.3 3 10 7 14h399993v-40H93c3.3\n-3.3 10.2-9.5 20.5-18.5s17.8-15.8 22.5-20.5c50.7-52 88-110.3 112-175 4-11.3 5\n-18.3 3-21-1.3-4-7.3-6-18-6-8 0-13 .7-15 2s-4.7 6.7-8 16c-42 98.7-107.3 174.7\n-196 228-6.7 4.7-10.7 8-12 10-1.3 2-2 5.7-2 11zm100-26v40h399900v-40z'/></svg>",
-
-      // leftharpoondown is from glyph U+21BD in font KaTeX Main
-      leftharpoondown: "><svg viewBox='0 0 400000 522'\npreserveAspectRatio='xMinYMin slice'><path d=\"M7 241c-4 4-6.333 8.667-7 14\n 0 5.333.667 9 2 11s5.333 5.333 12 10c90.667 54 156 130 196 228 3.333 10.667\n 6.333 16.333 9 17 2 .667 5 1 9 1h5c10.667 0 16.667-2 18-6 2-2.667 1-9.667-3-21\n -32-87.333-82.667-157.667-152-211l-3-3h399907v-40z\nM93 281 H400000 v-40L7 241z\"/></svg>",
-
-      // leftrightarrow is from glyph U+2194 in font KaTeX Main
-      leftrightarrow: "><svg width='50.1%' viewBox='0 0 400000 522'\npreserveAspectRatio='xMinYMin slice'>" + svgPath["leftarrow"] + "</svg>\n<svg x='50%' width='50%' viewBox='0 0 400000 522' preserveAspectRatio='xMaxYMin\n slice'>" + svgPath["rightarrow"] + "</svg>",
-
-      // leftrightharpoons is from glyphs U+21BC/21B1 in font KaTeX Main
-      leftrightharpoons: "><svg width='50.1%' viewBox='0 0 400000 716'\npreserveAspectRatio='xMinYMin slice'><path d='M0 267c.7 5.3\n 3 10 7 14h399993v-40H93c3.3-3.3 10.2-9.5 20.5-18.5s17.8-15.8 22.5-20.5c50.7-52\n 88-110.3 112-175 4-11.3 5-18.3 3-21-1.3-4-7.3-6-18-6-8 0-13 .7-15 2s-4.7 6.7-8\n 16c-42 98.7-107.3 174.7-196 228-6.7 4.7-10.7 8-12 10-1.3 2-2 5.7-2 11zm100-26\nv40h399900v-40zM0 435v40h400000v-40zm0 0v40h400000v-40z'/></svg>\n<svg x='50%' width='50%' viewBox='0 0 400000 716' preserveAspectRatio='xMaxYMin\n slice'><path d='M399747 705c0 7.3 6.7 11 20 11 8 0 13-.8\n 15-2.5s4.7-6.8 8-15.5c40-94 99.3-166.3 178-217 13.3-8 20.3-12.3 21-13 5.3-3.3\n 8.5-5.8 9.5-7.5 1-1.7 1.5-5.2 1.5-10.5s-2.3-10.3-7-15H0v40h399908c-34 25.3\n-64.7 57-92 95-27.3 38-48.7 77.7-64 119-3.3 8.7-5 14-5 16zM0 435v40h399900v-40z\nm0-194v40h400000v-40zm0 0v40h400000v-40z'/></svg>",
-
-      linesegment: "><svg width='50.1%' viewBox='0 0 400000 414'\npreserveAspectRatio='xMinYMin slice'><path d='M40 187V40H0\nv334h40V227h399960v-40zm0 0V40H0v334h40V227h399960v-40z'/></svg><svg x='50%'\nwidth='50%' viewBox='0 0 400000 414' preserveAspectRatio='xMaxYMin slice'>\n<path d='M0 187v40h399960v147h40V40h-40v147zm0\n 0v40h399960v147h40V40h-40v147z'/></svg>",
-
-      longequal: " viewBox='0 0 100 334' preserveAspectRatio='none'>\n<path d='M0 50h100v40H0zm0 194h100v40H0z'/>",
-
-      // mapsto is from glyph U+21A6 in font KaTeX Main
-      mapsto: "><svg width='50.1%' viewBox='0 0 400000 522'\npreserveAspectRatio='xMinYMin slice'><path d='M40 241c740\n 0 0 0 0 0v-75c0-40.7-.2-64.3-.5-71-.3-6.7-2.2-11.7-5.5-15-4-4-8.7-6-14-6-5.3 0\n-10 2-14 6C2.7 83.3.8 91.3.5 104 .2 116.7 0 169 0 261c0 114 .7 172.3 2 175 4 8\n 10 12 18 12 5.3 0 10-2 14-6 3.3-3.3 5.2-8.3 5.5-15 .3-6.7.5-30.3.5-71v-75\nh399960zm0 0v40h399960v-40z'/></svg><svg x='50%' width='50%' viewBox='0 0\n 400000 522' preserveAspectRatio='xMaxYMin slice'>" + svgPath["rightarrow"] + "</svg>",
-
-      // overbrace is from glyphs U+23A9/23A8/23A7 in font KaTeX_Size4-Regular
-      overbrace: "><svg width='25.5%' viewBox='0 0 400000 548'\npreserveAspectRatio='xMinYMin slice'><path d='M6 548l-6-6\nv-35l6-11c56-104 135.3-181.3 238-232 57.3-28.7 117-45 179-50h399577v120H403\nc-43.3 7-81 15-113 26-100.7 33-179.7 91-237 174-2.7 5-6 9-10 13-.7 1-7.3 1-20 1\nH6z'/></svg><svg x='25%' width='50%' viewBox='0 0 400000 548'\npreserveAspectRatio='xMidYMin slice'><path d='M200428 334\nc-100.7-8.3-195.3-44-280-108-55.3-42-101.7-93-139-153l-9-14c-2.7 4-5.7 8.7-9 14\n-53.3 86.7-123.7 153-211 199-66.7 36-137.3 56.3-212 62H0V214h199568c178.3-11.7\n 311.7-78.3 403-201 6-8 9.7-12 11-12 .7-.7 6.7-1 18-1s17.3.3 18 1c1.3 0 5 4 11\n 12 44.7 59.3 101.3 106.3 170 141s145.3 54.3 229 60h199572v120z'/></svg>\n<svg x='74.9%' width='24.1%' viewBox='0 0 400000 548'\npreserveAspectRatio='xMaxYMin slice'><path d='M400000 542l\n-6 6h-17c-12.7 0-19.3-.3-20-1-4-4-7.3-8.3-10-13-35.3-51.3-80.8-93.8-136.5-127.5\ns-117.2-55.8-184.5-66.5c-.7 0-2-.3-4-1-18.7-2.7-76-4.3-172-5H0V214h399571l6 1\nc124.7 8 235 61.7 331 161 31.3 33.3 59.7 72.7 85 118l7 13v35z'/></svg>",
-
-      // overgroup is from the MnSymbol package (public domain)
-      overgroup: "><svg width='50.1%' viewBox='0 0 400000 342'\npreserveAspectRatio='xMinYMin slice'><path d='M400000 80\nH435C64 80 168.3 229.4 21 260c-5.9 1.2-18 0-18 0-2 0-3-1-3-3v-38C76 61 257 0\n 435 0h399565z'/></svg><svg x='50%' width='50%' viewBox='0 0 400000 342'\npreserveAspectRatio='xMaxYMin slice'><path d='M0 80h399565\nc371 0 266.7 149.4 414 180 5.9 1.2 18 0 18 0 2 0 3-1 3-3v-38\nc-76-158-257-219-435-219H0z'/></svg>",
-
-      // rightarrow is from glyph U+2192 in font KaTeX Main
-      rightarrow: "><svg viewBox='0 0 400000 522' preserveAspectRatio='xMaxYMin\n slice'>" + svgPath["rightarrow"] + "</svg>",
-
-      // rightharpoon is from glyph U+21C0 in font KaTeX Main
-      rightharpoon: "><svg viewBox='0 0 400000 522' preserveAspectRatio='xMaxYMin\n slice'><path d='M0 241v40h399993c4.7-4.7 7-9.3 7-14 0-9.3\n-3.7-15.3-11-18-92.7-56.7-159-133.7-199-231-3.3-9.3-6-14.7-8-16-2-1.3-7-2-15-2\n-10.7 0-16.7 2-18 6-2 2.7-1 9.7 3 21 15.3 42 36.7 81.8 64 119.5 27.3 37.7 58\n 69.2 92 94.5zm0 0v40h399900v-40z'/></svg>",
-
-      // rightharpoondown is from glyph U+21C1 in font KaTeX Main
-      rightharpoondown: "><svg viewBox='0 0 400000 522'\npreserveAspectRatio='xMaxYMin slice'><path d='M399747 511\nc0 7.3 6.7 11 20 11 8 0 13-.8 15-2.5s4.7-6.8 8-15.5c40-94 99.3-166.3 178-217\n 13.3-8 20.3-12.3 21-13 5.3-3.3 8.5-5.8 9.5-7.5 1-1.7 1.5-5.2 1.5-10.5s-2.3\n -10.3-7-15H0v40h399908c-34 25.3-64.7 57-92 95-27.3 38-48.7 77.7-64 119-3.3\n 8.7-5 14-5 16zM0 241v40h399900v-40z'/></svg>",
-
-      // rightleftharpoons is from glyph U+21CC in font KaTeX Main
-      rightleftharpoons: "><svg width='50%' viewBox='0 0 400000 716'\npreserveAspectRatio='xMinYMin slice'><path d='M7 435c-4 4\n-6.3 8.7-7 14 0 5.3.7 9 2 11s5.3 5.3 12 10c90.7 54 156 130 196 228 3.3 10.7 6.3\n 16.3 9 17 2 .7 5 1 9 1h5c10.7 0 16.7-2 18-6 2-2.7 1-9.7-3-21-32-87.3-82.7\n-157.7-152-211l-3-3h399907v-40H7zm93 0v40h399900v-40zM0 241v40h399900v-40z\nm0 0v40h399900v-40z'/></svg><svg x='50%' width='50%' viewBox='0 0 400000 716'\npreserveAspectRatio='xMaxYMin slice'><path d='M0 241v40\nh399993c4.7-4.7 7-9.3 7-14 0-9.3-3.7-15.3-11-18-92.7-56.7-159-133.7-199-231-3.3\n-9.3-6-14.7-8-16-2-1.3-7-2-15-2-10.7 0-16.7 2-18 6-2 2.7-1 9.7 3 21 15.3 42\n 36.7 81.8 64 119.5 27.3 37.7 58 69.2 92 94.5zm0 0v40h399900v-40z\n m100 194v40h399900v-40zm0 0v40h399900v-40z'/></svg>",
-
-      // tilde1 is a modified version of a glyph from the MnSymbol package
-      tilde1: " viewBox='0 0 600 260' preserveAspectRatio='none'>\n<path d='M200 55.538c-77 0-168 73.953-177 73.953-3 0-7\n-2.175-9-5.437L2 97c-1-2-2-4-2-6 0-4 2-7 5-9l20-12C116 12 171 0 207 0c86 0\n 114 68 191 68 78 0 168-68 177-68 4 0 7 2 9 5l12 19c1 2.175 2 4.35 2 6.525 0\n 4.35-2 7.613-5 9.788l-19 13.05c-92 63.077-116.937 75.308-183 76.128\n-68.267.847-113-73.952-191-73.952z'/>",
-
-      // Ditto tilde2, tilde3, and tilde 4
-      tilde2: " viewBox='0 0 1033 286' preserveAspectRatio='none'>\n<path d='M344 55.266c-142 0-300.638 81.316-311.5 86.418\n-8.01 3.762-22.5 10.91-23.5 5.562L1 120c-1-2-1-3-1-4 0-5 3-9 8-10l18.4-9C160.9\n 31.9 283 0 358 0c148 0 188 122 331 122s314-97 326-97c4 0 8 2 10 7l7 21.114\nc1 2.14 1 3.21 1 4.28 0 5.347-3 9.626-7 10.696l-22.3 12.622C852.6 158.372 751\n 181.476 676 181.476c-149 0-189-126.21-332-126.21z'/>",
-
-      tilde3: " viewBox='0 0 2339 306' preserveAspectRatio='none'>\n<path d='M786 59C457 59 32 175.242 13 175.242c-6 0-10-3.457\n-11-10.37L.15 138c-1-7 3-12 10-13l19.2-6.4C378.4 40.7 634.3 0 804.3 0c337 0\n 411.8 157 746.8 157 328 0 754-112 773-112 5 0 10 3 11 9l1 14.075c1 8.066-.697\n 16.595-6.697 17.492l-21.052 7.31c-367.9 98.146-609.15 122.696-778.15 122.696\n -338 0-409-156.573-744-156.573z'/>",
-
-      tilde4: " viewBox='0 0 2340 312' preserveAspectRatio='none'>\n<path d='M786 58C457 58 32 177.487 13 177.487c-6 0-10-3.345\n-11-10.035L.15 143c-1-7 3-12 10-13l22-6.7C381.2 35 637.15 0 807.15 0c337 0 409\n 177 744 177 328 0 754-127 773-127 5 0 10 3 11 9l1 14.794c1 7.805-3 13.38-9\n 14.495l-20.7 5.574c-366.85 99.79-607.3 139.372-776.3 139.372-338 0-409\n -175.236-744-175.236z'/>",
-
-      // tofrom is from glyph U+21C4 in font KaTeX AMS Regular
-      tofrom: "><svg width='50.1%' viewBox='0 0 400000 528'\npreserveAspectRatio='xMinYMin slice'><path d='M0 147h400000\nv40H0zm0 214c68 40 115.7 95.7 143 167h22c15.3 0 23-.3 23-1 0-1.3-5.3-13.7-16-37\n-18-35.3-41.3-69-70-101l-7-8h399905v-40H95l7-8c28.7-32 52-65.7 70-101 10.7-23.3\n 16-35.7 16-37 0-.7-7.7-1-23-1h-22C115.7 265.3 68 321 0 361zm0-174v-40h399900\nv40zm100 154v40h399900v-40z'/></svg><svg x='50%' width='50%' viewBox='0 0\n 400000 528' preserveAspectRatio='xMaxYMin slice'><path\nd='M400000 167c-70.7-42-118-97.7-142-167h-23c-15.3 0-23 .3-23 1 0 1.3 5.3 13.7\n 16 37 18 35.3 41.3 69 70 101l7 8H0v40h399905l-7 8c-28.7 32-52 65.7-70 101-10.7\n 23.3-16 35.7-16 37 0 .7 7.7 1 23 1h23c24-69.3 71.3-125 142-167z\n M100 147v40h399900v-40zM0 341v40h399900v-40z'/></svg>",
-
-      // twoheadleftarrow is from glyph U+219E in font KaTeX AMS Regular
-      twoheadleftarrow: "><svg viewBox='0 0 400000 334'\npreserveAspectRatio='xMinYMin slice'><path d='M0 167c68 40\n 115.7 95.7 143 167h22c15.3 0 23-.3 23-1 0-1.3-5.3-13.7-16-37-18-35.3-41.3-69\n-70-101l-7-8h125l9 7c50.7 39.3 85 86 103 140h46c0-4.7-6.3-18.7-19-42-18-35.3\n-40-67.3-66-96l-9-9h399716v-40H284l9-9c26-28.7 48-60.7 66-96 12.7-23.333 19\n-37.333 19-42h-46c-18 54-52.3 100.7-103 140l-9 7H95l7-8c28.7-32 52-65.7 70-101\n 10.7-23.333 16-35.7 16-37 0-.7-7.7-1-23-1h-22C115.7 71.3 68 127 0 167z'/>\n</svg>",
-
-      // twoheadrightarrow is from glyph U+21A0 in font KaTeX AMS Regular
-      twoheadrightarrow: "><svg viewBox='0 0 400000 334'\npreserveAspectRatio='xMaxYMin slice'><path d='M400000 167\nc-68-40-115.7-95.7-143-167h-22c-15.3 0-23 .3-23 1 0 1.3 5.3 13.7 16 37 18 35.3\n 41.3 69 70 101l7 8h-125l-9-7c-50.7-39.3-85-86-103-140h-46c0 4.7 6.3 18.7 19 42\n 18 35.3 40 67.3 66 96l9 9H0v40h399716l-9 9c-26 28.7-48 60.7-66 96-12.7 23.333\n-19 37.333-19 42h46c18-54 52.3-100.7 103-140l9-7h125l-7 8c-28.7 32-52 65.7-70\n 101-10.7 23.333-16 35.7-16 37 0 .7 7.7 1 23 1h22c27.3-71.3 75-127 143-167z'/>\n</svg>",
-
-      // underbrace is from glyphs U+23A9/23A8/23A7 in font KaTeX_Size4-Regular
-      underbrace: "><svg width='25.1%' viewBox='0 0 400000 548'\npreserveAspectRatio='xMinYMin slice'><path d='M0 6l6-6h17\nc12.688 0 19.313.3 20 1 4 4 7.313 8.3 10 13 35.313 51.3 80.813 93.8 136.5 127.5\n 55.688 33.7 117.188 55.8 184.5 66.5.688 0 2 .3 4 1 18.688 2.7 76 4.3 172 5\nh399450v120H429l-6-1c-124.688-8-235-61.7-331-161C60.687 138.7 32.312 99.3 7 54\nL0 41V6z'/></svg><svg x='25%' width='50%' viewBox='0 0 400000 548'\npreserveAspectRatio='xMidYMin slice'><path d='M199572 214\nc100.7 8.3 195.3 44 280 108 55.3 42 101.7 93 139 153l9 14c2.7-4 5.7-8.7 9-14\n 53.3-86.7 123.7-153 211-199 66.7-36 137.3-56.3 212-62h199568v120H200432c-178.3\n 11.7-311.7 78.3-403 201-6 8-9.7 12-11 12-.7.7-6.7 1-18 1s-17.3-.3-18-1c-1.3 0\n-5-4-11-12-44.7-59.3-101.3-106.3-170-141s-145.3-54.3-229-60H0V214z'/></svg>\n<svg x='74.9%' width='25.1%' viewBox='0 0 400000 548'\npreserveAspectRatio='xMaxYMin slice'><path d='M399994 0l6 6\nv35l-6 11c-56 104-135.3 181.3-238 232-57.3 28.7-117 45-179 50H-300V214h399897\nc43.3-7 81-15 113-26 100.7-33 179.7-91 237-174 2.7-5 6-9 10-13 .7-1 7.3-1 20-1\nh17z'/></svg>",
-
-      // undergroup is from the MnSymbol package (public domain)
-      undergroup: "><svg width='50.1%' viewBox='0 0 400000 342'\npreserveAspectRatio='xMinYMin slice'><path d='M400000 262\nH435C64 262 168.3 112.6 21 82c-5.9-1.2-18 0-18 0-2 0-3 1-3 3v38c76 158 257 219\n 435 219h399565z'/></svg><svg x='50%' width='50%' viewBox='0 0 400000 342'\npreserveAspectRatio='xMaxYMin slice'><path d='M0 262h399565\nc371 0 266.7-149.4 414-180 5.9-1.2 18 0 18 0 2 0 3 1 3 3v38c-76 158-257\n 219-435 219H0z'/></svg>",
-
-      // widehat1 is a modified version of a glyph from the MnSymbol package
-      widehat1: " viewBox='0 0 1062 239' preserveAspectRatio='none'>\n<path d='M529 0h5l519 115c5 1 9 5 9 10 0 1-1 2-1 3l-4 22\nc-1 5-5 9-11 9h-2L532 67 19 159h-2c-5 0-9-4-11-9l-5-22c-1-6 2-12 8-13z'/>",
-
-      // Ditto widehat2, widehat3, and widehat4
-      widehat2: " viewBox='0 0 2364 300' preserveAspectRatio='none'>\n<path d='M1181 0h2l1171 176c6 0 10 5 10 11l-2 23c-1 6-5 10\n-11 10h-1L1182 67 15 220h-1c-6 0-10-4-11-10l-2-23c-1-6 4-11 10-11z'/>",
-
-      widehat3: " viewBox='0 0 2364 360' preserveAspectRatio='none'>\n<path d='M1181 0h2l1171 236c6 0 10 5 10 11l-2 23c-1 6-5 10\n-11 10h-1L1182 67 15 280h-1c-6 0-10-4-11-10l-2-23c-1-6 4-11 10-11z'/>",
-
-      widehat4: " viewBox='0 0 2364 420' preserveAspectRatio='none'>\n<path d='M1181 0h2l1171 296c6 0 10 5 10 11l-2 23c-1 6-5 10\n-11 10h-1L1182 67 15 340h-1c-6 0-10-4-11-10l-2-23c-1-6 4-11 10-11z'/>",
-
-      xcancel: "<line x1='0' y1='0' x2='100%' y2='100%' stroke-width='0.046em'/>\n<line x1='0' y1='100%' x2='100%' y2='0' stroke-width='0.046em'/>"
-  };
-
-  var svgSpan = function svgSpan(group, options) {
-      // Create a span with inline SVG for the element.
-      var label = group.value.label.substr(1);
-      var height = 0;
-      var depth = 0;
-      var imageName = "";
-      var minWidth = 0;
-
-      if (utils.contains(["widehat", "widetilde", "undertilde"], label)) {
-          // There are four SVG images available for each function.
-          // Choose a taller image when there are more characters.
-          var numChars = group.value.value.length;
-          if (numChars > 5) {
-              height = 0.312;
-              imageName = (label === "widehat" ? "widehat" : "tilde") + "4";
-          } else {
-              var imgIndex = [1, 1, 2, 2, 3, 3][numChars];
-              if (label === "widehat") {
-                  height = [0, 0.24, 0.30, 0.30, 0.36, 0.36][numChars];
-                  imageName = "widehat" + imgIndex;
-              } else {
-                  height = [0, 0.26, 0.30, 0.30, 0.34, 0.34][numChars];
-                  imageName = "tilde" + imgIndex;
-              }
-          }
-      } else {
-          var imgData = katexImagesData[label];
-          height = imgData[0];
-          depth = imgData[1];
-          imageName = imgData[2];
-          minWidth = imgData[3];
-      }
-
-      var span = buildCommon.makeSpan([], [], options);
-      span.height = height;
-      span.depth = depth;
-      var totalHeight = height + depth;
-      span.style.height = totalHeight + "em";
-      if (minWidth > 0) {
-          span.style.minWidth = minWidth + "em";
-      }
-
-      span.innerHTML = "<svg width='100%' height='" + totalHeight + "em'" + innerSVG[imageName] + "</svg>";
-
-      return span;
-  };
-
-  var encloseSpan = function encloseSpan(inner, label, pad, options) {
-      // Return an image span for \cancel, \bcancel, \xcancel, or \fbox
-      var img = void 0;
-      var totalHeight = inner.height + inner.depth + 2 * pad;
-
-      if (label === "fbox") {
-          img = buildCommon.makeSpan(["stretchy", label], [], options);
-          if (options.color) {
-              img.style.borderColor = options.getColor();
-          }
-      } else {
-          img = buildCommon.makeSpan([], [], options);
-          img.innerHTML = "<svg width='100%' height='" + totalHeight + "em'>" + innerSVG[label] + "</svg>";
-      }
-
-      img.height = totalHeight;
-      img.style.height = totalHeight + "em";
-
-      return img;
-  };
-
-  module.exports = {
-      encloseSpan: encloseSpan,
-      mathMLnode: mathMLnode,
-      svgSpan: svgSpan
-  };
-
-  },{"./buildCommon":34,"./mathMLTree":45,"./utils":51}],48:[function(require,module,exports){
-
-  /**
-   * This file holds a list of all no-argument functions and single-character
-   * symbols (like 'a' or ';').
-   *
-   * For each of the symbols, there are three properties they can have:
-   * - font (required): the font to be used for this symbol. Either "main" (the
-       normal font), or "ams" (the ams fonts).
-   * - group (required): the ParseNode group type the symbol should have (i.e.
-       "textord", "mathord", etc).
-       See https://github.com/Khan/KaTeX/wiki/Examining-TeX#group-types
-   * - replace: the character that this symbol or function should be
-   *   replaced with (i.e. "\phi" has a replace value of "\u03d5", the phi
-   *   character in the main font).
-   *
-   * The outermost map in the table indicates what mode the symbols should be
-   * accepted in (e.g. "math" or "text").
-   */
-
-  module.exports = {
-      math: {},
-      text: {}
-  };
-
-  function defineSymbol(mode, font, group, replace, name, acceptUnicodeChar) {
-      module.exports[mode][name] = {
-          font: font,
-          group: group,
-          replace: replace
-      };
-
-      if (acceptUnicodeChar) {
-          module.exports[mode][replace] = module.exports[mode][name];
-      }
-  }
-
-  // Some abbreviations for commonly used strings.
-  // This helps minify the code, and also spotting typos using jshint.
-
-  // modes:
-  var math = "math";
-  var text = "text";
-
-  // fonts:
-  var main = "main";
-  var ams = "ams";
-
-  // groups:
-  var accent = "accent";
-  var bin = "bin";
-  var close = "close";
-  var inner = "inner";
-  var mathord = "mathord";
-  var op = "op";
-  var open = "open";
-  var punct = "punct";
-  var rel = "rel";
-  var spacing = "spacing";
-  var textord = "textord";
-
-  // Now comes the symbol table
-
-  // Relation Symbols
-  defineSymbol(math, main, rel, "\u2261", "\\equiv");
-  defineSymbol(math, main, rel, "\u227A", "\\prec");
-  defineSymbol(math, main, rel, "\u227B", "\\succ");
-  defineSymbol(math, main, rel, "\u223C", "\\sim");
-  defineSymbol(math, main, rel, "\u22A5", "\\perp");
-  defineSymbol(math, main, rel, "\u2AAF", "\\preceq");
-  defineSymbol(math, main, rel, "\u2AB0", "\\succeq");
-  defineSymbol(math, main, rel, "\u2243", "\\simeq");
-  defineSymbol(math, main, rel, "\u2223", "\\mid");
-  defineSymbol(math, main, rel, "\u226A", "\\ll");
-  defineSymbol(math, main, rel, "\u226B", "\\gg");
-  defineSymbol(math, main, rel, "\u224D", "\\asymp");
-  defineSymbol(math, main, rel, "\u2225", "\\parallel");
-  defineSymbol(math, main, rel, "\u22C8", "\\bowtie");
-  defineSymbol(math, main, rel, "\u2323", "\\smile");
-  defineSymbol(math, main, rel, "\u2291", "\\sqsubseteq");
-  defineSymbol(math, main, rel, "\u2292", "\\sqsupseteq");
-  defineSymbol(math, main, rel, "\u2250", "\\doteq");
-  defineSymbol(math, main, rel, "\u2322", "\\frown");
-  defineSymbol(math, main, rel, "\u220B", "\\ni");
-  defineSymbol(math, main, rel, "\u221D", "\\propto");
-  defineSymbol(math, main, rel, "\u22A2", "\\vdash");
-  defineSymbol(math, main, rel, "\u22A3", "\\dashv");
-  defineSymbol(math, main, rel, "\u220B", "\\owns");
-
-  // Punctuation
-  defineSymbol(math, main, punct, ".", "\\ldotp");
-  defineSymbol(math, main, punct, "\u22C5", "\\cdotp");
-
-  // Misc Symbols
-  defineSymbol(math, main, textord, "#", "\\#");
-  defineSymbol(text, main, textord, "#", "\\#");
-  defineSymbol(math, main, textord, "&", "\\&");
-  defineSymbol(text, main, textord, "&", "\\&");
-  defineSymbol(math, main, textord, "\u2135", "\\aleph");
-  defineSymbol(math, main, textord, "\u2200", "\\forall");
-  defineSymbol(math, main, textord, "\u210F", "\\hbar");
-  defineSymbol(math, main, textord, "\u2203", "\\exists");
-  defineSymbol(math, main, textord, "\u2207", "\\nabla");
-  defineSymbol(math, main, textord, "\u266D", "\\flat");
-  defineSymbol(math, main, textord, "\u2113", "\\ell");
-  defineSymbol(math, main, textord, "\u266E", "\\natural");
-  defineSymbol(math, main, textord, "\u2663", "\\clubsuit");
-  defineSymbol(math, main, textord, "\u2118", "\\wp");
-  defineSymbol(math, main, textord, "\u266F", "\\sharp");
-  defineSymbol(math, main, textord, "\u2662", "\\diamondsuit");
-  defineSymbol(math, main, textord, "\u211C", "\\Re");
-  defineSymbol(math, main, textord, "\u2661", "\\heartsuit");
-  defineSymbol(math, main, textord, "\u2111", "\\Im");
-  defineSymbol(math, main, textord, "\u2660", "\\spadesuit");
-
-  // Math and Text
-  defineSymbol(math, main, textord, "\u2020", "\\dag");
-  defineSymbol(text, main, textord, "\u2020", "\\dag");
-  defineSymbol(text, main, textord, "\u2020", "\\textdagger");
-  defineSymbol(math, main, textord, "\u2021", "\\ddag");
-  defineSymbol(text, main, textord, "\u2021", "\\ddag");
-  defineSymbol(text, main, textord, "\u2020", "\\textdaggerdbl");
-
-  // Large Delimiters
-  defineSymbol(math, main, close, "\u23B1", "\\rmoustache");
-  defineSymbol(math, main, open, "\u23B0", "\\lmoustache");
-  defineSymbol(math, main, close, "\u27EF", "\\rgroup");
-  defineSymbol(math, main, open, "\u27EE", "\\lgroup");
-
-  // Binary Operators
-  defineSymbol(math, main, bin, "\u2213", "\\mp");
-  defineSymbol(math, main, bin, "\u2296", "\\ominus");
-  defineSymbol(math, main, bin, "\u228E", "\\uplus");
-  defineSymbol(math, main, bin, "\u2293", "\\sqcap");
-  defineSymbol(math, main, bin, "\u2217", "\\ast");
-  defineSymbol(math, main, bin, "\u2294", "\\sqcup");
-  defineSymbol(math, main, bin, "\u25EF", "\\bigcirc");
-  defineSymbol(math, main, bin, "\u2219", "\\bullet");
-  defineSymbol(math, main, bin, "\u2021", "\\ddagger");
-  defineSymbol(math, main, bin, "\u2240", "\\wr");
-  defineSymbol(math, main, bin, "\u2A3F", "\\amalg");
-
-  // Arrow Symbols
-  defineSymbol(math, main, rel, "\u27F5", "\\longleftarrow");
-  defineSymbol(math, main, rel, "\u21D0", "\\Leftarrow");
-  defineSymbol(math, main, rel, "\u27F8", "\\Longleftarrow");
-  defineSymbol(math, main, rel, "\u27F6", "\\longrightarrow");
-  defineSymbol(math, main, rel, "\u21D2", "\\Rightarrow");
-  defineSymbol(math, main, rel, "\u27F9", "\\Longrightarrow");
-  defineSymbol(math, main, rel, "\u2194", "\\leftrightarrow");
-  defineSymbol(math, main, rel, "\u27F7", "\\longleftrightarrow");
-  defineSymbol(math, main, rel, "\u21D4", "\\Leftrightarrow");
-  defineSymbol(math, main, rel, "\u27FA", "\\Longleftrightarrow");
-  defineSymbol(math, main, rel, "\u21A6", "\\mapsto");
-  defineSymbol(math, main, rel, "\u27FC", "\\longmapsto");
-  defineSymbol(math, main, rel, "\u2197", "\\nearrow");
-  defineSymbol(math, main, rel, "\u21A9", "\\hookleftarrow");
-  defineSymbol(math, main, rel, "\u21AA", "\\hookrightarrow");
-  defineSymbol(math, main, rel, "\u2198", "\\searrow");
-  defineSymbol(math, main, rel, "\u21BC", "\\leftharpoonup");
-  defineSymbol(math, main, rel, "\u21C0", "\\rightharpoonup");
-  defineSymbol(math, main, rel, "\u2199", "\\swarrow");
-  defineSymbol(math, main, rel, "\u21BD", "\\leftharpoondown");
-  defineSymbol(math, main, rel, "\u21C1", "\\rightharpoondown");
-  defineSymbol(math, main, rel, "\u2196", "\\nwarrow");
-  defineSymbol(math, main, rel, "\u21CC", "\\rightleftharpoons");
-
-  // AMS Negated Binary Relations
-  defineSymbol(math, ams, rel, "\u226E", "\\nless");
-  defineSymbol(math, ams, rel, "\uE010", "\\nleqslant");
-  defineSymbol(math, ams, rel, "\uE011", "\\nleqq");
-  defineSymbol(math, ams, rel, "\u2A87", "\\lneq");
-  defineSymbol(math, ams, rel, "\u2268", "\\lneqq");
-  defineSymbol(math, ams, rel, "\uE00C", "\\lvertneqq");
-  defineSymbol(math, ams, rel, "\u22E6", "\\lnsim");
-  defineSymbol(math, ams, rel, "\u2A89", "\\lnapprox");
-  defineSymbol(math, ams, rel, "\u2280", "\\nprec");
-  defineSymbol(math, ams, rel, "\u22E0", "\\npreceq");
-  defineSymbol(math, ams, rel, "\u22E8", "\\precnsim");
-  defineSymbol(math, ams, rel, "\u2AB9", "\\precnapprox");
-  defineSymbol(math, ams, rel, "\u2241", "\\nsim");
-  defineSymbol(math, ams, rel, "\uE006", "\\nshortmid");
-  defineSymbol(math, ams, rel, "\u2224", "\\nmid");
-  defineSymbol(math, ams, rel, "\u22AC", "\\nvdash");
-  defineSymbol(math, ams, rel, "\u22AD", "\\nvDash");
-  defineSymbol(math, ams, rel, "\u22EA", "\\ntriangleleft");
-  defineSymbol(math, ams, rel, "\u22EC", "\\ntrianglelefteq");
-  defineSymbol(math, ams, rel, "\u228A", "\\subsetneq");
-  defineSymbol(math, ams, rel, "\uE01A", "\\varsubsetneq");
-  defineSymbol(math, ams, rel, "\u2ACB", "\\subsetneqq");
-  defineSymbol(math, ams, rel, "\uE017", "\\varsubsetneqq");
-  defineSymbol(math, ams, rel, "\u226F", "\\ngtr");
-  defineSymbol(math, ams, rel, "\uE00F", "\\ngeqslant");
-  defineSymbol(math, ams, rel, "\uE00E", "\\ngeqq");
-  defineSymbol(math, ams, rel, "\u2A88", "\\gneq");
-  defineSymbol(math, ams, rel, "\u2269", "\\gneqq");
-  defineSymbol(math, ams, rel, "\uE00D", "\\gvertneqq");
-  defineSymbol(math, ams, rel, "\u22E7", "\\gnsim");
-  defineSymbol(math, ams, rel, "\u2A8A", "\\gnapprox");
-  defineSymbol(math, ams, rel, "\u2281", "\\nsucc");
-  defineSymbol(math, ams, rel, "\u22E1", "\\nsucceq");
-  defineSymbol(math, ams, rel, "\u22E9", "\\succnsim");
-  defineSymbol(math, ams, rel, "\u2ABA", "\\succnapprox");
-  defineSymbol(math, ams, rel, "\u2246", "\\ncong");
-  defineSymbol(math, ams, rel, "\uE007", "\\nshortparallel");
-  defineSymbol(math, ams, rel, "\u2226", "\\nparallel");
-  defineSymbol(math, ams, rel, "\u22AF", "\\nVDash");
-  defineSymbol(math, ams, rel, "\u22EB", "\\ntriangleright");
-  defineSymbol(math, ams, rel, "\u22ED", "\\ntrianglerighteq");
-  defineSymbol(math, ams, rel, "\uE018", "\\nsupseteqq");
-  defineSymbol(math, ams, rel, "\u228B", "\\supsetneq");
-  defineSymbol(math, ams, rel, "\uE01B", "\\varsupsetneq");
-  defineSymbol(math, ams, rel, "\u2ACC", "\\supsetneqq");
-  defineSymbol(math, ams, rel, "\uE019", "\\varsupsetneqq");
-  defineSymbol(math, ams, rel, "\u22AE", "\\nVdash");
-  defineSymbol(math, ams, rel, "\u2AB5", "\\precneqq");
-  defineSymbol(math, ams, rel, "\u2AB6", "\\succneqq");
-  defineSymbol(math, ams, rel, "\uE016", "\\nsubseteqq");
-  defineSymbol(math, ams, bin, "\u22B4", "\\unlhd");
-  defineSymbol(math, ams, bin, "\u22B5", "\\unrhd");
-
-  // AMS Negated Arrows
-  defineSymbol(math, ams, rel, "\u219A", "\\nleftarrow");
-  defineSymbol(math, ams, rel, "\u219B", "\\nrightarrow");
-  defineSymbol(math, ams, rel, "\u21CD", "\\nLeftarrow");
-  defineSymbol(math, ams, rel, "\u21CF", "\\nRightarrow");
-  defineSymbol(math, ams, rel, "\u21AE", "\\nleftrightarrow");
-  defineSymbol(math, ams, rel, "\u21CE", "\\nLeftrightarrow");
-
-  // AMS Misc
-  defineSymbol(math, ams, rel, "\u25B3", "\\vartriangle");
-  defineSymbol(math, ams, textord, "\u210F", "\\hslash");
-  defineSymbol(math, ams, textord, "\u25BD", "\\triangledown");
-  defineSymbol(math, ams, textord, "\u25CA", "\\lozenge");
-  defineSymbol(math, ams, textord, "\u24C8", "\\circledS");
-  defineSymbol(math, ams, textord, "\xAE", "\\circledR");
-  defineSymbol(text, ams, textord, "\xAE", "\\circledR");
-  defineSymbol(math, ams, textord, "\u2221", "\\measuredangle");
-  defineSymbol(math, ams, textord, "\u2204", "\\nexists");
-  defineSymbol(math, ams, textord, "\u2127", "\\mho");
-  defineSymbol(math, ams, textord, "\u2132", "\\Finv");
-  defineSymbol(math, ams, textord, "\u2141", "\\Game");
-  defineSymbol(math, ams, textord, "k", "\\Bbbk");
-  defineSymbol(math, ams, textord, "\u2035", "\\backprime");
-  defineSymbol(math, ams, textord, "\u25B2", "\\blacktriangle");
-  defineSymbol(math, ams, textord, "\u25BC", "\\blacktriangledown");
-  defineSymbol(math, ams, textord, "\u25A0", "\\blacksquare");
-  defineSymbol(math, ams, textord, "\u29EB", "\\blacklozenge");
-  defineSymbol(math, ams, textord, "\u2605", "\\bigstar");
-  defineSymbol(math, ams, textord, "\u2222", "\\sphericalangle");
-  defineSymbol(math, ams, textord, "\u2201", "\\complement");
-  defineSymbol(math, ams, textord, "\xF0", "\\eth");
-  defineSymbol(math, ams, textord, "\u2571", "\\diagup");
-  defineSymbol(math, ams, textord, "\u2572", "\\diagdown");
-  defineSymbol(math, ams, textord, "\u25A1", "\\square");
-  defineSymbol(math, ams, textord, "\u25A1", "\\Box");
-  defineSymbol(math, ams, textord, "\u25CA", "\\Diamond");
-  defineSymbol(math, ams, textord, "\xA5", "\\yen");
-  defineSymbol(math, ams, textord, "\u2713", "\\checkmark");
-  defineSymbol(text, ams, textord, "\u2713", "\\checkmark");
-
-  // AMS Hebrew
-  defineSymbol(math, ams, textord, "\u2136", "\\beth");
-  defineSymbol(math, ams, textord, "\u2138", "\\daleth");
-  defineSymbol(math, ams, textord, "\u2137", "\\gimel");
-
-  // AMS Greek
-  defineSymbol(math, ams, textord, "\u03DD", "\\digamma");
-  defineSymbol(math, ams, textord, "\u03F0", "\\varkappa");
-
-  // AMS Delimiters
-  defineSymbol(math, ams, open, "\u250C", "\\ulcorner");
-  defineSymbol(math, ams, close, "\u2510", "\\urcorner");
-  defineSymbol(math, ams, open, "\u2514", "\\llcorner");
-  defineSymbol(math, ams, close, "\u2518", "\\lrcorner");
-
-  // AMS Binary Relations
-  defineSymbol(math, ams, rel, "\u2266", "\\leqq");
-  defineSymbol(math, ams, rel, "\u2A7D", "\\leqslant");
-  defineSymbol(math, ams, rel, "\u2A95", "\\eqslantless");
-  defineSymbol(math, ams, rel, "\u2272", "\\lesssim");
-  defineSymbol(math, ams, rel, "\u2A85", "\\lessapprox");
-  defineSymbol(math, ams, rel, "\u224A", "\\approxeq");
-  defineSymbol(math, ams, bin, "\u22D6", "\\lessdot");
-  defineSymbol(math, ams, rel, "\u22D8", "\\lll");
-  defineSymbol(math, ams, rel, "\u2276", "\\lessgtr");
-  defineSymbol(math, ams, rel, "\u22DA", "\\lesseqgtr");
-  defineSymbol(math, ams, rel, "\u2A8B", "\\lesseqqgtr");
-  defineSymbol(math, ams, rel, "\u2251", "\\doteqdot");
-  defineSymbol(math, ams, rel, "\u2253", "\\risingdotseq");
-  defineSymbol(math, ams, rel, "\u2252", "\\fallingdotseq");
-  defineSymbol(math, ams, rel, "\u223D", "\\backsim");
-  defineSymbol(math, ams, rel, "\u22CD", "\\backsimeq");
-  defineSymbol(math, ams, rel, "\u2AC5", "\\subseteqq");
-  defineSymbol(math, ams, rel, "\u22D0", "\\Subset");
-  defineSymbol(math, ams, rel, "\u228F", "\\sqsubset");
-  defineSymbol(math, ams, rel, "\u227C", "\\preccurlyeq");
-  defineSymbol(math, ams, rel, "\u22DE", "\\curlyeqprec");
-  defineSymbol(math, ams, rel, "\u227E", "\\precsim");
-  defineSymbol(math, ams, rel, "\u2AB7", "\\precapprox");
-  defineSymbol(math, ams, rel, "\u22B2", "\\vartriangleleft");
-  defineSymbol(math, ams, rel, "\u22B4", "\\trianglelefteq");
-  defineSymbol(math, ams, rel, "\u22A8", "\\vDash");
-  defineSymbol(math, ams, rel, "\u22AA", "\\Vvdash");
-  defineSymbol(math, ams, rel, "\u2323", "\\smallsmile");
-  defineSymbol(math, ams, rel, "\u2322", "\\smallfrown");
-  defineSymbol(math, ams, rel, "\u224F", "\\bumpeq");
-  defineSymbol(math, ams, rel, "\u224E", "\\Bumpeq");
-  defineSymbol(math, ams, rel, "\u2267", "\\geqq");
-  defineSymbol(math, ams, rel, "\u2A7E", "\\geqslant");
-  defineSymbol(math, ams, rel, "\u2A96", "\\eqslantgtr");
-  defineSymbol(math, ams, rel, "\u2273", "\\gtrsim");
-  defineSymbol(math, ams, rel, "\u2A86", "\\gtrapprox");
-  defineSymbol(math, ams, bin, "\u22D7", "\\gtrdot");
-  defineSymbol(math, ams, rel, "\u22D9", "\\ggg");
-  defineSymbol(math, ams, rel, "\u2277", "\\gtrless");
-  defineSymbol(math, ams, rel, "\u22DB", "\\gtreqless");
-  defineSymbol(math, ams, rel, "\u2A8C", "\\gtreqqless");
-  defineSymbol(math, ams, rel, "\u2256", "\\eqcirc");
-  defineSymbol(math, ams, rel, "\u2257", "\\circeq");
-  defineSymbol(math, ams, rel, "\u225C", "\\triangleq");
-  defineSymbol(math, ams, rel, "\u223C", "\\thicksim");
-  defineSymbol(math, ams, rel, "\u2248", "\\thickapprox");
-  defineSymbol(math, ams, rel, "\u2AC6", "\\supseteqq");
-  defineSymbol(math, ams, rel, "\u22D1", "\\Supset");
-  defineSymbol(math, ams, rel, "\u2290", "\\sqsupset");
-  defineSymbol(math, ams, rel, "\u227D", "\\succcurlyeq");
-  defineSymbol(math, ams, rel, "\u22DF", "\\curlyeqsucc");
-  defineSymbol(math, ams, rel, "\u227F", "\\succsim");
-  defineSymbol(math, ams, rel, "\u2AB8", "\\succapprox");
-  defineSymbol(math, ams, rel, "\u22B3", "\\vartriangleright");
-  defineSymbol(math, ams, rel, "\u22B5", "\\trianglerighteq");
-  defineSymbol(math, ams, rel, "\u22A9", "\\Vdash");
-  defineSymbol(math, ams, rel, "\u2223", "\\shortmid");
-  defineSymbol(math, ams, rel, "\u2225", "\\shortparallel");
-  defineSymbol(math, ams, rel, "\u226C", "\\between");
-  defineSymbol(math, ams, rel, "\u22D4", "\\pitchfork");
-  defineSymbol(math, ams, rel, "\u221D", "\\varpropto");
-  defineSymbol(math, ams, rel, "\u25C0", "\\blacktriangleleft");
-  defineSymbol(math, ams, rel, "\u2234", "\\therefore");
-  defineSymbol(math, ams, rel, "\u220D", "\\backepsilon");
-  defineSymbol(math, ams, rel, "\u25B6", "\\blacktriangleright");
-  defineSymbol(math, ams, rel, "\u2235", "\\because");
-  defineSymbol(math, ams, rel, "\u22D8", "\\llless");
-  defineSymbol(math, ams, rel, "\u22D9", "\\gggtr");
-  defineSymbol(math, ams, bin, "\u22B2", "\\lhd");
-  defineSymbol(math, ams, bin, "\u22B3", "\\rhd");
-  defineSymbol(math, ams, rel, "\u2242", "\\eqsim");
-  defineSymbol(math, main, rel, "\u22C8", "\\Join");
-  defineSymbol(math, ams, rel, "\u2251", "\\Doteq");
-
-  // AMS Binary Operators
-  defineSymbol(math, ams, bin, "\u2214", "\\dotplus");
-  defineSymbol(math, ams, bin, "\u2216", "\\smallsetminus");
-  defineSymbol(math, ams, bin, "\u22D2", "\\Cap");
-  defineSymbol(math, ams, bin, "\u22D3", "\\Cup");
-  defineSymbol(math, ams, bin, "\u2A5E", "\\doublebarwedge");
-  defineSymbol(math, ams, bin, "\u229F", "\\boxminus");
-  defineSymbol(math, ams, bin, "\u229E", "\\boxplus");
-  defineSymbol(math, ams, bin, "\u22C7", "\\divideontimes");
-  defineSymbol(math, ams, bin, "\u22C9", "\\ltimes");
-  defineSymbol(math, ams, bin, "\u22CA", "\\rtimes");
-  defineSymbol(math, ams, bin, "\u22CB", "\\leftthreetimes");
-  defineSymbol(math, ams, bin, "\u22CC", "\\rightthreetimes");
-  defineSymbol(math, ams, bin, "\u22CF", "\\curlywedge");
-  defineSymbol(math, ams, bin, "\u22CE", "\\curlyvee");
-  defineSymbol(math, ams, bin, "\u229D", "\\circleddash");
-  defineSymbol(math, ams, bin, "\u229B", "\\circledast");
-  defineSymbol(math, ams, bin, "\u22C5", "\\centerdot");
-  defineSymbol(math, ams, bin, "\u22BA", "\\intercal");
-  defineSymbol(math, ams, bin, "\u22D2", "\\doublecap");
-  defineSymbol(math, ams, bin, "\u22D3", "\\doublecup");
-  defineSymbol(math, ams, bin, "\u22A0", "\\boxtimes");
-
-  // AMS Arrows
-  defineSymbol(math, ams, rel, "\u21E2", "\\dashrightarrow");
-  defineSymbol(math, ams, rel, "\u21E0", "\\dashleftarrow");
-  defineSymbol(math, ams, rel, "\u21C7", "\\leftleftarrows");
-  defineSymbol(math, ams, rel, "\u21C6", "\\leftrightarrows");
-  defineSymbol(math, ams, rel, "\u21DA", "\\Lleftarrow");
-  defineSymbol(math, ams, rel, "\u219E", "\\twoheadleftarrow");
-  defineSymbol(math, ams, rel, "\u21A2", "\\leftarrowtail");
-  defineSymbol(math, ams, rel, "\u21AB", "\\looparrowleft");
-  defineSymbol(math, ams, rel, "\u21CB", "\\leftrightharpoons");
-  defineSymbol(math, ams, rel, "\u21B6", "\\curvearrowleft");
-  defineSymbol(math, ams, rel, "\u21BA", "\\circlearrowleft");
-  defineSymbol(math, ams, rel, "\u21B0", "\\Lsh");
-  defineSymbol(math, ams, rel, "\u21C8", "\\upuparrows");
-  defineSymbol(math, ams, rel, "\u21BF", "\\upharpoonleft");
-  defineSymbol(math, ams, rel, "\u21C3", "\\downharpoonleft");
-  defineSymbol(math, ams, rel, "\u22B8", "\\multimap");
-  defineSymbol(math, ams, rel, "\u21AD", "\\leftrightsquigarrow");
-  defineSymbol(math, ams, rel, "\u21C9", "\\rightrightarrows");
-  defineSymbol(math, ams, rel, "\u21C4", "\\rightleftarrows");
-  defineSymbol(math, ams, rel, "\u21A0", "\\twoheadrightarrow");
-  defineSymbol(math, ams, rel, "\u21A3", "\\rightarrowtail");
-  defineSymbol(math, ams, rel, "\u21AC", "\\looparrowright");
-  defineSymbol(math, ams, rel, "\u21B7", "\\curvearrowright");
-  defineSymbol(math, ams, rel, "\u21BB", "\\circlearrowright");
-  defineSymbol(math, ams, rel, "\u21B1", "\\Rsh");
-  defineSymbol(math, ams, rel, "\u21CA", "\\downdownarrows");
-  defineSymbol(math, ams, rel, "\u21BE", "\\upharpoonright");
-  defineSymbol(math, ams, rel, "\u21C2", "\\downharpoonright");
-  defineSymbol(math, ams, rel, "\u21DD", "\\rightsquigarrow");
-  defineSymbol(math, ams, rel, "\u21DD", "\\leadsto");
-  defineSymbol(math, ams, rel, "\u21DB", "\\Rrightarrow");
-  defineSymbol(math, ams, rel, "\u21BE", "\\restriction");
-
-  defineSymbol(math, main, textord, "\u2018", "`");
-  defineSymbol(math, main, textord, "$", "\\$");
-  defineSymbol(text, main, textord, "$", "\\$");
-  defineSymbol(text, main, textord, "$", "\\textdollar");
-  defineSymbol(math, main, textord, "%", "\\%");
-  defineSymbol(text, main, textord, "%", "\\%");
-  defineSymbol(math, main, textord, "_", "\\_");
-  defineSymbol(text, main, textord, "_", "\\_");
-  defineSymbol(text, main, textord, "_", "\\textunderscore");
-  defineSymbol(math, main, textord, "\u2220", "\\angle");
-  defineSymbol(math, main, textord, "\u221E", "\\infty");
-  defineSymbol(math, main, textord, "\u2032", "\\prime");
-  defineSymbol(math, main, textord, "\u25B3", "\\triangle");
-  defineSymbol(math, main, textord, "\u0393", "\\Gamma", true);
-  defineSymbol(math, main, textord, "\u0394", "\\Delta", true);
-  defineSymbol(math, main, textord, "\u0398", "\\Theta", true);
-  defineSymbol(math, main, textord, "\u039B", "\\Lambda", true);
-  defineSymbol(math, main, textord, "\u039E", "\\Xi", true);
-  defineSymbol(math, main, textord, "\u03A0", "\\Pi", true);
-  defineSymbol(math, main, textord, "\u03A3", "\\Sigma", true);
-  defineSymbol(math, main, textord, "\u03A5", "\\Upsilon", true);
-  defineSymbol(math, main, textord, "\u03A6", "\\Phi", true);
-  defineSymbol(math, main, textord, "\u03A8", "\\Psi", true);
-  defineSymbol(math, main, textord, "\u03A9", "\\Omega", true);
-  defineSymbol(math, main, textord, "\xAC", "\\neg");
-  defineSymbol(math, main, textord, "\xAC", "\\lnot");
-  defineSymbol(math, main, textord, "\u22A4", "\\top");
-  defineSymbol(math, main, textord, "\u22A5", "\\bot");
-  defineSymbol(math, main, textord, "\u2205", "\\emptyset");
-  defineSymbol(math, ams, textord, "\u2205", "\\varnothing");
-  defineSymbol(math, main, mathord, "\u03B1", "\\alpha", true);
-  defineSymbol(math, main, mathord, "\u03B2", "\\beta", true);
-  defineSymbol(math, main, mathord, "\u03B3", "\\gamma", true);
-  defineSymbol(math, main, mathord, "\u03B4", "\\delta", true);
-  defineSymbol(math, main, mathord, "\u03F5", "\\epsilon", true);
-  defineSymbol(math, main, mathord, "\u03B6", "\\zeta", true);
-  defineSymbol(math, main, mathord, "\u03B7", "\\eta", true);
-  defineSymbol(math, main, mathord, "\u03B8", "\\theta", true);
-  defineSymbol(math, main, mathord, "\u03B9", "\\iota", true);
-  defineSymbol(math, main, mathord, "\u03BA", "\\kappa", true);
-  defineSymbol(math, main, mathord, "\u03BB", "\\lambda", true);
-  defineSymbol(math, main, mathord, "\u03BC", "\\mu", true);
-  defineSymbol(math, main, mathord, "\u03BD", "\\nu", true);
-  defineSymbol(math, main, mathord, "\u03BE", "\\xi", true);
-  defineSymbol(math, main, mathord, "\u03BF", "\\omicron", true);
-  defineSymbol(math, main, mathord, "\u03C0", "\\pi", true);
-  defineSymbol(math, main, mathord, "\u03C1", "\\rho", true);
-  defineSymbol(math, main, mathord, "\u03C3", "\\sigma", true);
-  defineSymbol(math, main, mathord, "\u03C4", "\\tau", true);
-  defineSymbol(math, main, mathord, "\u03C5", "\\upsilon", true);
-  defineSymbol(math, main, mathord, "\u03D5", "\\phi", true);
-  defineSymbol(math, main, mathord, "\u03C7", "\\chi", true);
-  defineSymbol(math, main, mathord, "\u03C8", "\\psi", true);
-  defineSymbol(math, main, mathord, "\u03C9", "\\omega", true);
-  defineSymbol(math, main, mathord, "\u03B5", "\\varepsilon", true);
-  defineSymbol(math, main, mathord, "\u03D1", "\\vartheta", true);
-  defineSymbol(math, main, mathord, "\u03D6", "\\varpi", true);
-  defineSymbol(math, main, mathord, "\u03F1", "\\varrho", true);
-  defineSymbol(math, main, mathord, "\u03C2", "\\varsigma", true);
-  defineSymbol(math, main, mathord, "\u03C6", "\\varphi", true);
-  defineSymbol(math, main, bin, "\u2217", "*");
-  defineSymbol(math, main, bin, "+", "+");
-  defineSymbol(math, main, bin, "\u2212", "-");
-  defineSymbol(math, main, bin, "\u22C5", "\\cdot");
-  defineSymbol(math, main, bin, "\u2218", "\\circ");
-  defineSymbol(math, main, bin, "\xF7", "\\div");
-  defineSymbol(math, main, bin, "\xB1", "\\pm");
-  defineSymbol(math, main, bin, "\xD7", "\\times");
-  defineSymbol(math, main, bin, "\u2229", "\\cap");
-  defineSymbol(math, main, bin, "\u222A", "\\cup");
-  defineSymbol(math, main, bin, "\u2216", "\\setminus");
-  defineSymbol(math, main, bin, "\u2227", "\\land");
-  defineSymbol(math, main, bin, "\u2228", "\\lor");
-  defineSymbol(math, main, bin, "\u2227", "\\wedge");
-  defineSymbol(math, main, bin, "\u2228", "\\vee");
-  defineSymbol(math, main, textord, "\u221A", "\\surd");
-  defineSymbol(math, main, open, "(", "(");
-  defineSymbol(math, main, open, "[", "[");
-  defineSymbol(math, main, open, "\u27E8", "\\langle");
-  defineSymbol(math, main, open, "\u2223", "\\lvert");
-  defineSymbol(math, main, open, "\u2225", "\\lVert");
-  defineSymbol(math, main, close, ")", ")");
-  defineSymbol(math, main, close, "]", "]");
-  defineSymbol(math, main, close, "?", "?");
-  defineSymbol(math, main, close, "!", "!");
-  defineSymbol(math, main, close, "\u27E9", "\\rangle");
-  defineSymbol(math, main, close, "\u2223", "\\rvert");
-  defineSymbol(math, main, close, "\u2225", "\\rVert");
-  defineSymbol(math, main, rel, "=", "=");
-  defineSymbol(math, main, rel, "<", "<");
-  defineSymbol(math, main, rel, ">", ">");
-  defineSymbol(math, main, rel, ":", ":");
-  defineSymbol(math, main, rel, "\u2248", "\\approx");
-  defineSymbol(math, main, rel, "\u2245", "\\cong");
-  defineSymbol(math, main, rel, "\u2265", "\\ge");
-  defineSymbol(math, main, rel, "\u2265", "\\geq");
-  defineSymbol(math, main, rel, "\u2190", "\\gets");
-  defineSymbol(math, main, rel, ">", "\\gt");
-  defineSymbol(math, main, rel, "\u2208", "\\in");
-  defineSymbol(math, main, rel, "\u2209", "\\notin");
-  defineSymbol(math, main, rel, "\u0338", "\\not");
-  defineSymbol(math, main, rel, "\u2282", "\\subset");
-  defineSymbol(math, main, rel, "\u2283", "\\supset");
-  defineSymbol(math, main, rel, "\u2286", "\\subseteq");
-  defineSymbol(math, main, rel, "\u2287", "\\supseteq");
-  defineSymbol(math, ams, rel, "\u2288", "\\nsubseteq");
-  defineSymbol(math, ams, rel, "\u2289", "\\nsupseteq");
-  defineSymbol(math, main, rel, "\u22A8", "\\models");
-  defineSymbol(math, main, rel, "\u2190", "\\leftarrow");
-  defineSymbol(math, main, rel, "\u2264", "\\le");
-  defineSymbol(math, main, rel, "\u2264", "\\leq");
-  defineSymbol(math, main, rel, "<", "\\lt");
-  defineSymbol(math, main, rel, "\u2260", "\\ne");
-  defineSymbol(math, main, rel, "\u2260", "\\neq");
-  defineSymbol(math, main, rel, "\u2192", "\\rightarrow");
-  defineSymbol(math, main, rel, "\u2192", "\\to");
-  defineSymbol(math, ams, rel, "\u2271", "\\ngeq");
-  defineSymbol(math, ams, rel, "\u2270", "\\nleq");
-  defineSymbol(math, main, spacing, null, "\\!");
-  defineSymbol(math, main, spacing, "\xA0", "\\ ");
-  defineSymbol(math, main, spacing, "\xA0", "~");
-  defineSymbol(math, main, spacing, null, "\\,");
-  defineSymbol(math, main, spacing, null, "\\:");
-  defineSymbol(math, main, spacing, null, "\\;");
-  defineSymbol(math, main, spacing, null, "\\enspace");
-  defineSymbol(math, main, spacing, null, "\\qquad");
-  defineSymbol(math, main, spacing, null, "\\quad");
-  defineSymbol(math, main, spacing, "\xA0", "\\space");
-  defineSymbol(math, main, punct, ",", ",");
-  defineSymbol(math, main, punct, ";", ";");
-  defineSymbol(math, main, punct, ":", "\\colon");
-  defineSymbol(math, ams, bin, "\u22BC", "\\barwedge");
-  defineSymbol(math, ams, bin, "\u22BB", "\\veebar");
-  defineSymbol(math, main, bin, "\u2299", "\\odot");
-  defineSymbol(math, main, bin, "\u2295", "\\oplus");
-  defineSymbol(math, main, bin, "\u2297", "\\otimes");
-  defineSymbol(math, main, textord, "\u2202", "\\partial");
-  defineSymbol(math, main, bin, "\u2298", "\\oslash");
-  defineSymbol(math, ams, bin, "\u229A", "\\circledcirc");
-  defineSymbol(math, ams, bin, "\u22A1", "\\boxdot");
-  defineSymbol(math, main, bin, "\u25B3", "\\bigtriangleup");
-  defineSymbol(math, main, bin, "\u25BD", "\\bigtriangledown");
-  defineSymbol(math, main, bin, "\u2020", "\\dagger");
-  defineSymbol(math, main, bin, "\u22C4", "\\diamond");
-  defineSymbol(math, main, bin, "\u22C6", "\\star");
-  defineSymbol(math, main, bin, "\u25C3", "\\triangleleft");
-  defineSymbol(math, main, bin, "\u25B9", "\\triangleright");
-  defineSymbol(math, main, open, "{", "\\{");
-  defineSymbol(text, main, textord, "{", "\\{");
-  defineSymbol(text, main, textord, "{", "\\textbraceleft");
-  defineSymbol(math, main, close, "}", "\\}");
-  defineSymbol(text, main, textord, "}", "\\}");
-  defineSymbol(text, main, textord, "}", "\\textbraceright");
-  defineSymbol(math, main, open, "{", "\\lbrace");
-  defineSymbol(math, main, close, "}", "\\rbrace");
-  defineSymbol(math, main, open, "[", "\\lbrack");
-  defineSymbol(math, main, close, "]", "\\rbrack");
-  defineSymbol(text, main, textord, "<", "\\textless"); // in T1 fontenc
-  defineSymbol(text, main, textord, ">", "\\textgreater"); // in T1 fontenc
-  defineSymbol(math, main, open, "\u230A", "\\lfloor");
-  defineSymbol(math, main, close, "\u230B", "\\rfloor");
-  defineSymbol(math, main, open, "\u2308", "\\lceil");
-  defineSymbol(math, main, close, "\u2309", "\\rceil");
-  defineSymbol(math, main, textord, "\\", "\\backslash");
-  defineSymbol(math, main, textord, "\u2223", "|");
-  defineSymbol(math, main, textord, "\u2223", "\\vert");
-  defineSymbol(text, main, textord, "|", "\\textbar"); // in T1 fontenc
-  defineSymbol(math, main, textord, "\u2225", "\\|");
-  defineSymbol(math, main, textord, "\u2225", "\\Vert");
-  defineSymbol(text, main, textord, "\u2225", "\\textbardbl");
-  defineSymbol(math, main, rel, "\u2191", "\\uparrow");
-  defineSymbol(math, main, rel, "\u21D1", "\\Uparrow");
-  defineSymbol(math, main, rel, "\u2193", "\\downarrow");
-  defineSymbol(math, main, rel, "\u21D3", "\\Downarrow");
-  defineSymbol(math, main, rel, "\u2195", "\\updownarrow");
-  defineSymbol(math, main, rel, "\u21D5", "\\Updownarrow");
-  defineSymbol(math, main, op, "\u2210", "\\coprod");
-  defineSymbol(math, main, op, "\u22C1", "\\bigvee");
-  defineSymbol(math, main, op, "\u22C0", "\\bigwedge");
-  defineSymbol(math, main, op, "\u2A04", "\\biguplus");
-  defineSymbol(math, main, op, "\u22C2", "\\bigcap");
-  defineSymbol(math, main, op, "\u22C3", "\\bigcup");
-  defineSymbol(math, main, op, "\u222B", "\\int");
-  defineSymbol(math, main, op, "\u222B", "\\intop");
-  defineSymbol(math, main, op, "\u222C", "\\iint");
-  defineSymbol(math, main, op, "\u222D", "\\iiint");
-  defineSymbol(math, main, op, "\u220F", "\\prod");
-  defineSymbol(math, main, op, "\u2211", "\\sum");
-  defineSymbol(math, main, op, "\u2A02", "\\bigotimes");
-  defineSymbol(math, main, op, "\u2A01", "\\bigoplus");
-  defineSymbol(math, main, op, "\u2A00", "\\bigodot");
-  defineSymbol(math, main, op, "\u222E", "\\oint");
-  defineSymbol(math, main, op, "\u2A06", "\\bigsqcup");
-  defineSymbol(math, main, op, "\u222B", "\\smallint");
-  defineSymbol(text, main, inner, "\u2026", "\\textellipsis");
-  defineSymbol(math, main, inner, "\u2026", "\\mathellipsis");
-  defineSymbol(text, main, inner, "\u2026", "\\ldots", true);
-  defineSymbol(math, main, inner, "\u2026", "\\ldots", true);
-  defineSymbol(math, main, inner, "\u22EF", "\\cdots", true);
-  defineSymbol(math, main, inner, "\u22F1", "\\ddots", true);
-  defineSymbol(math, main, textord, "\u22EE", "\\vdots", true);
-  defineSymbol(math, main, accent, "\xB4", "\\acute");
-  defineSymbol(math, main, accent, "`", "\\grave");
-  defineSymbol(math, main, accent, "\xA8", "\\ddot");
-  defineSymbol(math, main, accent, "~", "\\tilde");
-  defineSymbol(math, main, accent, "\xAF", "\\bar");
-  defineSymbol(math, main, accent, "\u02D8", "\\breve");
-  defineSymbol(math, main, accent, "\u02C7", "\\check");
-  defineSymbol(math, main, accent, "^", "\\hat");
-  defineSymbol(math, main, accent, "\u20D7", "\\vec");
-  defineSymbol(math, main, accent, "\u02D9", "\\dot");
-  defineSymbol(math, main, mathord, "\u0131", "\\imath");
-  defineSymbol(math, main, mathord, "\u0237", "\\jmath");
-  defineSymbol(text, main, accent, "\u02CA", "\\'"); // acute
-  defineSymbol(text, main, accent, "\u02CB", "\\`"); // grave
-  defineSymbol(text, main, accent, "\u02C6", "\\^"); // circumflex
-  defineSymbol(text, main, accent, "\u02DC", "\\~"); // tilde
-  defineSymbol(text, main, accent, "\u02C9", "\\="); // macron
-  defineSymbol(text, main, accent, "\u02D8", "\\u"); // breve
-  defineSymbol(text, main, accent, "\u02D9", "\\."); // dot above
-  defineSymbol(text, main, accent, "\u02DA", "\\r"); // ring above
-  defineSymbol(text, main, accent, "\u02C7", "\\v"); // caron
-  defineSymbol(text, main, accent, "\xA8", '\\"'); // diaresis
-  defineSymbol(text, main, accent, "\u030B", "\\H"); // double acute
-
-  defineSymbol(text, main, textord, "\u2013", "--");
-  defineSymbol(text, main, textord, "\u2013", "\\textendash");
-  defineSymbol(text, main, textord, "\u2014", "---");
-  defineSymbol(text, main, textord, "\u2014", "\\textemdash");
-  defineSymbol(text, main, textord, "\u2018", "`");
-  defineSymbol(text, main, textord, "\u2018", "\\textquoteleft");
-  defineSymbol(text, main, textord, "\u2019", "'");
-  defineSymbol(text, main, textord, "\u2019", "\\textquoteright");
-  defineSymbol(text, main, textord, "\u201C", "``");
-  defineSymbol(text, main, textord, "\u201C", "\\textquotedblleft");
-  defineSymbol(text, main, textord, "\u201D", "''");
-  defineSymbol(text, main, textord, "\u201D", "\\textquotedblright");
-  defineSymbol(math, main, textord, "\xB0", "\\degree");
-  defineSymbol(text, main, textord, "\xB0", "\\degree");
-  // TODO: In LaTeX, \pounds can generate a different character in text and math
-  // mode, but among our fonts, only Main-Italic defines this character "163".
-  defineSymbol(math, main, mathord, "\xA3", "\\pounds");
-  defineSymbol(math, main, mathord, "\xA3", "\\mathsterling");
-  defineSymbol(text, main, mathord, "\xA3", "\\pounds");
-  defineSymbol(text, main, mathord, "\xA3", "\\textsterling");
-  defineSymbol(math, ams, textord, "\u2720", "\\maltese");
-  defineSymbol(text, ams, textord, "\u2720", "\\maltese");
-
-  defineSymbol(text, main, spacing, "\xA0", "\\ ");
-  defineSymbol(text, main, spacing, "\xA0", " ");
-  defineSymbol(text, main, spacing, "\xA0", "~");
-
-  // There are lots of symbols which are the same, so we add them in afterwards.
-
-  // All of these are textords in math mode
-  var mathTextSymbols = "0123456789/@.\"";
-  for (var i = 0; i < mathTextSymbols.length; i++) {
-      var ch = mathTextSymbols.charAt(i);
-      defineSymbol(math, main, textord, ch, ch);
-  }
-
-  // All of these are textords in text mode
-  var textSymbols = "0123456789!@*()-=+[]<>|\";:?/.,";
-  for (var _i = 0; _i < textSymbols.length; _i++) {
-      var _ch = textSymbols.charAt(_i);
-      defineSymbol(text, main, textord, _ch, _ch);
-  }
-
-  // All of these are textords in text mode, and mathords in math mode
-  var letters = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
-  for (var _i2 = 0; _i2 < letters.length; _i2++) {
-      var _ch2 = letters.charAt(_i2);
-      defineSymbol(math, main, mathord, _ch2, _ch2);
-      defineSymbol(text, main, textord, _ch2, _ch2);
-  }
-
-  // Latin-1 letters
-  for (var _i3 = 0x00C0; _i3 <= 0x00D6; _i3++) {
-      var _ch3 = String.fromCharCode(_i3);
-      defineSymbol(math, main, mathord, _ch3, _ch3);
-      defineSymbol(text, main, textord, _ch3, _ch3);
-  }
-
-  for (var _i4 = 0x00D8; _i4 <= 0x00F6; _i4++) {
-      var _ch4 = String.fromCharCode(_i4);
-      defineSymbol(math, main, mathord, _ch4, _ch4);
-      defineSymbol(text, main, textord, _ch4, _ch4);
-  }
-
-  for (var _i5 = 0x00F8; _i5 <= 0x00FF; _i5++) {
-      var _ch5 = String.fromCharCode(_i5);
-      defineSymbol(math, main, mathord, _ch5, _ch5);
-      defineSymbol(text, main, textord, _ch5, _ch5);
-  }
-
-  // Cyrillic
-  for (var _i6 = 0x0410; _i6 <= 0x044F; _i6++) {
-      var _ch6 = String.fromCharCode(_i6);
-      defineSymbol(text, main, textord, _ch6, _ch6);
-  }
-
-  // Unicode versions of existing characters
-  defineSymbol(text, main, textord, "\u2013", "–");
-  defineSymbol(text, main, textord, "\u2014", "—");
-  defineSymbol(text, main, textord, "\u2018", "‘");
-  defineSymbol(text, main, textord, "\u2019", "’");
-  defineSymbol(text, main, textord, "\u201C", "“");
-  defineSymbol(text, main, textord, "\u201D", "”");
-
-  },{}],49:[function(require,module,exports){
-
-  var hangulRegex = /[\uAC00-\uD7AF]/;
-
-  // This regex combines
-  // - CJK symbols and punctuation: [\u3000-\u303F]
-  // - Hiragana: [\u3040-\u309F]
-  // - Katakana: [\u30A0-\u30FF]
-  // - CJK ideograms: [\u4E00-\u9FAF]
-  // - Hangul syllables: [\uAC00-\uD7AF]
-  // - Fullwidth punctuation: [\uFF00-\uFF60]
-  // Notably missing are halfwidth Katakana and Romanji glyphs.
-  var cjkRegex = /[\u3000-\u30FF\u4E00-\u9FAF\uAC00-\uD7AF\uFF00-\uFF60]/;
-
-  module.exports = {
-      cjkRegex: cjkRegex,
-      hangulRegex: hangulRegex
-  };
-
-  },{}],50:[function(require,module,exports){
-
-  var _ParseError = require("./ParseError");
-
-  var _ParseError2 = _interopRequireDefault(_ParseError);
-
-  function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { default: obj }; }
-
-  // This table gives the number of TeX pts in one of each *absolute* TeX unit.
-  // Thus, multiplying a length by this number converts the length from units
-  // into pts.  Dividing the result by ptPerEm gives the number of ems
-  // *assuming* a font size of ptPerEm (normal size, normal style).
-  var ptPerUnit = {
-      // https://en.wikibooks.org/wiki/LaTeX/Lengths and
-      // https://tex.stackexchange.com/a/8263
-      "pt": 1, // TeX point
-      "mm": 7227 / 2540, // millimeter
-      "cm": 7227 / 254, // centimeter
-      "in": 72.27, // inch
-      "bp": 803 / 800, // big (PostScript) points
-      "pc": 12, // pica
-      "dd": 1238 / 1157, // didot
-      "cc": 14856 / 1157, // cicero (12 didot)
-      "nd": 685 / 642, // new didot
-      "nc": 1370 / 107, // new cicero (12 new didot)
-      "sp": 1 / 65536, // scaled point (TeX's internal smallest unit)
-      // https://tex.stackexchange.com/a/41371
-      "px": 803 / 800 };
-
-  // Dictionary of relative units, for fast validity testing.
-  /* eslint no-console:0 */
-
-  /**
-   * This file does conversion between units.  In particular, it provides
-   * calculateSize to convert other units into ems.
-   */
-
-  var relativeUnit = {
-      "ex": true,
-      "em": true,
-      "mu": true
-  };
-
-  /**
-   * Determine whether the specified unit (either a string defining the unit
-   * or a "size" parse node containing a unit field) is valid.
-   */
-  var validUnit = function validUnit(unit) {
-      if (unit.unit) {
-          unit = unit.unit;
-      }
-      return unit in ptPerUnit || unit in relativeUnit || unit === "ex";
-  };
-
-  /*
-   * Convert a "size" parse node (with numeric "number" and string "unit" fields,
-   * as parsed by functions.js argType "size") into a CSS em value for the
-   * current style/scale.  `options` gives the current options.
-   */
-  var calculateSize = function calculateSize(sizeValue, options) {
-      var scale = void 0;
-      if (sizeValue.unit in ptPerUnit) {
-          // Absolute units
-          scale = ptPerUnit[sizeValue.unit] // Convert unit to pt
-          / options.fontMetrics().ptPerEm // Convert pt to CSS em
-          / options.sizeMultiplier; // Unscale to make absolute units
-      } else if (sizeValue.unit === "mu") {
-          // `mu` units scale with scriptstyle/scriptscriptstyle.
-          scale = options.fontMetrics().cssEmPerMu;
-      } else {
-          // Other relative units always refer to the *textstyle* font
-          // in the current size.
-          var unitOptions = void 0;
-          if (options.style.isTight()) {
-              // isTight() means current style is script/scriptscript.
-              unitOptions = options.havingStyle(options.style.text());
-          } else {
-              unitOptions = options;
-          }
-          // TODO: In TeX these units are relative to the quad of the current
-          // *text* font, e.g. cmr10. KaTeX instead uses values from the
-          // comparably-sized *Computer Modern symbol* font. At 10pt, these
-          // match. At 7pt and 5pt, they differ: cmr7=1.138894, cmsy7=1.170641;
-          // cmr5=1.361133, cmsy5=1.472241. Consider $\scriptsize a\kern1emb$.
-          // TeX \showlists shows a kern of 1.13889 * fontsize;
-          // KaTeX shows a kern of 1.171 * fontsize.
-          if (sizeValue.unit === "ex") {
-              scale = unitOptions.fontMetrics().xHeight;
-          } else if (sizeValue.unit === "em") {
-              scale = unitOptions.fontMetrics().quad;
-          } else {
-              throw new _ParseError2.default("Invalid unit: '" + sizeValue.unit + "'");
-          }
-          if (unitOptions !== options) {
-              scale *= unitOptions.sizeMultiplier / options.sizeMultiplier;
-          }
-      }
-      return sizeValue.number * scale;
-  };
-
-  module.exports = {
-      validUnit: validUnit,
-      calculateSize: calculateSize
-  };
-
-  },{"./ParseError":29}],51:[function(require,module,exports){
-
-  /**
-   * This file contains a list of utility functions which are useful in other
-   * files.
-   */
-
-  /**
-   * Provide an `indexOf` function which works in IE8, but defers to native if
-   * possible.
-   */
-  var nativeIndexOf = Array.prototype.indexOf;
-  var indexOf = function indexOf(list, elem) {
-      if (list == null) {
-          return -1;
-      }
-      if (nativeIndexOf && list.indexOf === nativeIndexOf) {
-          return list.indexOf(elem);
-      }
-      var l = list.length;
-      for (var i = 0; i < l; i++) {
-          if (list[i] === elem) {
-              return i;
-          }
-      }
-      return -1;
-  };
-
-  /**
-   * Return whether an element is contained in a list
-   */
-  var contains = function contains(list, elem) {
-      return indexOf(list, elem) !== -1;
-  };
-
-  /**
-   * Provide a default value if a setting is undefined
-   */
-  var deflt = function deflt(setting, defaultIfUndefined) {
-      return setting === undefined ? defaultIfUndefined : setting;
-  };
-
-  // hyphenate and escape adapted from Facebook's React under Apache 2 license
-
-  var uppercase = /([A-Z])/g;
-  var hyphenate = function hyphenate(str) {
-      return str.replace(uppercase, "-$1").toLowerCase();
-  };
-
-  var ESCAPE_LOOKUP = {
-      "&": "&amp;",
-      ">": "&gt;",
-      "<": "&lt;",
-      "\"": "&quot;",
-      "'": "&#x27;"
-  };
-
-  var ESCAPE_REGEX = /[&><"']/g;
-
-  function escaper(match) {
-      return ESCAPE_LOOKUP[match];
-  }
-
-  /**
-   * Escapes text to prevent scripting attacks.
-   *
-   * @param {*} text Text value to escape.
-   * @return {string} An escaped string.
-   */
-  function escape(text) {
-      return ("" + text).replace(ESCAPE_REGEX, escaper);
-  }
-
-  /**
-   * A function to set the text content of a DOM element in all supported
-   * browsers. Note that we don't define this if there is no document.
-   */
-  var setTextContent = void 0;
-  if (typeof document !== "undefined") {
-      var testNode = document.createElement("span");
-      if ("textContent" in testNode) {
-          setTextContent = function setTextContent(node, text) {
-              node.textContent = text;
-          };
-      } else {
-          setTextContent = function setTextContent(node, text) {
-              node.innerText = text;
-          };
-      }
-  }
-
-  /**
-   * A function to clear a node.
-   */
-  function clearNode(node) {
-      setTextContent(node, "");
-  }
-
-  module.exports = {
-      contains: contains,
-      deflt: deflt,
-      escape: escape,
-      hyphenate: hyphenate,
-      indexOf: indexOf,
-      setTextContent: setTextContent,
-      clearNode: clearNode
-  };
-
-  },{}]},{},[1])(1)
-  });
-  });
-
-  var katex$2 = unwrapExports(katex$1);
-
-  // Copyright 2018 The Distill Template Authors
-  //
-  // Licensed under the Apache License, Version 2.0 (the "License");
-  // you may not use this file except in compliance with the License.
-  // You may obtain a copy of the License at
-  //
-  //      http://www.apache.org/licenses/LICENSE-2.0
-  //
-  // Unless required by applicable law or agreed to in writing, software
-  // distributed under the License is distributed on an "AS IS" BASIS,
-  // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  // See the License for the specific language governing permissions and
-  // limitations under the License.
-
-  // This is a straight concatenation of code from KaTeX's contrib folder,
-  // but we aren't using some of their helpers that don't work well outside a browser environment.
-
-  /*global katex */
-
-  const findEndOfMath = function(delimiter, text, startIndex) {
-    // Adapted from
-    // https://github.com/Khan/perseus/blob/master/src/perseus-markdown.jsx
-    let index = startIndex;
-    let braceLevel = 0;
-
-    const delimLength = delimiter.length;
-
-    while (index < text.length) {
-      const character = text[index];
-
-      if (
-        braceLevel <= 0 &&
-        text.slice(index, index + delimLength) === delimiter
-      ) {
-        return index;
-      } else if (character === "\\") {
-        index++;
-      } else if (character === "{") {
-        braceLevel++;
-      } else if (character === "}") {
-        braceLevel--;
-      }
-
-      index++;
-    }
-
-    return -1;
-  };
-
-  const splitAtDelimiters = function(startData, leftDelim, rightDelim, display) {
-    const finalData = [];
-
-    for (let i = 0; i < startData.length; i++) {
-      if (startData[i].type === "text") {
-        const text = startData[i].data;
-
-        let lookingForLeft = true;
-        let currIndex = 0;
-        let nextIndex;
-
-        nextIndex = text.indexOf(leftDelim);
-        if (nextIndex !== -1) {
-          currIndex = nextIndex;
-          finalData.push({
-            type: "text",
-            data: text.slice(0, currIndex)
-          });
-          lookingForLeft = false;
-        }
-
-        while (true) {
-          // eslint-disable-line no-constant-condition
-          if (lookingForLeft) {
-            nextIndex = text.indexOf(leftDelim, currIndex);
-            if (nextIndex === -1) {
-              break;
-            }
-
-            finalData.push({
-              type: "text",
-              data: text.slice(currIndex, nextIndex)
-            });
-
-            currIndex = nextIndex;
-          } else {
-            nextIndex = findEndOfMath(
-              rightDelim,
-              text,
-              currIndex + leftDelim.length
-            );
-            if (nextIndex === -1) {
-              break;
-            }
-
-            finalData.push({
-              type: "math",
-              data: text.slice(currIndex + leftDelim.length, nextIndex),
-              rawData: text.slice(currIndex, nextIndex + rightDelim.length),
-              display: display
-            });
-
-            currIndex = nextIndex + rightDelim.length;
-          }
-
-          lookingForLeft = !lookingForLeft;
-        }
-
-        finalData.push({
-          type: "text",
-          data: text.slice(currIndex)
-        });
-      } else {
-        finalData.push(startData[i]);
-      }
-    }
-
-    return finalData;
-  };
-
-  const splitWithDelimiters = function(text, delimiters) {
-    let data = [{ type: "text", data: text }];
-    for (let i = 0; i < delimiters.length; i++) {
-      const delimiter = delimiters[i];
-      data = splitAtDelimiters(
-        data,
-        delimiter.left,
-        delimiter.right,
-        delimiter.display || false
-      );
-    }
-    return data;
-  };
-
-  /* Note: optionsCopy is mutated by this method. If it is ever exposed in the
-   * API, we should copy it before mutating.
-   */
-  const renderMathInText = function(text, optionsCopy) {
-    const data = splitWithDelimiters(text, optionsCopy.delimiters);
-    const fragment = document.createDocumentFragment();
-
-    for (let i = 0; i < data.length; i++) {
-      if (data[i].type === "text") {
-        fragment.appendChild(document.createTextNode(data[i].data));
-      } else {
-        const tag = document.createElement("d-math");
-        const math = data[i].data;
-        // Override any display mode defined in the settings with that
-        // defined by the text itself
-        optionsCopy.displayMode = data[i].display;
-        try {
-          tag.textContent = math;
-          if (optionsCopy.displayMode) {
-            tag.setAttribute("block", "");
-          }
-        } catch (e) {
-          if (!(e instanceof katex.ParseError)) {
-            throw e;
-          }
-          optionsCopy.errorCallback(
-            "KaTeX auto-render: Failed to parse `" + data[i].data + "` with ",
-            e
-          );
-          fragment.appendChild(document.createTextNode(data[i].rawData));
-          continue;
-        }
-        fragment.appendChild(tag);
-      }
-    }
-
-    return fragment;
-  };
-
-  const renderElem = function(elem, optionsCopy) {
-    for (let i = 0; i < elem.childNodes.length; i++) {
-      const childNode = elem.childNodes[i];
-      if (childNode.nodeType === 3) {
-        // Text node
-        const text = childNode.textContent;
-        if (optionsCopy.mightHaveMath(text)) {
-          const frag = renderMathInText(text, optionsCopy);
-          i += frag.childNodes.length - 1;
-          elem.replaceChild(frag, childNode);
-        }
-      } else if (childNode.nodeType === 1) {
-        // Element node
-        const shouldRender =
-          optionsCopy.ignoredTags.indexOf(childNode.nodeName.toLowerCase()) ===
-          -1;
-
-        if (shouldRender) {
-          renderElem(childNode, optionsCopy);
-        }
-      }
-      // Otherwise, it's something else, and ignore it.
-    }
-  };
-
-  const defaultAutoRenderOptions = {
-    delimiters: [
-      { left: "$$", right: "$$", display: true },
-      { left: "\\[", right: "\\]", display: true },
-      { left: "\\(", right: "\\)", display: false }
-      // LaTeX uses this, but it ruins the display of normal `$` in text:
-      // {left: '$', right: '$', display: false},
-    ],
-
-    ignoredTags: [
-      "script",
-      "noscript",
-      "style",
-      "textarea",
-      "pre",
-      "code",
-      "svg"
-    ],
-
-    errorCallback: function(msg, err) {
-      console.error(msg, err);
-    }
-  };
-
-  const renderMathInElement = function(elem, options) {
-    if (!elem) {
-      throw new Error("No element provided to render");
-    }
-
-    const optionsCopy = Object.assign({}, defaultAutoRenderOptions, options);
-    const delimiterStrings = optionsCopy.delimiters.flatMap(d => [
-      d.left,
-      d.right
-    ]);
-    const mightHaveMath = text =>
-      delimiterStrings.some(d => text.indexOf(d) !== -1);
-    optionsCopy.mightHaveMath = mightHaveMath;
-    renderElem(elem, optionsCopy);
-  };
-
-  // Copyright 2018 The Distill Template Authors
-
-  function Mathematics(dom, data) {
-    let needsCSS = false;
-    const body = dom.querySelector('body');
-
-    if (!body) {
-      console.warn("No body tag found!");
-      return;
-    }
-
-    if (data.katex && data.katex.delimiters) {
-      global.document = dom;
-      renderMathInElement(body, data.katex);
-    }
-
-    // render d-math tags
-    const mathTags = body.querySelectorAll('d-math');
-    if (mathTags.length > 0) {
-      needsCSS = true;
-      console.warn(`Prerendering ${mathTags.length} math tags...`);
-      for (const mathTag of mathTags) {
-        const localOptions = { displayMode: mathTag.hasAttribute('block') };
-        const options = Object.assign(localOptions, data.katex);
-        const html = katex$2.renderToString(mathTag.textContent, options);
-        const container = dom.createElement('span');
-        container.innerHTML = html;
-        mathTag.parentElement.insertBefore(container, mathTag);
-        mathTag.parentElement.removeChild(mathTag);
-      }
-    }
-
-    if (needsCSS) {
-      const katexCSSTag = '<link rel="stylesheet" href="https://distill.pub/third-party/katex/katex.min.css" crossorigin="anonymous">';
-      dom.head.insertAdjacentHTML('beforeend', katexCSSTag);
-    }
-
-  }
-
-  var favicon = "iVBORw0KGgoAAAANSUhEUgAAAEAAAABACAYAAACqaXHeAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJbWFnZVJlYWR5ccllPAAAA99JREFUeNrsG4t1ozDMzQSM4A2ODUonKBucN2hugtIJ6E1AboLcBiQTkJsANiAb9OCd/OpzMWBJBl5TvaeXPiiyJetry0J8wW3D3QpjRh3GjneXDq+fSQA9s2mH9x3KDhN4foJfCb8N/Jrv+2fnDn8vLRQOplWHVYdvHZYdZsBcZP1vBmh/n8DzEmhUQDPaOuP9pFuY+JwJHwHnCLQE2tnWBGEyXozY9xCUgHMhhjE2I4heVWtgIkZ83wL6Qgxj1obfWBxymPwe+b00BCCRNPbwfb60yleAkkBHGT5AEehIYz7eJrFDMF9CvH4wwhcGHiHMneFvLDQwlwvMLQq58trRcYBWfYn0A0OgHWQUSu25mE+BnoYKnnEJoeIWAifzOv7vLWd2ZKRfWAIme3tOiUaQ3UnLkb0xj1FxRIeEGKaGIHOs9nEgLaaA9i0JRYo1Ic67wJW86KSKE/ZAM8KuVMk8ITVhmxUxJ3Cl2xlm9Vtkeju1+mpCQNxaEGNCY8bs9X2YqwNoQeGjBWut/ma0QAWy/TqAsHx9wSya3I5IRxOfTC+leG+kA/4vSeEcGBtNUN6byhu3+keEZCQJUNh8MAO7HL6H8pQLnsW/Hd4T4lv93TPjfM7A46iEEqbB5EDOvwYNW6tGNZzT/o+CZ6sqZ6wUtR/wf7mi/VL8iNciT6rHih48Y55b4nKCHJCCzb4y0nwFmin3ZEMIoLfZF8F7nncFmvnWBaBj7CGAYA/WGJsUwHdYqVDwAmNsUgAx4CGgAA7GOOxADYOFWOaIKifuVYzmOpREqA21Mo7aPsgiY1PhOMAmxtR+AUbYH3Id2wc0SAFIQTsn9IUGWR8k9jx3vtXSiAacFxTAGakBk9UudkNECd6jLe+6HrshshvIuC6IlLMRy7er+JpcKma24SlE4cFZSZJDGVVrsNvitQhQrDhW0jfiOLfFd47C42eHT56D/BK0To+58Ahj+cAT8HT1UWlfLZCCd/uKawzU0Rh2EyIX/Icqth3niG8ybNroezwe6khdCNxRN+l4XGdOLVLlOOt2hTRJlr1ETIuMAltVTMz70mJrkdGAaZLSmnBEqmAE32JCMmuTlCnRgsBENtOUpHhvvsYIL0ibnBkaC6QvKcR7738GKp0AKnim7xgUSNv1bpS8QwhBt8r+EP47v/oyRK/S34yJ9nT+AN0Tkm4OdB9E4BsmXM3SnMlRFUrtp6IDpV2eKzdYvF3etm3KhQksbOLChGkSmcBdmcEwvqkrMy5BzL00NZeu3qPYJOOuCc+5NjcWKXQxFvTa3NoXJ4d8in7fiAUuTt781dkvuHX4K8AA2Usy7yNKLy0AAAAASUVORK5CYII=\n";
-
-  /*!
+!function(e,t){"object"==typeof exports&&"undefined"!=typeof module?t(exports,require("fs")):"function"==typeof define&&define.amd?define(["exports","fs"],t):t((e=e||self).dl={},e.fs)}(this,function(e,t){"use strict";function n(e,t){e.title=t.title,t.published&&(t.published instanceof Date?e.publishedDate=t.published:t.published.constructor===String&&(e.publishedDate=new Date(t.published))),t.publishedDate&&(t.publishedDate instanceof Date?e.publishedDate=t.publishedDate:t.publishedDate.constructor===String?e.publishedDate=new Date(t.publishedDate):console.error("Don't know what to do with published date: "+t.publishedDate)),e.description=t.description,e.authors=t.authors.map(e=>new te(e)),e.katex=t.katex,e.password=t.password,t.doi&&(e.doi=t.doi)}
+// Copyright 2018 The Distill Template Authors
+function r(e){for(let t of e.authors){const e=Boolean(t.affiliation),n=Boolean(t.affiliations);if(e)if(n)console.warn(`Author ${t.author} has both old-style ("affiliation" & "affiliationURL") and new style ("affiliations") affiliation information!`);else{let e={name:t.affiliation};t.affiliationURL&&(e.url=t.affiliationURL),t.affiliations=[e]}}return e}function i(e){const t=e.firstElementChild;if(t){if("json"==t.getAttribute("type").split("/")[1]){const e=t.textContent;return r(JSON.parse(e))}console.error("Distill only supports JSON frontmatter tags anymore; no more YAML.")}else console.error("You added a frontmatter tag but did not provide a script tag with front matter data in it. Please take a look at our templates.");return{}}
+// Copyright 2018 The Distill Template Authors
+function a(e,t){const r=e.querySelector("d-front-matter");r?n(t,i(r)):console.warn("No front matter tag found!")}function o(){throw new Error("Dynamic requires are not currently supported by rollup-plugin-commonjs")}function s(e){return e&&e.__esModule&&Object.prototype.hasOwnProperty.call(e,"default")?e["default"]:e}function l(e,t){return e(t={exports:{}},t.exports),t.exports}
+// Copyright 2018 The Distill Template Authors
+function u(e){return e.replace(/[\t\n ]+/g," ").replace(/{\\["^`.'acu~Hvs]( )?([a-zA-Z])}/g,(e,t,n)=>n).replace(/{\\([a-zA-Z])}/g,(e,t)=>t)}function d(e){const t=new Map,n=re.toJSON(e);for(const e of n){for(const[t,n]of Object.entries(e.entryTags))e.entryTags[t.toLowerCase()]=u(n);e.entryTags.type=e.entryType,t.set(e.citationKey,e.entryTags)}return t}function c(e){return`@article{${e.slug},\n  author = {${e.bibtexAuthors}},\n  title = {${e.title}},\n  journal = {${e.journal.title}},\n  year = {${e.publishedYear}},\n  note = {${e.url}},\n  doi = {${e.doi}}\n}`}
+// Copyright 2018 The Distill Template Authors
+function h(e){const t=e.firstElementChild;if(t&&"SCRIPT"===t.tagName){if("text/bibtex"==t.type){return d(e.firstElementChild.textContent)}if("text/json"==t.type)return new Map(JSON.parse(t.textContent));console.warn("Unsupported bibliography script tag type: "+t.type)}else console.warn("Bibliography did not have any script tag.")}
+// Copyright 2018 The Distill Template Authors
+function p(e,n){const r=e.querySelector("d-bibliography");if(!r)return void console.warn("No bibliography tag found!");const i=r.getAttribute("src");if(i){const a=n.inputDirectory+"/"+i,o=d(t.readFileSync(a,"utf-8")),s=e.createElement("script");s.type="text/json",s.textContent=JSON.stringify([...o]),r.appendChild(s),r.removeAttribute("src")}n.bibliography=h(r)}
+// Copyright 2018 The Distill Template Authors
+function f(e=document){const t=new Set,n=e.querySelectorAll("d-cite");for(const e of n){const n=(e.getAttribute("key")||e.getAttribute("bibtex-key")).split(",").map(e=>e.trim());for(const e of n)t.add(e)}return[...t]}function m(e,t,n,r){if(null==e.author)return"";var i=e.author.split(" and ");let a=i.map(e=>{if(-1!=(e=e.trim()).indexOf(","))var n=e.split(",")[0].trim(),r=e.split(",")[1];else if(-1!=e.indexOf(" "))n=e.split(" ").slice(-1)[0].trim(),r=e.split(" ").slice(0,-1).join(" ");else n=e.trim();var i="";return r!=undefined&&(i=(i=r.trim().split(" ").map(e=>e.trim()[0])).join(".")+"."),t.replace("${F}",r).replace("${L}",n).replace("${I}",i).trim()});if(i.length>1){var o=a.slice(0,i.length-1).join(n);return o+=(r||n)+a[i.length-1]}return a[0]}function g(e){var t=e.journal||e.booktitle||"";if("volume"in e){var n=e.issue||e.number;n=n!=undefined?"("+n+")":"",t+=", Vol "+e.volume+n}return"pages"in e&&(t+=", pp. "+e.pages),""!=t&&(t+=". "),"publisher"in e&&"."!=(t+=e.publisher)[t.length-1]&&(t+="."),t}function v(e){if("url"in e){var t=e.url,n=/arxiv\.org\/abs\/([0-9\.]*)/.exec(t);if(null!=n&&(t=`http://arxiv.org/pdf/${n[1]}.pdf`),".pdf"==t.slice(-4))var r="PDF";else if(".html"==t.slice(-5))r="HTML";return` &ensp;<a href="${t}">[${r||"link"}]</a>`}return""}function b(e,t){return"doi"in e?`${t?"<br>":""} <a href="https://doi.org/${e.doi}" style="text-decoration:inherit;">DOI: ${e.doi}</a>`:""}function y(e){return'<span class="title">'+e.title+"</span> "}function x(e){if(e){var t=y(e);return t+=v(e)+"<br>",e.author&&(t+=m(e,"${L}, ${I}",", "," and "),(e.year||e.date)&&(t+=", ")),e.year||e.date?t+=(e.year||e.date)+". ":t+=". ",t+=g(e),t+=b(e)}return"?"}
+// Copyright 2018 The Distill Template Authors
+function w(e,t){const n=new Set(t.citations),r=f(e);for(const e of r)n.add(e);t.citations=Array.from(n)}
+// Copyright 2018 The Distill Template Authors
+function k(e){const t=e.querySelector("head");if(e.querySelector("html").getAttribute("lang")||e.querySelector("html").setAttribute("lang","en"),!e.querySelector("meta[charset]")){const n=e.createElement("meta");n.setAttribute("charset","utf-8"),t.appendChild(n)}if(!e.querySelector("meta[name=viewport]")){const n=e.createElement("meta");n.setAttribute("name","viewport"),n.setAttribute("content","width=device-width, initial-scale=1"),t.appendChild(n)}}
+// Copyright 2018 The Distill Template Authors
+function M(e){return`\n  <div class="byline grid">\n    <div class="authors-affiliations grid">\n      <h3>Authors</h3>\n      <h3>Affiliations</h3>\n      ${e.authors.map(e=>`\n        <p class="author">\n          ${e.personalURL?`\n            <a class="name" href="${e.personalURL}">${e.name}</a>`:`\n            <span class="name">${e.name}</span>`}\n        </p>\n        <p class="affiliation">\n        ${e.affiliations.map(e=>e.url?`<a class="affiliation" href="${e.url}">${e.name}</a>`:`<span class="affiliation">${e.name}</span>`).join(", ")}\n        </p>\n      `).join("")}\n    </div>\n    <div>\n      <h3>Published</h3>\n      ${e.publishedDate?`\n        <p>${e.publishedMonth} ${e.publishedDay}, ${e.publishedYear}</p> `:"\n        <p><em>Not published yet.</em></p>"}\n    </div>\n  </div>\n`}
+// Copyright 2018 The Distill Template Authors
+function S(e,t){const n=e.querySelector("d-byline");n&&(n.innerHTML=M(t))}
+// Copyright 2018 The Distill Template Authors
+function z(e,t){const n=e.body,r=n.querySelector("d-article");if(!r)return void console.warn("No d-article tag found; skipping adding optional components!");let i=e.querySelector("d-byline");i||(t.authors?(i=e.createElement("d-byline"),n.insertBefore(i,r)):console.warn("No authors found in front matter; please add them before submission!"));let a=e.querySelector("d-title");a||(a=e.createElement("d-title"),n.insertBefore(a,i));let o=a.querySelector("h1");o||((o=e.createElement("h1")).textContent=t.title,a.insertBefore(o,a.firstChild));const s="undefined"!=typeof t.password;let l=n.querySelector("d-interstitial");if(s&&!l){const r="undefined"!=typeof window,i=r&&window.location.hostname.includes("localhost");r&&i||((l=e.createElement("d-interstitial")).password=t.password,n.insertBefore(l,n.firstChild))}else!s&&l&&l.parentElement.removeChild(this);let u=e.querySelector("d-appendix");u||(u=e.createElement("d-appendix"),e.body.appendChild(u));let d=e.querySelector("d-footnote-list");d||(d=e.createElement("d-footnote-list"),u.appendChild(d));let c=e.querySelector("d-citation-list");c||(c=e.createElement("d-citation-list"),u.appendChild(c))}
+// Copyright 2018 The Distill Template Authors
+function A(e,t){let n=!1;const r=e.querySelector("body");if(!r)return void console.warn("No body tag found!");t.katex&&t.katex.delimiters&&(global.document=e,ce(r,t.katex));const i=r.querySelectorAll("d-math");if(i.length>0){n=!0,console.warn(`Prerendering ${i.length} math tags...`);for(const n of i){const r={displayMode:n.hasAttribute("block")},i=Object.assign(r,t.katex),a=ie.renderToString(n.textContent,i),o=e.createElement("span");o.innerHTML=a,n.parentElement.insertBefore(o,n),n.parentElement.removeChild(n)}}if(n){const t='<link rel="stylesheet" href="https://distill.pub/third-party/katex/katex.min.css" crossorigin="anonymous">';e.head.insertAdjacentHTML("beforeend",t)}}function C(e){var t,n=""+e,r=pe.exec(n);if(!r)return n;var i="",a=0,o=0;for(a=r.index;a<n.length;a++){switch(n.charCodeAt(a)){case 34:t="&quot;";break;case 38:t="&amp;";break;case 39:t="&#39;";break;case 60:t="&lt;";break;case 62:t="&gt;";break;default:continue}o!==a&&(i+=n.substring(o,a)),o=a+1,i+=t}return o!==a?i+n.substring(o,a):i}
+// Copyright 2018 The Distill Template Authors
+function T(e,t){function n(e,t,n){(t||n)&&i(`    <meta name="${e}" content="${fe(t)}" >\n`)}let r=e.querySelector("head"),i=e=>N(r,e);if(i(`\n    <meta http-equiv="X-UA-Compatible" content="IE=Edge,chrome=1">\n    <link rel="icon" type="image/png" href="data:image/png;base64,${he}">\n    <link href="/rss.xml" rel="alternate" type="application/rss+xml" title="Articles from Distill">\n  `),t.title&&i(`\n    <title>${fe(t.title)}</title>\n    `),t.url&&i(`\n    <link rel="canonical" href="${t.url}">\n    `),t.publishedDate&&i(`\n    <!--  https://schema.org/Article -->\n    <meta property="description"       itemprop="description"   content="${fe(t.description)}" />\n    <meta property="article:published" itemprop="datePublished" content="${t.publishedISODateOnly}" />\n    <meta property="article:created"   itemprop="dateCreated"   content="${t.publishedISODateOnly}" />\n    `),t.updatedDate&&i(`\n    <meta property="article:modified"  itemprop="dateModified"  content="${t.updatedDate.toISOString()}" />\n    `),(t.authors||[]).forEach(e=>{N(r,`\n    <meta property="article:author" content="${fe(e.firstName)} ${fe(e.lastName)}" />`)}),i(`\n    <!--  https://developers.facebook.com/docs/sharing/webmasters#markup -->\n    <meta property="og:type" content="article"/>\n    <meta property="og:title" content="${fe(t.title)}"/>\n    <meta property="og:description" content="${fe(t.description)}">\n    <meta property="og:url" content="${t.url}"/>\n    <meta property="og:image" content="${t.previewURL}"/>\n    <meta property="og:locale" content="en_US" />\n    <meta property="og:site_name" content="Distill" />\n  `),i(`\n    <!--  https://dev.twitter.com/cards/types/summary -->\n    <meta name="twitter:card" content="summary_large_image">\n    <meta name="twitter:title" content="${fe(t.title)}">\n    <meta name="twitter:description" content="${fe(t.description)}">\n    <meta name="twitter:url" content="${t.url}">\n    <meta name="twitter:image" content="${t.previewURL}">\n    <meta name="twitter:image:width" content="560">\n    <meta name="twitter:image:height" content="295">\n  `),t.doiSuffix){i("\n      <!--  https://scholar.google.com/intl/en/scholar/inclusion.html#indexing -->\n"),n("citation_title",t.title),n("citation_fulltext_html_url",t.url),n("citation_volume",t.volume),n("citation_issue",t.issue),n("citation_firstpage",t.doiSuffix?`e${t.doiSuffix}`:undefined),n("citation_doi",t.doi);let e=t.journal||{};n("citation_journal_title",e.full_title||e.title),n("citation_journal_abbrev",e.abbrev_title),n("citation_issn",e.issn),n("citation_publisher",e.publisher),n("citation_fulltext_world_readable","",!0),t.publishedDate&&(n("citation_online_date",`${t.publishedYear}/${t.publishedMonthPadded}/${t.publishedDayPadded}`),n("citation_publication_date",`${t.publishedYear}/${t.publishedMonthPadded}/${t.publishedDayPadded}`)),(t.authors||[]).forEach(e=>{n("citation_author",`${e.lastName}, ${e.firstName}`),n("citation_author_institution",e.affiliation)})}else console.warn("No DOI suffix in data; not adding citation meta tags!");t.citations?t.citations.forEach(e=>{if(t.bibliography&&t.bibliography.has(e)){n("citation_reference",E(t.bibliography.get(e)))}else console.warn("No bibliography data found for "+e)}):console.warn("No citations found; not adding any references meta tags!")}function N(e,t){e.innerHTML+=t}function E(e){var t=`citation_title=${e.title};`;e.author&&""!==e.author&&e.author.split(" and ").forEach(e=>{let n,r;-1!=(e=e.trim()).indexOf(",")?(n=e.split(",")[0].trim(),r=e.split(",")[1].trim()):(n=e.split(" ").slice(-1)[0].trim(),r=e.split(" ").slice(0,-1).join(" ")),t+=`citation_author=${r} ${n};`}),"year"in e&&(t+=`citation_publication_date=${e.year};`);let n=/https?:\/\/arxiv\.org\/pdf\/([0-9]*\.[0-9]*)\.pdf/.exec(e.url);return(n=(n=n||/https?:\/\/arxiv\.org\/abs\/([0-9]*\.[0-9]*)/.exec(e.url))||/arXiv preprint arXiv:([0-9]*\.[0-9]*)/.exec(e.journal))&&n[1]?t+=`citation_arxiv_id=${n[1]};`:("journal"in e&&(t+=`citation_journal_title=${fe(e.journal)};`),"volume"in e&&(t+=`citation_volume=${fe(e.volume)};`),("issue"in e||"number"in e)&&(t+=`citation_number=${fe(e.issue||e.number)};`),t)}function R(e){const t="distill-prerendered-styles";if(!e.getElementById(t)){const n=e.createElement("style");n.id=t,n.type="text/css";const r=e.createTextNode(me);n.appendChild(r);const i=e.head.querySelector("script");e.head.insertBefore(n,i)}}
+// Copyright 2018 The Distill Template Authors
+function L(e,t){let n='\n  <style>\n\n  d-toc {\n    contain: layout style;\n    display: block;\n  }\n\n  d-toc ul {\n    padding-left: 0;\n  }\n\n  d-toc ul > ul {\n    padding-left: 24px;\n  }\n\n  d-toc a {\n    border-bottom: none;\n    text-decoration: none;\n  }\n\n  </style>\n  <nav role="navigation" class="table-of-contents"></nav>\n  <h2>Table of contents</h2>\n  <ul>';for(const e of t){const t="D-TITLE"==e.parentElement.tagName,r=e.getAttribute("no-toc");if(t||r)continue;const i=e.textContent;let a='<li><a href="'+("#"+e.getAttribute("id"))+'">'+i+"</a></li>";"H3"==e.tagName?a="<ul>"+a+"</ul>":a+="<br>",n+=a}n+="</ul></nav>",e.innerHTML=n}
+// Copyright 2018 The Distill Template Authors
+function O(e){const t=e.querySelector("d-article"),n=e.querySelector("d-toc");if(n){L(n,t.querySelectorAll("h2, h3")),n.setAttribute("prerendered","true")}}
+// Copyright 2018 The Distill Template Authors
+function q(e){for(var t=e.createTreeWalker(e.body,e.defaultView.NodeFilter.SHOW_TEXT);t.nextNode();){var n=t.currentNode,r=n.nodeValue;r&&_(n)&&(r=D(r=B(r)),n.nodeValue=r)}}function _(e){var t=e.parentElement,n=!!(t&&t.getAttribute&&t.getAttribute("class"))&&(t.getAttribute("class").includes("katex")||t.getAttribute("class").includes("MathJax"));return t&&"SCRIPT"!==t.nodeName&&"STYLE"!==t.nodeName&&"CODE"!==t.nodeName&&"PRE"!==t.nodeName&&"SPAN"!==t.nodeName&&"D-HEADER"!==t.nodeName&&"D-BYLINE"!==t.nodeName&&"D-MATH"!==t.nodeName&&"D-CODE"!==t.nodeName&&"D-BIBLIOGRAPHY"!==t.nodeName&&"D-FOOTER"!==t.nodeName&&"D-APPENDIX"!==t.nodeName&&"D-FRONTMATTER"!==t.nodeName&&"D-TOC"!==t.nodeName&&8!==t.nodeType&&!n}
+/*!
+   * typeset - Typesetting for the web
+   * @version v0.1.6
+   * @link https://github.com/davidmerfield/Typeset.js
+   * @author David Merfield
+   */function D(e){var t="\xa0",n=/([\xab\xbf\xa1]) /g,r=/ ([!?:;.,\u203d\xbb])/g;return e=(e=(e=(e=(e=e.replace(/--/g,"\u2014")).replace(/\s*\u2014\s*/g,"\u2009\u2014\u2009")).replace(/\.\.\./g,"\u2026")).replace(n,"$1"+t)).replace(r,t+"$1")}function B(e){return e=(e=(e=(e=(e=e.replace(/(\W|^)"([^\s!?:;.,\u203d\xbb])/g,"$1\u201c$2").replace(/(\u201c[^"]*)"([^"]*$|[^\u201c"]*\u201c)/g,"$1\u201d$2").replace(/([^0-9])"/g,"$1\u201d").replace(/(\W|^)'(\S)/g,"$1\u2018$2").replace(/([a-z])'([a-z])/gi,"$1\u2019$2").replace(/((\u2018[^']*)|[a-z])'([^0-9]|$)/gi,"$1\u2019$3").replace(/(\u2018)([0-9]{2}[^\u2019]*)(\u2018([^0-9]|$)|$|\u2019[a-z])/gi,"\u2019$2$3").replace(/(\B|^)\u2018(?=([^\u2019]*\u2019\b)*([^\u2019\u2018]*\W[\u2019\u2018]\b|[^\u2019\u2018]*$))/gi,"$1\u2019").replace(/'''/g,"\u2034").replace(/("|'')/g,"\u2033").replace(/'/g,"\u2032")).replace(/\\\u201c/,'"')).replace(/\\\u201d/,'"')).replace(/\\\u2019/,"'")).replace(/\\\u2018/,"'")}
+// Copyright 2018 The Distill Template Authors
+function I(e){const t=e.querySelector('script[src*="template.v2.js"]');t?t.parentNode.removeChild(t):console.debug("FYI: Did not find template tag when trying to remove it. You may not have added it. Be aware that our polyfills will add it.");const n=e.createElement("script");n.src="https://cdnjs.cloudflare.com/ajax/libs/webcomponentsjs/1.0.17/webcomponents-loader.js",e.head.insertBefore(n,e.head.firstChild);const r=e.createElement("script");r.innerHTML=ge,e.head.insertBefore(r,e.head.firstChild)}
+// Copyright 2018 The Distill Template Authors
+function H(e,t,n=document){if(t.size>0){e.style.display="";let r=e.querySelector(".references");if(r)r.innerHTML="";else{const t=n.createElement("style");t.innerHTML=ve,e.appendChild(t);const i=n.createElement("h3");i.id="references",i.textContent="References",e.appendChild(i),(r=n.createElement("ol")).id="references-list",r.className="references",e.appendChild(r)}for(const[e,i]of t){const t=n.createElement("li");t.id=e,t.innerHTML=x(i),r.appendChild(t)}}else e.style.display="none"}
+// Copyright 2018 The Distill Template Authors
+function P(e,t){const n=e.querySelector("d-citation-list");if(n){H(n,new Map(t.citations.map(e=>[e,t.bibliography.get(e)])),e),n.setAttribute("distill-prerendered","true")}}
+// Copyright 2018 The Distill Template Authors
+function j(e){const t=e.head,n=t.querySelector("meta[http-equiv]");t.insertBefore(n,t.firstChild);const r=t.querySelector("meta[name=viewport]");t.insertBefore(r,t.firstChild);const i=t.querySelector("meta[charset]");t.insertBefore(i,t.firstChild)}
+// Copyright 2018 The Distill Template Authors
+function F(e){if(!e.querySelector("distill-header")){const t=e.createElement("distill-header");t.innerHTML=ye,t.setAttribute("distill-prerendered","");const n=e.querySelector("body");n.insertBefore(t,n.firstChild)}}
+// Copyright 2018 The Distill Template Authors
+function $(e){let t=xe;"undefined"!=typeof e.githubUrl&&(t+='\n    <h3 id="updates-and-corrections">Updates and Corrections</h3>\n    <p>',e.githubCompareUpdatesUrl&&(t+=`<a href="${e.githubCompareUpdatesUrl}">View all changes</a> to this article since it was first published.`),t+=`\n    If you see mistakes or want to suggest changes, please <a href="${e.githubUrl+"/issues/new"}">create an issue on GitHub</a>. </p>\n    `);const n=e.journal;return void 0!==n&&"Distill"===n.title&&(t+=`\n    <h3 id="reuse">Reuse</h3>\n    <p>Diagrams and text are licensed under Creative Commons Attribution <a href="https://creativecommons.org/licenses/by/4.0/">CC-BY 4.0</a> with the <a class="github" href="${e.githubUrl}">source available on GitHub</a>, unless noted otherwise. The figures that have been reused from other sources don\u2019t fall under this license and can be recognized by a note in their caption: \u201cFigure from \u2026\u201d.</p>\n    `),"undefined"!=typeof e.publishedDate&&(t+=`\n    <h3 id="citation">Citation</h3>\n    <p>For attribution in academic contexts, please cite this work as</p>\n    <pre class="citation short">${e.concatenatedAuthors}, "${e.title}", Distill, ${e.publishedYear}.</pre>\n    <p>BibTeX citation</p>\n    <pre class="citation long">${c(e)}</pre>\n    `),t}
+// Copyright 2018 The Distill Template Authors
+function U(e,t){const n=e.querySelector("d-appendix");if(n){if(!n.querySelector("distill-appendix")){const r=e.createElement("distill-appendix");n.appendChild(r),r.innerHTML=$(t)}}else console.warn("No appendix tag found!")}
+// Copyright 2018 The Distill Template Authors
+function Y(e){if(!e.querySelector("distill-footer")){const t=e.createElement("distill-footer");t.innerHTML=we,e.querySelector("body").appendChild(t)}}
+// Copyright 2018 The Distill Template Authors
+function V(e,t,n=!0){let r;r=t instanceof ne?t:ne.fromObject(t);for(const[t,i]of ke.entries())n&&console.warn("Running extractor: "+t),i(e,r,n);for(const[t,i]of Me.entries())n&&console.warn("Running transform: "+t),i(e,r,n);e.body.setAttribute("distill-prerendered",""),t instanceof ne||r.assignToObject(t)}function G(e,t,n=!0){for(const[r,i]of Se.entries())n&&console.warn("Running distillify: ",r),i(e,t,n)}function W(e){const t=e.querySelectorAll("script");let n=undefined;for(const e of t){const t=e.src;if(t.includes("template.v1.js"))n=!1;else if(t.includes("template.v2.js"))n=!0;else if(t.includes("template."))throw new Error("Uses distill template, but unknown version?!")}if(n===undefined)throw new Error("Does not seem to use Distill template at all.");return n}t=t&&Object.prototype.hasOwnProperty.call(t,"default")?t["default"]:t;
+// Copyright 2018 The Distill Template Authors
+const K=["Sunday","Monday","Tuesday","Wednesday","Thursday","Friday","Saturday"],J=["Jan.","Feb.","March","April","May","June","July","Aug.","Sept.","Oct.","Nov.","Dec."],X=e=>e<10?"0"+e:e,Z=function(e){return`${K[e.getDay()].substring(0,3)}, ${X(e.getDate())} ${J[e.getMonth()].substring(0,3)} ${e.getFullYear().toString()} ${e.getUTCHours().toString()}:${e.getUTCMinutes().toString()}:${e.getUTCSeconds().toString()} Z`},Q=function(e){return Array.from(e).reduce((e,[t,n])=>Object.assign(e,{[t]:n}),{})},ee=function(e){const t=new Map;for(var n in e)e.hasOwnProperty(n)&&t.set(n,e[n]);return t};class te{constructor(e){this.name=e.author,this.personalURL=e.authorURL,this.affiliation=e.affiliation,this.affiliationURL=e.affiliationURL,this.affiliations=e.affiliations||[]}get firstName(){const e=this.name.split(" ");return e.slice(0,e.length-1).join(" ")}get lastName(){const e=this.name.split(" ");return e[e.length-1]}}class ne{constructor(){this.title="unnamed article",this.description="",this.authors=[],this.bibliography=new Map,this.bibliographyParsed=!1,this.citations=[],this.citationsCollected=!1,this.journal={},this.katex={},this.doi=undefined,this.publishedDate=undefined}set url(e){this._url=e}get url(){return this._url?this._url:this.distillPath&&this.journal.url?this.journal.url+"/"+this.distillPath:this.journal.url?this.journal.url:void 0}get githubUrl(){return this.githubPath?"https://github.com/"+this.githubPath:undefined}set previewURL(e){this._previewURL=e}get previewURL(){return this._previewURL?this._previewURL:this.url+"/thumbnail.jpg"}get publishedDateRFC(){return Z(this.publishedDate)}get updatedDateRFC(){return Z(this.updatedDate)}get publishedYear(){return this.publishedDate.getFullYear()}get publishedMonth(){return J[this.publishedDate.getMonth()]}get publishedDay(){return this.publishedDate.getDate()}get publishedMonthPadded(){return X(this.publishedDate.getMonth()+1)}get publishedDayPadded(){return X(this.publishedDate.getDate())}get publishedISODateOnly(){return this.publishedDate.toISOString().split("T")[0]}get volume(){const e=this.publishedYear-2015;if(e<1)throw new Error("Invalid publish date detected during computing volume");return e}get issue(){return this.publishedDate.getMonth()+1}get concatenatedAuthors(){return this.authors.length>2?this.authors[0].lastName+", et al.":2===this.authors.length?this.authors[0].lastName+" & "+this.authors[1].lastName:1===this.authors.length?this.authors[0].lastName:void 0}get bibtexAuthors(){return this.authors.map(e=>e.lastName+", "+e.firstName).join(" and ")}get slug(){let e="";return this.authors.length&&(e+=this.authors[0].lastName.toLowerCase(),e+=this.publishedYear,e+=this.title.split(" ")[0].toLowerCase()),e||"Untitled"}get bibliographyEntries(){return new Map(this.citations.map(e=>{return[e,this.bibliography.get(e)]}))}set bibliography(e){e instanceof Map?this._bibliography=e:"object"==typeof e&&(this._bibliography=ee(e))}get bibliography(){return this._bibliography}static fromObject(e){const t=new ne;return Object.assign(t,e),t}assignToObject(e){Object.assign(e,this),e.bibliography=Q(this.bibliographyEntries),e.url=this.url,e.doi=this.doi,e.githubUrl=this.githubUrl,e.previewURL=this.previewURL,this.publishedDate&&(e.volume=this.volume,e.issue=this.issue,e.publishedDateRFC=this.publishedDateRFC,e.publishedYear=this.publishedYear,e.publishedMonth=this.publishedMonth,e.publishedDay=this.publishedDay,e.publishedMonthPadded=this.publishedMonthPadded,e.publishedDayPadded=this.publishedDayPadded),this.updatedDate&&(e.updatedDateRFC=this.updatedDateRFC),e.concatenatedAuthors=this.concatenatedAuthors,e.bibtexAuthors=this.bibtexAuthors,e.slug=this.slug}}var re=l(function(e,t){!function(e){function t(){this.months=["jan","feb","mar","apr","may","jun","jul","aug","sep","oct","nov","dec"],this.notKey=[",","{","}"," ","="],this.pos=0,this.input="",this.entries=new Array,this.currentEntry="",this.setInput=function(e){this.input=e},this.getEntries=function(){return this.entries},this.isWhitespace=function(e){return" "==e||"\r"==e||"\t"==e||"\n"==e},this.match=function(e,t){if(t!=undefined&&null!=t||(t=!0),this.skipWhitespace(t),this.input.substring(this.pos,this.pos+e.length)!=e)throw"Token mismatch, expected "+e+", found "+this.input.substring(this.pos);this.pos+=e.length,this.skipWhitespace(t)},this.tryMatch=function(e,t){return t!=undefined&&null!=t||(t=!0),this.skipWhitespace(t),this.input.substring(this.pos,this.pos+e.length)==e},this.matchAt=function(){for(;this.input.length>this.pos&&"@"!=this.input[this.pos];)this.pos++;return"@"==this.input[this.pos]},this.skipWhitespace=function(e){for(;this.isWhitespace(this.input[this.pos]);)this.pos++;if("%"==this.input[this.pos]&&1==e){for(;"\n"!=this.input[this.pos];)this.pos++;this.skipWhitespace(e)}},this.value_braces=function(){var e=0;this.match("{",!1);for(var t=this.pos,n=!1;;){if(!n)if("}"==this.input[this.pos]){if(!(e>0)){var r=this.pos;return this.match("}",!1),this.input.substring(t,r)}e--}else if("{"==this.input[this.pos])e++;else if(this.pos>=this.input.length-1)throw"Unterminated value";n="\\"==this.input[this.pos]&&0==n,this.pos++}},this.value_comment=function(){for(var e="",t=0;!this.tryMatch("}",!1)||0!=t;){if(e+=this.input[this.pos],"{"==this.input[this.pos]&&t++,"}"==this.input[this.pos]&&t--,this.pos>=this.input.length-1)throw"Unterminated value:"+this.input.substring(start);this.pos++}return e},this.value_quotes=function(){this.match('"',!1);for(var e=this.pos,t=!1;;){if(!t){if('"'==this.input[this.pos]){var n=this.pos;return this.match('"',!1),this.input.substring(e,n)}if(this.pos>=this.input.length-1)throw"Unterminated value:"+this.input.substring(e)}t="\\"==this.input[this.pos]&&0==t,this.pos++}},this.single_value=function(){var e=this.pos;if(this.tryMatch("{"))return this.value_braces();if(this.tryMatch('"'))return this.value_quotes();var t=this.key();if(t.match("^[0-9]+$"))return t;if(this.months.indexOf(t.toLowerCase())>=0)return t.toLowerCase();throw"Value expected:"+this.input.substring(e)+" for key: "+t},this.value=function(){var e=[];for(e.push(this.single_value());this.tryMatch("#");)this.match("#"),e.push(this.single_value());return e.join("")},this.key=function(){for(var e=this.pos;;){if(this.pos>=this.input.length)throw"Runaway key";if(this.notKey.indexOf(this.input[this.pos])>=0)return this.input.substring(e,this.pos);this.pos++}},this.key_equals_value=function(){var e=this.key();if(this.tryMatch("="))return this.match("="),[e,this.value()];throw"... = value expected, equals sign missing:"+this.input.substring(this.pos)},this.key_value_list=function(){var e=this.key_equals_value();for(this.currentEntry.entryTags={},this.currentEntry.entryTags[e[0]]=e[1];this.tryMatch(",")&&(this.match(","),!this.tryMatch("}"));)e=this.key_equals_value(),this.currentEntry.entryTags[e[0]]=e[1]},this.entry_body=function(e){this.currentEntry={},this.currentEntry.citationKey=this.key(),this.currentEntry.entryType=e.substring(1),this.match(","),this.key_value_list(),this.entries.push(this.currentEntry)},this.directive=function(){return this.match("@"),"@"+this.key()},this.preamble=function(){this.currentEntry={},this.currentEntry.entryType="PREAMBLE",this.currentEntry.entry=this.value_comment(),this.entries.push(this.currentEntry)},this.comment=function(){this.currentEntry={},this.currentEntry.entryType="COMMENT",this.currentEntry.entry=this.value_comment(),this.entries.push(this.currentEntry)},this.entry=function(e){this.entry_body(e)},this.bibtex=function(){for(;this.matchAt();){var e=this.directive();this.match("{"),"@STRING"==e?this.string():"@PREAMBLE"==e?this.preamble():"@COMMENT"==e?this.comment():this.entry(e),this.match("}")}}}e.toJSON=function(e){var n=new t;return n.setInput(e),n.bibtex(),n.entries},e.toBibtex=function(e){var t="";for(var n in e){if(t+="@"+e[n].entryType,t+="{",e[n].citationKey&&(t+=e[n].citationKey+", "),e[n].entry&&(t+=e[n].entry),e[n].entryTags){var r="";for(var i in e[n].entryTags)0!=r.length&&(r+=", "),r+=i+"= {"+e[n].entryTags[i]+"}";t+=r}t+="}\n\n"}return t}}(t)}),ie=s(l(function(e){var t;t=function(){return function e(t,n,r){function i(s,l){if(!n[s]){if(!t[s]){var u="function"==typeof o&&o;if(!l&&u)return u(s,!0);if(a)return a(s,!0);var d=new Error("Cannot find module '"+s+"'");throw d.code="MODULE_NOT_FOUND",d}var c=n[s]={exports:{}};t[s][0].call(c.exports,function(e){var n=t[s][1][e];return i(n||e)},c,c.exports,e,t,n,r)}return n[s].exports}for(var a="function"==typeof o&&o,s=0;s<r.length;s++)i(r[s]);return i}({1:[function(e,t){function n(e){return e&&e.__esModule?e:{"default":e}}var r=n(e("./src/ParseError")),i=n(e("./src/Settings")),a=n(e("./src/buildTree")),o=n(e("./src/parseTree")),s=n(e("./src/utils")),l=function(e,t,n){s["default"].clearNode(t);var r=new i["default"](n),l=(0,o["default"])(e,r),u=(0,a["default"])(l,e,r).toNode();t.appendChild(u)};"undefined"!=typeof document&&"CSS1Compat"!==document.compatMode&&("undefined"!=typeof console&&console.warn("Warning: KaTeX doesn't work in quirks mode. Make sure your website has a suitable doctype."),l=function(){throw new r["default"]("KaTeX doesn't work in quirks mode.")});var u=function(e,t){var n=new i["default"](t),r=(0,o["default"])(e,n);return(0,a["default"])(r,e,n).toMarkup()},d=function(e,t){var n=new i["default"](t);return(0,o["default"])(e,n)};t.exports={render:l,renderToString:u,__parse:d,ParseError:r["default"]}},{"./src/ParseError":29,"./src/Settings":32,"./src/buildTree":37,"./src/parseTree":46,"./src/utils":51}],2:[function(e,t){t.exports={"default":e("core-js/library/fn/json/stringify"),__esModule:!0}},{"core-js/library/fn/json/stringify":6}],3:[function(e,t){t.exports={"default":e("core-js/library/fn/object/define-property"),__esModule:!0}},{"core-js/library/fn/object/define-property":7}],4:[function(e,t,n){n.__esModule=!0,n["default"]=function(e,t){if(!(e instanceof t))throw new TypeError("Cannot call a class as a function")}},{}],5:[function(e,t,n){function r(e){return e&&e.__esModule?e:{"default":e}}n.__esModule=!0;var i=r(e("../core-js/object/define-property"));n["default"]=function(){function e(e,t){for(var n=0;n<t.length;n++){var r=t[n];r.enumerable=r.enumerable||!1,r.configurable=!0,"value"in r&&(r.writable=!0),(0,i["default"])(e,r.key,r)}}return function(t,n,r){return n&&e(t.prototype,n),r&&e(t,r),t}}()},{"../core-js/object/define-property":3}],6:[function(e,t){var n=e("../../modules/_core"),r=n.JSON||(n.JSON={stringify:JSON.stringify});t.exports=function(){return r.stringify.apply(r,arguments)}},{"../../modules/_core":10}],7:[function(e,t){e("../../modules/es6.object.define-property");var n=e("../../modules/_core").Object;t.exports=function(e,t,r){return n.defineProperty(e,t,r)}},{"../../modules/_core":10,"../../modules/es6.object.define-property":23}],8:[function(e,t){t.exports=function(e){if("function"!=typeof e)throw TypeError(e+" is not a function!");return e}},{}],9:[function(e,t){var n=e("./_is-object");t.exports=function(e){if(!n(e))throw TypeError(e+" is not an object!");return e}},{"./_is-object":19}],10:[function(e,t){var n=t.exports={version:"2.4.0"};"number"==typeof __e&&(__e=n)},{}],11:[function(e,t){var n=e("./_a-function");t.exports=function(e,t,r){if(n(e),t===undefined)return e;switch(r){case 1:return function(n){return e.call(t,n)};case 2:return function(n,r){return e.call(t,n,r)};case 3:return function(n,r,i){return e.call(t,n,r,i)}}return function(){return e.apply(t,arguments)}}},{"./_a-function":8}],12:[function(e,t){t.exports=!e("./_fails")(function(){return 7!=Object.defineProperty({},"a",{get:function(){return 7}}).a})},{"./_fails":15}],13:[function(e,t){var n=e("./_is-object"),r=e("./_global").document,i=n(r)&&n(r.createElement);t.exports=function(e){return i?r.createElement(e):{}}},{"./_global":16,"./_is-object":19}],14:[function(e,t){var n=e("./_global"),r=e("./_core"),i=e("./_ctx"),a=e("./_hide"),o="prototype",s=function(e,t,l){var u,d,c,h=e&s.F,p=e&s.G,f=e&s.S,m=e&s.P,g=e&s.B,v=e&s.W,b=p?r:r[t]||(r[t]={}),y=b[o],x=p?n:f?n[t]:(n[t]||{})[o];for(u in p&&(l=t),l)(d=!h&&x&&x[u]!==undefined)&&u in b||(c=d?x[u]:l[u],b[u]=p&&"function"!=typeof x[u]?l[u]:g&&d?i(c,n):v&&x[u]==c?function(e){var t=function(t,n,r){if(this instanceof e){switch(arguments.length){case 0:return new e;case 1:return new e(t);case 2:return new e(t,n)}return new e(t,n,r)}return e.apply(this,arguments)};return t[o]=e[o],t}(c):m&&"function"==typeof c?i(Function.call,c):c,m&&((b.virtual||(b.virtual={}))[u]=c,e&s.R&&y&&!y[u]&&a(y,u,c)))};s.F=1,s.G=2,s.S=4,s.P=8,s.B=16,s.W=32,s.U=64,s.R=128,t.exports=s},{"./_core":10,"./_ctx":11,"./_global":16,"./_hide":17}],15:[function(e,t){t.exports=function(e){try{return!!e()}catch(t){return!0}}},{}],16:[function(e,t){var n=t.exports="undefined"!=typeof window&&window.Math==Math?window:"undefined"!=typeof self&&self.Math==Math?self:Function("return this")();"number"==typeof __g&&(__g=n)},{}],17:[function(e,t){var n=e("./_object-dp"),r=e("./_property-desc");t.exports=e("./_descriptors")?function(e,t,i){return n.f(e,t,r(1,i))}:function(e,t,n){return e[t]=n,e}},{"./_descriptors":12,"./_object-dp":20,"./_property-desc":21}],18:[function(e,t){t.exports=!e("./_descriptors")&&!e("./_fails")(function(){return 7!=Object.defineProperty(e("./_dom-create")("div"),"a",{get:function(){return 7}}).a})},{"./_descriptors":12,"./_dom-create":13,"./_fails":15}],19:[function(e,t){t.exports=function(e){return"object"==typeof e?null!==e:"function"==typeof e}},{}],20:[function(e,t,n){var r=e("./_an-object"),i=e("./_ie8-dom-define"),a=e("./_to-primitive"),o=Object.defineProperty;n.f=e("./_descriptors")?Object.defineProperty:function(e,t,n){if(r(e),t=a(t,!0),r(n),i)try{return o(e,t,n)}catch(s){}if("get"in n||"set"in n)throw TypeError("Accessors not supported!");return"value"in n&&(e[t]=n.value),e}},{"./_an-object":9,"./_descriptors":12,"./_ie8-dom-define":18,"./_to-primitive":22}],21:[function(e,t){t.exports=function(e,t){return{enumerable:!(1&e),configurable:!(2&e),writable:!(4&e),value:t}}},{}],22:[function(e,t){var n=e("./_is-object");t.exports=function(e,t){if(!n(e))return e;var r,i;if(t&&"function"==typeof(r=e.toString)&&!n(i=r.call(e)))return i;if("function"==typeof(r=e.valueOf)&&!n(i=r.call(e)))return i;if(!t&&"function"==typeof(r=e.toString)&&!n(i=r.call(e)))return i;throw TypeError("Can't convert object to primitive value")}},{"./_is-object":19}],23:[function(e){var t=e("./_export");t(t.S+t.F*!e("./_descriptors"),"Object",{defineProperty:e("./_object-dp").f})},{"./_descriptors":12,"./_export":14,"./_object-dp":20}],24:[function(e,t){function n(e){if(!e.__matchAtRelocatable){var t=e.source+"|()",n="g"+(e.ignoreCase?"i":"")+(e.multiline?"m":"")+(e.unicode?"u":"");e.__matchAtRelocatable=new RegExp(t,n)}return e.__matchAtRelocatable}function r(e,t,r){if(e.global||e.sticky)throw new Error("matchAt(...): Only non-global regexes are supported");var i=n(e);i.lastIndex=r;var a=i.exec(t);return null==a[a.length-1]?(a.length=a.length-1,a):null}t.exports=r},{}],25:[function(e,t){function n(e){if(null===e||e===undefined)throw new TypeError("Object.assign cannot be called with null or undefined");return Object(e)}function r(){try{if(!Object.assign)return!1;var e=new String("abc");if(e[5]="de","5"===Object.getOwnPropertyNames(e)[0])return!1;for(var t={},n=0;n<10;n++)t["_"+String.fromCharCode(n)]=n;if("0123456789"!==Object.getOwnPropertyNames(t).map(function(e){return t[e]}).join(""))return!1;var r={};return"abcdefghijklmnopqrst".split("").forEach(function(e){r[e]=e}),"abcdefghijklmnopqrst"===Object.keys(Object.assign({},r)).join("")}catch(i){return!1}}var i=Object.prototype.hasOwnProperty,a=Object.prototype.propertyIsEnumerable;t.exports=r()?Object.assign:function(e){for(var t,r,o=n(e),s=1;s<arguments.length;s++){for(var l in t=Object(arguments[s]))i.call(t,l)&&(o[l]=t[l]);if(Object.getOwnPropertySymbols){r=Object.getOwnPropertySymbols(t);for(var u=0;u<r.length;u++)a.call(t,r[u])&&(o[r[u]]=t[r[u]])}}return o}},{}],26:[function(e,t){function n(e){return e&&e.__esModule?e:{"default":e}}var r=n(e("babel-runtime/helpers/classCallCheck")),i=n(e("babel-runtime/helpers/createClass")),a=n(e("match-at")),o=n(e("./ParseError")),s=function(){function e(t,n,i,a){(0,r["default"])(this,e),this.text=t,this.start=n,this.end=i,this.lexer=a}return(0,i["default"])(e,[{key:"range",value:function(t,n){return t.lexer!==this.lexer?new e(n):new e(n,this.start,t.end,this.lexer)}}]),e}(),l=new RegExp("([ \r\n\t]+)|([!-\\[\\]-\u2027\u202a-\ud7ff\uf900-\uffff]|[\ud800-\udbff][\udc00-\udfff]|\\\\(?:[a-zA-Z]+|[^\ud800-\udfff]))"),u=function(){function e(t){(0,r["default"])(this,e),this.input=t,this.pos=0}return(0,i["default"])(e,[{key:"lex",value:function(){var e=this.input,t=this.pos;if(t===e.length)return new s("EOF",t,t,this);var n=(0,a["default"])(l,e,t);if(null===n)throw new o["default"]("Unexpected character: '"+e[t]+"'",new s(e[t],t,t+1,this));var r=n[2]||" ",i=this.pos;this.pos+=n[0].length;var u=this.pos;return new s(r,i,u,this)}}]),e}();t.exports=u},{"./ParseError":29,"babel-runtime/helpers/classCallCheck":4,"babel-runtime/helpers/createClass":5,"match-at":24}],27:[function(e,t){function n(e){return e&&e.__esModule?e:{"default":e}}var r=n(e("babel-runtime/helpers/classCallCheck")),i=n(e("babel-runtime/helpers/createClass")),a=n(e("./Lexer")),o=n(e("./macros")),s=n(e("./ParseError")),l=n(e("object-assign")),u=function(){function e(t,n){(0,r["default"])(this,e),this.lexer=new a["default"](t),this.macros=(0,l["default"])({},o["default"],n),this.stack=[],this.discardedWhiteSpace=[]}return(0,i["default"])(e,[{key:"nextToken",value:function(){for(;;){0===this.stack.length&&this.stack.push(this.lexer.lex());var e=this.stack.pop(),t=e.text;if("\\"!==t.charAt(0)||!this.macros.hasOwnProperty(t))return e;var n=void 0,r=this.macros[t];if("string"==typeof r){var i=0;if(-1!==r.indexOf("#"))for(var o=r.replace(/##/g,"");-1!==o.indexOf("#"+(i+1));)++i;var l=new a["default"](r);for(r=[],n=l.lex();"EOF"!==n.text;)r.push(n),n=l.lex();r.reverse(),r.numArgs=i,this.macros[t]=r}if(r.numArgs){var u=[],d=void 0;for(d=0;d<r.numArgs;++d){var c=this.get(!0);if("{"===c.text){for(var h=[],p=1;0!==p;)if(n=this.get(!1),h.push(n),"{"===n.text)++p;else if("}"===n.text)--p;else if("EOF"===n.text)throw new s["default"]("End of input in macro argument",c);h.pop(),h.reverse(),u[d]=h}else{if("EOF"===c.text)throw new s["default"]("End of input expecting macro argument",e);u[d]=[c]}}for(d=(r=r.slice()).length-1;d>=0;--d)if("#"===(n=r[d]).text){if(0===d)throw new s["default"]("Incomplete placeholder at end of macro body",n);if("#"===(n=r[--d]).text)r.splice(d+1,1);else{if(!/^[1-9]$/.test(n.text))throw new s["default"]("Not a valid argument number",n);r.splice.apply(r,[d,2].concat(u[n.text-1]))}}}this.stack=this.stack.concat(r)}}},{key:"get",value:function(e){this.discardedWhiteSpace=[];var t=this.nextToken();if(e)for(;" "===t.text;)this.discardedWhiteSpace.push(t),t=this.nextToken();return t}},{key:"unget",value:function(e){for(this.stack.push(e);0!==this.discardedWhiteSpace.length;)this.stack.push(this.discardedWhiteSpace.pop())}}]),e}();t.exports=u},{"./Lexer":26,"./ParseError":29,"./macros":44,"babel-runtime/helpers/classCallCheck":4,"babel-runtime/helpers/createClass":5,"object-assign":25}],28:[function(e,t){function n(e){return e&&e.__esModule?e:{"default":e}}var r=n(e("babel-runtime/helpers/classCallCheck")),i=n(e("babel-runtime/helpers/createClass")),a=n(e("./fontMetrics")),o=6,s=[[1,1,1],[2,1,1],[3,1,1],[4,2,1],[5,2,1],[6,3,1],[7,4,2],[8,6,3],[9,7,6],[10,8,7],[11,10,9]],l=[.5,.6,.7,.8,.9,1,1.2,1.44,1.728,2.074,2.488],u=function(e,t){return t.size<2?e:s[e-1][t.size-1]},d=function(){function e(t){(0,r["default"])(this,e),this.style=t.style,this.color=t.color,this.size=t.size||o,this.textSize=t.textSize||this.size,this.phantom=t.phantom,this.font=t.font,this.sizeMultiplier=l[this.size-1],this._fontMetrics=null}return(0,i["default"])(e,[{key:"extend",value:function(t){var n={style:this.style,size:this.size,textSize:this.textSize,color:this.color,phantom:this.phantom,font:this.font};for(var r in t)t.hasOwnProperty(r)&&(n[r]=t[r]);return new e(n)}},{key:"havingStyle",value:function(e){return this.style===e?this:this.extend({style:e,size:u(this.textSize,e)})}},{key:"havingCrampedStyle",value:function(){return this.havingStyle(this.style.cramp())}},{key:"havingSize",value:function(e){return this.size===e&&this.textSize===e?this:this.extend({style:this.style.text(),size:e,textSize:e})}},{key:"havingBaseStyle",value:function(e){e=e||this.style.text();var t=u(o,e);return this.size===t&&this.textSize===o&&this.style===e?this:this.extend({style:e,size:t,baseSize:o})}},{key:"withColor",value:function(e){return this.extend({color:e})}},{key:"withPhantom",value:function(){return this.extend({phantom:!0})}},{key:"withFont",value:function(e){return this.extend({font:e||this.font})}},{key:"sizingClasses",value:function(e){return e.size!==this.size?["sizing","reset-size"+e.size,"size"+this.size]:[]}},{key:"baseSizingClasses",value:function(){return this.size!==o?["sizing","reset-size"+this.size,"size"+o]:[]}},{key:"fontMetrics",value:function(){return this._fontMetrics||(this._fontMetrics=a["default"].getFontMetrics(this.size)),this._fontMetrics}},{key:"getColor",value:function(){return this.phantom?"transparent":e.colorMap[this.color]||this.color}}]),e}();d.colorMap={"katex-blue":"#6495ed","katex-orange":"#ffa500","katex-pink":"#ff00af","katex-red":"#df0030","katex-green":"#28ae7b","katex-gray":"gray","katex-purple":"#9d38bd","katex-blueA":"#ccfaff","katex-blueB":"#80f6ff","katex-blueC":"#63d9ea","katex-blueD":"#11accd","katex-blueE":"#0c7f99","katex-tealA":"#94fff5","katex-tealB":"#26edd5","katex-tealC":"#01d1c1","katex-tealD":"#01a995","katex-tealE":"#208170","katex-greenA":"#b6ffb0","katex-greenB":"#8af281","katex-greenC":"#74cf70","katex-greenD":"#1fab54","katex-greenE":"#0d923f","katex-goldA":"#ffd0a9","katex-goldB":"#ffbb71","katex-goldC":"#ff9c39","katex-goldD":"#e07d10","katex-goldE":"#a75a05","katex-redA":"#fca9a9","katex-redB":"#ff8482","katex-redC":"#f9685d","katex-redD":"#e84d39","katex-redE":"#bc2612","katex-maroonA":"#ffbde0","katex-maroonB":"#ff92c6","katex-maroonC":"#ed5fa6","katex-maroonD":"#ca337c","katex-maroonE":"#9e034e","katex-purpleA":"#ddd7ff","katex-purpleB":"#c6b9fc","katex-purpleC":"#aa87ff","katex-purpleD":"#7854ab","katex-purpleE":"#543b78","katex-mintA":"#f5f9e8","katex-mintB":"#edf2df","katex-mintC":"#e0e5cc","katex-grayA":"#f6f7f7","katex-grayB":"#f0f1f2","katex-grayC":"#e3e5e6","katex-grayD":"#d6d8da","katex-grayE":"#babec2","katex-grayF":"#888d93","katex-grayG":"#626569","katex-grayH":"#3b3e40","katex-grayI":"#21242c","katex-kaBlue":"#314453","katex-kaGreen":"#71B307"},d.BASESIZE=o,t.exports=d},{"./fontMetrics":41,"babel-runtime/helpers/classCallCheck":4,"babel-runtime/helpers/createClass":5}],29:[function(e,t){function n(e){return e&&e.__esModule?e:{"default":e}}var r=n(e("babel-runtime/helpers/classCallCheck")),i=function a(e,t){(0,r["default"])(this,a);var n="KaTeX parse error: "+e,i=void 0,o=void 0;if(t&&t.lexer&&t.start<=t.end){var s=t.lexer.input;i=t.start,o=t.end,i===s.length?n+=" at end of input: ":n+=" at position "+(i+1)+": ";var l=s.slice(i,o).replace(/[^]/g,"$&\u0332");n+=(i>15?"\u2026"+s.slice(i-15,i):s.slice(0,i))+l+(o+15<s.length?s.slice(o,o+15)+"\u2026":s.slice(o))}var u=new Error(n);return u.name="ParseError",u.__proto__=a.prototype,u.position=i,u};i.prototype.__proto__=Error.prototype,t.exports=i},{"babel-runtime/helpers/classCallCheck":4}],30:[function(e,t,n){function r(e){return e&&e.__esModule?e:{"default":e}}Object.defineProperty(n,"__esModule",{value:!0});var i=r(e("babel-runtime/helpers/classCallCheck")),a=function o(e,t,n,r,a){(0,i["default"])(this,o),this.type=e,this.value=t,this.mode=n,!r||a&&a.lexer!==r.lexer||(this.lexer=r.lexer,this.start=r.start,this.end=(a||r).end)};n["default"]=a},{"babel-runtime/helpers/classCallCheck":4}],31:[function(e,t){function n(e){return e&&e.__esModule?e:{"default":e}}function r(e,t,n){this.result=e,this.isFunction=t,this.token=n}var i=n(e("babel-runtime/helpers/classCallCheck")),a=n(e("babel-runtime/helpers/createClass")),o=n(e("./functions")),s=n(e("./environments")),l=n(e("./MacroExpander")),u=n(e("./symbols")),d=n(e("./utils")),c=n(e("./units")),h=e("./unicodeRegexes"),p=n(e("./ParseNode")),f=n(e("./ParseError")),m=function(){function e(t,n){(0,i["default"])(this,e),this.gullet=new l["default"](t,n.macros),n.colorIsTextColor&&(this.gullet.macros["\\color"]="\\textcolor"),this.settings=n,this.leftrightDepth=0}return(0,a["default"])(e,[{key:"expect",value:function(e,t){if(this.nextToken.text!==e)throw new f["default"]("Expected '"+e+"', got '"+this.nextToken.text+"'",this.nextToken);!1!==t&&this.consume()}},{key:"consume",value:function(){this.nextToken=this.gullet.get("math"===this.mode)}},{key:"switchMode",value:function(e){this.gullet.unget(this.nextToken),this.mode=e,this.consume()}},{key:"parse",value:function(){this.mode="math",this.consume();var e=this.parseInput();return e}},{key:"parseInput",value:function(){var e=this.parseExpression(!1);return this.expect("EOF",!1),e}},{key:"parseExpression",value:function(t,n){for(var r=[];;){var i=this.nextToken;if(-1!==e.endOfExpression.indexOf(i.text))break;if(n&&i.text===n)break;if(t&&o["default"][i.text]&&o["default"][i.text].infix)break;var a=this.parseAtom();if(!a){if(!this.settings.throwOnError&&"\\"===i.text[0]){var s=this.handleUnsupportedCmd();r.push(s);continue}break}r.push(a)}return this.handleInfixNodes(r)}},{key:"handleInfixNodes",value:function(e){for(var t=-1,n=void 0,r=0;r<e.length;r++){var i=e[r];if("infix"===i.type){if(-1!==t)throw new f["default"]("only one infix operator per group",i.value.token);t=r,n=i.value.replaceWith}}if(-1!==t){var a=void 0,o=void 0,s=e.slice(0,t),l=e.slice(t+1);a=1===s.length&&"ordgroup"===s[0].type?s[0]:new p["default"]("ordgroup",s,this.mode),o=1===l.length&&"ordgroup"===l[0].type?l[0]:new p["default"]("ordgroup",l,this.mode);var u=this.callFunction(n,[a,o],null);return[new p["default"](u.type,u,this.mode)]}return e}},{key:"handleSupSubscript",value:function(t){var n=this.nextToken,r=n.text;this.consume();var i=this.parseGroup();if(i){if(i.isFunction){if(o["default"][i.result].greediness>e.SUPSUB_GREEDINESS)return this.parseFunction(i);throw new f["default"]("Got function '"+i.result+"' with no arguments as "+t,n)}return i.result}if(this.settings.throwOnError||"\\"!==this.nextToken.text[0])throw new f["default"]("Expected group after '"+r+"'",n);return this.handleUnsupportedCmd()}},{key:"handleUnsupportedCmd",value:function(){for(var e=this.nextToken.text,t=[],n=0;n<e.length;n++)t.push(new p["default"]("textord",e[n],"text"));var r=new p["default"]("text",{body:t,type:"text"},this.mode),i=new p["default"]("color",{color:this.settings.errorColor,value:[r],type:"color"},this.mode);return this.consume(),i}},{key:"parseAtom",value:function(){var e=this.parseImplicitGroup();if("text"===this.mode)return e;for(var t=void 0,n=void 0;;){var r=this.nextToken;if("\\limits"===r.text||"\\nolimits"===r.text){if(!e||"op"!==e.type)throw new f["default"]("Limit controls must follow a math operator",r);var i="\\limits"===r.text;e.value.limits=i,e.value.alwaysHandleSupSub=!0,this.consume()}else if("^"===r.text){if(t)throw new f["default"]("Double superscript",r);t=this.handleSupSubscript("superscript")}else if("_"===r.text){if(n)throw new f["default"]("Double subscript",r);n=this.handleSupSubscript("subscript")}else{if("'"!==r.text)break;if(t)throw new f["default"]("Double superscript",r);var a=new p["default"]("textord","\\prime",this.mode),o=[a];for(this.consume();"'"===this.nextToken.text;)o.push(a),this.consume();"^"===this.nextToken.text&&o.push(this.handleSupSubscript("superscript")),t=new p["default"]("ordgroup",o,this.mode)}}return t||n?new p["default"]("supsub",{base:e,sup:t,sub:n},this.mode):e}},{key:"parseImplicitGroup",value:function(){var t=this.parseSymbol();if(null==t)return this.parseFunction();var n=t.result;if("\\left"===n){var r=this.parseFunction(t);++this.leftrightDepth;var i=this.parseExpression(!1);--this.leftrightDepth,this.expect("\\right",!1);var a=this.parseFunction();return new p["default"]("leftright",{body:i,left:r.value.value,right:a.value.value},this.mode)}if("\\begin"===n){var o=this.parseFunction(t),l=o.value.name;if(!s["default"].hasOwnProperty(l))throw new f["default"]("No such environment: "+l,o.value.nameGroup);var u=s["default"][l],c=this.parseArguments("\\begin{"+l+"}",u),h={mode:this.mode,envName:l,parser:this,positions:c.pop()},m=u.handler(h,c);this.expect("\\end",!1);var g=this.nextToken,v=this.parseFunction();if(v.value.name!==l)throw new f["default"]("Mismatch: \\begin{"+l+"} matched by \\end{"+v.value.name+"}",g);return m.position=v.position,m}if(d["default"].contains(e.sizeFuncs,n)){this.consumeSpaces();var b=this.parseExpression(!1);return new p["default"]("sizing",{size:d["default"].indexOf(e.sizeFuncs,n)+1,value:b},this.mode)}if(d["default"].contains(e.styleFuncs,n)){this.consumeSpaces();var y=this.parseExpression(!0);return new p["default"]("styling",{style:n.slice(1,n.length-5),value:y},this.mode)}if(n in e.oldFontFuncs){var x=e.oldFontFuncs[n];this.consumeSpaces();var w=this.parseExpression(!0);return"text"===x.slice(0,4)?new p["default"]("text",{style:x,body:new p["default"]("ordgroup",w,this.mode)},this.mode):new p["default"]("font",{font:x,body:new p["default"]("ordgroup",w,this.mode)},this.mode)}if("\\color"===n){var k=this.parseColorGroup(!1);if(!k)throw new f["default"]("\\color not followed by color");var M=this.parseExpression(!0);return new p["default"]("color",{type:"color",color:k.result.value,value:M},this.mode)}if("$"===n){if("math"===this.mode)throw new f["default"]("$ within math mode");this.consume();var S=this.mode;this.switchMode("math");var z=this.parseExpression(!1,"$");return this.expect("$",!0),this.switchMode(S),new p["default"]("styling",{style:"text",value:z},"math")}return this.parseFunction(t)}},{key:"parseFunction",value:function(e){if(e||(e=this.parseGroup()),e){if(e.isFunction){var t=e.result,n=o["default"][t];if("text"===this.mode&&!n.allowedInText)throw new f["default"]("Can't use function '"+t+"' in text mode",e.token);if("math"===this.mode&&!1===n.allowedInMath)throw new f["default"]("Can't use function '"+t+"' in math mode",e.token);var r=this.parseArguments(t,n),i=e.token,a=this.callFunction(t,r,r.pop(),i);return new p["default"](a.type,a,this.mode)}return e.result}return null}},{key:"callFunction",value:function(e,t,n,r){var i={funcName:e,parser:this,positions:n,token:r};return o["default"][e].handler(i,t)}},{key:"parseArguments",value:function(e,t){var n=t.numArgs+t.numOptionalArgs;if(0===n)return[[this.pos]];for(var i=t.greediness,a=[this.pos],s=[],l=0;l<n;l++){var u=this.nextToken,d=t.argTypes&&t.argTypes[l],c=void 0;if(l<t.numOptionalArgs){if(!(c=d?this.parseGroupOfType(d,!0):this.parseGroup(!0))){s.push(null),a.push(this.pos);continue}}else if(!(c=d?this.parseGroupOfType(d):this.parseGroup())){if(this.settings.throwOnError||"\\"!==this.nextToken.text[0])throw new f["default"]("Expected group after '"+e+"'",u);c=new r(this.handleUnsupportedCmd(this.nextToken.text),!1)}var h=void 0;if(c.isFunction){if(!(o["default"][c.result].greediness>i))throw new f["default"]("Got function '"+c.result+"' as argument to '"+e+"'",u);h=this.parseFunction(c)}else h=c.result;s.push(h),a.push(this.pos)}return s.push(a),s}},{key:"parseGroupOfType",value:function(e,t){var n=this.mode;if("original"===e&&(e=n),"color"===e)return this.parseColorGroup(t);if("size"===e)return this.parseSizeGroup(t);this.switchMode(e),"text"===e&&this.consumeSpaces();var r=this.parseGroup(t);return this.switchMode(n),r}},{key:"consumeSpaces",value:function(){for(;" "===this.nextToken.text;)this.consume()}},{key:"parseStringGroup",value:function(e,t){if(t&&"["!==this.nextToken.text)return null;var n=this.mode;this.mode="text",this.expect(t?"[":"{");for(var r="",i=this.nextToken,a=i;this.nextToken.text!==(t?"]":"}");){if("EOF"===this.nextToken.text)throw new f["default"]("Unexpected end of input in "+e,i.range(this.nextToken,r));r+=(a=this.nextToken).text,this.consume()}return this.mode=n,this.expect(t?"]":"}"),i.range(a,r)}},{key:"parseRegexGroup",value:function(e,t){var n=this.mode;this.mode="text";for(var r=this.nextToken,i=r,a="";"EOF"!==this.nextToken.text&&e.test(a+this.nextToken.text);)a+=(i=this.nextToken).text,this.consume();if(""===a)throw new f["default"]("Invalid "+t+": '"+r.text+"'",r);return this.mode=n,r.range(i,a)}},{key:"parseColorGroup",value:function(e){var t=this.parseStringGroup("color",e);if(!t)return null;var n=/^(#[a-z0-9]+|[a-z]+)$/i.exec(t.text);if(!n)throw new f["default"]("Invalid color: '"+t.text+"'",t)
+;return new r(new p["default"]("color",n[0],this.mode),!1)}},{key:"parseSizeGroup",value:function(e){var t=void 0;if(!(t=e||"{"===this.nextToken.text?this.parseStringGroup("size",e):this.parseRegexGroup(/^[-+]? *(?:$|\d+|\d+\.\d*|\.\d*) *[a-z]{0,2} *$/,"size")))return null;var n=/([-+]?) *(\d+(?:\.\d*)?|\.\d+) *([a-z]{2})/.exec(t.text);if(!n)throw new f["default"]("Invalid size: '"+t.text+"'",t);var i={number:+(n[1]+n[2]),unit:n[3]};if(!c["default"].validUnit(i))throw new f["default"]("Invalid unit: '"+i.unit+"'",t);return new r(new p["default"]("size",i,this.mode),!1)}},{key:"parseGroup",value:function(e){var t=this.nextToken;if(this.nextToken.text===(e?"[":"{")){this.consume();var n=this.parseExpression(!1,e?"]":null),i=this.nextToken;return this.expect(e?"]":"}"),"text"===this.mode&&this.formLigatures(n),new r(new p["default"]("ordgroup",n,this.mode,t,i),!1)}return e?null:this.parseSymbol()}},{key:"formLigatures",value:function(e){for(var t=e.length-1,n=0;n<t;++n){var r=e[n],i=r.value;"-"===i&&"-"===e[n+1].value&&(n+1<t&&"-"===e[n+2].value?(e.splice(n,3,new p["default"]("textord","---","text",r,e[n+2])),t-=2):(e.splice(n,2,new p["default"]("textord","--","text",r,e[n+1])),t-=1)),"'"!==i&&"`"!==i||e[n+1].value!==i||(e.splice(n,2,new p["default"]("textord",i+i,"text",r,e[n+1])),t-=1)}}},{key:"parseSymbol",value:function(){var e=this.nextToken;return o["default"][e.text]?(this.consume(),new r(e.text,!0,e)):u["default"][this.mode][e.text]?(this.consume(),new r(new p["default"](u["default"][this.mode][e.text].group,e.text,this.mode,e),!1,e)):"text"===this.mode&&h.cjkRegex.test(e.text)?(this.consume(),new r(new p["default"]("textord",e.text,this.mode,e),!1,e)):"$"===e.text?new r(e.text,!1,e):null}}]),e}();m.endOfExpression=["}","\\end","\\right","&","\\\\","\\cr"],m.SUPSUB_GREEDINESS=1,m.sizeFuncs=["\\tiny","\\sixptsize","\\scriptsize","\\footnotesize","\\small","\\normalsize","\\large","\\Large","\\LARGE","\\huge","\\Huge"],m.styleFuncs=["\\displaystyle","\\textstyle","\\scriptstyle","\\scriptscriptstyle"],m.oldFontFuncs={"\\rm":"mathrm","\\sf":"mathsf","\\tt":"mathtt","\\bf":"mathbf","\\it":"mathit"},m.prototype.ParseNode=p["default"],t.exports=m},{"./MacroExpander":27,"./ParseError":29,"./ParseNode":30,"./environments":40,"./functions":43,"./symbols":48,"./unicodeRegexes":49,"./units":50,"./utils":51,"babel-runtime/helpers/classCallCheck":4,"babel-runtime/helpers/createClass":5}],32:[function(e,t){function n(e){return e&&e.__esModule?e:{"default":e}}var r=n(e("babel-runtime/helpers/classCallCheck")),i=n(e("./utils")),a=function o(e){(0,r["default"])(this,o),e=e||{},this.displayMode=i["default"].deflt(e.displayMode,!1),this.throwOnError=i["default"].deflt(e.throwOnError,!0),this.errorColor=i["default"].deflt(e.errorColor,"#cc0000"),this.macros=e.macros||{},this.colorIsTextColor=i["default"].deflt(e.colorIsTextColor,!1)};t.exports=a},{"./utils":51,"babel-runtime/helpers/classCallCheck":4}],33:[function(e,t){function n(e){return e&&e.__esModule?e:{"default":e}}var r=n(e("babel-runtime/helpers/classCallCheck")),i=n(e("babel-runtime/helpers/createClass")),a=function(){function e(t,n,i){(0,r["default"])(this,e),this.id=t,this.size=n,this.cramped=i}return(0,i["default"])(e,[{key:"sup",value:function(){return f[m[this.id]]}},{key:"sub",value:function(){return f[g[this.id]]}},{key:"fracNum",value:function(){return f[v[this.id]]}},{key:"fracDen",value:function(){return f[b[this.id]]}},{key:"cramp",value:function(){return f[y[this.id]]}},{key:"text",value:function(){return f[x[this.id]]}},{key:"isTight",value:function(){return this.size>=2}}]),e}(),o=0,s=1,l=2,u=3,d=4,c=5,h=6,p=7,f=[new a(o,0,!1),new a(s,0,!0),new a(l,1,!1),new a(u,1,!0),new a(d,2,!1),new a(c,2,!0),new a(h,3,!1),new a(p,3,!0)],m=[d,c,d,c,h,p,h,p],g=[c,c,c,c,p,p,p,p],v=[l,u,d,c,h,p,h,p],b=[u,u,c,c,p,p,p,p],y=[s,s,u,u,c,c,p,p],x=[o,s,l,u,l,u,l,u];t.exports={DISPLAY:f[o],TEXT:f[l],SCRIPT:f[d],SCRIPTSCRIPT:f[h]}},{"babel-runtime/helpers/classCallCheck":4,"babel-runtime/helpers/createClass":5}],34:[function(e,t){function n(e){return e&&e.__esModule?e:{"default":e}}var r=n(e("./domTree")),i=n(e("./fontMetrics")),a=n(e("./symbols")),o=n(e("./utils")),s=["\\imath","\\jmath","\\pounds"],l=function(e,t,n){return a["default"][n][e]&&a["default"][n][e].replace&&(e=a["default"][n][e].replace),{value:e,metrics:i["default"].getCharacterMetrics(e,t)}},u=function(e,t,n,i,a){var o=l(e,t,n),s=o.metrics;e=o.value;var u=void 0;if(s){var d=s.italic;"text"===n&&(d=0),u=new r["default"].symbolNode(e,s.height,s.depth,d,s.skew,a)}else"undefined"!=typeof console&&console.warn("No character metrics for '"+e+"' in style '"+t+"'"),u=new r["default"].symbolNode(e,0,0,0,0,a);return i&&(u.maxFontSize=i.sizeMultiplier,i.style.isTight()&&u.classes.push("mtight"),i.getColor()&&(u.style.color=i.getColor())),u},d=function(e,t,n,r){return"\\"===e||"main"===a["default"][t][e].font?u(e,"Main-Regular",t,n,r):u(e,"AMS-Regular",t,n,r.concat(["amsrm"]))},c=function(e,t,n,r,i){if("mathord"===i){var o=h(e);return u(e,o.fontName,t,n,r.concat([o.fontClass]))}if("textord"===i)return"ams"===(a["default"][t][e]&&a["default"][t][e].font)?u(e,"AMS-Regular",t,n,r.concat(["amsrm"])):u(e,"Main-Regular",t,n,r.concat(["mathrm"]));throw new Error("unexpected type: "+i+" in mathDefault")},h=function(e){return/[0-9]/.test(e.charAt(0))||o["default"].contains(s,e)?{fontName:"Main-Italic",fontClass:"mainit"}:{fontName:"Math-Italic",fontClass:"mathit"}},p=function(e,t,n){var r=e.mode,i=e.value,a=["mord"],d=t.font;if(d){var p=void 0;return p="mathit"===d||o["default"].contains(s,i)?h(i):x[d],l(i,p.fontName,r).metrics?u(i,p.fontName,r,t,a.concat([p.fontClass||d])):c(i,r,t,a,n)}return c(i,r,t,a,n)},f=function(e){var t=0,n=0,r=0;if(e.children)for(var i=0;i<e.children.length;i++)e.children[i].height>t&&(t=e.children[i].height),e.children[i].depth>n&&(n=e.children[i].depth),e.children[i].maxFontSize>r&&(r=e.children[i].maxFontSize);e.height=t,e.depth=n,e.maxFontSize=r},m=function(e,t,n){var i=new r["default"].span(e,t,n);return f(i),i},g=function(e,t){e.children=t.concat(e.children),f(e)},v=function(e){var t=new r["default"].documentFragment(e);return f(t),t},b=function(e,t,n){var i=void 0,a=void 0,o=void 0;if("individualShift"===t){var s=e;for(e=[s[0]],a=i=-s[0].shift-s[0].elem.depth,o=1;o<s.length;o++){var l=-s[o].shift-a-s[o].elem.depth,u=l-(s[o-1].elem.height+s[o-1].elem.depth);a+=l,e.push({type:"kern",size:u}),e.push(s[o])}}else if("top"===t){var d=n;for(o=0;o<e.length;o++)"kern"===e[o].type?d-=e[o].size:d-=e[o].elem.height+e[o].elem.depth;i=d}else i="bottom"===t?-n:"shift"===t?-e[0].elem.depth-n:"firstBaseline"===t?-e[0].elem.depth:0;var c=0;for(o=0;o<e.length;o++)if("elem"===e[o].type){var h=e[o].elem;c=Math.max(c,h.maxFontSize,h.height)}c+=2;var p=m(["pstrut"],[]);p.style.height=c+"em";var f=[],g=i,v=i;for(a=i,o=0;o<e.length;o++){if("kern"===e[o].type)a+=e[o].size;else{var b=e[o].elem,y=m([],[p,b]);y.style.top=-c-a-b.depth+"em",e[o].marginLeft&&(y.style.marginLeft=e[o].marginLeft),e[o].marginRight&&(y.style.marginRight=e[o].marginRight),f.push(y),a+=b.height+b.depth}g=Math.min(g,a),v=Math.max(v,a)}var x=m(["vlist"],f);x.style.height=v+"em";var w=void 0;if(g<0){var k=m(["vlist"],[]);k.style.height=-g+"em";var M=m(["vlist-s"],[new r["default"].symbolNode("\u200b")]);w=[m(["vlist-r"],[x,M]),m(["vlist-r"],[k])]}else w=[m(["vlist-r"],[x])];var S=m(["vlist-t"],w);return 2===w.length&&S.classes.push("vlist-t2"),S.height=v,S.depth=-g,S},y={"\\qquad":{size:"2em",className:"qquad"},"\\quad":{size:"1em",className:"quad"},"\\enspace":{size:"0.5em",className:"enspace"},"\\;":{size:"0.277778em",className:"thickspace"},"\\:":{size:"0.22222em",className:"mediumspace"},"\\,":{size:"0.16667em",className:"thinspace"},"\\!":{size:"-0.16667em",className:"negativethinspace"}},x={mathbf:{variant:"bold",fontName:"Main-Bold"},mathrm:{variant:"normal",fontName:"Main-Regular"},textit:{variant:"italic",fontName:"Main-Italic"},mathbb:{variant:"double-struck",fontName:"AMS-Regular"},mathcal:{variant:"script",fontName:"Caligraphic-Regular"},mathfrak:{variant:"fraktur",fontName:"Fraktur-Regular"},mathscr:{variant:"script",fontName:"Script-Regular"},mathsf:{variant:"sans-serif",fontName:"SansSerif-Regular"},mathtt:{variant:"monospace",fontName:"Typewriter-Regular"}};t.exports={fontMap:x,makeSymbol:u,mathsym:d,makeSpan:m,makeFragment:v,makeVList:b,makeOrd:p,prependChildren:g,spacingFunctions:y}},{"./domTree":39,"./fontMetrics":41,"./symbols":48,"./utils":51}],35:[function(e,t){function n(e){return e&&e.__esModule?e:{"default":e}}function r(e,t,n){for(var r=y(e,t,!1),i=t.sizeMultiplier/n.sizeMultiplier,a=0;a<r.length;a++){var o=h["default"].indexOf(r[a].classes,"sizing");o<0?Array.prototype.push.apply(r[a].classes,t.sizingClasses(n)):r[a].classes[o+1]==="reset-size"+t.size&&(r[a].classes[o+1]="reset-size"+n.size),r[a].height*=i,r[a].depth*=i}return l["default"].makeFragment(r)}var i=n(e("babel-runtime/core-js/json/stringify")),a=n(e("./ParseError")),o=n(e("./Style")),s=e("./buildCommon"),l=n(s),u=n(e("./delimiter")),d=n(e("./domTree")),c=n(e("./units")),h=n(e("./utils")),p=n(e("./stretchy")),f=function(e){return e instanceof d["default"].span&&"mspace"===e.classes[0]},m=function(e){return e&&"mbin"===e.classes[0]},g=function(e,t){return e?h["default"].contains(["mbin","mopen","mrel","mop","mpunct"],e.classes[0]):t},v=function(e,t){return e?h["default"].contains(["mrel","mclose","mpunct"],e.classes[0]):t},b=function(e,t){for(var n=t;n<e.length&&f(e[n]);)n++;return n===t?null:e.splice(t,n-t)},y=function(e,t,n){for(var r=[],i=0;i<e.length;i++){var a=e[i],o=C(a,t);o instanceof d["default"].documentFragment?Array.prototype.push.apply(r,o.children):r.push(o)}for(var u=0;u<r.length;u++){var c=b(r,u);if(c){if(!(u<r.length)){Array.prototype.push.apply(r,c);break}r[u]instanceof d["default"].symbolNode&&(r[u]=(0,s.makeSpan)([].concat(r[u].classes),[r[u]])),l["default"].prependChildren(r[u],c)}}for(var h=0;h<r.length;h++)m(r[h])&&(g(r[h-1],n)||v(r[h+1],n))&&(r[h].classes[0]="mord");for(var p=0;p<r.length;p++)if("\u0338"===r[p].value&&p+1<r.length){var f=r.slice(p,p+2);f[0].classes=["mainrm"],f[0].style.position="absolute",f[0].style.right="0";var y=r[p+1].classes,x=(0,s.makeSpan)(y,f);-1!==y.indexOf("mord")&&(x.style.paddingLeft="0.277771em"),x.style.position="relative",r.splice(p,2,x)}return r},x=function N(e){if(e instanceof d["default"].documentFragment){if(e.children.length)return N(e.children[e.children.length-1])}else if(h["default"].contains(["mord","mop","mbin","mrel","mopen","mclose","mpunct","minner"],e.classes[0]))return e.classes[0];return null},w=function(e,t){if(e.value.base){var n=e.value.base;return"op"===n.type?n.value.limits&&(t.style.size===o["default"].DISPLAY.size||n.value.alwaysHandleSupSub):"accent"===n.type?M(n.value.base):"horizBrace"===n.type?!e.value.sub===n.value.isOver:null}return!1},k=function E(e){return!!e&&("ordgroup"===e.type?1===e.value.length?E(e.value[0]):e:"color"===e.type?1===e.value.value.length?E(e.value.value[0]):e:"font"===e.type?E(e.value.body):e)},M=function(e){var t=k(e);return"mathord"===t.type||"textord"===t.type||"bin"===t.type||"rel"===t.type||"inner"===t.type||"open"===t.type||"close"===t.type||"punct"===t.type},S=function(e,t){var n=["nulldelimiter"].concat(e.baseSizingClasses());return(0,s.makeSpan)(t.concat(n))},z={mathord:function(e,t){return l["default"].makeOrd(e,t,"mathord")},textord:function(e,t){return l["default"].makeOrd(e,t,"textord")},bin:function(e,t){return l["default"].mathsym(e.value,e.mode,t,["mbin"])},rel:function(e,t){return l["default"].mathsym(e.value,e.mode,t,["mrel"])},open:function(e,t){return l["default"].mathsym(e.value,e.mode,t,["mopen"])},close:function(e,t){return l["default"].mathsym(e.value,e.mode,t,["mclose"])},inner:function(e,t){return l["default"].mathsym(e.value,e.mode,t,["minner"])},punct:function(e,t){return l["default"].mathsym(e.value,e.mode,t,["mpunct"])},ordgroup:function(e,t){return(0,s.makeSpan)(["mord"],y(e.value,t,!0),t)},text:function(e,t){for(var n=t.withFont(e.value.style),r=y(e.value.body,n,!0),i=0;i<r.length-1;i++)r[i].tryCombine(r[i+1])&&(r.splice(i+1,1),i--);return(0,s.makeSpan)(["mord","text"],r,n)},color:function(e,t){var n=y(e.value.value,t.withColor(e.value.color),!1);return new l["default"].makeFragment(n)},supsub:function(e,t){if(w(e,t))return z[e.value.base.type](e,t);var n=C(e.value.base,t),r=void 0,i=void 0,a=t.fontMetrics(),u=void 0,c=0,h=0;e.value.sup&&(u=t.havingStyle(t.style.sup()),r=C(e.value.sup,u,t),M(e.value.base)||(c=n.height-u.fontMetrics().supDrop*u.sizeMultiplier/t.sizeMultiplier)),e.value.sub&&(u=t.havingStyle(t.style.sub()),i=C(e.value.sub,u,t),M(e.value.base)||(h=n.depth+u.fontMetrics().subDrop*u.sizeMultiplier/t.sizeMultiplier));var p=void 0;p=t.style===o["default"].DISPLAY?a.sup1:t.style.cramped?a.sup3:a.sup2;var f=t.sizeMultiplier,m=.5/a.ptPerEm/f+"em",g=void 0;if(e.value.sup)if(e.value.sub){c=Math.max(c,p,r.depth+.25*a.xHeight),h=Math.max(h,a.sub2);var v=a.defaultRuleThickness;if(c-r.depth-(i.height-h)<4*v){h=4*v-(c-r.depth)+i.height;var b=.8*a.xHeight-(c-r.depth);b>0&&(c+=b,h-=b)}var y=[{type:"elem",elem:i,shift:h,marginRight:m},{type:"elem",elem:r,shift:-c,marginRight:m}];n instanceof d["default"].symbolNode&&(y[0].marginLeft=-n.italic+"em"),g=l["default"].makeVList(y,"individualShift",null,t)}else c=Math.max(c,p,r.depth+.25*a.xHeight),g=l["default"].makeVList([{type:"elem",elem:r,marginRight:m}],"shift",-c,t);else{h=Math.max(h,a.sub1,i.height-.8*a.xHeight);var k=[{type:"elem",elem:i,marginRight:m}];n instanceof d["default"].symbolNode&&(k[0].marginLeft=-n.italic+"em"),g=l["default"].makeVList(k,"shift",h,t)}var S=x(n)||"mord";return(0,s.makeSpan)([S],[n,(0,s.makeSpan)(["msupsub"],[g])],t)},genfrac:function(e,t){var n=t.style;"display"===e.value.size?n=o["default"].DISPLAY:"text"===e.value.size&&(n=o["default"].TEXT);var r=n.fracNum(),i=n.fracDen(),a=void 0;a=t.havingStyle(r);var d=C(e.value.numer,a,t);a=t.havingStyle(i);var c=C(e.value.denom,a,t),h=void 0,p=void 0,f=void 0;e.value.hasBarLine?(p=(h=A("frac-line",t)).height,f=h.height):(h=null,p=0,f=t.fontMetrics().defaultRuleThickness);var m=void 0,g=void 0,v=void 0;n.size===o["default"].DISPLAY.size?(m=t.fontMetrics().num1,g=p>0?3*f:7*f,v=t.fontMetrics().denom1):(p>0?(m=t.fontMetrics().num2,g=f):(m=t.fontMetrics().num3,g=3*f),v=t.fontMetrics().denom2);var b=void 0;if(0===p){var y=m-d.depth-(c.height-v);y<g&&(m+=.5*(g-y),v+=.5*(g-y)),b=l["default"].makeVList([{type:"elem",elem:c,shift:v},{type:"elem",elem:d,shift:-m}],"individualShift",null,t)}else{var x=t.fontMetrics().axisHeight;m-d.depth-(x+.5*p)<g&&(m+=g-(m-d.depth-(x+.5*p))),x-.5*p-(c.height-v)<g&&(v+=g-(x-.5*p-(c.height-v)));var w=-(x-.5*p);b=l["default"].makeVList([{type:"elem",elem:c,shift:v},{type:"elem",elem:h,shift:w},{type:"elem",elem:d,shift:-m}],"individualShift",null,t)}a=t.havingStyle(n),b.height*=a.sizeMultiplier/t.sizeMultiplier,b.depth*=a.sizeMultiplier/t.sizeMultiplier;var k=void 0;k=n.size===o["default"].DISPLAY.size?t.fontMetrics().delim1:t.fontMetrics().delim2;var M=void 0,z=void 0;return M=null==e.value.leftDelim?S(t,["mopen"]):u["default"].customSizedDelim(e.value.leftDelim,k,!0,t.havingStyle(n),e.mode,["mopen"]),z=null==e.value.rightDelim?S(t,["mclose"]):u["default"].customSizedDelim(e.value.rightDelim,k,!0,t.havingStyle(n),e.mode,["mclose"]),(0,s.makeSpan)(["mord"].concat(a.sizingClasses(t)),[M,(0,s.makeSpan)(["mfrac"],[b]),z],t)},array:function(e,t){var n=void 0,r=void 0,i=e.value.body.length,o=0,u=new Array(i),d=1/t.fontMetrics().ptPerEm,p=5*d,f=12*d,m=3*d,g=h["default"].deflt(e.value.arraystretch,1)*f,v=.7*g,b=.3*g,y=0;for(n=0;n<e.value.body.length;++n){var x=e.value.body[n],w=v,k=b;o<x.length&&(o=x.length);var M=new Array(x.length);for(r=0;r<x.length;++r){var S=C(x[r],t);k<S.depth&&(k=S.depth),w<S.height&&(w=S.height),M[r]=S}var z=0;e.value.rowGaps[n]&&(z=c["default"].calculateSize(e.value.rowGaps[n].value,t))>0&&(k<(z+=b)&&(k=z),z=0),e.value.addJot&&(k+=m),M.height=w,M.depth=k,y+=w,M.pos=y,y+=k+z,u[n]=M}var A=y/2+t.fontMetrics().axisHeight,T=e.value.cols||[],N=[],E=void 0,R=void 0;for(r=0,R=0;r<o||R<T.length;++r,++R){for(var L=T[R]||{},O=!0;"separator"===L.type;){if(O||((E=(0,s.makeSpan)(["arraycolsep"],[])).style.width=t.fontMetrics().doubleRuleSep+"em",N.push(E)),"|"!==L.separator)throw new a["default"]("Invalid separator type: "+L.separator);var q=(0,s.makeSpan)(["vertical-separator"],[]);q.style.height=y+"em",q.style.verticalAlign=-(y-A)+"em",N.push(q),L=T[++R]||{},O=!1}if(!(r>=o)){var _=void 0;(r>0||e.value.hskipBeforeAndAfter)&&0!==(_=h["default"].deflt(L.pregap,p))&&((E=(0,s.makeSpan)(["arraycolsep"],[])).style.width=_+"em",N.push(E));var D=[];for(n=0;n<i;++n){var B=u[n],I=B[r];if(I){var H=B.pos-A;I.depth=B.depth,I.height=B.height,D.push({type:"elem",elem:I,shift:H})}}D=l["default"].makeVList(D,"individualShift",null,t),D=(0,s.makeSpan)(["col-align-"+(L.align||"c")],[D]),N.push(D),(r<o-1||e.value.hskipBeforeAndAfter)&&0!==(_=h["default"].deflt(L.postgap,p))&&((E=(0,s.makeSpan)(["arraycolsep"],[])).style.width=_+"em",N.push(E))}}return u=(0,s.makeSpan)(["mtable"],N),(0,s.makeSpan)(["mord"],[u],t)},spacing:function(e,t){return"\\ "===e.value||"\\space"===e.value||" "===e.value||"~"===e.value?"text"===e.mode?l["default"].makeOrd(e,t,"textord"):(0,s.makeSpan)(["mspace"],[l["default"].mathsym(e.value,e.mode,t)],t):(0,s.makeSpan)(["mspace",l["default"].spacingFunctions[e.value].className],[],t)},llap:function(e,t){var n=(0,s.makeSpan)(["inner"],[C(e.value.body,t)]),r=(0,s.makeSpan)(["fix"],[]);return(0,s.makeSpan)(["mord","llap"],[n,r],t)},rlap:function(e,t){var n=(0,s.makeSpan)(["inner"],[C(e.value.body,t)]),r=(0,s.makeSpan)(["fix"],[]);return(0,s.makeSpan)(["mord","rlap"],[n,r],t)},op:function(e,t){var n=void 0,r=void 0,i=!1;"supsub"===e.type&&(n=e.value.sup,r=e.value.sub,e=e.value.base,i=!0);var a=t.style,u=["\\smallint"],c=!1;a.size===o["default"].DISPLAY.size&&e.value.symbol&&!h["default"].contains(u,e.value.body)&&(c=!0);var p=void 0;if(e.value.symbol){var f=c?"Size2-Regular":"Size1-Regular";p=l["default"].makeSymbol(e.value.body,f,"math",t,["mop","op-symbol",c?"large-op":"small-op"])}else if(e.value.value){var m=y(e.value.value,t,!0);1===m.length&&m[0]instanceof d["default"].symbolNode?(p=m[0]).classes[0]="mop":p=(0,s.makeSpan)(["mop"],m,t)}else{for(var g=[],v=1;v<e.value.body.length;v++)g.push(l["default"].mathsym(e.value.body[v],e.mode));p=(0,s.makeSpan)(["mop"],g,t)}var b=0,x=0;if(p instanceof d["default"].symbolNode&&(b=(p.height-p.depth)/2-t.fontMetrics().axisHeight,x=p.italic),i){p=(0,s.makeSpan)([],[p]);var w=void 0,k=void 0,M=void 0,S=void 0,z=void 0;n&&(z=t.havingStyle(a.sup()),w=C(n,z,t),k=Math.max(t.fontMetrics().bigOpSpacing1,t.fontMetrics().bigOpSpacing3-w.depth)),r&&(z=t.havingStyle(a.sub()),M=C(r,z,t),S=Math.max(t.fontMetrics().bigOpSpacing2,t.fontMetrics().bigOpSpacing4-M.height));var A=void 0,T=void 0,N=void 0;if(n)if(r){if(!n&&!r)return p;N=t.fontMetrics().bigOpSpacing5+M.height+M.depth+S+p.depth+b,A=l["default"].makeVList([{type:"kern",size:t.fontMetrics().bigOpSpacing5},{type:"elem",elem:M,marginLeft:-x+"em"},{type:"kern",size:S},{type:"elem",elem:p},{type:"kern",size:k},{type:"elem",elem:w,marginLeft:x+"em"},{type:"kern",size:t.fontMetrics().bigOpSpacing5}],"bottom",N,t)}else N=p.depth+b,A=l["default"].makeVList([{type:"elem",elem:p},{type:"kern",size:k},{type:"elem",elem:w,marginLeft:x+"em"},{type:"kern",size:t.fontMetrics().bigOpSpacing5}],"bottom",N,t);else T=p.height-b,A=l["default"].makeVList([{type:"kern",size:t.fontMetrics().bigOpSpacing5},{type:"elem",elem:M,marginLeft:-x+"em"},{type:"kern",size:S},{type:"elem",elem:p}],"top",T,t);return(0,s.makeSpan)(["mop","op-limits"],[A],t)}return b&&(p.style.position="relative",p.style.top=b+"em"),p},mod:function(e,t){var n=[];if("bmod"===e.value.modType?(t.style.isTight()||n.push((0,s.makeSpan)(["mspace","negativemediumspace"],[],t)),n.push((0,s.makeSpan)(["mspace","thickspace"],[],t))):t.style.size===o["default"].DISPLAY.size?n.push((0,s.makeSpan)(["mspace","quad"],[],t)):"mod"===e.value.modType?n.push((0,s.makeSpan)(["mspace","twelvemuspace"],[],t)):n.push((0,s.makeSpan)(["mspace","eightmuspace"],[],t)),"pod"!==e.value.modType&&"pmod"!==e.value.modType||n.push(l["default"].mathsym("(",e.mode)),"pod"!==e.value.modType){var r=[l["default"].mathsym("m",e.mode),l["default"].mathsym("o",e.mode),l["default"].mathsym("d",e.mode)];"bmod"===e.value.modType?(n.push((0,s.makeSpan)(["mbin"],r,t)),n.push((0,s.makeSpan)(["mspace","thickspace"],[],t)),t.style.isTight()||n.push((0,s.makeSpan)(["mspace","negativemediumspace"],[],t))):(Array.prototype.push.apply(n,r),n.push((0,s.makeSpan)(["mspace","sixmuspace"],[],t)))}return e.value.value&&Array.prototype.push.apply(n,y(e.value.value,t,!1)),"pod"!==e.value.modType&&"pmod"!==e.value.modType||n.push(l["default"].mathsym(")",e.mode)),l["default"].makeFragment(n)},katex:function(e,t){var n=(0,s.makeSpan)(["k"],[l["default"].mathsym("K",e.mode)],t),r=(0,s.makeSpan)(["a"],[l["default"].mathsym("A",e.mode)],t);r.height=.75*(r.height+.2),r.depth=.75*(r.height-.2);var i=(0,s.makeSpan)(["t"],[l["default"].mathsym("T",e.mode)],t),a=(0,s.makeSpan)(["e"],[l["default"].mathsym("E",e.mode)],t);a.height=a.height-.2155,a.depth=a.depth+.2155;var o=(0,s.makeSpan)(["x"],[l["default"].mathsym("X",e.mode)],t);return(0,s.makeSpan)(["mord","katex-logo"],[n,r,i,a,o],t)}},A=function(e,t,n){var r=(0,s.makeSpan)([e],[],t);return r.height=n||t.fontMetrics().defaultRuleThickness,r.style.borderBottomWidth=r.height+"em",r.maxFontSize=1,r};z.overline=function(e,t){var n=C(e.value.body,t.havingCrampedStyle()),r=A("overline-line",t),i=l["default"].makeVList([{type:"elem",elem:n},{type:"kern",size:3*r.height},{type:"elem",elem:r},{type:"kern",size:r.height}],"firstBaseline",null,t);return(0,s.makeSpan)(["mord","overline"],[i],t)},z.underline=function(e,t){var n=C(e.value.body,t),r=A("underline-line",t),i=l["default"].makeVList([{type:"kern",size:r.height},{type:"elem",elem:r},{type:"kern",size:3*r.height},{type:"elem",elem:n}],"top",n.height,t);return(0,s.makeSpan)(["mord","underline"],[i],t)},z.sqrt=function(e,t){var n=C(e.value.body,t.havingCrampedStyle());n instanceof d["default"].documentFragment&&(n=(0,s.makeSpan)([],[n],t));var r=t.fontMetrics().defaultRuleThickness,i=r;t.style.id<o["default"].TEXT.id&&(i=t.fontMetrics().xHeight);var a=r+i/4,c=(n.height+n.depth+a+r)*t.sizeMultiplier,h=u["default"].customSizedDelim("\\surd",c,!1,t,e.mode),p=t.fontMetrics().sqrtRuleThickness*h.sizeMultiplier,f=h.height-p;f>n.height+n.depth+a&&(a=(a+f-n.height-n.depth)/2);var m=h.height-n.height-a-p,g=void 0;if(0===n.height&&0===n.depth?g=(0,s.makeSpan)():(n.style.paddingLeft=h.surdWidth+"em",(g=l["default"].makeVList([{type:"elem",elem:n},{type:"kern",size:-(n.height+m)},{type:"elem",elem:h},{type:"kern",size:p}],"firstBaseline",null,t)).children[0].children[0].classes.push("svg-align")),e.value.index){var v=t.havingStyle(o["default"].SCRIPTSCRIPT),b=C(e.value.index,v,t),y=.6*(g.height-g.depth),x=l["default"].makeVList([{type:"elem",elem:b}],"shift",-y,t),w=(0,s.makeSpan)(["root"],[x]);return(0,s.makeSpan)(["mord","sqrt"],[w,g],t)}return(0,s.makeSpan)(["mord","sqrt"],[g],t)},z.sizing=function(e,t){var n=t.havingSize(e.value.size);return r(e.value.value,n,t)},z.styling=function(e,t){var n={display:o["default"].DISPLAY,text:o["default"].TEXT,script:o["default"].SCRIPT,scriptscript:o["default"].SCRIPTSCRIPT}[e.value.style],i=t.havingStyle(n);return r(e.value.value,i,t)},z.font=function(e,t){var n=e.value.font;return C(e.value.body,t.withFont(n))},z.delimsizing=function(e,t){var n=e.value.value;return"."===n?(0,s.makeSpan)([e.value.mclass]):u["default"].sizedDelim(n,e.value.size,t,e.mode,[e.value.mclass])},z.leftright=function(e,t){for(var n=y(e.value.body,t,!0),r=0,i=0,a=!1,o=0;o<n.length;o++)n[o].isMiddle?a=!0:(r=Math.max(n[o].height,r),i=Math.max(n[o].depth,i));r*=t.sizeMultiplier,i*=t.sizeMultiplier;var d=void 0;if(d="."===e.value.left?S(t,["mopen"]):u["default"].leftRightDelim(e.value.left,r,i,t,e.mode,["mopen"]),n.unshift(d),a)for(var c=1;c<n.length;c++){var h=n[c];if(h.isMiddle){n[c]=u["default"].leftRightDelim(h.isMiddle.value,r,i,h.isMiddle.options,e.mode,[]);var p=b(h.children,0);p&&l["default"].prependChildren(n[c],p)}}var f=void 0;return f="."===e.value.right?S(t,["mclose"]):u["default"].leftRightDelim(e.value.right,r,i,t,e.mode,["mclose"]),n.push(f),(0,s.makeSpan)(["minner"],n,t)},z.middle=function(e,t){var n=void 0;return"."===e.value.value?n=S(t,[]):(n=u["default"].sizedDelim(e.value.value,1,t,e.mode,[])).isMiddle={value:e.value.value,options:t},n},z.rule=function(e,t){var n=(0,s.makeSpan)(["mord","rule"],[],t),r=0;e.value.shift&&(r=c["default"].calculateSize(e.value.shift,t));var i=c["default"].calculateSize(e.value.width,t),a=c["default"].calculateSize(e.value.height,t);return n.style.borderRightWidth=i+"em",n.style.borderTopWidth=a+"em",n.style.bottom=r+"em",n.width=i,n.height=a+r,n.depth=-r,n.maxFontSize=1.125*a*t.sizeMultiplier,n},z.kern=function(e,t){var n=(0,s.makeSpan)(["mord","rule"],[],t);if(e.value.dimension){var r=c["default"].calculateSize(e.value.dimension,t);n.style.marginLeft=r+"em"}return n},z.accent=function(e,t){var n=e.value.base,r=void 0;if("supsub"===e.type){var i=e;n=(e=i.value.base).value.base,i.value.base=n,r=C(i,t)}var a=C(n,t.havingCrampedStyle()),o=0;if(e.value.isShifty&&M(n)){var u=k(n);o=C(u,t.havingCrampedStyle()).skew}var d=Math.min(a.height,t.fontMetrics().xHeight),c=void 0;if(e.value.isStretchy){c=p["default"].svgSpan(e,t);var h=(c=l["default"].makeVList([{type:"elem",elem:a},{type:"elem",elem:c}],"firstBaseline",null,t)).children[0].children[0].children[1];h.classes.push("svg-align"),o>0&&(h.style.width="calc(100% - "+2*o+"em)",h.style.marginLeft=2*o+"em")}else{var f=l["default"].makeSymbol(e.value.label,"Main-Regular",e.mode,t);f.italic=0;var m=null;"\\vec"===e.value.label?m="accent-vec":"\\H"===e.value.label&&(m="accent-hungarian"),c=(0,s.makeSpan)([],[f]),(c=(0,s.makeSpan)(["accent-body",m],[c])).style.marginLeft=2*o+"em",c=l["default"].makeVList([{type:"elem",elem:a},{type:"kern",size:-d},{type:"elem",elem:c}],"firstBaseline",null,t)}var g=(0,s.makeSpan)(["mord","accent"],[c],t);return r?(r.children[0]=g,r.height=Math.max(g.height,r.height),r.classes[0]="mord",r):g},z.horizBrace=function(e,t){var n=t.style,r="supsub"===e.type,i=void 0,a=void 0;r&&(e.value.sup?(a=t.havingStyle(n.sup()),i=C(e.value.sup,a,t)):(a=t.havingStyle(n.sub()),i=C(e.value.sub,a,t)),e=e.value.base);var u=C(e.value.base,t.havingBaseStyle(o["default"].DISPLAY)),d=p["default"].svgSpan(e,t),c=void 0;if(e.value.isOver?(c=l["default"].makeVList([{type:"elem",elem:u},{type:"kern",size:.1},{type:"elem",elem:d}],"firstBaseline",null,t)).children[0].children[0].children[1].classes.push("svg-align"):(c=l["default"].makeVList([{type:"elem",elem:d},{type:"kern",size:.1},{type:"elem",elem:u}],"bottom",u.depth+.1+d.height,t)).children[0].children[0].children[0].classes.push("svg-align"),r){var h=(0,s.makeSpan)(["mord",e.value.isOver?"mover":"munder"],[c],t);c=e.value.isOver?l["default"].makeVList([{type:"elem",elem:h},{type:"kern",size:.2},{type:"elem",elem:i}],"firstBaseline",null,t):l["default"].makeVList([{type:"elem",elem:i},{type:"kern",size:.2},{type:"elem",elem:h}],"bottom",h.depth+.2+i.height,t)}return(0,s.makeSpan)(["mord",e.value.isOver?"mover":"munder"],[c],t)},z.accentUnder=function(e,t){var n=C(e.value.body,t),r=p["default"].svgSpan(e,t),i=/tilde/.test(e.value.label)?.12:0,a=l["default"].makeVList([{type:"elem",elem:r},{type:"kern",size:i},{type:"elem",elem:n}],"bottom",r.height+i,t);return a.children[0].children[0].children[0].classes.push("svg-align"),(0,s.makeSpan)(["mord","accentunder"],[a],t)},z.enclose=function(e,t){var n=C(e.value.body,t),r=e.value.label.substr(1),i=t.sizeMultiplier,a=void 0,o=0,u=0;if("sout"===r)(a=(0,s.makeSpan)(["stretchy","sout"])).height=t.fontMetrics().defaultRuleThickness/i,u=-.5*t.fontMetrics().xHeight;else{n.classes.push("fbox"===r?"boxpad":"cancel-pad");var d=M(e.value.body);o="fbox"===r?.34:d?.2:0,u=n.depth+o,a=p["default"].encloseSpan(n,r,o,t)}var c=l["default"].makeVList([{type:"elem",elem:n,shift:0},{type:"elem",elem:a,shift:u}],"individualShift",null,t);return"fbox"!==r&&c.children[0].children[0].children[1].classes.push("svg-align"),/cancel/.test(r)?(0,s.makeSpan)(["mord","cancel-lap"],[c],t):(0,s.makeSpan)(["mord"],[c],t)},z.xArrow=function(e,t){var n=t.style,r=t.havingStyle(n.sup()),i=C(e.value.body,r,t);i.classes.push("x-arrow-pad");var a=void 0;e.value.below&&(r=t.havingStyle(n.sub()),(a=C(e.value.below,r,t)).classes.push("x-arrow-pad"));var o=p["default"].svgSpan(e,t),u=-t.fontMetrics().axisHeight+o.depth,d=-t.fontMetrics().axisHeight-o.height-.111,c=void 0;if(e.value.below){var h=-t.fontMetrics().axisHeight+a.height+o.height+.111;c=l["default"].makeVList([{type:"elem",elem:i,shift:d},{type:"elem",elem:o,shift:u},{type:"elem",elem:a,shift:h}],"individualShift",null,t)}else c=l["default"].makeVList([{type:"elem",elem:i,shift:d},{type:"elem",elem:o,shift:u}],"individualShift",null,t);return c.children[0].children[0].children[1].classes.push("svg-align"),(0,s.makeSpan)(["mrel","x-arrow"],[c],t)},z.phantom=function(e,t){var n=y(e.value.value,t.withPhantom(),!1);return new l["default"].makeFragment(n)},z.mclass=function(e,t){var n=y(e.value.value,t,!0);return(0,s.makeSpan)([e.value.mclass],n,t)};var C=function(e,t,n){if(!e)return(0,s.makeSpan)();if(z[e.type]){var r=z[e.type](e,t);if(n&&t.size!==n.size){r=(0,s.makeSpan)(t.sizingClasses(n),[r],t);var i=t.sizeMultiplier/n.sizeMultiplier;r.height*=i,r.depth*=i}return r}throw new a["default"]("Got group of unknown type: '"+e.type+"'")},T=function(e,t){e=JSON.parse((0,i["default"])(e));var n=y(e,t,!0),r=(0,s.makeSpan)(["base"],n,t),a=(0,s.makeSpan)(["strut"]),o=(0,s.makeSpan)(["strut","bottom"]);a.style.height=r.height+"em",o.style.height=r.height+r.depth+"em",o.style.verticalAlign=-r.depth+"em";var l=(0,s.makeSpan)(["katex-html"],[a,o,r]);return l.setAttribute("aria-hidden","true"),l};t.exports=T},{"./ParseError":29,"./Style":33,"./buildCommon":34,"./delimiter":38,"./domTree":39,"./stretchy":47,"./units":50,"./utils":51,"babel-runtime/core-js/json/stringify":2}],36:[function(e,t){function n(e){return e&&e.__esModule?e:{"default":e}}var r=e("./buildCommon"),i=n(r),a=n(e("./fontMetrics")),o=n(e("./mathMLTree")),s=n(e("./ParseError")),l=n(e("./Style")),u=n(e("./symbols")),d=n(e("./utils")),c=n(e("./stretchy")),h=function(e,t){return u["default"][t][e]&&u["default"][t][e].replace&&(e=u["default"][t][e].replace),new o["default"].TextNode(e)},p=function(e,t){var n=t.font;if(!n)return null;var i=e.mode;if("mathit"===n)return"italic";var o=e.value;if(d["default"].contains(["\\imath","\\jmath"],o))return null;u["default"][i][o]&&u["default"][i][o].replace&&(o=u["default"][i][o].replace);var s=r.fontMap[n].fontName;return a["default"].getCharacterMetrics(o,s)?r.fontMap[t.font].variant:null},f={},m={mi:"italic",mn:"normal",mtext:"normal"};f.mathord=function(e,t){var n=new o["default"].MathNode("mi",[h(e.value,e.mode)]),r=p(e,t)||"italic";return r!==m[n.type]&&n.setAttribute("mathvariant",r),n},f.textord=function(e,t){var n=h(e.value,e.mode),r=p(e,t)||"normal",i=void 0;return i="text"===e.mode?new o["default"].MathNode("mtext",[n]):/[0-9]/.test(e.value)?new o["default"].MathNode("mn",[n]):"\\prime"===e.value?new o["default"].MathNode("mo",[n]):new o["default"].MathNode("mi",[n]),r!==m[i.type]&&i.setAttribute("mathvariant",r),i},f.bin=function(e){return new o["default"].MathNode("mo",[h(e.value,e.mode)])},f.rel=function(e){return new o["default"].MathNode("mo",[h(e.value,e.mode)])},f.open=function(e){return new o["default"].MathNode("mo",[h(e.value,e.mode)])},f.close=function(e){return new o["default"].MathNode("mo",[h(e.value,e.mode)])},f.inner=function(e){return new o["default"].MathNode("mo",[h(e.value,e.mode)])},f.punct=function(e){var t=new o["default"].MathNode("mo",[h(e.value,e.mode)]);return t.setAttribute("separator","true"),t},f.ordgroup=function(e,t){var n=g(e.value,t);return new o["default"].MathNode("mrow",n)},f.text=function(e,t){for(var n=e.value.body,r=[],i=null,a=0;a<n.length;a++){var s=v(n[a],t);"mtext"===s.type&&null!=i?Array.prototype.push.apply(i.children,s.children):(r.push(s),"mtext"===s.type&&(i=s))}return 1===r.length?r[0]:new o["default"].MathNode("mrow",r)},f.color=function(e,t){var n=g(e.value.value,t),r=new o["default"].MathNode("mstyle",n);return r.setAttribute("mathcolor",e.value.color),r},f.supsub=function(e,t){var n=!1,r=void 0;e.value.base&&"horizBrace"===e.value.base.value.type&&!!e.value.sup===e.value.base.value.isOver&&(n=!0,r=e.value.base.value.isOver);var i=!0,a=[v(e.value.base,t,i)];e.value.sub&&a.push(v(e.value.sub,t,i)),
+e.value.sup&&a.push(v(e.value.sup,t,i));var s=void 0;if(n)s=r?"mover":"munder";else if(e.value.sub)if(e.value.sup){var u=e.value.base;s=u&&u.value.limits&&t.style===l["default"].DISPLAY?"munderover":"msubsup"}else s="msub";else s="msup";return new o["default"].MathNode(s,a)},f.genfrac=function(e,t){var n=new o["default"].MathNode("mfrac",[v(e.value.numer,t),v(e.value.denom,t)]);if(e.value.hasBarLine||n.setAttribute("linethickness","0px"),null!=e.value.leftDelim||null!=e.value.rightDelim){var r=[];if(null!=e.value.leftDelim){var i=new o["default"].MathNode("mo",[new o["default"].TextNode(e.value.leftDelim)]);i.setAttribute("fence","true"),r.push(i)}if(r.push(n),null!=e.value.rightDelim){var a=new o["default"].MathNode("mo",[new o["default"].TextNode(e.value.rightDelim)]);a.setAttribute("fence","true"),r.push(a)}return new o["default"].MathNode("mrow",r)}return n},f.array=function(e,t){return new o["default"].MathNode("mtable",e.value.body.map(function(e){return new o["default"].MathNode("mtr",e.map(function(e){return new o["default"].MathNode("mtd",[v(e,t)])}))}))},f.sqrt=function(e,t){return e.value.index?new o["default"].MathNode("mroot",[v(e.value.body,t),v(e.value.index,t)]):new o["default"].MathNode("msqrt",[v(e.value.body,t)])},f.leftright=function(e,t){var n=g(e.value.body,t);if("."!==e.value.left){var r=new o["default"].MathNode("mo",[h(e.value.left,e.mode)]);r.setAttribute("fence","true"),n.unshift(r)}if("."!==e.value.right){var i=new o["default"].MathNode("mo",[h(e.value.right,e.mode)]);i.setAttribute("fence","true"),n.push(i)}return new o["default"].MathNode("mrow",n)},f.middle=function(e){var t=new o["default"].MathNode("mo",[h(e.value.middle,e.mode)]);return t.setAttribute("fence","true"),t},f.accent=function(e,t){var n=void 0;n=e.value.isStretchy?c["default"].mathMLnode(e.value.label):new o["default"].MathNode("mo",[h(e.value.label,e.mode)]);var r=new o["default"].MathNode("mover",[v(e.value.base,t),n]);return r.setAttribute("accent","true"),r},f.spacing=function(e){var t=void 0;return"\\ "===e.value||"\\space"===e.value||" "===e.value||"~"===e.value?t=new o["default"].MathNode("mtext",[new o["default"].TextNode("\xa0")]):(t=new o["default"].MathNode("mspace")).setAttribute("width",i["default"].spacingFunctions[e.value].size),t},f.op=function(e,t){return e.value.symbol?new o["default"].MathNode("mo",[h(e.value.body,e.mode)]):e.value.value?new o["default"].MathNode("mo",g(e.value.value,t)):new o["default"].MathNode("mi",[new o["default"].TextNode(e.value.body.slice(1))])},f.mod=function(e,t){var n=[];if("pod"!==e.value.modType&&"pmod"!==e.value.modType||n.push(new o["default"].MathNode("mo",[h("(",e.mode)])),"pod"!==e.value.modType&&n.push(new o["default"].MathNode("mo",[h("mod",e.mode)])),e.value.value){var r=new o["default"].MathNode("mspace");r.setAttribute("width","0.333333em"),n.push(r),n=n.concat(g(e.value.value,t))}return"pod"!==e.value.modType&&"pmod"!==e.value.modType||n.push(new o["default"].MathNode("mo",[h(")",e.mode)])),new o["default"].MathNode("mo",n)},f.katex=function(){return new o["default"].MathNode("mtext",[new o["default"].TextNode("KaTeX")])},f.font=function(e,t){var n=e.value.font;return v(e.value.body,t.withFont(n))},f.delimsizing=function(e){var t=[];"."!==e.value.value&&t.push(h(e.value.value,e.mode));var n=new o["default"].MathNode("mo",t);return"mopen"===e.value.mclass||"mclose"===e.value.mclass?n.setAttribute("fence","true"):n.setAttribute("fence","false"),n},f.styling=function(e,t){var n={display:l["default"].DISPLAY,text:l["default"].TEXT,script:l["default"].SCRIPT,scriptscript:l["default"].SCRIPTSCRIPT}[e.value.style],r=t.havingStyle(n),i=g(e.value.value,r),a=new o["default"].MathNode("mstyle",i),s={display:["0","true"],text:["0","false"],script:["1","false"],scriptscript:["2","false"]}[e.value.style];return a.setAttribute("scriptlevel",s[0]),a.setAttribute("displaystyle",s[1]),a},f.sizing=function(e,t){var n=t.havingSize(e.value.size),r=g(e.value.value,n),i=new o["default"].MathNode("mstyle",r);return i.setAttribute("mathsize",n.sizeMultiplier+"em"),i},f.overline=function(e,t){var n=new o["default"].MathNode("mo",[new o["default"].TextNode("\u203e")]);n.setAttribute("stretchy","true");var r=new o["default"].MathNode("mover",[v(e.value.body,t),n]);return r.setAttribute("accent","true"),r},f.underline=function(e,t){var n=new o["default"].MathNode("mo",[new o["default"].TextNode("\u203e")]);n.setAttribute("stretchy","true");var r=new o["default"].MathNode("munder",[v(e.value.body,t),n]);return r.setAttribute("accentunder","true"),r},f.accentUnder=function(e,t){var n=c["default"].mathMLnode(e.value.label),r=new o["default"].MathNode("munder",[v(e.value.body,t),n]);return r.setAttribute("accentunder","true"),r},f.enclose=function(e,t){var n=new o["default"].MathNode("menclose",[v(e.value.body,t)]),r="";switch(e.value.label){case"\\bcancel":r="downdiagonalstrike";break;case"\\sout":r="horizontalstrike";break;case"\\fbox":r="box";break;default:r="updiagonalstrike"}return n.setAttribute("notation",r),n},f.horizBrace=function(e,t){var n=c["default"].mathMLnode(e.value.label);return new o["default"].MathNode(e.value.isOver?"mover":"munder",[v(e.value.base,t),n])},f.xArrow=function(e,t){var n=c["default"].mathMLnode(e.value.label),r=void 0,i=void 0;if(e.value.body){var a=v(e.value.body,t);e.value.below?(i=v(e.value.below,t),r=new o["default"].MathNode("munderover",[n,i,a])):r=new o["default"].MathNode("mover",[n,a])}else e.value.below?(i=v(e.value.below,t),r=new o["default"].MathNode("munder",[n,i])):r=new o["default"].MathNode("mover",[n]);return r},f.rule=function(){return new o["default"].MathNode("mrow")},f.kern=function(){return new o["default"].MathNode("mrow")},f.llap=function(e,t){var n=new o["default"].MathNode("mpadded",[v(e.value.body,t)]);return n.setAttribute("lspace","-1width"),n.setAttribute("width","0px"),n},f.rlap=function(e,t){var n=new o["default"].MathNode("mpadded",[v(e.value.body,t)]);return n.setAttribute("width","0px"),n},f.phantom=function(e,t){var n=g(e.value.value,t);return new o["default"].MathNode("mphantom",n)},f.mclass=function(e,t){var n=g(e.value.value,t);return new o["default"].MathNode("mstyle",n)};var g=function(e,t){for(var n=[],r=0;r<e.length;r++){var i=e[r];n.push(v(i,t))}return n},v=function(e,t){var n=arguments.length>2&&arguments[2]!==undefined&&arguments[2];if(!e)return new o["default"].MathNode("mrow");if(f[e.type]){var r=f[e.type](e,t);return n&&"mrow"===r.type&&1===r.children.length?r.children[0]:r}throw new s["default"]("Got group of unknown type: '"+e.type+"'")},b=function(e,t,n){var i=g(e,n),a=new o["default"].MathNode("mrow",i),s=new o["default"].MathNode("annotation",[new o["default"].TextNode(t)]);s.setAttribute("encoding","application/x-tex");var l=new o["default"].MathNode("semantics",[a,s]),u=new o["default"].MathNode("math",[l]);return(0,r.makeSpan)(["katex-mathml"],[u])};t.exports=b},{"./ParseError":29,"./Style":33,"./buildCommon":34,"./fontMetrics":41,"./mathMLTree":45,"./stretchy":47,"./symbols":48,"./utils":51}],37:[function(e,t){function n(e){return e&&e.__esModule?e:{"default":e}}var r=n(e("./buildHTML")),i=n(e("./buildMathML")),a=e("./buildCommon"),o=n(e("./Options")),s=n(e("./Settings")),l=n(e("./Style")),u=function(e,t,n){n=n||new s["default"]({});var u=l["default"].TEXT;n.displayMode&&(u=l["default"].DISPLAY);var d=new o["default"]({style:u}),c=(0,i["default"])(e,t,d),h=(0,r["default"])(e,d),p=(0,a.makeSpan)(["katex"],[c,h]);return n.displayMode?(0,a.makeSpan)(["katex-display"],[p]):p};t.exports=u},{"./Options":28,"./Settings":32,"./Style":33,"./buildCommon":34,"./buildHTML":35,"./buildMathML":36}],38:[function(e,t){function n(e){return e&&e.__esModule?e:{"default":e}}var r=n(e("./ParseError")),i=n(e("./Style")),a=e("./buildCommon"),o=n(a),s=n(e("./fontMetrics")),l=n(e("./symbols")),u=n(e("./utils")),d=function(e,t){return l["default"].math[e]&&l["default"].math[e].replace?s["default"].getCharacterMetrics(l["default"].math[e].replace,t):s["default"].getCharacterMetrics(e,t)},c=function(e,t,n,r){var i=n.havingBaseStyle(t),o=(0,a.makeSpan)((r||[]).concat(i.sizingClasses(n)),[e],n);return o.delimSizeMultiplier=i.sizeMultiplier/n.sizeMultiplier,o.height*=o.delimSizeMultiplier,o.depth*=o.delimSizeMultiplier,o.maxFontSize=i.sizeMultiplier,o},h=function(e,t,n){var r=t.havingBaseStyle(n),i=(1-t.sizeMultiplier/r.sizeMultiplier)*t.fontMetrics().axisHeight;e.classes.push("delimcenter"),e.style.top=i+"em",e.height-=i,e.depth+=i},p=function(e,t,n,r,i,a){var s=o["default"].makeSymbol(e,"Main-Regular",i,r),l=c(s,t,r,a);return n&&h(l,r,t),l},f=function(e,t,n,r){return o["default"].makeSymbol(e,"Size"+t+"-Regular",n,r)},m=function(e,t,n,r,o,s){var l=f(e,t,o,r),u=c((0,a.makeSpan)(["delimsizing","size"+t],[l],r),i["default"].TEXT,r,s);return n&&h(u,r,i["default"].TEXT),u},g=function(e,t,n){var r=void 0;return"Size1-Regular"===t?r="delim-size1":"Size4-Regular"===t&&(r="delim-size4"),{type:"elem",elem:(0,a.makeSpan)(["delimsizinginner",r],[(0,a.makeSpan)([],[o["default"].makeSymbol(e,t,n)])])}},v=function(e,t,n,r,s,l){var u=void 0,h=void 0,p=void 0,f=void 0;u=p=f=e,h=null;var m="Size1-Regular";"\\uparrow"===e?p=f="\u23d0":"\\Uparrow"===e?p=f="\u2016":"\\downarrow"===e?u=p="\u23d0":"\\Downarrow"===e?u=p="\u2016":"\\updownarrow"===e?(u="\\uparrow",p="\u23d0",f="\\downarrow"):"\\Updownarrow"===e?(u="\\Uparrow",p="\u2016",f="\\Downarrow"):"["===e||"\\lbrack"===e?(u="\u23a1",p="\u23a2",f="\u23a3",m="Size4-Regular"):"]"===e||"\\rbrack"===e?(u="\u23a4",p="\u23a5",f="\u23a6",m="Size4-Regular"):"\\lfloor"===e?(p=u="\u23a2",f="\u23a3",m="Size4-Regular"):"\\lceil"===e?(u="\u23a1",p=f="\u23a2",m="Size4-Regular"):"\\rfloor"===e?(p=u="\u23a5",f="\u23a6",m="Size4-Regular"):"\\rceil"===e?(u="\u23a4",p=f="\u23a5",m="Size4-Regular"):"("===e?(u="\u239b",p="\u239c",f="\u239d",m="Size4-Regular"):")"===e?(u="\u239e",p="\u239f",f="\u23a0",m="Size4-Regular"):"\\{"===e||"\\lbrace"===e?(u="\u23a7",h="\u23a8",f="\u23a9",p="\u23aa",m="Size4-Regular"):"\\}"===e||"\\rbrace"===e?(u="\u23ab",h="\u23ac",f="\u23ad",p="\u23aa",m="Size4-Regular"):"\\lgroup"===e?(u="\u23a7",f="\u23a9",p="\u23aa",m="Size4-Regular"):"\\rgroup"===e?(u="\u23ab",f="\u23ad",p="\u23aa",m="Size4-Regular"):"\\lmoustache"===e?(u="\u23a7",f="\u23ad",p="\u23aa",m="Size4-Regular"):"\\rmoustache"===e&&(u="\u23ab",f="\u23a9",p="\u23aa",m="Size4-Regular");var v=d(u,m),b=v.height+v.depth,y=d(p,m),x=y.height+y.depth,w=d(f,m),k=w.height+w.depth,M=0,S=1;if(null!==h){var z=d(h,m);M=z.height+z.depth,S=2}var A=b+k+M,C=Math.ceil((t-A)/(S*x)),T=A+C*S*x,N=r.fontMetrics().axisHeight;n&&(N*=r.sizeMultiplier);var E=T/2-N,R=[];if(R.push(g(f,m,s)),null===h)for(var L=0;L<C;L++)R.push(g(p,m,s));else{for(var O=0;O<C;O++)R.push(g(p,m,s));R.push(g(h,m,s));for(var q=0;q<C;q++)R.push(g(p,m,s))}R.push(g(u,m,s));var _=r.havingBaseStyle(i["default"].TEXT),D=o["default"].makeVList(R,"bottom",E,_);return c((0,a.makeSpan)(["delimsizing","mult"],[D],_),i["default"].TEXT,r,l)},b={main:"<svg viewBox='0 0 400000 1000' preserveAspectRatio='xMinYMin\nslice'><path d='M95 622c-2.667 0-7.167-2.667-13.5\n-8S72 604 72 600c0-2 .333-3.333 1-4 1.333-2.667 23.833-20.667 67.5-54s\n65.833-50.333 66.5-51c1.333-1.333 3-2 5-2 4.667 0 8.667 3.333 12 10l173\n378c.667 0 35.333-71 104-213s137.5-285 206.5-429S812 17.333 812 14c5.333\n-9.333 12-14 20-14h399166v40H845.272L620 507 385 993c-2.667 4.667-9 7-19\n7-6 0-10-1-12-3L160 575l-65 47zM834 0h399166v40H845z'/></svg>",1:"<svg viewBox='0 0 400000 1200' preserveAspectRatio='xMinYMin\nslice'><path d='M263 601c.667 0 18 39.667 52 119s68.167\n 158.667 102.5 238 51.833 119.333 52.5 120C810 373.333 980.667 17.667 982 11\nc4.667-7.333 11-11 19-11h398999v40H1012.333L741 607c-38.667 80.667-84 175-136\n 283s-89.167 185.333-111.5 232-33.833 70.333-34.5 71c-4.667 4.667-12.333 7-23\n 7l-12-1-109-253c-72.667-168-109.333-252-110-252-10.667 8-22 16.667-34 26-22\n 17.333-33.333 26-34 26l-26-26 76-59 76-60zM1001 0h398999v40H1012z'/></svg>",2:"<svg viewBox='0 0 400000 1800' preserveAspectRatio='xMinYMin\nslice'><path d='M1001 0h398999v40H1013.084S929.667 308 749\n 880s-277 876.333-289 913c-4.667 4.667-12.667 7-24 7h-12c-1.333-3.333-3.667\n-11.667-7-25-35.333-125.333-106.667-373.333-214-744-10 12-21 25-33 39l-32 39\nc-6-5.333-15-14-27-26l25-30c26.667-32.667 52-63 76-91l52-60 208 722c56-175.333\n 126.333-397.333 211-666s153.833-488.167 207.5-658.5C944.167 129.167 975 32.667\n 983 10c4-6.667 10-10 18-10zm0 0h398999v40H1013z'/></svg>",3:"<svg viewBox='0 0 400000 2400' preserveAspectRatio='xMinYMin\nslice'><path d='M424 2398c-1.333-.667-38.5-172-111.5-514\nS202.667 1370.667 202 1370c0-2-10.667 14.333-32 49-4.667 7.333-9.833 15.667\n-15.5 25s-9.833 16-12.5 20l-5 7c-4-3.333-8.333-7.667-13-13l-13-13 76-122 77-121\n 209 968c0-2 84.667-361.667 254-1079C896.333 373.667 981.667 13.333 983 10\nc4-6.667 10-10 18-10h398999v40H1014.622S927.332 418.667 742 1206c-185.333\n 787.333-279.333 1182.333-282 1185-2 6-10 9-24 9-8 0-12-.667-12-2z\nM1001 0h398999v40H1014z'/></svg>",4:"<svg viewBox='0 0 400000 3000' preserveAspectRatio='xMinYMin\nslice'><path d='M473 2713C812.333 913.667 982.333 13 983 11\nc3.333-7.333 9.333-11 18-11h399110v40H1017.698S927.168 518 741.5 1506C555.833\n 2494 462 2989 460 2991c-2 6-10 9-24 9-8 0-12-.667-12-2s-5.333-32-16-92c-50.667\n-293.333-119.667-693.333-207-1200 0-1.333-5.333 8.667-16 30l-32 64-16 33-26-26\n 76-153 77-151c.667.667 35.667 202 105 604 67.333 400.667 102 602.667 104 606z\nM1001 0h398999v40H1017z'/></svg>",tall:"l-4 4-4 4c-.667.667-2 1.5-4 2.5s-4.167 1.833-6.5 2.5-5.5 1-9.5 1h\n-12l-28-84c-16.667-52-96.667 -294.333-240-727l-212 -643 -85 170c-4-3.333-8.333\n-7.667-13 -13l-13-13l77-155 77-156c66 199.333 139 419.667 219 661 l218 661z\nM702 0H400000v40H742z'/></svg>"},y=function(e,t,n){var r=o["default"].makeSpan([],[],n),i=n.sizeMultiplier;if("small"===t.type)i=n.havingBaseStyle(t.style).sizeMultiplier/n.sizeMultiplier,r.height=1*i,r.style.height=r.height+"em",r.surdWidth=.833*i,r.innerHTML="<svg width='100%' height='"+r.height+"em'>\n            "+b.main+"</svg>";else if("large"===t.type)r.height=M[t.size]/i,r.style.height=r.height+"em",r.surdWidth=1/i,r.innerHTML='<svg width="100%" height="'+r.height+'em">\n            '+b[t.size]+"</svg>";else{r.height=e/i,r.style.height=r.height+"em",r.surdWidth=1.056/i;var a=Math.floor(1e3*r.height),s=a-54;r.innerHTML="<svg width='100%' height='"+r.height+"em'>\n            <svg viewBox='0 0 400000 "+a+"'\n            preserveAspectRatio='xMinYMax slice'>\n            <path d='M702 0H400000v40H742v"+s+"\n            "+b.tall+"</svg>"}return r.sizeMultiplier=i,r},x=["(",")","[","\\lbrack","]","\\rbrack","\\{","\\lbrace","\\}","\\rbrace","\\lfloor","\\rfloor","\\lceil","\\rceil","\\surd"],w=["\\uparrow","\\downarrow","\\updownarrow","\\Uparrow","\\Downarrow","\\Updownarrow","|","\\|","\\vert","\\Vert","\\lvert","\\rvert","\\lVert","\\rVert","\\lgroup","\\rgroup","\\lmoustache","\\rmoustache"],k=["<",">","\\langle","\\rangle","/","\\backslash","\\lt","\\gt"],M=[0,1.2,1.8,2.4,3],S=function(e,t,n,i,a){if("<"===e||"\\lt"===e?e="\\langle":">"!==e&&"\\gt"!==e||(e="\\rangle"),u["default"].contains(x,e)||u["default"].contains(k,e))return m(e,t,!1,n,i,a);if(u["default"].contains(w,e))return v(e,M[t],!1,n,i,a);throw new r["default"]("Illegal delimiter: '"+e+"'")},z=[{type:"small",style:i["default"].SCRIPTSCRIPT},{type:"small",style:i["default"].SCRIPT},{type:"small",style:i["default"].TEXT},{type:"large",size:1},{type:"large",size:2},{type:"large",size:3},{type:"large",size:4}],A=[{type:"small",style:i["default"].SCRIPTSCRIPT},{type:"small",style:i["default"].SCRIPT},{type:"small",style:i["default"].TEXT},{type:"stack"}],C=[{type:"small",style:i["default"].SCRIPTSCRIPT},{type:"small",style:i["default"].SCRIPT},{type:"small",style:i["default"].TEXT},{type:"large",size:1},{type:"large",size:2},{type:"large",size:3},{type:"large",size:4},{type:"stack"}],T=function(e){return"small"===e.type?"Main-Regular":"large"===e.type?"Size"+e.size+"-Regular":"stack"===e.type?"Size4-Regular":void 0},N=function(e,t,n,r){for(var i=Math.min(2,3-r.style.size);i<n.length&&"stack"!==n[i].type;i++){var a=d(e,T(n[i])),o=a.height+a.depth;if("small"===n[i].type&&(o*=r.havingBaseStyle(n[i].style).sizeMultiplier),o>t)return n[i]}return n[n.length-1]},E=function(e,t,n,r,i,a){"<"===e||"\\lt"===e?e="\\langle":">"!==e&&"\\gt"!==e||(e="\\rangle");var o=void 0;o=u["default"].contains(k,e)?z:u["default"].contains(x,e)?C:A;var s=N(e,t,o,r);return"\\surd"===e?y(t,s,r):"small"===s.type?p(e,s.style,n,r,i,a):"large"===s.type?m(e,s.size,n,r,i,a):"stack"===s.type?v(e,t,n,r,i,a):void 0},R=function(e,t,n,r,i,a){var o=r.fontMetrics().axisHeight*r.sizeMultiplier,s=901,l=5/r.fontMetrics().ptPerEm,u=Math.max(t-o,n+o),d=Math.max(u/500*s,2*u-l);return E(e,d,!0,r,i,a)};t.exports={sizedDelim:S,customSizedDelim:E,leftRightDelim:R}},{"./ParseError":29,"./Style":33,"./buildCommon":34,"./fontMetrics":41,"./symbols":48,"./utils":51}],39:[function(e,t){function n(e){return e&&e.__esModule?e:{"default":e}}var r=n(e("babel-runtime/helpers/classCallCheck")),i=n(e("babel-runtime/helpers/createClass")),a=n(e("./unicodeRegexes")),o=n(e("./utils")),s=function(e){for(var t=(e=e.slice()).length-1;t>=0;t--)e[t]||e.splice(t,1);return e.join(" ")},l=function(){function e(t,n,i){(0,r["default"])(this,e),this.classes=t||[],this.children=n||[],this.height=0,this.depth=0,this.maxFontSize=0,this.style={},this.attributes={},this.innerHTML,i&&(i.style.isTight()&&this.classes.push("mtight"),i.getColor()&&(this.style.color=i.getColor()))}return(0,i["default"])(e,[{key:"setAttribute",value:function(e,t){this.attributes[e]=t}},{key:"tryCombine",value:function(){return!1}},{key:"toNode",value:function(){var e=document.createElement("span");for(var t in e.className=s(this.classes),this.style)Object.prototype.hasOwnProperty.call(this.style,t)&&(e.style[t]=this.style[t]);for(var n in this.attributes)Object.prototype.hasOwnProperty.call(this.attributes,n)&&e.setAttribute(n,this.attributes[n]);this.innerHTML&&(e.innerHTML=this.innerHTML);for(var r=0;r<this.children.length;r++)e.appendChild(this.children[r].toNode());return e}},{key:"toMarkup",value:function(){var e="<span";this.classes.length&&(e+=' class="',e+=o["default"].escape(s(this.classes)),e+='"');var t="";for(var n in this.style)this.style.hasOwnProperty(n)&&(t+=o["default"].hyphenate(n)+":"+this.style[n]+";");for(var r in t&&(e+=' style="'+o["default"].escape(t)+'"'),this.attributes)Object.prototype.hasOwnProperty.call(this.attributes,r)&&(e+=" "+r+'="',e+=o["default"].escape(this.attributes[r]),e+='"');e+=">",this.innerHTML&&(e+=this.innerHTML);for(var i=0;i<this.children.length;i++)e+=this.children[i].toMarkup();return e+="</span>"}}]),e}(),u=function(){function e(t){(0,r["default"])(this,e),this.children=t||[],this.height=0,this.depth=0,this.maxFontSize=0}return(0,i["default"])(e,[{key:"toNode",value:function(){for(var e=document.createDocumentFragment(),t=0;t<this.children.length;t++)e.appendChild(this.children[t].toNode());return e}},{key:"toMarkup",value:function(){for(var e="",t=0;t<this.children.length;t++)e+=this.children[t].toMarkup();return e}}]),e}(),d={"\xee":"\u0131\u0302","\xef":"\u0131\u0308","\xed":"\u0131\u0301","\xec":"\u0131\u0300"},c=function(){function e(t,n,i,o,s,l,u){(0,r["default"])(this,e),this.value=t||"",this.height=n||0,this.depth=i||0,this.italic=o||0,this.skew=s||0,this.classes=l||[],this.style=u||{},this.maxFontSize=0,a["default"].cjkRegex.test(t)&&(a["default"].hangulRegex.test(t)?this.classes.push("hangul_fallback"):this.classes.push("cjk_fallback")),/[\xee\xef\xed\xec]/.test(this.value)&&(this.value=d[this.value])}return(0,i["default"])(e,[{key:"tryCombine",value:function(t){if(!t||!(t instanceof e)||this.italic>0||s(this.classes)!==s(t.classes)||this.skew!==t.skew||this.maxFontSize!==t.maxFontSize)return!1;for(var n in this.style)if(this.style.hasOwnProperty(n)&&this.style[n]!==t.style[n])return!1;for(var r in t.style)if(t.style.hasOwnProperty(r)&&this.style[r]!==t.style[r])return!1;return this.value+=t.value,this.height=Math.max(this.height,t.height),this.depth=Math.max(this.depth,t.depth),this.italic=t.italic,!0}},{key:"toNode",value:function(){var e=document.createTextNode(this.value),t=null;for(var n in this.italic>0&&((t=document.createElement("span")).style.marginRight=this.italic+"em"),this.classes.length>0&&((t=t||document.createElement("span")).className=s(this.classes)),this.style)this.style.hasOwnProperty(n)&&((t=t||document.createElement("span")).style[n]=this.style[n]);return t?(t.appendChild(e),t):e}},{key:"toMarkup",value:function(){var e=!1,t="<span";this.classes.length&&(e=!0,t+=' class="',t+=o["default"].escape(s(this.classes)),t+='"');var n="";for(var r in this.italic>0&&(n+="margin-right:"+this.italic+"em;"),this.style)this.style.hasOwnProperty(r)&&(n+=o["default"].hyphenate(r)+":"+this.style[r]+";");n&&(e=!0,t+=' style="'+o["default"].escape(n)+'"');var i=o["default"].escape(this.value);return e?(t+=">",t+=i,t+="</span>"):i}}]),e}();t.exports={span:l,documentFragment:u,symbolNode:c}},{"./unicodeRegexes":49,"./utils":51,"babel-runtime/helpers/classCallCheck":4,"babel-runtime/helpers/createClass":5}],40:[function(e,t){function n(e){return e&&e.__esModule?e:{"default":e}}function r(e,t,n){for(var r=[],i=[r],a=[];;){var l=e.parseExpression(!1,null);l=new o["default"]("ordgroup",l,e.mode),n&&(l=new o["default"]("styling",{style:n,value:[l]},e.mode)),r.push(l);var u=e.nextToken.text;if("&"===u)e.consume();else{if("\\end"===u)break;if("\\\\"!==u&&"\\cr"!==u)throw new s["default"]("Expected & or \\\\ or \\end",e.nextToken);var d=e.parseFunction();a.push(d.value.size),r=[],i.push(r)}}return t.body=i,t.rowGaps=a,new o["default"](t.type,t,e.mode)}function i(e,n,r){"string"==typeof e&&(e=[e]),"number"==typeof n&&(n={numArgs:n});for(var i={numArgs:n.numArgs||0,argTypes:n.argTypes,greediness:1,allowedInText:!!n.allowedInText,numOptionalArgs:n.numOptionalArgs||0,handler:r},a=0;a<e.length;++a)t.exports[e[a]]=i}function a(e){return"d"===e.substr(0,1)?"display":"text"}var o=n(e("./ParseNode")),s=n(e("./ParseError"));i(["array","darray"],{numArgs:1},function(e,t){var n=t[0],i={type:"array",cols:(n=n.value.map?n.value:[n]).map(function(e){var t=e.value;if(-1!=="lcr".indexOf(t))return{type:"align",align:t};if("|"===t)return{type:"separator",separator:"|"};throw new s["default"]("Unknown column alignment: "+e.value,e)}),hskipBeforeAndAfter:!0};return i=r(e.parser,i,a(e.envName))}),i(["matrix","pmatrix","bmatrix","Bmatrix","vmatrix","Vmatrix"],{},function(e){var t={matrix:null,pmatrix:["(",")"],bmatrix:["[","]"],Bmatrix:["\\{","\\}"],vmatrix:["|","|"],Vmatrix:["\\Vert","\\Vert"]}[e.envName],n={type:"array",hskipBeforeAndAfter:!1};return n=r(e.parser,n,a(e.envName)),t&&(n=new o["default"]("leftright",{body:[n],left:t[0],right:t[1]},e.mode)),n}),i(["cases","dcases"],{},function(e){var t={type:"array",arraystretch:1.2,cols:[{type:"align",align:"l",pregap:0,postgap:1},{type:"align",align:"l",pregap:0,postgap:0}]};return t=r(e.parser,t,a(e.envName)),t=new o["default"]("leftright",{body:[t],left:"\\{",right:"."},e.mode)}),i("aligned",{},function(e){var t={type:"array",cols:[],addJot:!0};t=r(e.parser,t,"display");var n=new o["default"]("ordgroup",[],e.mode),i=0;t.value.body.forEach(function(e){for(var t=1;t<e.length;t+=2){e[t].value.value[0].value.unshift(n)}i<e.length&&(i=e.length)});for(var a=0;a<i;++a){var s="r",l=0;a%2==1?s="l":a>0&&(l=2),t.value.cols[a]={type:"align",align:s,pregap:l,postgap:0}}return t}),i("gathered",{},function(e){var t={type:"array",cols:[{type:"align",align:"c"}],addJot:!0};return t=r(e.parser,t,"display")})},{"./ParseError":29,"./ParseNode":30}],41:[function(e,t){function n(e){return e&&e.__esModule?e:{"default":e}}var r=e("./unicodeRegexes"),i=n(e("./fontMetricsData")),a={slant:[.25,.25,.25],space:[0,0,0],stretch:[0,0,0],shrink:[0,0,0],xHeight:[.431,.431,.431],quad:[1,1.171,1.472],extraSpace:[0,0,0],num1:[.677,.732,.925],num2:[.394,.384,.387],num3:[.444,.471,.504],denom1:[.686,.752,1.025],denom2:[.345,.344,.532],sup1:[.413,.503,.504],sup2:[.363,.431,.404],sup3:[.289,.286,.294],sub1:[.15,.143,.2],sub2:[.247,.286,.4],supDrop:[.386,.353,.494],subDrop:[.05,.071,.1],delim1:[2.39,1.7,1.98],delim2:[1.01,1.157,1.42],axisHeight:[.25,.25,.25],defaultRuleThickness:[.04,.049,.049],bigOpSpacing1:[.111,.111,.111],bigOpSpacing2:[.166,.166,.166],bigOpSpacing3:[.2,.2,.2],bigOpSpacing4:[.6,.611,.611],bigOpSpacing5:[.1,.143,.143],sqrtRuleThickness:[.04,.04,.04],ptPerEm:[10,10,10],doubleRuleSep:[.2,.2,.2]},o={"\xc0":"A","\xc1":"A","\xc2":"A","\xc3":"A","\xc4":"A","\xc5":"A","\xc6":"A","\xc7":"C","\xc8":"E","\xc9":"E","\xca":"E","\xcb":"E","\xcc":"I","\xcd":"I","\xce":"I","\xcf":"I","\xd0":"D","\xd1":"N","\xd2":"O","\xd3":"O","\xd4":"O","\xd5":"O","\xd6":"O","\xd8":"O","\xd9":"U","\xda":"U","\xdb":"U","\xdc":"U","\xdd":"Y","\xde":"o","\xdf":"B","\xe0":"a","\xe1":"a","\xe2":"a","\xe3":"a","\xe4":"a","\xe5":"a","\xe6":"a","\xe7":"c","\xe8":"e","\xe9":"e","\xea":"e","\xeb":"e","\xec":"i","\xed":"i","\xee":"i","\xef":"i","\xf0":"d","\xf1":"n","\xf2":"o","\xf3":"o","\xf4":"o","\xf5":"o","\xf6":"o","\xf8":"o","\xf9":"u","\xfa":"u","\xfb":"u","\xfc":"u","\xfd":"y","\xfe":"o","\xff":"y","\u0410":"A","\u0411":"B","\u0412":"B","\u0413":"F","\u0414":"A","\u0415":"E","\u0416":"K","\u0417":"3","\u0418":"N","\u0419":"N","\u041a":"K","\u041b":"N","\u041c":"M","\u041d":"H","\u041e":"O","\u041f":"N","\u0420":"P","\u0421":"C","\u0422":"T","\u0423":"y","\u0424":"O","\u0425":"X","\u0426":"U","\u0427":"h","\u0428":"W","\u0429":"W","\u042a":"B","\u042b":"X","\u042c":"B","\u042d":"3","\u042e":"X","\u042f":"R","\u0430":"a","\u0431":"b","\u0432":"a","\u0433":"r","\u0434":"y","\u0435":"e","\u0436":"m","\u0437":"e","\u0438":"n","\u0439":"n","\u043a":"n","\u043b":"n","\u043c":"m","\u043d":"n","\u043e":"o","\u043f":"n","\u0440":"p","\u0441":"c","\u0442":"o","\u0443":"y","\u0444":"b","\u0445":"x","\u0446":"n","\u0447":"n","\u0448":"w","\u0449":"w","\u044a":"a","\u044b":"m","\u044c":"a","\u044d":"e","\u044e":"m","\u044f":"r"},s=function(e,t){var n=e.charCodeAt(0);e[0]in o?n=o[e[0]].charCodeAt(0):r.cjkRegex.test(e[0])&&(n="M".charCodeAt(0));var a=i["default"][t][n];if(a)return{depth:a[0],height:a[1],italic:a[2],skew:a[3],width:a[4]}},l={},u=function(e){var t=void 0;if(!l[t=e>=5?0:e>=3?1:2]){var n=l[t]={};for(var r in a)a.hasOwnProperty(r)&&(n[r]=a[r][t]);n.cssEmPerMu=n.quad/18}return l[t]};t.exports={getFontMetrics:u,getCharacterMetrics:s}},{"./fontMetricsData":42,"./unicodeRegexes":49}],42:[function(e,t){t.exports={"AMS-Regular":{65:[0,.68889,0,0],66:[0,.68889,0,0],67:[0,.68889,0,0],68:[0,.68889,0,0],69:[0,.68889,0,0],70:[0,.68889,0,0],71:[0,.68889,0,0],72:[0,.68889,0,0],73:[0,.68889,0,0],74:[.16667,.68889,0,0],75:[0,.68889,0,0],76:[0,.68889,0,0],77:[0,.68889,0,0],78:[0,.68889,0,0],79:[.16667,.68889,0,0],80:[0,.68889,0,0],81:[.16667,.68889,0,0],82:[0,.68889,0,0],83:[0,.68889,0,0],84:[0,.68889,0,0],85:[0,.68889,0,0],86:[0,.68889,0,0],87:[0,.68889,0,0],88:[0,.68889,0,0],89:[0,.68889,0,0],90:[0,.68889,0,0],107:[0,.68889,0,0],165:[0,.675,.025,0],174:[.15559,.69224,0,0],240:[0,.68889,0,0],295:[0,.68889,0,0],710:[0,.825,0,0],732:[0,.9,0,0],770:[0,.825,0,0],771:[0,.9,0,0],989:[.08167,.58167,0,0],1008:[0,.43056,.04028,0],8245:[0,.54986,0,0],8463:[0,.68889,0,0],8487:[0,.68889,0,0],8498:[0,.68889,0,0],8502:[0,.68889,0,0],8503:[0,.68889,0,0],8504:[0,.68889,0,0],8513:[0,.68889,0,0],8592:[-.03598,.46402,0,0],8594:[-.03598,.46402,0,0],8602:[-.13313,.36687,0,0],8603:[-.13313,.36687,0,0],8606:[.01354,.52239,0,0],8608:[.01354,.52239,0,0],8610:[.01354,.52239,0,0],8611:[.01354,.52239,0,0],8619:[0,.54986,0,0],8620:[0,.54986,0,0],8621:[-.13313,.37788,0,0],8622:[-.13313,.36687,0,0],8624:[0,.69224,0,0],8625:[0,.69224,0,0],8630:[0,.43056,0,0],8631:[0,.43056,0,0],8634:[.08198,.58198,0,0],8635:[.08198,.58198,0,0],8638:[.19444,.69224,0,0],8639:[.19444,.69224,0,0],8642:[.19444,.69224,0,0],8643:[.19444,.69224,0,0],8644:[.1808,.675,0,0],8646:[.1808,.675,0,0],8647:[.1808,.675,0,0],8648:[.19444,.69224,0,0],8649:[.1808,.675,0,0],8650:[.19444,.69224,0,0],8651:[.01354,.52239,0,0],8652:[.01354,.52239,0,0],8653:[-.13313,.36687,0,0],8654:[-.13313,.36687,0,0],8655:[-.13313,.36687,0,0],8666:[.13667,.63667,0,0],8667:[.13667,.63667,0,0],8669:[-.13313,.37788,0,0],8672:[-.064,.437,0,0],8674:[-.064,.437,0,0],8705:[0,.825,0,0],8708:[0,.68889,0,0],8709:[.08167,.58167,0,0],8717:[0,.43056,0,0],8722:[-.03598,.46402,0,0],8724:[.08198,.69224,0,0],8726:[.08167,.58167,0,0],8733:[0,.69224,0,0],8736:[0,.69224,0,0],8737:[0,.69224,0,0],8738:[.03517,.52239,0,0],8739:[.08167,.58167,0,0],8740:[.25142,.74111,0,0],8741:[.08167,.58167,0,0],8742:[.25142,.74111,0,0],8756:[0,.69224,0,0],8757:[0,.69224,0,0],8764:[-.13313,.36687,0,0],8765:[-.13313,.37788,0,0],8769:[-.13313,.36687,0,0],8770:[-.03625,.46375,0,0],8774:[.30274,.79383,0,0],8776:[-.01688,.48312,0,0],8778:[.08167,.58167,0,0],8782:[.06062,.54986,0,0],8783:[.06062,.54986,0,0],8785:[.08198,.58198,0,0],8786:[.08198,.58198,0,0],8787:[.08198,.58198,0,0],8790:[0,.69224,0,0],8791:[.22958,.72958,0,0],8796:[.08198,.91667,0,0],8806:[.25583,.75583,0,0],8807:[.25583,.75583,0,0],8808:[.25142,.75726,0,0],8809:[.25142,.75726,0,0],8812:[.25583,.75583,0,0],8814:[.20576,.70576,0,0],8815:[.20576,.70576,0,0],8816:[.30274,.79383,0,0],8817:[.30274,.79383,0,0],8818:[.22958,.72958,0,0],8819:[.22958,.72958,0,0],8822:[.1808,.675,0,0],8823:[.1808,.675,0,0],8828:[.13667,.63667,0,0],8829:[.13667,.63667,0,0],8830:[.22958,.72958,0,0],8831:[.22958,.72958,0,0],8832:[.20576,.70576,0,0],8833:[.20576,.70576,0,0],8840:[.30274,.79383,0,0],8841:[.30274,.79383,0,0],8842:[.13597,.63597,0,0],8843:[.13597,.63597,0,0],8847:[.03517,.54986,0,0],8848:[.03517,.54986,0,0],8858:[.08198,.58198,0,0],8859:[.08198,.58198,0,0],8861:[.08198,.58198,0,0],8862:[0,.675,0,0],8863:[0,.675,0,0],8864:[0,.675,0,0],8865:[0,.675,0,0],8872:[0,.69224,0,0],8873:[0,.69224,0,0],8874:[0,.69224,0,0],8876:[0,.68889,0,0],8877:[0,.68889,0,0],8878:[0,.68889,0,0],8879:[0,.68889,0,0],8882:[.03517,.54986,0,0],8883:[.03517,.54986,0,0],8884:[.13667,.63667,0,0],8885:[.13667,.63667,0,0],8888:[0,.54986,0,0],8890:[.19444,.43056,0,0],8891:[.19444,.69224,0,0],8892:[.19444,.69224,0,0],8901:[0,.54986,0,0],8903:[.08167,.58167,0,0],8905:[.08167,.58167,0,0],8906:[.08167,.58167,0,0],8907:[0,.69224,0,0],8908:[0,.69224,0,0],8909:[-.03598,.46402,0,0],8910:[0,.54986,0,0],8911:[0,.54986,0,0],8912:[.03517,.54986,0,0],8913:[.03517,.54986,0,0],8914:[0,.54986,0,0],8915:[0,.54986,0,0],8916:[0,.69224,0,0],8918:[.0391,.5391,0,0],8919:[.0391,.5391,0,0],8920:[.03517,.54986,0,0],8921:[.03517,.54986,0,0],8922:[.38569,.88569,0,0],8923:[.38569,.88569,0,0],8926:[.13667,.63667,0,0],8927:[.13667,.63667,0,0],8928:[.30274,.79383,0,0],8929:[.30274,.79383,0,0],8934:[.23222,.74111,0,0],8935:[.23222,.74111,0,0],8936:[.23222,.74111,0,0],8937:[.23222,.74111,0,0],8938:[.20576,.70576,0,0],8939:[.20576,.70576,0,0],8940:[.30274,.79383,0,0],8941:[.30274,.79383,0,0],8994:[.19444,.69224,0,0],8995:[.19444,.69224,0,0],9416:[.15559,.69224,0,0],9484:[0,.69224,0,0],9488:[0,.69224,0,0],9492:[0,.37788,0,0],9496:[0,.37788,0,0],9585:[.19444,.68889,0,0],9586:[.19444,.74111,0,0],9632:[0,.675,0,0],9633:[0,.675,0,0],9650:[0,.54986,0,0],9651:[0,.54986,0,0],9654:[.03517,.54986,0,0],9660:[0,.54986,0,0],9661:[0,.54986,0,0],9664:[.03517,.54986,0,0],9674:[.11111,.69224,0,0],9733:[.19444,.69224,0,0],10003:[0,.69224,0,0],10016:[0,.69224,0,0],10731:[.11111,.69224,0,0],10846:[.19444,.75583,0,0],10877:[.13667,.63667,0,0],10878:[.13667,.63667,0,0],10885:[.25583,.75583,0,0],10886:[.25583,.75583,0,0],10887:[.13597,.63597,0,0],10888:[.13597,.63597,0,0],10889:[.26167,.75726,0,0],10890:[.26167,.75726,0,0],10891:[.48256,.98256,0,0],10892:[.48256,.98256,0,0],10901:[.13667,.63667,0,0],10902:[.13667,.63667,0,0],10933:[.25142,.75726,0,0],10934:[.25142,.75726,0,0],10935:[.26167,.75726,0,0],10936:[.26167,.75726,0,0],10937:[.26167,.75726,0,0],10938:[.26167,.75726,0,0],10949:[.25583,.75583,0,0],10950:[.25583,.75583,0,0],10955:[.28481,.79383,0,0],10956:[.28481,.79383,0,0],57350:[.08167,.58167,0,0],57351:[.08167,.58167,0,0],57352:[.08167,.58167,0,0],57353:[0,.43056,.04028,0],57356:[.25142,.75726,0,0],57357:[.25142,.75726,0,0],57358:[.41951,.91951,0,0],57359:[.30274,.79383,0,0],57360:[.30274,.79383,0,0],57361:[.41951,.91951,0,0],57366:[.25142,.75726,0,0],57367:[.25142,.75726,0,0],57368:[.25142,.75726,0,0],57369:[.25142,.75726,0,0],57370:[.13597,.63597,0,0],57371:[.13597,.63597,0,0]},"Caligraphic-Regular":{48:[0,.43056,0,0],49:[0,.43056,0,0],50:[0,.43056,0,0],51:[.19444,.43056,0,0],52:[.19444,.43056,0,0],53:[.19444,.43056,0,0],54:[0,.64444,0,0],55:[.19444,.43056,0,0],56:[0,.64444,0,0],57:[.19444,.43056,0,0],65:[0,.68333,0,.19445],66:[0,.68333,.03041,.13889],67:[0,.68333,.05834,.13889],
+68:[0,.68333,.02778,.08334],69:[0,.68333,.08944,.11111],70:[0,.68333,.09931,.11111],71:[.09722,.68333,.0593,.11111],72:[0,.68333,.00965,.11111],73:[0,.68333,.07382,0],74:[.09722,.68333,.18472,.16667],75:[0,.68333,.01445,.05556],76:[0,.68333,0,.13889],77:[0,.68333,0,.13889],78:[0,.68333,.14736,.08334],79:[0,.68333,.02778,.11111],80:[0,.68333,.08222,.08334],81:[.09722,.68333,0,.11111],82:[0,.68333,0,.08334],83:[0,.68333,.075,.13889],84:[0,.68333,.25417,0],85:[0,.68333,.09931,.08334],86:[0,.68333,.08222,0],87:[0,.68333,.08222,.08334],88:[0,.68333,.14643,.13889],89:[.09722,.68333,.08222,.08334],90:[0,.68333,.07944,.13889]},"Fraktur-Regular":{33:[0,.69141,0,0],34:[0,.69141,0,0],38:[0,.69141,0,0],39:[0,.69141,0,0],40:[.24982,.74947,0,0],41:[.24982,.74947,0,0],42:[0,.62119,0,0],43:[.08319,.58283,0,0],44:[0,.10803,0,0],45:[.08319,.58283,0,0],46:[0,.10803,0,0],47:[.24982,.74947,0,0],48:[0,.47534,0,0],49:[0,.47534,0,0],50:[0,.47534,0,0],51:[.18906,.47534,0,0],52:[.18906,.47534,0,0],53:[.18906,.47534,0,0],54:[0,.69141,0,0],55:[.18906,.47534,0,0],56:[0,.69141,0,0],57:[.18906,.47534,0,0],58:[0,.47534,0,0],59:[.12604,.47534,0,0],61:[-.13099,.36866,0,0],63:[0,.69141,0,0],65:[0,.69141,0,0],66:[0,.69141,0,0],67:[0,.69141,0,0],68:[0,.69141,0,0],69:[0,.69141,0,0],70:[.12604,.69141,0,0],71:[0,.69141,0,0],72:[.06302,.69141,0,0],73:[0,.69141,0,0],74:[.12604,.69141,0,0],75:[0,.69141,0,0],76:[0,.69141,0,0],77:[0,.69141,0,0],78:[0,.69141,0,0],79:[0,.69141,0,0],80:[.18906,.69141,0,0],81:[.03781,.69141,0,0],82:[0,.69141,0,0],83:[0,.69141,0,0],84:[0,.69141,0,0],85:[0,.69141,0,0],86:[0,.69141,0,0],87:[0,.69141,0,0],88:[0,.69141,0,0],89:[.18906,.69141,0,0],90:[.12604,.69141,0,0],91:[.24982,.74947,0,0],93:[.24982,.74947,0,0],94:[0,.69141,0,0],97:[0,.47534,0,0],98:[0,.69141,0,0],99:[0,.47534,0,0],100:[0,.62119,0,0],101:[0,.47534,0,0],102:[.18906,.69141,0,0],103:[.18906,.47534,0,0],104:[.18906,.69141,0,0],105:[0,.69141,0,0],106:[0,.69141,0,0],107:[0,.69141,0,0],108:[0,.69141,0,0],109:[0,.47534,0,0],110:[0,.47534,0,0],111:[0,.47534,0,0],112:[.18906,.52396,0,0],113:[.18906,.47534,0,0],114:[0,.47534,0,0],115:[0,.47534,0,0],116:[0,.62119,0,0],117:[0,.47534,0,0],118:[0,.52396,0,0],119:[0,.52396,0,0],120:[.18906,.47534,0,0],121:[.18906,.47534,0,0],122:[.18906,.47534,0,0],8216:[0,.69141,0,0],8217:[0,.69141,0,0],58112:[0,.62119,0,0],58113:[0,.62119,0,0],58114:[.18906,.69141,0,0],58115:[.18906,.69141,0,0],58116:[.18906,.47534,0,0],58117:[0,.69141,0,0],58118:[0,.62119,0,0],58119:[0,.47534,0,0]},"Main-Bold":{33:[0,.69444,0,0],34:[0,.69444,0,0],35:[.19444,.69444,0,0],36:[.05556,.75,0,0],37:[.05556,.75,0,0],38:[0,.69444,0,0],39:[0,.69444,0,0],40:[.25,.75,0,0],41:[.25,.75,0,0],42:[0,.75,0,0],43:[.13333,.63333,0,0],44:[.19444,.15556,0,0],45:[0,.44444,0,0],46:[0,.15556,0,0],47:[.25,.75,0,0],48:[0,.64444,0,0],49:[0,.64444,0,0],50:[0,.64444,0,0],51:[0,.64444,0,0],52:[0,.64444,0,0],53:[0,.64444,0,0],54:[0,.64444,0,0],55:[0,.64444,0,0],56:[0,.64444,0,0],57:[0,.64444,0,0],58:[0,.44444,0,0],59:[.19444,.44444,0,0],60:[.08556,.58556,0,0],61:[-.10889,.39111,0,0],62:[.08556,.58556,0,0],63:[0,.69444,0,0],64:[0,.69444,0,0],65:[0,.68611,0,0],66:[0,.68611,0,0],67:[0,.68611,0,0],68:[0,.68611,0,0],69:[0,.68611,0,0],70:[0,.68611,0,0],71:[0,.68611,0,0],72:[0,.68611,0,0],73:[0,.68611,0,0],74:[0,.68611,0,0],75:[0,.68611,0,0],76:[0,.68611,0,0],77:[0,.68611,0,0],78:[0,.68611,0,0],79:[0,.68611,0,0],80:[0,.68611,0,0],81:[.19444,.68611,0,0],82:[0,.68611,0,0],83:[0,.68611,0,0],84:[0,.68611,0,0],85:[0,.68611,0,0],86:[0,.68611,.01597,0],87:[0,.68611,.01597,0],88:[0,.68611,0,0],89:[0,.68611,.02875,0],90:[0,.68611,0,0],91:[.25,.75,0,0],92:[.25,.75,0,0],93:[.25,.75,0,0],94:[0,.69444,0,0],95:[.31,.13444,.03194,0],96:[0,.69444,0,0],97:[0,.44444,0,0],98:[0,.69444,0,0],99:[0,.44444,0,0],100:[0,.69444,0,0],101:[0,.44444,0,0],102:[0,.69444,.10903,0],103:[.19444,.44444,.01597,0],104:[0,.69444,0,0],105:[0,.69444,0,0],106:[.19444,.69444,0,0],107:[0,.69444,0,0],108:[0,.69444,0,0],109:[0,.44444,0,0],110:[0,.44444,0,0],111:[0,.44444,0,0],112:[.19444,.44444,0,0],113:[.19444,.44444,0,0],114:[0,.44444,0,0],115:[0,.44444,0,0],116:[0,.63492,0,0],117:[0,.44444,0,0],118:[0,.44444,.01597,0],119:[0,.44444,.01597,0],120:[0,.44444,0,0],121:[.19444,.44444,.01597,0],122:[0,.44444,0,0],123:[.25,.75,0,0],124:[.25,.75,0,0],125:[.25,.75,0,0],126:[.35,.34444,0,0],168:[0,.69444,0,0],172:[0,.44444,0,0],175:[0,.59611,0,0],176:[0,.69444,0,0],177:[.13333,.63333,0,0],180:[0,.69444,0,0],215:[.13333,.63333,0,0],247:[.13333,.63333,0,0],305:[0,.44444,0,0],567:[.19444,.44444,0,0],710:[0,.69444,0,0],711:[0,.63194,0,0],713:[0,.59611,0,0],714:[0,.69444,0,0],715:[0,.69444,0,0],728:[0,.69444,0,0],729:[0,.69444,0,0],730:[0,.69444,0,0],732:[0,.69444,0,0],768:[0,.69444,0,0],769:[0,.69444,0,0],770:[0,.69444,0,0],771:[0,.69444,0,0],772:[0,.59611,0,0],774:[0,.69444,0,0],775:[0,.69444,0,0],776:[0,.69444,0,0],778:[0,.69444,0,0],779:[0,.69444,0,0],780:[0,.63194,0,0],824:[.19444,.69444,0,0],915:[0,.68611,0,0],916:[0,.68611,0,0],920:[0,.68611,0,0],923:[0,.68611,0,0],926:[0,.68611,0,0],928:[0,.68611,0,0],931:[0,.68611,0,0],933:[0,.68611,0,0],934:[0,.68611,0,0],936:[0,.68611,0,0],937:[0,.68611,0,0],8211:[0,.44444,.03194,0],8212:[0,.44444,.03194,0],8216:[0,.69444,0,0],8217:[0,.69444,0,0],8220:[0,.69444,0,0],8221:[0,.69444,0,0],8224:[.19444,.69444,0,0],8225:[.19444,.69444,0,0],8242:[0,.55556,0,0],8407:[0,.72444,.15486,0],8463:[0,.69444,0,0],8465:[0,.69444,0,0],8467:[0,.69444,0,0],8472:[.19444,.44444,0,0],8476:[0,.69444,0,0],8501:[0,.69444,0,0],8592:[-.10889,.39111,0,0],8593:[.19444,.69444,0,0],8594:[-.10889,.39111,0,0],8595:[.19444,.69444,0,0],8596:[-.10889,.39111,0,0],8597:[.25,.75,0,0],8598:[.19444,.69444,0,0],8599:[.19444,.69444,0,0],8600:[.19444,.69444,0,0],8601:[.19444,.69444,0,0],8636:[-.10889,.39111,0,0],8637:[-.10889,.39111,0,0],8640:[-.10889,.39111,0,0],8641:[-.10889,.39111,0,0],8656:[-.10889,.39111,0,0],8657:[.19444,.69444,0,0],8658:[-.10889,.39111,0,0],8659:[.19444,.69444,0,0],8660:[-.10889,.39111,0,0],8661:[.25,.75,0,0],8704:[0,.69444,0,0],8706:[0,.69444,.06389,0],8707:[0,.69444,0,0],8709:[.05556,.75,0,0],8711:[0,.68611,0,0],8712:[.08556,.58556,0,0],8715:[.08556,.58556,0,0],8722:[.13333,.63333,0,0],8723:[.13333,.63333,0,0],8725:[.25,.75,0,0],8726:[.25,.75,0,0],8727:[-.02778,.47222,0,0],8728:[-.02639,.47361,0,0],8729:[-.02639,.47361,0,0],8730:[.18,.82,0,0],8733:[0,.44444,0,0],8734:[0,.44444,0,0],8736:[0,.69224,0,0],8739:[.25,.75,0,0],8741:[.25,.75,0,0],8743:[0,.55556,0,0],8744:[0,.55556,0,0],8745:[0,.55556,0,0],8746:[0,.55556,0,0],8747:[.19444,.69444,.12778,0],8764:[-.10889,.39111,0,0],8768:[.19444,.69444,0,0],8771:[.00222,.50222,0,0],8776:[.02444,.52444,0,0],8781:[.00222,.50222,0,0],8801:[.00222,.50222,0,0],8804:[.19667,.69667,0,0],8805:[.19667,.69667,0,0],8810:[.08556,.58556,0,0],8811:[.08556,.58556,0,0],8826:[.08556,.58556,0,0],8827:[.08556,.58556,0,0],8834:[.08556,.58556,0,0],8835:[.08556,.58556,0,0],8838:[.19667,.69667,0,0],8839:[.19667,.69667,0,0],8846:[0,.55556,0,0],8849:[.19667,.69667,0,0],8850:[.19667,.69667,0,0],8851:[0,.55556,0,0],8852:[0,.55556,0,0],8853:[.13333,.63333,0,0],8854:[.13333,.63333,0,0],8855:[.13333,.63333,0,0],8856:[.13333,.63333,0,0],8857:[.13333,.63333,0,0],8866:[0,.69444,0,0],8867:[0,.69444,0,0],8868:[0,.69444,0,0],8869:[0,.69444,0,0],8900:[-.02639,.47361,0,0],8901:[-.02639,.47361,0,0],8902:[-.02778,.47222,0,0],8968:[.25,.75,0,0],8969:[.25,.75,0,0],8970:[.25,.75,0,0],8971:[.25,.75,0,0],8994:[-.13889,.36111,0,0],8995:[-.13889,.36111,0,0],9651:[.19444,.69444,0,0],9657:[-.02778,.47222,0,0],9661:[.19444,.69444,0,0],9667:[-.02778,.47222,0,0],9711:[.19444,.69444,0,0],9824:[.12963,.69444,0,0],9825:[.12963,.69444,0,0],9826:[.12963,.69444,0,0],9827:[.12963,.69444,0,0],9837:[0,.75,0,0],9838:[.19444,.69444,0,0],9839:[.19444,.69444,0,0],10216:[.25,.75,0,0],10217:[.25,.75,0,0],10815:[0,.68611,0,0],10927:[.19667,.69667,0,0],10928:[.19667,.69667,0,0]},"Main-Italic":{33:[0,.69444,.12417,0],34:[0,.69444,.06961,0],35:[.19444,.69444,.06616,0],37:[.05556,.75,.13639,0],38:[0,.69444,.09694,0],39:[0,.69444,.12417,0],40:[.25,.75,.16194,0],41:[.25,.75,.03694,0],42:[0,.75,.14917,0],43:[.05667,.56167,.03694,0],44:[.19444,.10556,0,0],45:[0,.43056,.02826,0],46:[0,.10556,0,0],47:[.25,.75,.16194,0],48:[0,.64444,.13556,0],49:[0,.64444,.13556,0],50:[0,.64444,.13556,0],51:[0,.64444,.13556,0],52:[.19444,.64444,.13556,0],53:[0,.64444,.13556,0],54:[0,.64444,.13556,0],55:[.19444,.64444,.13556,0],56:[0,.64444,.13556,0],57:[0,.64444,.13556,0],58:[0,.43056,.0582,0],59:[.19444,.43056,.0582,0],61:[-.13313,.36687,.06616,0],63:[0,.69444,.1225,0],64:[0,.69444,.09597,0],65:[0,.68333,0,0],66:[0,.68333,.10257,0],67:[0,.68333,.14528,0],68:[0,.68333,.09403,0],69:[0,.68333,.12028,0],70:[0,.68333,.13305,0],71:[0,.68333,.08722,0],72:[0,.68333,.16389,0],73:[0,.68333,.15806,0],74:[0,.68333,.14028,0],75:[0,.68333,.14528,0],76:[0,.68333,0,0],77:[0,.68333,.16389,0],78:[0,.68333,.16389,0],79:[0,.68333,.09403,0],80:[0,.68333,.10257,0],81:[.19444,.68333,.09403,0],82:[0,.68333,.03868,0],83:[0,.68333,.11972,0],84:[0,.68333,.13305,0],85:[0,.68333,.16389,0],86:[0,.68333,.18361,0],87:[0,.68333,.18361,0],88:[0,.68333,.15806,0],89:[0,.68333,.19383,0],90:[0,.68333,.14528,0],91:[.25,.75,.1875,0],93:[.25,.75,.10528,0],94:[0,.69444,.06646,0],95:[.31,.12056,.09208,0],97:[0,.43056,.07671,0],98:[0,.69444,.06312,0],99:[0,.43056,.05653,0],100:[0,.69444,.10333,0],101:[0,.43056,.07514,0],102:[.19444,.69444,.21194,0],103:[.19444,.43056,.08847,0],104:[0,.69444,.07671,0],105:[0,.65536,.1019,0],106:[.19444,.65536,.14467,0],107:[0,.69444,.10764,0],108:[0,.69444,.10333,0],109:[0,.43056,.07671,0],110:[0,.43056,.07671,0],111:[0,.43056,.06312,0],112:[.19444,.43056,.06312,0],113:[.19444,.43056,.08847,0],114:[0,.43056,.10764,0],115:[0,.43056,.08208,0],116:[0,.61508,.09486,0],117:[0,.43056,.07671,0],118:[0,.43056,.10764,0],119:[0,.43056,.10764,0],120:[0,.43056,.12042,0],121:[.19444,.43056,.08847,0],122:[0,.43056,.12292,0],126:[.35,.31786,.11585,0],163:[0,.69444,0,0],305:[0,.43056,0,.02778],567:[.19444,.43056,0,.08334],768:[0,.69444,0,0],769:[0,.69444,.09694,0],770:[0,.69444,.06646,0],771:[0,.66786,.11585,0],772:[0,.56167,.10333,0],774:[0,.69444,.10806,0],775:[0,.66786,.11752,0],776:[0,.66786,.10474,0],778:[0,.69444,0,0],779:[0,.69444,.1225,0],780:[0,.62847,.08295,0],915:[0,.68333,.13305,0],916:[0,.68333,0,0],920:[0,.68333,.09403,0],923:[0,.68333,0,0],926:[0,.68333,.15294,0],928:[0,.68333,.16389,0],931:[0,.68333,.12028,0],933:[0,.68333,.11111,0],934:[0,.68333,.05986,0],936:[0,.68333,.11111,0],937:[0,.68333,.10257,0],8211:[0,.43056,.09208,0],8212:[0,.43056,.09208,0],8216:[0,.69444,.12417,0],8217:[0,.69444,.12417,0],8220:[0,.69444,.1685,0],8221:[0,.69444,.06961,0],8463:[0,.68889,0,0]},"Main-Regular":{32:[0,0,0,0],33:[0,.69444,0,0],34:[0,.69444,0,0],35:[.19444,.69444,0,0],36:[.05556,.75,0,0],37:[.05556,.75,0,0],38:[0,.69444,0,0],39:[0,.69444,0,0],40:[.25,.75,0,0],41:[.25,.75,0,0],42:[0,.75,0,0],43:[.08333,.58333,0,0],44:[.19444,.10556,0,0],45:[0,.43056,0,0],46:[0,.10556,0,0],47:[.25,.75,0,0],48:[0,.64444,0,0],49:[0,.64444,0,0],50:[0,.64444,0,0],51:[0,.64444,0,0],52:[0,.64444,0,0],53:[0,.64444,0,0],54:[0,.64444,0,0],55:[0,.64444,0,0],56:[0,.64444,0,0],57:[0,.64444,0,0],58:[0,.43056,0,0],59:[.19444,.43056,0,0],60:[.0391,.5391,0,0],61:[-.13313,.36687,0,0],62:[.0391,.5391,0,0],63:[0,.69444,0,0],64:[0,.69444,0,0],65:[0,.68333,0,0],66:[0,.68333,0,0],67:[0,.68333,0,0],68:[0,.68333,0,0],69:[0,.68333,0,0],70:[0,.68333,0,0],71:[0,.68333,0,0],72:[0,.68333,0,0],73:[0,.68333,0,0],74:[0,.68333,0,0],75:[0,.68333,0,0],76:[0,.68333,0,0],77:[0,.68333,0,0],78:[0,.68333,0,0],79:[0,.68333,0,0],80:[0,.68333,0,0],81:[.19444,.68333,0,0],82:[0,.68333,0,0],83:[0,.68333,0,0],84:[0,.68333,0,0],85:[0,.68333,0,0],86:[0,.68333,.01389,0],87:[0,.68333,.01389,0],88:[0,.68333,0,0],89:[0,.68333,.025,0],90:[0,.68333,0,0],91:[.25,.75,0,0],92:[.25,.75,0,0],93:[.25,.75,0,0],94:[0,.69444,0,0],95:[.31,.12056,.02778,0],96:[0,.69444,0,0],97:[0,.43056,0,0],98:[0,.69444,0,0],99:[0,.43056,0,0],100:[0,.69444,0,0],101:[0,.43056,0,0],102:[0,.69444,.07778,0],103:[.19444,.43056,.01389,0],104:[0,.69444,0,0],105:[0,.66786,0,0],106:[.19444,.66786,0,0],107:[0,.69444,0,0],108:[0,.69444,0,0],109:[0,.43056,0,0],110:[0,.43056,0,0],111:[0,.43056,0,0],112:[.19444,.43056,0,0],113:[.19444,.43056,0,0],114:[0,.43056,0,0],115:[0,.43056,0,0],116:[0,.61508,0,0],117:[0,.43056,0,0],118:[0,.43056,.01389,0],119:[0,.43056,.01389,0],120:[0,.43056,0,0],121:[.19444,.43056,.01389,0],122:[0,.43056,0,0],123:[.25,.75,0,0],124:[.25,.75,0,0],125:[.25,.75,0,0],126:[.35,.31786,0,0],160:[0,0,0,0],168:[0,.66786,0,0],172:[0,.43056,0,0],175:[0,.56778,0,0],176:[0,.69444,0,0],177:[.08333,.58333,0,0],180:[0,.69444,0,0],215:[.08333,.58333,0,0],247:[.08333,.58333,0,0],305:[0,.43056,0,0],567:[.19444,.43056,0,0],710:[0,.69444,0,0],711:[0,.62847,0,0],713:[0,.56778,0,0],714:[0,.69444,0,0],715:[0,.69444,0,0],728:[0,.69444,0,0],729:[0,.66786,0,0],730:[0,.69444,0,0],732:[0,.66786,0,0],768:[0,.69444,0,0],769:[0,.69444,0,0],770:[0,.69444,0,0],771:[0,.66786,0,0],772:[0,.56778,0,0],774:[0,.69444,0,0],775:[0,.66786,0,0],776:[0,.66786,0,0],778:[0,.69444,0,0],779:[0,.69444,0,0],780:[0,.62847,0,0],824:[.19444,.69444,0,0],915:[0,.68333,0,0],916:[0,.68333,0,0],920:[0,.68333,0,0],923:[0,.68333,0,0],926:[0,.68333,0,0],928:[0,.68333,0,0],931:[0,.68333,0,0],933:[0,.68333,0,0],934:[0,.68333,0,0],936:[0,.68333,0,0],937:[0,.68333,0,0],8211:[0,.43056,.02778,0],8212:[0,.43056,.02778,0],8216:[0,.69444,0,0],8217:[0,.69444,0,0],8220:[0,.69444,0,0],8221:[0,.69444,0,0],8224:[.19444,.69444,0,0],8225:[.19444,.69444,0,0],8230:[0,.12,0,0],8242:[0,.55556,0,0],8407:[0,.71444,.15382,0],8463:[0,.68889,0,0],8465:[0,.69444,0,0],8467:[0,.69444,0,.11111],8472:[.19444,.43056,0,.11111],8476:[0,.69444,0,0],8501:[0,.69444,0,0],8592:[-.13313,.36687,0,0],8593:[.19444,.69444,0,0],8594:[-.13313,.36687,0,0],8595:[.19444,.69444,0,0],8596:[-.13313,.36687,0,0],8597:[.25,.75,0,0],8598:[.19444,.69444,0,0],8599:[.19444,.69444,0,0],8600:[.19444,.69444,0,0],8601:[.19444,.69444,0,0],8614:[.011,.511,0,0],8617:[.011,.511,0,0],8618:[.011,.511,0,0],8636:[-.13313,.36687,0,0],8637:[-.13313,.36687,0,0],8640:[-.13313,.36687,0,0],8641:[-.13313,.36687,0,0],8652:[.011,.671,0,0],8656:[-.13313,.36687,0,0],8657:[.19444,.69444,0,0],8658:[-.13313,.36687,0,0],8659:[.19444,.69444,0,0],8660:[-.13313,.36687,0,0],8661:[.25,.75,0,0],8704:[0,.69444,0,0],8706:[0,.69444,.05556,.08334],8707:[0,.69444,0,0],8709:[.05556,.75,0,0],8711:[0,.68333,0,0],8712:[.0391,.5391,0,0],8715:[.0391,.5391,0,0],8722:[.08333,.58333,0,0],8723:[.08333,.58333,0,0],8725:[.25,.75,0,0],8726:[.25,.75,0,0],8727:[-.03472,.46528,0,0],8728:[-.05555,.44445,0,0],8729:[-.05555,.44445,0,0],8730:[.2,.8,0,0],8733:[0,.43056,0,0],8734:[0,.43056,0,0],8736:[0,.69224,0,0],8739:[.25,.75,0,0],8741:[.25,.75,0,0],8743:[0,.55556,0,0],8744:[0,.55556,0,0],8745:[0,.55556,0,0],8746:[0,.55556,0,0],8747:[.19444,.69444,.11111,0],8764:[-.13313,.36687,0,0],8768:[.19444,.69444,0,0],8771:[-.03625,.46375,0,0],8773:[-.022,.589,0,0],8776:[-.01688,.48312,0,0],8781:[-.03625,.46375,0,0],8784:[-.133,.67,0,0],8800:[.215,.716,0,0],8801:[-.03625,.46375,0,0],8804:[.13597,.63597,0,0],8805:[.13597,.63597,0,0],8810:[.0391,.5391,0,0],8811:[.0391,.5391,0,0],8826:[.0391,.5391,0,0],8827:[.0391,.5391,0,0],8834:[.0391,.5391,0,0],8835:[.0391,.5391,0,0],8838:[.13597,.63597,0,0],8839:[.13597,.63597,0,0],8846:[0,.55556,0,0],8849:[.13597,.63597,0,0],8850:[.13597,.63597,0,0],8851:[0,.55556,0,0],8852:[0,.55556,0,0],8853:[.08333,.58333,0,0],8854:[.08333,.58333,0,0],8855:[.08333,.58333,0,0],8856:[.08333,.58333,0,0],8857:[.08333,.58333,0,0],8866:[0,.69444,0,0],8867:[0,.69444,0,0],8868:[0,.69444,0,0],8869:[0,.69444,0,0],8872:[.249,.75,0,0],8900:[-.05555,.44445,0,0],8901:[-.05555,.44445,0,0],8902:[-.03472,.46528,0,0],8904:[.005,.505,0,0],8942:[.03,.9,0,0],8943:[-.19,.31,0,0],8945:[-.1,.82,0,0],8968:[.25,.75,0,0],8969:[.25,.75,0,0],8970:[.25,.75,0,0],8971:[.25,.75,0,0],8994:[-.14236,.35764,0,0],8995:[-.14236,.35764,0,0],9136:[.244,.744,0,0],9137:[.244,.744,0,0],9651:[.19444,.69444,0,0],9657:[-.03472,.46528,0,0],9661:[.19444,.69444,0,0],9667:[-.03472,.46528,0,0],9711:[.19444,.69444,0,0],9824:[.12963,.69444,0,0],9825:[.12963,.69444,0,0],9826:[.12963,.69444,0,0],9827:[.12963,.69444,0,0],9837:[0,.75,0,0],9838:[.19444,.69444,0,0],9839:[.19444,.69444,0,0],10216:[.25,.75,0,0],10217:[.25,.75,0,0],10222:[.244,.744,0,0],10223:[.244,.744,0,0],10229:[.011,.511,0,0],10230:[.011,.511,0,0],10231:[.011,.511,0,0],10232:[.024,.525,0,0],10233:[.024,.525,0,0],10234:[.024,.525,0,0],10236:[.011,.511,0,0],10815:[0,.68333,0,0],10927:[.13597,.63597,0,0],10928:[.13597,.63597,0,0]},"Math-BoldItalic":{47:[.19444,.69444,0,0],65:[0,.68611,0,0],66:[0,.68611,.04835,0],67:[0,.68611,.06979,0],68:[0,.68611,.03194,0],69:[0,.68611,.05451,0],70:[0,.68611,.15972,0],71:[0,.68611,0,0],72:[0,.68611,.08229,0],73:[0,.68611,.07778,0],74:[0,.68611,.10069,0],75:[0,.68611,.06979,0],76:[0,.68611,0,0],77:[0,.68611,.11424,0],78:[0,.68611,.11424,0],79:[0,.68611,.03194,0],80:[0,.68611,.15972,0],81:[.19444,.68611,0,0],82:[0,.68611,.00421,0],83:[0,.68611,.05382,0],84:[0,.68611,.15972,0],85:[0,.68611,.11424,0],86:[0,.68611,.25555,0],87:[0,.68611,.15972,0],88:[0,.68611,.07778,0],89:[0,.68611,.25555,0],90:[0,.68611,.06979,0],97:[0,.44444,0,0],98:[0,.69444,0,0],99:[0,.44444,0,0],100:[0,.69444,0,0],101:[0,.44444,0,0],102:[.19444,.69444,.11042,0],103:[.19444,.44444,.03704,0],104:[0,.69444,0,0],105:[0,.69326,0,0],106:[.19444,.69326,.0622,0],107:[0,.69444,.01852,0],108:[0,.69444,.0088,0],109:[0,.44444,0,0],110:[0,.44444,0,0],111:[0,.44444,0,0],112:[.19444,.44444,0,0],113:[.19444,.44444,.03704,0],114:[0,.44444,.03194,0],115:[0,.44444,0,0],116:[0,.63492,0,0],117:[0,.44444,0,0],118:[0,.44444,.03704,0],119:[0,.44444,.02778,0],120:[0,.44444,0,0],121:[.19444,.44444,.03704,0],122:[0,.44444,.04213,0],915:[0,.68611,.15972,0],916:[0,.68611,0,0],920:[0,.68611,.03194,0],923:[0,.68611,0,0],926:[0,.68611,.07458,0],928:[0,.68611,.08229,0],931:[0,.68611,.05451,0],933:[0,.68611,.15972,0],934:[0,.68611,0,0],936:[0,.68611,.11653,0],937:[0,.68611,.04835,0],945:[0,.44444,0,0],946:[.19444,.69444,.03403,0],947:[.19444,.44444,.06389,0],948:[0,.69444,.03819,0],949:[0,.44444,0,0],950:[.19444,.69444,.06215,0],951:[.19444,.44444,.03704,0],952:[0,.69444,.03194,0],953:[0,.44444,0,0],954:[0,.44444,0,0],955:[0,.69444,0,0],956:[.19444,.44444,0,0],957:[0,.44444,.06898,0],958:[.19444,.69444,.03021,0],959:[0,.44444,0,0],960:[0,.44444,.03704,0],961:[.19444,.44444,0,0],962:[.09722,.44444,.07917,0],963:[0,.44444,.03704,0],964:[0,.44444,.13472,0],965:[0,.44444,.03704,0],966:[.19444,.44444,0,0],967:[.19444,.44444,0,0],968:[.19444,.69444,.03704,0],969:[0,.44444,.03704,0],977:[0,.69444,0,0],981:[.19444,.69444,0,0],982:[0,.44444,.03194,0],1009:[.19444,.44444,0,0],1013:[0,.44444,0,0]},"Math-Italic":{47:[.19444,.69444,0,0],65:[0,.68333,0,.13889],66:[0,.68333,.05017,.08334],67:[0,.68333,.07153,.08334],68:[0,.68333,.02778,.05556],69:[0,.68333,.05764,.08334],70:[0,.68333,.13889,.08334],71:[0,.68333,0,.08334],72:[0,.68333,.08125,.05556],73:[0,.68333,.07847,.11111],74:[0,.68333,.09618,.16667],75:[0,.68333,.07153,.05556],76:[0,.68333,0,.02778],77:[0,.68333,.10903,.08334],78:[0,.68333,.10903,.08334],79:[0,.68333,.02778,.08334],80:[0,.68333,.13889,.08334],81:[.19444,.68333,0,.08334],82:[0,.68333,.00773,.08334],83:[0,.68333,.05764,.08334],84:[0,.68333,.13889,.08334],85:[0,.68333,.10903,.02778],86:[0,.68333,.22222,0],87:[0,.68333,.13889,0],88:[0,.68333,.07847,.08334],89:[0,.68333,.22222,0],90:[0,.68333,.07153,.08334],97:[0,.43056,0,0],98:[0,.69444,0,0],99:[0,.43056,0,.05556],100:[0,.69444,0,.16667],101:[0,.43056,0,.05556],102:[.19444,.69444,.10764,.16667],103:[.19444,.43056,.03588,.02778],104:[0,.69444,0,0],105:[0,.65952,0,0],106:[.19444,.65952,.05724,0],107:[0,.69444,.03148,0],108:[0,.69444,.01968,.08334],109:[0,.43056,0,0],110:[0,.43056,0,0],111:[0,.43056,0,.05556],112:[.19444,.43056,0,.08334],113:[.19444,.43056,.03588,.08334],114:[0,.43056,.02778,.05556],115:[0,.43056,0,.05556],116:[0,.61508,0,.08334],117:[0,.43056,0,.02778],118:[0,.43056,.03588,.02778],119:[0,.43056,.02691,.08334],120:[0,.43056,0,.02778],121:[.19444,.43056,.03588,.05556],122:[0,.43056,.04398,.05556],915:[0,.68333,.13889,.08334],916:[0,.68333,0,.16667],920:[0,.68333,.02778,.08334],923:[0,.68333,0,.16667],926:[0,.68333,.07569,.08334],928:[0,.68333,.08125,.05556],931:[0,.68333,.05764,.08334],933:[0,.68333,.13889,.05556],934:[0,.68333,0,.08334],936:[0,.68333,.11,.05556],937:[0,.68333,.05017,.08334],945:[0,.43056,.0037,.02778],946:[.19444,.69444,.05278,.08334],947:[.19444,.43056,.05556,0],948:[0,.69444,.03785,.05556],949:[0,.43056,0,.08334],950:[.19444,.69444,.07378,.08334],951:[.19444,.43056,.03588,.05556],952:[0,.69444,.02778,.08334],953:[0,.43056,0,.05556],954:[0,.43056,0,0],955:[0,.69444,0,0],956:[.19444,.43056,0,.02778],957:[0,.43056,.06366,.02778],958:[.19444,.69444,.04601,.11111],959:[0,.43056,0,.05556],960:[0,.43056,.03588,0],961:[.19444,.43056,0,.08334],962:[.09722,.43056,.07986,.08334],963:[0,.43056,.03588,0],964:[0,.43056,.1132,.02778],965:[0,.43056,.03588,.02778],966:[.19444,.43056,0,.08334],967:[.19444,.43056,0,.05556],968:[.19444,.69444,.03588,.11111],969:[0,.43056,.03588,0],977:[0,.69444,0,.08334],981:[.19444,.69444,0,.08334],982:[0,.43056,.02778,0],1009:[.19444,.43056,0,.08334],1013:[0,.43056,0,.05556]},"Math-Regular":{65:[0,.68333,0,.13889],66:[0,.68333,.05017,.08334],67:[0,.68333,.07153,.08334],68:[0,.68333,.02778,.05556],69:[0,.68333,.05764,.08334],70:[0,.68333,.13889,.08334],71:[0,.68333,0,.08334],72:[0,.68333,.08125,.05556],73:[0,.68333,.07847,.11111],74:[0,.68333,.09618,.16667],75:[0,.68333,.07153,.05556],76:[0,.68333,0,.02778],77:[0,.68333,.10903,.08334],78:[0,.68333,.10903,.08334],79:[0,.68333,.02778,.08334],80:[0,.68333,.13889,.08334],81:[.19444,.68333,0,.08334],82:[0,.68333,.00773,.08334],83:[0,.68333,.05764,.08334],84:[0,.68333,.13889,.08334],85:[0,.68333,.10903,.02778],86:[0,.68333,.22222,0],87:[0,.68333,.13889,0],88:[0,.68333,.07847,.08334],89:[0,.68333,.22222,0],90:[0,.68333,.07153,.08334],97:[0,.43056,0,0],98:[0,.69444,0,0],99:[0,.43056,0,.05556],100:[0,.69444,0,.16667],101:[0,.43056,0,.05556],102:[.19444,.69444,.10764,.16667],103:[.19444,.43056,.03588,.02778],104:[0,.69444,0,0],105:[0,.65952,0,0],106:[.19444,.65952,.05724,0],107:[0,.69444,.03148,0],108:[0,.69444,.01968,.08334],109:[0,.43056,0,0],110:[0,.43056,0,0],111:[0,.43056,0,.05556],112:[.19444,.43056,0,.08334],113:[.19444,.43056,.03588,.08334],114:[0,.43056,.02778,.05556],115:[0,.43056,0,.05556],116:[0,.61508,0,.08334],117:[0,.43056,0,.02778],118:[0,.43056,.03588,.02778],119:[0,.43056,.02691,.08334],120:[0,.43056,0,.02778],121:[.19444,.43056,.03588,.05556],122:[0,.43056,.04398,.05556],915:[0,.68333,.13889,.08334],916:[0,.68333,0,.16667],920:[0,.68333,.02778,.08334],923:[0,.68333,0,.16667],926:[0,.68333,.07569,.08334],928:[0,.68333,.08125,.05556],931:[0,.68333,.05764,.08334],933:[0,.68333,.13889,.05556],934:[0,.68333,0,.08334],936:[0,.68333,.11,.05556],937:[0,.68333,.05017,.08334],945:[0,.43056,.0037,.02778],946:[.19444,.69444,.05278,.08334],947:[.19444,.43056,.05556,0],948:[0,.69444,.03785,.05556],949:[0,.43056,0,.08334],950:[.19444,.69444,.07378,.08334],951:[.19444,.43056,.03588,.05556],952:[0,.69444,.02778,.08334],953:[0,.43056,0,.05556],954:[0,.43056,0,0],955:[0,.69444,0,0],956:[.19444,.43056,0,.02778],957:[0,.43056,.06366,.02778],958:[.19444,.69444,.04601,.11111],959:[0,.43056,0,.05556],960:[0,.43056,.03588,0],961:[.19444,.43056,0,.08334],962:[.09722,.43056,.07986,.08334],963:[0,.43056,.03588,0],964:[0,.43056,.1132,.02778],965:[0,.43056,.03588,.02778],966:[.19444,.43056,0,.08334],967:[.19444,.43056,0,.05556],968:[.19444,.69444,.03588,.11111],969:[0,.43056,.03588,0],977:[0,.69444,0,.08334],981:[.19444,.69444,0,.08334],982:[0,.43056,.02778,0],1009:[.19444,.43056,0,.08334],1013:[0,.43056,0,.05556]},"SansSerif-Regular":{33:[0,.69444,0,0],34:[0,.69444,0,0],35:[.19444,.69444,0,0],36:[.05556,.75,0,0],37:[.05556,.75,0,0],38:[0,.69444,0,0],39:[0,.69444,0,0],40:[.25,.75,0,0],41:[.25,.75,0,0],42:[0,.75,0,0],43:[.08333,.58333,0,0],44:[.125,.08333,0,0],45:[0,.44444,0,0],46:[0,.08333,0,0],47:[.25,.75,0,0],48:[0,.65556,0,0],49:[0,.65556,0,0],50:[0,.65556,0,0],51:[0,.65556,0,0],52:[0,.65556,0,0],53:[0,.65556,0,0],54:[0,.65556,0,0],55:[0,.65556,0,0],56:[0,.65556,0,0],57:[0,.65556,0,0],58:[0,.44444,0,0],59:[.125,.44444,0,0],61:[-.13,.37,0,0],63:[0,.69444,0,0],64:[0,.69444,0,0],65:[0,.69444,0,0],66:[0,.69444,0,0],67:[0,.69444,0,0],68:[0,.69444,0,0],69:[0,.69444,0,0],70:[0,.69444,0,0],71:[0,.69444,0,0],72:[0,.69444,0,0],73:[0,.69444,0,0],74:[0,.69444,0,0],75:[0,.69444,0,0],76:[0,.69444,0,0],77:[0,.69444,0,0],78:[0,.69444,0,0],79:[0,.69444,0,0],80:[0,.69444,0,0],81:[.125,.69444,0,0],82:[0,.69444,0,0],83:[0,.69444,0,0],84:[0,.69444,0,0],85:[0,.69444,0,0],86:[0,.69444,.01389,0],87:[0,.69444,.01389,0],88:[0,.69444,0,0],89:[0,.69444,.025,0],90:[0,.69444,0,0],91:[.25,.75,0,0],93:[.25,.75,0,0],94:[0,.69444,0,0],95:[.35,.09444,.02778,0],97:[0,.44444,0,0],98:[0,.69444,0,0],99:[0,.44444,0,0],100:[0,.69444,0,0],101:[0,.44444,0,0],102:[0,.69444,.06944,0],103:[.19444,.44444,.01389,0],104:[0,.69444,0,0],105:[0,.67937,0,0],106:[.19444,.67937,0,0],107:[0,.69444,0,0],108:[0,.69444,0,0],109:[0,.44444,0,0],110:[0,.44444,0,0],111:[0,.44444,0,0],112:[.19444,.44444,0,0],113:[.19444,.44444,0,0],114:[0,.44444,.01389,0],115:[0,.44444,0,0],116:[0,.57143,0,0],117:[0,.44444,0,0],118:[0,.44444,.01389,0],119:[0,.44444,.01389,0],120:[0,.44444,0,0],121:[.19444,.44444,.01389,0],122:[0,.44444,0,0],126:[.35,.32659,0,0],305:[0,.44444,0,0],567:[.19444,.44444,0,0],768:[0,.69444,0,0],769:[0,.69444,0,0],770:[0,.69444,0,0],771:[0,.67659,0,0],772:[0,.60889,0,0],774:[0,.69444,0,0],775:[0,.67937,0,0],776:[0,.67937,0,0],778:[0,.69444,0,0],779:[0,.69444,0,0],780:[0,.63194,0,0],915:[0,.69444,0,0],916:[0,.69444,0,0],920:[0,.69444,0,0],923:[0,.69444,0,0],926:[0,.69444,0,0],928:[0,.69444,0,0],931:[0,.69444,0,0],933:[0,.69444,0,0],934:[0,.69444,0,0],936:[0,.69444,0,0],937:[0,.69444,0,0],8211:[0,.44444,.02778,0],8212:[0,.44444,.02778,0],8216:[0,.69444,0,0],8217:[0,.69444,0,0],8220:[0,.69444,0,0],8221:[0,.69444,0,0]},"Script-Regular":{65:[0,.7,.22925,0],66:[0,.7,.04087,0],67:[0,.7,.1689,0],68:[0,.7,.09371,0],69:[0,.7,.18583,0],70:[0,.7,.13634,0],71:[0,.7,.17322,0],72:[0,.7,.29694,0],73:[0,.7,.19189,0],74:[.27778,.7,.19189,0],75:[0,.7,.31259,0],76:[0,.7,.19189,0],77:[0,.7,.15981,0],78:[0,.7,.3525,0],79:[0,.7,.08078,0],80:[0,.7,.08078,0],81:[0,.7,.03305,0],82:[0,.7,.06259,0],83:[0,.7,.19189,0],84:[0,.7,.29087,0],85:[0,.7,.25815,0],86:[0,.7,.27523,0],87:[0,.7,.27523,0],88:[0,.7,.26006,0],89:[0,.7,.2939,0],90:[0,.7,.24037,0]},"Size1-Regular":{40:[.35001,.85,0,0],41:[.35001,.85,0,0],47:[.35001,.85,0,0],91:[.35001,.85,0,0],92:[.35001,.85,0,0],93:[.35001,.85,0,0],123:[.35001,.85,0,0],125:[.35001,.85,0,0],710:[0,.72222,0,0],732:[0,.72222,0,0],770:[0,.72222,0,0],771:[0,.72222,0,0],8214:[-99e-5,.601,0,0],8593:[1e-5,.6,0,0],8595:[1e-5,.6,0,0],8657:[1e-5,.6,0,0],8659:[1e-5,.6,0,0],8719:[.25001,.75,0,0],8720:[.25001,.75,0,0],8721:[.25001,.75,0,0],8730:[.35001,.85,0,0],8739:[-.00599,.606,0,0],8741:[-.00599,.606,0,0],8747:[.30612,.805,.19445,0],8748:[.306,.805,.19445,0],8749:[.306,.805,.19445,0],8750:[.30612,.805,.19445,0],8896:[.25001,.75,0,0],8897:[.25001,.75,0,0],8898:[.25001,.75,0,0],8899:[.25001,.75,0,0],8968:[.35001,.85,0,0],8969:[.35001,.85,0,0],8970:[.35001,.85,0,0],8971:[.35001,.85,0,0],9168:[-99e-5,.601,0,0],10216:[.35001,.85,0,0],10217:[.35001,.85,0,0],10752:[.25001,.75,0,0],10753:[.25001,.75,0,0],10754:[.25001,.75,0,0],10756:[.25001,.75,0,0],10758:[.25001,.75,0,0]},"Size2-Regular":{40:[.65002,1.15,0,0],41:[.65002,1.15,0,0],47:[.65002,1.15,0,0],91:[.65002,1.15,0,0],92:[.65002,1.15,0,0],93:[.65002,1.15,0,0],123:[.65002,1.15,0,0],125:[.65002,1.15,0,0],710:[0,.75,0,0],732:[0,.75,0,0],770:[0,.75,0,0],771:[0,.75,0,0],8719:[.55001,1.05,0,0],8720:[.55001,1.05,0,0],8721:[.55001,1.05,0,0],8730:[.65002,1.15,0,0],8747:[.86225,1.36,.44445,0],8748:[.862,1.36,.44445,0],8749:[.862,1.36,.44445,0],8750:[.86225,1.36,.44445,0],8896:[.55001,1.05,0,0],8897:[.55001,1.05,0,0],8898:[.55001,1.05,0,0],8899:[.55001,1.05,0,0],8968:[.65002,1.15,0,0],8969:[.65002,1.15,0,0],8970:[.65002,1.15,0,0],8971:[.65002,1.15,0,0],10216:[.65002,1.15,0,0],10217:[.65002,1.15,0,0],10752:[.55001,1.05,0,0],10753:[.55001,1.05,0,0],10754:[.55001,1.05,0,0],10756:[.55001,1.05,0,0],10758:[.55001,1.05,0,0]},"Size3-Regular":{40:[.95003,1.45,0,0],41:[.95003,1.45,0,0],47:[.95003,1.45,0,0],91:[.95003,1.45,0,0],92:[.95003,1.45,0,0],93:[.95003,1.45,0,0],123:[.95003,1.45,0,0],125:[.95003,1.45,0,0],710:[0,.75,0,0],732:[0,.75,0,0],770:[0,.75,0,0],771:[0,.75,0,0],8730:[.95003,1.45,0,0],8968:[.95003,1.45,0,0],8969:[.95003,1.45,0,0],8970:[.95003,1.45,0,0],8971:[.95003,1.45,0,0],10216:[.95003,1.45,0,0],10217:[.95003,1.45,0,0]},"Size4-Regular":{40:[1.25003,1.75,0,0],41:[1.25003,1.75,0,0],47:[1.25003,1.75,0,0],91:[1.25003,1.75,0,0],92:[1.25003,1.75,0,0],93:[1.25003,1.75,0,0],123:[1.25003,1.75,0,0],125:[1.25003,1.75,0,0],710:[0,.825,0,0],732:[0,.825,0,0],770:[0,.825,0,0],771:[0,.825,0,0],8730:[1.25003,1.75,0,0],8968:[1.25003,1.75,0,0],8969:[1.25003,1.75,0,0],8970:[1.25003,1.75,0,0],8971:[1.25003,1.75,0,0],9115:[.64502,1.155,0,0],9116:[1e-5,.6,0,0],9117:[.64502,1.155,0,0],9118:[.64502,1.155,0,0],9119:[1e-5,.6,0,0],9120:[.64502,1.155,0,0],9121:[.64502,1.155,0,0],9122:[-99e-5,.601,0,0],9123:[.64502,1.155,0,0],9124:[.64502,1.155,0,0],9125:[-99e-5,.601,0,0],9126:[.64502,1.155,0,0],9127:[1e-5,.9,0,0],9128:[.65002,1.15,0,0],9129:[.90001,0,0,0],9130:[0,.3,0,0],9131:[1e-5,.9,0,0],9132:[.65002,1.15,0,0],9133:[.90001,0,0,0],9143:[.88502,.915,0,0],10216:[1.25003,1.75,0,0],10217:[1.25003,1.75,0,0],57344:[-.00499,.605,0,0],57345:[-.00499,.605,0,0],57680:[0,.12,0,0],57681:[0,.12,0,0],57682:[0,.12,0,0],57683:[0,.12,0,0]},"Typewriter-Regular":{33:[0,.61111,0,0],34:[0,.61111,0,0],35:[0,.61111,0,0],36:[.08333,.69444,0,0],37:[.08333,.69444,0,0],38:[0,.61111,0,0],39:[0,.61111,0,0],40:[.08333,.69444,0,0],41:[.08333,.69444,0,0],42:[0,.52083,0,0],43:[-.08056,.53055,0,0],44:[.13889,.125,0,0],45:[-.08056,.53055,0,0],46:[0,.125,0,0],47:[.08333,.69444,0,0],48:[0,.61111,0,0],49:[0,.61111,0,0],50:[0,.61111,0,0],51:[0,.61111,0,0],52:[0,.61111,0,0],53:[0,.61111,0,0],54:[0,.61111,0,0],55:[0,.61111,0,0],56:[0,.61111,0,0],57:[0,.61111,0,0],58:[0,.43056,0,0],59:[.13889,.43056,0,0],60:[-.05556,.55556,0,0],61:[-.19549,.41562,0,0],62:[-.05556,.55556,0,0],63:[0,.61111,0,0],64:[0,.61111,0,0],65:[0,.61111,0,0],66:[0,.61111,0,0],67:[0,.61111,0,0],68:[0,.61111,0,0],69:[0,.61111,0,0],70:[0,.61111,0,0],71:[0,.61111,0,0],72:[0,.61111,0,0],73:[0,.61111,0,0],74:[0,.61111,0,0],75:[0,.61111,0,0],76:[0,.61111,0,0],77:[0,.61111,0,0],78:[0,.61111,0,0],79:[0,.61111,0,0],80:[0,.61111,0,0],81:[.13889,.61111,0,0],82:[0,.61111,0,0],83:[0,.61111,0,0],84:[0,.61111,0,0],85:[0,.61111,0,0],86:[0,.61111,0,0],87:[0,.61111,0,0],88:[0,.61111,0,0],89:[0,.61111,0,0],90:[0,.61111,0,0],91:[.08333,.69444,0,0],92:[.08333,.69444,0,0],93:[.08333,.69444,0,0],94:[0,.61111,0,0],95:[.09514,0,0,0],96:[0,.61111,0,0],97:[0,.43056,0,0],98:[0,.61111,0,0],99:[0,.43056,0,0],100:[0,.61111,0,0],101:[0,.43056,0,0],102:[0,.61111,0,0],103:[.22222,.43056,0,0],104:[0,.61111,0,0],105:[0,.61111,0,0],106:[.22222,.61111,0,0],107:[0,.61111,0,0],108:[0,.61111,0,0],109:[0,.43056,0,0],110:[0,.43056,0,0],111:[0,.43056,0,0],112:[.22222,.43056,0,0],113:[.22222,.43056,0,0],114:[0,.43056,0,0],115:[0,.43056,0,0],116:[0,.55358,0,0],117:[0,.43056,0,0],118:[0,.43056,0,0],119:[0,.43056,0,0],120:[0,.43056,0,0],121:[.22222,.43056,0,0],122:[0,.43056,0,0],123:[.08333,.69444,0,0],124:[.08333,.69444,0,0],125:[.08333,.69444,0,0],126:[0,.61111,0,0],127:[0,.61111,0,0],305:[0,.43056,0,0],567:[.22222,.43056,0,0],768:[0,.61111,0,0],769:[0,.61111,0,0],770:[0,.61111,0,0],771:[0,.61111,0,0],772:[0,.56555,0,0],774:[0,.61111,0,0],776:[0,.61111,0,0],778:[0,.61111,0,0],780:[0,.56597,0,0],915:[0,.61111,0,0],916:[0,.61111,0,0],920:[0,.61111,0,0],923:[0,.61111,0,0],926:[0,.61111,0,0],928:[0,.61111,0,0],931:[0,.61111,0,0],933:[0,.61111,0,0],934:[0,.61111,0,0],936:[0,.61111,0,0],937:[0,.61111,0,0],2018:[0,.61111,0,0],2019:[0,.61111,0,0],8242:[0,.61111,0,0]}}},{}],43:[function(e,t){function n(e){return e&&e.__esModule?e:{"default":e}}function r(e,n,r){"string"==typeof e&&(e=[e]),"number"==typeof n&&(n={numArgs:n});for(var i={numArgs:n.numArgs,argTypes:n.argTypes,greediness:n.greediness===undefined?1:n.greediness,allowedInText:!!n.allowedInText,allowedInMath:n.allowedInMath,numOptionalArgs:n.numOptionalArgs||0,infix:!!n.infix,handler:r},a=0;a<e.length;++a)t.exports[e[a]]=i}var i=n(e("./utils")),a=n(e("./ParseError")),o=n(e("./ParseNode")),s=function(e){return"ordgroup"===e.type?e.value:[e]};r("\\sqrt",{numArgs:1,numOptionalArgs:1
+},function(e,t){var n=t[0];return{type:"sqrt",body:t[1],index:n}});var l={"\\text":undefined,"\\textrm":"mathrm","\\textsf":"mathsf","\\texttt":"mathtt","\\textnormal":"mathrm","\\textbf":"mathbf","\\textit":"textit"};r(["\\text","\\textrm","\\textsf","\\texttt","\\textnormal","\\textbf","\\textit"],{numArgs:1,argTypes:["text"],greediness:2,allowedInText:!0},function(e,t){var n=t[0];return{type:"text",body:s(n),style:l[e.funcName]}}),r("\\textcolor",{numArgs:2,allowedInText:!0,greediness:3,argTypes:["color","original"]},function(e,t){var n=t[0],r=t[1];return{type:"color",color:n.value,value:s(r)}}),r("\\color",{numArgs:1,allowedInText:!0,greediness:3,argTypes:["color"]},null),r("\\overline",{numArgs:1},function(e,t){return{type:"overline",body:t[0]}}),r("\\underline",{numArgs:1},function(e,t){return{type:"underline",body:t[0]}}),r("\\rule",{numArgs:2,numOptionalArgs:1,argTypes:["size","size","size"]},function(e,t){var n=t[0],r=t[1],i=t[2];return{type:"rule",shift:n&&n.value,width:r.value,height:i.value}}),r(["\\kern","\\mkern"],{numArgs:1,argTypes:["size"]},function(e,t){return{type:"kern",dimension:t[0].value}}),r("\\KaTeX",{numArgs:0},function(){return{type:"katex"}}),r("\\phantom",{numArgs:1},function(e,t){var n=t[0];return{type:"phantom",value:s(n)}}),r(["\\mathord","\\mathbin","\\mathrel","\\mathopen","\\mathclose","\\mathpunct","\\mathinner"],{numArgs:1},function(e,t){var n=t[0];return{type:"mclass",mclass:"m"+e.funcName.substr(5),value:s(n)}}),r("\\stackrel",{numArgs:2},function(e,t){var n=t[0],r=t[1],i=new o["default"]("op",{type:"op",limits:!0,alwaysHandleSupSub:!0,symbol:!1,value:s(r)},r.mode);return{type:"mclass",mclass:"mrel",value:[new o["default"]("supsub",{base:i,sup:n,sub:null},n.mode)]}}),r("\\bmod",{numArgs:0},function(){return{type:"mod",modType:"bmod",value:null}}),r(["\\pod","\\pmod","\\mod"],{numArgs:1},function(e,t){var n=t[0];return{type:"mod",modType:e.funcName.substr(1),value:s(n)}});var u={"\\bigl":{mclass:"mopen",size:1},"\\Bigl":{mclass:"mopen",size:2},"\\biggl":{mclass:"mopen",size:3},"\\Biggl":{mclass:"mopen",size:4},"\\bigr":{mclass:"mclose",size:1},"\\Bigr":{mclass:"mclose",size:2},"\\biggr":{mclass:"mclose",size:3},"\\Biggr":{mclass:"mclose",size:4},"\\bigm":{mclass:"mrel",size:1},"\\Bigm":{mclass:"mrel",size:2},"\\biggm":{mclass:"mrel",size:3},"\\Biggm":{mclass:"mrel",size:4},"\\big":{mclass:"mord",size:1},"\\Big":{mclass:"mord",size:2},"\\bigg":{mclass:"mord",size:3},"\\Bigg":{mclass:"mord",size:4}},d=["(",")","[","\\lbrack","]","\\rbrack","\\{","\\lbrace","\\}","\\rbrace","\\lfloor","\\rfloor","\\lceil","\\rceil","<",">","\\langle","\\rangle","\\lt","\\gt","\\lvert","\\rvert","\\lVert","\\rVert","\\lgroup","\\rgroup","\\lmoustache","\\rmoustache","/","\\backslash","|","\\vert","\\|","\\Vert","\\uparrow","\\Uparrow","\\downarrow","\\Downarrow","\\updownarrow","\\Updownarrow","."],c={"\\Bbb":"\\mathbb","\\bold":"\\mathbf","\\frak":"\\mathfrak"};r(["\\blue","\\orange","\\pink","\\red","\\green","\\gray","\\purple","\\blueA","\\blueB","\\blueC","\\blueD","\\blueE","\\tealA","\\tealB","\\tealC","\\tealD","\\tealE","\\greenA","\\greenB","\\greenC","\\greenD","\\greenE","\\goldA","\\goldB","\\goldC","\\goldD","\\goldE","\\redA","\\redB","\\redC","\\redD","\\redE","\\maroonA","\\maroonB","\\maroonC","\\maroonD","\\maroonE","\\purpleA","\\purpleB","\\purpleC","\\purpleD","\\purpleE","\\mintA","\\mintB","\\mintC","\\grayA","\\grayB","\\grayC","\\grayD","\\grayE","\\grayF","\\grayG","\\grayH","\\grayI","\\kaBlue","\\kaGreen"],{numArgs:1,allowedInText:!0,greediness:3},function(e,t){var n=t[0];return{type:"color",color:"katex-"+e.funcName.slice(1),value:s(n)}}),r(["\\arcsin","\\arccos","\\arctan","\\arctg","\\arcctg","\\arg","\\ch","\\cos","\\cosec","\\cosh","\\cot","\\cotg","\\coth","\\csc","\\ctg","\\cth","\\deg","\\dim","\\exp","\\hom","\\ker","\\lg","\\ln","\\log","\\sec","\\sin","\\sinh","\\sh","\\tan","\\tanh","\\tg","\\th"],{numArgs:0},function(e){return{type:"op",limits:!1,symbol:!1,body:e.funcName}}),r(["\\det","\\gcd","\\inf","\\lim","\\liminf","\\limsup","\\max","\\min","\\Pr","\\sup"],{numArgs:0},function(e){return{type:"op",limits:!0,symbol:!1,body:e.funcName}}),r(["\\int","\\iint","\\iiint","\\oint"],{numArgs:0},function(e){return{type:"op",limits:!1,symbol:!0,body:e.funcName}}),r(["\\coprod","\\bigvee","\\bigwedge","\\biguplus","\\bigcap","\\bigcup","\\intop","\\prod","\\sum","\\bigotimes","\\bigoplus","\\bigodot","\\bigsqcup","\\smallint"],{numArgs:0},function(e){return{type:"op",limits:!0,symbol:!0,body:e.funcName}}),r("\\mathop",{numArgs:1},function(e,t){var n=t[0];return{type:"op",limits:!1,symbol:!1,value:s(n)}}),r(["\\dfrac","\\frac","\\tfrac","\\dbinom","\\binom","\\tbinom","\\\\atopfrac"],{numArgs:2,greediness:2},function(e,t){var n=t[0],r=t[1],i=void 0,a=null,o=null,s="auto";switch(e.funcName){case"\\dfrac":case"\\frac":case"\\tfrac":i=!0;break;case"\\\\atopfrac":i=!1;break;case"\\dbinom":case"\\binom":case"\\tbinom":i=!1,a="(",o=")";break;default:throw new Error("Unrecognized genfrac command")}switch(e.funcName){case"\\dfrac":case"\\dbinom":s="display";break;case"\\tfrac":case"\\tbinom":s="text"}return{type:"genfrac",numer:n,denom:r,hasBarLine:i,leftDelim:a,rightDelim:o,size:s}}),r(["\\llap","\\rlap"],{numArgs:1,allowedInText:!0},function(e,t){var n=t[0];return{type:e.funcName.slice(1),body:n}});var h=function(e,t){if(i["default"].contains(d,e.value))return e;throw new a["default"]("Invalid delimiter: '"+e.value+"' after '"+t.funcName+"'",e)};r(["\\bigl","\\Bigl","\\biggl","\\Biggl","\\bigr","\\Bigr","\\biggr","\\Biggr","\\bigm","\\Bigm","\\biggm","\\Biggm","\\big","\\Big","\\bigg","\\Bigg"],{numArgs:1},function(e,t){var n=h(t[0],e);return{type:"delimsizing",size:u[e.funcName].size,mclass:u[e.funcName].mclass,value:n.value}}),r(["\\left","\\right"],{numArgs:1},function(e,t){return{type:"leftright",value:h(t[0],e).value}}),r("\\middle",{numArgs:1},function(e,t){var n=h(t[0],e);if(!e.parser.leftrightDepth)throw new a["default"]("\\middle without preceding \\left",n);return{type:"middle",value:n.value}}),r(["\\tiny","\\scriptsize","\\footnotesize","\\small","\\normalsize","\\large","\\Large","\\LARGE","\\huge","\\Huge"],0,null),r(["\\displaystyle","\\textstyle","\\scriptstyle","\\scriptscriptstyle"],0,null),r(["\\rm","\\sf","\\tt","\\bf","\\it"],0,null),r(["\\mathrm","\\mathit","\\mathbf","\\mathbb","\\mathcal","\\mathfrak","\\mathscr","\\mathsf","\\mathtt","\\Bbb","\\bold","\\frak"],{numArgs:1,greediness:2},function(e,t){var n=t[0],r=e.funcName;return r in c&&(r=c[r]),{type:"font",font:r.slice(1),body:n}}),r(["\\acute","\\grave","\\ddot","\\tilde","\\bar","\\breve","\\check","\\hat","\\vec","\\dot","\\widehat","\\widetilde","\\overrightarrow","\\overleftarrow","\\Overrightarrow","\\overleftrightarrow","\\overgroup","\\overlinesegment","\\overleftharpoon","\\overrightharpoon"],{numArgs:1},function(e,t){var n=t[0],r=!i["default"].contains(["\\acute","\\grave","\\ddot","\\tilde","\\bar","\\breve","\\check","\\hat","\\vec","\\dot"],e.funcName),a=!r||i["default"].contains(["\\widehat","\\widetilde"],e.funcName);return{type:"accent",label:e.funcName,isStretchy:r,isShifty:a,value:s(n),base:n}}),r(["\\'","\\`","\\^","\\~","\\=","\\u","\\.",'\\"',"\\r","\\H","\\v"],{numArgs:1,allowedInText:!0,allowedInMath:!1},function(e,t){var n=t[0];return{type:"accent",label:e.funcName,isStretchy:!1,isShifty:!0,value:s(n),base:n}}),r(["\\overbrace","\\underbrace"],{numArgs:1},function(e,t){var n=t[0];return{type:"horizBrace",label:e.funcName,isOver:/^\\over/.test(e.funcName),base:n}}),r(["\\underleftarrow","\\underrightarrow","\\underleftrightarrow","\\undergroup","\\underlinesegment","\\undertilde"],{numArgs:1},function(e,t){var n=t[0];return{type:"accentUnder",label:e.funcName,value:s(n),body:n}}),r(["\\xleftarrow","\\xrightarrow","\\xLeftarrow","\\xRightarrow","\\xleftrightarrow","\\xLeftrightarrow","\\xhookleftarrow","\\xhookrightarrow","\\xmapsto","\\xrightharpoondown","\\xrightharpoonup","\\xleftharpoondown","\\xleftharpoonup","\\xrightleftharpoons","\\xleftrightharpoons","\\xLongequal","\\xtwoheadrightarrow","\\xtwoheadleftarrow","\\xLongequal","\\xtofrom"],{numArgs:1,numOptionalArgs:1},function(e,t){var n=t[0],r=t[1];return{type:"xArrow",label:e.funcName,body:r,below:n}}),r(["\\cancel","\\bcancel","\\xcancel","\\sout","\\fbox"],{numArgs:1},function(e,t){var n=t[0];return{type:"enclose",label:e.funcName,body:n}}),r(["\\over","\\choose","\\atop"],{numArgs:0,infix:!0},function(e){var t=void 0;switch(e.funcName){case"\\over":t="\\frac";break;case"\\choose":t="\\binom";break;case"\\atop":t="\\\\atopfrac";break;default:throw new Error("Unrecognized infix genfrac command")}return{type:"infix",replaceWith:t,token:e.token}}),r(["\\\\","\\cr"],{numArgs:0,numOptionalArgs:1,argTypes:["size"]},function(e,t){return{type:"cr",size:t[0]}}),r(["\\begin","\\end"],{numArgs:1,argTypes:["text"]},function(e,t){var n=t[0];if("ordgroup"!==n.type)throw new a["default"]("Invalid environment name",n);for(var r="",i=0;i<n.value.length;++i)r+=n.value[i].value;return{type:"environment",name:r,nameGroup:n}})},{"./ParseError":29,"./ParseNode":30,"./utils":51}],44:[function(e,t){function n(e,n){t.exports[e]=n}n("\\bgroup","{"),n("\\egroup","}"),n("\\begingroup","{"),n("\\endgroup","}"),n("\\mkern","\\kern"),n("\\overset","\\mathop{#2}\\limits^{#1}"),n("\\underset","\\mathop{#2}\\limits_{#1}"),n("\\boxed","\\fbox{\\displaystyle{#1}}"),n("\\iff","\\;\\Longleftrightarrow\\;"),n("\\implies","\\;\\Longrightarrow\\;"),n("\\impliedby","\\;\\Longleftarrow\\;"),n("\\ordinarycolon",":"),n("\\vcentcolon","\\mathrel{\\mathop\\ordinarycolon}"),n("\\dblcolon","\\vcentcolon\\mathrel{\\mkern-.9mu}\\vcentcolon"),n("\\coloneqq","\\vcentcolon\\mathrel{\\mkern-1.2mu}="),n("\\Coloneqq","\\dblcolon\\mathrel{\\mkern-1.2mu}="),n("\\coloneq","\\vcentcolon\\mathrel{\\mkern-1.2mu}\\mathrel{-}"),n("\\Coloneq","\\dblcolon\\mathrel{\\mkern-1.2mu}\\mathrel{-}"),n("\\eqqcolon","=\\mathrel{\\mkern-1.2mu}\\vcentcolon"),n("\\Eqqcolon","=\\mathrel{\\mkern-1.2mu}\\dblcolon"),n("\\eqcolon","\\mathrel{-}\\mathrel{\\mkern-1.2mu}\\vcentcolon"),n("\\Eqcolon","\\mathrel{-}\\mathrel{\\mkern-1.2mu}\\dblcolon"),n("\\colonapprox","\\vcentcolon\\mathrel{\\mkern-1.2mu}\\approx"),n("\\Colonapprox","\\dblcolon\\mathrel{\\mkern-1.2mu}\\approx"),n("\\colonsim","\\vcentcolon\\mathrel{\\mkern-1.2mu}\\sim"),n("\\Colonsim","\\dblcolon\\mathrel{\\mkern-1.2mu}\\sim"),n("\\ratio","\\vcentcolon"),n("\\coloncolon","\\dblcolon"),n("\\colonequals","\\coloneqq"),n("\\coloncolonequals","\\Coloneqq"),n("\\equalscolon","\\eqqcolon"),n("\\equalscoloncolon","\\Eqqcolon"),n("\\colonminus","\\coloneq"),n("\\coloncolonminus","\\Coloneq"),n("\\minuscolon","\\eqcolon"),n("\\minuscoloncolon","\\Eqcolon"),n("\\coloncolonapprox","\\Colonapprox"),n("\\coloncolonsim","\\Colonsim"),n("\\simcolon","\\sim\\mathrel{\\mkern-1.2mu}\\vcentcolon"),n("\\simcoloncolon","\\sim\\mathrel{\\mkern-1.2mu}\\dblcolon"),n("\\approxcolon","\\approx\\mathrel{\\mkern-1.2mu}\\vcentcolon"),n("\\approxcoloncolon","\\approx\\mathrel{\\mkern-1.2mu}\\dblcolon")},{}],45:[function(e,t){function n(e){return e&&e.__esModule?e:{"default":e}}var r=n(e("babel-runtime/helpers/classCallCheck")),i=n(e("babel-runtime/helpers/createClass")),a=n(e("./utils")),o=function(){function e(t,n){(0,r["default"])(this,e),this.type=t,this.attributes={},this.children=n||[]}return(0,i["default"])(e,[{key:"setAttribute",value:function(e,t){this.attributes[e]=t}},{key:"toNode",value:function(){var e=document.createElementNS("http://www.w3.org/1998/Math/MathML",this.type);for(var t in this.attributes)Object.prototype.hasOwnProperty.call(this.attributes,t)&&e.setAttribute(t,this.attributes[t]);for(var n=0;n<this.children.length;n++)e.appendChild(this.children[n].toNode());return e}},{key:"toMarkup",value:function(){var e="<"+this.type;for(var t in this.attributes)Object.prototype.hasOwnProperty.call(this.attributes,t)&&(e+=" "+t+'="',e+=a["default"].escape(this.attributes[t]),e+='"');e+=">";for(var n=0;n<this.children.length;n++)e+=this.children[n].toMarkup();return e+="</"+this.type+">"}}]),e}(),s=function(){function e(t){(0,r["default"])(this,e),this.text=t}return(0,i["default"])(e,[{key:"toNode",value:function(){return document.createTextNode(this.text)}},{key:"toMarkup",value:function(){return a["default"].escape(this.text)}}]),e}();t.exports={MathNode:o,TextNode:s}},{"./utils":51,"babel-runtime/helpers/classCallCheck":4,"babel-runtime/helpers/createClass":5}],46:[function(e,t){function n(e){return e&&e.__esModule?e:{"default":e}}var r=n(e("./Parser")),i=function(e,t){if(!("string"==typeof e||e instanceof String))throw new TypeError("KaTeX can only parse string typed expression");return new r["default"](e,t).parse()};t.exports=i},{"./Parser":31}],47:[function(e,t){var n=e("./buildCommon"),r=e("./mathMLTree"),i=e("./utils"),a={widehat:"^",widetilde:"~",undertilde:"~",overleftarrow:"\u2190",underleftarrow:"\u2190",xleftarrow:"\u2190",overrightarrow:"\u2192",underrightarrow:"\u2192",xrightarrow:"\u2192",underbrace:"\u23b5",overbrace:"\u23de",overleftrightarrow:"\u2194",underleftrightarrow:"\u2194",xleftrightarrow:"\u2194",Overrightarrow:"\u21d2",xRightarrow:"\u21d2",overleftharpoon:"\u21bc",xleftharpoonup:"\u21bc",overrightharpoon:"\u21c0",xrightharpoonup:"\u21c0",xLeftarrow:"\u21d0",xLeftrightarrow:"\u21d4",xhookleftarrow:"\u21a9",xhookrightarrow:"\u21aa",xmapsto:"\u21a6",xrightharpoondown:"\u21c1",xleftharpoondown:"\u21bd",xrightleftharpoons:"\u21cc",xleftrightharpoons:"\u21cb",xtwoheadleftarrow:"\u219e",xtwoheadrightarrow:"\u21a0",xLongequal:"=",xtofrom:"\u21c4"},o=function(e){var t=new r.MathNode("mo",[new r.TextNode(a[e.substr(1)])]);return t.setAttribute("stretchy","true"),t},s={overleftarrow:[.522,0,"leftarrow",.5],underleftarrow:[.522,0,"leftarrow",.5],xleftarrow:[.261,.261,"leftarrow",.783],overrightarrow:[.522,0,"rightarrow",.5],underrightarrow:[.522,0,"rightarrow",.5],xrightarrow:[.261,.261,"rightarrow",.783],overbrace:[.548,0,"overbrace",1.6],underbrace:[.548,0,"underbrace",1.6],overleftrightarrow:[.522,0,"leftrightarrow",.5],underleftrightarrow:[.522,0,"leftrightarrow",.5],xleftrightarrow:[.261,.261,"leftrightarrow",.783],Overrightarrow:[.56,0,"doublerightarrow",.5],xLeftarrow:[.28,.28,"doubleleftarrow",.783],xRightarrow:[.28,.28,"doublerightarrow",.783],xLeftrightarrow:[.28,.28,"doubleleftrightarrow",.955],overleftharpoon:[.522,0,"leftharpoon",.5],overrightharpoon:[.522,0,"rightharpoon",.5],xleftharpoonup:[.261,.261,"leftharpoon",.783],xrightharpoonup:[.261,.261,"rightharpoon",.783],xhookleftarrow:[.261,.261,"hookleftarrow",.87],xhookrightarrow:[.261,.261,"hookrightarrow",.87],overlinesegment:[.414,0,"linesegment",.5],underlinesegment:[.414,0,"linesegment",.5],xmapsto:[.261,.261,"mapsto",.783],xrightharpoondown:[.261,.261,"rightharpoondown",.783],xleftharpoondown:[.261,.261,"leftharpoondown",.783],xrightleftharpoons:[.358,.358,"rightleftharpoons",.716],xleftrightharpoons:[.358,.358,"leftrightharpoons",.716],overgroup:[.342,0,"overgroup",.87],undergroup:[.342,0,"undergroup",.87],xtwoheadleftarrow:[.167,.167,"twoheadleftarrow",.86],xtwoheadrightarrow:[.167,.167,"twoheadrightarrow",.86],xLongequal:[.167,.167,"longequal",.5],xtofrom:[.264,.264,"tofrom",.86]},l={doubleleftarrow:"<path d='M262 157\nl10-10c34-36 62.7-77 86-123 3.3-8 5-13.3 5-16 0-5.3-6.7-8-20-8-7.3\n 0-12.2.5-14.5 1.5-2.3 1-4.8 4.5-7.5 10.5-49.3 97.3-121.7 169.3-217 216-28\n 14-57.3 25-88 33-6.7 2-11 3.8-13 5.5-2 1.7-3 4.2-3 7.5s1 5.8 3 7.5\nc2 1.7 6.3 3.5 13 5.5 68 17.3 128.2 47.8 180.5 91.5 52.3 43.7 93.8 96.2 124.5\n 157.5 9.3 8 15.3 12.3 18 13h6c12-.7 18-4 18-10 0-2-1.7-7-5-15-23.3-46-52-87\n-86-123l-10-10h399738v-40H218c328 0 0 0 0 0l-10-8c-26.7-20-65.7-43-117-69 2.7\n-2 6-3.7 10-5 36.7-16 72.3-37.3 107-64l10-8h399782v-40z\nm8 0v40h399730v-40zm0 194v40h399730v-40z'/>",doublerightarrow:"<path d='M399738 392l\n-10 10c-34 36-62.7 77-86 123-3.3 8-5 13.3-5 16 0 5.3 6.7 8 20 8 7.3 0 12.2-.5\n 14.5-1.5 2.3-1 4.8-4.5 7.5-10.5 49.3-97.3 121.7-169.3 217-216 28-14 57.3-25 88\n-33 6.7-2 11-3.8 13-5.5 2-1.7 3-4.2 3-7.5s-1-5.8-3-7.5c-2-1.7-6.3-3.5-13-5.5-68\n-17.3-128.2-47.8-180.5-91.5-52.3-43.7-93.8-96.2-124.5-157.5-9.3-8-15.3-12.3-18\n-13h-6c-12 .7-18 4-18 10 0 2 1.7 7 5 15 23.3 46 52 87 86 123l10 10H0v40h399782\nc-328 0 0 0 0 0l10 8c26.7 20 65.7 43 117 69-2.7 2-6 3.7-10 5-36.7 16-72.3 37.3\n-107 64l-10 8H0v40zM0 157v40h399730v-40zm0 194v40h399730v-40z'/>",leftarrow:"<path d='M400000 241H110l3-3c68.7-52.7 113.7-120\n 135-202 4-14.7 6-23 6-25 0-7.3-7-11-21-11-8 0-13.2.8-15.5 2.5-2.3 1.7-4.2 5.8\n-5.5 12.5-1.3 4.7-2.7 10.3-4 17-12 48.7-34.8 92-68.5 130S65.3 228.3 18 247\nc-10 4-16 7.7-18 11 0 8.7 6 14.3 18 17 47.3 18.7 87.8 47 121.5 85S196 441.3 208\n 490c.7 2 1.3 5 2 9s1.2 6.7 1.5 8c.3 1.3 1 3.3 2 6s2.2 4.5 3.5 5.5c1.3 1 3.3\n 1.8 6 2.5s6 1 10 1c14 0 21-3.7 21-11 0-2-2-10.3-6-25-20-79.3-65-146.7-135-202\n l-3-3h399890zM100 241v40h399900v-40z'/>",rightarrow:"<path d='M0 241v40h399891c-47.3 35.3-84 78-110 128\n-16.7 32-27.7 63.7-33 95 0 1.3-.2 2.7-.5 4-.3 1.3-.5 2.3-.5 3 0 7.3 6.7 11 20\n 11 8 0 13.2-.8 15.5-2.5 2.3-1.7 4.2-5.5 5.5-11.5 2-13.3 5.7-27 11-41 14.7-44.7\n 39-84.5 73-119.5s73.7-60.2 119-75.5c6-2 9-5.7 9-11s-3-9-9-11c-45.3-15.3-85\n-40.5-119-75.5s-58.3-74.8-73-119.5c-4.7-14-8.3-27.3-11-40-1.3-6.7-3.2-10.8-5.5\n-12.5-2.3-1.7-7.5-2.5-15.5-2.5-14 0-21 3.7-21 11 0 2 2 10.3 6 25 20.7 83.3 67\n 151.7 139 205zm0 0v40h399900v-40z'/>"},u={bcancel:"<line x1='0' y1='0' x2='100%' y2='100%' stroke-width='0.046em'/>",cancel:"<line x1='0' y1='100%' x2='100%' y2='0' stroke-width='0.046em'/>",doubleleftarrow:"><svg viewBox='0 0 400000 549'\npreserveAspectRatio='xMinYMin slice'>"+l.doubleleftarrow+"</svg>",doubleleftrightarrow:"><svg width='50.1%' viewBox='0 0 400000 549'\npreserveAspectRatio='xMinYMin slice'>"+l.doubleleftarrow+"</svg>\n<svg x='50%' width='50%' viewBox='0 0 400000 549' preserveAspectRatio='xMaxYMin\n slice'>"+l.doublerightarrow+"</svg>",doublerightarrow:"><svg viewBox='0 0 400000 549'\npreserveAspectRatio='xMaxYMin slice'>"+l.doublerightarrow+"</svg>",hookleftarrow:"><svg width='50.1%' viewBox='0 0 400000 522'\npreserveAspectRatio='xMinYMin slice'>"+l.leftarrow+"</svg>\n<svg x='50%' width='50%' viewBox='0 0 400000 522' preserveAspectRatio='xMaxYMin\n slice'><path d='M399859 241c-764 0 0 0 0 0 40-3.3 68.7\n -15.7 86-37 10-12 15-25.3 15-40 0-22.7-9.8-40.7-29.5-54-19.7-13.3-43.5-21-71.5\n -23-17.3-1.3-26-8-26-20 0-13.3 8.7-20 26-20 38 0 71 11.2 99 33.5 0 0 7 5.6 21\n 16.7 14 11.2 21 33.5 21 66.8s-14 61.2-42 83.5c-28 22.3-61 33.5-99 33.5L0 241z\n M0 281v-40h399859v40z'/></svg>",hookrightarrow:"><svg width='50.1%' viewBox='0 0 400000 522'\npreserveAspectRatio='xMinYMin slice'><path d='M400000 281\nH103s-33-11.2-61-33.5S0 197.3 0 164s14.2-61.2 42.5-83.5C70.8 58.2 104 47 142 47\nc16.7 0 25 6.7 25 20 0 12-8.7 18.7-26 20-40 3.3-68.7 15.7-86 37-10 12-15 25.3\n-15 40 0 22.7 9.8 40.7 29.5 54 19.7 13.3 43.5 21 71.5 23h399859zM103 281v-40\nh399897v40z'/></svg><svg x='50%' width='50%' viewBox='0 0 400000 522'\npreserveAspectRatio='xMaxYMin slice'>"+l.rightarrow+"</svg>",leftarrow:"><svg viewBox='0 0 400000 522' preserveAspectRatio='xMinYMin\n slice'>"+l.leftarrow+"</svg>",leftharpoon:"><svg viewBox='0 0 400000 522' preserveAspectRatio='xMinYMin\n slice'><path d='M0 267c.7 5.3 3 10 7 14h399993v-40H93c3.3\n-3.3 10.2-9.5 20.5-18.5s17.8-15.8 22.5-20.5c50.7-52 88-110.3 112-175 4-11.3 5\n-18.3 3-21-1.3-4-7.3-6-18-6-8 0-13 .7-15 2s-4.7 6.7-8 16c-42 98.7-107.3 174.7\n-196 228-6.7 4.7-10.7 8-12 10-1.3 2-2 5.7-2 11zm100-26v40h399900v-40z'/></svg>",leftharpoondown:"><svg viewBox='0 0 400000 522'\npreserveAspectRatio='xMinYMin slice'><path d=\"M7 241c-4 4-6.333 8.667-7 14\n 0 5.333.667 9 2 11s5.333 5.333 12 10c90.667 54 156 130 196 228 3.333 10.667\n 6.333 16.333 9 17 2 .667 5 1 9 1h5c10.667 0 16.667-2 18-6 2-2.667 1-9.667-3-21\n -32-87.333-82.667-157.667-152-211l-3-3h399907v-40z\nM93 281 H400000 v-40L7 241z\"/></svg>",leftrightarrow:"><svg width='50.1%' viewBox='0 0 400000 522'\npreserveAspectRatio='xMinYMin slice'>"+l.leftarrow+"</svg>\n<svg x='50%' width='50%' viewBox='0 0 400000 522' preserveAspectRatio='xMaxYMin\n slice'>"+l.rightarrow+"</svg>",leftrightharpoons:"><svg width='50.1%' viewBox='0 0 400000 716'\npreserveAspectRatio='xMinYMin slice'><path d='M0 267c.7 5.3\n 3 10 7 14h399993v-40H93c3.3-3.3 10.2-9.5 20.5-18.5s17.8-15.8 22.5-20.5c50.7-52\n 88-110.3 112-175 4-11.3 5-18.3 3-21-1.3-4-7.3-6-18-6-8 0-13 .7-15 2s-4.7 6.7-8\n 16c-42 98.7-107.3 174.7-196 228-6.7 4.7-10.7 8-12 10-1.3 2-2 5.7-2 11zm100-26\nv40h399900v-40zM0 435v40h400000v-40zm0 0v40h400000v-40z'/></svg>\n<svg x='50%' width='50%' viewBox='0 0 400000 716' preserveAspectRatio='xMaxYMin\n slice'><path d='M399747 705c0 7.3 6.7 11 20 11 8 0 13-.8\n 15-2.5s4.7-6.8 8-15.5c40-94 99.3-166.3 178-217 13.3-8 20.3-12.3 21-13 5.3-3.3\n 8.5-5.8 9.5-7.5 1-1.7 1.5-5.2 1.5-10.5s-2.3-10.3-7-15H0v40h399908c-34 25.3\n-64.7 57-92 95-27.3 38-48.7 77.7-64 119-3.3 8.7-5 14-5 16zM0 435v40h399900v-40z\nm0-194v40h400000v-40zm0 0v40h400000v-40z'/></svg>",linesegment:"><svg width='50.1%' viewBox='0 0 400000 414'\npreserveAspectRatio='xMinYMin slice'><path d='M40 187V40H0\nv334h40V227h399960v-40zm0 0V40H0v334h40V227h399960v-40z'/></svg><svg x='50%'\nwidth='50%' viewBox='0 0 400000 414' preserveAspectRatio='xMaxYMin slice'>\n<path d='M0 187v40h399960v147h40V40h-40v147zm0\n 0v40h399960v147h40V40h-40v147z'/></svg>",longequal:" viewBox='0 0 100 334' preserveAspectRatio='none'>\n<path d='M0 50h100v40H0zm0 194h100v40H0z'/>",mapsto:"><svg width='50.1%' viewBox='0 0 400000 522'\npreserveAspectRatio='xMinYMin slice'><path d='M40 241c740\n 0 0 0 0 0v-75c0-40.7-.2-64.3-.5-71-.3-6.7-2.2-11.7-5.5-15-4-4-8.7-6-14-6-5.3 0\n-10 2-14 6C2.7 83.3.8 91.3.5 104 .2 116.7 0 169 0 261c0 114 .7 172.3 2 175 4 8\n 10 12 18 12 5.3 0 10-2 14-6 3.3-3.3 5.2-8.3 5.5-15 .3-6.7.5-30.3.5-71v-75\nh399960zm0 0v40h399960v-40z'/></svg><svg x='50%' width='50%' viewBox='0 0\n 400000 522' preserveAspectRatio='xMaxYMin slice'>"+l.rightarrow+"</svg>",overbrace:"><svg width='25.5%' viewBox='0 0 400000 548'\npreserveAspectRatio='xMinYMin slice'><path d='M6 548l-6-6\nv-35l6-11c56-104 135.3-181.3 238-232 57.3-28.7 117-45 179-50h399577v120H403\nc-43.3 7-81 15-113 26-100.7 33-179.7 91-237 174-2.7 5-6 9-10 13-.7 1-7.3 1-20 1\nH6z'/></svg><svg x='25%' width='50%' viewBox='0 0 400000 548'\npreserveAspectRatio='xMidYMin slice'><path d='M200428 334\nc-100.7-8.3-195.3-44-280-108-55.3-42-101.7-93-139-153l-9-14c-2.7 4-5.7 8.7-9 14\n-53.3 86.7-123.7 153-211 199-66.7 36-137.3 56.3-212 62H0V214h199568c178.3-11.7\n 311.7-78.3 403-201 6-8 9.7-12 11-12 .7-.7 6.7-1 18-1s17.3.3 18 1c1.3 0 5 4 11\n 12 44.7 59.3 101.3 106.3 170 141s145.3 54.3 229 60h199572v120z'/></svg>\n<svg x='74.9%' width='24.1%' viewBox='0 0 400000 548'\npreserveAspectRatio='xMaxYMin slice'><path d='M400000 542l\n-6 6h-17c-12.7 0-19.3-.3-20-1-4-4-7.3-8.3-10-13-35.3-51.3-80.8-93.8-136.5-127.5\ns-117.2-55.8-184.5-66.5c-.7 0-2-.3-4-1-18.7-2.7-76-4.3-172-5H0V214h399571l6 1\nc124.7 8 235 61.7 331 161 31.3 33.3 59.7 72.7 85 118l7 13v35z'/></svg>",overgroup:"><svg width='50.1%' viewBox='0 0 400000 342'\npreserveAspectRatio='xMinYMin slice'><path d='M400000 80\nH435C64 80 168.3 229.4 21 260c-5.9 1.2-18 0-18 0-2 0-3-1-3-3v-38C76 61 257 0\n 435 0h399565z'/></svg><svg x='50%' width='50%' viewBox='0 0 400000 342'\npreserveAspectRatio='xMaxYMin slice'><path d='M0 80h399565\nc371 0 266.7 149.4 414 180 5.9 1.2 18 0 18 0 2 0 3-1 3-3v-38\nc-76-158-257-219-435-219H0z'/></svg>",rightarrow:"><svg viewBox='0 0 400000 522' preserveAspectRatio='xMaxYMin\n slice'>"+l.rightarrow+"</svg>",rightharpoon:"><svg viewBox='0 0 400000 522' preserveAspectRatio='xMaxYMin\n slice'><path d='M0 241v40h399993c4.7-4.7 7-9.3 7-14 0-9.3\n-3.7-15.3-11-18-92.7-56.7-159-133.7-199-231-3.3-9.3-6-14.7-8-16-2-1.3-7-2-15-2\n-10.7 0-16.7 2-18 6-2 2.7-1 9.7 3 21 15.3 42 36.7 81.8 64 119.5 27.3 37.7 58\n 69.2 92 94.5zm0 0v40h399900v-40z'/></svg>",rightharpoondown:"><svg viewBox='0 0 400000 522'\npreserveAspectRatio='xMaxYMin slice'><path d='M399747 511\nc0 7.3 6.7 11 20 11 8 0 13-.8 15-2.5s4.7-6.8 8-15.5c40-94 99.3-166.3 178-217\n 13.3-8 20.3-12.3 21-13 5.3-3.3 8.5-5.8 9.5-7.5 1-1.7 1.5-5.2 1.5-10.5s-2.3\n -10.3-7-15H0v40h399908c-34 25.3-64.7 57-92 95-27.3 38-48.7 77.7-64 119-3.3\n 8.7-5 14-5 16zM0 241v40h399900v-40z'/></svg>",rightleftharpoons:"><svg width='50%' viewBox='0 0 400000 716'\npreserveAspectRatio='xMinYMin slice'><path d='M7 435c-4 4\n-6.3 8.7-7 14 0 5.3.7 9 2 11s5.3 5.3 12 10c90.7 54 156 130 196 228 3.3 10.7 6.3\n 16.3 9 17 2 .7 5 1 9 1h5c10.7 0 16.7-2 18-6 2-2.7 1-9.7-3-21-32-87.3-82.7\n-157.7-152-211l-3-3h399907v-40H7zm93 0v40h399900v-40zM0 241v40h399900v-40z\nm0 0v40h399900v-40z'/></svg><svg x='50%' width='50%' viewBox='0 0 400000 716'\npreserveAspectRatio='xMaxYMin slice'><path d='M0 241v40\nh399993c4.7-4.7 7-9.3 7-14 0-9.3-3.7-15.3-11-18-92.7-56.7-159-133.7-199-231-3.3\n-9.3-6-14.7-8-16-2-1.3-7-2-15-2-10.7 0-16.7 2-18 6-2 2.7-1 9.7 3 21 15.3 42\n 36.7 81.8 64 119.5 27.3 37.7 58 69.2 92 94.5zm0 0v40h399900v-40z\n m100 194v40h399900v-40zm0 0v40h399900v-40z'/></svg>",tilde1:" viewBox='0 0 600 260' preserveAspectRatio='none'>\n<path d='M200 55.538c-77 0-168 73.953-177 73.953-3 0-7\n-2.175-9-5.437L2 97c-1-2-2-4-2-6 0-4 2-7 5-9l20-12C116 12 171 0 207 0c86 0\n 114 68 191 68 78 0 168-68 177-68 4 0 7 2 9 5l12 19c1 2.175 2 4.35 2 6.525 0\n 4.35-2 7.613-5 9.788l-19 13.05c-92 63.077-116.937 75.308-183 76.128\n-68.267.847-113-73.952-191-73.952z'/>",tilde2:" viewBox='0 0 1033 286' preserveAspectRatio='none'>\n<path d='M344 55.266c-142 0-300.638 81.316-311.5 86.418\n-8.01 3.762-22.5 10.91-23.5 5.562L1 120c-1-2-1-3-1-4 0-5 3-9 8-10l18.4-9C160.9\n 31.9 283 0 358 0c148 0 188 122 331 122s314-97 326-97c4 0 8 2 10 7l7 21.114\nc1 2.14 1 3.21 1 4.28 0 5.347-3 9.626-7 10.696l-22.3 12.622C852.6 158.372 751\n 181.476 676 181.476c-149 0-189-126.21-332-126.21z'/>",tilde3:" viewBox='0 0 2339 306' preserveAspectRatio='none'>\n<path d='M786 59C457 59 32 175.242 13 175.242c-6 0-10-3.457\n-11-10.37L.15 138c-1-7 3-12 10-13l19.2-6.4C378.4 40.7 634.3 0 804.3 0c337 0\n 411.8 157 746.8 157 328 0 754-112 773-112 5 0 10 3 11 9l1 14.075c1 8.066-.697\n 16.595-6.697 17.492l-21.052 7.31c-367.9 98.146-609.15 122.696-778.15 122.696\n -338 0-409-156.573-744-156.573z'/>",tilde4:" viewBox='0 0 2340 312' preserveAspectRatio='none'>\n<path d='M786 58C457 58 32 177.487 13 177.487c-6 0-10-3.345\n-11-10.035L.15 143c-1-7 3-12 10-13l22-6.7C381.2 35 637.15 0 807.15 0c337 0 409\n 177 744 177 328 0 754-127 773-127 5 0 10 3 11 9l1 14.794c1 7.805-3 13.38-9\n 14.495l-20.7 5.574c-366.85 99.79-607.3 139.372-776.3 139.372-338 0-409\n -175.236-744-175.236z'/>",tofrom:"><svg width='50.1%' viewBox='0 0 400000 528'\npreserveAspectRatio='xMinYMin slice'><path d='M0 147h400000\nv40H0zm0 214c68 40 115.7 95.7 143 167h22c15.3 0 23-.3 23-1 0-1.3-5.3-13.7-16-37\n-18-35.3-41.3-69-70-101l-7-8h399905v-40H95l7-8c28.7-32 52-65.7 70-101 10.7-23.3\n 16-35.7 16-37 0-.7-7.7-1-23-1h-22C115.7 265.3 68 321 0 361zm0-174v-40h399900\nv40zm100 154v40h399900v-40z'/></svg><svg x='50%' width='50%' viewBox='0 0\n 400000 528' preserveAspectRatio='xMaxYMin slice'><path\nd='M400000 167c-70.7-42-118-97.7-142-167h-23c-15.3 0-23 .3-23 1 0 1.3 5.3 13.7\n 16 37 18 35.3 41.3 69 70 101l7 8H0v40h399905l-7 8c-28.7 32-52 65.7-70 101-10.7\n 23.3-16 35.7-16 37 0 .7 7.7 1 23 1h23c24-69.3 71.3-125 142-167z\n M100 147v40h399900v-40zM0 341v40h399900v-40z'/></svg>",twoheadleftarrow:"><svg viewBox='0 0 400000 334'\npreserveAspectRatio='xMinYMin slice'><path d='M0 167c68 40\n 115.7 95.7 143 167h22c15.3 0 23-.3 23-1 0-1.3-5.3-13.7-16-37-18-35.3-41.3-69\n-70-101l-7-8h125l9 7c50.7 39.3 85 86 103 140h46c0-4.7-6.3-18.7-19-42-18-35.3\n-40-67.3-66-96l-9-9h399716v-40H284l9-9c26-28.7 48-60.7 66-96 12.7-23.333 19\n-37.333 19-42h-46c-18 54-52.3 100.7-103 140l-9 7H95l7-8c28.7-32 52-65.7 70-101\n 10.7-23.333 16-35.7 16-37 0-.7-7.7-1-23-1h-22C115.7 71.3 68 127 0 167z'/>\n</svg>",twoheadrightarrow:"><svg viewBox='0 0 400000 334'\npreserveAspectRatio='xMaxYMin slice'><path d='M400000 167\nc-68-40-115.7-95.7-143-167h-22c-15.3 0-23 .3-23 1 0 1.3 5.3 13.7 16 37 18 35.3\n 41.3 69 70 101l7 8h-125l-9-7c-50.7-39.3-85-86-103-140h-46c0 4.7 6.3 18.7 19 42\n 18 35.3 40 67.3 66 96l9 9H0v40h399716l-9 9c-26 28.7-48 60.7-66 96-12.7 23.333\n-19 37.333-19 42h46c18-54 52.3-100.7 103-140l9-7h125l-7 8c-28.7 32-52 65.7-70\n 101-10.7 23.333-16 35.7-16 37 0 .7 7.7 1 23 1h22c27.3-71.3 75-127 143-167z'/>\n</svg>",underbrace:"><svg width='25.1%' viewBox='0 0 400000 548'\npreserveAspectRatio='xMinYMin slice'><path d='M0 6l6-6h17\nc12.688 0 19.313.3 20 1 4 4 7.313 8.3 10 13 35.313 51.3 80.813 93.8 136.5 127.5\n 55.688 33.7 117.188 55.8 184.5 66.5.688 0 2 .3 4 1 18.688 2.7 76 4.3 172 5\nh399450v120H429l-6-1c-124.688-8-235-61.7-331-161C60.687 138.7 32.312 99.3 7 54\nL0 41V6z'/></svg><svg x='25%' width='50%' viewBox='0 0 400000 548'\npreserveAspectRatio='xMidYMin slice'><path d='M199572 214\nc100.7 8.3 195.3 44 280 108 55.3 42 101.7 93 139 153l9 14c2.7-4 5.7-8.7 9-14\n 53.3-86.7 123.7-153 211-199 66.7-36 137.3-56.3 212-62h199568v120H200432c-178.3\n 11.7-311.7 78.3-403 201-6 8-9.7 12-11 12-.7.7-6.7 1-18 1s-17.3-.3-18-1c-1.3 0\n-5-4-11-12-44.7-59.3-101.3-106.3-170-141s-145.3-54.3-229-60H0V214z'/></svg>\n<svg x='74.9%' width='25.1%' viewBox='0 0 400000 548'\npreserveAspectRatio='xMaxYMin slice'><path d='M399994 0l6 6\nv35l-6 11c-56 104-135.3 181.3-238 232-57.3 28.7-117 45-179 50H-300V214h399897\nc43.3-7 81-15 113-26 100.7-33 179.7-91 237-174 2.7-5 6-9 10-13 .7-1 7.3-1 20-1\nh17z'/></svg>",undergroup:"><svg width='50.1%' viewBox='0 0 400000 342'\npreserveAspectRatio='xMinYMin slice'><path d='M400000 262\nH435C64 262 168.3 112.6 21 82c-5.9-1.2-18 0-18 0-2 0-3 1-3 3v38c76 158 257 219\n 435 219h399565z'/></svg><svg x='50%' width='50%' viewBox='0 0 400000 342'\npreserveAspectRatio='xMaxYMin slice'><path d='M0 262h399565\nc371 0 266.7-149.4 414-180 5.9-1.2 18 0 18 0 2 0 3 1 3 3v38c-76 158-257\n 219-435 219H0z'/></svg>",widehat1:" viewBox='0 0 1062 239' preserveAspectRatio='none'>\n<path d='M529 0h5l519 115c5 1 9 5 9 10 0 1-1 2-1 3l-4 22\nc-1 5-5 9-11 9h-2L532 67 19 159h-2c-5 0-9-4-11-9l-5-22c-1-6 2-12 8-13z'/>",widehat2:" viewBox='0 0 2364 300' preserveAspectRatio='none'>\n<path d='M1181 0h2l1171 176c6 0 10 5 10 11l-2 23c-1 6-5 10\n-11 10h-1L1182 67 15 220h-1c-6 0-10-4-11-10l-2-23c-1-6 4-11 10-11z'/>",widehat3:" viewBox='0 0 2364 360' preserveAspectRatio='none'>\n<path d='M1181 0h2l1171 236c6 0 10 5 10 11l-2 23c-1 6-5 10\n-11 10h-1L1182 67 15 280h-1c-6 0-10-4-11-10l-2-23c-1-6 4-11 10-11z'/>",widehat4:" viewBox='0 0 2364 420' preserveAspectRatio='none'>\n<path d='M1181 0h2l1171 296c6 0 10 5 10 11l-2 23c-1 6-5 10\n-11 10h-1L1182 67 15 340h-1c-6 0-10-4-11-10l-2-23c-1-6 4-11 10-11z'/>",xcancel:"<line x1='0' y1='0' x2='100%' y2='100%' stroke-width='0.046em'/>\n<line x1='0' y1='100%' x2='100%' y2='0' stroke-width='0.046em'/>"},d=function(e,t){var r=e.value.label.substr(1),a=0,o=0,l="",d=0;if(i.contains(["widehat","widetilde","undertilde"],r)){var c=e.value.value.length;if(c>5)a=.312,l=("widehat"===r?"widehat":"tilde")+"4";else{var h=[1,1,2,2,3,3][c];"widehat"===r?(a=[0,.24,.3,.3,.36,.36][c],l="widehat"+h):(a=[0,.26,.3,.3,.34,.34][c],l="tilde"+h)}}else{var p=s[r];a=p[0],o=p[1],l=p[2],d=p[3]}var f=n.makeSpan([],[],t);f.height=a,f.depth=o;var m=a+o;return f.style.height=m+"em",d>0&&(f.style.minWidth=d+"em"),f.innerHTML="<svg width='100%' height='"+m+"em'"+u[l]+"</svg>",f},c=function(e,t,r,i){var a=void 0,o=e.height+e.depth+2*r;return"fbox"===t?(a=n.makeSpan(["stretchy",t],[],i),i.color&&(a.style.borderColor=i.getColor())):(a=n.makeSpan([],[],i)).innerHTML="<svg width='100%' height='"+o+"em'>"+u[t]+"</svg>",a.height=o,a.style.height=o+"em",a};t.exports={encloseSpan:c,mathMLnode:o,svgSpan:d}},{"./buildCommon":34,"./mathMLTree":45,"./utils":51}],48:[function(e,t){function n(e,n,r,i,a,o){t.exports[e][a]={font:n,group:r,replace:i},o&&(t.exports[e][i]=t.exports[e][a])}t.exports={math:{},text:{}};var r="math",i="text",a="main",o="ams",s="accent",l="bin",u="close",d="inner",c="mathord",h="op",p="open",f="punct",m="rel",g="spacing",v="textord";n(r,a,m,"\u2261","\\equiv"),n(r,a,m,"\u227a","\\prec"),n(r,a,m,"\u227b","\\succ"),n(r,a,m,"\u223c","\\sim"),n(r,a,m,"\u22a5","\\perp"),n(r,a,m,"\u2aaf","\\preceq"),n(r,a,m,"\u2ab0","\\succeq"),n(r,a,m,"\u2243","\\simeq"),n(r,a,m,"\u2223","\\mid"),n(r,a,m,"\u226a","\\ll"),n(r,a,m,"\u226b","\\gg"),n(r,a,m,"\u224d","\\asymp"),n(r,a,m,"\u2225","\\parallel"),n(r,a,m,"\u22c8","\\bowtie"),n(r,a,m,"\u2323","\\smile"),n(r,a,m,"\u2291","\\sqsubseteq"),n(r,a,m,"\u2292","\\sqsupseteq"),n(r,a,m,"\u2250","\\doteq"),n(r,a,m,"\u2322","\\frown"),n(r,a,m,"\u220b","\\ni"),n(r,a,m,"\u221d","\\propto"),n(r,a,m,"\u22a2","\\vdash"),n(r,a,m,"\u22a3","\\dashv"),
+n(r,a,m,"\u220b","\\owns"),n(r,a,f,".","\\ldotp"),n(r,a,f,"\u22c5","\\cdotp"),n(r,a,v,"#","\\#"),n(i,a,v,"#","\\#"),n(r,a,v,"&","\\&"),n(i,a,v,"&","\\&"),n(r,a,v,"\u2135","\\aleph"),n(r,a,v,"\u2200","\\forall"),n(r,a,v,"\u210f","\\hbar"),n(r,a,v,"\u2203","\\exists"),n(r,a,v,"\u2207","\\nabla"),n(r,a,v,"\u266d","\\flat"),n(r,a,v,"\u2113","\\ell"),n(r,a,v,"\u266e","\\natural"),n(r,a,v,"\u2663","\\clubsuit"),n(r,a,v,"\u2118","\\wp"),n(r,a,v,"\u266f","\\sharp"),n(r,a,v,"\u2662","\\diamondsuit"),n(r,a,v,"\u211c","\\Re"),n(r,a,v,"\u2661","\\heartsuit"),n(r,a,v,"\u2111","\\Im"),n(r,a,v,"\u2660","\\spadesuit"),n(r,a,v,"\u2020","\\dag"),n(i,a,v,"\u2020","\\dag"),n(i,a,v,"\u2020","\\textdagger"),n(r,a,v,"\u2021","\\ddag"),n(i,a,v,"\u2021","\\ddag"),n(i,a,v,"\u2020","\\textdaggerdbl"),n(r,a,u,"\u23b1","\\rmoustache"),n(r,a,p,"\u23b0","\\lmoustache"),n(r,a,u,"\u27ef","\\rgroup"),n(r,a,p,"\u27ee","\\lgroup"),n(r,a,l,"\u2213","\\mp"),n(r,a,l,"\u2296","\\ominus"),n(r,a,l,"\u228e","\\uplus"),n(r,a,l,"\u2293","\\sqcap"),n(r,a,l,"\u2217","\\ast"),n(r,a,l,"\u2294","\\sqcup"),n(r,a,l,"\u25ef","\\bigcirc"),n(r,a,l,"\u2219","\\bullet"),n(r,a,l,"\u2021","\\ddagger"),n(r,a,l,"\u2240","\\wr"),n(r,a,l,"\u2a3f","\\amalg"),n(r,a,m,"\u27f5","\\longleftarrow"),n(r,a,m,"\u21d0","\\Leftarrow"),n(r,a,m,"\u27f8","\\Longleftarrow"),n(r,a,m,"\u27f6","\\longrightarrow"),n(r,a,m,"\u21d2","\\Rightarrow"),n(r,a,m,"\u27f9","\\Longrightarrow"),n(r,a,m,"\u2194","\\leftrightarrow"),n(r,a,m,"\u27f7","\\longleftrightarrow"),n(r,a,m,"\u21d4","\\Leftrightarrow"),n(r,a,m,"\u27fa","\\Longleftrightarrow"),n(r,a,m,"\u21a6","\\mapsto"),n(r,a,m,"\u27fc","\\longmapsto"),n(r,a,m,"\u2197","\\nearrow"),n(r,a,m,"\u21a9","\\hookleftarrow"),n(r,a,m,"\u21aa","\\hookrightarrow"),n(r,a,m,"\u2198","\\searrow"),n(r,a,m,"\u21bc","\\leftharpoonup"),n(r,a,m,"\u21c0","\\rightharpoonup"),n(r,a,m,"\u2199","\\swarrow"),n(r,a,m,"\u21bd","\\leftharpoondown"),n(r,a,m,"\u21c1","\\rightharpoondown"),n(r,a,m,"\u2196","\\nwarrow"),n(r,a,m,"\u21cc","\\rightleftharpoons"),n(r,o,m,"\u226e","\\nless"),n(r,o,m,"\ue010","\\nleqslant"),n(r,o,m,"\ue011","\\nleqq"),n(r,o,m,"\u2a87","\\lneq"),n(r,o,m,"\u2268","\\lneqq"),n(r,o,m,"\ue00c","\\lvertneqq"),n(r,o,m,"\u22e6","\\lnsim"),n(r,o,m,"\u2a89","\\lnapprox"),n(r,o,m,"\u2280","\\nprec"),n(r,o,m,"\u22e0","\\npreceq"),n(r,o,m,"\u22e8","\\precnsim"),n(r,o,m,"\u2ab9","\\precnapprox"),n(r,o,m,"\u2241","\\nsim"),n(r,o,m,"\ue006","\\nshortmid"),n(r,o,m,"\u2224","\\nmid"),n(r,o,m,"\u22ac","\\nvdash"),n(r,o,m,"\u22ad","\\nvDash"),n(r,o,m,"\u22ea","\\ntriangleleft"),n(r,o,m,"\u22ec","\\ntrianglelefteq"),n(r,o,m,"\u228a","\\subsetneq"),n(r,o,m,"\ue01a","\\varsubsetneq"),n(r,o,m,"\u2acb","\\subsetneqq"),n(r,o,m,"\ue017","\\varsubsetneqq"),n(r,o,m,"\u226f","\\ngtr"),n(r,o,m,"\ue00f","\\ngeqslant"),n(r,o,m,"\ue00e","\\ngeqq"),n(r,o,m,"\u2a88","\\gneq"),n(r,o,m,"\u2269","\\gneqq"),n(r,o,m,"\ue00d","\\gvertneqq"),n(r,o,m,"\u22e7","\\gnsim"),n(r,o,m,"\u2a8a","\\gnapprox"),n(r,o,m,"\u2281","\\nsucc"),n(r,o,m,"\u22e1","\\nsucceq"),n(r,o,m,"\u22e9","\\succnsim"),n(r,o,m,"\u2aba","\\succnapprox"),n(r,o,m,"\u2246","\\ncong"),n(r,o,m,"\ue007","\\nshortparallel"),n(r,o,m,"\u2226","\\nparallel"),n(r,o,m,"\u22af","\\nVDash"),n(r,o,m,"\u22eb","\\ntriangleright"),n(r,o,m,"\u22ed","\\ntrianglerighteq"),n(r,o,m,"\ue018","\\nsupseteqq"),n(r,o,m,"\u228b","\\supsetneq"),n(r,o,m,"\ue01b","\\varsupsetneq"),n(r,o,m,"\u2acc","\\supsetneqq"),n(r,o,m,"\ue019","\\varsupsetneqq"),n(r,o,m,"\u22ae","\\nVdash"),n(r,o,m,"\u2ab5","\\precneqq"),n(r,o,m,"\u2ab6","\\succneqq"),n(r,o,m,"\ue016","\\nsubseteqq"),n(r,o,l,"\u22b4","\\unlhd"),n(r,o,l,"\u22b5","\\unrhd"),n(r,o,m,"\u219a","\\nleftarrow"),n(r,o,m,"\u219b","\\nrightarrow"),n(r,o,m,"\u21cd","\\nLeftarrow"),n(r,o,m,"\u21cf","\\nRightarrow"),n(r,o,m,"\u21ae","\\nleftrightarrow"),n(r,o,m,"\u21ce","\\nLeftrightarrow"),n(r,o,m,"\u25b3","\\vartriangle"),n(r,o,v,"\u210f","\\hslash"),n(r,o,v,"\u25bd","\\triangledown"),n(r,o,v,"\u25ca","\\lozenge"),n(r,o,v,"\u24c8","\\circledS"),n(r,o,v,"\xae","\\circledR"),n(i,o,v,"\xae","\\circledR"),n(r,o,v,"\u2221","\\measuredangle"),n(r,o,v,"\u2204","\\nexists"),n(r,o,v,"\u2127","\\mho"),n(r,o,v,"\u2132","\\Finv"),n(r,o,v,"\u2141","\\Game"),n(r,o,v,"k","\\Bbbk"),n(r,o,v,"\u2035","\\backprime"),n(r,o,v,"\u25b2","\\blacktriangle"),n(r,o,v,"\u25bc","\\blacktriangledown"),n(r,o,v,"\u25a0","\\blacksquare"),n(r,o,v,"\u29eb","\\blacklozenge"),n(r,o,v,"\u2605","\\bigstar"),n(r,o,v,"\u2222","\\sphericalangle"),n(r,o,v,"\u2201","\\complement"),n(r,o,v,"\xf0","\\eth"),n(r,o,v,"\u2571","\\diagup"),n(r,o,v,"\u2572","\\diagdown"),n(r,o,v,"\u25a1","\\square"),n(r,o,v,"\u25a1","\\Box"),n(r,o,v,"\u25ca","\\Diamond"),n(r,o,v,"\xa5","\\yen"),n(r,o,v,"\u2713","\\checkmark"),n(i,o,v,"\u2713","\\checkmark"),n(r,o,v,"\u2136","\\beth"),n(r,o,v,"\u2138","\\daleth"),n(r,o,v,"\u2137","\\gimel"),n(r,o,v,"\u03dd","\\digamma"),n(r,o,v,"\u03f0","\\varkappa"),n(r,o,p,"\u250c","\\ulcorner"),n(r,o,u,"\u2510","\\urcorner"),n(r,o,p,"\u2514","\\llcorner"),n(r,o,u,"\u2518","\\lrcorner"),n(r,o,m,"\u2266","\\leqq"),n(r,o,m,"\u2a7d","\\leqslant"),n(r,o,m,"\u2a95","\\eqslantless"),n(r,o,m,"\u2272","\\lesssim"),n(r,o,m,"\u2a85","\\lessapprox"),n(r,o,m,"\u224a","\\approxeq"),n(r,o,l,"\u22d6","\\lessdot"),n(r,o,m,"\u22d8","\\lll"),n(r,o,m,"\u2276","\\lessgtr"),n(r,o,m,"\u22da","\\lesseqgtr"),n(r,o,m,"\u2a8b","\\lesseqqgtr"),n(r,o,m,"\u2251","\\doteqdot"),n(r,o,m,"\u2253","\\risingdotseq"),n(r,o,m,"\u2252","\\fallingdotseq"),n(r,o,m,"\u223d","\\backsim"),n(r,o,m,"\u22cd","\\backsimeq"),n(r,o,m,"\u2ac5","\\subseteqq"),n(r,o,m,"\u22d0","\\Subset"),n(r,o,m,"\u228f","\\sqsubset"),n(r,o,m,"\u227c","\\preccurlyeq"),n(r,o,m,"\u22de","\\curlyeqprec"),n(r,o,m,"\u227e","\\precsim"),n(r,o,m,"\u2ab7","\\precapprox"),n(r,o,m,"\u22b2","\\vartriangleleft"),n(r,o,m,"\u22b4","\\trianglelefteq"),n(r,o,m,"\u22a8","\\vDash"),n(r,o,m,"\u22aa","\\Vvdash"),n(r,o,m,"\u2323","\\smallsmile"),n(r,o,m,"\u2322","\\smallfrown"),n(r,o,m,"\u224f","\\bumpeq"),n(r,o,m,"\u224e","\\Bumpeq"),n(r,o,m,"\u2267","\\geqq"),n(r,o,m,"\u2a7e","\\geqslant"),n(r,o,m,"\u2a96","\\eqslantgtr"),n(r,o,m,"\u2273","\\gtrsim"),n(r,o,m,"\u2a86","\\gtrapprox"),n(r,o,l,"\u22d7","\\gtrdot"),n(r,o,m,"\u22d9","\\ggg"),n(r,o,m,"\u2277","\\gtrless"),n(r,o,m,"\u22db","\\gtreqless"),n(r,o,m,"\u2a8c","\\gtreqqless"),n(r,o,m,"\u2256","\\eqcirc"),n(r,o,m,"\u2257","\\circeq"),n(r,o,m,"\u225c","\\triangleq"),n(r,o,m,"\u223c","\\thicksim"),n(r,o,m,"\u2248","\\thickapprox"),n(r,o,m,"\u2ac6","\\supseteqq"),n(r,o,m,"\u22d1","\\Supset"),n(r,o,m,"\u2290","\\sqsupset"),n(r,o,m,"\u227d","\\succcurlyeq"),n(r,o,m,"\u22df","\\curlyeqsucc"),n(r,o,m,"\u227f","\\succsim"),n(r,o,m,"\u2ab8","\\succapprox"),n(r,o,m,"\u22b3","\\vartriangleright"),n(r,o,m,"\u22b5","\\trianglerighteq"),n(r,o,m,"\u22a9","\\Vdash"),n(r,o,m,"\u2223","\\shortmid"),n(r,o,m,"\u2225","\\shortparallel"),n(r,o,m,"\u226c","\\between"),n(r,o,m,"\u22d4","\\pitchfork"),n(r,o,m,"\u221d","\\varpropto"),n(r,o,m,"\u25c0","\\blacktriangleleft"),n(r,o,m,"\u2234","\\therefore"),n(r,o,m,"\u220d","\\backepsilon"),n(r,o,m,"\u25b6","\\blacktriangleright"),n(r,o,m,"\u2235","\\because"),n(r,o,m,"\u22d8","\\llless"),n(r,o,m,"\u22d9","\\gggtr"),n(r,o,l,"\u22b2","\\lhd"),n(r,o,l,"\u22b3","\\rhd"),n(r,o,m,"\u2242","\\eqsim"),n(r,a,m,"\u22c8","\\Join"),n(r,o,m,"\u2251","\\Doteq"),n(r,o,l,"\u2214","\\dotplus"),n(r,o,l,"\u2216","\\smallsetminus"),n(r,o,l,"\u22d2","\\Cap"),n(r,o,l,"\u22d3","\\Cup"),n(r,o,l,"\u2a5e","\\doublebarwedge"),n(r,o,l,"\u229f","\\boxminus"),n(r,o,l,"\u229e","\\boxplus"),n(r,o,l,"\u22c7","\\divideontimes"),n(r,o,l,"\u22c9","\\ltimes"),n(r,o,l,"\u22ca","\\rtimes"),n(r,o,l,"\u22cb","\\leftthreetimes"),n(r,o,l,"\u22cc","\\rightthreetimes"),n(r,o,l,"\u22cf","\\curlywedge"),n(r,o,l,"\u22ce","\\curlyvee"),n(r,o,l,"\u229d","\\circleddash"),n(r,o,l,"\u229b","\\circledast"),n(r,o,l,"\u22c5","\\centerdot"),n(r,o,l,"\u22ba","\\intercal"),n(r,o,l,"\u22d2","\\doublecap"),n(r,o,l,"\u22d3","\\doublecup"),n(r,o,l,"\u22a0","\\boxtimes"),n(r,o,m,"\u21e2","\\dashrightarrow"),n(r,o,m,"\u21e0","\\dashleftarrow"),n(r,o,m,"\u21c7","\\leftleftarrows"),n(r,o,m,"\u21c6","\\leftrightarrows"),n(r,o,m,"\u21da","\\Lleftarrow"),n(r,o,m,"\u219e","\\twoheadleftarrow"),n(r,o,m,"\u21a2","\\leftarrowtail"),n(r,o,m,"\u21ab","\\looparrowleft"),n(r,o,m,"\u21cb","\\leftrightharpoons"),n(r,o,m,"\u21b6","\\curvearrowleft"),n(r,o,m,"\u21ba","\\circlearrowleft"),n(r,o,m,"\u21b0","\\Lsh"),n(r,o,m,"\u21c8","\\upuparrows"),n(r,o,m,"\u21bf","\\upharpoonleft"),n(r,o,m,"\u21c3","\\downharpoonleft"),n(r,o,m,"\u22b8","\\multimap"),n(r,o,m,"\u21ad","\\leftrightsquigarrow"),n(r,o,m,"\u21c9","\\rightrightarrows"),n(r,o,m,"\u21c4","\\rightleftarrows"),n(r,o,m,"\u21a0","\\twoheadrightarrow"),n(r,o,m,"\u21a3","\\rightarrowtail"),n(r,o,m,"\u21ac","\\looparrowright"),n(r,o,m,"\u21b7","\\curvearrowright"),n(r,o,m,"\u21bb","\\circlearrowright"),n(r,o,m,"\u21b1","\\Rsh"),n(r,o,m,"\u21ca","\\downdownarrows"),n(r,o,m,"\u21be","\\upharpoonright"),n(r,o,m,"\u21c2","\\downharpoonright"),n(r,o,m,"\u21dd","\\rightsquigarrow"),n(r,o,m,"\u21dd","\\leadsto"),n(r,o,m,"\u21db","\\Rrightarrow"),n(r,o,m,"\u21be","\\restriction"),n(r,a,v,"\u2018","`"),n(r,a,v,"$","\\$"),n(i,a,v,"$","\\$"),n(i,a,v,"$","\\textdollar"),n(r,a,v,"%","\\%"),n(i,a,v,"%","\\%"),n(r,a,v,"_","\\_"),n(i,a,v,"_","\\_"),n(i,a,v,"_","\\textunderscore"),n(r,a,v,"\u2220","\\angle"),n(r,a,v,"\u221e","\\infty"),n(r,a,v,"\u2032","\\prime"),n(r,a,v,"\u25b3","\\triangle"),n(r,a,v,"\u0393","\\Gamma",!0),n(r,a,v,"\u0394","\\Delta",!0),n(r,a,v,"\u0398","\\Theta",!0),n(r,a,v,"\u039b","\\Lambda",!0),n(r,a,v,"\u039e","\\Xi",!0),n(r,a,v,"\u03a0","\\Pi",!0),n(r,a,v,"\u03a3","\\Sigma",!0),n(r,a,v,"\u03a5","\\Upsilon",!0),n(r,a,v,"\u03a6","\\Phi",!0),n(r,a,v,"\u03a8","\\Psi",!0),n(r,a,v,"\u03a9","\\Omega",!0),n(r,a,v,"\xac","\\neg"),n(r,a,v,"\xac","\\lnot"),n(r,a,v,"\u22a4","\\top"),n(r,a,v,"\u22a5","\\bot"),n(r,a,v,"\u2205","\\emptyset"),n(r,o,v,"\u2205","\\varnothing"),n(r,a,c,"\u03b1","\\alpha",!0),n(r,a,c,"\u03b2","\\beta",!0),n(r,a,c,"\u03b3","\\gamma",!0),n(r,a,c,"\u03b4","\\delta",!0),n(r,a,c,"\u03f5","\\epsilon",!0),n(r,a,c,"\u03b6","\\zeta",!0),n(r,a,c,"\u03b7","\\eta",!0),n(r,a,c,"\u03b8","\\theta",!0),n(r,a,c,"\u03b9","\\iota",!0),n(r,a,c,"\u03ba","\\kappa",!0),n(r,a,c,"\u03bb","\\lambda",!0),n(r,a,c,"\u03bc","\\mu",!0),n(r,a,c,"\u03bd","\\nu",!0),n(r,a,c,"\u03be","\\xi",!0),n(r,a,c,"\u03bf","\\omicron",!0),n(r,a,c,"\u03c0","\\pi",!0),n(r,a,c,"\u03c1","\\rho",!0),n(r,a,c,"\u03c3","\\sigma",!0),n(r,a,c,"\u03c4","\\tau",!0),n(r,a,c,"\u03c5","\\upsilon",!0),n(r,a,c,"\u03d5","\\phi",!0),n(r,a,c,"\u03c7","\\chi",!0),n(r,a,c,"\u03c8","\\psi",!0),n(r,a,c,"\u03c9","\\omega",!0),n(r,a,c,"\u03b5","\\varepsilon",!0),n(r,a,c,"\u03d1","\\vartheta",!0),n(r,a,c,"\u03d6","\\varpi",!0),n(r,a,c,"\u03f1","\\varrho",!0),n(r,a,c,"\u03c2","\\varsigma",!0),n(r,a,c,"\u03c6","\\varphi",!0),n(r,a,l,"\u2217","*"),n(r,a,l,"+","+"),n(r,a,l,"\u2212","-"),n(r,a,l,"\u22c5","\\cdot"),n(r,a,l,"\u2218","\\circ"),n(r,a,l,"\xf7","\\div"),n(r,a,l,"\xb1","\\pm"),n(r,a,l,"\xd7","\\times"),n(r,a,l,"\u2229","\\cap"),n(r,a,l,"\u222a","\\cup"),n(r,a,l,"\u2216","\\setminus"),n(r,a,l,"\u2227","\\land"),n(r,a,l,"\u2228","\\lor"),n(r,a,l,"\u2227","\\wedge"),n(r,a,l,"\u2228","\\vee"),n(r,a,v,"\u221a","\\surd"),n(r,a,p,"(","("),n(r,a,p,"[","["),n(r,a,p,"\u27e8","\\langle"),n(r,a,p,"\u2223","\\lvert"),n(r,a,p,"\u2225","\\lVert"),n(r,a,u,")",")"),n(r,a,u,"]","]"),n(r,a,u,"?","?"),n(r,a,u,"!","!"),n(r,a,u,"\u27e9","\\rangle"),n(r,a,u,"\u2223","\\rvert"),n(r,a,u,"\u2225","\\rVert"),n(r,a,m,"=","="),n(r,a,m,"<","<"),n(r,a,m,">",">"),n(r,a,m,":",":"),n(r,a,m,"\u2248","\\approx"),n(r,a,m,"\u2245","\\cong"),n(r,a,m,"\u2265","\\ge"),n(r,a,m,"\u2265","\\geq"),n(r,a,m,"\u2190","\\gets"),n(r,a,m,">","\\gt"),n(r,a,m,"\u2208","\\in"),n(r,a,m,"\u2209","\\notin"),n(r,a,m,"\u0338","\\not"),n(r,a,m,"\u2282","\\subset"),n(r,a,m,"\u2283","\\supset"),n(r,a,m,"\u2286","\\subseteq"),n(r,a,m,"\u2287","\\supseteq"),n(r,o,m,"\u2288","\\nsubseteq"),n(r,o,m,"\u2289","\\nsupseteq"),n(r,a,m,"\u22a8","\\models"),n(r,a,m,"\u2190","\\leftarrow"),n(r,a,m,"\u2264","\\le"),n(r,a,m,"\u2264","\\leq"),n(r,a,m,"<","\\lt"),n(r,a,m,"\u2260","\\ne"),n(r,a,m,"\u2260","\\neq"),n(r,a,m,"\u2192","\\rightarrow"),n(r,a,m,"\u2192","\\to"),n(r,o,m,"\u2271","\\ngeq"),n(r,o,m,"\u2270","\\nleq"),n(r,a,g,null,"\\!"),n(r,a,g,"\xa0","\\ "),n(r,a,g,"\xa0","~"),n(r,a,g,null,"\\,"),n(r,a,g,null,"\\:"),n(r,a,g,null,"\\;"),n(r,a,g,null,"\\enspace"),n(r,a,g,null,"\\qquad"),n(r,a,g,null,"\\quad"),n(r,a,g,"\xa0","\\space"),n(r,a,f,",",","),n(r,a,f,";",";"),n(r,a,f,":","\\colon"),n(r,o,l,"\u22bc","\\barwedge"),n(r,o,l,"\u22bb","\\veebar"),n(r,a,l,"\u2299","\\odot"),n(r,a,l,"\u2295","\\oplus"),n(r,a,l,"\u2297","\\otimes"),n(r,a,v,"\u2202","\\partial"),n(r,a,l,"\u2298","\\oslash"),n(r,o,l,"\u229a","\\circledcirc"),n(r,o,l,"\u22a1","\\boxdot"),n(r,a,l,"\u25b3","\\bigtriangleup"),n(r,a,l,"\u25bd","\\bigtriangledown"),n(r,a,l,"\u2020","\\dagger"),n(r,a,l,"\u22c4","\\diamond"),n(r,a,l,"\u22c6","\\star"),n(r,a,l,"\u25c3","\\triangleleft"),n(r,a,l,"\u25b9","\\triangleright"),n(r,a,p,"{","\\{"),n(i,a,v,"{","\\{"),n(i,a,v,"{","\\textbraceleft"),n(r,a,u,"}","\\}"),n(i,a,v,"}","\\}"),n(i,a,v,"}","\\textbraceright"),n(r,a,p,"{","\\lbrace"),n(r,a,u,"}","\\rbrace"),n(r,a,p,"[","\\lbrack"),n(r,a,u,"]","\\rbrack"),n(i,a,v,"<","\\textless"),n(i,a,v,">","\\textgreater"),n(r,a,p,"\u230a","\\lfloor"),n(r,a,u,"\u230b","\\rfloor"),n(r,a,p,"\u2308","\\lceil"),n(r,a,u,"\u2309","\\rceil"),n(r,a,v,"\\","\\backslash"),n(r,a,v,"\u2223","|"),n(r,a,v,"\u2223","\\vert"),n(i,a,v,"|","\\textbar"),n(r,a,v,"\u2225","\\|"),n(r,a,v,"\u2225","\\Vert"),n(i,a,v,"\u2225","\\textbardbl"),n(r,a,m,"\u2191","\\uparrow"),n(r,a,m,"\u21d1","\\Uparrow"),n(r,a,m,"\u2193","\\downarrow"),n(r,a,m,"\u21d3","\\Downarrow"),n(r,a,m,"\u2195","\\updownarrow"),n(r,a,m,"\u21d5","\\Updownarrow"),n(r,a,h,"\u2210","\\coprod"),n(r,a,h,"\u22c1","\\bigvee"),n(r,a,h,"\u22c0","\\bigwedge"),n(r,a,h,"\u2a04","\\biguplus"),n(r,a,h,"\u22c2","\\bigcap"),n(r,a,h,"\u22c3","\\bigcup"),n(r,a,h,"\u222b","\\int"),n(r,a,h,"\u222b","\\intop"),n(r,a,h,"\u222c","\\iint"),n(r,a,h,"\u222d","\\iiint"),n(r,a,h,"\u220f","\\prod"),n(r,a,h,"\u2211","\\sum"),n(r,a,h,"\u2a02","\\bigotimes"),n(r,a,h,"\u2a01","\\bigoplus"),n(r,a,h,"\u2a00","\\bigodot"),n(r,a,h,"\u222e","\\oint"),n(r,a,h,"\u2a06","\\bigsqcup"),n(r,a,h,"\u222b","\\smallint"),n(i,a,d,"\u2026","\\textellipsis"),n(r,a,d,"\u2026","\\mathellipsis"),n(i,a,d,"\u2026","\\ldots",!0),n(r,a,d,"\u2026","\\ldots",!0),n(r,a,d,"\u22ef","\\cdots",!0),n(r,a,d,"\u22f1","\\ddots",!0),n(r,a,v,"\u22ee","\\vdots",!0),n(r,a,s,"\xb4","\\acute"),n(r,a,s,"`","\\grave"),n(r,a,s,"\xa8","\\ddot"),n(r,a,s,"~","\\tilde"),n(r,a,s,"\xaf","\\bar"),n(r,a,s,"\u02d8","\\breve"),n(r,a,s,"\u02c7","\\check"),n(r,a,s,"^","\\hat"),n(r,a,s,"\u20d7","\\vec"),n(r,a,s,"\u02d9","\\dot"),n(r,a,c,"\u0131","\\imath"),n(r,a,c,"\u0237","\\jmath"),n(i,a,s,"\u02ca","\\'"),n(i,a,s,"\u02cb","\\`"),n(i,a,s,"\u02c6","\\^"),n(i,a,s,"\u02dc","\\~"),n(i,a,s,"\u02c9","\\="),n(i,a,s,"\u02d8","\\u"),n(i,a,s,"\u02d9","\\."),n(i,a,s,"\u02da","\\r"),n(i,a,s,"\u02c7","\\v"),n(i,a,s,"\xa8",'\\"'),n(i,a,s,"\u030b","\\H"),n(i,a,v,"\u2013","--"),n(i,a,v,"\u2013","\\textendash"),n(i,a,v,"\u2014","---"),n(i,a,v,"\u2014","\\textemdash"),n(i,a,v,"\u2018","`"),n(i,a,v,"\u2018","\\textquoteleft"),n(i,a,v,"\u2019","'"),n(i,a,v,"\u2019","\\textquoteright"),n(i,a,v,"\u201c","``"),n(i,a,v,"\u201c","\\textquotedblleft"),n(i,a,v,"\u201d","''"),n(i,a,v,"\u201d","\\textquotedblright"),n(r,a,v,"\xb0","\\degree"),n(i,a,v,"\xb0","\\degree"),n(r,a,c,"\xa3","\\pounds"),n(r,a,c,"\xa3","\\mathsterling"),n(i,a,c,"\xa3","\\pounds"),n(i,a,c,"\xa3","\\textsterling"),n(r,o,v,"\u2720","\\maltese"),n(i,o,v,"\u2720","\\maltese"),n(i,a,g,"\xa0","\\ "),n(i,a,g,"\xa0"," "),n(i,a,g,"\xa0","~");for(var b='0123456789/@."',y=0;y<b.length;y++){var x=b.charAt(y);n(r,a,v,x,x)}for(var w='0123456789!@*()-=+[]<>|";:?/.,',k=0;k<w.length;k++){var M=w.charAt(k);n(i,a,v,M,M)}for(var S="abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ",z=0;z<S.length;z++){var A=S.charAt(z);n(r,a,c,A,A),n(i,a,v,A,A)}for(var C=192;C<=214;C++){var T=String.fromCharCode(C);n(r,a,c,T,T),n(i,a,v,T,T)}for(var N=216;N<=246;N++){var E=String.fromCharCode(N);n(r,a,c,E,E),n(i,a,v,E,E)}for(var R=248;R<=255;R++){var L=String.fromCharCode(R);n(r,a,c,L,L),n(i,a,v,L,L)}for(var O=1040;O<=1103;O++){var q=String.fromCharCode(O);n(i,a,v,q,q)}n(i,a,v,"\u2013","\u2013"),n(i,a,v,"\u2014","\u2014"),n(i,a,v,"\u2018","\u2018"),n(i,a,v,"\u2019","\u2019"),n(i,a,v,"\u201c","\u201c"),n(i,a,v,"\u201d","\u201d")},{}],49:[function(e,t){var n=/[\uAC00-\uD7AF]/,r=/[\u3000-\u30FF\u4E00-\u9FAF\uAC00-\uD7AF\uFF00-\uFF60]/;t.exports={cjkRegex:r,hangulRegex:n}},{}],50:[function(e,t){function n(e){return e&&e.__esModule?e:{"default":e}}var r=n(e("./ParseError")),i={pt:1,mm:7227/2540,cm:7227/254,"in":72.27,bp:1.00375,pc:12,dd:1238/1157,cc:14856/1157,nd:685/642,nc:1370/107,sp:1/65536,px:1.00375},a={ex:!0,em:!0,mu:!0},o=function(e){return e.unit&&(e=e.unit),e in i||e in a||"ex"===e},s=function(e,t){var n=void 0;if(e.unit in i)n=i[e.unit]/t.fontMetrics().ptPerEm/t.sizeMultiplier;else if("mu"===e.unit)n=t.fontMetrics().cssEmPerMu;else{var a=void 0;if(a=t.style.isTight()?t.havingStyle(t.style.text()):t,"ex"===e.unit)n=a.fontMetrics().xHeight;else{if("em"!==e.unit)throw new r["default"]("Invalid unit: '"+e.unit+"'");n=a.fontMetrics().quad}a!==t&&(n*=a.sizeMultiplier/t.sizeMultiplier)}return e.number*n};t.exports={validUnit:o,calculateSize:s}},{"./ParseError":29}],51:[function(e,t){function n(e){return c[e]}function r(e){return(""+e).replace(h,n)}function i(e){p(e,"")}var a=Array.prototype.indexOf,o=function(e,t){if(null==e)return-1;if(a&&e.indexOf===a)return e.indexOf(t);for(var n=e.length,r=0;r<n;r++)if(e[r]===t)return r;return-1},s=function(e,t){return-1!==o(e,t)},l=function(e,t){return e===undefined?t:e},u=/([A-Z])/g,d=function(e){return e.replace(u,"-$1").toLowerCase()},c={"&":"&amp;",">":"&gt;","<":"&lt;",'"':"&quot;","'":"&#x27;"},h=/[&><"']/g,p=void 0;if("undefined"!=typeof document){var f=document.createElement("span");p="textContent"in f?function(e,t){e.textContent=t}:function(e,t){e.innerText=t}}t.exports={contains:s,deflt:l,escape:r,hyphenate:d,indexOf:o,setTextContent:p,clearNode:i}},{}]},{},[1])(1)},e.exports=t()}));
+// Copyright 2018 The Distill Template Authors
+const ae=function(e,t,n){let r=n,i=0;const a=e.length;for(;r<t.length;){const n=t[r];if(i<=0&&t.slice(r,r+a)===e)return r;"\\"===n?r++:"{"===n?i++:"}"===n&&i--,r++}return-1},oe=function(e,t,n,r){const i=[];for(let a=0;a<e.length;a++)if("text"===e[a].type){const o=e[a].data;let s,l=!0,u=0;for(-1!==(s=o.indexOf(t))&&(u=s,i.push({type:"text",data:o.slice(0,u)}),l=!1);;){if(l){if(-1===(s=o.indexOf(t,u)))break;i.push({type:"text",data:o.slice(u,s)}),u=s}else{if(-1===(s=ae(n,o,u+t.length)))break;i.push({type:"math",data:o.slice(u+t.length,s),rawData:o.slice(u,s+n.length),display:r}),u=s+n.length}l=!l}i.push({type:"text",data:o.slice(u)})}else i.push(e[a]);return i},se=function(e,t){let n=[{type:"text",data:e}];for(let e=0;e<t.length;e++){const r=t[e];n=oe(n,r.left,r.right,r.display||!1)}return n},le=function(e,t){const n=se(e,t.delimiters),r=document.createDocumentFragment();for(let e=0;e<n.length;e++)if("text"===n[e].type)r.appendChild(document.createTextNode(n[e].data));else{const a=document.createElement("d-math"),o=n[e].data;t.displayMode=n[e].display;try{a.textContent=o,t.displayMode&&a.setAttribute("block","")}catch(i){if(!(i instanceof katex.ParseError))throw i;t.errorCallback("KaTeX auto-render: Failed to parse `"+n[e].data+"` with ",i),r.appendChild(document.createTextNode(n[e].rawData));continue}r.appendChild(a)}return r},ue=function(e,t){for(let n=0;n<e.childNodes.length;n++){const r=e.childNodes[n];if(3===r.nodeType){const i=r.textContent;if(t.mightHaveMath(i)){const a=le(i,t);n+=a.childNodes.length-1,e.replaceChild(a,r)}}else if(1===r.nodeType){-1===t.ignoredTags.indexOf(r.nodeName.toLowerCase())&&ue(r,t)}}},de={delimiters:[{left:"$$",right:"$$",display:!0},{left:"\\[",right:"\\]",display:!0},{left:"\\(",right:"\\)",display:!1}],ignoredTags:["script","noscript","style","textarea","pre","code","svg"],errorCallback:function(e,t){console.error(e,t)}},ce=function(e,t){if(!e)throw new Error("No element provided to render");const n=Object.assign({},de,t),r=n.delimiters.flatMap(e=>[e.left,e.right]),i=e=>r.some(t=>-1!==e.indexOf(t));n.mightHaveMath=i,ue(e,n)};var he="iVBORw0KGgoAAAANSUhEUgAAAEAAAABACAYAAACqaXHeAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJbWFnZVJlYWR5ccllPAAAA99JREFUeNrsG4t1ozDMzQSM4A2ODUonKBucN2hugtIJ6E1AboLcBiQTkJsANiAb9OCd/OpzMWBJBl5TvaeXPiiyJetry0J8wW3D3QpjRh3GjneXDq+fSQA9s2mH9x3KDhN4foJfCb8N/Jrv+2fnDn8vLRQOplWHVYdvHZYdZsBcZP1vBmh/n8DzEmhUQDPaOuP9pFuY+JwJHwHnCLQE2tnWBGEyXozY9xCUgHMhhjE2I4heVWtgIkZ83wL6Qgxj1obfWBxymPwe+b00BCCRNPbwfb60yleAkkBHGT5AEehIYz7eJrFDMF9CvH4wwhcGHiHMneFvLDQwlwvMLQq58trRcYBWfYn0A0OgHWQUSu25mE+BnoYKnnEJoeIWAifzOv7vLWd2ZKRfWAIme3tOiUaQ3UnLkb0xj1FxRIeEGKaGIHOs9nEgLaaA9i0JRYo1Ic67wJW86KSKE/ZAM8KuVMk8ITVhmxUxJ3Cl2xlm9Vtkeju1+mpCQNxaEGNCY8bs9X2YqwNoQeGjBWut/ma0QAWy/TqAsHx9wSya3I5IRxOfTC+leG+kA/4vSeEcGBtNUN6byhu3+keEZCQJUNh8MAO7HL6H8pQLnsW/Hd4T4lv93TPjfM7A46iEEqbB5EDOvwYNW6tGNZzT/o+CZ6sqZ6wUtR/wf7mi/VL8iNciT6rHih48Y55b4nKCHJCCzb4y0nwFmin3ZEMIoLfZF8F7nncFmvnWBaBj7CGAYA/WGJsUwHdYqVDwAmNsUgAx4CGgAA7GOOxADYOFWOaIKifuVYzmOpREqA21Mo7aPsgiY1PhOMAmxtR+AUbYH3Id2wc0SAFIQTsn9IUGWR8k9jx3vtXSiAacFxTAGakBk9UudkNECd6jLe+6HrshshvIuC6IlLMRy7er+JpcKma24SlE4cFZSZJDGVVrsNvitQhQrDhW0jfiOLfFd47C42eHT56D/BK0To+58Ahj+cAT8HT1UWlfLZCCd/uKawzU0Rh2EyIX/Icqth3niG8ybNroezwe6khdCNxRN+l4XGdOLVLlOOt2hTRJlr1ETIuMAltVTMz70mJrkdGAaZLSmnBEqmAE32JCMmuTlCnRgsBENtOUpHhvvsYIL0ibnBkaC6QvKcR7738GKp0AKnim7xgUSNv1bpS8QwhBt8r+EP47v/oyRK/S34yJ9nT+AN0Tkm4OdB9E4BsmXM3SnMlRFUrtp6IDpV2eKzdYvF3etm3KhQksbOLChGkSmcBdmcEwvqkrMy5BzL00NZeu3qPYJOOuCc+5NjcWKXQxFvTa3NoXJ4d8in7fiAUuTt781dkvuHX4K8AA2Usy7yNKLy0AAAAASUVORK5CYII=\n",pe=/["'&<>]/,fe=C;
+/*!
    * escape-html
    * Copyright(c) 2012-2013 TJ Holowaychuk
    * Copyright(c) 2015 Andreas Lubbe
    * Copyright(c) 2015 Tiancheng "Timothy" Gu
    * MIT Licensed
    */
-
-  /**
-   * Module variables.
-   * @private
-   */
-
-  var matchHtmlRegExp = /["'&<>]/;
-
-  /**
-   * Module exports.
-   * @public
-   */
-
-  var escapeHtml_1 = escapeHtml;
-
-  /**
-   * Escape special characters in the given string of html.
-   *
-   * @param  {string} string The string to escape for inserting into HTML
-   * @return {string}
-   * @public
-   */
-
-  function escapeHtml(string) {
-    var str = '' + string;
-    var match = matchHtmlRegExp.exec(str);
-
-    if (!match) {
-      return str;
-    }
-
-    var escape;
-    var html = '';
-    var index = 0;
-    var lastIndex = 0;
-
-    for (index = match.index; index < str.length; index++) {
-      switch (str.charCodeAt(index)) {
-        case 34: // "
-          escape = '&quot;';
-          break;
-        case 38: // &
-          escape = '&amp;';
-          break;
-        case 39: // '
-          escape = '&#39;';
-          break;
-        case 60: // <
-          escape = '&lt;';
-          break;
-        case 62: // >
-          escape = '&gt;';
-          break;
-        default:
-          continue;
-      }
-
-      if (lastIndex !== index) {
-        html += str.substring(lastIndex, index);
-      }
-
-      lastIndex = index + 1;
-      html += escape;
-    }
-
-    return lastIndex !== index
-      ? html + str.substring(lastIndex, index)
-      : html;
-  }
-
-  // Copyright 2018 The Distill Template Authors
-
-  function Meta(dom, data) {
-    let head = dom.querySelector('head');
-    let appendHead = html => appendHtml(head, html);
-
-    function meta(name, content, force) {
-      if (content || force)
-        appendHead(`    <meta name="${name}" content="${escapeHtml_1(content)}" >\n`);
-    }
-
-    appendHead(`
-    <meta http-equiv="X-UA-Compatible" content="IE=Edge,chrome=1">
-    <link rel="icon" type="image/png" href="data:image/png;base64,${favicon}">
-    <link href="/rss.xml" rel="alternate" type="application/rss+xml" title="Articles from Distill">
-  `);
-
-    if (data.title) {
-      appendHead(`
-    <title>${escapeHtml_1(data.title)}</title>
-    `);
-    }
-
-    if (data.url) {
-      appendHead(`
-    <link rel="canonical" href="${data.url}">
-    `);
-    }
-
-
-    if (data.publishedDate){
-      appendHead(`
-    <!--  https://schema.org/Article -->
-    <meta property="description"       itemprop="description"   content="${escapeHtml_1(data.description)}" />
-    <meta property="article:published" itemprop="datePublished" content="${data.publishedISODateOnly}" />
-    <meta property="article:created"   itemprop="dateCreated"   content="${data.publishedISODateOnly}" />
-    `);
-    }
-
-    if (data.updatedDate) {
-      appendHead(`
-    <meta property="article:modified"  itemprop="dateModified"  content="${data.updatedDate.toISOString()}" />
-    `);
-    }
-
-    (data.authors || []).forEach((a) => {
-      appendHtml(head, `
-    <meta property="article:author" content="${escapeHtml_1(a.firstName)} ${escapeHtml_1(a.lastName)}" />`);
-    });
-
-    appendHead(`
-    <!--  https://developers.facebook.com/docs/sharing/webmasters#markup -->
-    <meta property="og:type" content="article"/>
-    <meta property="og:title" content="${escapeHtml_1(data.title)}"/>
-    <meta property="og:description" content="${escapeHtml_1(data.description)}">
-    <meta property="og:url" content="${data.url}"/>
-    <meta property="og:image" content="${data.previewURL}"/>
-    <meta property="og:locale" content="en_US" />
-    <meta property="og:site_name" content="Distill" />
-  `);
-
-    appendHead(`
-    <!--  https://dev.twitter.com/cards/types/summary -->
-    <meta name="twitter:card" content="summary_large_image">
-    <meta name="twitter:title" content="${escapeHtml_1(data.title)}">
-    <meta name="twitter:description" content="${escapeHtml_1(data.description)}">
-    <meta name="twitter:url" content="${data.url}">
-    <meta name="twitter:image" content="${data.previewURL}">
-    <meta name="twitter:image:width" content="560">
-    <meta name="twitter:image:height" content="295">
-  `);
-
-    // if this is a proprer article, generate Google Scholar meta data
-    if (data.doiSuffix){
-      appendHead(`
-      <!--  https://scholar.google.com/intl/en/scholar/inclusion.html#indexing -->\n`);
-
-      meta('citation_title', data.title);
-      meta('citation_fulltext_html_url', data.url);
-      meta('citation_volume', data.volume);
-      meta('citation_issue', data.issue);
-      meta('citation_firstpage', data.doiSuffix ? `e${data.doiSuffix}` : undefined);
-      meta('citation_doi', data.doi);
-
-      let journal = data.journal || {};
-      meta('citation_journal_title', journal.full_title || journal.title);
-      meta('citation_journal_abbrev', journal.abbrev_title);
-      meta('citation_issn', journal.issn);
-      meta('citation_publisher', journal.publisher);
-      meta('citation_fulltext_world_readable', '', true);
-
-      if (data.publishedDate){
-        meta('citation_online_date', `${data.publishedYear}/${data.publishedMonthPadded}/${data.publishedDayPadded}`);
-        meta('citation_publication_date', `${data.publishedYear}/${data.publishedMonthPadded}/${data.publishedDayPadded}`);
-      }
-
-      (data.authors || []).forEach((a) => {
-        meta('citation_author', `${a.lastName}, ${a.firstName}`);
-        meta('citation_author_institution', a.affiliation);
-      });
-    } else {
-      console.warn('No DOI suffix in data; not adding citation meta tags!');
-    }
-
-    if (data.citations) {
-      data.citations.forEach(key => {
-        if (data.bibliography && data.bibliography.has(key)) {
-          const entry = data.bibliography.get(key);
-          meta('citation_reference', citation_meta_content(entry) );
-        } else {
-          console.warn('No bibliography data found for ' + key);
-        }
-      });
-    } else {
-      console.warn('No citations found; not adding any references meta tags!');
-    }
-  }
-
-  function appendHtml(el, html) {
-    el.innerHTML += html;
-  }
-
-  function citation_meta_content(ref){
-    var content = `citation_title=${ref.title};`;
-
-    if (ref.author && ref.author !== '') {
-      ref.author.split(' and ').forEach(name => {
-        name = name.trim();
-        let last, firsts;
-        if (name.indexOf(',') != -1){
-          last = name.split(',')[0].trim();
-          firsts = name.split(',')[1].trim();
-        } else {
-          last = name.split(' ').slice(-1)[0].trim();
-          firsts = name.split(' ').slice(0,-1).join(' ');
-        }
-        content += `citation_author=${firsts} ${last};`;
-      });
-    }
-
-    if ('year' in ref) {
-      content += `citation_publication_date=${ref.year};`;
-    }
-
-    // Special test for arxiv
-    let arxiv_id_search = /https?:\/\/arxiv\.org\/pdf\/([0-9]*\.[0-9]*)\.pdf/.exec(ref.url);
-    arxiv_id_search = arxiv_id_search || /https?:\/\/arxiv\.org\/abs\/([0-9]*\.[0-9]*)/.exec(ref.url);
-    arxiv_id_search = arxiv_id_search || /arXiv preprint arXiv:([0-9]*\.[0-9]*)/.exec(ref.journal);
-    if (arxiv_id_search && arxiv_id_search[1]){
-      content += `citation_arxiv_id=${arxiv_id_search[1]};`;
-      return content; // arXiv is not considered a journal, so we don't need journal/volume/issue
-    }
-    if ('journal' in ref){
-      content += `citation_journal_title=${escapeHtml_1(ref.journal)};`;
-    }
-    if ('volume' in ref) {
-      content += `citation_volume=${escapeHtml_1(ref.volume)};`;
-    }
-    if ('issue' in ref || 'number' in ref){
-      content += `citation_number=${escapeHtml_1(ref.issue || ref.number)};`;
-    }
-    return content;
-  }
-
-  var base = "/*\n * Copyright 2018 The Distill Template Authors\n *\n * Licensed under the Apache License, Version 2.0 (the \"License\");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *      http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an \"AS IS\" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n\nhtml {\n  font-size: 14px;\n\tline-height: 1.6em;\n  /* font-family: \"Libre Franklin\", \"Helvetica Neue\", sans-serif; */\n  font-family: -apple-system, BlinkMacSystemFont, \"Segoe UI\", Roboto, Oxygen, Ubuntu, Cantarell, \"Fira Sans\", \"Droid Sans\", \"Helvetica Neue\", Arial, sans-serif;\n  /*, \"Apple Color Emoji\", \"Segoe UI Emoji\", \"Segoe UI Symbol\";*/\n  text-size-adjust: 100%;\n  -ms-text-size-adjust: 100%;\n  -webkit-text-size-adjust: 100%;\n}\n\n@media(min-width: 768px) {\n  html {\n    font-size: 16px;\n  }\n}\n\nbody {\n  margin: 0;\n}\n\na {\n  color: #004276;\n}\n\nfigure {\n  margin: 0;\n}\n\ntable {\n\tborder-collapse: collapse;\n\tborder-spacing: 0;\n}\n\ntable th {\n\ttext-align: left;\n}\n\ntable thead {\n  border-bottom: 1px solid rgba(0, 0, 0, 0.05);\n}\n\ntable thead th {\n  padding-bottom: 0.5em;\n}\n\ntable tbody :first-child td {\n  padding-top: 0.5em;\n}\n\npre {\n  overflow: auto;\n  max-width: 100%;\n}\n\np {\n  margin-top: 0;\n  margin-bottom: 1em;\n}\n\nsup, sub {\n  vertical-align: baseline;\n  position: relative;\n  top: -0.4em;\n  line-height: 1em;\n}\n\nsub {\n  top: 0.4em;\n}\n\n.kicker,\n.marker {\n  font-size: 15px;\n  font-weight: 600;\n  color: rgba(0, 0, 0, 0.5);\n}\n\n\n/* Headline */\n\n@media(min-width: 1024px) {\n  d-title h1 span {\n    display: block;\n  }\n}\n\n/* Figure */\n\nfigure {\n  position: relative;\n  margin-bottom: 2.5em;\n  margin-top: 1.5em;\n}\n\nfigcaption+figure {\n\n}\n\nfigure img {\n  width: 100%;\n}\n\nfigure svg text,\nfigure svg tspan {\n}\n\nfigcaption,\n.figcaption {\n  color: rgba(0, 0, 0, 0.6);\n  font-size: 12px;\n  line-height: 1.5em;\n}\n\n@media(min-width: 1024px) {\nfigcaption,\n.figcaption {\n    font-size: 13px;\n  }\n}\n\nfigure.external img {\n  background: white;\n  border: 1px solid rgba(0, 0, 0, 0.1);\n  box-shadow: 0 1px 8px rgba(0, 0, 0, 0.1);\n  padding: 18px;\n  box-sizing: border-box;\n}\n\nfigcaption a {\n  color: rgba(0, 0, 0, 0.6);\n}\n\nfigcaption b,\nfigcaption strong, {\n  font-weight: 600;\n  color: rgba(0, 0, 0, 1.0);\n}\n";
-
-  var layout = "/*\n * Copyright 2018 The Distill Template Authors\n *\n * Licensed under the Apache License, Version 2.0 (the \"License\");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *      http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an \"AS IS\" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n\n@supports not (display: grid) {\n  .base-grid,\n  distill-header,\n  d-title,\n  d-abstract,\n  d-article,\n  d-appendix,\n  distill-appendix,\n  d-byline,\n  d-footnote-list,\n  d-citation-list,\n  distill-footer {\n    display: block;\n    padding: 8px;\n  }\n}\n\n.base-grid,\ndistill-header,\nd-title,\nd-abstract,\nd-article,\nd-appendix,\ndistill-appendix,\nd-byline,\nd-footnote-list,\nd-citation-list,\ndistill-footer {\n  display: grid;\n  justify-items: stretch;\n  grid-template-columns: [screen-start] 8px [page-start kicker-start text-start gutter-start middle-start] 1fr 1fr 1fr 1fr 1fr 1fr 1fr 1fr [text-end page-end gutter-end kicker-end middle-end] 8px [screen-end];\n  grid-column-gap: 8px;\n}\n\n.grid {\n  display: grid;\n  grid-column-gap: 8px;\n}\n\n@media(min-width: 768px) {\n  .base-grid,\n  distill-header,\n  d-title,\n  d-abstract,\n  d-article,\n  d-appendix,\n  distill-appendix,\n  d-byline,\n  d-footnote-list,\n  d-citation-list,\n  distill-footer {\n    grid-template-columns: [screen-start] 1fr [page-start kicker-start middle-start text-start] 45px 45px 45px 45px 45px 45px 45px 45px [ kicker-end text-end gutter-start] 45px [middle-end] 45px [page-end gutter-end] 1fr [screen-end];\n    grid-column-gap: 16px;\n  }\n\n  .grid {\n    grid-column-gap: 16px;\n  }\n}\n\n@media(min-width: 1000px) {\n  .base-grid,\n  distill-header,\n  d-title,\n  d-abstract,\n  d-article,\n  d-appendix,\n  distill-appendix,\n  d-byline,\n  d-footnote-list,\n  d-citation-list,\n  distill-footer {\n    grid-template-columns: [screen-start] 1fr [page-start kicker-start] 50px [middle-start] 50px [text-start kicker-end] 50px 50px 50px 50px 50px 50px 50px 50px [text-end gutter-start] 50px [middle-end] 50px [page-end gutter-end] 1fr [screen-end];\n    grid-column-gap: 16px;\n  }\n\n  .grid {\n    grid-column-gap: 16px;\n  }\n}\n\n@media(min-width: 1180px) {\n  .base-grid,\n  distill-header,\n  d-title,\n  d-abstract,\n  d-article,\n  d-appendix,\n  distill-appendix,\n  d-byline,\n  d-footnote-list,\n  d-citation-list,\n  distill-footer {\n    grid-template-columns: [screen-start] 1fr [page-start kicker-start] 60px [middle-start] 60px [text-start kicker-end] 60px 60px 60px 60px 60px 60px 60px 60px [text-end gutter-start] 60px [middle-end] 60px [page-end gutter-end] 1fr [screen-end];\n    grid-column-gap: 32px;\n  }\n\n  .grid {\n    grid-column-gap: 32px;\n  }\n}\n\n\n\n\n.base-grid {\n  grid-column: screen;\n}\n\n/* .l-body,\nd-article > *  {\n  grid-column: text;\n}\n\n.l-page,\nd-title > *,\nd-figure {\n  grid-column: page;\n} */\n\n.l-gutter {\n  grid-column: gutter;\n}\n\n.l-text,\n.l-body {\n  grid-column: text;\n}\n\n.l-page {\n  grid-column: page;\n}\n\n.l-body-outset {\n  grid-column: middle;\n}\n\n.l-page-outset {\n  grid-column: page;\n}\n\n.l-screen {\n  grid-column: screen;\n}\n\n.l-screen-inset {\n  grid-column: screen;\n  padding-left: 16px;\n  padding-left: 16px;\n}\n\n\n/* Aside */\n\nd-article aside {\n  grid-column: gutter;\n  font-size: 12px;\n  line-height: 1.6em;\n  color: rgba(0, 0, 0, 0.6)\n}\n\n@media(min-width: 768px) {\n  aside {\n    grid-column: gutter;\n  }\n\n  .side {\n    grid-column: gutter;\n  }\n}\n";
-
-  var print = "/*\n * Copyright 2018 The Distill Template Authors\n *\n * Licensed under the Apache License, Version 2.0 (the \"License\");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *      http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an \"AS IS\" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n\n@media print {\n\n  @page {\n    size: 8in 11in;\n    @bottom-right {\n      content: counter(page) \" of \" counter(pages);\n    }\n  }\n\n  html {\n    /* no general margins -- CSS Grid takes care of those */\n  }\n\n  p, code {\n    page-break-inside: avoid;\n  }\n\n  h2, h3 {\n    page-break-after: avoid;\n  }\n\n  d-header {\n    visibility: hidden;\n  }\n\n  d-footer {\n    display: none!important;\n  }\n\n}\n";
-
-  var byline = "/*\n * Copyright 2018 The Distill Template Authors\n *\n * Licensed under the Apache License, Version 2.0 (the \"License\");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *      http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an \"AS IS\" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n\nd-byline {\n  contain: style;\n  overflow: hidden;\n  border-top: 1px solid rgba(0, 0, 0, 0.1);\n  font-size: 0.8rem;\n  line-height: 1.8em;\n  padding: 1.5rem 0;\n  min-height: 1.8em;\n}\n\n\nd-byline .byline {\n  grid-template-columns: 1fr 1fr;\n  grid-column: text;\n}\n\n@media(min-width: 768px) {\n  d-byline .byline {\n    grid-template-columns: 1fr 1fr 1fr 1fr;\n  }\n}\n\nd-byline .authors-affiliations {\n  grid-column-end: span 2;\n  grid-template-columns: 1fr 1fr;\n  margin-bottom: 1em;\n}\n\n@media(min-width: 768px) {\n  d-byline .authors-affiliations {\n    margin-bottom: 0;\n  }\n}\n\nd-byline h3 {\n  font-size: 0.6rem;\n  font-weight: 400;\n  color: rgba(0, 0, 0, 0.5);\n  margin: 0;\n  text-transform: uppercase;\n}\n\nd-byline p {\n  margin: 0;\n}\n\nd-byline a,\nd-article d-byline a {\n  color: rgba(0, 0, 0, 0.8);\n  text-decoration: none;\n  border-bottom: none;\n}\n\nd-article d-byline a:hover {\n  text-decoration: underline;\n  border-bottom: none;\n}\n\nd-byline p.author {\n  font-weight: 500;\n}\n\nd-byline .affiliations {\n\n}\n";
-
-  var article = "/*\n * Copyright 2018 The Distill Template Authors\n *\n * Licensed under the Apache License, Version 2.0 (the \"License\");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *      http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an \"AS IS\" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n\nd-article {\n  contain: layout style;\n  overflow-x: hidden;\n  border-top: 1px solid rgba(0, 0, 0, 0.1);\n  padding-top: 2rem;\n  color: rgba(0, 0, 0, 0.8);\n}\n\nd-article > * {\n  grid-column: text;\n}\n\n@media(min-width: 768px) {\n  d-article {\n    font-size: 16px;\n  }\n}\n\n@media(min-width: 1024px) {\n  d-article {\n    font-size: 1.06rem;\n    line-height: 1.7em;\n  }\n}\n\n\n/* H2 */\n\n\nd-article .marker {\n  text-decoration: none;\n  border: none;\n  counter-reset: section;\n  grid-column: kicker;\n  line-height: 1.7em;\n}\n\nd-article .marker:hover {\n  border: none;\n}\n\nd-article .marker span {\n  padding: 0 3px 4px;\n  border-bottom: 1px solid rgba(0, 0, 0, 0.2);\n  position: relative;\n  top: 4px;\n}\n\nd-article .marker:hover span {\n  color: rgba(0, 0, 0, 0.7);\n  border-bottom: 1px solid rgba(0, 0, 0, 0.7);\n}\n\nd-article h2 {\n  font-weight: 600;\n  font-size: 24px;\n  line-height: 1.25em;\n  margin: 2rem 0 1.5rem 0;\n  border-bottom: 1px solid rgba(0, 0, 0, 0.1);\n  padding-bottom: 1rem;\n}\n\n@media(min-width: 1024px) {\n  d-article h2 {\n    font-size: 36px;\n  }\n}\n\n/* H3 */\n\nd-article h3 {\n  font-weight: 700;\n  font-size: 18px;\n  line-height: 1.4em;\n  margin-bottom: 1em;\n  margin-top: 2em;\n}\n\n@media(min-width: 1024px) {\n  d-article h3 {\n    font-size: 20px;\n  }\n}\n\n/* H4 */\n\nd-article h4 {\n  font-weight: 600;\n  text-transform: uppercase;\n  font-size: 14px;\n  line-height: 1.4em;\n}\n\nd-article a {\n  color: inherit;\n}\n\nd-article p,\nd-article ul,\nd-article ol,\nd-article blockquote {\n  margin-top: 0;\n  margin-bottom: 1em;\n  margin-left: 0;\n  margin-right: 0;\n}\n\nd-article blockquote {\n  border-left: 2px solid rgba(0, 0, 0, 0.2);\n  padding-left: 2em;\n  font-style: italic;\n  color: rgba(0, 0, 0, 0.6);\n}\n\nd-article a {\n  border-bottom: 1px solid rgba(0, 0, 0, 0.4);\n  text-decoration: none;\n}\n\nd-article a:hover {\n  border-bottom: 1px solid rgba(0, 0, 0, 0.8);\n}\n\nd-article .link {\n  text-decoration: underline;\n  cursor: pointer;\n}\n\nd-article ul,\nd-article ol {\n  padding-left: 24px;\n}\n\nd-article li {\n  margin-bottom: 1em;\n  margin-left: 0;\n  padding-left: 0;\n}\n\nd-article li:last-child {\n  margin-bottom: 0;\n}\n\nd-article pre {\n  font-size: 14px;\n  margin-bottom: 20px;\n}\n\nd-article hr {\n  grid-column: screen;\n  width: 100%;\n  border: none;\n  border-bottom: 1px solid rgba(0, 0, 0, 0.1);\n  margin-top: 60px;\n  margin-bottom: 60px;\n}\n\nd-article section {\n  margin-top: 60px;\n  margin-bottom: 60px;\n}\n\nd-article span.equation-mimic {\n  font-family: georgia;\n  font-size: 115%;\n  font-style: italic;\n}\n\nd-article > d-code,\nd-article section > d-code  {\n  display: block;\n}\n\nd-article > d-math[block],\nd-article section > d-math[block]  {\n  display: block;\n}\n\n@media (max-width: 768px) {\n  d-article > d-code,\n  d-article section > d-code,\n  d-article > d-math[block],\n  d-article section > d-math[block] {\n      overflow-x: scroll;\n      -ms-overflow-style: none;  // IE 10+\n      overflow: -moz-scrollbars-none;  // Firefox\n  }\n\n  d-article > d-code::-webkit-scrollbar,\n  d-article section > d-code::-webkit-scrollbar,\n  d-article > d-math[block]::-webkit-scrollbar,\n  d-article section > d-math[block]::-webkit-scrollbar {\n    display: none;  // Safari and Chrome\n  }\n}\n\nd-article .citation {\n  color: #668;\n  cursor: pointer;\n}\n\nd-include {\n  width: auto;\n  display: block;\n}\n\nd-figure {\n  contain: layout style;\n}\n\n/* KaTeX */\n\n.katex, .katex-prerendered {\n  contain: style;\n  display: inline-block;\n}\n\n/* Tables */\n\nd-article table {\n  border-collapse: collapse;\n  margin-bottom: 1.5rem;\n  border-bottom: 1px solid rgba(0, 0, 0, 0.2);\n}\n\nd-article table th {\n  border-bottom: 1px solid rgba(0, 0, 0, 0.2);\n}\n\nd-article table td {\n  border-bottom: 1px solid rgba(0, 0, 0, 0.05);\n}\n\nd-article table tr:last-of-type td {\n  border-bottom: none;\n}\n\nd-article table th,\nd-article table td {\n  font-size: 15px;\n  padding: 2px 8px;\n}\n\nd-article table tbody :first-child td {\n  padding-top: 2px;\n}\n";
-
-  var title = "/*\n * Copyright 2018 The Distill Template Authors\n *\n * Licensed under the Apache License, Version 2.0 (the \"License\");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *      http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an \"AS IS\" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n\nd-title {\n  padding: 2rem 0 1.5rem;\n  contain: layout style;\n  overflow-x: hidden;\n}\n\n@media(min-width: 768px) {\n  d-title {\n    padding: 4rem 0 1.5rem;\n  }\n}\n\nd-title h1 {\n  grid-column: text;\n  font-size: 40px;\n  font-weight: 700;\n  line-height: 1.1em;\n  margin: 0 0 0.5rem;\n}\n\n@media(min-width: 768px) {\n  d-title h1 {\n    font-size: 50px;\n  }\n}\n\nd-title p {\n  font-weight: 300;\n  font-size: 1.2rem;\n  line-height: 1.55em;\n  grid-column: text;\n}\n\nd-title .status {\n  margin-top: 0px;\n  font-size: 12px;\n  color: #009688;\n  opacity: 0.8;\n  grid-column: kicker;\n}\n\nd-title .status span {\n  line-height: 1;\n  display: inline-block;\n  padding: 6px 0;\n  border-bottom: 1px solid #80cbc4;\n  font-size: 11px;\n  text-transform: uppercase;\n}\n";
-
-  var math = "/*\n * Copyright 2018 The Distill Template Authors\n *\n * Licensed under the Apache License, Version 2.0 (the \"License\");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *      http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an \"AS IS\" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n\nspan.katex-display {\n  text-align: left;\n  padding: 8px 0 8px 0;\n  margin: 0.5em 0 0.5em 1em;\n}\n\nspan.katex {\n  -webkit-font-smoothing: antialiased;\n  color: rgba(0, 0, 0, 0.8);\n  font-size: 1.18em;\n}\n";
-
-  // Copyright 2018 The Distill Template Authors
-
-  const styles = base + layout + title + byline + article + math + print;
-
-  function makeStyleTag(dom) {
-
-    const styleTagId = 'distill-prerendered-styles';
-    const prerenderedTag = dom.getElementById(styleTagId);
-    if (!prerenderedTag) {
-      const styleTag = dom.createElement('style');
-      styleTag.id = styleTagId;
-      styleTag.type = 'text/css';
-      const cssTextTag = dom.createTextNode(styles);
-      styleTag.appendChild(cssTextTag);
-      const firstScriptTag = dom.head.querySelector('script');
-      dom.head.insertBefore(styleTag, firstScriptTag);
-    }
-
-  }
-
-  // Copyright 2018 The Distill Template Authors
-
-  function renderTOC(element, headings) {
-
-    let ToC =`
-  <style>
-
-  d-toc {
-    contain: layout style;
-    display: block;
-  }
-
-  d-toc ul {
-    padding-left: 0;
-  }
-
-  d-toc ul > ul {
-    padding-left: 24px;
-  }
-
-  d-toc a {
-    border-bottom: none;
-    text-decoration: none;
-  }
-
-  </style>
-  <nav role="navigation" class="table-of-contents"></nav>
-  <h2>Table of contents</h2>
-  <ul>`;
-
-    for (const el of headings) {
-      // should element be included in TOC?
-      const isInTitle = el.parentElement.tagName == 'D-TITLE';
-      const isException = el.getAttribute('no-toc');
-      if (isInTitle || isException) continue;
-      // create TOC entry
-      const title = el.textContent;
-      const link = '#' + el.getAttribute('id');
-
-      let newLine = '<li>' + '<a href="' + link + '">' + title + '</a>' + '</li>';
-      if (el.tagName == 'H3') {
-        newLine = '<ul>' + newLine + '</ul>';
-      } else {
-        newLine += '<br>';
-      }
-      ToC += newLine;
-
-    }
-
-    ToC += '</ul></nav>';
-    element.innerHTML = ToC;
-  }
-
-  // Copyright 2018 The Distill Template Authors
-
-  function TOC(dom) {
-    const article = dom.querySelector('d-article');
-    const toc = dom.querySelector('d-toc');
-    if (toc) {
-      const headings = article.querySelectorAll('h2, h3');
-      renderTOC(toc, headings);
-      toc.setAttribute('prerendered', 'true');
-    }
-  }
-
-  // Copyright 2018 The Distill Template Authors
-  //
-  // Licensed under the Apache License, Version 2.0 (the "License");
-  // you may not use this file except in compliance with the License.
-  // You may obtain a copy of the License at
-  //
-  //      http://www.apache.org/licenses/LICENSE-2.0
-  //
-  // Unless required by applicable law or agreed to in writing, software
-  // distributed under the License is distributed on an "AS IS" BASIS,
-  // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  // See the License for the specific language governing permissions and
-  // limitations under the License.
-
-  function Typeset(dom) {
-
-    var textNodes = dom.createTreeWalker(
-      dom.body,
-      dom.defaultView.NodeFilter.SHOW_TEXT
-    );
-    while (textNodes.nextNode()) {
-      var n = textNodes.currentNode,
-        text = n.nodeValue;
-      if (text && acceptNode(n)) {
-        text = quotes(text);
-        text = punctuation(text);
-        // TODO: Add back support for ligatures once their uppercased versions don't hang Chrome search anymore
-        // see: https://bugs.chromium.org/p/chromium/issues/detail?id=862648
-        // text = ligatures(text);
-        n.nodeValue = text;
-      }
-    }
-  }
-
-  // 2018-07-11 shancarter@ and ludwigschubert@ no longer know what this was meant to accomplish
-  // if it was trying to not replace text in any child nodes of those listed here,
-  // then it does not accomplish that.
-  function acceptNode(node) {
-    var parent = node.parentElement;
-    var isMath = (parent && parent.getAttribute && parent.getAttribute('class')) ? parent.getAttribute('class').includes('katex') || parent.getAttribute('class').includes('MathJax') : false;
-    return parent &&
-           parent.nodeName !== 'SCRIPT' &&
-           parent.nodeName !== 'STYLE' &&
-           parent.nodeName !== 'CODE' &&
-           parent.nodeName !== 'PRE' &&
-           parent.nodeName !== 'SPAN' &&
-           parent.nodeName !== 'D-HEADER' &&
-           parent.nodeName !== 'D-BYLINE' &&
-           parent.nodeName !== 'D-MATH' &&
-           parent.nodeName !== 'D-CODE' &&
-           parent.nodeName !== 'D-BIBLIOGRAPHY' &&
-           parent.nodeName !== 'D-FOOTER' &&
-           parent.nodeName !== 'D-APPENDIX' &&
-           parent.nodeName !== 'D-FRONTMATTER' &&
-           parent.nodeName !== 'D-TOC' &&
-           parent.nodeType !== 8 && //comment nodes
-           !isMath;
-  }
-
-
-  /*!
-   * typeset - Typesetting for the web
-   * @version v0.1.6
-   * @link https://github.com/davidmerfield/Typeset.js
-   * @author David Merfield
-   */
-  // which has a CC0 license
-  // http://creativecommons.org/publicdomain/zero/1.0/
-
-
-  function punctuation(text){
-
-    // Dashes
-    text = text.replace(/--/g, '\u2014');
-    text = text.replace(/\s*\u2014\s*/g,'\u2009\u2014\u2009'); //this has thin spaces
-
-    // Elipses
-    text = text.replace(/\.\.\./g,'…');
-
-    // Nbsp for punc with spaces
-    var NBSP = '\u00a0';
-    var NBSP_PUNCTUATION_START = /([«¿¡]) /g;
-    var NBSP_PUNCTUATION_END = / ([!?:;.,‽»])/g;
-
-    text = text.replace(NBSP_PUNCTUATION_START, '$1' + NBSP);
-    text = text.replace(NBSP_PUNCTUATION_END, NBSP + '$1');
-
-    return text;
-  }
-
-  function quotes(text) {
-
-    text = text
-      .replace(/(\W|^)"([^\s!?:;.,‽»])/g, '$1\u201c$2') // beginning "
-      .replace(/(\u201c[^"]*)"([^"]*$|[^\u201c"]*\u201c)/g, '$1\u201d$2') // ending "
-      .replace(/([^0-9])"/g,'$1\u201d') // remaining " at end of word
-      .replace(/(\W|^)'(\S)/g, '$1\u2018$2') // beginning '
-      .replace(/([a-z])'([a-z])/ig, '$1\u2019$2') // conjunction's possession
-      .replace(/((\u2018[^']*)|[a-z])'([^0-9]|$)/ig, '$1\u2019$3') // ending '
-      .replace(/(\u2018)([0-9]{2}[^\u2019]*)(\u2018([^0-9]|$)|$|\u2019[a-z])/ig, '\u2019$2$3') // abbrev. years like '93
-      .replace(/(\B|^)\u2018(?=([^\u2019]*\u2019\b)*([^\u2019\u2018]*\W[\u2019\u2018]\b|[^\u2019\u2018]*$))/ig, '$1\u2019') // backwards apostrophe
-      .replace(/'''/g, '\u2034') // triple prime
-      .replace(/("|'')/g, '\u2033') // double prime
-      .replace(/'/g, '\u2032');
-
-    // Allow escaped quotes
-    text = text.replace(/\\“/, '"');
-    text = text.replace(/\\”/, '"');
-    text = text.replace(/\\’/, '\'');
-    text = text.replace(/\\‘/, '\'');
-
-    return text;
-  }
-
-  // Copyright 2018 The Distill Template Authors
-
-  // const template = `
-  // if ('IntersectionObserver' in window &&
-  //   'IntersectionObserverEntry' in window &&
-  //   'intersectionRatio' in IntersectionObserverEntry.prototype) {
-  //     // Platform supports IntersectionObserver natively! :-)
-  //     if (!('isIntersecting' in IntersectionObserverEntry.prototype)) {
-  //       Object.defineProperty(IntersectionObserverEntry.prototype,
-  //         'isIntersecting', {
-  //         get: function () {
-  //           return this.intersectionRatio > 0;
-  //         }
-  //       });
-  //     }
-  // } else {
-  //   // Platform does not support webcomponents--loading polyfills synchronously.
-  //   const scriptTag = document.createElement('script');
-  //   scriptTag.src = '${intersectionObserverPath}';
-  //   scriptTag.async = false;
-  //   document.currentScript.parentNode.insertBefore(scriptTag, document.currentScript.nextSibling);
-  // }
-  //
-  // if ('registerElement' in document &&
-  //     'import' in document.createElement('link') &&
-  //     'content' in document.createElement('template')) {
-  //   // Platform supports webcomponents natively! :-)
-  // } else {
-  //   // Platform does not support webcomponents--loading polyfills synchronously.
-  //   const scriptTag = document.createElement('script');
-  //   scriptTag.src = '${webcomponentPath}';
-  //   scriptTag.async = false;
-  //   document.currentScript.parentNode.insertBefore(scriptTag, document.currentScript.nextSibling);
-  // }
-  //
-  //
-  // `;
-
-
-  const addBackIn = `
-window.addEventListener('WebComponentsReady', function() {
-  console.warn('WebComponentsReady');
-  const loaderTag = document.createElement('script');
-  loaderTag.src = 'https://distill.pub/template.v2.js';
-  document.head.insertBefore(loaderTag, document.head.firstChild);
-});
-`;
-
-  function render(dom) {
-    // pull out template script tag
-    const templateTag = dom.querySelector('script[src*="template.v2.js"]');
-    if (templateTag) {
-      templateTag.parentNode.removeChild(templateTag);
-    } else {
-      console.debug('FYI: Did not find template tag when trying to remove it. You may not have added it. Be aware that our polyfills will add it.');
-    }
-
-    // add loader
-    const loaderTag = dom.createElement('script');
-    loaderTag.src = 'https://cdnjs.cloudflare.com/ajax/libs/webcomponentsjs/1.0.17/webcomponents-loader.js';
-    dom.head.insertBefore(loaderTag, dom.head.firstChild);
-
-    // add loader event listener to add tempalrte back in
-    const addTag = dom.createElement('script');
-    addTag.innerHTML = addBackIn;
-    dom.head.insertBefore(addTag, dom.head.firstChild);
-
-
-    // create polyfill script tag
-    // const polyfillScriptTag = dom.createElement('script');
-    // polyfillScriptTag.innerHTML = template;
-    // polyfillScriptTag.id = 'polyfills';
-
-    // insert at appropriate position--before any other script tag
-    // const firstScriptTag = dom.head.querySelector('script');
-    // dom.head.insertBefore(polyfillScriptTag, firstScriptTag);
-  }
-
-  // Copyright 2018 The Distill Template Authors
-
-  const styles$1 = `
-d-citation-list {
-  contain: style;
-}
-
-d-citation-list .references {
-  grid-column: text;
-}
-
-d-citation-list .references .title {
-  font-weight: 500;
-}
-`;
-
-  function renderCitationList(element, entries, dom=document) {
-    if (entries.size > 0) {
-      element.style.display = '';
-      let list = element.querySelector('.references');
-      if (list) {
-        list.innerHTML = '';
-      } else {
-        const stylesTag = dom.createElement('style');
-        stylesTag.innerHTML = styles$1;
-        element.appendChild(stylesTag);
-
-        const heading = dom.createElement('h3');
-        heading.id = 'references';
-        heading.textContent = 'References';
-        element.appendChild(heading);
-
-        list = dom.createElement('ol');
-        list.id = 'references-list';
-        list.className = 'references';
-        element.appendChild(list);
-      }
-
-      for (const [key, entry] of entries) {
-        const listItem = dom.createElement('li');
-        listItem.id = key;
-        listItem.innerHTML = bibliography_cite(entry);
-        list.appendChild(listItem);
-      }
-    } else {
-      element.style.display = 'none';
-    }
-  }
-
-  // Copyright 2018 The Distill Template Authors
-
-  function CitationList(dom, data) {
-    const citationListTag = dom.querySelector('d-citation-list');
-    if (citationListTag) {
-      const entries = new Map(data.citations.map( citationKey => {
-        return [citationKey, data.bibliography.get(citationKey)];
-      }));
-      renderCitationList(citationListTag, entries, dom);
-      citationListTag.setAttribute('distill-prerendered', 'true');
-    }
-  }
-
-  // Copyright 2018 The Distill Template Authors
-  //
-  // Licensed under the Apache License, Version 2.0 (the "License");
-  // you may not use this file except in compliance with the License.
-  // You may obtain a copy of the License at
-  //
-  //      http://www.apache.org/licenses/LICENSE-2.0
-  //
-  // Unless required by applicable law or agreed to in writing, software
-  // distributed under the License is distributed on an "AS IS" BASIS,
-  // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  // See the License for the specific language governing permissions and
-  // limitations under the License.
-
-  /*
-    Try to only reorder things that MAY be user defined.
-    Try to use templates etc to define the order of our own tags.
-  */
-
-  function render$1(dom) {
-    const head = dom.head;
-
-    const metaIE = head.querySelector('meta[http-equiv]');
-    head.insertBefore(metaIE, head.firstChild);
-
-    const metaViewport = head.querySelector('meta[name=viewport]');
-    head.insertBefore(metaViewport, head.firstChild);
-
-    const metaCharset = head.querySelector('meta[charset]');
-    head.insertBefore(metaCharset, head.firstChild);
-  }
-
-  var logo = "<svg viewBox=\"-607 419 64 64\">\n  <path d=\"M-573.4,478.9c-8,0-14.6-6.4-14.6-14.5s14.6-25.9,14.6-40.8c0,14.9,14.6,32.8,14.6,40.8S-565.4,478.9-573.4,478.9z\"/>\n</svg>\n";
-
-  const headerTemplate = `
-<style>
-distill-header {
-  position: relative;
-  height: 60px;
-  background-color: hsl(200, 60%, 15%);
-  width: 100%;
-  box-sizing: border-box;
-  z-index: 2;
-  color: rgba(0, 0, 0, 0.8);
-  border-bottom: 1px solid rgba(0, 0, 0, 0.08);
-  box-shadow: 0 1px 6px rgba(0, 0, 0, 0.05);
-}
-distill-header .content {
-  height: 70px;
-  grid-column: page;
-}
-distill-header a {
-  font-size: 16px;
-  height: 60px;
-  line-height: 60px;
-  text-decoration: none;
-  color: rgba(255, 255, 255, 0.8);
-  padding: 22px 0;
-}
-distill-header a:hover {
-  color: rgba(255, 255, 255, 1);
-}
-distill-header svg {
-  width: 24px;
-  position: relative;
-  top: 4px;
-  margin-right: 2px;
-}
-@media(min-width: 1080px) {
-  distill-header {
-    height: 70px;
-  }
-  distill-header a {
-    height: 70px;
-    line-height: 70px;
-    padding: 28px 0;
-  }
-  distill-header .logo {
-  }
-}
-distill-header svg path {
-  fill: none;
-  stroke: rgba(255, 255, 255, 0.8);
-  stroke-width: 3px;
-}
-distill-header .logo {
-  font-size: 17px;
-  font-weight: 200;
-}
-distill-header .nav {
-  float: right;
-  font-weight: 300;
-}
-distill-header .nav a {
-  font-size: 12px;
-  margin-left: 24px;
-  text-transform: uppercase;
-}
-</style>
-<div class="content">
-  <a href="/" class="logo">
-    ${logo}
-    Distill
-  </a>
-  <nav class="nav">
-    <a href="/about/">About</a>
-    <a href="/prize/">Prize</a>
-    <a href="/journal/">Submit</a>
-  </nav>
-</div>
-`;
-
-  // Copyright 2018 The Distill Template Authors
-
-  function DistillHeader(dom, data) {
-    const headerTag = dom.querySelector('distill-header');
-    if (!headerTag) {
-      const header = dom.createElement('distill-header');
-      header.innerHTML = headerTemplate;
-      header.setAttribute('distill-prerendered', "");
-      const body = dom.querySelector('body');
-      body.insertBefore(header, body.firstChild);
-    }
-  }
-
-  // Copyright 2018 The Distill Template Authors
-
-  const styles$2 = `
-<style>
-  distill-appendix {
-    contain: layout style;
-  }
-
-  distill-appendix .citation {
-    font-size: 11px;
-    line-height: 15px;
-    border-left: 1px solid rgba(0, 0, 0, 0.1);
-    padding-left: 18px;
-    border: 1px solid rgba(0,0,0,0.1);
-    background: rgba(0, 0, 0, 0.02);
-    padding: 10px 18px;
-    border-radius: 3px;
-    color: rgba(150, 150, 150, 1);
-    overflow: hidden;
-    margin-top: -12px;
-    white-space: pre-wrap;
-    word-wrap: break-word;
-  }
-
-  distill-appendix > * {
-    grid-column: text;
-  }
-</style>
-`;
-
-  function appendixTemplate(frontMatter) {
-    let html = styles$2;
-
-    if (typeof frontMatter.githubUrl !== 'undefined') {
-      html += `
-    <h3 id="updates-and-corrections">Updates and Corrections</h3>
-    <p>`;
-      if (frontMatter.githubCompareUpdatesUrl) {
-        html += `<a href="${frontMatter.githubCompareUpdatesUrl}">View all changes</a> to this article since it was first published.`;
-      }
-      html += `
-    If you see mistakes or want to suggest changes, please <a href="${frontMatter.githubUrl + '/issues/new'}">create an issue on GitHub</a>. </p>
-    `;
-    }
-
-    const journal = frontMatter.journal;
-    if (typeof journal !== 'undefined' && journal.title === 'Distill') {
-      html += `
-    <h3 id="reuse">Reuse</h3>
-    <p>Diagrams and text are licensed under Creative Commons Attribution <a href="https://creativecommons.org/licenses/by/4.0/">CC-BY 4.0</a> with the <a class="github" href="${frontMatter.githubUrl}">source available on GitHub</a>, unless noted otherwise. The figures that have been reused from other sources don’t fall under this license and can be recognized by a note in their caption: “Figure from …”.</p>
-    `;
-    }
-
-    if (typeof frontMatter.publishedDate !== 'undefined') {
-      html += `
-    <h3 id="citation">Citation</h3>
-    <p>For attribution in academic contexts, please cite this work as</p>
-    <pre class="citation short">${frontMatter.concatenatedAuthors}, "${frontMatter.title}", Distill, ${frontMatter.publishedYear}.</pre>
-    <p>BibTeX citation</p>
-    <pre class="citation long">${serializeFrontmatterToBibtex(frontMatter)}</pre>
-    `;
-    }
-
-    return html;
-  }
-
-  // Copyright 2018 The Distill Template Authors
-
-  function DistillAppendix(dom, data) {
-
-    const appendixTag = dom.querySelector('d-appendix');
-    if (!appendixTag) {
-      console.warn('No appendix tag found!');
-      return;
-    }
-    const distillAppendixTag = appendixTag.querySelector('distill-appendix');
-    if (!distillAppendixTag) {
-      const distillAppendix = dom.createElement('distill-appendix');
-      appendixTag.appendChild(distillAppendix);
-      distillAppendix.innerHTML = appendixTemplate(data);
-    }
-
-  }
-
-  const footerTemplate = `
-<style>
-
-:host {
-  color: rgba(255, 255, 255, 0.5);
-  font-weight: 300;
-  padding: 2rem 0;
-  border-top: 1px solid rgba(0, 0, 0, 0.1);
-  background-color: hsl(180, 5%, 15%); /*hsl(200, 60%, 15%);*/
-  text-align: left;
-  contain: content;
-}
-
-.footer-container .logo svg {
-  width: 24px;
-  position: relative;
-  top: 4px;
-  margin-right: 2px;
-}
-
-.footer-container .logo svg path {
-  fill: none;
-  stroke: rgba(255, 255, 255, 0.8);
-  stroke-width: 3px;
-}
-
-.footer-container .logo {
-  font-size: 17px;
-  font-weight: 200;
-  color: rgba(255, 255, 255, 0.8);
-  text-decoration: none;
-  margin-right: 6px;
-}
-
-.footer-container {
-  grid-column: text;
-}
-
-.footer-container .nav {
-  font-size: 0.9em;
-  margin-top: 1.5em;
-}
-
-.footer-container .nav a {
-  color: rgba(255, 255, 255, 0.8);
-  margin-right: 6px;
-  text-decoration: none;
-}
-
-</style>
-
-<div class='footer-container'>
-
-  <a href="/" class="logo">
-    ${logo}
-    Distill
-  </a> is dedicated to clear explanations of machine learning
-
-  <div class="nav">
-    <a href="https://distill.pub/about/">About</a>
-    <a href="https://distill.pub/journal/">Submit</a>
-    <a href="https://distill.pub/prize/">Prize</a>
-    <a href="https://distill.pub/archive/">Archive</a>
-    <a href="https://distill.pub/rss.xml">RSS</a>
-    <a href="https://github.com/distillpub">GitHub</a>
-    <a href="https://twitter.com/distillpub">Twitter</a>
-    &nbsp;&nbsp;&nbsp;&nbsp; ISSN 2476-0757
-  </div>
-
-</div>
-
-`;
-
-  // Copyright 2018 The Distill Template Authors
-
-  function DistillFooter(dom) {
-    const footerTag = dom.querySelector('distill-footer');
-    if(!footerTag) {
-      const footer = dom.createElement('distill-footer');
-      footer.innerHTML = footerTemplate;
-      const body = dom.querySelector('body');
-      body.appendChild(footer);
-    }
-  }
-
-  // Copyright 2018 The Distill Template Authors
-
-  const extractors = new Map([
-    ['ExtractFrontmatter', ExtractFrontmatter],
-    ['ExtractBibliography', ExtractBibliography],
-    ['ExtractCitations', ExtractCitations],
-  ]);
-
-  const transforms = new Map([
-    ['HTML', HTML],
-    ['makeStyleTag', makeStyleTag],
-    ['OptionalComponents', OptionalComponents],
-    ['TOC', TOC],
-    ['Byline', Byline],
-    ['Mathematics', Mathematics],
-    ['Meta', Meta],
-    ['Typeset', Typeset],
-    ['Polyfills', render],
-    ['CitationList', CitationList],
-    ['Reorder', render$1] // keep last
-  ]);
-
-  const distillTransforms = new Map([
-    ['DistillHeader', DistillHeader],
-    ['DistillAppendix', DistillAppendix],
-    ['DistillFooter', DistillFooter],
-  ]);
-
-  /* Exported functions */
-
-  function render$2(dom, data, verbose=true) {
-    let frontMatter;
-    if (data instanceof FrontMatter) {
-      frontMatter = data;
-    } else {
-      frontMatter = FrontMatter.fromObject(data);
-    }
-    // first, we collect static data from the dom
-    for (const [name, extract] of extractors.entries()) {
-      if (verbose) console.warn('Running extractor: ' + name);
-      extract(dom, frontMatter, verbose);
-    }
-    // secondly we use it to transform parts of the dom
-    for (const [name, transform] of transforms.entries()) {
-      if (verbose) console.warn('Running transform: ' + name);
-      // console.warn('Running transform: ', transform);
-      transform(dom, frontMatter, verbose);
-    }
-    dom.body.setAttribute('distill-prerendered', '');
-    // the function calling us can now use the transformed dom and filled data object
-    if (data instanceof FrontMatter) ; else {
-      frontMatter.assignToObject(data);
-    }
-  }
-
-  function distillify(dom, data, verbose=true) {
-    // thirdly, we can use these additional transforms when publishing on the Distill website
-    for (const [name, transform] of distillTransforms.entries()) {
-      if (verbose) console.warn('Running distillify: ', name);
-      transform(dom, data, verbose);
-    }
-  }
-
-  function usesTemplateV2(dom) {
-    const tags = dom.querySelectorAll('script');
-    let usesV2 = undefined;
-    for (const tag of tags) {
-      const src = tag.src;
-      if (src.includes('template.v1.js')) {
-        usesV2 = false;
-      } else if (src.includes('template.v2.js')) {
-        usesV2 = true;
-      } else if (src.includes('template.')) {
-        throw new Error('Uses distill template, but unknown version?!');
-      }
-    }
-
-    if (usesV2 === undefined) {
-      throw new Error('Does not seem to use Distill template at all.');
-    } else {
-      return usesV2;
-    }
-  }
-
-  const testing = {
-    extractors: extractors,
-    transforms: transforms,
-    distillTransforms: distillTransforms
-  };
-
-  exports.FrontMatter = FrontMatter;
-  exports.distillify = distillify;
-  exports.render = render$2;
-  exports.testing = testing;
-  exports.usesTemplateV2 = usesTemplateV2;
-
-  Object.defineProperty(exports, '__esModule', { value: true });
-
-})));
-//# sourceMappingURL=transforms.v2.js.map
+// Copyright 2018 The Distill Template Authors
+const me='/*\n * Copyright 2018 The Distill Template Authors\n *\n * Licensed under the Apache License, Version 2.0 (the "License");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *      http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an "AS IS" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n\nhtml {\n  font-size: 14px;\n\tline-height: 1.6em;\n  /* font-family: "Libre Franklin", "Helvetica Neue", sans-serif; */\n  font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, Oxygen, Ubuntu, Cantarell, "Fira Sans", "Droid Sans", "Helvetica Neue", Arial, sans-serif;\n  /*, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol";*/\n  text-size-adjust: 100%;\n  -ms-text-size-adjust: 100%;\n  -webkit-text-size-adjust: 100%;\n}\n\n@media(min-width: 768px) {\n  html {\n    font-size: 16px;\n  }\n}\n\nbody {\n  margin: 0;\n}\n\na {\n  color: #004276;\n}\n\nfigure {\n  margin: 0;\n}\n\ntable {\n\tborder-collapse: collapse;\n\tborder-spacing: 0;\n}\n\ntable th {\n\ttext-align: left;\n}\n\ntable thead {\n  border-bottom: 1px solid rgba(0, 0, 0, 0.05);\n}\n\ntable thead th {\n  padding-bottom: 0.5em;\n}\n\ntable tbody :first-child td {\n  padding-top: 0.5em;\n}\n\npre {\n  overflow: auto;\n  max-width: 100%;\n}\n\np {\n  margin-top: 0;\n  margin-bottom: 1em;\n}\n\nsup, sub {\n  vertical-align: baseline;\n  position: relative;\n  top: -0.4em;\n  line-height: 1em;\n}\n\nsub {\n  top: 0.4em;\n}\n\n.kicker,\n.marker {\n  font-size: 15px;\n  font-weight: 600;\n  color: rgba(0, 0, 0, 0.5);\n}\n\n\n/* Headline */\n\n@media(min-width: 1024px) {\n  d-title h1 span {\n    display: block;\n  }\n}\n\n/* Figure */\n\nfigure {\n  position: relative;\n  margin-bottom: 2.5em;\n  margin-top: 1.5em;\n}\n\nfigcaption+figure {\n\n}\n\nfigure img {\n  width: 100%;\n}\n\nfigure svg text,\nfigure svg tspan {\n}\n\nfigcaption,\n.figcaption {\n  color: rgba(0, 0, 0, 0.6);\n  font-size: 12px;\n  line-height: 1.5em;\n}\n\n@media(min-width: 1024px) {\nfigcaption,\n.figcaption {\n    font-size: 13px;\n  }\n}\n\nfigure.external img {\n  background: white;\n  border: 1px solid rgba(0, 0, 0, 0.1);\n  box-shadow: 0 1px 8px rgba(0, 0, 0, 0.1);\n  padding: 18px;\n  box-sizing: border-box;\n}\n\nfigcaption a {\n  color: rgba(0, 0, 0, 0.6);\n}\n\nfigcaption b,\nfigcaption strong, {\n  font-weight: 600;\n  color: rgba(0, 0, 0, 1.0);\n}\n'+'/*\n * Copyright 2018 The Distill Template Authors\n *\n * Licensed under the Apache License, Version 2.0 (the "License");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *      http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an "AS IS" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n\n@supports not (display: grid) {\n  .base-grid,\n  distill-header,\n  d-title,\n  d-abstract,\n  d-article,\n  d-appendix,\n  distill-appendix,\n  d-byline,\n  d-footnote-list,\n  d-citation-list,\n  distill-footer {\n    display: block;\n    padding: 8px;\n  }\n}\n\n.base-grid,\ndistill-header,\nd-title,\nd-abstract,\nd-article,\nd-appendix,\ndistill-appendix,\nd-byline,\nd-footnote-list,\nd-citation-list,\ndistill-footer {\n  display: grid;\n  justify-items: stretch;\n  grid-template-columns: [screen-start] 8px [page-start kicker-start text-start gutter-start middle-start] 1fr 1fr 1fr 1fr 1fr 1fr 1fr 1fr [text-end page-end gutter-end kicker-end middle-end] 8px [screen-end];\n  grid-column-gap: 8px;\n}\n\n.grid {\n  display: grid;\n  grid-column-gap: 8px;\n}\n\n@media(min-width: 768px) {\n  .base-grid,\n  distill-header,\n  d-title,\n  d-abstract,\n  d-article,\n  d-appendix,\n  distill-appendix,\n  d-byline,\n  d-footnote-list,\n  d-citation-list,\n  distill-footer {\n    grid-template-columns: [screen-start] 1fr [page-start kicker-start middle-start text-start] 45px 45px 45px 45px 45px 45px 45px 45px [ kicker-end text-end gutter-start] 45px [middle-end] 45px [page-end gutter-end] 1fr [screen-end];\n    grid-column-gap: 16px;\n  }\n\n  .grid {\n    grid-column-gap: 16px;\n  }\n}\n\n@media(min-width: 1000px) {\n  .base-grid,\n  distill-header,\n  d-title,\n  d-abstract,\n  d-article,\n  d-appendix,\n  distill-appendix,\n  d-byline,\n  d-footnote-list,\n  d-citation-list,\n  distill-footer {\n    grid-template-columns: [screen-start] 1fr [page-start kicker-start] 50px [middle-start] 50px [text-start kicker-end] 50px 50px 50px 50px 50px 50px 50px 50px [text-end gutter-start] 50px [middle-end] 50px [page-end gutter-end] 1fr [screen-end];\n    grid-column-gap: 16px;\n  }\n\n  .grid {\n    grid-column-gap: 16px;\n  }\n}\n\n@media(min-width: 1180px) {\n  .base-grid,\n  distill-header,\n  d-title,\n  d-abstract,\n  d-article,\n  d-appendix,\n  distill-appendix,\n  d-byline,\n  d-footnote-list,\n  d-citation-list,\n  distill-footer {\n    grid-template-columns: [screen-start] 1fr [page-start kicker-start] 60px [middle-start] 60px [text-start kicker-end] 60px 60px 60px 60px 60px 60px 60px 60px [text-end gutter-start] 60px [middle-end] 60px [page-end gutter-end] 1fr [screen-end];\n    grid-column-gap: 32px;\n  }\n\n  .grid {\n    grid-column-gap: 32px;\n  }\n}\n\n\n\n\n.base-grid {\n  grid-column: screen;\n}\n\n/* .l-body,\nd-article > *  {\n  grid-column: text;\n}\n\n.l-page,\nd-title > *,\nd-figure {\n  grid-column: page;\n} */\n\n.l-gutter {\n  grid-column: gutter;\n}\n\n.l-text,\n.l-body {\n  grid-column: text;\n}\n\n.l-page {\n  grid-column: page;\n}\n\n.l-body-outset {\n  grid-column: middle;\n}\n\n.l-page-outset {\n  grid-column: page;\n}\n\n.l-screen {\n  grid-column: screen;\n}\n\n.l-screen-inset {\n  grid-column: screen;\n  padding-left: 16px;\n  padding-left: 16px;\n}\n\n\n/* Aside */\n\nd-article aside {\n  grid-column: gutter;\n  font-size: 12px;\n  line-height: 1.6em;\n  color: rgba(0, 0, 0, 0.6)\n}\n\n@media(min-width: 768px) {\n  aside {\n    grid-column: gutter;\n  }\n\n  .side {\n    grid-column: gutter;\n  }\n}\n'+'/*\n * Copyright 2018 The Distill Template Authors\n *\n * Licensed under the Apache License, Version 2.0 (the "License");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *      http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an "AS IS" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n\nd-title {\n  padding: 2rem 0 1.5rem;\n  contain: layout style;\n  overflow-x: hidden;\n}\n\n@media(min-width: 768px) {\n  d-title {\n    padding: 4rem 0 1.5rem;\n  }\n}\n\nd-title h1 {\n  grid-column: text;\n  font-size: 40px;\n  font-weight: 700;\n  line-height: 1.1em;\n  margin: 0 0 0.5rem;\n}\n\n@media(min-width: 768px) {\n  d-title h1 {\n    font-size: 50px;\n  }\n}\n\nd-title p {\n  font-weight: 300;\n  font-size: 1.2rem;\n  line-height: 1.55em;\n  grid-column: text;\n}\n\nd-title .status {\n  margin-top: 0px;\n  font-size: 12px;\n  color: #009688;\n  opacity: 0.8;\n  grid-column: kicker;\n}\n\nd-title .status span {\n  line-height: 1;\n  display: inline-block;\n  padding: 6px 0;\n  border-bottom: 1px solid #80cbc4;\n  font-size: 11px;\n  text-transform: uppercase;\n}\n'+'/*\n * Copyright 2018 The Distill Template Authors\n *\n * Licensed under the Apache License, Version 2.0 (the "License");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *      http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an "AS IS" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n\nd-byline {\n  contain: style;\n  overflow: hidden;\n  border-top: 1px solid rgba(0, 0, 0, 0.1);\n  font-size: 0.8rem;\n  line-height: 1.8em;\n  padding: 1.5rem 0;\n  min-height: 1.8em;\n}\n\n\nd-byline .byline {\n  grid-template-columns: 1fr 1fr;\n  grid-column: text;\n}\n\n@media(min-width: 768px) {\n  d-byline .byline {\n    grid-template-columns: 1fr 1fr 1fr 1fr;\n  }\n}\n\nd-byline .authors-affiliations {\n  grid-column-end: span 2;\n  grid-template-columns: 1fr 1fr;\n  margin-bottom: 1em;\n}\n\n@media(min-width: 768px) {\n  d-byline .authors-affiliations {\n    margin-bottom: 0;\n  }\n}\n\nd-byline h3 {\n  font-size: 0.6rem;\n  font-weight: 400;\n  color: rgba(0, 0, 0, 0.5);\n  margin: 0;\n  text-transform: uppercase;\n}\n\nd-byline p {\n  margin: 0;\n}\n\nd-byline a,\nd-article d-byline a {\n  color: rgba(0, 0, 0, 0.8);\n  text-decoration: none;\n  border-bottom: none;\n}\n\nd-article d-byline a:hover {\n  text-decoration: underline;\n  border-bottom: none;\n}\n\nd-byline p.author {\n  font-weight: 500;\n}\n\nd-byline .affiliations {\n\n}\n'+'/*\n * Copyright 2018 The Distill Template Authors\n *\n * Licensed under the Apache License, Version 2.0 (the "License");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *      http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an "AS IS" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n\nd-article {\n  contain: layout style;\n  overflow-x: hidden;\n  border-top: 1px solid rgba(0, 0, 0, 0.1);\n  padding-top: 2rem;\n  color: rgba(0, 0, 0, 0.8);\n}\n\nd-article > * {\n  grid-column: text;\n}\n\n@media(min-width: 768px) {\n  d-article {\n    font-size: 16px;\n  }\n}\n\n@media(min-width: 1024px) {\n  d-article {\n    font-size: 1.06rem;\n    line-height: 1.7em;\n  }\n}\n\n\n/* H2 */\n\n\nd-article .marker {\n  text-decoration: none;\n  border: none;\n  counter-reset: section;\n  grid-column: kicker;\n  line-height: 1.7em;\n}\n\nd-article .marker:hover {\n  border: none;\n}\n\nd-article .marker span {\n  padding: 0 3px 4px;\n  border-bottom: 1px solid rgba(0, 0, 0, 0.2);\n  position: relative;\n  top: 4px;\n}\n\nd-article .marker:hover span {\n  color: rgba(0, 0, 0, 0.7);\n  border-bottom: 1px solid rgba(0, 0, 0, 0.7);\n}\n\nd-article h2 {\n  font-weight: 600;\n  font-size: 24px;\n  line-height: 1.25em;\n  margin: 2rem 0 1.5rem 0;\n  border-bottom: 1px solid rgba(0, 0, 0, 0.1);\n  padding-bottom: 1rem;\n}\n\n@media(min-width: 1024px) {\n  d-article h2 {\n    font-size: 36px;\n  }\n}\n\n/* H3 */\n\nd-article h3 {\n  font-weight: 700;\n  font-size: 18px;\n  line-height: 1.4em;\n  margin-bottom: 1em;\n  margin-top: 2em;\n}\n\n@media(min-width: 1024px) {\n  d-article h3 {\n    font-size: 20px;\n  }\n}\n\n/* H4 */\n\nd-article h4 {\n  font-weight: 600;\n  text-transform: uppercase;\n  font-size: 14px;\n  line-height: 1.4em;\n}\n\nd-article a {\n  color: inherit;\n}\n\nd-article p,\nd-article ul,\nd-article ol,\nd-article blockquote {\n  margin-top: 0;\n  margin-bottom: 1em;\n  margin-left: 0;\n  margin-right: 0;\n}\n\nd-article blockquote {\n  border-left: 2px solid rgba(0, 0, 0, 0.2);\n  padding-left: 2em;\n  font-style: italic;\n  color: rgba(0, 0, 0, 0.6);\n}\n\nd-article a {\n  border-bottom: 1px solid rgba(0, 0, 0, 0.4);\n  text-decoration: none;\n}\n\nd-article a:hover {\n  border-bottom: 1px solid rgba(0, 0, 0, 0.8);\n}\n\nd-article .link {\n  text-decoration: underline;\n  cursor: pointer;\n}\n\nd-article ul,\nd-article ol {\n  padding-left: 24px;\n}\n\nd-article li {\n  margin-bottom: 1em;\n  margin-left: 0;\n  padding-left: 0;\n}\n\nd-article li:last-child {\n  margin-bottom: 0;\n}\n\nd-article pre {\n  font-size: 14px;\n  margin-bottom: 20px;\n}\n\nd-article hr {\n  grid-column: screen;\n  width: 100%;\n  border: none;\n  border-bottom: 1px solid rgba(0, 0, 0, 0.1);\n  margin-top: 60px;\n  margin-bottom: 60px;\n}\n\nd-article section {\n  margin-top: 60px;\n  margin-bottom: 60px;\n}\n\nd-article span.equation-mimic {\n  font-family: georgia;\n  font-size: 115%;\n  font-style: italic;\n}\n\nd-article > d-code,\nd-article section > d-code  {\n  display: block;\n}\n\nd-article > d-math[block],\nd-article section > d-math[block]  {\n  display: block;\n}\n\n@media (max-width: 768px) {\n  d-article > d-code,\n  d-article section > d-code,\n  d-article > d-math[block],\n  d-article section > d-math[block] {\n      overflow-x: scroll;\n      -ms-overflow-style: none;  // IE 10+\n      overflow: -moz-scrollbars-none;  // Firefox\n  }\n\n  d-article > d-code::-webkit-scrollbar,\n  d-article section > d-code::-webkit-scrollbar,\n  d-article > d-math[block]::-webkit-scrollbar,\n  d-article section > d-math[block]::-webkit-scrollbar {\n    display: none;  // Safari and Chrome\n  }\n}\n\nd-article .citation {\n  color: #668;\n  cursor: pointer;\n}\n\nd-include {\n  width: auto;\n  display: block;\n}\n\nd-figure {\n  contain: layout style;\n}\n\n/* KaTeX */\n\n.katex, .katex-prerendered {\n  contain: style;\n  display: inline-block;\n}\n\n/* Tables */\n\nd-article table {\n  border-collapse: collapse;\n  margin-bottom: 1.5rem;\n  border-bottom: 1px solid rgba(0, 0, 0, 0.2);\n}\n\nd-article table th {\n  border-bottom: 1px solid rgba(0, 0, 0, 0.2);\n}\n\nd-article table td {\n  border-bottom: 1px solid rgba(0, 0, 0, 0.05);\n}\n\nd-article table tr:last-of-type td {\n  border-bottom: none;\n}\n\nd-article table th,\nd-article table td {\n  font-size: 15px;\n  padding: 2px 8px;\n}\n\nd-article table tbody :first-child td {\n  padding-top: 2px;\n}\n'+'/*\n * Copyright 2018 The Distill Template Authors\n *\n * Licensed under the Apache License, Version 2.0 (the "License");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *      http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an "AS IS" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n\nspan.katex-display {\n  text-align: left;\n  padding: 8px 0 8px 0;\n  margin: 0.5em 0 0.5em 1em;\n}\n\nspan.katex {\n  -webkit-font-smoothing: antialiased;\n  color: rgba(0, 0, 0, 0.8);\n  font-size: 1.18em;\n}\n'+'/*\n * Copyright 2018 The Distill Template Authors\n *\n * Licensed under the Apache License, Version 2.0 (the "License");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *      http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an "AS IS" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n\n@media print {\n\n  @page {\n    size: 8in 11in;\n    @bottom-right {\n      content: counter(page) " of " counter(pages);\n    }\n  }\n\n  html {\n    /* no general margins -- CSS Grid takes care of those */\n  }\n\n  p, code {\n    page-break-inside: avoid;\n  }\n\n  h2, h3 {\n    page-break-after: avoid;\n  }\n\n  d-header {\n    visibility: hidden;\n  }\n\n  d-footer {\n    display: none!important;\n  }\n\n}\n',ge="\nwindow.addEventListener('WebComponentsReady', function() {\n  console.warn('WebComponentsReady');\n  const loaderTag = document.createElement('script');\n  loaderTag.src = 'https://distill.pub/template.v2.js';\n  document.head.insertBefore(loaderTag, document.head.firstChild);\n});\n",ve="\nd-citation-list {\n  contain: style;\n}\n\nd-citation-list .references {\n  grid-column: text;\n}\n\nd-citation-list .references .title {\n  font-weight: 500;\n}\n";var be='<svg viewBox="-607 419 64 64">\n  <path d="M-573.4,478.9c-8,0-14.6-6.4-14.6-14.5s14.6-25.9,14.6-40.8c0,14.9,14.6,32.8,14.6,40.8S-565.4,478.9-573.4,478.9z"/>\n</svg>\n';const ye=`\n<style>\ndistill-header {\n  position: relative;\n  height: 60px;\n  background-color: hsl(200, 60%, 15%);\n  width: 100%;\n  box-sizing: border-box;\n  z-index: 2;\n  color: rgba(0, 0, 0, 0.8);\n  border-bottom: 1px solid rgba(0, 0, 0, 0.08);\n  box-shadow: 0 1px 6px rgba(0, 0, 0, 0.05);\n}\ndistill-header .content {\n  height: 70px;\n  grid-column: page;\n}\ndistill-header a {\n  font-size: 16px;\n  height: 60px;\n  line-height: 60px;\n  text-decoration: none;\n  color: rgba(255, 255, 255, 0.8);\n  padding: 22px 0;\n}\ndistill-header a:hover {\n  color: rgba(255, 255, 255, 1);\n}\ndistill-header svg {\n  width: 24px;\n  position: relative;\n  top: 4px;\n  margin-right: 2px;\n}\n@media(min-width: 1080px) {\n  distill-header {\n    height: 70px;\n  }\n  distill-header a {\n    height: 70px;\n    line-height: 70px;\n    padding: 28px 0;\n  }\n  distill-header .logo {\n  }\n}\ndistill-header svg path {\n  fill: none;\n  stroke: rgba(255, 255, 255, 0.8);\n  stroke-width: 3px;\n}\ndistill-header .logo {\n  font-size: 17px;\n  font-weight: 200;\n}\ndistill-header .nav {\n  float: right;\n  font-weight: 300;\n}\ndistill-header .nav a {\n  font-size: 12px;\n  margin-left: 24px;\n  text-transform: uppercase;\n}\n</style>\n<div class="content">\n  <a href="/" class="logo">\n    ${be}\n    Distill\n  </a>\n  <nav class="nav">\n    <a href="/about/">About</a>\n    <a href="/prize/">Prize</a>\n    <a href="/journal/">Submit</a>\n  </nav>\n</div>\n`,xe="\n<style>\n  distill-appendix {\n    contain: layout style;\n  }\n\n  distill-appendix .citation {\n    font-size: 11px;\n    line-height: 15px;\n    border-left: 1px solid rgba(0, 0, 0, 0.1);\n    padding-left: 18px;\n    border: 1px solid rgba(0,0,0,0.1);\n    background: rgba(0, 0, 0, 0.02);\n    padding: 10px 18px;\n    border-radius: 3px;\n    color: rgba(150, 150, 150, 1);\n    overflow: hidden;\n    margin-top: -12px;\n    white-space: pre-wrap;\n    word-wrap: break-word;\n  }\n\n  distill-appendix > * {\n    grid-column: text;\n  }\n</style>\n",we=`\n<style>\n\n:host {\n  color: rgba(255, 255, 255, 0.5);\n  font-weight: 300;\n  padding: 2rem 0;\n  border-top: 1px solid rgba(0, 0, 0, 0.1);\n  background-color: hsl(180, 5%, 15%); /*hsl(200, 60%, 15%);*/\n  text-align: left;\n  contain: content;\n}\n\n.footer-container .logo svg {\n  width: 24px;\n  position: relative;\n  top: 4px;\n  margin-right: 2px;\n}\n\n.footer-container .logo svg path {\n  fill: none;\n  stroke: rgba(255, 255, 255, 0.8);\n  stroke-width: 3px;\n}\n\n.footer-container .logo {\n  font-size: 17px;\n  font-weight: 200;\n  color: rgba(255, 255, 255, 0.8);\n  text-decoration: none;\n  margin-right: 6px;\n}\n\n.footer-container {\n  grid-column: text;\n}\n\n.footer-container .nav {\n  font-size: 0.9em;\n  margin-top: 1.5em;\n}\n\n.footer-container .nav a {\n  color: rgba(255, 255, 255, 0.8);\n  margin-right: 6px;\n  text-decoration: none;\n}\n\n</style>\n\n<div class='footer-container'>\n\n  <a href="/" class="logo">\n    ${be}\n    Distill\n  </a> is dedicated to clear explanations of machine learning\n\n  <div class="nav">\n    <a href="https://distill.pub/about/">About</a>\n    <a href="https://distill.pub/journal/">Submit</a>\n    <a href="https://distill.pub/prize/">Prize</a>\n    <a href="https://distill.pub/archive/">Archive</a>\n    <a href="https://distill.pub/rss.xml">RSS</a>\n    <a href="https://github.com/distillpub">GitHub</a>\n    <a href="https://twitter.com/distillpub">Twitter</a>\n    &nbsp;&nbsp;&nbsp;&nbsp; ISSN 2476-0757\n  </div>\n\n</div>\n\n`,ke=new Map([["ExtractFrontmatter",a],["ExtractBibliography",p],["ExtractCitations",w]]),Me=new Map([["HTML",k],["makeStyleTag",R],["OptionalComponents",z],["TOC",O],["Byline",S],["Mathematics",A],["Meta",T],["Typeset",q],["Polyfills",I],["CitationList",P],["Reorder",j]]),Se=new Map([["DistillHeader",F],["DistillAppendix",U],["DistillFooter",Y]]),ze={extractors:ke,transforms:Me,distillTransforms:Se};e.FrontMatter=ne,e.distillify=G,e.render=V,e.testing=ze,e.usesTemplateV2=W,Object.defineProperty(e,"__esModule",{value:!0})});
\ No newline at end of file
diff --git a/assets/js/masonry.js b/assets/js/masonry.js
index 054f3a08..57fd6fe5 100644
--- a/assets/js/masonry.js
+++ b/assets/js/masonry.js
@@ -1,12 +1 @@
-$(document).ready(function() {
-  // Init Masonry
-  var $grid = $('.grid').masonry({
-    gutter: 10,
-    horizontalOrder: true,
-    itemSelector: '.grid-item',
-  });
-  // Layout Masonry after each image loads
-  $grid.imagesLoaded().progress( function() {
-    $grid.masonry('layout');
-  });
-});
+$(document).ready(function(){var r=$(".grid").masonry({gutter:10,horizontalOrder:!0,itemSelector:".grid-item"});r.imagesLoaded().progress(function(){r.masonry("layout")})});
\ No newline at end of file
diff --git a/assets/js/theme.js b/assets/js/theme.js
index f6c9cdf7..55f4fd8e 100644
--- a/assets/js/theme.js
+++ b/assets/js/theme.js
@@ -1,64 +1 @@
-// Has to be in the head tag, otherwise a flicker effect will occur.
-
-let toggleTheme = (theme) => {
-  if (theme == "dark") {
-    setTheme("light");
-  } else {
-    setTheme("dark");
-  }
-}
-
-
-let setTheme = (theme) =>  {
-  transTheme();
-  setHighlight(theme);
-
-  if (theme) {
-    document.documentElement.setAttribute("data-theme", theme);
-  }
-  else {
-    document.documentElement.removeAttribute("data-theme");
-  }
-  localStorage.setItem("theme", theme);
-  
-  // Updates the background of medium-zoom overlay.
-  if (typeof medium_zoom !== 'undefined') {
-    medium_zoom.update({
-      background: getComputedStyle(document.documentElement)
-          .getPropertyValue('--global-bg-color') + 'ee',  // + 'ee' for trasparency.
-    })
-  }
-};
-
-let setHighlight = (theme) => {
-  if (theme == "dark") {
-    document.getElementById("highlight_theme_light").media = "none";
-    document.getElementById("highlight_theme_dark").media = "";
-  } else {
-    document.getElementById("highlight_theme_dark").media = "none";
-    document.getElementById("highlight_theme_light").media = "";
-  }
-}
-
-
-let transTheme = () => {
-  document.documentElement.classList.add("transition");
-  window.setTimeout(() => {
-    document.documentElement.classList.remove("transition");
-  }, 500)
-}
-
-
-let initTheme = (theme) => {
-  if (theme == null || theme == 'null') {
-    const userPref = window.matchMedia;
-    if (userPref && userPref('(prefers-color-scheme: dark)').matches) {
-        theme = 'dark';
-    }
-  }
-  
-  setTheme(theme);
-}
-
-
-initTheme(localStorage.getItem("theme"));
+let toggleTheme=e=>{setTheme("dark"==e?"light":"dark")},setTheme=e=>{transTheme(),setHighlight(e),e?document.documentElement.setAttribute("data-theme",e):document.documentElement.removeAttribute("data-theme"),localStorage.setItem("theme",e),"undefined"!=typeof medium_zoom&&medium_zoom.update({background:getComputedStyle(document.documentElement).getPropertyValue("--global-bg-color")+"ee"})},setHighlight=e=>{"dark"==e?(document.getElementById("highlight_theme_light").media="none",document.getElementById("highlight_theme_dark").media=""):(document.getElementById("highlight_theme_dark").media="none",document.getElementById("highlight_theme_light").media="")},transTheme=()=>{document.documentElement.classList.add("transition"),window.setTimeout(()=>{document.documentElement.classList.remove("transition")},500)},initTheme=e=>{if(null==e||"null"==e){const t=window.matchMedia;t&&t("(prefers-color-scheme: dark)").matches&&(e="dark")}setTheme(e)};initTheme(localStorage.getItem("theme"));
\ No newline at end of file
diff --git a/assets/js/zoom.js b/assets/js/zoom.js
index c8610d61..2a8bc1fb 100644
--- a/assets/js/zoom.js
+++ b/assets/js/zoom.js
@@ -1,8 +1 @@
-// Initialize medium zoom.
-$(document).ready(function() {
-  medium_zoom = mediumZoom('[data-zoomable]', {
-    margin: 100,
-    background: getComputedStyle(document.documentElement)
-        .getPropertyValue('--global-bg-color') + 'ee',  // + 'ee' for trasparency.
-  })
-});
+$(document).ready(function(){medium_zoom=mediumZoom("[data-zoomable]",{margin:100,background:getComputedStyle(document.documentElement).getPropertyValue("--global-bg-color")+"ee"})});
\ No newline at end of file
diff --git a/bin/cibuild b/bin/cibuild
deleted file mode 100755
index d5c9e195..00000000
--- a/bin/cibuild
+++ /dev/null
@@ -1 +0,0 @@
-bundle exec jekyll build
diff --git a/bin/deploy b/bin/deploy
deleted file mode 100755
index 2953d787..00000000
--- a/bin/deploy
+++ /dev/null
@@ -1,118 +0,0 @@
-#!/usr/bin/env bash
-
-# Run this script to deploy the app to Github Pages
-
-# Parse cmd arguments
-
-SRC_BRANCH="master"
-DEPLOY_BRANCH="gh-pages"
-
-USAGE_MSG="usage: deploy [-h|--help] [-u|--user] [-s|--src SRC_BRANCH] [-d|--deploy DEPLOY_BRANCH] [--verbose] [--no-push]"
-
-while [[ $# > 0 ]]; do
-    key="$1"
-
-    case $key in
-        -h|--help)
-        echo $USAGE_MSG
-        exit 0
-        ;;
-        -u|--user)
-        SRC_BRANCH="source"
-        DEPLOY_BRANCH="master"
-        ;;
-        -s|--src)
-        SRC_BRANCH="$2"
-        shift
-        ;;
-        -d|--deploy)
-        DEPLOY_BRANCH="$2"
-        shift
-        ;;
-        --verbose)
-        set -x
-        ;;
-        --no-push)
-        NO_PUSH="--no-push"
-        ;;
-        *)
-        echo "Option $1 is unknown." >&2
-        echo $USAGE_MSG >&2
-        exit 1
-        ;;
-    esac
-    shift
-done
-
-# Exit if any subcommand fails
-set -e
-
-echo "Deploying..."
-echo "Source branch: $SRC_BRANCH"
-echo "Deploy branch: $DEPLOY_BRANCH"
-
-read -r -p "Do you want to proceed? [y/N] " response
-if [[ ! $response =~ ^([yY][eE][sS]|[yY])+$ ]]
-then
-    echo "Aborting."
-    [[ "$0" = "$BASH_SOURCE" ]] && exit 1 || return 1
-fi
-
-# Check if there are any uncommitted changes
-if ! git diff-index --quiet HEAD --; then
-    echo "Changes to the following files are uncommitted:"
-    git diff-index --name-only HEAD --
-    echo "Please commit the changes before proceeding."
-    echo "Aborting."
-    [[ "$0" = "$BASH_SOURCE" ]] && exit 1 || return 1
-fi
-
-# Check if there are any untracked files
-if ! test -z "$(git ls-files --exclude-standard --others)"; then
-    echo "There are untracked files:"
-    git ls-files --exclude-standard --others
-    echo "Please commit those files or stash them before proceeding."
-    echo "Aborting."
-    [[ "$0" = "$BASH_SOURCE" ]] && exit 1 || return 1
-fi
-
-# Switch to source branch (creates it if necessary from the current branch)
-if [ `git branch | grep $SRC_BRANCH | tr ' ' '\n' | tail -1` ]
-then
-    git checkout $SRC_BRANCH
-else
-    git checkout -b $SRC_BRANCH
-fi
-
-# Checkout DEPLOY_BRANCH branch
-if [ `git branch | grep $DEPLOY_BRANCH` ]
-then
-  git branch -D $DEPLOY_BRANCH
-fi
-git checkout -b $DEPLOY_BRANCH
-
-# Export JEKYLL_ENV=production
-export JEKYLL_ENV=production
-
-# Build site
-bundle exec jekyll build
-
-# Delete and move files
-find . -maxdepth 1 ! -name '_site' ! -name '.git' ! -name 'CNAME' ! -name '.gitignore' -exec rm -rf {} \;
-mv _site/* .
-rm -R _site/
-
-# Create `.nojekyll` file (bypass GitHub Pages Jekyll processing)
-touch .nojekyll
-
-# Push to DEPLOY_BRANCH
-git add -fA
-git commit --allow-empty -m "$(git log -1 --pretty=%B) [ci skip]"
-[[ ${NO_PUSH} ]] || git push -f -q origin $DEPLOY_BRANCH
-
-# Move back to SRC_BRANCH
-git checkout $SRC_BRANCH
-
-echo "Deployed successfully!"
-
-exit 0
diff --git a/bin/docker_build_image.sh b/bin/docker_build_image.sh
deleted file mode 100755
index fe208afb..00000000
--- a/bin/docker_build_image.sh
+++ /dev/null
@@ -1,5 +0,0 @@
-  FILE=Gemfile.lock
-if [ -f "$FILE" ]; then
-    rm $FILE
-fi
-  docker build -t "al-folio:latest" . 
\ No newline at end of file
diff --git a/bin/docker_run.sh b/bin/docker_run.sh
deleted file mode 100755
index a59a44ba..00000000
--- a/bin/docker_run.sh
+++ /dev/null
@@ -1,7 +0,0 @@
-FILE=Gemfile.lock
-if [ -f "$FILE" ]; then
-    rm $FILE
-fi
-docker run --rm -v "$PWD:/srv/jekyll/" -p "8080:8080" \
-                    -it al-folio:latest bundler  \
-                    exec jekyll serve --watch --port=8080 --host=0.0.0.0 
\ No newline at end of file
diff --git a/bin/dockerhub_run.sh b/bin/dockerhub_run.sh
deleted file mode 100755
index 7054e991..00000000
--- a/bin/dockerhub_run.sh
+++ /dev/null
@@ -1,7 +0,0 @@
-FILE=Gemfile.lock
-if [ -f "$FILE" ]; then
-    rm $FILE
-fi
-docker run --rm -v "$PWD:/srv/jekyll/" -p "8080:8080" \
-                    -it amirpourmand/al-folio bundler  \
-                    exec jekyll serve --watch --port=8080 --host=0.0.0.0 
diff --git a/blog/2023/adamw/index.html b/blog/2023/adamw/index.html
new file mode 100644
index 00000000..e299b700
--- /dev/null
+++ b/blog/2023/adamw/index.html
@@ -0,0 +1,126 @@
+<!DOCTYPE html> <html> <script>let thunk=()=>{let e=e=>e.trim(),t=e=>e.innerText,n=e=>{let t=e.split(" "),n=t.slice(0,-1).join(" ");return[t.at(-1),n]},a=Array.from(document.getElementsByClassName("author")).map(t).map(e).map(n),i=a[0][0],o=(Array.from(document.getElementsByClassName("affiliation")).filter(e=>"P"===e.nodeName).map(t).map(e),"May 1, 2023"),r="Decay No More",m="Weight decay is among the most important tuning parameters to reach high accuracy for large-scale machine learning models. In this blog post, we revisit AdamW, the weight decay version of Adam, summarizing empirical findings as well as theoretical motivations from an optimization perspective.";{let e=a.map(e=>`${e[0]}, ${e[1]}`).join(" and "),t=`\n@inproceedings{${(i+"2023"+r.split(" ").slice(0,3).join("")).replace(" ","").replace(/[\p{P}$+<=>^`|~]/gu,"").toLowerCase().trim()},\n  author = {${e}},\n  title = {${r}},\n  abstract = {${m}},\n  booktitle = {ICLR Blogposts 2023},\n  year = {2023},\n  date = {${o}},\n  note = {${window.location.href}},\n  url  = {${window.location.href}}\n}\n  `.trim();document.getElementById("bibtex-box").innerText=t}{let e=a.map(e=>e[0]),t=`\n${e=e.length>2?e[0]+", et al.":2==e.length?e[0]+" & "+e[1]:e[0]}, "${r}", ICLR Blogposts, 2023.\n`.trim();document.getElementById("bibtex-academic-attribution").innerText=t}};document.addEventListener("readystatechange",function(){"complete"===document.readyState&&thunk()});</script> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <title>Decay No More | ICLR Blogposts 2023</title> <meta name="author" content="abc b c"/> <meta name="description" content="Weight decay is among the most important tuning parameters to reach high accuracy for large-scale machine learning models. In this blog post, we revisit AdamW, the weight decay version of Adam, summarizing empirical findings as well as theoretical motivations from an optimization perspective."/> <meta name="keywords" content="machine-learning, ml, deep-learning, reinforcement-learning, iclr"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha256-DF7Zhf293AJxJNTmh5zhoYYIMs2oXitRfBjY+9L//AY=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/css/mdb.min.css" integrity="sha256-jpjYvU3G3N6nrrBwXJoVEYI/0zw8htfFnhT9ljN3JJw=" crossorigin="anonymous"/> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fortawesome/fontawesome-free@5.15.4/css/all.min.css" integrity="sha256-mUZM63G8m73Mcidfrv5E+Y61y7a12O5mW4ezU3bxqW4=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/academicons@1.9.1/css/academicons.min.css" integrity="sha256-i1+4qU2G2860dGGIOJscdC30s9beBXjFfzjWLjBRsBg=" crossorigin="anonymous"> <link rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700|Roboto+Slab:100,300,400,500,700|Material+Icons"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/github.css" media="" id="highlight_theme_light"/> <link rel="shortcut icon" href="/2023/assets/img/iclr_favicon.ico"/> <link rel="stylesheet" href="/2023/assets/css/main.css"> <link rel="canonical" href="https://iclr-blogposts.github.io/2023/blog/2023/adamw/"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/native.css" media="none" id="highlight_theme_dark"/> <script src="/2023/assets/js/theme.js"></script> <script src="/2023/assets/js/dark_mode.js"></script> <script src="https://cdn.jsdelivr.net/npm/jquery@3.6.0/dist/jquery.min.js" integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=" crossorigin="anonymous"></script> <script type="text/javascript">window.MathJax={tex:{tags:"ams"}};</script> <script defer type="text/javascript" id="MathJax-script" src="https://cdn.jsdelivr.net/npm/mathjax@3.2.0/es5/tex-mml-chtml.js"></script> <script defer src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script> <script src="/2023/assets/js/distillpub/template.v2.js"></script> <script src="/2023/assets/js/distillpub/transforms.v2.js"></script> <script src="/2023/assets/js/distillpub/overrides.js"></script> </head> <d-front-matter> <script async type="text/json">{
+      "title": "Decay No More",
+      "description": "Weight decay is among the most important tuning parameters to reach high accuracy for large-scale machine learning models. In this blog post, we revisit AdamW, the weight decay version of Adam, summarizing empirical findings as well as theoretical motivations from an optimization perspective.",
+      "published": "May 1, 2023",
+      "authors": [
+        {
+          "author": "Fabian Schaipp",
+          "authorURL": "https://fabian-sp.github.io/",
+          "affiliations": [
+            {
+              "name": "Technical University of Munich",
+              "url": ""
+            }
+          ]
+        }
+        
+      ],
+      "katex": {
+        "delimiters": [
+          {
+            "left": "$",
+            "right": "$",
+            "display": false
+          },
+          {
+            "left": "$$",
+            "right": "$$",
+            "display": true
+          }
+        ]
+      }
+    }</script> </d-front-matter> <body class="fixed-top-nav"> <header> <nav id="navbar" class="navbar navbar-light navbar-expand-sm fixed-top"> <div class="container"> <a class="navbar-brand title font-weight-lighter" href="/2023/">ICLR Blogposts 2023</a> <button class="navbar-toggler collapsed ml-auto" type="button" data-toggle="collapse" data-target="#navbarNav" aria-controls="navbarNav" aria-expanded="false" aria-label="Toggle navigation"> <span class="sr-only">Toggle navigation</span> <span class="icon-bar top-bar"></span> <span class="icon-bar middle-bar"></span> <span class="icon-bar bottom-bar"></span> </button> <div class="collapse navbar-collapse text-right" id="navbarNav"> <ul class="navbar-nav ml-auto flex-nowrap"> <li class="nav-item "> <a class="nav-link" href="/2023/about">about</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/call">call for blogposts</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/submitting">submitting</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/reviewing">reviewing</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/blog/index.html">blog</a> </li> <li class="nav-item dropdown "> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdown" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">other iterations</a> <div class="dropdown-menu dropdown-menu-right" aria-labelledby="navbarDropdown"> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2025/">2025</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2024/">2024</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blog-track.github.io/home/" target="_blank" rel="noopener noreferrer">2022</a> </div> </li> <li class="toggle-container"> <button id="light-toggle" title="Change theme"> <i class="fas fa-moon"></i> <i class="fas fa-sun"></i> </button> </li> </ul> </div> </div> </nav> </header> <div class="post distill"> <d-title> <h1>Decay No More</h1> <p>Weight decay is among the most important tuning parameters to reach high accuracy for large-scale machine learning models. In this blog post, we revisit AdamW, the weight decay version of Adam, summarizing empirical findings as well as theoretical motivations from an optimization perspective.</p> </d-title> <d-byline></d-byline> <d-article> <d-contents> <nav class="l-text figcaption"> <h3>Contents</h3> <div><a href="#introduction">Introduction</a></div> <ul> <li><a href="#notation">Notation</a></li> </ul> <div><a href="#adam">Adam</a></div> <div><a href="#adamw">AdamW</a></div> <div><a href="#follow-up-work">Follow-up work</a></div> <div><a href="#proxadam">ProxAdam</a></div> <ul> <li><a href="#a-short-introduction-to-proximal-operators">A short introduction to proximal operators</a></li> <li><a href="#weight-decay-as-a-proximal-operator">Weight decay as a proximal operator</a></li> <li><a href="#changing-the-norm">Changing the norm</a></li> </ul> <div><a href="#adamw-is-scale-free">AdamW is scale-free</a></div> <div><a href="#summary">Summary</a></div> <div><a href="#appendix">Appendix</a></div> </nav> </d-contents> <h2 id="introduction">Introduction</h2> <p>Weight decay is a regularization technique in machine learning which scales down the weights in every step. It dates back at least to the 1990’s and the work of Krogh and Hertz <d-cite key="Krogh1991"></d-cite> and Bos and Chug <d-cite key="Bos1996"></d-cite>.</p> <p>In <code class="language-plaintext highlighter-rouge">Pytorch</code>, weight decay is one simple line which typically is found somewhere in the <code class="language-plaintext highlighter-rouge">step</code>-method:</p> <figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="n">group</span><span class="p">[</span><span class="sh">'</span><span class="s">params</span><span class="sh">'</span><span class="p">]:</span>
+  <span class="n">p</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="nf">add_</span><span class="p">(</span><span class="n">p</span><span class="p">.</span><span class="n">data</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=-</span><span class="n">decay</span><span class="p">)</span></code></pre></figure> <p>Subtracting a multiple of the weight can be seen as taking a step into the negative gradient direction of the squared norm of the weight. This relates weight decay to \(\ell_2\)-regularization.</p> <p>The exact mechanism of weight decay is still puzzling the machine learning community:</p> <div class="jekyll-twitter-plugin"> <blockquote class="twitter-tweet"> <p lang="en" dir="ltr">The story of weight decay in pictures:<br><br>weight decay ...<br>1) improves data efficiency by &gt; 50%<br>2) is frequently found in the best hyperparam configs<br>3) is among the most important hparams to tune<br>4) is also tricky to tune <a href="https://t.co/PjWpk3pJxz" target="_blank" rel="noopener noreferrer">pic.twitter.com/PjWpk3pJxz</a></p>— Sebastian Raschka (@rasbt) <a href="https://twitter.com/rasbt/status/1614327550058328064?ref_src=twsrc%5Etfw" target="_blank" rel="noopener noreferrer">January 14, 2023</a> </blockquote> <script async="" src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> </div> <div class="jekyll-twitter-plugin"> <blockquote class="twitter-tweet"> <p lang="en" dir="ltr">There is a gaping hole in the literature regarding the purpose of weight decay in deep learning. Nobody knows what weight decay does! AFAIK, the last comprehensive look at weight decay was this 2019 paper <a href="https://t.co/7WDBZojsm0" target="_blank" rel="noopener noreferrer">https://t.co/7WDBZojsm0</a>, which argued that weight decay <a href="https://t.co/qUpCbfhFRf" target="_blank" rel="noopener noreferrer">https://t.co/qUpCbfhFRf</a></p>— Jeremy Cohen (@deepcohen) <a href="https://twitter.com/deepcohen/status/1617274166570528769?ref_src=twsrc%5Etfw" target="_blank" rel="noopener noreferrer">January 22, 2023</a> </blockquote> <script async="" src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> </div> <p>The paper by Zhang et al. <d-cite key="Zhang2019"></d-cite> - which is the one mentioned in the second tweet - gives a comprehensive overview of weight decay and its effect on generalization, in particular in the interplay with Batch Normalization <code class="language-plaintext highlighter-rouge">(BN)</code> <d-cite key="Ioffe2015"></d-cite>. Batch Normalization describes a module of a network that normalizes the output of the previous layer to have zero mean and variance of one (or a variant of this with learnable mean and variance). We will not go into the details here but refer to <a href="https://iclr-blog-track.github.io/2022/03/25/unnormalized-resnets/" target="_blank" rel="noopener noreferrer">this blog post</a> <d-cite key="pieterjan2022normalizationisdead"></d-cite> for the interested reader.</p> <p>We want to summarize two findings of <d-cite key="Zhang2019"></d-cite>:</p> <ul> <li>On the one hand, weight decay has (in theory) no effect on layers with <code class="language-plaintext highlighter-rouge">(BN)</code>. This is simply due to the fact that <code class="language-plaintext highlighter-rouge">(BN)</code> makes the output invariant to a rescaling of the weights.</li> </ul> <blockquote> Weight decay is widely used in networks with Batch Normalization (Ioffe &amp; Szegedy, 2015). In principle, weight decay regularization should have no effect in this case, since one can scale the weights by a small factor without changing the network’s predictions. Hence, it does not meaningfully constrain the network’s capacity. —Zhang et al., 2019 </blockquote> <ul> <li>However, the experiments of the paper show that weight decay on layers with <code class="language-plaintext highlighter-rouge">(BN)</code> can nevertheless improve accuracy. The authors argue that this is due to an effectively larger learning rate.</li> </ul> <p>This blog post will summarize the development of weight decay specifically for <span style="font-family:monospace">Adam</span>. We try to shed some light on the following questions:</p> <ol> <li>What is the difference between <span style="font-family:monospace">Adam</span> and its weight decay version <span style="font-family:monospace">AdamW</span>? Does the existing literature give a clear answer to the question when (and why) <span style="font-family:monospace">AdamW</span> performs better?</li> <li>Is the weight decay mechanism of <span style="font-family:monospace">AdamW</span> just <em>one more trick</em> or can we actually motivate it from an optimization perspective?</li> <li>The last section is somewhat explorational: could we come up with different formulas for a weight decay version of <span style="font-family:monospace">Adam</span>? By doing so, we will see that <span style="font-family:monospace">AdamW</span> already combines several advantages for practical use.</li> </ol> <h3 id="notation">Notation</h3> <p>We denote by \(\alpha &gt; 0\) the initial learning rate. We use \(\eta_t &gt; 0\) for a learning rate schedule multiplier. By this, the effective learning rate in iteration \(t\) is \(\alpha \eta_t\). We use \(\lambda &gt; 0\) for the weight decay parameter.</p> <h2 id="adam">Adam</h2> <p><span style="font-family:monospace">Adam</span> uses an exponentially moving average (EMA) of stochastic gradients, typically denoted by \(m_t\), and of the elementwise squared gradients, denoted by \(v_t\).</p> <p>We denote with \(\hat m_t\) and \(\hat v_t\) the EMA estimates with bias correction (see <d-cite key="Kingma2015"></d-cite>), this means</p> \[\hat m_t = \frac{m_t}{1-\beta_1^t}, \quad \hat v_t = \frac{v_t}{1-\beta_2^t}\] <p>where \(\beta_1, \beta_2 \in [0,1)\). The update formula of <span style="font-family:monospace">Adam</span> is given by</p> \[w_t = w_{t-1} - \eta_t \alpha \frac{\hat m_t}{\epsilon + \sqrt{\hat v_t}}.\] <p>How would <span style="font-family:monospace">Adam</span> handle regularization? The first approach to this was to simply add the regularization term \(\frac{\lambda}{2}\|w\|^2\) on top of the loss, do backpropagation and then compute the <span style="font-family:monospace">Adam</span> step as outlined above. This is usually referred to as <span style="font-family:monospace">AdamL2</span>. However, Loshchilov and Hutter <d-cite key="Loshchilov2019"></d-cite> showed that this can be suboptimal and one major contribution to alleviate this was the development of <span style="font-family:monospace">AdamW</span>.</p> <h2 id="adamw">AdamW</h2> <p>For training with \(\ell_2\)-regularization, Loshchilov and Hutter proposed <span style="font-family:monospace">AdamW</span> in 2019 <d-cite key="Loshchilov2019"></d-cite> as an alternative to <span style="font-family:monospace">AdamL2</span>. In the paper, the update formula is given as</p> \[\tag{AdamW} w_t = (1-\eta_t \lambda)w_{t-1} - \eta_t \alpha \frac{\hat m_t}{\epsilon + \sqrt{\hat v_t}}.\] <p>While for <span style="font-family:monospace">Adam</span> several results for convex and nonconvex problems are established <d-cite key="Defossez2022, Reddi2018"></d-cite>, theoretical guarantees for <span style="font-family:monospace">AdamW</span> have been explored - to the best of our knowledge - only very recently <d-cite key="Anonymous2023"></d-cite>. Despite this, the method has enjoyed considerable practical success: for instance, <span style="font-family:monospace">AdamW</span> is implemented in the machine learning libraries Tensorflow <d-cite key="Abadi2015"></d-cite> and Pytorch <d-cite key="Paszke2019"></d-cite>. Another example is the <code class="language-plaintext highlighter-rouge">fairseq</code> library, developped by Facebook Research, which implements many SeqToSeq models. In their codebase, when <span style="font-family:monospace">Adam</span> is specified with weight decay, <span style="font-family:monospace">AdamW</span> is used by default (see <a href="https://github.com/facebookresearch/fairseq/blob/main/fairseq/optim/adam.py" target="_blank" rel="noopener noreferrer">here</a>).</p> <p>We summarize the empirical findings of <d-cite key="Loshchilov2019"></d-cite> as follows:</p> <ul> <li> <p><span style="font-family:monospace">AdamW</span> improves generalization as compared to <span style="font-family:monospace">AdamL2</span> for image classification tasks. In the paper, the authors use a ResNet model <d-cite key="He2016"></d-cite> for the CIFAR10 and Imagenet32 dataset.</p> </li> <li> <p>Another advantage of <span style="font-family:monospace">AdamW</span> is stated in the abstract of <d-cite key="Loshchilov2019"></d-cite>:</p> </li> </ul> <blockquote> We provide empirical evidence that our proposed modification decouples the optimal choice of weight decay factor from the setting of the learning rate for both standard SGD and Adam [...]. —Loshchilov and Hutter, 2019 </blockquote> <p>What the authors mean by <em>decoupling</em> is that if we plot the test accuracy as a heatmap of learning rate and weight decay, the areas with high accuracy are more rectangular; the best learing rate is not too sensitive to the choice of weight decay. We illustrate this conceptually in the plot below which is inspired by Figure 2 in <d-cite key="Loshchilov2019"></d-cite>. The advantage of a decoupled method is that if one of the two hyperparameters is changed, the optimal value for the other one might still be identical and does not need to be retuned - this could reduce a 2D grid search to two 1D line searches.</p> <div class="row mt-3"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-adamw/heatmap-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-adamw/heatmap-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-adamw/heatmap-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-adamw/heatmap.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> <div class="caption"> Fig. 1: Heatmap of the test accuracy (bright = good accuracy) depending on learning rate and weight decay parameter choice. </div> <p>When revisiting the literature on <span style="font-family:monospace">AdamW</span> we made an interesting practical observation: the <a href="https://pytorch.org/docs/stable/generated/torch.optim.AdamW.html" target="_blank" rel="noopener noreferrer">Pytorch implementation</a> of <span style="font-family:monospace">AdamW</span> is actually slightly different to the algorithm proposed in the paper. In Pytorch, the following is implemented:</p> \[w_t = (1-\eta_t \alpha \lambda)w_{t-1} - \eta_t \alpha \frac{\hat m_t}{\epsilon + \sqrt{\hat v_t}}.\] <p>The difference is that the decay factor in the code is \(1-\eta_t \alpha \lambda\) instead of \(1-\eta_t \lambda\) in the paper. Clearly, this is equivalent as we can simply reparametrize the weight decay factor \(\lambda\) to make up for this. However, as the default learning rate \(\alpha=0.001\) is rather small, this means that practicioners might need to choose rather high values of \(\lambda\) in order to get sufficiently strong decay. Moreover, this leaves a certain ambiguity when tuned values for \(\lambda\) are reported in the literature.</p> <h2 id="follow-up-work">Follow-up work</h2> <p>In a recent article, Zhuang et al. revisit the <span style="font-family:monospace">AdamW</span> method and try to explain its practical success <d-cite key="Zhuang2022"></d-cite>. One of their central arguments is that <span style="font-family:monospace">AdamW</span> is approximately equal to <span style="font-family:monospace">Adam</span> with a proximal update for \(\ell_2\)-regularization.</p> <p>Before explaining this in detail, we first want to summarize the empirical findings of <d-cite key="Zhuang2022"></d-cite>:</p> <ul> <li>When <code class="language-plaintext highlighter-rouge">(BN)</code> is <em>deactivated</em>, <span style="font-family:monospace">AdamW</span> achieves better generalization compared to <span style="font-family:monospace">AdamL2</span> for image classification with a standard ResNet architecture <d-cite key="He2016"></d-cite>.</li> <li>When <code class="language-plaintext highlighter-rouge">(BN)</code> is <em>activated</em>, the test accuracy of <span style="font-family:monospace">AdamW</span> and <span style="font-family:monospace">AdamL2</span> are on par. Moreover, the best accuracy is achieved for no weight decay, i.e. \(\lambda=0\).</li> </ul> <p>The second result is somewhat stunning as it seems to contradict the results in <d-cite key="Loshchilov2019"></d-cite>, which had shown that <span style="font-family:monospace">AdamW</span> generalizes better than <span style="font-family:monospace">AdamL2</span>.<d-footnote>It seems like the AdamW-paper also used (BN) in their experiments, see https://github.com/loshchil/AdamW-and-SGDW.</d-footnote></p> <p>Comparing the details of the experimental setups, we presume the following explanations for this:</p> <ul> <li> <p>The model that is trained in <d-cite key="Loshchilov2019"></d-cite> is slightly different as it uses a Shake-Shake-Image ResNet <d-cite key="He2016, Gastaldi2017"></d-cite>.</p> </li> <li> <p>From Figure 4 in <d-cite key="Loshchilov2019"></d-cite>, one can observe that the improvement in accuracy for the CIFAR-10 dataset becomes noticeable very late in the training (see also Section 4.3 in <d-cite key="Loshchilov2019"></d-cite>). Thus, depending on the number of epochs after which training is stopped, one can reach different conclusions.</p> </li> </ul> <h2 id="proxadam">ProxAdam</h2> <p>The paper by Zhuang et al. <d-cite key="Zhuang2022"></d-cite> does not only compare <span style="font-family:monospace">AdamL2</span> to <span style="font-family:monospace">AdamW</span> experimentally, but it also provides a mathematical motivation for weight decay. In order to understand this, we first need to introduce the <strong>proximal operator</strong>, a central concept of convex analysis.</p> <h3 id="a-short-introduction-to-proximal-operators">A short introduction to proximal operators</h3> <p>Proximal algorithms have been studied for decades in the context of (non-smooth) optimization, way before machine learning was a thing. The groundwork of this field has been laid by R. Tyrrell Rockafellar from the 1970’s onwards <d-cite key="Rockafellar1976,Rockafellar1998"></d-cite>. If \(\varphi: \mathbb{R}^n \to \mathbb{R}\) is convex then the proximal operator is defined as</p> \[\mathrm{prox}_\varphi(x) := \mathrm{argmin}_{z \in \mathbb{R}^n} \varphi(z) + \frac12 \|z-x\|^2.\] <p>If \(\varphi\) is non-smooth, we can not simply compute a gradient step - hence we have to deal with non-smooth terms in a different way. For many classical regularization functions (e.g. the \(\ell_1\)-norm), the proximal operator can be computed in closed form. This makes it a key ingredient of optimization algorithms for non-smooth, regularized problems. Assume that we want to minimize the sum of a differentiable loss \(f\) and a convex regularizer \(\varphi\), i.e.</p> \[\min_{w \in \mathbb{R}^n} f(w) + \varphi(w).\] <p>The proximal gradient method in this setting has the update formula</p> \[w_{t} = \mathrm{prox}_{\alpha \varphi}\big(w_{t-1}- \alpha \nabla f(w_{t-1})\big),\] <p>where \(\alpha&gt;0\) is a step size (<em>aka</em> learning rate). An equivalent way of writing this (which will become useful later on) is<d-footnote>This can be proven using the definition of the proximal operator and completing the square.</d-footnote></p> \[\tag{1} w_{t} = \mathrm{argmin}_y \langle y-w_{t-1}, \nabla f(w_{t-1})\rangle + \varphi(y) + \frac{1}{2\alpha}\|y-w_{t-1}\|^2.\] <h3 id="weight-decay-as-a-proximal-operator">Weight decay as a proximal operator</h3> <p>For \(\ell_2\)-regularization \(\varphi(w) = \frac{\lambda}{2}\|w\|^2\), the proximal operator at \(w\) is given by \(\frac{1}{1+\lambda}w = (1-\frac{\lambda}{1+\lambda})w\). Based on this, the authors of <d-cite key="Zhuang2022"></d-cite> propose a proximal version of <span style="font-family:monospace">Adam</span> called <span style="font-family:monospace">ProxAdam</span>. It is given by</p> \[\tag{ProxAdam} w_t = \big(1- \frac{\lambda\eta_t}{1+\lambda\eta_t} \big)w_{t-1} - \frac{\eta_t \alpha}{1+\lambda\eta_t} \frac{\hat m_t}{\epsilon + \sqrt{\hat v_t}}.\] <p>Knowing this, we can now understand why <span style="font-family:monospace">AdamW</span> is approximately a proximal version of <span style="font-family:monospace">Adam</span>. Using the first-order Taylor-approximation \(\frac{ax}{1+bx}\approx ax\) for small \(x\), applied to the coefficients in front of \(w_{t-1}\) and \(\frac{\hat m_t}{\epsilon + \sqrt{\hat v_t}}\) gives the formula</p> \[w_t = (1-\eta_t \lambda)w_{t-1} - \eta_t \alpha \frac{\hat m_t}{\epsilon + \sqrt{\hat v_t}}\] <p>which is equal to <span style="font-family:monospace">AdamW</span>. The argument we just presented is exactly how <d-cite key="Zhuang2022"></d-cite> concludes that <span style="font-family:monospace">AdamW</span> \(\approx\) <span style="font-family:monospace">ProxAdam</span>.</p> <h3 id="changing-the-norm">Changing the norm</h3> <p>There is one more way of interpreting proximal methods. Let us begin with a simple example: Define the diagonal matrix \(D_t := \mathrm{Diag}(\epsilon + \sqrt{\hat v_t})\). Then, the <span style="font-family:monospace">Adam</span> update can be equivalently written<d-footnote>This can be proven by first-order optimality and solving for $w_t$. We will do a similar calculation further below.</d-footnote> as</p> \[w_t = \mathrm{argmin}_y \langle y-w_{t-1}, \hat m_t \rangle + \frac{1}{2\eta_t\alpha}\|y-w_{t-1}\|_{D_t}^2.\] <p>In other words, <span style="font-family:monospace">Adam</span> takes a proximal step of a linear function, but with the adaptive norm \(D_t\). This change in norm is what makes <span style="font-family:monospace">Adam</span> different from <span style="font-family:monospace">SGD</span> with (heavy-ball) momentum.</p> <p>The update formula of <span style="font-family:monospace">ProxAdam</span> can also be written as a proximal method:</p> \[\tag{P1} w_t = \mathrm{argmin}_y \langle y-w_{t-1}, \hat m_t \rangle + \frac{\lambda}{2\alpha}\|y\|_{D_t}^2 + \frac{1}{2 \eta_t \alpha}\|y-w_{t-1}\|_{D_t}^2.\] <p>In fact, the first-order optimality conditions of (P1) are</p> \[0 = \hat m_t + \frac{\lambda}{\alpha} D_t w_t + \frac{1}{\eta_t \alpha}D_t (w_t-w_{t-1}).\] <p>Solving for \(w_t\) (and doing simple algebra) gives</p> \[\tag{2} w_t = (1+\lambda \eta_t)^{-1}\big[w_{t-1} - \eta_t \alpha D_t^{-1} \hat m_t\big]\] <p>which is equal to <span style="font-family:monospace">ProxAdam</span>.</p> <p>What is slightly surprising here is the term \(\alpha^{-1}\|y\|_{D_t}^2\) in (P1) - we might have expected the regularization term to be used with the standard \(\ell_2\)-norm. This leads us to our final section.</p> <h2 id="adamw-is-scale-free"> <span style="font-family:monospace">AdamW</span> is scale-free</h2> <p>As an alternative to (P1), we could replace \(\alpha^{-1}\|y\|_{D_t}^2\) by \(\|y\|^2\) and update</p> \[w_t = \mathrm{argmin}_y \langle y-w_{t-1}, \hat m_t \rangle + \frac{\lambda}{2}\|y\|^2 + \frac{1}{2\eta_t\alpha}\|y-w_{t-1}\|_{D_t}^2.\] <p>Again, setting the gradient of the objective to zero and solving for \(w_t\) we get</p> \[w_t = \big(\mathrm{Id} + \eta_t \lambda \alpha D_t^{-1}\big)^{-1} \big[w_{t-1} - \eta_t\alpha D_t^{-1} \hat m_t \big].\] <p>Comparing this to (2) we see that the second factor is the same, but the decay factor now also depends on \(D_t\) and \(\alpha\). Let us call this method <span style="font-family:monospace">AdamP</span>.</p> <p>Now the natural question is whether <span style="font-family:monospace">AdamP</span> or <span style="font-family:monospace">ProxAdam</span> (or <span style="font-family:monospace">AdamW</span> as its approximation) would be superior. One answer to this is that we would prefer a <em>scale-free</em> algorithm: with this we mean that if the loss function would be multiplied by a positive constant, we could still run the method with exactly the same parameters and obtain the same result. <span style="font-family:monospace">Adam</span> for example is scale-free and in <d-cite key="Zhuang2022"></d-cite> it is explained that <span style="font-family:monospace">ProxAdam</span>/<span style="font-family:monospace">AdamW</span> are, too. The reason for this is the following: looking at (P1) we see that if the loss is scaled by \(c&gt;0\), then \(\hat m_t\) and \(D_t\) are scaled by \(c\) (if we neglect the \(\epsilon\) in \(D_t\)). Hence, the objective in (P1) is multiplied by \(c\) which implies that <span style="font-family:monospace">ProxAdam</span> for \(\epsilon=0\) is invariant to scaling for the same values of \(\lambda,\alpha,\eta_t\). Now, for (P2) the story is different, as here the second term \(\frac{\lambda}{2}\|y\|^2\) is not scaled by \(c\), but the other terms are. We would need to rescale \(\lambda\) by \(c\) to obtain the identical update. As a consequence, <span style="font-family:monospace">AdamP</span> would <strong>not be scale-free</strong> and this makes it less attractive as a method. We should point out that scale-freeness is rather a practical advantage that requires less tuning when changing the model or dataset - it does not imply that the test accuracy would be different when both methods are tuned.</p> <p>To verify this, we ran a simple experiment on a ResNet20 for CIFAR10 with <code class="language-plaintext highlighter-rouge">(BN)</code> deactivated. For <span style="font-family:monospace">AdamW</span> (the <code class="language-plaintext highlighter-rouge">Pytorch</code> version) and <span style="font-family:monospace">AdamP</span> we tested the learning rates <code class="language-plaintext highlighter-rouge">[1e-3,1e-2,1e-1]</code> and weight decay <code class="language-plaintext highlighter-rouge">[1e-5,1e-4,1e-3,1e-2]</code>. From the plots below, we can see that both methods approximately achieve the same accuracy for the best configurations<d-footnote>The best configurations all have learning rate 1e-3.</d-footnote>. The only difference - in this very simple example - is that <span style="font-family:monospace">AdamP</span> seems to arrive at a model with smaller norm for the configurations with high accuracy (see right plot). Hence, its regularization seems to be stronger.</p> <div class="row mt-3"> <div class="col-sm mt-3 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-adamw/resnet20val_score-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-adamw/resnet20val_score-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-adamw/resnet20val_score-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-adamw/resnet20val_score.png" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> <div class="col-sm mt-3 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-adamw/resnet20model_norm-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-adamw/resnet20model_norm-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-adamw/resnet20model_norm-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-adamw/resnet20model_norm.png" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> </div> <p>For the sake of completeness, we also add a <code class="language-plaintext highlighter-rouge">Pytorch</code> implementation of <span style="font-family:monospace">AdamP</span> in the <a href="#appendix">Appendix</a>.</p> <h2 id="summary">Summary</h2> <ul> <li> <p>Weight decay can be seen as a proximal way of handling \(\ell_2\)-regularization. Therefore, it is not a different <em>type</em> of regularization itself but rather a different <em>treatment</em> of regularization in the optimization method. As a consequence, <span style="font-family:monospace">AdamW</span> is an (almost) proximal version of <span style="font-family:monospace">Adam</span>.</p> </li> <li> <p>Whether or not weight decay brings advantages when used <em>together with</em> <code class="language-plaintext highlighter-rouge">(BN)</code> seems to depend on several factors of the model and experimental design. However, in all experiments we discussed here <span style="font-family:monospace">AdamW</span> performed better or at least on par to <span style="font-family:monospace">AdamL2</span>.</p> </li> <li> <p>The second conclusion suggests that proximal algorithms such as <span style="font-family:monospace">AdamW</span> seem to be favourable. Together with the scale-free property that we described in the final section, this makes <span style="font-family:monospace">AdamW</span> a robust method and explains its practical success.</p> </li> </ul> <h2 id="acknowledgements">Acknowledgements</h2> <p>Special thanks go to Robert M. Gower and the anonymous reviewers for their constructive feedback.</p> <p><a name="appendix"></a></p> <h2 id="appendix">Appendix</h2> <p>Below you find a <code class="language-plaintext highlighter-rouge">Pytorch</code> implementation of <span style="font-family:monospace">AdamP</span>:</p> <figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="kn">import</span> <span class="n">torch</span>
+<span class="kn">from</span> <span class="n">torch.optim</span> <span class="kn">import</span> <span class="n">Optimizer</span>
+
+
+<span class="k">class</span> <span class="nc">AdamP</span><span class="p">(</span><span class="n">Optimizer</span><span class="p">):</span>
+    <span class="sa">r</span><span class="sh">"""</span><span class="s">
+    Arguments:
+        params (iterable): iterable of parameters to optimize or dicts defining
+            parameter groups
+        lr (float, optional): learning rate (default: 1e-3)
+        betas (Tuple[float, float], optional): coefficients used for computing
+            running averages of gradient and its square (default: (0.9, 0.999))
+        eps (float, optional): term added to the denominator to improve
+            numerical stability (default: 1e-8)
+        weight_decay (float, optional): weight decay (L2 penalty) (default: 0)
+        
+    </span><span class="sh">"""</span>
+
+    <span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="n">self</span><span class="p">,</span> <span class="n">params</span><span class="p">,</span> <span class="n">lr</span><span class="o">=</span><span class="mf">1e-3</span><span class="p">,</span> <span class="n">betas</span><span class="o">=</span><span class="p">(</span><span class="mf">0.9</span><span class="p">,</span> <span class="mf">0.999</span><span class="p">),</span> <span class="n">eps</span><span class="o">=</span><span class="mf">1e-8</span><span class="p">,</span>
+                 <span class="n">weight_decay</span><span class="o">=</span><span class="mi">0</span><span class="p">):</span>
+        <span class="k">if</span> <span class="ow">not</span> <span class="mf">0.0</span> <span class="o">&lt;=</span> <span class="n">lr</span><span class="p">:</span>
+            <span class="k">raise</span> <span class="nc">ValueError</span><span class="p">(</span><span class="sh">"</span><span class="s">Invalid learning rate: {}</span><span class="sh">"</span><span class="p">.</span><span class="nf">format</span><span class="p">(</span><span class="n">lr</span><span class="p">))</span>
+        <span class="k">if</span> <span class="ow">not</span> <span class="mf">0.0</span> <span class="o">&lt;=</span> <span class="n">eps</span><span class="p">:</span>
+            <span class="k">raise</span> <span class="nc">ValueError</span><span class="p">(</span><span class="sh">"</span><span class="s">Invalid epsilon value: {}</span><span class="sh">"</span><span class="p">.</span><span class="nf">format</span><span class="p">(</span><span class="n">eps</span><span class="p">))</span>
+        <span class="k">if</span> <span class="ow">not</span> <span class="mf">0.0</span> <span class="o">&lt;=</span> <span class="n">betas</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">&lt;</span> <span class="mf">1.0</span><span class="p">:</span>
+            <span class="k">raise</span> <span class="nc">ValueError</span><span class="p">(</span><span class="sh">"</span><span class="s">Invalid beta parameter at index 0: {}</span><span class="sh">"</span><span class="p">.</span><span class="nf">format</span><span class="p">(</span><span class="n">betas</span><span class="p">[</span><span class="mi">0</span><span class="p">]))</span>
+        <span class="k">if</span> <span class="ow">not</span> <span class="mf">0.0</span> <span class="o">&lt;=</span> <span class="n">betas</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">&lt;</span> <span class="mf">1.0</span><span class="p">:</span>
+            <span class="k">raise</span> <span class="nc">ValueError</span><span class="p">(</span><span class="sh">"</span><span class="s">Invalid beta parameter at index 1: {}</span><span class="sh">"</span><span class="p">.</span><span class="nf">format</span><span class="p">(</span><span class="n">betas</span><span class="p">[</span><span class="mi">1</span><span class="p">]))</span>
+        <span class="k">if</span> <span class="ow">not</span> <span class="mf">0.0</span> <span class="o">&lt;=</span> <span class="n">weight_decay</span><span class="p">:</span>
+            <span class="k">raise</span> <span class="nc">ValueError</span><span class="p">(</span><span class="sh">"</span><span class="s">Invalid weight_decay value: {}</span><span class="sh">"</span><span class="p">.</span><span class="nf">format</span><span class="p">(</span><span class="n">weight_decay</span><span class="p">))</span>
+        <span class="n">defaults</span> <span class="o">=</span> <span class="nf">dict</span><span class="p">(</span><span class="n">lr</span><span class="o">=</span><span class="n">lr</span><span class="p">,</span> <span class="n">betas</span><span class="o">=</span><span class="n">betas</span><span class="p">,</span> <span class="n">eps</span><span class="o">=</span><span class="n">eps</span><span class="p">,</span>
+                        <span class="n">weight_decay</span><span class="o">=</span><span class="n">weight_decay</span><span class="p">)</span>
+        
+        <span class="n">self</span><span class="p">.</span><span class="n">_init_lr</span> <span class="o">=</span> <span class="n">lr</span>
+        <span class="nf">super</span><span class="p">().</span><span class="nf">__init__</span><span class="p">(</span><span class="n">params</span><span class="p">,</span> <span class="n">defaults</span><span class="p">)</span>
+
+        <span class="k">return</span>
+   
+
+    <span class="k">def</span> <span class="nf">step</span><span class="p">(</span><span class="n">self</span><span class="p">,</span> <span class="n">closure</span><span class="o">=</span><span class="bp">None</span><span class="p">):</span>
+        <span class="sh">"""</span><span class="s">Performs a single optimization step.
+
+        Arguments:
+            closure (callable, optional): A closure that reevaluates the model
+                and returns the loss.
+        </span><span class="sh">"""</span>
+        
+        <span class="k">if</span> <span class="n">closure</span> <span class="ow">is</span> <span class="ow">not</span> <span class="bp">None</span><span class="p">:</span>
+            <span class="k">with</span> <span class="n">torch</span><span class="p">.</span><span class="nf">enable_grad</span><span class="p">():</span>
+                <span class="n">loss</span> <span class="o">=</span> <span class="nf">closure</span><span class="p">()</span>
+
+        <span class="k">for</span> <span class="n">group</span> <span class="ow">in</span> <span class="n">self</span><span class="p">.</span><span class="n">param_groups</span><span class="p">:</span>
+            <span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="n">group</span><span class="p">[</span><span class="sh">'</span><span class="s">params</span><span class="sh">'</span><span class="p">]:</span>
+                <span class="k">if</span> <span class="n">p</span><span class="p">.</span><span class="n">grad</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
+                    <span class="k">continue</span>
+
+                <span class="n">grad</span> <span class="o">=</span> <span class="n">p</span><span class="p">.</span><span class="n">grad</span>
+                <span class="n">state</span> <span class="o">=</span> <span class="n">self</span><span class="p">.</span><span class="n">state</span><span class="p">[</span><span class="n">p</span><span class="p">]</span>
+
+                <span class="c1"># State initialization
+</span>                <span class="k">if</span> <span class="sh">'</span><span class="s">step</span><span class="sh">'</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">state</span><span class="p">:</span>
+                    <span class="n">state</span><span class="p">[</span><span class="sh">'</span><span class="s">step</span><span class="sh">'</span><span class="p">]</span> <span class="o">=</span> <span class="mi">0</span>
+                    <span class="c1"># Exponential moving average of gradient values
+</span>                    <span class="n">state</span><span class="p">[</span><span class="sh">'</span><span class="s">exp_avg</span><span class="sh">'</span><span class="p">]</span> <span class="o">=</span> <span class="n">torch</span><span class="p">.</span><span class="nf">zeros_like</span><span class="p">(</span><span class="n">p</span><span class="p">.</span><span class="n">data</span><span class="p">).</span><span class="nf">detach</span><span class="p">()</span>
+                    <span class="c1"># Exponential moving average of squared gradient values
+</span>                    <span class="n">state</span><span class="p">[</span><span class="sh">'</span><span class="s">exp_avg_sq</span><span class="sh">'</span><span class="p">]</span> <span class="o">=</span> <span class="n">torch</span><span class="p">.</span><span class="nf">zeros_like</span><span class="p">(</span><span class="n">p</span><span class="p">.</span><span class="n">data</span><span class="p">).</span><span class="nf">detach</span><span class="p">()</span>
+                    
+                <span class="n">exp_avg</span><span class="p">,</span> <span class="n">exp_avg_sq</span> <span class="o">=</span> <span class="n">state</span><span class="p">[</span><span class="sh">'</span><span class="s">exp_avg</span><span class="sh">'</span><span class="p">],</span> <span class="n">state</span><span class="p">[</span><span class="sh">'</span><span class="s">exp_avg_sq</span><span class="sh">'</span><span class="p">]</span>
+                <span class="n">beta1</span><span class="p">,</span> <span class="n">beta2</span> <span class="o">=</span> <span class="n">group</span><span class="p">[</span><span class="sh">'</span><span class="s">betas</span><span class="sh">'</span><span class="p">]</span>
+
+                <span class="n">state</span><span class="p">[</span><span class="sh">'</span><span class="s">step</span><span class="sh">'</span><span class="p">]</span> <span class="o">+=</span> <span class="mi">1</span>
+                <span class="n">bias_correction1</span> <span class="o">=</span> <span class="mi">1</span> <span class="o">-</span> <span class="n">beta1</span><span class="o">**</span><span class="n">state</span><span class="p">[</span><span class="sh">'</span><span class="s">step</span><span class="sh">'</span><span class="p">]</span>
+                <span class="n">bias_correction2</span> <span class="o">=</span> <span class="mi">1</span> <span class="o">-</span> <span class="n">beta2</span><span class="o">**</span><span class="n">state</span><span class="p">[</span><span class="sh">'</span><span class="s">step</span><span class="sh">'</span><span class="p">]</span>
+
+                
+                <span class="c1"># Decay the first and second moment running average coefficient
+</span>                <span class="n">exp_avg</span><span class="p">.</span><span class="nf">mul_</span><span class="p">(</span><span class="n">beta1</span><span class="p">).</span><span class="nf">add_</span><span class="p">(</span><span class="n">grad</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span> <span class="mi">1</span><span class="o">-</span><span class="n">beta1</span><span class="p">)</span>
+                <span class="n">exp_avg_sq</span><span class="p">.</span><span class="nf">mul_</span><span class="p">(</span><span class="n">beta2</span><span class="p">).</span><span class="nf">addcmul_</span><span class="p">(</span><span class="n">grad</span><span class="p">,</span> <span class="n">grad</span><span class="p">,</span> <span class="n">value</span><span class="o">=</span> <span class="mi">1</span><span class="o">-</span><span class="n">beta2</span><span class="p">)</span>
+                <span class="n">D</span> <span class="o">=</span> <span class="p">(</span><span class="n">exp_avg_sq</span><span class="p">.</span><span class="nf">div</span><span class="p">(</span><span class="n">bias_correction2</span><span class="p">)).</span><span class="nf">sqrt</span><span class="p">().</span><span class="nf">add_</span><span class="p">(</span><span class="n">group</span><span class="p">[</span><span class="sh">'</span><span class="s">eps</span><span class="sh">'</span><span class="p">])</span>
+
+                <span class="n">lr</span> <span class="o">=</span> <span class="n">group</span><span class="p">[</span><span class="sh">'</span><span class="s">lr</span><span class="sh">'</span><span class="p">]</span>
+                <span class="n">lmbda</span> <span class="o">=</span> <span class="n">group</span><span class="p">[</span><span class="sh">'</span><span class="s">weight_decay</span><span class="sh">'</span><span class="p">]</span>
+
+                <span class="n">p</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="nf">addcdiv_</span><span class="p">(</span><span class="n">exp_avg</span><span class="p">,</span> <span class="n">D</span><span class="p">,</span> <span class="n">value</span><span class="o">=-</span><span class="n">lr</span><span class="o">/</span><span class="n">bias_correction1</span><span class="p">)</span>
+                <span class="k">if</span> <span class="n">lmbda</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">:</span>
+                    <span class="n">p</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="nf">div_</span><span class="p">(</span><span class="mf">1.0</span> <span class="o">+</span> <span class="n">lr</span><span class="o">*</span><span class="n">lmbda</span><span class="o">/</span><span class="n">D</span><span class="p">)</span> <span class="c1"># adaptive weight decay
+</span>
+            
+
+        <span class="k">return</span> <span class="n">loss</span></code></pre></figure> </d-article> <d-appendix> <d-footnote-list></d-footnote-list> <d-citation-list></d-citation-list> </d-appendix> </div> <d-bibliography src="/2023/assets/bibliography/2023-05-01-adamw.bib"></d-bibliography> <d-article id="bibtex-container" class="related highlight"> For attribution in academic contexts, please cite this work as <pre id="bibtex-academic-attribution">
+        PLACEHOLDER FOR ACADEMIC ATTRIBUTION
+  </pre> BibTeX citation <pre id="bibtex-box">
+        PLACEHOLDER FOR BIBTEX
+  </pre> </d-article> <script src="https://utteranc.es/client.js" repo="iclr-blogposts/2023" issue-term="pathname" theme="github-light" crossorigin="anonymous" async> </script> <script src="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/js/bootstrap.bundle.min.js" integrity="sha256-fgLAgv7fyCGopR/gBNq2iW3ZKIdqIcyshnUULC4vex8=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/js/mdb.min.js" integrity="sha256-NdbiivsvWt7VYCt6hYNT3h/th9vSTL4EDWeGs5SN3DA=" crossorigin="anonymous"></script> </body> </html>
\ No newline at end of file
diff --git a/blog/2023/autoregressive-neural-pde-solver/index.html b/blog/2023/autoregressive-neural-pde-solver/index.html
new file mode 100644
index 00000000..917c1744
--- /dev/null
+++ b/blog/2023/autoregressive-neural-pde-solver/index.html
@@ -0,0 +1,36 @@
+<!DOCTYPE html> <html> <script>let thunk=()=>{let e=e=>e.trim(),t=e=>e.innerText,a=e=>{let t=e.split(" "),a=t.slice(0,-1).join(" ");return[t.at(-1),a]},n=Array.from(document.getElementsByClassName("author")).map(t).map(e).map(a),r=n[0][0],i=(Array.from(document.getElementsByClassName("affiliation")).filter(e=>"P"===e.nodeName).map(t).map(e),"May 1, 2023"),o="Autoregressive Renaissance in Neural PDE Solvers",s="Recent developments in the field of neural partial differential equation (PDE) solvers have placed a strong emphasis on neural operators. However, the paper Message Passing Neural PDE Solver by Brandstetter et al. published in ICLR 2022 revisits autoregressive models and designs a message passing graph neural network that is comparable with or outperforms both the state-of-the-art Fourier Neural Operator and traditional classical PDE solvers in its generalization capabilities and performance. This blog post delves into the key contributions of this work, exploring the strategies used to address the common problem of instability in autoregressive models and the design choices of the message passing graph neural network architecture.";{let e=n.map(e=>`${e[0]}, ${e[1]}`).join(" and "),t=`\n@inproceedings{${(r+"2023"+o.split(" ").slice(0,3).join("")).replace(" ","").replace(/[\p{P}$+<=>^`|~]/gu,"").toLowerCase().trim()},\n  author = {${e}},\n  title = {${o}},\n  abstract = {${s}},\n  booktitle = {ICLR Blogposts 2023},\n  year = {2023},\n  date = {${i}},\n  note = {${window.location.href}},\n  url  = {${window.location.href}}\n}\n  `.trim();document.getElementById("bibtex-box").innerText=t}{let e=n.map(e=>e[0]),t=`\n${e=e.length>2?e[0]+", et al.":2==e.length?e[0]+" & "+e[1]:e[0]}, "${o}", ICLR Blogposts, 2023.\n`.trim();document.getElementById("bibtex-academic-attribution").innerText=t}};document.addEventListener("readystatechange",function(){"complete"===document.readyState&&thunk()});</script> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <title>Autoregressive Renaissance in Neural PDE Solvers | ICLR Blogposts 2023</title> <meta name="author" content="abc b c"/> <meta name="description" content="Recent developments in the field of neural partial differential equation (PDE) solvers have placed a strong emphasis on neural operators. However, the paper Message Passing Neural PDE Solver by Brandstetter et al. published in ICLR 2022 revisits autoregressive models and designs a message passing graph neural network that is comparable with or outperforms both the state-of-the-art Fourier Neural Operator and traditional classical PDE solvers in its generalization capabilities and performance. This blog post delves into the key contributions of this work, exploring the strategies used to address the common problem of instability in autoregressive models and the design choices of the message passing graph neural network architecture."/> <meta name="keywords" content="machine-learning, ml, deep-learning, reinforcement-learning, iclr"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha256-DF7Zhf293AJxJNTmh5zhoYYIMs2oXitRfBjY+9L//AY=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/css/mdb.min.css" integrity="sha256-jpjYvU3G3N6nrrBwXJoVEYI/0zw8htfFnhT9ljN3JJw=" crossorigin="anonymous"/> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fortawesome/fontawesome-free@5.15.4/css/all.min.css" integrity="sha256-mUZM63G8m73Mcidfrv5E+Y61y7a12O5mW4ezU3bxqW4=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/academicons@1.9.1/css/academicons.min.css" integrity="sha256-i1+4qU2G2860dGGIOJscdC30s9beBXjFfzjWLjBRsBg=" crossorigin="anonymous"> <link rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700|Roboto+Slab:100,300,400,500,700|Material+Icons"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/github.css" media="" id="highlight_theme_light"/> <link rel="shortcut icon" href="/2023/assets/img/iclr_favicon.ico"/> <link rel="stylesheet" href="/2023/assets/css/main.css"> <link rel="canonical" href="https://iclr-blogposts.github.io/2023/blog/2023/autoregressive-neural-pde-solver/"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/native.css" media="none" id="highlight_theme_dark"/> <script src="/2023/assets/js/theme.js"></script> <script src="/2023/assets/js/dark_mode.js"></script> <script src="https://cdn.jsdelivr.net/npm/jquery@3.6.0/dist/jquery.min.js" integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=" crossorigin="anonymous"></script> <script type="text/javascript">window.MathJax={tex:{tags:"ams"}};</script> <script defer type="text/javascript" id="MathJax-script" src="https://cdn.jsdelivr.net/npm/mathjax@3.2.0/es5/tex-mml-chtml.js"></script> <script defer src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script> <script src="/2023/assets/js/distillpub/template.v2.js"></script> <script src="/2023/assets/js/distillpub/transforms.v2.js"></script> <script src="/2023/assets/js/distillpub/overrides.js"></script> <style type="text/css">.center-screen{justify-content:center;align-items:center;text-align:center}.fake-img{background:#e2edfc;border:1px solid rgba(0,0,0,0.1);border-radius:25px;box-shadow:0 0 4px rgba(0,0,0,0.05);margin-bottom:12px}.fake-img p{font-family:sans-serif;color:white;margin:12px 8px;text-align:center;font-size:12px;line-height:150%}.vertical-center{margin:0;position:absolute;top:50%;-ms-transform:translateY(-50%);transform:translateY(-50%)}[data-theme="dark"] .fake-img{background:#112f4a}summary{color:steelblue}summary-math{text-align:center;color:black}[data-theme="dark"] summary-math{text-align:center;color:white}details[open]{--bg:#e2edfc;color:white;border-radius:25px;padding-left:8px;background:var(--bg);outline:.5rem solid var(--bg);margin:0 0 2rem 0}[data-theme="dark"] details[open]{--bg:#112f4a;border-radius:25px;padding-left:8px;background:var(--bg);outline:.5rem solid var(--bg);margin:0 0 2rem 0}[data-theme="dark"] blockquote{background:var(--global-bg-color);border-left:2px solid white;margin:1.5em 10px;padding:.5em 10px;font-size:1.1rem;color:white}hr{color:#333;width:50%;margin:0 auto;text-align:center;height:2px}l-body-outset{display:flex;justify-content:center}</style> </head> <d-front-matter> <script async type="text/json">{
+      "title": "Autoregressive Renaissance in Neural PDE Solvers",
+      "description": "Recent developments in the field of neural partial differential equation (PDE) solvers have placed a strong emphasis on neural operators. However, the paper Message Passing Neural PDE Solver by Brandstetter et al. published in ICLR 2022 revisits autoregressive models and designs a message passing graph neural network that is comparable with or outperforms both the state-of-the-art Fourier Neural Operator and traditional classical PDE solvers in its generalization capabilities and performance. This blog post delves into the key contributions of this work, exploring the strategies used to address the common problem of instability in autoregressive models and the design choices of the message passing graph neural network architecture.",
+      "published": "May 1, 2023",
+      "authors": [
+        {
+          "author": "Yolanne Lee",
+          "authorURL": "https://www.linkedin.com/in/yolannelee/",
+          "affiliations": [
+            {
+              "name": "University College London",
+              "url": ""
+            }
+          ]
+        }
+        
+      ],
+      "katex": {
+        "delimiters": [
+          {
+            "left": "$",
+            "right": "$",
+            "display": false
+          },
+          {
+            "left": "$$",
+            "right": "$$",
+            "display": true
+          }
+        ]
+      }
+    }</script> </d-front-matter> <body class="fixed-top-nav"> <header> <nav id="navbar" class="navbar navbar-light navbar-expand-sm fixed-top"> <div class="container"> <a class="navbar-brand title font-weight-lighter" href="/2023/">ICLR Blogposts 2023</a> <button class="navbar-toggler collapsed ml-auto" type="button" data-toggle="collapse" data-target="#navbarNav" aria-controls="navbarNav" aria-expanded="false" aria-label="Toggle navigation"> <span class="sr-only">Toggle navigation</span> <span class="icon-bar top-bar"></span> <span class="icon-bar middle-bar"></span> <span class="icon-bar bottom-bar"></span> </button> <div class="collapse navbar-collapse text-right" id="navbarNav"> <ul class="navbar-nav ml-auto flex-nowrap"> <li class="nav-item "> <a class="nav-link" href="/2023/about">about</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/call">call for blogposts</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/submitting">submitting</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/reviewing">reviewing</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/blog/index.html">blog</a> </li> <li class="nav-item dropdown "> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdown" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">other iterations</a> <div class="dropdown-menu dropdown-menu-right" aria-labelledby="navbarDropdown"> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2025/">2025</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2024/">2024</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blog-track.github.io/home/" target="_blank" rel="noopener noreferrer">2022</a> </div> </li> <li class="toggle-container"> <button id="light-toggle" title="Change theme"> <i class="fas fa-moon"></i> <i class="fas fa-sun"></i> </button> </li> </ul> </div> </div> </nav> </header> <div class="post distill"> <d-title> <h1>Autoregressive Renaissance in Neural PDE Solvers</h1> <p>Recent developments in the field of neural partial differential equation (PDE) solvers have placed a strong emphasis on neural operators. However, the paper Message Passing Neural PDE Solver by Brandstetter et al. published in ICLR 2022 revisits autoregressive models and designs a message passing graph neural network that is comparable with or outperforms both the state-of-the-art Fourier Neural Operator and traditional classical PDE solvers in its generalization capabilities and performance. This blog post delves into the key contributions of this work, exploring the strategies used to address the common problem of instability in autoregressive models and the design choices of the message passing graph neural network architecture.</p> </d-title> <d-byline></d-byline> <d-article> <d-contents> <nav class="l-text figcaption"> <h3>Contents</h3> <div><a href="#introduction">Introduction</a></div> <div><a href="#background">Background</a></div> <ul> <li><a href="#let-s-brush-up-on-the-basics">Let's brush up on the basics...</a></li> <li><a href="#solving-pdes-the-classical-way">Solving PDEs the classical way</a></li> <li><a href="#neural-solvers">Neural Solvers</a></li> </ul> <div><a href="#message-passing-neural-pde-solver-mp-pde">Message Passing Neural PDE Solver (MP-PDE)</a></div> <ul> <li><a href="#the-pushforward-trick-and-temporal-bundling">The Pushforward Trick and Temporal bundling</a></li> <li><a href="#network-architecture">Network Architecture</a></li> <li><a href="#results">Results</a></li> <li><a href="#comparing-interpretations">Comparing Interpretations</a></li> </ul> <div><a href="#conclusion">Conclusion</a></div> <ul> <li><a href="#future-directions">Future Directions</a></li> <li><a href="#ongoing-challenges">Ongoing Challenges</a></li> <li><a href="#remarks">Remarks</a></li> </ul> </nav> </d-contents> <h2 id="introduction">Introduction</h2> <blockquote> Improving PDE solvers has trickle down benefits to a vast range of other fields. </blockquote> <p>Partial differential equations (PDEs) play a crucial role in modeling complex systems and understanding how they change over time and in space.</p> <p>They are used across physics and engineering, modeling a wide range of physical phenomena like heat transfer, sound waves, electromagnetism, and fluid dynamics, but they can also be used in finance to model the behavior of financial markets, in biology to model the spread of diseases, and in computer vision to model the processing of images.</p> <p>They are particularly interesting in deep learning!</p> <ol> <li><span style="color:#9444e2;">Neural networks can be used to model complex PDEs.</span></li> <li><span style="color:#9444e2;">Embedding knowledge of a PDE into a neural network can help it generalize better and/or use less data</span></li> <li><span style="color:#9444e2;">PDEs can help explain, interpret, and design neural networks.</span></li> </ol> <p>Despite their long history dating back to equations first formalized by Euler over 250 years ago, finding numerical solutions to PDEs continues to be a challenging problem.</p> <p>The recent advances in machine learning and artificial intelligence have opened up new possibilities for solving PDEs in a more efficient and accurate manner. These developments have the potential to revolutionize many fields, leading to a better understanding of complex systems and the ability to make more informed predictions about their behavior.</p> <p>The background and problem set up precedes a brief look into classical and neural solvers, and finally discusses the message passing neural PDE solver (MP-PDE) introduced by Brandstetter et al. <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>.</p> <h2 id="background">Background</h2> <h3 id="lets-brush-up-on-the-basics">Let's brush up on the basics…</h3> <p><em>The notation and definitions provided match those in the paper for consistency, unless otherwise specified.</em></p> <div> <p> Ordinary differential equations (ODEs) describe how a function changes with respect to a <span style="color:#9444e2">single independent variable</span> and its derivatives. In contrast, PDEs are mathematical equations that describe the behavior of a dependent variable as it changes with respect to <span style="color:#9444e2">multiple independent variables</span> and their derivatives. </p> <p> Formally, for one time dimension and possibly multiple spatial dimensions denoted by \(\textbf{x}=[x_{1},x_{2},x_{3},\text{...}]^{\top} \in \mathbb{X}\), a general (temporal) PDE may be written as </p> <p> $$\partial_{t}\textbf{u}= F\left(t, \textbf{x}, \textbf{u},\partial_{\textbf{x}}\textbf{u},\partial_{\textbf{xx}}\textbf{u},\text{...}\right) \qquad (t,\mathbf{x}) \in [0,T] \times \mathbb{X}$$ </p> <p> The \(\partial\) is a partial derivative operator which can be understood as "a small change in". For example, the \(\partial_{t}\textbf{u}\) term refers to how much an infinitesmally small change in \(t\) changes \(\textbf{u}\). Below is an explicit definition for some arbitrary function \(f(x,y)\): $$\frac{\partial f(x,y)}{\partial x} = \lim_{h \to 0} \frac{f(x+h,y) - f(x,y)}{h}$$ </p> <ul> <li>Initial condition: \(\mathbf{u}(0,\mathbf{x})=\mathbf{u}^{0}(\mathbf{x})\) for \(\mathbf{x} \in \mathbb{X}\)</li> <li>Boundary conditions: \(B[ \mathbf{u}](t,x)=0\) for \((t,\mathbf{x}) \in [0,T] \times \partial \mathbb{X}\)</li> </ul> </div> <div class="fake-img l-gutter"> <p> Many equations are solutions to such PDEs alone. For example, the wave equation is given by \(\partial_{tt}u = \partial_{xx}u\). You will find that any function in the form \(u(x,t)=F(x-ct)+\) \(G(x+ct)\) is a potential solution. Initial conditions are used to specify how a PDE "starts" in time, and boundary conditions determine the value of the solution at the boundaries of the region where the PDE is defined. </p> </div> <details><summary>Types of boundary conditions</summary> Dirichlet boundary conditions prescribe a fixed value of the solution at a particular point on the boundary of the domain. Neumann boundary conditions, on the other hand, prescribe the rate of change of the solution at a particular point on the boundary. There are also mixed boundary conditions, which involve both Dirichlet and Neumann conditions, and Robin boundary conditions, which involve a linear combination of the solution and its derivatives at the boundary. </details> <p><br></p> <div class="l-body-outset"> <iframe src="/2023/assets/html/2023-05-01-autoregressive-neural-pde-solver/slider.html" frameborder="0" scrolling="no" height="750px" width="100%"></iframe> </div> <div class="caption"> Example of the wave equation PDE \(\partial^{2}_{t}u = c^{2}\partial^{2}_ {\mathbf{x}}u\) solved using finite differences. Drag the slider to watch it evolve in time! </div> <p>The study of PDEs is in itself split into many broad fields. Briefly, these are two other important properties in addition to the initial and boundary conditions:</p> <details><summary>Linearity</summary> <ul> <li>Linear: the highest power of the unknown function appearing in the equation is one (i.e., a linear combination of the unknown function and its derivatives)</li> <li>Nonlinear: the highest power of the unknown function appearing in the equation is greater than one</li> </ul> </details> <p><br></p> <details><summary>Homogeneity</summary> For an example PDE \(u_t - u_xx = f(x, t)\): <ul> <li>Homogeneous: PDEs with no constant terms (i.e., the right-hand side \(f(x,t)=0\)) and express a balance between different physical quantities</li> <li>Inhomogeneous: PDEs with a non-zero constant term \(f(x,t)\neq0\) on the right-hand side and describe how an external factor affects the balance</li> </ul> </details> <p><br></p> <p>PDEs can be either linear or nonlinear, homogeneous or inhomogeneous, and can contain a combination of constant coefficients and variable coefficients. They can also involve a variety of boundary conditions, such as Dirichlet, Neumann, and Robin conditions, and can be solved using analytical, numerical, or semi-analytical methods <d-cite key="straussPartialDifferentialEquations2007"></d-cite>.</p> <hr style="width:40%"> <p>Brandstetter et al. <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite> follow precedence set by Li et al. <d-cite key="liFourierNeuralOperator2021"></d-cite> and Bar-Sinai et al. <d-cite key="bar-sinaiLearningDatadrivenDiscretizations2019"></d-cite>to focus on <span style="color:#9444e2;">PDEs written in conservation form</span>:</p> <p style="text-align:center;"> \(\partial_{t} \mathbf{u} + \nabla \cdot \mathbf{J}(\mathbf{u}) = 0\) </p> <ul> <li> <p>\(J\) is the flux, or the amount of some quantity that is flowing through a region at a given time</p> </li> <li> <p>\(\nabla \cdot J\) is the divergence of the flux, or the amount of outflow of the flux at a given point</p> </li> </ul> <p>Additionally, they consider <span style="color:#9444e2;">Dirichlet and Neumann</span> boundary conditions.</p> <h3 id="solving-pdes-the-classical-way">Solving PDEs the classical way</h3> <p>A brief search in a library will find numerous books detailing how to solve various types of PDEs. </p> <details><summary>Analytical methods: an exact solution to a PDE can be found by mathematical means <d-cite key="straussPartialDifferentialEquations2007"></d-cite>.</summary><br> <ul> <li>Separation of Variables<ul> <li>This method involves expressing the solution as the product of functions of each variable, and then solving each function individually. It is mainly used for linear PDEs that can be separated into two or more ordinary differential equations.</li> </ul> </li> <li>Green's Functions<ul> <li>This method involves expressing the solution in terms of a Green's function, which is a particular solution to a homogeneous equation with specified boundary conditions.</li> </ul> </li> </ul> </details> <p><br></p> <details><summary>Semi-analytical methods: an analytical solution is combined with numerical approximations to find a solution <d-cite key="bartelsNumericalApproximationPartial"></d-cite>.</summary><br> <ul> <li>Perturbation methods<ul> <li>This method is used when the solution to a PDE is close to a known solution or is a small deviation from a known solution. The solution is found by making a perturbation to the known solution and solving the resulting equation analytically.</li> </ul> </li> <li>Asymptotic methods<ul> <li>In this method, the solution is represented as a series of terms that are solved analytically. The solution is then approximated by taking the leading terms of the series.</li> </ul> </li> </ul> </details> <p><br></p> <blockquote> Very few PDEs have analytical solutions, so numerical methods have been developed to approximate PDE solutions over a wider range of potential problems. </blockquote> <h4 id="numerical-methods">Numerical Methods</h4> <p>Often, approaches for temporal PDEs follow the <span style="color:#9444e2;">method of lines (<abbr title="method of lines">MOL</abbr>)</span>.</p> <p>Every point of the discretization is then thought of as a separate ODE evolving in time, enabling the use of ODE solvers such as Runge-Kutta methods.</p> <details><summary>1. Discretizing the problem</summary><br> <p> In the most basic case (<span style="color:#9444e2;">a regular grid</span>), arbitrary spatial and temporal resolutions \(\mathbf{n_{x}}\) and \(n_{t}\) can be chosen and thus used to create a grid where \(\mathbf{n_{x}}\) is a vector containing a resolution for each spatial dimension. </p> <hr style="width:40%"> <p> The domain may also be <span style="color:#9444e2;">irregularly sampled, resulting in a grid-free discretization</span>. This is often the case with real-world data that comes from scattered sensors, for example. </p> <p>Finite difference methods (FDMs) or any other discretization technique can be used to discretize the time domain. </p> <p> One direction of ongoing research seeks to determine discretization methods which can result in more efficient numerical solvers (for example, take larger steps in flatter regions and smaller steps in rapidly changing regions). </p> </details> <p><br></p> <details><summary>2. Estimating the spatial derivatives</summary><br> <p> A popular choice when using a gridded discretization is the <span style="color:#9444e2;">finite difference method (FDM)</span>. Spatial derivative operators are replaced by a stencil which indicates how values at a finite set of neighboring grid points are combined to approximate the derivative at a given position. This stencil is based on the Taylor series expansion. </p> <p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/fdm_animation.gif-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/fdm_animation.gif-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/fdm_animation.gif-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/fdm_animation.gif" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </p> <div class="caption"> Credits: Augusto Peres, Inductiva <d-cite key="HeatHeatEquation"></d-cite>. </div> <hr style="width:40%"> <p> The <span style="color:#9444e2;">finite volume method (FVM)</span> is another approach which works for irregular geometries. Rather than requiring a grid, the computation domain can be divided into discrete, non-overlapping control volumes used to compute the solution for that portion <d-cite key="bartelsNumericalApproximationPartial"></d-cite>. </p> <p> For every control volume, a set of equations describing the balance of some physical quantities (in essence, estimating the flux at control volume boundaries) can be solved which results in the approximated spatial derivative. </p> <p> While this method <span style="color:#9444e2;">only works for conservation form equations</span>, it can handle complex problems with irregular geometries and fluxes that are difficult to handle with other numerical techniques such as the <abbr title="finite difference method">FDM</abbr>. </p> <hr style="width:40%"> <p> In the <span style="color:#9444e2;">pseudospectral method (PSM)</span>, PDEs are solved pointwise in physical space by using basis functions to approximate the spatial derivatives <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. The pseudospectral method and the Galerkin method are two common examples of spectral methods which use basis functions satisfying various conditions depending on the specific algorithm being applied. While the <abbr title="finite difference method">FDM</abbr> considers local information to construct approximations, spectral methods determine global solutions and have exponential convergence. </p> <p> These methods are well-suited for solving problems with <span style="color:#9444e2;">smooth solutions and periodic boundary conditions</span>, but their performance drops for irregular or non-smooth solutions, as well as problems with more degrees of freedom where their global nature results in high dimensional dense matrix computations. </p> </details> <p><br></p> <details><summary>3. Time updates</summary><br> The resulting problem is a set of temporal ODEs which can be solved with classical ODE solvers such as any member of the Runge-Kutta method family. </details> <p><br></p> <h4 id="limitations-of-classical-methods">Limitations of Classical Methods</h4> <p>The properties of a PDE, such as its order, linearity, homogeneity, and boundary conditions, determine its solution method. <span style="color:#9444e2;">Different methods have been developed based on the different properties and requirements of the problem at hand.</span> Brandstetter at al. categorizes these requirements into the following <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>:</p> <div> <table> <thead> <tr> <th>User</th> <th>Structural</th> <th>Implementational</th> </tr> </thead> <tbody> <tr> <td>Computation efficiency, computational cost, accuracy, guarantees (or uncertainty estimates), generalization across PDEs</td> <td>Spatial and temporal resolution, boundary conditions, domain sampling regularity, dimensionality</td> <td>Stability over long rollouts, preservation of invariants</td> </tr> </tbody> </table> <p> The countless combinations of requirements resulted in what Bartels defines as a <span style="color:#9444e2;">splitter field</span> <d-cite key="bartelsNumericalApproximationPartial"></d-cite>: a specialized classical solver is developed for each sub-problems, resulting in many specialized tools rather than a single one. </p> <p> These methods, while effective and mathematically proven, often come at high computation costs. Taking into account that PDEs often exhibit chaotic behaviour and are sensitive to any changes in their parameters, <span style="color:#ff4f4b;">re-running a solver every time a coefficient or boundary condition changes in a single PDE can be computationally expensive</span>. </p> <p> One key example which limits grid-based classical solvers is the <span style="color:#9444e2;">Courant-Friedrichs-Lewy (CFL) condition</span>, which states that the maximum time step size should be proportional to the minimum spatial grid size. According to this condition, as the number of dimensions increases, the size of the temporal step must decrease and therefore numerical solvers become very slow for complex PDEs. </p> </div> <table> <thead> <tr> <th>Algorithm</th> <th>Equation</th> <th>Boundary conditions</th> <th>Complexity</th> </tr> </thead> <tbody> <tr> <td>Classical FDM/FEM/FVM</td> <td>general</td> <td>general</td> <td>poly\(((\frac{1}{\varepsilon})^{d})\)</td> </tr> <tr> <td>Adaptive FDM/FEM <d-cite key="babuskaHpVersionFinite1987"></d-cite> </td> <td>general</td> <td>general</td> <td>poly\(((\log(\frac{1}{\varepsilon}))^{d})\)</td> </tr> <tr> <td>Spectral method <d-cite key="gheorghiuSpectralMethodsDifferential2007,shenSpectralMethodsAlgorithms2011"></d-cite> </td> <td>general</td> <td>general</td> <td>poly\(((\log(\frac{1}{\varepsilon}))^{d})\)</td> </tr> <tr> <td>Sparse grid FDM/FEM <d-cite key="bungartzSparseGrids2004,zengerSparseGrids1991"></d-cite> </td> <td>general</td> <td>general</td> <td>poly\(((\frac{1}{\varepsilon})(\log(\frac{1}{\varepsilon}))^{d})\)</td> </tr> <tr> <td>Sparse grid spectral method <d-cite key="shenEfficientSpectralSparse2010,shenEfficientSpectralSparse2012"></d-cite> </td> <td>elliptic</td> <td>general</td> <td>poly\((\log(\frac{1}{\varepsilon})(\log \log(\frac{1}{\varepsilon}))^{d})\)</td> </tr> </tbody> </table> <div class="caption"> Table showing (polynomial) computational complexity of some common numerical methods, including finite difference method (FDM), finite elements method (FEM), finite volume method (FVM), spectral method, and some of their variants for \(d\)-dimensional PDEs with error tolerance ε. Note that every method has an exponential dependency on the dimenAdapted from <d-cite key="childsHighprecisionQuantumAlgorithms2021"></d-cite>. </div> <h3 id="neural-solvers">Neural Solvers</h3> <p> Neural solvers offer some very desirable properties that may serve to unify some of this splitter field. Neural networks can <span style="color:#9444e2;">learn and generalize to new contexts</span> such as different initial/boundary conditions, coefficients, or even different PDEs entirely <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. They can also circumvent the CFL condition, making them a promising avenue for solving highly complex PDEs such as those found in weather prediction. For a review which contextualizes physics informed machine learning with regards to classical problems and methods, see <d-cite key="mengWhenPhysicsMeets2022"></d-cite> </p> <p> Though most methods lie along a spectrum from classical leaning to end-to-end neural, a naive yet illustrative categorization into three groupings is shown below. </p> <p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/PDEchart-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/PDEchart-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/PDEchart-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/PDEchart.png" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </p> <h4 id="fully-neuraluniversal-function-approximators">Fully Neural/Universal Function Approximators</h4> <p>The term fully neural here refers to methods which rely on the universal function approximation theory such that a sufficiently complex network can represent any arbitrary function. Many common fully neural methods are also known as neural operators which <span style="color:#9444e2;">model the solution of a PDE as an operator that maps inputs to outputs</span>. The problem is set such that a neural operator \(\mathcal{M}\) satisfies \(\mathcal{M}(t,\mathbf{u}^{0}) = \mathbf{u}(t)\) where \(\mathbf{u}^{0}\) are the initial conditions <d-cite key="luDeepONetLearningNonlinear2021, brandstetterMessagePassingNeural2022a"></d-cite>. The idea of using deep learning techniques to solve differential equations has a long history, including Dissanayake’s and Phan-Thien’s attempt to use <abbr title="multilayer perceptron">MLP</abbr>s as universal approximators to solve PDEs, and arguably includes any work involving incorporating prior knowledge into models in general <d-cite key="dissanayakeNeuralnetworkbasedApproximationsSolving1994,psichogiosHybridNeuralNetworkfirst1992,lagarisArtificialNeuralNetworks1998"></d-cite>. Simple <abbr title="multilayer perceptron">MLP</abbr>s, CNNs, RNNs, and other networks used to map input vectors to output vectors are naive examples of finite-dimensional operators.</p> <p>Raissi et al. officially coined the physics-informed neural network (PINN) in 2017 <d-cite key="raissiPhysicsinformedNeuralNetworks2019"></d-cite>. The problem is set such that the network \(\mathcal{N}\) satisfies \(\mathcal{N}(t,\mathbf{u}^{0}) = \mathbf{u}(t)\) where \(\mathbf{u}^{0}\) are the initial conditions. The main principle behind <abbr title="physics informed neural network">PINN</abbr>s is to enforce the governing physical laws of the problem on the network’s predictions by adding loss term(s) to the network’s objective function.</p> <p>For a typical loss function \(\theta = \text{argmin}_{\theta} \mathcal{L}(\theta)\)</p> <p>the loss with a physics prior may be defined as follows:</p> \[\mathcal{L}(\theta) = \omega_{\mathcal{F}} \mathcal{L}_{\mathcal{F}}(\theta) + \omega_{\mathcal{B}} \mathcal{L}_{\mathcal{B}}(\theta) + \omega_{d} \mathcal{L}_{\text{data}}(\theta)\] <table> <thead> <tr> <th>Term</th> <th>Definition</th> <th>Effect</th> <th> </th> </tr> </thead> <tbody> <tr> <td>\(\mathcal{L}_{\mathcal{B}}\)</td> <td>Loss wrt. the initial and/or boundary conditions</td> <td>Fits the known data over the network</td> <td> </td> </tr> <tr> <td>\(\mathcal{L}_{\mathcal{F}}\)</td> <td>Loss wrt. the PDE</td> <td>Enforces DE \(\mathcal{F}\) at collocation points; Calculating using autodiff to compute derivatives of \(\mathbf{\hat{u}_{\theta}(\mathbf{z})}\)</td> <td> </td> </tr> <tr> <td>\(\mathcal{L}_{\text{data}}\)</td> <td>Validation of known data points</td> <td>Fits the known data over the NN and forces \(\mathbf{\hat{u}}_{\theta}\) to match measurements of \(\mathbf{u}\) over provided points</td> <td>–&gt;</td> </tr> </tbody> </table> <p>Since the network maps input variables to output variables which are both finite-dimensional and dependent on the grid used to discretize the problem domain, it is considered a finite dimensional neural operator. The paper gained a lot of traction and inspired many architectures which now fall under the <abbr title="physics informed neural network">PINN</abbr> family; for a more thorough review see <d-cite key="cuomoScientificMachineLearning2022"></d-cite>, and for <a href="https://www.physicsbaseddeeplearning.org/intro.html" target="_blank" rel="noopener noreferrer">hands-on examples visit this digital book</a> <d-cite key="thuereyPhysicsbasedDeepLearning2022"></d-cite>.</p> <p>The success of this loss-based approach is apparent when considering the rapid growth of papers which extend the original iteration of the <abbr title="physics informed neural network">PINN</abbr>. However, Krishnapriyan et al. <d-cite key="krishnapriyanCharacterizingPossibleFailure2021"></d-cite> has shown that even though standard fully-connected neural networks are theoretically capable of representing any function given enough neurons and layers, a <abbr title="physics informed neural network">PINN</abbr> may still fail to approximate a solution due to the complex loss landscapes arising from soft PDE constraints.</p> <p>The DeepONet architecture is a seminal example of an infinite dimensional neural operator in contrast to the finite dimensional <abbr title="physics informed neural network">PINN</abbr> <d-cite key="luDeepONetLearningNonlinear2021"></d-cite>. It consists of one or multiple branch net(s) which encode discrete inputs to an input function space, and a single trunk net which receives the query location to evaluate the output function. The model maps from a fixed, finite dimensional grid to an infinite dimensional output space.</p> <p>Since the development of the DeepONet, many novel neural operators have emerged which generalize this finite-infinite dimensional mapping to an infinite-infinite dimensional mapping<d-cite key="liNeuralOperatorGraph2020,liPhysicsinformedNeuralOperator2021,goswamiPhysicsInformedDeepNeural2022,rahmanUshapedNeuralOperators2022,tripuraWaveletNeuralOperator2022,fanaskovSpectralNeuralOperators2022,pathakFourCastNetGlobalDatadriven2022"></d-cite>, including the <span style="color:#9444e2;">Fourier Neural Operator (FNO)</span> <d-cite key="liFourierNeuralOperator2021"></d-cite>. It operates within Fourier space and takes advantage of the convolution theorem to place the integral kernel in Fourier space as a convolutional operator.</p> <div> <p> These global integral operators (implemented as Fourier space convolutional operators) are combined with local nonlinear activation functions, resulting in an architecture which is <span style="color:#9444e2;">highly expressive yet computationally efficient, as well as being resolution-invariant</span>. </p> <p> While the vanilla <abbr title="Fourier neural operator">FNO</abbr> required the input function to be defined on a grid due to its reliance on the FFT, further work developed mesh-independent variations as well <d-cite key="kovachkiNeuralOperatorLearning2022"></d-cite>. </p> </div> <div class="fake-img l-gutter"> <p> Convolution Theorem </p> <p> The Fourier transform of the convolution of two signals is equal to the pointwise product of their individual Fourier transforms </p> </div> <p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/FNO-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/FNO-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/FNO-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/FNO.png" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </p> <div class="caption"> <abbr title="Fourier neural operator">FNO</abbr> architecture. For more details, see <a href="https://zongyi-li.github.io/blog/2020/fourier-pde/" target="_blank" rel="noopener noreferrer">this blogpost</a>. Credits: Li et al. <d-cite key="liFourierNeuralOperator2021"></d-cite>. </div> <p> Neural operators are able to operate on multiple domains and can be completely data-driven. </p> <p> However, these models <span style="color:#ff4f4b;">do not tend to predict out-of-distribution \(t\)</span> and are therefore limited when dealing with temporal PDEs. Another major barrier is their relative <span style="color:#ff4f4b;">lack of interpretability and guarantees</span> compared to classical solvers. </p> <h4 id="neural-augmented-classical-methods">Neural-Augmented Classical Methods</h4> <p>A parallel line of research involves using deep learning as a tool to improve classical numerical methods for solving PDEs. One avenue involves modifying existing iterative methods: while neural operator methods directly mapped inputs to outputs, <span style="color:#9444e2;">autoregressive methods take an iterative approach instead</span>. For example, iterating over time results in a problem such as \(\mathbf{u}(t+\Delta t) = \mathcal{A}(\Delta t, \mathbf{u}(t))\) where \(\mathcal{A}\) is some temporal update <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>.</p> <div class="l-body-outset"> <div class="row mt-3"> <div class="col-sm mt-3 mt-md-0"> <div class="vertical-center" style="background-color:white"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/rnn-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/rnn-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/rnn-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/rnn.png" class="img-fluid rounded" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> </div> <div class="col-sm mt-3 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/wavenet.gif-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/wavenet.gif-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/wavenet.gif-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/wavenet.gif" class="img-fluid rounded" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> </div> <div class="caption"> Similarly to <abbr title="recurrent neural networks">RNN</abbr>s (left), autoregressive models take previous time steps to predict the next time step. However, autoregressive models (right) are entirely feed-forward and take the previous predictions as inputs rather than storing them in some hidden state. Credits: RNN diagram from Colah's Blog <d-cite key="UnderstandingLSTMNetworks"></d-cite>, WaveNet from Deepmind Blog <d-cite key="WaveNetGenerativeModel"></d-cite> </div> </div> <p>Three autoregressive systems mentioned by Brandstetter et al. are hybrid methods which use neural networks to predict certain parameters for finite volume, multigrid, and iterative finite elements methods. <span style="color:#9444e2;">All three retain a (classical) computation grid which makes them somewhat interpretable</span> <d-cite key="bar-sinaiLearningDatadrivenDiscretizations2019, greenfeldLearningOptimizeMultigrid2019a, hsiehLearningNeuralPDE2019"></d-cite>.</p> <div class="fake-img l-gutter"> <p> Other autoregressive models include PixelCNN for images, WaveNet for audio, and the Transformer for text. </p> </div> <p>Hsieh et al. <d-cite key="hsiehLearningNeuralPDE2019"></d-cite>, for example, develops a neural network-accelerated iterative finite elements method. Most significantly, their approach offers theoretical guarantees of convergence and correctness. Their problem formulation focuses on solving a single linear PDE class for variable discretization, boundary conditions, and source/forcing terms. For any PDE with an existing linear iterative solver, a learned iterator can replace a handcrafted classical iterator.</p> <p>Similarly, Um et al. <d-cite key="umSolverintheLoopLearningDifferentiable2020a"></d-cite> proposed using a neural network component to learn the error or deviation from the path of an iterative solver. Using this component, the iterative method can be “pushed” back onto the true PDE solution.</p> <p>Another way deep learning can be leveraged in classical methods is characterized by <d-cite key="meurisMachinelearningbasedSpectralMethods2023"></d-cite> and also highlights the deeply interconnected nature of these novel developments. The conventional spectral method rewrites a PDE in terms of the sum of basis functions; Meuris et al. use a DeepONet to discover candidate functions to be used as basis functions. Though mathematical work is required to mold the extracted function (from the DeepONet) to a basis function satisfying certain desirable properties, it expands the use of the spectral method toward complex domains where we might not have known appropriate basis functions.</p> <p>However, augmented classical systems have not gained the acclaim seen by their fully neural counterparts as a whole.</p> <p>This is on one hand due to their <span style="color:#ff4f4b;">limitations in generalization</span>. In Hsieh et al.’s case, an existing numerical method must be used to craft a complementary neural iterator <d-cite key="hsiehLearningNeuralPDE2019"></d-cite>. Another major concern is the <span style="color:#ff4f4b;">accumulation of error</span> in iterative methods, which is particularly detrimental for PDE problems that often exhibit chaotic behavior <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. Overarching both neural component and neural-optimized methods, however, is the tradeoff between marginal improvements to classical methods and what tends to be a non-trivial amount of manual work required to implement such methods.</p> <h4 id="classical-inspired-neural-methods">Classical-Inspired Neural Methods</h4> <p>Ruthotto and Haber released an impactful study in 2018 which interprets residual neural networks (ResNets) as PDEs, and addresses some of their challenges using PDE theory <d-cite key="ruthottoDeepNeuralNetworks2018"></d-cite>. A standard ResNet has skip connections which in effect add a previous layer’s output directly to the calculation of a future layer’s output. Given input features \(\mathbf{Y}_{0}=\mathbf{Y}\) and a ResNet with \(N\) layers, the output of the \(j\)th layer is used to calculate that of the next:</p> \[\mathbf{Y}_{j+1}=\mathbf{Y}_{j}+f(\theta^{(j)},\mathbf{Y}_{j})\] <p>This formulation also describes a typical forward Euler discretization with a step size \(\delta_{t}=1\). Based on this continuous interpretation of a ResNet layer, PDEs from control theory can be used to develop novel networks with specific and expected behaviours like smoothing or even memory reduction <d-cite key="ruthottoDeepNeuralNetworks2018"></d-cite>.</p> <p>This is an example of a strong classical-inspired neural method which allowed us to systematically develop novel architectures. Since then, PDE interpretations of neural network architectures have been expanded to encompass embedding PDEs into architectures themselves, and building architectures to mimic classical PDE solvers.</p> <p>The Graph Neural Diffusion (GRAND) model introduced by Chamberlain et al. demonstrates that <span style="color:#9444e2;">graph neural networks (GNNs) can be crafted using differential equations</span> (like diffusion processes) where the spatial derivative is analogous to the difference between node features, and the temporal update is a continuous counterpart to the layer index <d-cite key="chamberlainGRANDGraphNeural2021a"></d-cite>. From these two principles and their derivations of diffusion PDEs on graphs, Chamberlain et al. design networks which ameliorate common <abbr title="graph neural network">GNN</abbr> pitfalls like oversmoothing (which occurs as the number of layers increases). Note that here, the emphasis is not in outputting the solution of a PDE directly but rather using a PDE to influence or bias the output toward an expected result, somewhat more similarly to how a <abbr title="physics informed neural network">PINN</abbr> biases the output to obey a specified PDE.</p> <p>Later, the PDE-GCN model extends GRAND by deriving differential operators on manifolds which are then discretized on graphs to then build not only diffusion, but hyperbolic PDE-inspired <abbr title="graph neural network">GNN</abbr>s as well <d-cite key="eliasofPdegcnNovelArchitectures2021"></d-cite>. The discretized nonlinear diffusion and nonlinear hyperbolic PDEs call back to Ruthotto et al.’s comparison to ResNet updates and are used to define the titular PDE-inspired graph convolutional network (GCN) layer. Interestingly, mixing both diffusion and hyperbolic variants can allow one to discover which is more prominent to a task by retrieving a parameter which weights how much one network dynamic contributes to the output.</p> <p>This category of models highlights the diverse ways that PDEs are used in deep learning. Not only can these networks be tested on mathematical datasets, but they provide valuable interpretations and performance improvements when used in non-geometric tasks like node classification and even protein-protein interactions in biology.</p> <h2 id="message-passing-neural-pde-solver-mp-pde">Message Passing Neural PDE Solver (MP-PDE)</h2> <p>Brandstetter et al. propose a <span style="color:#9444e2;">fully neural PDE solver which capitalizes on neural message passing</span>. The overall architecture is laid out below, consisting of an <abbr title="multilayer perceptron">MLP</abbr> encoder, a <abbr title="graph neural network">GNN</abbr> processor, and a CNN decoder.</p> <div class="l-body-outset" style="background-color:white"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/MP-PDE-Solver-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/MP-PDE-Solver-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/MP-PDE-Solver-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/MP-PDE-Solver.png" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> <div class="caption"> Overall MP-PDE architecture. Credits: Brandstetter et al. <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. </div> <p>At its core, this model is autoregressive and thus faces the same challenge listed above. Two key contributions of this work are the <span style="color:#9444e2;">pushforward trick and temporal bundling which mitigate the potential butterfly effect of error accumulation</span><d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. The network itself, being fully neural, is capable of generalization across many changes as well.</p> <h3 id="the-pushforward-trick-and-temporal-bundling">The Pushforward Trick and Temporal Bundling</h3> <div class="l-body-outset"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/pushforward3-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/pushforward3-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/pushforward3-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/pushforward3.jpg" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <div class="caption"> Pushforward trick compared to one-step and unrolled training. Credits: Brandstetter et al. <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. </div> </div> <p>During testing, the model uses current time steps (first from data, then <span style="color:#9444e2;">from its own predictions</span>) to approximate the next time step.</p> <p>This results in a distribution shift problem because the inputs are no longer solely from ground truth data: <span style="color:#9444e2;">the distribution learned during training will always be an approximation of the true data distribution</span>. The model will appear to overfit to the one-step training distribution and perform poorly the further it continues to predict.</p> <p>An adversarial-style stability loss is added to the one-step loss so that the training distribution is brought closer to the test time distribution <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>:</p> <details><summary style="text-align:center;"><span style="summary-math"> \(L_{\text{one-step}} =\) <span style="color:#23a15c;">\(\mathbb{E}_{k}\)</span> <span style="color:#928b54;">\(\mathbb{E}_{\mathbf{u^{k+1}|\mathbf{u^{k},\mathbf{u^{k} \sim p_{k}}}}}\)</span> \([\) <span style="color:#5588e0;">\(\mathcal{L}\)</span> \((\) <span style="color:#9444e2;">\(\mathcal{A}(\mathbf{u}^{k})\)</span> \(,\) <span style="color:#46b4af;">\(\mathbf{u}^{k+1}\)</span>\(]\) </span> </summary> <p> The <span style="color:#5588e0;">loss function</span> is used to evaluate the difference between the <span style="color:#9444e2;">temporal update</span> and the <span style="color:#46b4af;">expected next state</span>, and the overall one-step loss is calculated as the expected value of this loss over <span style="color:#23a15c;">all time-steps</span> and <span style="color:#928b54;">all possible next states</span>. </p> </details> <p><br style="line-height:5px"></p> <p style="text-align:center;"> \(L_{\text{stability}} = \mathbb{E}_{k}\mathbb{E}_{\mathbf{u^{k+1}|\mathbf{u^{k},\mathbf{u^{k} \sim p_{k}}}}}[\mathbb{E}_{\epsilon | \mathbf{u}^{k}} [\mathcal{L}(\mathcal{A}(\mathbf{u}^{k}+\) <span style="color:#faad18;">\(\epsilon\)</span> \()),\mathbf{u}^{k+1}]]\) </p> <p style="text-align:center;"> \(L_{\text{total}} = L_{\text{one-step}} + L_{\text{stability}}\) </p> <p> The stability loss is largely based off the one-step loss, but now assumes that the temporal update uses <span style="color:#faad18;">noisy data</span>. </p> <p> The pushforward trick lies in the choice of <span style="color:#faad18;">\(\epsilon\)</span> such that \(\mathbf{u}^{k}+\epsilon = \mathcal{A}(\mathbf{u}^{k-1})\), similar to the test time distribution. Practically, it is implemented to be <span style="color:#9444e2;">noise from the network itself</span> so that as the network improves, the loss decreases. </p> <p> Necessarily, the noise of the network must be known or calculated to implement this loss term. So, <span style="color:#9444e2;">the model is unrolled for 2 steps</span> but only backpropagated over the most recent unroll step, which already has the neural network noise <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. In essence, the one-step training has a clean input and noisy output whereas the pushforward trick has both noisy input and noisy output with the \(\epsilon\) term capturing the noise. </p> <p> While the network could be unrolled during training, this not only slows the training down but also might result in the network learning shortcuts across unrolled steps. </p> <p><strong>Temporal bundling</strong></p> <div class="row mt-3"> <div class="col-8"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/NN-AR-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/NN-AR-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/NN-AR-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/NN-AR.jpg" class="img-fluid rounded" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> <div class="col-4"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/temporalbundling-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/temporalbundling-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/temporalbundling-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/temporalbundling.jpg" class="img-fluid rounded" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> </div> <div class="caption"> Temporal bundling compared to neural operators and autoregressive models. Credits: Brandstetter et al. <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. </div> <p>This trick complements the previous by <span style="color:#9444e2;">reducing the amount of times the test time distribution changes</span>. Rather than predicting a single value at a time, the MP-PDE predicts multiple time-steps at a time, as seen above <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>.</p> <h3 id="network-architecture">Network Architecture</h3> <p><abbr title="graph neural network">GNN</abbr>s have been used as PDE solvers in a variety of works <d-cite key="liNeuralOperatorGraph2020, eliasofPdegcnNovelArchitectures2021, iakovlevLearningContinuoustimePDEs2021"></d-cite>; however, in this implementation, <span style="color:#9444e2;">links can be drawn directly from the <abbr title="method of lines">MOL</abbr> to each component of the network architecture centering around the use of a message passing algorithm.</span></p> <table> <thead> <tr> <th>Classical Numerical Method</th> <th>MP-PDE Network Component</th> </tr> </thead> <tbody> <tr> <td>Partitioning the problem onto a grid</td> <td>Encoder <br><em>Encodes a vector of solutions into node embeddings</em> </td> </tr> <tr> <td>Estimating the spatial derivatives</td> <td>Processor <br><em>Estimates spatial derivatives via message passing</em> </td> </tr> <tr> <td>Time updates</td> <td>Decoder <br><em>Combines some representation of spatial derivatives smoothed into a time update</em> </td> </tr> </tbody> </table> <ol> <li>Encoder <p> The encoder is implemented as a two-layer <abbr title="multilayer perceptron">MLP</abbr> which computes an embedding for each node \(i\) to cast the data to a <span style="color:#9444e2;">non-regular integration grid</span>: </p> <details><summary style="text-align:center"><span style="summary-math"> \(\mathbf{f}_{i}^{0} = \epsilon^{v}([\mathbf{u}_{i}^{k-K:k},\mathbf{x}_{i},t_{k},\theta_{PDE}])\) </span> </summary> where \(\mathbf{u}_{i}^{k-K:k}\) is a vector of previous solutions (the length equaling the temporal bundle length), \(\mathbf{x}_{i}\) is the node's position, \(t_{k}\) is the current timestep, and \(\theta_{PDE}\) holds equation parameters. </details> </li> <li> Processor <p> The node embeddings from the encoder are then used in a message passing <abbr title="graph neural network">GNN</abbr>. <a id="spatialderivative" style="text-decoration:none;">The message passing algorithm, which approximates spatial derivatives, is run \(M\) steps using the following updates:</a> </p> <details><summary style="text-align:center"><span style="summary-math"> \(\text{edge } j \to i \text{ message:} \qquad \mathbf{m}_{ij}^{m} =\) <span style="color:#ae46b4;">\(\phi\)</span> \((\) <span style="color:#b4a546;">\(\mathbf{f}_{i}^{m}, \mathbf{f}_{j}^{m},\)</span> <span style="color:steelblue;">\(\mathbf{u}_{i}^{k-K:k}-\mathbf{u}_{j}^{k-K:k}\)</span>, <span style="color:#6546b4;">\(\mathbf{x}_{i}-\mathbf{x}_{j}\)</span>, <span style="color:#46b4af;">\(\theta_{PDE}\)</span> \())\) </span> </summary> The <span style="color:#6546b4;">difference in spatial coordinates</span> helps enforce translational symmetry and, combined with the <span style="color:steelblue;">difference in node solutions</span>, relates the message passing to a local difference operator. The addition of the <span style="color:#46b4af;">PDE parameters</span> is motivated by considering what the MP-PDE should generalize over: by adding this information in multiple places, flexibility can potentially be learned since all this information (as well as the <span style="color:#b4a546;">node embeddings</span>) is fed through <span style="color:#ae46b4;">a two-layer <abbr title="multilayer perceptron">MLP</abbr></span>. In addition, the solution of a PDE at any timestep must respect the boundary condition (the same as in classical methods for BVPs), so adding the <span style="color:#46b4af;">PDE parameters</span> in the edge update provides knowledge of the boundary conditions to the neural solver. </details> <br> <details><summary style="text-align:center;"><span style="summary-math"> \(\text{node } i \text{ update:} \qquad\) <span style="color:#ff4f4b;">\(\mathbf{f}_{i}^{m+1}\)</span> \(=\) <span style="color:#928b54;">\(\psi\)</span> \((\) <span style="color:#5588e0;">\(\mathbf{f}^{m}_{i}\)</span>, <span style="color:#722e4e;">\(\sum_{j \in \mathcal{N}(i)} \mathbf{m}_{ij}^{m}\)</span>, <span style="color:#46b4af;">\(\theta_{PDE}\)</span> \()\) </span> </summary> The <span style="color:#ff4f4b;">future node embedding</span> is updated using <span style="color:#5588e0;">the current node embedding</span>, <span style="color:#722e4e;">the aggregation of all received messages</span>, and (again) the <span style="color:#46b4af;">PDE parameters</span>. This information is also fed through <span style="color:#928b54;">a two-layer <abbr title="multilayer perceptron">MLP</abbr></span>. </details><br> <p> Bar-Sinai et al. explores the relationship between <abbr title="finite difference method">FDM</abbr> and <abbr title="finite volume method">FVM</abbr> as used in the method of lines <d-cite key="bar-sinaiLearningDatadrivenDiscretizations2019"></d-cite>. In both methods, the \(n^{th}\) order derivative at a point \(x\) is approximated by </p> <p style="text-align:center;"> \(\partial^{(n)}_{x}u \approx \sum_{i} a^{(n)}_{i} u_{i}\) </p> <p> for some precomputed coefficients \(a^{(n)}_{i}\). <span style="color:#9444e2;">The right hand side parallels the message passing scheme</span>, which aggregates the local difference (<span style="color:steelblue;">\(\mathbf{u}_{i}^{k-K:k}-\mathbf{u}_{j}^{k-K:k}\)</span> in the edge update) and other (learned) embeddings over neighborhoods of nodes. </p> <p> This relationship gives an intuitive understanding of the message passing <abbr title="graph neural network">GNN</abbr>, which mimics <abbr title="finite difference method">FDM</abbr> for a single layer, <abbr title="finite volume method">FVM</abbr> for two layers, and <abbr title="Weighted Essentially Non-Oscillatory (5th order)">WENO5</abbr> for three layers <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. <abbr title="Weighted Essentially Non-Oscillatory (5th order)">WENO5</abbr> is a numerical interpolation scheme used to reconstruct the solution at cell interfaces in <abbr title="finite volume method">FVM</abbr>. </p> <p> While the interpretation is desirable, how far this holds in the actual function of the <abbr title="message passing graph neural network">MP-GNN</abbr> is harder to address. The concepts of the nodes as integration points and messages as local differences break down as the nodes and edges update. In addition, the furthest node that contributes a message from for any point is at \(n\) edges away for the \(n^{th}\) layer (or a specified limit). This results in a very coarse and potentially underinformed approximation for the first layer which is then propagated to the next layers. However, both the updates use two layer <abbr title="multilayer perceptron">MLP</abbr>s which (although abstracting away from their respective interpretations) may in effect learn optimal weightings to counterbalance this. </p> </li> <li> Decoder <p> The approximated spatial derivatives are then <span style="color:#9444e2;">combined and smoothed using a 1D CNN</span> which outputs a bundle of next time steps (recall temporal bundling) \(\mathbf{d}_{i}\). The solution is then updated: </p> <p style="text-align:center;"> \(\mathbf{u}^{k+l}_{i} = u^{k}_{i} + (t_{k+l}-t_{k})\mathbf{d}^{l}_{i}\) </p> <p> Some precedence is seen, for example, in classical linear multistep methods which (though effective) face stability concerns. Since the CNN is adaptive, it appears that it avoids this issue <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. </p> </li> </ol> <h3 id="results">Results</h3> <details><summary>Quantitative measures: accumulated error, runtime</summary> <p> Accumulated error: \(\frac{1}{n_{x}} \sum_{x,t} MSE\) </p> <p> Runtime (s): Measured time taken to run for a given number of steps. </p> </details> <blockquote> As a general neural PDE solver, the <abbr title="message passing graph neural network">MP-GNN</abbr> surpasses even the current state-of-the-art <abbr title="Fourier neural operator">FNO</abbr>. </blockquote> <p>For example, after training a neural model and setting up an instance of <abbr title="method of lines">MOL</abbr>, this is a brief comparison of how they can generalize without re-training.</p> <table> <thead> <tr> <th>Generalization to...</th> <th><abbr title="message passing graph neural network">MP-GNN</abbr></th> <th><abbr title="Fourier neural operator">FNO</abbr></th> <th>Classical (<abbr title="method of lines">MOL</abbr>)</th> </tr> </thead> <tbody> <tr> <td>New PDEs</td> <td>Yes</td> <td>No</td> <td>No</td> </tr> <tr> <td>Different resolutions</td> <td>Yes</td> <td>Yes</td> <td>No (unless downsampling)</td> </tr> <tr> <td>Changes in PDE parameters</td> <td>Yes</td> <td>Yes</td> <td>Sometimes</td> </tr> <tr> <td>Non-regular grids</td> <td>Yes</td> <td>Some</td> <td>Yes (dependent on implementation)</td> </tr> <tr> <td>Higher dimensions</td> <td>Yes</td> <td>No</td> <td>No</td> </tr> </tbody> </table> <div class="l-body-outset"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/shock_formation-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/shock_formation-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/shock_formation-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/shock_formation.png" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <div class="caption"> Demonstration of shock formation using MP-PDE from different training data resolutions. Credits: Brandstetter et al. <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. </div> </div> <p>This experiment exemplifies the MP-PDE’s ability to model shocks (where both the <abbr title="finite difference method">FDM</abbr> and PSM methods fail) across multiple resolutions. Even at a fifth of the resolution of the ground truth, both the small and large shocks are captured well.</p> <div class="l-body-outset"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/2dshock-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/2dshock-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/2dshock-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/2dshock.jpg" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <div class="caption"> Demonstration of shock formation using MP-PDE from different training data resolutions. Credits: Brandstetter et al. <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. </div> </div> <p>The same data is displayed in 2D to show the time evolution. After about 7.5s, the error accumulation is large enough to visibly diverge from the ground truth. The predictions become unreliable due to error accumulation.</p> <p>In practice, this survival time should be empirically found (as seen here) to determine how long the solution is reliable. However, the ground truth would be needed for comparison, rendering this as another chicken-egg problem.</p> <table> <thead> <tr> <th colspan="2"></th> <th colspan="4" style="border-left:1px solid lightgrey;">Accumulated Error</th> <th colspan="2" style="border-left:1px solid lightgrey;">Runtime [s]</th> </tr> </thead> <tbody> <tr> <td colspan="2"> \(\quad (n_{t},n_{x})\) </td> <td style="border-left:1px solid lightgrey;">WENO5</td> <td>FNO-RNN</td> <td style="border-left:1px solid lightgrey;">FNO-PF</td> <td>MP-PDE</td> <td style="border-left:1px solid lightgrey;">WENO5</td> <td>MP-PDE</td> </tr> <tr> <td><b>E1</b></td> <td>(250,100)</td> <td style="border-left:1px solid lightgrey;">2.02</td> <td>11.93</td> <td style="border-left:1px solid lightgrey;">0.54</td> <td>1.55</td> <td style="border-left:1px solid lightgrey;">1.9</td> <td>0.09</td> </tr> <tr> <td><b>E1</b></td> <td>(250, 50)</td> <td style="border-left:1px solid lightgrey;">6.23</td> <td>29.98</td> <td style="border-left:1px solid lightgrey;">0.51</td> <td>1.67</td> <td style="border-left:1px solid lightgrey;">1.8</td> <td>0.08</td> </tr> <tr> <td><b>E1</b></td> <td>(250, 40)</td> <td style="border-left:1px solid lightgrey;">9.63</td> <td>10.44</td> <td style="border-left:1px solid lightgrey;">0.57</td> <td>1.47</td> <td style="border-left:1px solid lightgrey;">1.7</td> <td>0.08</td> </tr> <tr> <td><b>E2</b></td> <td>(250, 100)</td> <td style="border-left:1px solid lightgrey;">1.19</td> <td>17.09</td> <td style="border-left:1px solid lightgrey;">2.53</td> <td>1.58</td> <td style="border-left:1px solid lightgrey;">1.9</td> <td>0.09</td> </tr> <tr> <td><b>E2</b></td> <td>(250, 50)</td> <td style="border-left:1px solid lightgrey;">5.35</td> <td>3.57</td> <td style="border-left:1px solid lightgrey;">2.27</td> <td>1.63</td> <td style="border-left:1px solid lightgrey;">1.8</td> <td>0.09</td> </tr> <tr> <td><b>E2</b></td> <td>(250, 40)</td> <td style="border-left:1px solid lightgrey;">8.05</td> <td>3.26</td> <td style="border-left:1px solid lightgrey;">2.38</td> <td>1.45</td> <td style="border-left:1px solid lightgrey;">1.7</td> <td>0.08</td> </tr> <tr> <td><b>E3</b></td> <td>(250, 100)</td> <td style="border-left:1px solid lightgrey;">4.71</td> <td>10.16</td> <td style="border-left:1px solid lightgrey;">5.69</td> <td>4.26</td> <td style="border-left:1px solid lightgrey;">4.8</td> <td>0.09</td> </tr> <tr> <td><b>E3</b></td> <td>(250, 50)</td> <td style="border-left:1px solid lightgrey;">11.71</td> <td>14.49</td> <td style="border-left:1px solid lightgrey;">5.39</td> <td>3.74</td> <td style="border-left:1px solid lightgrey;">4.5</td> <td>0.09</td> </tr> <tr> <td><b>E3</b></td> <td>(250, 40)</td> <td style="border-left:1px solid lightgrey;">15.97</td> <td>20.90</td> <td style="border-left:1px solid lightgrey;">5.98</td> <td>3.70</td> <td style="border-left:1px solid lightgrey;">4.4</td> <td>0.09</td> </tr> </tbody> </table> <div class="caption"> Table of experiment results adapted from paper. Credits: Brandstetter et al. <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. </div> <details><summary>Abbreviations</summary> <table> <thead> <tr> <th>Shorthand</th> <th>Meaning</th> </tr> </thead> <tbody> <tr> <td><strong>E1</strong></td> <td>Burgers' equation without diffusion</td> </tr> <tr> <td><strong>E2</strong></td> <td>Burgers' equation with variable diffusion</td> </tr> <tr> <td><strong>E3</strong></td> <td>Mixed equation, see below</td> </tr> <tr> <td>\(n_{t}\)</td> <td>Temporal resolution</td> </tr> <tr> <td>\(n_{x}\)</td> <td>Spatial resolution</td> </tr> <tr> <td>WENO5</td> <td>Weighted Essentially Non-Oscillatory (5th order)</td> </tr> <tr> <td> <abbr title="Fourier neural operator">FNO</abbr>-<abbr title="recurrent neural networks">RNN</abbr> </td> <td>Recurrent variation of <abbr title="Fourier neural operator">FNO</abbr> from original paper</td> </tr> <tr> <td> <abbr title="Fourier neural operator">FNO</abbr>-PF</td> <td> <abbr title="Fourier neural operator">FNO</abbr> with the pushforward trick added</td> </tr> <tr> <td>MP-PDE</td> <td>Message passing neural PDE solver</td> </tr> </tbody> </table> <p> The authors form a general PDE in the form </p> <p style="text-align:center;"> \([\partial_{t}u + \partial_{x}(\alpha u^{2} - \beta \partial_{x} u + \gamma \partial_{xx} u)](t,x) = \delta (t,x)\) </p> <p style="text-align:center;"> \(u(0,x) = \delta(0,x)\) </p> <p> such that \(\theta_{PDE} = (\alpha, \beta, \gamma)\) and different combinations of these result in the heat equation, Burgers' equation, and the KdV equation. \(\delta\) is a forcing term, allowing for greater variation in the equations being tested. </p> </details> <p>For this same experiment, the error and runtimes were recorded when solving using <abbr title="Weighted Essentially Non-Oscillatory (5th order)">WENO5</abbr>, the recurrent variant of the <abbr title="Fourier neural operator">FNO</abbr> (<abbr title="Fourier neural operator">FNO</abbr>-<abbr title="recurrent neural networks">RNN</abbr>), the <abbr title="Fourier neural operator">FNO</abbr> with the pushforward trick (<abbr title="Fourier neural operator">FNO</abbr>-PF), and the MP-PDE.</p> <blockquote> The pushforward trick is successful in mitigating error accumulation. </blockquote> <p>Comparing the accumulated errors of <abbr title="Fourier neural operator">FNO</abbr>-<abbr title="recurrent neural networks">RNN</abbr> and the <abbr title="Fourier neural operator">FNO</abbr>-PF across all experiments highlights the advantage of the pushforward trick. While the MP-PDE outperforms all other tested methods in the two generalization experiments <strong>E2</strong> and <strong>E3</strong>, the <abbr title="Fourier neural operator">FNO</abbr>-PF is most accurate for <strong>E1</strong>.</p> <p>When solving a single equation, the <abbr title="Fourier neural operator">FNO</abbr> likely performs better, though both <abbr title="Fourier neural operator">FNO</abbr>-PF and MP-PDE methods outperform <abbr title="Weighted Essentially Non-Oscillatory (5th order)">WENO5</abbr>.</p> <blockquote> Neural solvers are resolution-invariant. </blockquote> <p>As \(n_{x}\) is decreased, <abbr title="Weighted Essentially Non-Oscillatory (5th order)">WENO5</abbr> performs increasingly worse whereas all the neural solvers remain relatively stable.</p> <blockquote> Neural solver runtimes are constant to resolution. </blockquote> <p>Additionally, the runtimes of <abbr title="Weighted Essentially Non-Oscillatory (5th order)">WENO5</abbr> decrease (likely proportionally) since fewer steps require fewer calculations, but the MP-PDE runtimes again appear relatively stable.</p> <h3 id="comparing-interpretations">Comparing Interpretations</h3> <p>The way the MP-PDE is constructed parallels how both GRAND and the PDE-GCN are built. All three architectures follow a basic premise of mirroring the <abbr title="method of lines">MOL</abbr> and describe certain mechanisms in their respective systems which mimic spatial discretisations and temporal discretisations.</p> <p>The spatial derivative is discretized by a <abbr title="graph neural network">GNN</abbr> in the MP-PDE and by the message passing algorithm (consisting of node and edge updates within one layer of a <abbr title="graph neural network">GNN</abbr>) in the GRAND and PDE-GCN. In the MP-PDE, the spatial derivatives are in effect parameterized by the node and edge updates (the former which Brandstetter et al. highlight takes the difference in solutions \(u_{i}=u_{j}\)) detailed above, both of which are generic <abbr title="multilayer perceptron">MLP</abbr>s. In comparison, both GRAND and PDE-GCN (using the diffusion variant) come to comparable formulas when discretising using the forward Euler method.</p> <p>The GRAND paper derives the following, where \(\tau\) is a temporal step, \(\mathbf{x}\) is the diffusion equation, and \(\mathbf{A}\) is the attention matrix <d-cite key="chamberlainGRANDGraphNeural2021a"></d-cite>:</p> \[\mathbf{x}^{(k+1)}=(\mathbf{I} + \tau \mathbf{A}(\mathbf{x}^{(k)}))\mathbf{x}^{(k)}\] <p>which, when modified, results in:</p> \[\mathbf{x}^{(k+1)}=\mathbf{x}^{(k)} + \tau \mathbf{x}^{(k)} \mathbf{A}(\mathbf{x}^{(k)})\] <p>The PDE-GCN defines manifold operators discretized onto graphs. The update is defined as the following, where \(\mathbf{G}\) is the gradient operator, \(\mathbf{K}\) is a \(1 \times 1\) trainable convolution kernel, \(\sigma\) is the activation function, \(\tau\) is the temporal step, and \(\mathbf{x}\) is the diffusion equation <d-cite key="eliasofPdegcnNovelArchitectures2021"></d-cite>:</p> \[\mathbf{x}^{(k+1)}=\mathbf{x}^{(k)}-\tau \mathbf{G}^{T} \mathbf{K}^{T}_{k} \sigma (\mathbf{K}_{k} \mathbf{G} \mathbf{x}^{(k)})\] <p>The structure of these latter two models shares many similarities, though where GRAND naturally results in a graph attention network, the PDE-GCN results in a graph convolutional network.</p> <p>The temporal update for the MP-PDE relies on the 1D CNN outputting a temporal bundle, whereas GRAND and PDE-GCN regard their respective layer indexes to be the discretised time steps.</p> <p>These are examples of how spatial and temporal discretisations can result in unique architectures. The PDE-GCN outperforms GRAND on at least two out of three out of the popular Cora, SiteSeer, and PubMed benchmarks. However, the MP-PDE has a different objective altogether; while the PDE-GCN and GRAND output a single graph result (which is fed through a convolutional layer for node classification tasks), the MP-PDE iteratively produces results through time. This iterative requirement also requires that the temporal update must be retrievable and therefore must diverge from Ruthotto et al.’s original interpretation of time steps as layers adopted by the other two models. The MP-PDE instead appears to rely on the neural networks in both node and edge updates to learn spatial derivatives over multiple layers. An interesting experiment would be to apply the other two techniques to the same testing data as PDE-GCN and compare accuracies at a specific point in time (see future directions).</p> <h2 id="conclusion">Conclusion</h2> <h4 id="future-directions">Future Directions</h4> <p>The authors conclude by discussing some future directions.</p> <p>For example, the MP-PDE can be modified for <span style="color:#9444e2;">PDE <em>retrieval</em> (which they call parameter optimization)</span>. There is some precedence for this: Cranmer et al. develop a method which fits a symbolic regression model (eg.: PySR, eureqa) to the learned internal functions of a GNN <d-cite key="cranmerDiscoveringSymbolicModels2020"></d-cite>. Alternatively, the MP-PDE’s capacity for generalization means that biasing the model with a prior to determine coefficients could be as simple as training on an example instance of the predicted equation, fitting this model on real world data (much like a finetuning process), and extracting the \(\theta_{PDE}\) parameters.</p> <p>The one-step loss which is the basis of the <span style="color:#9444e2;">adversarial-style loss</span> is also used in reinforcement learning, which frequently uses deep autoregressive models. Other formulations which borrow from reinforcement learning (where distribution shifts are quite common) and other fields could prove successful as well. Transformer-based natural language processing are now capable of capturing extremely long sequence dependencies and generating coherent long-form text. Since <a href="https://graphdeeplearning.github.io/post/transformers-are-gnns/" target="_blank" rel="noopener noreferrer">Transformers are GNNs</a> which use attention to aggregate neighborhoods, this may be a viable avenue to explore.</p> <p><span style="color:#9444e2;">Adaptive time stepping</span> is another avenue which could make the model more efficient and accurate by taking large steps over stable/predictable solution regions and smaller steps over changing/unpredictable solution regions. The choice of a CNN for the decoder works well over regular inputs and outputs, but other options like attention-based architectures could potentially weigh the outputted node embeddings such that the model might learn different time steps. Some care would have to be taken with temporal bundling in this case, since the resulting vectors would be potentially irregular in time.</p> <p>In addition, while the GRAND architecture is designed for a single output, adapting it to suit an iterative solver may prove fruitful since the attention mechanism would encode spatial awareness. The motivation for this choice is that a sparse attention matrix might be able to provide a more global solution.</p> <h4 id="ongoing-challenges">Ongoing Challenges</h4> <p>While there are numerous diverse branches of development, key challenges remain:</p> <ul> <li>(Unified and) appropriate evaluation metrics <ul> <li>Currently, mean squared error (or root mean squared error) is implemented as the choice of loss in not only MP-PDE, but most named networks herein. However, it is unclear whether this is the best measure of correctness to solving a PDE since the specific values of the solution evaluated at the discretised points will depend on the discretisation method. An interesting further study would be to use the MP-PDE and test it on data generated from multiple numerical solvers. Additionally, Brandstetter et al. identify a metric called survival time which defines the length of time before the predicted solution diverges past a specified error threshold. Such metrics are important from a user’s perspective when choosing between architectures, but there has yet to be a unified set of metrics in literature and so we lack convenient benchmarking.</li> </ul> </li> <li>Understanding choices in network architecture <ul> <li>Given an end goal of using neural PDE solvers in practical settings, a major barrier for not only MP-PDE but for GRAND and PDE-GCN as well are the difficulties in choosing network parameters. While the proposed MP-PDE sheds light on certain choices like the message passing function and encoder-processor-decoder structure, it does not address some pragmatic decisions. For example, the 6 message passing layers in the MP-PDE appears relatively arbitrary which is a complaint shared in many machine learning methods. Because of the resulting upfront work in optimising the chosen model to determine what works for a new problem setting, the time cost of implementing it can be prohibitively high in comparison to the relative convenience of the many numerical solvers. One avenue of research to address this concern is neural architecture searching, where the design of neural architectures is discovered rather than manually specified. However, there is still a long way to go as many automated searches require significant compute to test the parameter space adequately.</li> </ul> </li> <li>The chicken and the egg <ul> <li>As impressive as many novel neural methods may be, it remains that training data comes from classical methods. One of the largest open questions (which also drives the need for generalisation) is how we can design neural solvers which require as little data as possible. Transfer learning, curriculum learning, and techniques to encourage generalisation (as seen with the MP-PDE) are all steps toward addressing this problem, but no significant success has been seen from any one in particular.</li> </ul> </li> </ul> <h4 id="remarks">Remarks</h4> <p>In their paper “Message Passing Neural PDE Solver”, Brandstetter at al. present a well-motivated neural solver based on the principle of message passing. The key contributions are the end-to-end network capable of one-shot generalization, and the mitigation of error accumulation in autoregressive models via temporal bundling and the pushforward trick. Note that the latter are self-contained can be applied to other architectures (as in the FNO-PF), providing a valuable tool to improve autoregressive models.</p> </d-article> <d-appendix> <d-footnote-list></d-footnote-list> <d-citation-list></d-citation-list> </d-appendix> </div> <d-bibliography src="/2023/assets/bibliography/2023-05-01-autoregressive-neural-pde-solver.bib"></d-bibliography> <d-article id="bibtex-container" class="related highlight"> For attribution in academic contexts, please cite this work as <pre id="bibtex-academic-attribution">
+        PLACEHOLDER FOR ACADEMIC ATTRIBUTION
+  </pre> BibTeX citation <pre id="bibtex-box">
+        PLACEHOLDER FOR BIBTEX
+  </pre> </d-article> <script src="https://utteranc.es/client.js" repo="iclr-blogposts/2023" issue-term="pathname" theme="github-light" crossorigin="anonymous" async> </script> <script src="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/js/bootstrap.bundle.min.js" integrity="sha256-fgLAgv7fyCGopR/gBNq2iW3ZKIdqIcyshnUULC4vex8=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/js/mdb.min.js" integrity="sha256-NdbiivsvWt7VYCt6hYNT3h/th9vSTL4EDWeGs5SN3DA=" crossorigin="anonymous"></script> </body> </html>
\ No newline at end of file
diff --git a/blog/2023/bsuite-applications/index.html b/blog/2023/bsuite-applications/index.html
new file mode 100644
index 00000000..ad58825c
--- /dev/null
+++ b/blog/2023/bsuite-applications/index.html
@@ -0,0 +1,46 @@
+<!DOCTYPE html> <html> <script>let thunk=()=>{let e=e=>e.trim(),t=e=>e.innerText,n=e=>{let t=e.split(" "),n=t.slice(0,-1).join(" ");return[t.at(-1),n]},i=Array.from(document.getElementsByClassName("author")).map(t).map(e).map(n),a=i[0][0],r=(Array.from(document.getElementsByClassName("affiliation")).filter(e=>"P"===e.nodeName).map(t).map(e),"May 1, 2023"),o="Practical Applications of Bsuite For Reinforcement Learning",l="In 2019, researchers at DeepMind published a suite of reinforcement learning environments called Behavior Suite for Reinforcement Learning, or bsuite. Each environment is designed to directly test a core capability of a general reinforcement learning agent, such as its ability to generalize from past experience or handle delayed rewards. In this blog post, we extend their work by providing specific examples of how bsuite can address common challenges faced by reinforcement learning practitioners during the development process.";{let e=i.map(e=>`${e[0]}, ${e[1]}`).join(" and "),t=`\n@inproceedings{${(a+"2023"+o.split(" ").slice(0,3).join("")).replace(" ","").replace(/[\p{P}$+<=>^`|~]/gu,"").toLowerCase().trim()},\n  author = {${e}},\n  title = {${o}},\n  abstract = {${l}},\n  booktitle = {ICLR Blogposts 2023},\n  year = {2023},\n  date = {${r}},\n  note = {${window.location.href}},\n  url  = {${window.location.href}}\n}\n  `.trim();document.getElementById("bibtex-box").innerText=t}{let e=i.map(e=>e[0]),t=`\n${e=e.length>2?e[0]+", et al.":2==e.length?e[0]+" & "+e[1]:e[0]}, "${o}", ICLR Blogposts, 2023.\n`.trim();document.getElementById("bibtex-academic-attribution").innerText=t}};document.addEventListener("readystatechange",function(){"complete"===document.readyState&&thunk()});</script> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <title>Practical Applications of Bsuite For Reinforcement Learning | ICLR Blogposts 2023</title> <meta name="author" content="abc b c"/> <meta name="description" content="In 2019, researchers at DeepMind published a suite of reinforcement learning environments called Behavior Suite for Reinforcement Learning, or bsuite. Each environment is designed to directly test a core capability of a general reinforcement learning agent, such as its ability to generalize from past experience or handle delayed rewards. In this blog post, we extend their work by providing specific examples of how bsuite can address common challenges faced by reinforcement learning practitioners during the development process."/> <meta name="keywords" content="machine-learning, ml, deep-learning, reinforcement-learning, iclr"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha256-DF7Zhf293AJxJNTmh5zhoYYIMs2oXitRfBjY+9L//AY=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/css/mdb.min.css" integrity="sha256-jpjYvU3G3N6nrrBwXJoVEYI/0zw8htfFnhT9ljN3JJw=" crossorigin="anonymous"/> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fortawesome/fontawesome-free@5.15.4/css/all.min.css" integrity="sha256-mUZM63G8m73Mcidfrv5E+Y61y7a12O5mW4ezU3bxqW4=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/academicons@1.9.1/css/academicons.min.css" integrity="sha256-i1+4qU2G2860dGGIOJscdC30s9beBXjFfzjWLjBRsBg=" crossorigin="anonymous"> <link rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700|Roboto+Slab:100,300,400,500,700|Material+Icons"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/github.css" media="" id="highlight_theme_light"/> <link rel="shortcut icon" href="/2023/assets/img/iclr_favicon.ico"/> <link rel="stylesheet" href="/2023/assets/css/main.css"> <link rel="canonical" href="https://iclr-blogposts.github.io/2023/blog/2023/bsuite-applications/"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/native.css" media="none" id="highlight_theme_dark"/> <script src="/2023/assets/js/theme.js"></script> <script src="/2023/assets/js/dark_mode.js"></script> <script src="https://cdn.jsdelivr.net/npm/jquery@3.6.0/dist/jquery.min.js" integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=" crossorigin="anonymous"></script> <script type="text/javascript">window.MathJax={tex:{tags:"ams"}};</script> <script defer type="text/javascript" id="MathJax-script" src="https://cdn.jsdelivr.net/npm/mathjax@3.2.0/es5/tex-mml-chtml.js"></script> <script defer src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script> <script src="/2023/assets/js/distillpub/template.v2.js"></script> <script src="/2023/assets/js/distillpub/transforms.v2.js"></script> <script src="/2023/assets/js/distillpub/overrides.js"></script> <style type="text/css">.fake-img{background:#bbb;border:1px solid rgba(0,0,0,0.1);box-shadow:0 0 4px rgba(0,0,0,0.1);margin-bottom:12px}.fake-img p{font-family:monospace;color:white;text-align:left;margin:12px 0;text-align:center;font-size:16px}.asdf{max-width:75%}.emph{text-decoration:underline;# font-weight:bold;# font-style:italic}</style> </head> <d-front-matter> <script async type="text/json">{
+      "title": "Practical Applications of Bsuite For Reinforcement Learning",
+      "description": "In 2019, researchers at DeepMind published a suite of reinforcement learning environments called Behavior Suite for Reinforcement Learning, or bsuite. Each environment is designed to directly test a core capability of a general reinforcement learning agent, such as its ability to generalize from past experience or handle delayed rewards. In this blog post, we extend their work by providing specific examples of how bsuite can address common challenges faced by reinforcement learning practitioners during the development process.",
+      "published": "May 1, 2023",
+      "authors": [
+        {
+          "author": "Loren Anderson",
+          "authorURL": "",
+          "affiliations": [
+            {
+              "name": "USA",
+              "url": ""
+            }
+          ]
+        },
+        {
+          "author": "Nathan Bittner",
+          "authorURL": "",
+          "affiliations": [
+            {
+              "name": "USA",
+              "url": ""
+            }
+          ]
+        }
+        
+      ],
+      "katex": {
+        "delimiters": [
+          {
+            "left": "$",
+            "right": "$",
+            "display": false
+          },
+          {
+            "left": "$$",
+            "right": "$$",
+            "display": true
+          }
+        ]
+      }
+    }</script> </d-front-matter> <body class="fixed-top-nav"> <header> <nav id="navbar" class="navbar navbar-light navbar-expand-sm fixed-top"> <div class="container"> <a class="navbar-brand title font-weight-lighter" href="/2023/">ICLR Blogposts 2023</a> <button class="navbar-toggler collapsed ml-auto" type="button" data-toggle="collapse" data-target="#navbarNav" aria-controls="navbarNav" aria-expanded="false" aria-label="Toggle navigation"> <span class="sr-only">Toggle navigation</span> <span class="icon-bar top-bar"></span> <span class="icon-bar middle-bar"></span> <span class="icon-bar bottom-bar"></span> </button> <div class="collapse navbar-collapse text-right" id="navbarNav"> <ul class="navbar-nav ml-auto flex-nowrap"> <li class="nav-item "> <a class="nav-link" href="/2023/about">about</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/call">call for blogposts</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/submitting">submitting</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/reviewing">reviewing</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/blog/index.html">blog</a> </li> <li class="nav-item dropdown "> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdown" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">other iterations</a> <div class="dropdown-menu dropdown-menu-right" aria-labelledby="navbarDropdown"> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2025/">2025</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2024/">2024</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blog-track.github.io/home/" target="_blank" rel="noopener noreferrer">2022</a> </div> </li> <li class="toggle-container"> <button id="light-toggle" title="Change theme"> <i class="fas fa-moon"></i> <i class="fas fa-sun"></i> </button> </li> </ul> </div> </div> </nav> </header> <div class="post distill"> <d-title> <h1>Practical Applications of Bsuite For Reinforcement Learning</h1> <p>In 2019, researchers at DeepMind published a suite of reinforcement learning environments called Behavior Suite for Reinforcement Learning, or bsuite. Each environment is designed to directly test a core capability of a general reinforcement learning agent, such as its ability to generalize from past experience or handle delayed rewards. In this blog post, we extend their work by providing specific examples of how bsuite can address common challenges faced by reinforcement learning practitioners during the development process.</p> </d-title> <d-byline></d-byline> <d-article> <d-contents> <nav class="l-text figcaption"> <h3>Contents</h3> <div><a href="#0-introduction">0. Introduction</a></div> <ul> <li><a href="#background">Background</a></li> <li><a href="#summary-of-bsuite">Summary of bsuite</a></li> <li><a href="#motivation">Motivation</a></li> <li><a href="#contribution-statement">Contribution Statement</a></li> <li><a href="#experiment-summary">Experiment Summary</a></li> </ul> <div><a href="#1-initial-model-selection">1. Initial Model Selection</a></div> <ul> <li><a href="#comparing-baseline-algorithms">Comparing Baseline Algorithms</a></li> <li><a href="#comparing-off-the-shelf-implementations">Comparing Off-the-Shelf Implementations</a></li> <li><a href="#gauging-hardware-necessities">Gauging Hardware Necessities</a></li> <li><a href="#future-work">Future Work</a></li> </ul> <div><a href="#2-preprocessing-choice">2. Preprocessing Choice</a></div> <ul> <li><a href="#verification-of-preprocessing">Verification of Preprocessing</a></li> <li><a href="#better-model-versus-preprocessing">Better Model versus Preprocessing</a></li> <li><a href="#future-work">Future Work</a></li> </ul> <div><a href="#3-hyperparameter-tuning">3. Hyperparameter Tuning</a></div> <ul> <li><a href="#unintuitive-hyperparameters">Unintuitive Hyperparameters</a></li> <li><a href="#promising-ranges-of-hyperparameters">Promising Ranges of Hyperparameters</a></li> <li><a href="#pace-of-annealing-hyperparameters">Pace of Annealing Hyperparameters</a></li> <li><a href="#future-work">Future Work</a></li> </ul> <div><a href="#4-testing-and-debugging">4. Testing and Debugging</a></div> <ul> <li><a href="#incorrect-hyperparameter">Incorrect Hyperparameter</a></li> <li><a href="#off-the-shelf-algorithm-testing">Off-the-Shelf Algorithm Testing</a></li> <li><a href="#future-work">Future Work</a></li> </ul> <div><a href="#5-model-improvement">5. Model Improvement</a></div> <ul> <li><a href="#increasing-network-complexity">Increasing Network Complexity</a></li> <li><a href="#off-the-shelf-improvements">Off-the-Shelf Improvements</a></li> <li><a href="#future-work">Future Work</a></li> </ul> <div><a href="#6-conclusion">6. Conclusion</a></div> <ul> <li><a href="#green-computing-statement">Green Computing Statement</a></li> <li><a href="#inclusive-computing-statement">Inclusive Computing Statement</a></li> </ul> <div><a href="#acknowledgements">Acknowledgements</a></div> </nav> </d-contents> <h2 id="0-introduction">0. Introduction</h2> <p>For the past few decades, the field of AI has appeared similar to the Wild West. There have been rapid achievements <d-cite key="krizhevsky_imagenet_2012"></d-cite><d-cite key="hessel_rainbow_2018"></d-cite> and epic showdowns <d-cite key="brown_superhuman_2019"></d-cite><d-cite key="silver_mastering_2016"></d-cite><d-cite key="vinyals_sc2_2019"></d-cite> happening in the frontier of AI research. The subfield of reinforcement learning has been no exception, where progress in the frontier has generated sensational applied feats while leaving theoretical understanding in the dust <d-cite key="osband_behaviour_2020"></d-cite>. As in many other AI subfields, there remain prevailing questions such as, <em>“Which model should I initially select for the given task?”</em>, <em>“How can I tune hyperparameters to increase performance?”</em>, and <em>“What is the best way to improve my already working model?”</em>. In this blog post, we help tame the frontier of reinforcement learning research by providing insights and quantitative answers to such questions through diagnostic, methodical, and reproducible reinforcement learning techniques. In particular, we focus on DeepMind’s <em>Behaviour Suite for Reinforcement Learning</em> (bsuite) codebase and showcase explicit examples of how it can aid reinforcement learning researchers in the development process and help provide a bridge between theoretical and applied reinforcement learning understanding.</p> <p>This introduction section provides the necessary background and motivation to understand the importance of our contribution. The background section describes how deep learning provides a blueprint for bridging theory to practice, and then discusses traditional reinforcement learning benchmarks. The bsuite summary section provides a high-level overview of the core capabilities tested by bsuite, its motivation, an example environment, and a comparison against traditional benchmark environments. In the motivation section, we present arguments for increasing the wealth and diversity of documented bsuite examples, with references to the paper and reviewer comments. The contribution statement presents the four distinct contributions of our work that help extend the bsuite publication. Finally, the experiment summary section describes our setup and rationale for the experimental illustrations in sections 1-5. The information in this introduction section is primarily distilled from the original bsuite publication <d-cite key="osband_behaviour_2020"></d-cite>.</p> <h3 id="background">Background</h3> <p>The current state of reinforcement learning (RL) theory notably lags progress in practice, especially in challenging problems. There are examples of deep reinforcement learning (DRL) agents learning to play Go from scratch at the professional level <d-cite key="silver_mastering_2016"></d-cite>, learning to navigate diverse video games from raw pixels <d-cite key="mnih_human-level_2015"></d-cite>, and learning to manipulate objects with robotic hands <d-cite key="andrychowicz_learning_2020"></d-cite>. While these algorithms have some foundational roots in theory, including gradient descent <d-cite key="bottou_large-scale_2010"></d-cite>, TD learning <d-cite key="sutton_learning_1988"></d-cite>, and Q-learning <d-cite key="watkins_q-learning_1992"></d-cite>, the authors of bsuite acknowledge that, “The current theory of deep reinforcement learning is still in its infancy” <d-cite key="osband_behaviour_2020"></d-cite>. A strong theory is prized since it can help provide insight and direction for improving known algorithms, while hinting at future research directions.</p> <p>Fortunately, deep learning (DL) provides a blueprint of the interaction between theoretical and practical improvements. During the ‘neural network winter’, DL techniques were disregarded in favor of more theoretically sound convex loss methods <d-cite key="cortes_support-vector_1995"></d-cite>, even though the main ideas and successful demonstrations existed many years previously <d-cite key="rosenblatt_perceptron_1958"></d-cite>. It was only until DL techniques achieved superior scores on benchmark problems, mainly for image recognition <d-cite key="krizhevsky_imagenet_2012"></d-cite>, that DL earned the research spotlight. Consequently, a renewed interest in DL theory followed shortly after <d-cite key="kawaguchi_deep_2016"></d-cite><d-cite key="bartlett_spectrally-normalized_2017"></d-cite><d-cite key="belkin_reconciling_2019"></d-cite>, bolstered by the considerable wealth of applied research. Due to the lack of theory in DRL and the proximity of the DL and DRL research fields, <span class="emph">one enticing avenue to accelerate progress in reinforcement learning research is to follow the blueprint laid out by deep learning research and create well-defined and vetted benchmarks for the understanding of reinforcement learning algorithms</span>.</p> <p>To this end, the trend of RL benchmarks has seen an increase in overall complexity. The earliest such benchmarks were simple MDPs that served as basic testbeds with fairly obvious solutions, such as <em>Cartpole</em> <d-cite key="barto_neuronlike_1983"></d-cite> and <em>MountainCar</em> <d-cite key="moore_efficient_1990"></d-cite>. Other benchmarks proved to be more diagnostic by targeting certain capabilities such as <em>RiverSwim</em> <d-cite key="strehl_analysis_2008"></d-cite> for exploration and <em>Taxi</em> <d-cite key="dietterich_hierarchical_2000"></d-cite> for temporal abstraction. Modern benchmarks such as the <em>ATARI Learning Environment</em> <d-cite key="bellemare_arcade_2013"></d-cite> and board games such as <em>Chess</em>, <em>Go</em>, and <em>Shogi</em> are more complex and prove difficult for humans, with even the best humans unable to achieve perfect play. The corresponding achievements were highly publicized <d-cite key="silver_mastering_2016"></d-cite><d-cite key="mnih_human-level_2015"></d-cite> due to the superhuman performance of the agents, with the agents taking actions that were sometimes not even considered by their human counterparts. Consequently, the pursuit of superhuman performance on complex benchmarks has recently been a strong driver of progress in the field <d-cite key="vinyals_sc2_2019"></d-cite><d-cite key="silver_general_2018"></d-cite><d-cite key="perolat_mastering_2022"></d-cite><d-cite key="ecoffet_first_2021"></d-cite><d-cite key="bakhtin_diplomacy_2022"></d-cite>.</p> <h3 id="summary-of-bsuite">Summary of bsuite</h3> <p>The open-source <em>Behaviour Suite for Reinforcement Learning</em> (bsuite) benchmark <d-cite key="osband_behaviour_2020"></d-cite> goes against the grain of the current benchmark trend of increasing complexity. It acts as a complement to existing benchmarks by creating 23 environments with minimal confounding factors to test 7 behavioral core capabilities of RL agents, as follows: <strong>basic</strong>, <strong>exploration</strong>, <strong>memory</strong>, <strong>generalization</strong>, <strong>noise</strong>, <strong>scale</strong>, and <strong>credit assignment</strong>. Current benchmarks often contain most of these capabilities within a single environment, whereas bsuite tailors its environments to target one or a few of these capabilities. Each bsuite environment is scalable and has 16 to 22 levels of difficulty, providing a more precise analysis of the corresponding capabilities than a simple, and possibly misleading <d-cite key="agarwal_deep_2021"></d-cite>, ranking of algorithm performance. Furthermore, algorithms have fixed evaluation regimes based on the number of seeds and episodes allowed during training, which rewards algorithms that exhibit the capabilities rather than those that focus on sheer compute power. The targeted and scalable nature of bsuite can provide insights such as eliciting bottlenecks and revealing scaling properties that are opaque in traditional benchmarks. With respect to the benchmarks described in the preceding paragraph, bsuite is most similar to the diagnostic benchmarks of <em>RiverSwim</em> <d-cite key="strehl_analysis_2008"></d-cite> for and <em>Taxi</em> <d-cite key="dietterich_hierarchical_2000"></d-cite> due to its purpose as a stepping stone for tackling more challenging benchmarks.</p> <p>The bsuite evaluation of an agent yields a radar chart (Fig. 1) that displays the agent’s score from 0 to 1 on all seven capabilities, usually based on regret, that yields a quick quantitative comparison between agents. Scores near 0 indicate poor performance, often akin to an agent acting randomly, while scores near 1 indicate mastery of all environment difficulties. A central premise of bsuite is that <span class="emph">if an agent achieves high scores on certain environments, then it is much more likely to exhibit the associated core capabilities due to the targeted nature of the environments. Therefore, the agent will more likely perform better on a challenging environment that contains many of the capabilities than one with lower scores on bsuite</span>. This premise is corroborated by recent research that shows how insights on simple environments can still hold true on more complex environments <d-cite key="ceron_revisiting_2021"></d-cite>. However, we urge practitioners to exercise caution when adopting bsuite into the development process, as the insights on more simple bsuite environments are not guaranteed to extend to more complex environments in a straightforward manner.</p> <div style="text-align: center;"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar01-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar01-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar01-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-bsuite-applications/radar01.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> <div class="caption"> Figure 1. Example radar chart of DQN on all 7 bsuite core capabilities. </div> <p>An example environment is <em>deep sea</em> that targets exploration power. As shown in Figure 2, <em>deep sea</em> is an $N \times N$ grid with starting state at cell $(1, 1)$ and treasure at $(N, N)$, with $N$ ranging from 10 to 100. The agent has two actions, move downward left and downward right; the goal is to reach the treasure and receive a reward of $1$ by always moving downward right. A reward of $0$ is given to the agent for moving downward left at a timestep, while a penalizing reward of $-0.01/N$ is given for moving downward right. The evaluation protocol of <em>deep sea</em> only allows for $10K$ episodes of $N-1$ time steps each, which prevents an algorithm with unlimited time from casually exploring the entire state space and stumbling upon the treasure. Note that superhuman performance is nonexistent in <em>deep sea</em> (and more precisely in the entire bsuite gamut) since a human can spot the optimal policy nearly instantaneously. Surprisingly, we will show later that baseline DRL agents fail miserably at this task.</p> <div style="text-align: center;"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/diagram02-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/diagram02-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/diagram02-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-bsuite-applications/diagram02.png" class="img-fluid asdf" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <div class="caption"> Figure 2. Illustration of deep sea environment taken from <d-cite key="osband_behaviour_2020"></d-cite>. </div> </div> <p>The <strong>challenge</strong> of <em>deep sea</em> is the necessity of exploration in an environment that presents an irreversible, suboptimal greedy action (moving downward left) at every time step. This environment <strong>targets</strong> exploration power by ensuring that a successful agent must deliberately choose to explore the state space by neglecting the greedy action. The <strong>simplistic</strong> implementation removes confounding goals, such as learning to see from pixels while completing other tasks <d-cite key="mnih_human-level_2015"></d-cite>. Furthermore, this environment provides a granular exploration score through <strong>scaling</strong> the environment size by $N$ and determining when an agent starts to fail. Finally, the implementation of the environment yields <strong>fast</strong> computation, allowing multiple, quick runs with minimal overhead and compute cost. These 5 aforementioned key qualities are encompassed by all bsuite environments, and we contrast such environments against traditional benchmark environments in the below table.</p> <table> <thead> <tr> <th>Key Quality</th> <th>Traditional Benchmark Environment</th> <th>bsuite Environment</th> </tr> </thead> <tbody> <tr> <td><strong>Targeted</strong></td> <td>Performance on environment subtly related to many or all core capabilities.</td> <td>Performance on environment directly related with one or few core capabilities.</td> </tr> <tr> <td><strong>Simple</strong></td> <td>Exhibits many confounding factors related to performance.</td> <td>Removes confounding factors related to performance.</td> </tr> <tr> <td><strong>Challenging</strong></td> <td>Requires competency in many core capabilities but not necessarily past normal range in any capability.</td> <td>Pushes agents beyond normal range in one or few core capabilities.</td> </tr> <tr> <td><strong>Scalable</strong></td> <td>Discerns agent’s power through comparing against other agents and human performance.</td> <td>Discerns agent’s competency of core capabilities through increasingly more difficult environments.</td> </tr> <tr> <td><strong>Fast</strong></td> <td>Long episodes with computationally-intensive observations.</td> <td>Relatively small episode and experiment lengths with low observation complexity.</td> </tr> </tbody> </table> <h3 id="motivation">Motivation</h3> <p>The authors of bsuite stated, “Our aim is that these experiments can help provide a bridge between theory and practice, with benefits to both sides” <d-cite key="osband_behaviour_2020"></d-cite>. As discussed in the background section, establishing clear benchmarks can yield applied progress, which in turn can accelerate theoretical progress. The use of bsuite in this manner seems highly fruitful since its environments are targeted, which allows for hypothesis testing and eventual formalization into provable guarantees. As such, <span class="emph">it is instrumental that the applied aspect of bsuite is emphasized through the adoption and diverse application of reinforcement learning practitioners</span>.</p> <p>The applied examples in the published paper are rather meagre: there are two examples of algorithm comparison on two specific environments and three example comparisons of algorithms, optimizers, and ensemble sizes across the entire bsuite gamut in the appendix. The two examples on the specific environments showcase how bsuite can be used for directed algorithm improvement, but the experiments in the appendices only discuss the general notion of algorithm comparison using bsuite scores. In addition to the examples, the authors supply some comments throughout the paper that provide hints regarding the applied usage of bsuite. Looking at the <a href="https://openreview.net/forum?id=rygf-kSYwH" target="_blank" rel="noopener noreferrer">paper reviews</a>, <a href="https://openreview.net/forum?id=rygf-kSYwH&amp;noteId=rkxk2BR3YH" target="_blank" rel="noopener noreferrer">reviewer #1</a> mentioned how there was no explicit conclusion from the evaluation, and <a href="https://openreview.net/forum?id=rygf-kSYwH&amp;noteId=rJxjmH6otS" target="_blank" rel="noopener noreferrer">reviewer #3</a> mentioned that examples of diagnostic use and concrete examples would help support the paper. Furthermore, <a href="https://openreview.net/forum?id=rygf-kSYwH&amp;noteId=SJgEVpbAFr" target="_blank" rel="noopener noreferrer">reviewer #2</a> encouraged publication of bsuite at a top venue to see traction within with the RL research community, and the <a href="https://openreview.net/forum?id=rygf-kSYwH&amp;noteId=7x_6G9OVWG" target="_blank" rel="noopener noreferrer">program chairs</a> mentioned how success or failure can rely on community acceptance. Considering that bsuite received a spotlight presentation at ICLR 2020 and has amassed over 100 citations in the relatively small field of RL reproducibility during the past few years, bsuite has all intellectual merit and some community momentum to reach the level of a timeless benchmark in RL research. <span class="emph">To elevate bsuite to the status of a timeless reinforcement learning benchmark and to help bridge the theoretical and applied sides of reinforcement learning, we believe that it is necessary to develop and document concrete bsuite examples that help answer difficult and prevailing questions throughout the reinforcement learning development process</span>.</p> <h3 id="contribution-statement">Contribution Statement</h3> <p>This blog post extends the work of bsuite by showcasing 12 example use cases with experimental illustration that directly address specific questions in the reinforcement learning development process to (i) help bridge the gap between theory and practice, (ii) promote community acceptance, (iii) aid applied practitioners, and (iv) highlight potential research directions in reproducible reinforcement learning.</p> <h3 id="experiment-summary">Experiment Summary</h3> <p>We separate our examples into 5 categories of <strong>initial model selection</strong>, <strong>preprocessing choice</strong>, <strong>hyperparameter tuning</strong>, <strong>testing and debugging</strong>, and <strong>model improvement</strong>. This blog post follows a similar structure to the paper <em>Deep Reinforcement Learning that Matters</em> <d-cite key="henderson_deep_2018"></d-cite> by posing and answering a question in each category, and then providing a few illustrative examples with conclusions. Most examples use Stable-Baselines3 (SB3) <d-cite key="raffin_stable-baselines3_2022"></d-cite> for training DRL agents due to its clarity and simplicity, and the examples focus on DRL due to its pervasiveness in the applied RL community. We provide code and instructions for each experiment in our <a href="https://github.com/LorenJAnderson/bsuite-applications.git" target="_blank" rel="noopener noreferrer">GitHub codebase</a>, along with hyperparameters and implementation details. Since the focus of this blog post is the discussion of diverse example use cases, not architectural considerations or implementation details, we refer the reader to the <a href="https://openreview.net/pdf?id=rygf-kSYwH#page=13" target="_blank" rel="noopener noreferrer">paper appendix</a> and the <a href="https://colab.research.google.com/github/deepmind/bsuite/blob/master/bsuite/analysis/results.ipynb" target="_blank" rel="noopener noreferrer">colab analysis tutorial</a> for more information about the environments and to the <a href="https://colab.research.google.com/drive/1rU20zJ281sZuMD1DHbsODFr1DbASL0RH" target="_blank" rel="noopener noreferrer">colab intro tutorial</a> and our own codebase for instructions and examples regarding the implementation of bsuite.</p> <p>Although running a bsuite environment is orders of magnitude faster than most benchmark environments, the number of individual bsuite environments and the number of our examples required us to create a subset of bsuite, which we will refer to as <em>mini-bsuite</em> or <em>msuite</em> in this work. We designed msuite to mirror the general scaling pattern of each bsuite environment and the diversity of core capabilities in bsuite; a complete description of msuite can be found in our GitHub codebase. Running experiments on a subset of bsuite highlights its flexibility, and we will show, still elicits quality insights. Since we use a subset of bsuite for our experiments, our radar charts will look different from those in the original bsuite paper. We generally keep the more challenging environments and consequently produce lower scores, especially in the generalization category.</p> <p>We stress that the below examples are not meant to amaze the reader or exhibit state-of-the-art research. <span class="epmh">The main products of this work are the practicality and diversity of ideas in the examples</span>, while the experiments are primarily for basic validation and illustrative purposes. Moreover, these experiments use modest compute power and showcase the effectiveness of bsuite in the low-compute regime. Each example has tangible benefits such as saving development time, shortening compute time, increasing performance, and lessening frustration of the practitioner, among others. To maintain any sense of brevity in this post, we now begin discussion of the examples.</p> <h2 id="1-initial-model-selection">1. Initial Model Selection</h2> <p>The reinforcement learning development cycle typically begins with an environment to solve. A natural question usually follows: “<em>Which underlying RL model should I choose to best tackle this environment, given my resources</em>?”. Resources can range from the hardware (e.g. model size on the GPU), to temporal constraints, to availability of off-the-shelf algorithms <d-cite key="liang_rllib_2018"></d-cite><d-cite key="raffin_stable-baselines3_2022"></d-cite>, to programming efficiency of the practitioner. Initially selecting an effective model can save a great amount of development time due to the potentially greater performance baseline of the agent. In this section, we illustrate how bsuite can be used to effectively answer the question of initial model selection.</p> <h3 id="comparing-baseline-algorithms">Comparing Baseline Algorithms</h3> <p>Perhaps the first choice in the RL development cycle is choosing the algorithm. A considerable amount of RL research is focused on the corresponding algorithms, which presents many possibilities for the researcher. The No Free Lunch Theorem <d-cite key="wolpert_no_1997"></d-cite> tailored to reinforcement learning would state that no algorithm will prove better than any other unless the characteristics of the underlying environment are known. Using bsuite provides a quantitative assessment of algorithm performance on capabilities that are prevalent in many or even most reinforcement learning environments of interest.</p> <p>Example: Figure 3 shows the performance of the Stable-Baselines3 (SB3) implementations of DQN, A2C, and PPO on msuite with our default hyperparameters. Recent research <d-cite key="andrychowicz_what_2020"></d-cite> suggests that PPO is the most commonly used RL algorithm, and it was a successor to DQN and A2C. The results indeed show that PPO is superior on msuite in most categories, providing credibility for its use as the premiere baseline DRL algorithm.</p> <div style="text-align: center;"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar11-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar11-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar11-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-bsuite-applications/radar11.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <div class="caption"> Figure 3. Comparison of SB3 default DQN, A2C, and PPO baseline algorithms. </div> </div> <h3 id="comparing-off-the-shelf-implementations">Comparing Off-the-Shelf Implementations</h3> <p>Due to the vast number of reinforcement learning paradigms (e.g. model-based, hierarchical), there are many off-the-shelf (OTS) libraries that provide a select number of thoroughly tested reinforcement learning algorithms. Often, temporal resources or coding capabilities do not allow for practitioners to implement every algorithm by hand. Fortunately, running an algorithm on bsuite can provide a quick glance of an OTS algorithm’s abilities at low cost to the practitioner.</p> <p>Example: Figure 4 compares our default DQN implementation against the example DQN implementation in the bsuite codebase. There is a significant difference between the performance of each implementation on msuite, with the bsuite implementation displaying its superiority. Note that the hyperparameters of bsuite DQN were most likely chosen with the evaluation on bsuite in mind, which could explain its increased performance.</p> <div style="text-align: center;"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar12-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar12-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar12-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-bsuite-applications/radar12.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <div class="caption"> Figure 4. Comparison of SB3 DQN and bsuite DQN. </div> </div> <h3 id="gauging-hardware-necessities">Gauging Hardware Necessities</h3> <p>Even after an initial algorithm is selected, hardware limitations such as network size and data storage can prevent the agent from being deployed. Using bsuite provides a low-cost comparison among possible hardware choices that can be used to argue for their necessity. This is especially important for small development teams since there can likely be a major disparity between their own hardware resources and those discussed in corresponding research publications.</p> <p>Example: Figure 5 compares the default DQN implementation when varying replay buffer sizes, from $1\mathrm{e}{2}$ to $1\mathrm{e}{5}$, with the default having size $1\mathrm{e}{4}$. The original DQN implementation used a replay buffer of size $1\mathrm{e}{6}$, which is too large for the RAM constraints of many personal computers. The results show that increasing the buffer size to at least $1\mathrm{e}{4}$ yields significant returns on msuite. Note that since the experiment lengths (total time steps for all episodes) of msuite were sometimes less than $1\mathrm{e}{5}$, the largest buffer size of $1\mathrm{e}{5}$ did not always discard experiences from very old episodes, which most likely decreased its performance in comparison to a buffer size of $1\mathrm{e}{4}$.</p> <div style="text-align: center;"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar13-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar13-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar13-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-bsuite-applications/radar13.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <div class="caption"> Figure 5. Comparison of DQN with varying buffer sizes. </div> </div> <h3 id="future-work">Future Work</h3> <p>Due to the diversity of OTS libraries, one possible research direction in reproducible RL is to test algorithms from different OTS libraries using the same hyperparameters on bsuite and create a directory of bsuite radar charts. This provides practitioners a comparison with their own implementation or a starting point when selecting an OTS library and algorithm. Another direction is to test various aspects related to hardware constraints and attempt to show the tradeoff between constraints and performance on bsuite and other benchmarks. This would especially help practitioners with low compute resources to budget resource use on multiple projects.</p> <h2 id="2-preprocessing-choice">2. Preprocessing Choice</h2> <p>Most benchmark environments present complexities such as high-dimensional observations, unscaled rewards, unnecessary actions, and partially-observable Markov Decision Process (POMDP) dynamics. Some of these difficulties can be curbed using environment preprocessing techniques. While certain environments such as <em>ATARI</em> have formalized standards for preprocessing, there are some aspects such as frame skipping that are considered part of the underlying algorithm, and therefore, a choice of the practitioner <d-cite key="machado_revisiting_2018"></d-cite>. A natural question to ask is, “<em>What environment preprocessing techniques will best help my agent attain its goal in this environment</em>?”. In this section, we show how bsuite can provide insight to the choice of preprocessing, with benefits of increased performance and shortened training time.</p> <h3 id="verification-of-preprocessing">Verification of Preprocessing</h3> <p>Preprocessing techniques usually targeted to ease some aspect of the agent’s training. For example, removing unnecessary actions (e.g. in a joystick action space) prevents the agent from having to learn which actions are useless. While a new preprocessing technique can provide improvements, there is always the chance that it fails to make a substantial improvement, or worse yet, generally decreases performance. Invoking bsuite can help provide verification that the preprocessing provided the planned improvement.</p> <p>Example: Figure 6 shows the performance of the default DQN agent versus an agent that received normalized rewards from the environment. Normalizing the rewards increases the speed of training a neural network, since the parameters are usually initialized to expect target values in a range from $-1$ to $1$. Our results show that the normalization preprocessing indeed increases the capability of navigating varying reward scales while not suffering drastically in any other capability.</p> <div style="text-align: center;"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar21-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar21-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar21-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-bsuite-applications/radar21.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <div class="caption"> Figure 6. Comparison of DQN with and without reward normalization. </div> </div> <h3 id="better-model-versus-preprocessing">Better Model versus Preprocessing</h3> <p>Instead of choosing to preprocess the environment, a more sophisticated algorithm may better achieve the preprocessing goals. For example, many improvements on the original DQN algorithm have been directed towards accomplishing goals such as improving stability, reducing overestimation, and bolstering exploration. Comparing preprocessing against an algorithmic improvement provides a quantitative reason for deciding between the two options, especially since development time of many common preprocessing wrappers is quite short.</p> <p>Example: Figure 7 shows the results of PPO with a recurrent network versus PPO having its observation as the last 4 stacked frames from the environment. Frame stacking is common on some <em>ATARI</em> environments by converting the POMDP dynamics to an MDP, which is necessary to determine velocity of any element on the screen. An improvement to DQN, Deep Recurrent Q-networks <d-cite key="hausknecht_deep_2017"></d-cite> uses a recurrent LSTM to aid in memory and achieve the same effects of frame stacking. The msuite results show that memory is considerably improved with PPO RNN and therefore may be worth the extra development time.</p> <div style="text-align: center;"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar22-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar22-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar22-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-bsuite-applications/radar22.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <div class="caption"> Figure 7. Comparison of PPO with frame stacking and PPO with RNN. </div> </div> <h3 id="future-work-1">Future Work</h3> <p>One research direction is to document common preprocessing techniques and determine their scores on bsuite. This would provide practitioners a summary of directed strengths for each preprocessing technique while possibly uncovering unexpected behavior. Another direction is to determine the extent to which preprocessing techniques aided previous results in the literature, which could illuminate strengths or weaknesses in the corresponding algorithms.</p> <h2 id="3-hyperparameter-tuning">3. Hyperparameter Tuning</h2> <p>After selecting a model and determining any preprocessing of the environment, an agent must eventually be trained on the environment to gauge its performance. During the training process, initial choices of hyperparameters can heavily influence the agent’s performance <d-cite key="andrychowicz_what_2020"></d-cite>, including how to explore and how quickly the model should learn from past experience. The corresponding question to ask is, “<em>How can I choose hyperparameters to yield the best performance, given a model?</em>” In this section, we show how bsuite can be used to tune hyperparameters, thereby increasing performance and shortening compute time.</p> <h3 id="unintuitive-hyperparameters">Unintuitive Hyperparameters</h3> <p>Some hyperparameters such as exploration percentage and batch size are more concrete, while others such as discounting factor and learning rate are a little less intuitive. Determining a starting value of an unintuitive hyperparameter can be challenging and require a few trials before honing in on a successful value. Instead of having to run experiments on a costly environment, using bsuite can provide a thoughtful initial guess of the value with minimal compute.</p> <p>Example: Figure 8 shows the results of running PPO with various entropy bonus coefficients across msuite (default is $0.01$). The entropy bonus affects the action distribution of the agent, and the value of $1\mathrm{e}{-2}$ presented in the original paper <d-cite key="schulman_proximal_2017"></d-cite> is fairly unintuitive. The results show that the value of $1\mathrm{e}{-2}$ is indeed superior on msuite by a small margin. Since SB3 has the entropy bonus initialized to 0, this example also shows how hyperparameter tuning with msuite can improve performance even on OTS implementations.</p> <div style="text-align: center;"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar31-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar31-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar31-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-bsuite-applications/radar31.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <div class="caption"> Figure 8. Comparison of default PPO with varying entropy bonuses. </div> </div> <h3 id="promising-ranges-of-hyperparameters">Promising Ranges of Hyperparameters</h3> <p>Instead of determining a single value of a hyperparameter, gauging an acceptable range may be required. Since hyperparameters can have confounding effects, knowing approximate soft boundaries of hyperparameters at which agents start to fail basic tasks can provide useful information during a more general hyperparameter tuning process. For example, smaller learning rates generally take longer for algorithm convergence, and a practitioner may want to know a promising range of learning rates if the computing budget is flexible. The scaling nature of bsuite presents knowledge of the extent to which different hyperparameter choices affect performance, greatly aiding in ascertaining a promising hyperparameter range.</p> <p>Example: Figure 9 shows the results of default DQN with varying learning rates on msuite (default $7\mathrm{e}{-4}$). The results suggest that learning rates above $1\mathrm{e}{-2}$ start to yield diminishing returns. Since some experiment lengths in msuite only run for $10K$ episodes, the lowest learning rate of $1\mathrm{e}{-6}$ may never converge in time even with high-quality training data, necessitating a modification to msuite to learn a lower bound.</p> <div style="text-align: center;"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar32-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar32-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar32-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-bsuite-applications/radar32.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <div class="caption"> Figure 9. Comparison of default DQN with varying learning rates. </div> </div> <h3 id="pace-of-annealing-hyperparameters">Pace of Annealing Hyperparameters</h3> <p>While some hyperparameters stay fixed, others must change throughout the course of training. Typically, these include hyperparameters that control the exploration vs. exploitation dilemma, such as entropy bonus and epsilon-greedy exploration. These hyperparameters are often dependent on the entire experiment; for example, SB3 anneals epsilon-greedy exploration for a fixed fraction of the experiment. Therefore, entire experiments, some consisting of millions of episodes, need to be run to determine successful values of these hyperparameters. Using bsuite can provide a quick confirmation that the annealing of these parameters happens at an acceptable rate.</p> <p>Example: Figure 10 shows the performance of DQN with various epsilon-greedy exploration annealing lengths, based on a fixed fraction of the entire experiment (default $0.1$). The annealing fraction of $0.1$ performs best on msuite, which is the same choice of parameter in the original DQN paper. Furthermore, performance decreases with greater annealing lengths. Since bsuite environments are generally scored with regret, we acknowledge that the longer annealing lengths may have better relative performance if bsuite were scored with a training versus testing split.</p> <div style="text-align: center;"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar33-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar33-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar33-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-bsuite-applications/radar33.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <div class="caption"> Figure 10. Comparison of default DQN with varying epsilon annealing lengths. </div> </div> <h3 id="future-work-2">Future Work</h3> <p>The three experiments above can be extended by documenting the effect of varying hyperparameters on performance, especially in OTS implementations. This would help practitioners understand the effects of certain hyperparameters on the bsuite core capabilities, allowing for a better initial hyperparameter choice when certain capabilities are necessary for the environment at hand. Another research direction is to determine if integrating a fast hyperparameter tuner on general environments such as bsuite into a hyperparameter tuner for single, complex environments would increase the speed of tuning on the fixed environment. Since the bsuite core capabilities are necessary in many complex environments, initially determining competency on bsuite would act as a first pass of the tuning algorithm.</p> <h2 id="4-testing-and-debugging">4. Testing and Debugging</h2> <p>Known to every RL practitioner, testing and debugging during the development cycle is nearly unavoidable. It is common to encounter silent bugs in RL code, where the program runs but the agent fails to learn because of an implementation error. Examples include incorrect preprocessing, incorrect hyperparameters, or missing algorithm additions. Quick unit tests can be invaluable for the RL practitioner, as shown in successor work to bsuite <d-cite key="rajan_mdp_2021"></d-cite>. A corresponding question to ask during the testing and debugging phase is, “<em>What tests can I perform to verify that my agent is running as intended?</em>” In this section, we show how bsuite can be used as a sanity check for the implementation, saving compute time and lessening the frustration of the practitioner. In an effort to refrain from contrived examples, the two examples below highlight real-life scenarios where using bsuite could have saved the authors of this blog post hours of frustration in their own work.</p> <h3 id="incorrect-hyperparameter">Incorrect Hyperparameter</h3> <p>As discussed in the previous section, hyperparameters are of major importance to the performance of a RL algorithm. A missing or incorrect hyperparameter will not necessarily prevent a program from running, but most such bugs will severely degrade performance. Using bsuite can quickly expose poor performance of an algorithm at a low cost to the practitioner.</p> <p>Example: Figure 11 shows the default PPO implementation against a PPO implementation with an erroneous learning rate of $1\mathrm{e}{3}$. Many hyperparameters such as total training steps and maximum buffer size are usually coded using scientific notation since they are so large; consequently, it is easy to forget the ‘minus sign’ when coding the learning rate and instead code the learning rate as $1\mathrm{e}{3}$. The results on msuite show that performance has degraded severely from an OTS implementation, and more investigation into the code is required. One of the authors of this blog post would have saved roughly a day of training a PPO agent in their own work had they realized this exact mistake.</p> <div style="text-align: center;"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar41-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar41-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar41-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-bsuite-applications/radar41.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <div class="caption"> Figure 11. Comparison of default PPO with miscoded PPO. </div> </div> <h3 id="off-the-shelf-algorithm-testing">Off-the-Shelf Algorithm Testing</h3> <p>While the previous example used an OTS algorithm for comparison to illuminate silent bugs, it may be the case that the OTS algorithm itself could have a silent bug. Whether due to an incorrect library being used or a misunderstanding of the OTS algorithm, any silent bug in an OTS algorithm can be difficult to detect due to the codebase being written by another practitioner. Again, bsuite can be used to diagnose poor performance and elucidate a coding problem.</p> <p>Example: Figure 12 shows the results of the SB3 DQN with our default experimental hyperparameters and with the default SB3 hyperparameters on msuite. A core difference between the hyperparameters is the burn rate: the default SB3 hyperparameters perform $10K$ steps before learning takes place (e.g. backprop), while our default experimental hyperparameters start the learning after $1K$ steps. Since many of the easier msuite environments only last $10K$ time steps, failure to learn anything during that time severely degrades performance, as shown. Noticing the default value of this hyperparameter in SB3 would have saved the authors roughly 10 hours of training time.</p> <div style="text-align: center;"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar42-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar42-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar42-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-bsuite-applications/radar42.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <div class="caption"> Figure 12. Comparison of DQN with small and large burn-in. </div> </div> <h3 id="future-work-3">Future Work</h3> <p>The training time for a complete run of bsuite can take an hour for even the most basic algorithms. Considering that a few of the easiest bsuite environments could have shown poor performance in the above examples within mere minutes, one research avenue is to create a fast debugging system for reinforcement learning algorithms. In the spirit of bsuite, it should implement targeted experiments to provide actionable solutions for eliminating silent bugs. Such work would primarily act as a public good, but it could also help bridge the gap between RL theory and practice if it embodies the targeted nature of bsuite.</p> <h2 id="5-model-improvement">5. Model Improvement</h2> <p>A natural milestone in the RL development cycle is getting an algorithm running bug-free with notable signs of learning. A common follow-up question to ask is, “<em>How can I improve my model to yield better performance?</em>”. The practitioner may consider choosing an entirely new model and repeating some of the above steps; a more enticing option is usually to improve the existing model by reusing its core structure and only making minor additions or modifications, an approach taken in the development of the baseline RAINBOW DQN algorithm <d-cite key="hessel_rainbow_2018"></d-cite>. In this section, we discuss how bsuite can be used to provide targeted improvements of existing models and increase performance while mitigating compute time.</p> <h3 id="increasing-network-complexity">Increasing Network Complexity</h3> <p>In DRL, the neural network usually encodes the policy, and its architecture directly affects the agent’s learning capacity. The more complicated CNN architecture was a driver for the first superhuman performance of a DRL algorithm on the <em>ATARI</em> suite due to its ability to distill image data into higher-level features. Using bsuite can provide a quick verification if an architectural improvement produces its intended effect.</p> <p>Example: Figure 13 shows the results of PPO against PPO with a recurrent neural network. As mentioned in a previous example, RNNs aid memory and were originally incorporated into DRL as a way to deal with POMDP dynamics. The results on msuite display the substantial increase in memory capability while sacrificing on credit assignment. This example highlights how bsuite can provide warnings of possible unexpected decreases in certain capabilities, which must be monitored closely by the practitioner.</p> <div style="text-align: center;"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar51-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar51-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar51-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-bsuite-applications/radar51.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <div class="caption"> Figure 13. Comparison of default PPO with PPO RNN. </div> </div> <h3 id="off-the-shelf-improvements">Off-the-Shelf Improvements</h3> <p>While previous examples discussed comparison, verification, and debugging OTS implementations, many OTS libraries provide support for well-known algorithm improvements. For example, some DQN implementations have boolean values to signify the use of noisy networks, double Q-learning, and more. Using bsuite provides the necessary targeted analysis to help determine if certain improvements are fruitful for the environment at hand.</p> <p>Example: Figure 14 shows the results of our default DQN compared against the SB3 QRDQN algorithm with default hyperparameters and the SBE QRDQN algorithm with hyperparameters matching our default DQN implementation. The QRDQN algorithm is an improvement over DQN that aims to capture the distribution over returns instead of a point estimate of the expected return. This implementation is more complex but allows for a precise estimate that aids in stability. The results show that this improvement was rather negligible on msuite, and unless credit assignment is the major concern in the environment at hand, a different improvement may prove more useful.</p> <div style="text-align: center;"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar52-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar52-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar52-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-bsuite-applications/radar52.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <div class="caption"> Figure 14. Comparison of DQN with QRDQN variants. </div> </div> <h3 id="future-work-4">Future Work</h3> <p>Since bsuite provides quantitative results, one avenue of research is to create a recommender system that uses information from previous bsuite analyses to recommend improvements in DRL algorithms. The practitioner would need to provide as input the most important capabilities that an environment is believed to exhibit, and bsuite would tailor recommendations towards those capabilities. Such a recommender system could save compute time, increase performance, and ultimately expose the practitioner to new and exciting algorithmic possibilities.</p> <h2 id="6-conclusion">6. Conclusion</h2> <p>Traditional RL benchmarks contain many confounding variables, which makes analysis of agent performance rather opaque. In contrast, bsuite provides targeted environments that help gauge agent prowess in one or few core capabilities. The goal of bsuite is to help bridge the gap between practical theory and practical algorithms, yet there currently is no database or list of example use cases for the practitioner. Our work extends bsuite by providing concrete examples of its use, with a few examples in each of five categories. We supply at least one possible avenue of related future work or research in reproducible RL for each category. In its current state, bsuite is poised to be a standard RL benchmark for years to come due to its acceptance in a top-tier venue, well-structured codebase, multiple tutorials, and over 100 citations in the past few years in a relatively small field. We aim to help propel bsuite, and more generally methodical and reproducible RL research, into the mainstream through our explicit use cases and examples. With a diverse set of examples to choose from, we intend for applied RL practitioners to understand more use cases of bsuite, apply and document the use of bsuite in their experiments, and ultimately help bridge the gap between practical theory and practical algorithms.</p> <h3 id="green-computing-statement">Green Computing Statement</h3> <p>The use of bsuite can provide directed improvements in algorithms, from high-level model selection and improvement to lower-level debugging, testing, and hyperparameter tuning. Due to the current climate crisis, we feel that thoroughly-tested and accessible ideas that can reduce computational cost should be promoted to a wide audience of researchers.</p> <h3 id="inclusive-computing-statement">Inclusive Computing Statement</h3> <p>Many of the ideas in bsuite and this blog post are most helpful in regimes with low compute resources because of the targeted nature of these works. Due to the increasing gap between compute power of various research teams, we feel that thoroughly-tested and accessible ideas that can benefit teams with meagre compute power should be promoted to a wide audience of researchers.</p> <h2 id="acknowledgements">Acknowledgements</h2> <p>We thank the reviewers for their helpful comments. We also thank the authors of bsuite for their outstanding work.</p> </d-article> <d-appendix> <d-footnote-list></d-footnote-list> <d-citation-list></d-citation-list> </d-appendix> </div> <d-bibliography src="/2023/assets/bibliography/2023-05-01-bsuite-applications.bib"></d-bibliography> <d-article id="bibtex-container" class="related highlight"> For attribution in academic contexts, please cite this work as <pre id="bibtex-academic-attribution">
+        PLACEHOLDER FOR ACADEMIC ATTRIBUTION
+  </pre> BibTeX citation <pre id="bibtex-box">
+        PLACEHOLDER FOR BIBTEX
+  </pre> </d-article> <script src="https://utteranc.es/client.js" repo="iclr-blogposts/2023" issue-term="pathname" theme="github-light" crossorigin="anonymous" async> </script> <script src="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/js/bootstrap.bundle.min.js" integrity="sha256-fgLAgv7fyCGopR/gBNq2iW3ZKIdqIcyshnUULC4vex8=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/js/mdb.min.js" integrity="sha256-NdbiivsvWt7VYCt6hYNT3h/th9vSTL4EDWeGs5SN3DA=" crossorigin="anonymous"></script> </body> </html>
\ No newline at end of file
diff --git a/blog/2023/classification-layer-initialization-in-maml/index.html b/blog/2023/classification-layer-initialization-in-maml/index.html
new file mode 100644
index 00000000..0a487aad
--- /dev/null
+++ b/blog/2023/classification-layer-initialization-in-maml/index.html
@@ -0,0 +1,56 @@
+<!DOCTYPE html> <html> <script>let thunk=()=>{let e=e=>e.trim(),t=e=>e.innerText,n=e=>{let t=e.split(" "),n=t.slice(0,-1).join(" ");return[t.at(-1),n]},a=Array.from(document.getElementsByClassName("author")).map(t).map(e).map(n),i=a[0][0],o=(Array.from(document.getElementsByClassName("affiliation")).filter(e=>"P"===e.nodeName).map(t).map(e),"May 1, 2023"),r="Strategies for Classification Layer Initialization in Model-Agnostic Meta-Learning",s="This blog post discusses different strategies for initializing the classification layers parameters before fine-tuning on a new task in Model-Agnostic Meta-Learning. Each of the strategies in question has emerged from a different problemand it will be analyzed whether one approach can solve the problems addressed by the other approaches.";{let e=a.map(e=>`${e[0]}, ${e[1]}`).join(" and "),t=`\n@inproceedings{${(i+"2023"+r.split(" ").slice(0,3).join("")).replace(" ","").replace(/[\p{P}$+<=>^`|~]/gu,"").toLowerCase().trim()},\n  author = {${e}},\n  title = {${r}},\n  abstract = {${s}},\n  booktitle = {ICLR Blogposts 2023},\n  year = {2023},\n  date = {${o}},\n  note = {${window.location.href}},\n  url  = {${window.location.href}}\n}\n  `.trim();document.getElementById("bibtex-box").innerText=t}{let e=a.map(e=>e[0]),t=`\n${e=e.length>2?e[0]+", et al.":2==e.length?e[0]+" & "+e[1]:e[0]}, "${r}", ICLR Blogposts, 2023.\n`.trim();document.getElementById("bibtex-academic-attribution").innerText=t}};document.addEventListener("readystatechange",function(){"complete"===document.readyState&&thunk()});</script> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <title>Strategies for Classification Layer Initialization in Model-Agnostic Meta-Learning | ICLR Blogposts 2023</title> <meta name="author" content="abc b c"/> <meta name="description" content="This blog post discusses different strategies for initializing the classification layers parameters before fine-tuning on a new task in Model-Agnostic Meta-Learning. Each of the strategies in question has emerged from a different problemand it will be analyzed whether one approach can solve the problems addressed by the other approaches."/> <meta name="keywords" content="machine-learning, ml, deep-learning, reinforcement-learning, iclr"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha256-DF7Zhf293AJxJNTmh5zhoYYIMs2oXitRfBjY+9L//AY=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/css/mdb.min.css" integrity="sha256-jpjYvU3G3N6nrrBwXJoVEYI/0zw8htfFnhT9ljN3JJw=" crossorigin="anonymous"/> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fortawesome/fontawesome-free@5.15.4/css/all.min.css" integrity="sha256-mUZM63G8m73Mcidfrv5E+Y61y7a12O5mW4ezU3bxqW4=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/academicons@1.9.1/css/academicons.min.css" integrity="sha256-i1+4qU2G2860dGGIOJscdC30s9beBXjFfzjWLjBRsBg=" crossorigin="anonymous"> <link rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700|Roboto+Slab:100,300,400,500,700|Material+Icons"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/github.css" media="" id="highlight_theme_light"/> <link rel="shortcut icon" href="/2023/assets/img/iclr_favicon.ico"/> <link rel="stylesheet" href="/2023/assets/css/main.css"> <link rel="canonical" href="https://iclr-blogposts.github.io/2023/blog/2023/classification-layer-initialization-in-maml/"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/native.css" media="none" id="highlight_theme_dark"/> <script src="/2023/assets/js/theme.js"></script> <script src="/2023/assets/js/dark_mode.js"></script> <script src="https://cdn.jsdelivr.net/npm/jquery@3.6.0/dist/jquery.min.js" integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=" crossorigin="anonymous"></script> <script type="text/javascript">window.MathJax={tex:{tags:"ams"}};</script> <script defer type="text/javascript" id="MathJax-script" src="https://cdn.jsdelivr.net/npm/mathjax@3.2.0/es5/tex-mml-chtml.js"></script> <script defer src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script> <script src="/2023/assets/js/distillpub/template.v2.js"></script> <script src="/2023/assets/js/distillpub/transforms.v2.js"></script> <script src="/2023/assets/js/distillpub/overrides.js"></script> </head> <d-front-matter> <script async type="text/json">{
+      "title": "Strategies for Classification Layer Initialization in Model-Agnostic Meta-Learning",
+      "description": "This blog post discusses different strategies for initializing the classification layers parameters before fine-tuning on a new task in Model-Agnostic Meta-Learning. Each of the strategies in question has emerged from a different problemand it will be analyzed whether one approach can solve the problems addressed by the other approaches.",
+      "published": "May 1, 2023",
+      "authors": [
+        {
+          "author": "Nys Tjade Siegel",
+          "authorURL": "https://www.linkedin.com/in/nys-tjade-siegel-b06a1718a?originalSubdomain=de",
+          "affiliations": [
+            {
+              "name": "ALU Freiburg",
+              "url": ""
+            }
+          ]
+        },
+        {
+          "author": "Thomas Goerttler",
+          "authorURL": "https://scholar.google.de/citations?user=ppQIwpIAAAAJ&hl=de",
+          "affiliations": [
+            {
+              "name": "TU Berlin",
+              "url": ""
+            }
+          ]
+        },
+        {
+          "author": "Klaus Obermayer",
+          "authorURL": "https://www.tu.berlin/ni/",
+          "affiliations": [
+            {
+              "name": "TU Berlin",
+              "url": ""
+            }
+          ]
+        }
+        
+      ],
+      "katex": {
+        "delimiters": [
+          {
+            "left": "$",
+            "right": "$",
+            "display": false
+          },
+          {
+            "left": "$$",
+            "right": "$$",
+            "display": true
+          }
+        ]
+      }
+    }</script> </d-front-matter> <body class="fixed-top-nav"> <header> <nav id="navbar" class="navbar navbar-light navbar-expand-sm fixed-top"> <div class="container"> <a class="navbar-brand title font-weight-lighter" href="/2023/">ICLR Blogposts 2023</a> <button class="navbar-toggler collapsed ml-auto" type="button" data-toggle="collapse" data-target="#navbarNav" aria-controls="navbarNav" aria-expanded="false" aria-label="Toggle navigation"> <span class="sr-only">Toggle navigation</span> <span class="icon-bar top-bar"></span> <span class="icon-bar middle-bar"></span> <span class="icon-bar bottom-bar"></span> </button> <div class="collapse navbar-collapse text-right" id="navbarNav"> <ul class="navbar-nav ml-auto flex-nowrap"> <li class="nav-item "> <a class="nav-link" href="/2023/about">about</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/call">call for blogposts</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/submitting">submitting</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/reviewing">reviewing</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/blog/index.html">blog</a> </li> <li class="nav-item dropdown "> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdown" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">other iterations</a> <div class="dropdown-menu dropdown-menu-right" aria-labelledby="navbarDropdown"> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2025/">2025</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2024/">2024</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blog-track.github.io/home/" target="_blank" rel="noopener noreferrer">2022</a> </div> </li> <li class="toggle-container"> <button id="light-toggle" title="Change theme"> <i class="fas fa-moon"></i> <i class="fas fa-sun"></i> </button> </li> </ul> </div> </div> </nav> </header> <div class="post distill"> <d-title> <h1>Strategies for Classification Layer Initialization in Model-Agnostic Meta-Learning</h1> <p>This blog post discusses different strategies for initializing the classification layers parameters before fine-tuning on a new task in Model-Agnostic Meta-Learning. Each of the strategies in question has emerged from a different problemand it will be analyzed whether one approach can solve the problems addressed by the other approaches.</p> </d-title> <d-byline></d-byline> <d-article> <d-contents> <nav class="l-text figcaption"> <h3>Contents</h3> <div><a href="#introduction">Introduction</a></div> <div><a href="#what-is-meta-learning">What is Meta-Learning?</a></div> <div><a href="#quick-recap-on-maml">Quick recap on MAML</a></div> <div><a href="#learning-a-single-initialization-vector">Learning a single initialization vector</a></div> <div><a href="#zero-initialization">Zero initialization</a></div> <ul> <li><a href="#maml-s-scl-intuition">MAML's SCL Intuition</a></li> </ul> <div><a href="#initialization-using-prototypes">Initialization using prototypes</a></div> <div><a href="#what-else-is-there">What else is there?</a></div> <div><a href="#conclusion-and-discussion">Conclusion and Discussion</a></div> </nav> </d-contents> <h2 id="introduction">Introduction</h2> <p>In a previous study, Raghu et al. [2020] <d-cite key="DBLP:conf/iclr/RaghuRBV20"></d-cite> found that in model-agnostic meta-learning (MAML) for few-shot classification, the majority of changes observed in the network during the inner loop fine-tuning process occurred in the linear classification head. It is commonly believed that during this phase, the linear head remaps encoded features to the classes of the new task. In traditional MAML, the weights of the final linear layer are meta-learned in the usual way. However, there are some issues with this approach:</p> <p>First, it is difficult to imagine that a single set of optimal classification head weights can be learned. This becomes apparent when considering class label permutations: two different tasks may have the same classes but in a different order. As a result, the weights that perform well for the first task will likely not be effective for the second task. This is reflected in the fact that MAML’s performance can vary by up to 15% depending on the class label ordering during testing <d-cite key="DBLP:conf/iclr/YeC22"></d-cite>.</p> <p>Second, more challenging datasets are being proposed as few-shot learning benchmarks, such as Meta-Dataset <d-cite key="DBLP:conf/iclr/TriantafillouZD20"></d-cite>. These datasets have varying numbers of classes per task, making it impossible to learn a single set of weights for the classification layer.</p> <p>Therefore, it seems logical to consider how to initialize the final classification layer before fine-tuning on a new task. Random initialization may not be optimal, as it can introduce unnecessary noise <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite>.</p> <p>This blog post will discuss different approaches to the last layer initialization that claim to outperform the original MAML method.</p> <h2 id="what-is-meta-learning">What is Meta-Learning?</h2> <p>Before diving into the topic, let’s look at the general idea of meta-learning. In supervised machine learning, tasks are learned using a large number of labeled examples. However, acquiring a sufficient amount of labeled data can be labor extensive. Also, this approach to machine learning evidently deviates from the human learning process; a child is certainly able to learn what a specific object is, using only a few examples, and not hundreds or thousands. This is where meta-learning comes in. Its goal can be described as acquiring the ability to learn new tasks from only a few examples <d-cite key="9428530"></d-cite>.</p> <p>There is not one fixed framework for meta-learning; however, a common approach is based on the principle that the conditions in which a model is trained and evaluated must match <d-cite key="vinyals2016matching"></d-cite>.<br> Let’s look at this in more detail for the case of few-shot classification, which can be solved with meta-learning. Here, the meta-learning goal can be verbalized as “learning to learn new classes from few examples” <d-cite key="DBLP:conf/iclr/TriantafillouZD20"></d-cite>. When evaluating a meta-learner, one needs a training set \(\mathcal{D^{tr}} = ((\mathbf{x}_1, y_1), (\mathbf{x}_2, y_2), (\mathbf{x}_3, y_3), ...)\), consisting of labeled examples for unseen classes. Those are used by the meta-learner to adapt to the novel task. How well the meta-learner performs can then be evaluated on labeled examples from the same classes: \(\mathcal{D^{test}} = ((\mathbf{x}_{1}^{\ast}, y_{1}^{\ast}), (\mathbf{x}_{2}^{\ast}, y_{2}^{\ast}), (\mathbf{x}_{3}^{\ast}, y_{3}^{\ast}), ...)\). The combination of such a training and test set is referred to as an episode or a task: $\mathcal{T} = (\mathcal{D^{tr}}, \mathcal{D^{test}})$.</p> <p>To match the conditions for training and evaluation, one would split all available classes with their examples into a dataset for meta-training \(\mathcal{C}_{train}\) and a dataset for meta-testing \(\mathcal{C}_{test}\). Tasks are then drawn from those datasets for either training or testing purposes.<br> A possible approach for using a task in the training phase could be: Fine-tune the meta-learner using \(\mathcal{D^{tr}}\), evaluate its performance on \(\mathcal{D^{test}}\), and finally update the model based on this evaluation error.</p> <h2 id="quick-recap-on-maml">Quick recap on MAML</h2> <p>Model-Agnostic Meta-Learning (MAML) <d-cite key="DBLP:conf/icml/FinnAL17"></d-cite> is a well-established algorithm in the field of optimization-based meta-learning. Its goal is to find parameters $\theta$ for a parametric model $f_{\theta}$ that can be efficiently adapted to perform an unseen task from the same task distribution, using only a few training examples. The pre-training of $\theta$ is done using two nested loops (bi-level optimization), with meta-training occurring in the outer loop and task-specific fine-tuning in the inner loop. The task-specific fine-tuning is typically done using a few steps of gradient descent:</p> \[\theta_{i}' = \theta - \alpha\nabla_{\theta}\mathcal{L_{\mathcal{T_{i}}}}(\theta, \mathcal{D^{tr}})\] <p>where $\alpha$ is the inner loop learning rate, $\mathcal{L_{\mathcal{T_{i}}}}$ is a task’s loss function, and $\mathcal{D^{tr}}$ is a task’s training set. The task includes a test set as well: $\mathcal{T_{i}} = (\mathcal{D_{i}^{tr}}, \mathcal{D_{i}^{test}})$.</p> <p>In the outer loop, the meta parameter $\theta$ is updated by backpropagating through the inner loop to reduce errors made on the tasks’ test set using the fine-tuned parameters:</p> \[\theta' = \theta - \eta\nabla_{\theta} \sum_{\mathcal{T_{i}} \sim p(\mathcal{T})}^{} \mathcal{L_{\mathcal{T_{i}}}}(\theta_{i}', \mathcal{D^{test}}).\] <p>Here, $\eta$ is the meta-learning rate. The differentiation through the inner loop involves calculating second-order derivatives, which mainly distinguishes MAML from simply optimizing for a $\theta$ that minimizes the average task loss.</p> <p>It is worth noting that in practical scenarios, this second-order differentiation is computationally expensive, and approximation methods such as first-order MAML (FOMAML) <d-cite key="DBLP:conf/icml/FinnAL17"></d-cite> or Reptile <d-cite key="DBLP:journals/corr/abs-1803-02999"></d-cite> are often used. In FOMAML, the outer loop update is simply: \(\theta' = \theta - \eta\nabla_{\theta'} \sum_{\mathcal{T_{i}} \sim p(\mathcal{T})}^{}\mathcal{L_{\mathcal{T_{i}}}}(\theta_{i}', \mathcal{D^{test}})\), which avoids differentiating through the inner loop.</p> <p>Before proceeding, let’s prepare ourselves for the next sections by looking at the notation we can use when discussing MAML in the few-shot classification regime: The model’s output prediction can be described as $\hat{y} = f_{\theta}(\mathbf{x}) = \underset{c\in[N]}{\mathrm{argmax}} h_{\mathbf{w}} (g_{\phi}(\mathbf{x}), c)$, where we divide our model $f_{\theta}(\mathbf{x})$ (which takes an input $\mathbf{x}$) into a feature extractor $g_{\phi}(\mathbf{x})$ and a classifier $h_\mathbf{w}(\mathbf{r}, c)$, which is parameterized by classification head weight vectors ${\mathbf{w}}_{c=1}^N$. $\mathbf{r}$ denotes an input’s representation, and $c$ is the index of the class we want the output prediction for.</p> <p>Finally, $\theta = {\mathbf{w_1}, \mathbf{w_1}, …, \mathbf{w_N}, \phi}$, and we are consistent with our previous notation.</p> <h2 id="learning-a-single-initialization-vector">Learning a single initialization vector</h2> <p>The first two variants of MAML - we look at - approach the initialization task by initializing the classification head weight vectors identically for all classes. In the paper</p> <p></p> <p><span>   ▶  </span>Han-Jia Ye &amp; Wei-Lun Chao (ICLR, 2022) How to train your MAML to excel in few-shot classification <d-cite key="DBLP:conf/iclr/YeC22"></d-cite>,</p> <p></p> <p>an approach called <strong>UnicornMAML</strong> is presented. It is explicitly motivated by the effect that different class-label assignments can have. Ye &amp; Chao [2022] <d-cite key="DBLP:conf/iclr/YeC22"></d-cite> report that during testing, vanilla MAML can perform very differently for <ins>tasks with the same set of classes</ins>, which are just <ins>differently ordered</ins>. Namely, they report that classification accuracy can vary up to 15% in the one-shot setting and up to 8% in the five-shot setting. This makes MAML’s performance quite unstable. <br><br></p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-classification-layer-initialization-in-maml/perm_final-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-classification-layer-initialization-in-maml/perm_final-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-classification-layer-initialization-in-maml/perm_final-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-classification-layer-initialization-in-maml/perm_final.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <p align="center"> <em>Fig.1 Example of MAML and a class label permutation <d-cite key="DBLP:conf/iclr/YeC22"></d-cite>. We can see the randomness introduced, as $\mathbf{w_1}$ is supposed to interpret the input features as "unicorn" for the first task and as "bee" for the second. For both tasks, the class outputted as a prediction should be the same, as in human perception, both tasks are identical. This, however, is obviously not the case.</em> </p> <p>The solution proposed is fairly simple: Instead of meta-learning $N$ weight vectors for the final layer, only a <ins>single vector</ins> $\mathbf{w}$ is meta-learned and used to initialize all $ \{ \mathbf{w} \}_{c=1}^N $ before the fine-tuning stage.</p> <p>This forces the model to make random predictions before the inner loop, as $\hat{y_c}= h_{\mathbf{w}} (g_{\phi} (\mathbf{x}), c)$ will be the same for all $c \in [1,…,N ]$.</p> <p>After the inner loop, the updated parameters have been computed as usual: \(\theta' = \\{\mathbf{w_1}', \mathbf{w_2}', ..., \mathbf{w_N}', \phi'\\}\). The gradient for updating the single classification head meta weight vector $\mathbf{w}$, is just the aggregation of the gradients w.r.t. all the single $\mathbf{w_c}$:</p> \[\nabla_{\mathbf{w}} \mathcal{L_{\mathcal{T_i}}} (\mathcal{D^{test}}, \theta_i) = \sum_{c \in [N]} \nabla_{\mathbf{w_c}} \mathcal{L_{\mathcal{T_i}}} (\theta_i, \mathcal{D^{test}})\] <p>This collapses the models meta-parameters to $ \theta = \{\mathbf{w}, \phi\} $. <br><br></p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-classification-layer-initialization-in-maml/unicorn_maml_final-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-classification-layer-initialization-in-maml/unicorn_maml_final-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-classification-layer-initialization-in-maml/unicorn_maml_final-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-classification-layer-initialization-in-maml/unicorn_maml_final.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <p align="center"> <em>Fig.2 Overview of UnicornMAML <d-cite key="DBLP:conf/iclr/YeC22"></d-cite>. We can see that class label permutations don't matter anymore, as before fine-tuning, the probability of predicting each class is the same.</em> </p> <p>This tweak to vanilla MAML makes UnicornMAML permutation invariant, as models fine-tuned on tasks including the same categories - just differently ordered - will now yield the same output predictions. Also, the method could be used with more challenging datasets where the number of classes varies without any further adaptation: It doesn’t matter how many classification head weight vectors are initialized by the single meta-classification head weight vector.</p> <p>Furthermore, the uniform initialization in Unicorn-MAML addresses the problem of memorization overfitting <d-cite key="DBLP:conf/iclr/YinTZLF20"></d-cite>. The phenomenon describes a scenario where a single model can learn all the training tasks only from the test data in the outer loop. This leads to a model that learns to perform the training tasks but also to a model that doesn’t do any fine-tuning and thus fails to generalize to unseen tasks.</p> <p>Yin et al. [2020] <d-cite key="DBLP:conf/iclr/YinTZLF20"></d-cite> illustrate memorization overfitting using a simple example: Imagine a 3D pose prediction problem, where each task consists of 2D pictures of a certain object. The objects are rotated by some angle from an (unknown) canonical pose in every picture. Each picture is labeled by the angle by which the object is rotated from the object’s canonical pose.</p> <p>In a memorization overfitting scenario, a model learns and memorizes the canonical pose of all the objects shown during training. This way, the model no longer needs to adapt during fine-tuning in the meta-training phase. For correctly dealing with the test examples during training, it could just recognize which object it is looking at and calculate the angle from the remembered canonical pose.<br> This becomes a problem when unseen objects are shown to the model during meta-testing. Here, it would be critical to infer the canonical pose from the training examples to infer the rotation angle for the test examples correctly. This, however, was not learned by the model in this example.</p> <p>When initializing the classification head identically for all classes, the model is forced to adapt during fine-tuning, as otherwise, it would predict only at the chance level. This prevents memorization overfitting.</p> <p>Ye &amp; Chao [2022] <d-cite key="DBLP:conf/iclr/YeC22"></d-cite> benchmark UnicornMAML on MiniImageNet and TieredImageNet. In the five-shot setting, the approach is claimed to outperform ProtoNet, ProtoMAML, MetaOptNet, MTL+E3BM, RFS-Distill, DeepEMD, MATE+MetaOpt DSN-MR and FEAT. In the one-shot setting, UnicornMAML is reported to perform averagely compared with the other methods.</p> <p>Let’s finally think of how to interpret UnicornMAML: When meta-learning only a single classification head vector, one could say that rather than learning a mapping from features to classes, the weight vector instead learns a prioritization of those features that seem to be more relevant across tasks.</p> <h2 id="zero-initialization">Zero initialization</h2> <p>The second approach for initializing weights identically for all classes is proposed in the paper</p> <p></p> <p><span>   ▶  </span>Chia-Hsiang Kao et al. (ICLR, 2022) MAML is a Noisy Contrastive Learner in Classification <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite>.</p> <p></p> <p>Kao et al. [2022] <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite> modify the original MAML by setting the whole classification head to zero before each inner loop. They refer to this MAML-tweak as the <strong>zeroing trick</strong>.</p> <p>An overview of MAML with the zeroing trick is displayed below:</p> <div class="l-page"> <iframe src="/2023/assets/html/2023-05-01-classification-layer-initialization-in-maml/algorithm.html" frameborder="0" scrolling="no" width="100%" height="400px"></iframe> </div> <p align="center"> <em>Fig.3 MAML with the zeroing trick applied <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite>.</em> </p> <p>Through applying the zero initialization, three of the problems addressed by UnicornMAML are solved as well:</p> <ul> <li>MAML, with the zeroing trick applied, leads to random predictions before fine-tuning. This happens as zeroing the whole classification head is also a form of identical weight initialization for all classes. Thus, the zeroing trick solves the problem caused by class label ordering permutations during testing.</li> <li>Through the random predictions before fine-tuning, memorization overfitting is prevented as well.</li> <li>The zeroing trick makes MAML applicable for datasets with a varying number of classes per task.</li> </ul> <p>Interestingly, the motivation for applying the zeroing trick, stated by Kao et al. [2022] <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite>, is entirely different. In general, Kao et al. [2022] <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite> want to unveil in what sense MAML encourages its models to learn general-purpose feature representations. They show that under some assumptions, there is a supervised contrastive learning (SCL) objective underlying MAML.</p> <p>In SCL, the label information is leveraged by pulling embeddings belonging to the same class closer together while increasing the embedding distances of samples from different classes <d-cite key="DBLP:conf/nips/KhoslaTWSTIMLK20"></d-cite>. This is achieved by contrasting examples within a batch to each other. If two examples share the same label, the SCL loss is designed to increase their embeddings’ similarity. If the label differs, it enforces the examples embedding similarity to decrease. The SCL loss contains an explicit similarity measure, which distinguishes it from supervised learning.</p> <p>More specifically, Kao et al. [2022] <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite> show that, in MAML without the zeroing trick, the outer-loop update for the encoder follows a noisy SCL loss under the following assumptions:</p> <ol> <li>The encoder weights are frozen in the inner loop (EFIL assumption)</li> <li>There is only a single inner loop update step.<d-footnote>Note that FOMAML technically follows a noisy SCL loss without this assumption. However, when applying the zeroing trick, this assumption is needed again for stating that the encoder update is following an SCL loss</d-footnote> </li> </ol> <p>A noisy SCL loss means that cases can occur where the loss forces the model to maximize similarities between embeddings from samples of different classes. The outer-loop encoder loss in this setting contains an “interference term” which causes the model to pull together embeddings from different tasks or to pull embeddings into a random direction, with the randomness being introduced by random initialization of the classification head. Those two phenomena are termed <em>cross-task interference</em> and <em>initialization interference</em>. Noise and interference in the loss vanish when applying the zeroing trick, and the outer-loop encoder loss turns into a proper SCL loss. Meaning that minimizing this loss forces embeddings of the same class/task together while pushing embeddings from the same task and different classes apart.</p> <p>Those findings are derived using a general formulation of MAML, with a cross-entropy loss, and the details are available in the paper <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite>. Also, a slightly simpler example is stated to give an intuition of MAML’s SCL properties. We will briefly summarize it in the following to share this intuition with you.</p> <p>In experiments on the mini-ImageNet and Omniglot datasets, a decent increase in performance is reported for MAML with the zeroing trick compared to vanilla MAML.</p> <h3 id="mamls-scl-intuition">MAML’s SCL Intuition</h3> <p>To get an intuition of how MAML relates to SCL, let’s look at the following setup: an N-way one-shot classification task using MAML with Mean Squared Error (MSE) between the one-hot encoded class label and the prediction of the model. Furthermore, the EFIL assumption is made, the zeroing trick is applied, only a single inner loop update step is used, and only a single task is sampled per batch.</p> <p>In this setting, the classification heads inner-loop update for a single datapoint looks like this:</p> \[\mathbf{w}' = \mathbf{w} - \alpha (-g_{\phi} (\mathbf{x}_{1}^{tr}) \mathbf{t}_{1}^{tr\top})\] <p>$\mathbf{t}_1^{tr}$ refers to the one-hot encoded class label belonging to $\mathbf{x}_1^{tr}$. In words, the features extracted for training example $\mathbf{x}_1^{tr}$ are added to column $\mathbf{w}_c$, with $c$ being the index of 1 in $\mathbf{t}_1^{tr}$. For multiple examples, the features of all training examples labeled with class $c$ are added to the $c^{th}$ column of $\mathbf{w}$.</p> <p>Now, for calculating the model’s output in the outer loop, the model computes the dot products of the columns \(\\{\mathbf{w'} \\}_{c=1}^N\) and the encoded test examples \(g_{\phi}(\mathbf{x}_1^{test})\). To match the one-hot encoded label as well as possible, the dot product has to be large when \(\mathbf{t}_1^{test}\) = \(1\) at index \(c\), and small otherwise. We can see that the loss enforces embedding similarity for features from the same classes while enforcing dissimilarity for embeddings from different classes, which fits the SCL objective.</p> <h2 id="initialization-using-prototypes">Initialization using prototypes</h2> <p>A more sophisticated approach for last-layer initialization in MAML is introduced in the paper</p> <p></p> <p><span>   ▶  </span>Eleni Triantafillou et al. (ICLR, 2020) Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples <d-cite key="DBLP:conf/iclr/TriantafillouZD20"></d-cite> .</p> <p></p> <p>As one might guess from the name, <strong>Proto-MAML</strong> makes use of Prototypical Networks (PNs) for enhancing MAML. Unlike the two initialization strategies presented above, Proto-MAML does not force the classification head weights to be initialized identically for all classes before fine-tuning. Instead, it calculates class-specific initialization vectors based on the training examples. This solves some of the problems mentioned earlier (see <a href="#conclusion--discussion">Conclusion &amp; Discussion</a>), but also it adds another type of logic to the classification layer.</p> <p>Let’s revise how PNs work when used for few-shot learning for understanding Proto-MAML afterward:</p> <p>Class prototypes \(\mathbf{c}_{c}\) are computed by averaging over train example embeddings of each class, created by a feature extractor \(g_{\phi}(\mathbf{x})\). For classifying a test example, a softmax over the distances (e.g., squared Euclidean distance) between class prototypes \(\mathbf{c}_{c}\) and example embeddings \(g_{\phi}(\mathbf{x}^{test})\) is used, to generate probabilities for each class.</p> <p>When using the squared Euclidean distance, the model’s output logits are expressed as:</p> \[\begin{align*} &amp;- \vert \vert g_{\phi}(\mathbf{x}) - \mathbf{c}_c \vert \vert^2 \\ =&amp; −g_{\phi}(\mathbf{\mathbf{x}})^{\top} g_{\phi}(\mathbf{x}) + 2 \mathbf{c}_{c}^{\top} g_{\phi}(\mathbf{x}) − \mathbf{c}_{c}^{\top} \mathbf{c}_{c} \\ =&amp; 2 \mathbf{c}_{c}^{\top} g_{\phi}(\mathbf{x}) − \vert \vert \mathbf{c}_{c} \vert \vert^2 + constant. \end{align*}\] <p>Note that the “test” superscripts on $\mathbf{x}$ are left out for clarity. \(−g_{\phi}(\mathbf{x})^{\top} g_{\phi}(\mathbf{x})\) is disregarded here, as it’s the same for all logits, and thus doesn’t affect the output probabilities. When inspecting the left-over equation, we can see that it now has the shape of a linear classifier. More specifically, a linear classifier with weight vectors \(\mathbf{w}_c = 2 \mathbf{c}_c^{\top}\) and biases \(b_c = \vert \vert \mathbf{c}_{c} \vert \vert^2\).</p> <p>Returning to Proto-MAML, Triantafillou et al. [2020] <d-cite key="DBLP:conf/iclr/TriantafillouZD20"></d-cite> adapt vanilla MAML by initializing the classification head using the prototype weights and biases, as just discussed. The initialization happens before the inner loop for each task, and the prototypes are computed by MAML’s own feature extractor. Afterward, the fine-tuning works as usual. Finally, when updating $\theta$ in the outer loop, the gradients flow also through the initialization of \(\mathbf{w}_c\) and \(b_c\), which is easy as they fully depend on \(g_{\phi}(\mathbf{x})\).</p> <p>Note that because of computational reasons, Triantafillou et al. [2020] <d-cite key="DBLP:conf/iclr/TriantafillouZD20"></d-cite> refer to Proto-MAML as (FO-)Proto-MAML.</p> <p>With Proto-MAML, one gets a task-specific, data-dependent initialization in a simple fashion, which seems super nice. For computing the model’s output logits after classification head initialization, dot products between class prototypes and embedded examples are computed, which again seems very reasonable.</p> <p>One could argue that in the one-shot scenario, Proto-MAML doesn’t learn that much in the inner loop beside the initialization itself. This happens as the dot product between an embedded training example and one class prototype (which equals the embedded training example itself for one class) will be disproportionately high. For a k-shot example, this effect might be less, but still, there is always one training example embedding within the prototype to compare. Following this thought, the training samples would rather provide a useful initialization of the final layer than a lot of parameter adaptation.</p> <p>Proto-MAML is claimed to outperform the approaches, K-nearest neighbours, Finetune, MatchingNet, ProtoNet, fo-MAML and RelationNet on most sub-datasets of MetaDataset <d-cite key="DBLP:conf/iclr/TriantafillouZD20"></d-cite>, like ILSVRC-2012 or Omniglot.</p> <h2 id="what-else-is-there">What else is there?</h2> <p>Before proceeding to <a href="#conclusion--discussion">Conclusion &amp; Discussion</a>, here are some pointers to methods that did not perfectly fit the topic but which are closely related:</p> <p>The first method worth mentioning is called Latent Embedding Optimization (LEO) <d-cite key="DBLP:conf/iclr/RusuRSVPOH19"></d-cite>. The authors encode the training data in a low dimensional subspace, from which model parameters $\theta$ can be generated. In the example presented, $\theta$ consists only of $\mathbf{w}$, so for the first inner-loop iteration, this would perfectly fit our initialization topic. The low-dimensional code is generated using a feed-forward encoder, as well as a relation network. Using the relation network allows LEO to consider relations between the training examples of different classes. Very similar classes, for example, might require different decision boundaries than more distinct classes, hence the intuition.</p> <p>LEO deviates from the initialization scheme, however, as optimization is done in the low dimensional subspace and not on the model’s parameters directly. It is stated that optimizing in a lower dimensional subspace helps in low-data regimes.</p> <p>Another related method is called MetaOptNet <d-cite key="DBLP:conf/cvpr/LeeMRS19"></d-cite>. In this approach, convex base learners, like support vector machines, are used as the classification head. Those can be optimized till convergence, which solves, e.g., the problem of varying performance due to random class label orderings.</p> <h2 id="conclusion-and-discussion">Conclusion and Discussion</h2> <p>To conclude, we’ve seen that a variety of problems can be tackled by using initialization strategies for MAML’s linear classification head, including:</p> <ul> <li>Varying performance due to random class label orderings</li> <li>Ability of MAML to work on datasets where the number of classes per task varies</li> <li>Memorization overfitting</li> <li>Cross-task interference</li> <li>and Initialization interference.</li> </ul> <p>Furthermore, for all the approaches presented, a decent gain in performance is reported in comparison to vanilla MAML. It seems, therefore, very reasonable to spend some time thinking about the last layer initialization.</p> <p>Looking at the problems mentioned and variants discussed in more detail, we can state that all the different variants make MAML <strong>permutation invariant with regard to class label orderings</strong>. UnicornMAML and the zeroing trick solve it by uniform initialization of $\mathbf{w}$. In Proto-MAML, the initialization adapts to the class label assignments, so it’s permutation invariant as well.</p> <p>Also, all variants are compatible with <strong>datasets where the number of classes per task varies</strong>. In UnicornMAML, an arbitrary number of classification head vectors can be initialized with the single meta-learned classification head weight vector. When zero-initializing the classification head, the number of classes per task does not matter as well. In Proto-MAML, prototypes can be computed for an arbitrary number of classes, so again, the algorithm works on such a dataset without further adaption.</p> <p>Next, UnicornMAML and the zeroing trick solve <strong>memorization overfitting</strong>, again by initializing $\mathbf{w}$ identically for all classes. Proto-MAML solves memorization overfitting as well, as the task-specific initialization of $\mathbf{w}$ itself can be interpreted as fine-tuning.</p> <p><strong>Cross-task interference</strong> and <strong>initialization interference</strong> are solved by the zeroing trick. For the other methods, this is harder to say, as the derivations made by Kao et al. [2022] <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite> are quite a case specific. Intuitively, Proto-MAML should solve cross-task interference, as the classification head is reinitialized after each task. Initialization interference is not solved by either ProtoMAML or UnicornMAML, as random initialization before the beginning of meta-training remains.</p> <p>Note that in discussion with a reviewer, Kao et al. [2022] <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite> state that the main results they show are achieved by models which had the zeroing trick implemented but which didn’t follow the EFIL assumption. They argue that using only the zeroing trick still enhances supervised contrastiveness. This kind of puts their whole theory into perspective, as without the EFIL assumption, MAML with the zeroing trick is neither an SCL algorithm nor a noisy SCL algorithm. Still, noticeable performance gains are reported though.</p> <p>The question arises whether the whole theoretical background is needed or whether the zeroing tricks benefit is mainly the identical initialization for all classes, like in UnicornMAML. It would be nice to see how the single learned initialization vector in UnicornMAML turns out to be shaped and how it compares to the zeroing trick. While the zeroing trick reduces cross-task noise and initialization noise, a single initialization vector can weight some features as more important than others for the final classification decision across tasks.</p> <p>In contrast to the uniform initialization approaches, we have seen Proto-MAML, where class-specific classification head vectors are computed for initialization based on the training data.</p> <p>Finally, Ye &amp; Chao [2022] <d-cite key="DBLP:conf/iclr/YeC22"></d-cite> compare the performance between Proto-MAML and UnicornMAML on MiniImageNet and TieredImageNet. UnicornMAML performs slightly better here in the one- and five-shot settings. Kao et al. [2022] <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite> report that MAML with the zeroing-trick outperforms unmodified MAML on the mini-ImageNet and Omniglot datasets. They do not provide a benchmark score, however.</p> </d-article> <d-appendix> <d-footnote-list></d-footnote-list> <d-citation-list></d-citation-list> </d-appendix> </div> <d-bibliography src="/2023/assets/bibliography/2023-05-01-classification-layer-initialization-in-maml.bib"></d-bibliography> <d-article id="bibtex-container" class="related highlight"> For attribution in academic contexts, please cite this work as <pre id="bibtex-academic-attribution">
+        PLACEHOLDER FOR ACADEMIC ATTRIBUTION
+  </pre> BibTeX citation <pre id="bibtex-box">
+        PLACEHOLDER FOR BIBTEX
+  </pre> </d-article> <script src="https://utteranc.es/client.js" repo="iclr-blogposts/2023" issue-term="pathname" theme="github-light" crossorigin="anonymous" async> </script> <script src="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/js/bootstrap.bundle.min.js" integrity="sha256-fgLAgv7fyCGopR/gBNq2iW3ZKIdqIcyshnUULC4vex8=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/js/mdb.min.js" integrity="sha256-NdbiivsvWt7VYCt6hYNT3h/th9vSTL4EDWeGs5SN3DA=" crossorigin="anonymous"></script> </body> </html>
\ No newline at end of file
diff --git a/blog/2023/facial-poisoning/index.html b/blog/2023/facial-poisoning/index.html
new file mode 100644
index 00000000..c9e70747
--- /dev/null
+++ b/blog/2023/facial-poisoning/index.html
@@ -0,0 +1,36 @@
+<!DOCTYPE html> <html> <script>let thunk=()=>{let t=t=>t.trim(),e=t=>t.innerText,a=t=>{let e=t.split(" "),a=e.slice(0,-1).join(" ");return[e.at(-1),a]},n=Array.from(document.getElementsByClassName("author")).map(e).map(t).map(a),o=n[0][0],i=(Array.from(document.getElementsByClassName("affiliation")).filter(t=>"P"===t.nodeName).map(e).map(t),"May 1, 2023"),r="Data Poisoning is Hitting a Wall",l="In this post, we look at the paper 'Data Poisoning Won't Save You From Facial Recognition', discuss the impact of the work, and additionally look at how this work fares in the current state of adversarial machine learning. Being a blog post as opposed to a traditional paper, we try to avoid inundating the reader with mathematical equations and complex terminologies. Instead, we aim to put forth this work's primary concept and implications, along with our observations, in a clear, concise manner. Don't want to go through the entire post? Check out the TL;DR at the end for a quick summary.";{let t=n.map(t=>`${t[0]}, ${t[1]}`).join(" and "),e=`\n@inproceedings{${(o+"2023"+r.split(" ").slice(0,3).join("")).replace(" ","").replace(/[\p{P}$+<=>^`|~]/gu,"").toLowerCase().trim()},\n  author = {${t}},\n  title = {${r}},\n  abstract = {${l}},\n  booktitle = {ICLR Blogposts 2023},\n  year = {2023},\n  date = {${i}},\n  note = {${window.location.href}},\n  url  = {${window.location.href}}\n}\n  `.trim();document.getElementById("bibtex-box").innerText=e}{let t=n.map(t=>t[0]),e=`\n${t=t.length>2?t[0]+", et al.":2==t.length?t[0]+" & "+t[1]:t[0]}, "${r}", ICLR Blogposts, 2023.\n`.trim();document.getElementById("bibtex-academic-attribution").innerText=e}};document.addEventListener("readystatechange",function(){"complete"===document.readyState&&thunk()});</script> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <title>Data Poisoning is Hitting a Wall | ICLR Blogposts 2023</title> <meta name="author" content="abc b c"/> <meta name="description" content="In this post, we look at the paper 'Data Poisoning Won't Save You From Facial Recognition', discuss the impact of the work, and additionally look at how this work fares in the current state of adversarial machine learning. Being a blog post as opposed to a traditional paper, we try to avoid inundating the reader with mathematical equations and complex terminologies. Instead, we aim to put forth this work's primary concept and implications, along with our observations, in a clear, concise manner. Don't want to go through the entire post? Check out the TL;DR at the end for a quick summary."/> <meta name="keywords" content="machine-learning, ml, deep-learning, reinforcement-learning, iclr"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha256-DF7Zhf293AJxJNTmh5zhoYYIMs2oXitRfBjY+9L//AY=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/css/mdb.min.css" integrity="sha256-jpjYvU3G3N6nrrBwXJoVEYI/0zw8htfFnhT9ljN3JJw=" crossorigin="anonymous"/> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fortawesome/fontawesome-free@5.15.4/css/all.min.css" integrity="sha256-mUZM63G8m73Mcidfrv5E+Y61y7a12O5mW4ezU3bxqW4=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/academicons@1.9.1/css/academicons.min.css" integrity="sha256-i1+4qU2G2860dGGIOJscdC30s9beBXjFfzjWLjBRsBg=" crossorigin="anonymous"> <link rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700|Roboto+Slab:100,300,400,500,700|Material+Icons"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/github.css" media="" id="highlight_theme_light"/> <link rel="shortcut icon" href="/2023/assets/img/iclr_favicon.ico"/> <link rel="stylesheet" href="/2023/assets/css/main.css"> <link rel="canonical" href="https://iclr-blogposts.github.io/2023/blog/2023/facial-poisoning/"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/native.css" media="none" id="highlight_theme_dark"/> <script src="/2023/assets/js/theme.js"></script> <script src="/2023/assets/js/dark_mode.js"></script> <script src="https://cdn.jsdelivr.net/npm/jquery@3.6.0/dist/jquery.min.js" integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=" crossorigin="anonymous"></script> <script type="text/javascript">window.MathJax={tex:{tags:"ams"}};</script> <script defer type="text/javascript" id="MathJax-script" src="https://cdn.jsdelivr.net/npm/mathjax@3.2.0/es5/tex-mml-chtml.js"></script> <script defer src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script> <script src="/2023/assets/js/distillpub/template.v2.js"></script> <script src="/2023/assets/js/distillpub/transforms.v2.js"></script> <script src="/2023/assets/js/distillpub/overrides.js"></script> </head> <d-front-matter> <script async type="text/json">{
+      "title": "Data Poisoning is Hitting a Wall",
+      "description": "In this post, we look at the paper 'Data Poisoning Won't Save You From Facial Recognition', discuss the impact of the work, and additionally look at how this work fares in the current state of adversarial machine learning. Being a blog post as opposed to a traditional paper, we try to avoid inundating the reader with mathematical equations and complex terminologies. Instead, we aim to put forth this work's primary concept and implications, along with our observations, in a clear, concise manner. Don't want to go through the entire post? Check out the TL;DR at the end for a quick summary.",
+      "published": "May 1, 2023",
+      "authors": [
+        {
+          "author": "Rajat Sahay",
+          "authorURL": "https://rajatsahay.github.io/",
+          "affiliations": [
+            {
+              "name": "Rochester Institute of Technology, USA",
+              "url": ""
+            }
+          ]
+        }
+        
+      ],
+      "katex": {
+        "delimiters": [
+          {
+            "left": "$",
+            "right": "$",
+            "display": false
+          },
+          {
+            "left": "$$",
+            "right": "$$",
+            "display": true
+          }
+        ]
+      }
+    }</script> </d-front-matter> <body class="fixed-top-nav"> <header> <nav id="navbar" class="navbar navbar-light navbar-expand-sm fixed-top"> <div class="container"> <a class="navbar-brand title font-weight-lighter" href="/2023/">ICLR Blogposts 2023</a> <button class="navbar-toggler collapsed ml-auto" type="button" data-toggle="collapse" data-target="#navbarNav" aria-controls="navbarNav" aria-expanded="false" aria-label="Toggle navigation"> <span class="sr-only">Toggle navigation</span> <span class="icon-bar top-bar"></span> <span class="icon-bar middle-bar"></span> <span class="icon-bar bottom-bar"></span> </button> <div class="collapse navbar-collapse text-right" id="navbarNav"> <ul class="navbar-nav ml-auto flex-nowrap"> <li class="nav-item "> <a class="nav-link" href="/2023/about">about</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/call">call for blogposts</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/submitting">submitting</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/reviewing">reviewing</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/blog/index.html">blog</a> </li> <li class="nav-item dropdown "> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdown" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">other iterations</a> <div class="dropdown-menu dropdown-menu-right" aria-labelledby="navbarDropdown"> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2025/">2025</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2024/">2024</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blog-track.github.io/home/" target="_blank" rel="noopener noreferrer">2022</a> </div> </li> <li class="toggle-container"> <button id="light-toggle" title="Change theme"> <i class="fas fa-moon"></i> <i class="fas fa-sun"></i> </button> </li> </ul> </div> </div> </nav> </header> <div class="post distill"> <d-title> <h1>Data Poisoning is Hitting a Wall</h1> <p>In this post, we look at the paper 'Data Poisoning Won't Save You From Facial Recognition', discuss the impact of the work, and additionally look at how this work fares in the current state of adversarial machine learning. Being a blog post as opposed to a traditional paper, we try to avoid inundating the reader with mathematical equations and complex terminologies. Instead, we aim to put forth this work's primary concept and implications, along with our observations, in a clear, concise manner. Don't want to go through the entire post? Check out the TL;DR at the end for a quick summary.</p> </d-title> <d-byline></d-byline> <d-article> <d-contents> <nav class="l-text figcaption"> <h3>Contents</h3> <div><a href="#overview-and-motivation">Overview and Motivation</a></div> <ul> <li><a href="#what-is-data-poisoning">What is Data Poisoning?</a></li> </ul> <div><a href="#why-doesn-t-data-poisoning-work">Why doesn't Data Poisoning work?</a></div> <div><a href="#high-level-idea">High Level Idea</a></div> <div><a href="#experiments">Experiments</a></div> <ul> <li><a href="#adaptive-defenses-break-facial-poisoning-attacks">Adaptive defenses break facial poisoning attacks</a></li> <li><a href="#attack-detection">Attack Detection</a></li> <li><a href="#time-is-all-you-need">Time is all you need</a></li> <li><a href="#robustness-shouldnt-come-at-the-cost-of-accuracy">Robustness shouldnt come at the cost of accuracy</a></li> </ul> <div><a href="#conclusion">Conclusion</a></div> <div><a href="#outlook">Outlook</a></div> <div><a href="#tldr">TLDR</a></div> </nav> </d-contents> <h2 id="overview-and-motivation">Overview and Motivation</h2> <p>To illustrate the data poisoning process, and to tie in with the paper below, let’s describe data poisoning against the backdrop of the facial recognition problem.</p> <p>Facial recognition systems have been known to pose a severe threat to society. With unprecedented advancements in AI research, it is evident that this threat will be around for a while. There has been a steady increase in vendors offering facial recognition services for downstream applications — ranging from customer onboarding tools to criminal identification for police forces. The systems provided by these vendors are usually trained on images of users’ faces scraped from the Web. Ethical and moral concerns aside, this poses a considerable risk to the privacy of individuals.</p> <h4 id="what-is-data-poisoning">What is Data Poisoning?</h4> <p>Keeping this in mind, a growing body of work has emerged that allows users to fight back using principles from adversarial machine learning. Primary among these is the technique of data poisoning - where users can perturb pictures that they post online so that models that train on these become <em>poisoned</em>. In other words, once a model has been introduced to a perturbed image of a user, it misidentifies any future instances of that person.</p> <p>Services like <em>Fawkes</em> popularized this approach by offering a service promising “strong protection against unauthorized {facial recognition} models.” Users could pass their images through Fawkes and receive poisoned photos - virtually identical to the naked eye, which were then posted to social media, alleviating any worries that they might be used to identify them in the future. It quickly gained popularity, was covered by the New York Times <d-footnote>[This tool could protect your photos from Facial Recognition](https://www.nytimes.com/2020/08/03/technology/fawkes-tool-protects-photos-from-facial-recognition.html)</d-footnote> and received over 500,000 downloads. Following Fawkes’ success, similar systems were proposed in academic and commercial settings.</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-facial-poisoning/facial_poisoning-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-facial-poisoning/facial_poisoning-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-facial-poisoning/facial_poisoning-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-facial-poisoning/facial_poisoning.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <hr> <p>The authors of the paper, however, look at these systems from a different perspective. They argue that services like Fawkes (and poisoning strategies in general) cannot protect users’ privacy when it comes to facial recognition systems. In fact, it usually exacerbates the situation by providing them with a false sense of security. For instance, there might have previously been a privacy-focused user who would have refrained from uploading their photos to the Internet. However, they might do so now under the false belief that their poisoned photos would work towards protecting their privacy. Thus, these users are now <em>less private</em> than they were before.</p> <h2 id="why-doesnt-data-poisoning-work">Why doesn’t data poisoning work?</h2> <p>While data poisoning may have uses in other fields, such as healthcare, this post shows that it would not protect against facial recognition models. The main reason for this is due to a fundamental asymmetry between the users and the model trainers. Let us take the scenario described in the above figure. A user commits to an attack and uploads a perturbed image of themselves to the Web. This image eventually gets scraped by the model as part of its data collection strategy. In this case, the model trainer, or the vendors offering facial recognition services, now benefit from acting second. This provides them with two significant advantages:</p> <ul> <li> <p>Since image poisoning systems cater to large user bases, these systems are usually made publicly accessible. This allows the model trainers to become aware of the technique used, which, in turn, helps them apply techniques to resist the poisoning attacks. This strategy of using alternate training techniques is known as an <strong>adaptive defense</strong>.</p> </li> <li> <p>As current poisoning attacks are designed to prevent <em>existing</em> facial recognition tools from working, there is no reason to assume that future models will also be poisoned. So, trainers can simply wait a while and use newer models to keep identifying users, which would be invulnerable to poisoning attacks. This technique can (aptly) be named an <strong>oblivious defense</strong>.</p> </li> </ul> <p>Observant readers might equate this setting of continually evolving attack and defense tactics to an <em>arms race</em>. However, since a perturbation applied to an image cannot be changed once scraped by the model, a successful attack has to remain effective against <em>all</em> future models, even those trained adaptively against the attack. A better alternative to this would be pushing for legislation that restricts the use of privacy-invasive facial recognition systems.</p> <h2 id="high-level-idea">High Level Idea</h2> <p>We now look at the conclusions put forward in the excellent paper written by Radiya-Dixit <em>et al</em>.</p> <ol> <li>An adaptive model trainer with black-box access to facial recognition systems like Fawkes can train a robust model that resists poisoning attacks and correctly identifies all users with high accuracy.</li> <li>An adaptive model trainer can also repurpose this model to <em>detect</em> perturbed pictures with near-perfect accuracy.</li> <li>Image poisoning systems have already been broken by newer facial recognition that appeared less than a year after the attacks were introduced and employed superior training strategies.</li> <li>It is possible to increase the robustness of a model (against poisoning attacks) without degrading its accuracy in identifying ‘clean’ images.</li> </ol> <p>Let us take a closer look and deconstruct how the authors arrived at these conclusions.</p> <h2 id="experiments">Experiments</h2> <p>For clarity, before we arrive at the individual conclusions, we look at the setup used by the authors to carry out their experiments.</p> <p>The authors evaluate three distinct poisoning attacks: <strong>Fawkes v0.3</strong>, <strong>Fawkes v1.0</strong><d-cite key="shan2020fawkes"></d-cite>, and a separate attack published at ICLR 2021 called <strong>LowKey</strong><d-cite key="cherepanova2021lowkey"></d-cite>. All of these function on the same underlying principle of data poisoning. Their goal is to force the facial recognition model to associate an image with spurious features absent in unperturbed images.</p> <p>The experiments are performed with the <em>FaceScrub</em> dataset<d-cite key="ng2014data"></d-cite>, which contains over 50,000 pictures of 530 celebrities. A sample run of an experimental procedure can be described as follows: A user, in this case, one of the celebrities in the <em>FaceScrub</em> dataset, perturbs all of their images with <em>Fawkes</em> or <em>LowKey</em> in their strongest settings. These images then end up as the training data used by the model trainer. The model trainer uses the standard approach for training their facial recognition system by employing a pre-trained feature extractor to convert pictures into embeddings. Given a test image, the model tries to find a training example that minimizes the distance between them in the embedding space and returns the identity associated with the training example.</p> <p>The authors use various models as feature extractors from <em>FaceNet</em><d-cite key="schroff2015facenet"></d-cite> to OpenAI’s <em>CLIP</em><d-cite key="radford2021learning"></d-cite>. This is an important step that helps quantify the effectiveness of the <strong>oblivious defense</strong> strategy. ***</p> <h4 id="adaptive-defenses-break-facial-poisoning-attacks">Adaptive defenses break facial poisoning attacks</h4> <p>This section describes how the model trainer can adaptively train a generic feature extractor that can resist poisoning attacks.</p> <p>The model trainer begins by collecting a public dataset of unperturbed images. In this case, that would be a canonical dataset of celebrities that are a part of the <em>FaceScrub</em> dataset. With black-box access to the poisoning tool, the trainer calls it to obtain perturbed samples of the same images.</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-facial-poisoning/adaptive-attack.gif-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-facial-poisoning/adaptive-attack.gif-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-facial-poisoning/adaptive-attack.gif-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-facial-poisoning/adaptive-attack.gif" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <p>With access to both unperturbed images and their corresponding poisoned counterparts, the trainer can teach a model to produce similar embeddings for both sets of pictures, encouraging the model to adaptively learn robust features. This is done hoping that this robustness would eventually generalize to perturbations’ applied to other images.</p> <p>While the above strategy works in theory, it requires direct intervention from model trainers by using the ‘clean’ images provided by them. This would not scale well, especially for large-scale facial recognition systems that look at millions of photographs. However, this attack could also occur without the trainers’ explicit involvement. There is a high possibility that some users already have unperturbed images of themselves on the Web; either they forgot to perturb some pictures, or they were uploaded by someone else. Feature extractors trained on these pictures would then be encouraged to learn robust features.</p> <p><strong>Results:</strong> All three attacks were evaluated against a non-robust <em>WebFace</em> model to establish a baseline. They were found to have a misclassification rate of 55-77% for users who poisoned their pictures online. This compares starkly to a rate of 8% for unprotected users. However, when trained adaptively, the misclassification rate for all users - protected or unprotected - dropped to 5-8%, and all poisoning attacks were rendered ineffective. ***</p> <h4 id="attack-detection">Attack Detection</h4> <p>Since the model trainers have black-box access to the facial poisoning tools (<em>Fawkes</em> and <em>LowKey</em>), they can also turn the tables and build a detector to determine whether a specific image has been perturbed. Such a detector can dynamically filter out perturbed photos, allowing the model to retain only unperturbed pictures of a user. Moreover, detecting an attack could be a privacy concern (for instance, law enforcement might actively target users whose attack attempts are detected).</p> <p>To verify this, the authors were able to fine-tune a standard pre-trained <em>ImageNet</em> model to distinguish between perturbed and clean images of 25 random celebrities in the dataset. The model detected the poisoned images with near-perfect precision (99.8%) and recall (99.8%). ***</p> <h4 id="time-is-all-you-need">Time is all you need</h4> <p>Rather than creating poisoned counterparts to clean images and adaptively training a model, trainers have a much simpler alternative. They can simply wait for better facial recognition systems to be developed and then retroactively apply such a system to pictures they scraped in the past. <em><strong>Simply put, facial poisoning attacks cannot withstand the test of time</strong></em>.</p> <p>To bypass this <em>oblivious</em> defense strategy, an attack must not only be able to fool all present models but also be effective against future iterations without changing its perturbation. Asymetrically (to the benefit of the model trainer) newer techniques need not be robust to all attacks; instead, they just have to resist the specific method used in previous pictures.</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-facial-poisoning/oblivious-attack.gif-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-facial-poisoning/oblivious-attack.gif-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-facial-poisoning/oblivious-attack.gif-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-facial-poisoning/oblivious-attack.gif" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <p>To confirm this, the paper included a study where <em>Fawkes</em> was pitted against various feature extractors ordered chronologically. While the original <em>Fawkes v0.3</em> was utterly ineffective against any model apart from <em>WebFace</em>, the updated v1.0 could transfer its attack to other extractors like <em>VGGFace</em>, <em>FaceNet</em>, and <em>ArcFace</em>. However, while <em>Fawkes v1.0</em> provided a perfect (100%) error rate on the <em>Celeb1M</em> model (the one it was trained to target), it failed miserably against more recent extractors like <em>MagFace</em><d-cite key="meng2021magface"></d-cite> or <em>CLIP</em>. A similar trend was also observed when using <em>LowKey</em>. While it fared better than <em>Fawkes</em> and could transfer its attack to MagSafe, LowKey failed to break the fine-tuned <em>CLIP</em> model trained by the authors.</p> <p>To provide more credence to their findings, the authors also illustrated how users who downloaded an older model (<em>Fawkes v0.3</em>, for example) could not ‘regain’ their privacy by switching to an updated attack. For brevity, this post does not go into the specifics, but we encourage interested readers to look at the paper and additional supplementary material. ***</p> <h4 id="robustness-shouldnt-come-at-the-cost-of-accuracy">Robustness shouldn’t come at the cost of accuracy</h4> <p>A potential caveat for the <em>adaptive</em> and <em>oblivious</em> defenses is that increased robustness may come at the cost of decreased accuracy. For example, the CLIP model is much more robust than all the other feature extractors, but its clean accuracy falls slightly below the best models. In most cases, a trainer might be hesitant to deploy a <em>CLIP</em>-based model if only a small minority of users try to attack the system.</p> <p>Keeping this in mind, the authors demonstrated two approaches that allow model trainers to incorporate the best qualities of both worlds:</p> <p><strong>Top2:</strong> This approach involved having a human in the loop. The authors propose that the system simply run the image through both models and return two candidate labels. To further streamline the process, the system could pass the image to the robust model only when the more accurate model cannot get a result. Humans could visually inspect these images to check for inconsistencies or determine if they were poisoned.</p> <p><strong>Confidence Thresholding:</strong> To automate the above process, the system could begin by passing the image through the most accurate model and checking the prediction’s confidence. This can be quantitatively defined as the distance between the target picture and its nearest neighbor in the embedding space. If the system finds the confidence below a certain threshold, the image is passed through the robust model instead.</p> <p>The paper demonstrates a facial recognition system that uses <em>MagFace</em> for an accurate model and combines that with a more robust model like the fine-tuned <em>CLIP</em> or an adaptively trained model. In both cases, the clean accuracy of the system matches or exceeds that of <em>MagFace</em>, while retaining high robustness to attacks.</p> <hr> <h3 id="conclusion">Conclusion</h3> <p>The main takeaway from this post is that data poisoning is no longer an effective method to protect users from facial recognition systems. The original premise for developing poisoning attacks was to facilitate an ‘arms race,’ where better attacks could counteract improved defenses. However, the people who deploy facial recognition models would always have the upper hand.</p> <p>The paper shows that facial recognition models can be trained to detect and overcome poisoning attacks by simply having black-box access to a public-facing tool or just waiting for newer models and retroactively using them. To compete even against the latter category of systems, users would have to presume that minimal changes will be made to facial recognition models in the upcoming years. Given the state and pace of research in the field, that seems highly unlikely. ***</p> <h3 id="outlook">Outlook</h3> <p>This blog post provides a better understanding of the techniques used to neutralize the effects of data poisoning from the ICLR 2022 paper <em>Data Poisoning Won’t Save You from Facial Recognition.</em> We hope that this has been of help to researchers and practitioners in the fields of adversarial ML.</p> <p>We now look to provide some clarifications and how we think this work would fit in the current age of machine learning.</p> <p><strong>The work is a net positive</strong> This paper takes a gloomy stance on the current state of protection against facial recognition models. By stating that model trainers would always have the upper hand in the race by simply switching to a more advanced framework, the authors quash any possibility of a technological solution. Instead, they argue that a legislative approach might hold the key to solving the problem. Looking at the discussion between the authors and the reviewers before the acceptance of the paper <d-footnote>[ICLR OpenReview](https://openreview.net/forum?id=B5XahNLmna)</d-footnote>, it was clear that the reviewers were reluctant to accept the finality of the solution - a sentiment we’re sure would be shared by many others. However, if nothing else, this paper warns users against the futility of using commercial products like Fawkes to protect their identities. In alleviating the false sense of security provided by data poisoning attacks, this paper - and, by extension, this post - serves as a net positive for users’ privacy.</p> <p><strong>Is legislation the answer?</strong> With artificial intelligence embedding itself into society at an unprecedented rate, it is clear that a complete overhaul of legislative frameworks is urgently required. As AI becomes more mainstream, privacy-invasive systems could graduate from storing information to using them for financial incentives. While we have seen this happen with users’ browsing data, the repercussions of using biometrics would be much more severe. In fact, there have already been cases where facial recognition has been used by companies on users without their prior explicit consent. <d-footnote> [Madison Square Garden has put lawyers who represent people suing it on an 'exclusion list' to keep them out of concerts and sporting events](https://www.nytimes.com/2022/12/22/nyregion/madison-square-garden-facial-recognition.html)</d-footnote></p> <p>While we agree with the authors for a push towards proper legislation, given the rate of progress, we believe the community can do more. Legislation is a process that moves slowly and usually needs uniform implementation. Literature on the subject has shown that each country has its own views on the emerging landscape of AI <d-footnote>[How different countries view artificial intelligence](https://www.brookings.edu/research/how-different-countries-view-artificial-intelligence/)</d-footnote> and bases its rules on those views. These may or may not always work. We believe a temporary stopgap in the form of a technological solution would be helpful, while a legislative solution holds maximum promise in the long run.</p> <hr> <h3 id="tldr">TL;DR</h3> <p>This post broadly explores the ineffectiveness of data poisoning strategies against facial recognition models. It shows that commercial solutions like Fawkes and LowKey, which allow users to perturb their photos before posting them to social media, offer no protection to the users once their pictures are scraped.</p> <p>It reveals that an ‘oblivious’ model trainer can simply wait long enough for future developments to nullify the effects of the perturbation. Or, since the people developing the facial recognition systems also have access to poisoning tools, they can simply develop strategies to detect and adapt to the perturbations.</p> <p>Finally, given that there are no technical solutions to the problem, the best approach would be to push for legislation to counteract privacy-invasive facial recognition systems.</p> <hr> </d-article> <d-appendix> <d-footnote-list></d-footnote-list> <d-citation-list></d-citation-list> </d-appendix> </div> <d-bibliography src="/2023/assets/bibliography/2023-05-01-facial-poisoning.bib"></d-bibliography> <d-article id="bibtex-container" class="related highlight"> For attribution in academic contexts, please cite this work as <pre id="bibtex-academic-attribution">
+        PLACEHOLDER FOR ACADEMIC ATTRIBUTION
+  </pre> BibTeX citation <pre id="bibtex-box">
+        PLACEHOLDER FOR BIBTEX
+  </pre> </d-article> <script src="https://utteranc.es/client.js" repo="iclr-blogposts/2023" issue-term="pathname" theme="github-light" crossorigin="anonymous" async> </script> <script src="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/js/bootstrap.bundle.min.js" integrity="sha256-fgLAgv7fyCGopR/gBNq2iW3ZKIdqIcyshnUULC4vex8=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/js/mdb.min.js" integrity="sha256-NdbiivsvWt7VYCt6hYNT3h/th9vSTL4EDWeGs5SN3DA=" crossorigin="anonymous"></script> </body> </html>
\ No newline at end of file
diff --git a/blog/2023/hitchhikers-momentum/index.html b/blog/2023/hitchhikers-momentum/index.html
new file mode 100644
index 00000000..19f09bbc
--- /dev/null
+++ b/blog/2023/hitchhikers-momentum/index.html
@@ -0,0 +1,36 @@
+<!DOCTYPE html> <html> <script>let thunk=()=>{let e=e=>e.trim(),t=e=>e.innerText,n=e=>{let t=e.split(" "),n=t.slice(0,-1).join(" ");return[t.at(-1),n]},i=Array.from(document.getElementsByClassName("author")).map(t).map(e).map(n),o=i[0][0],a=(Array.from(document.getElementsByClassName("affiliation")).filter(e=>"P"===e.nodeName).map(t).map(e),"May 1, 2023"),r="A Hitchhiker's Guide to Momentum",m="Polyak momentum is one of the most iconic methods in optimization. Despite it's simplicity, it features rich dynamics that depend both on the step-size and momentum parameter. In this blog post we identify the different regions of the parameter space and discuss their convergence properties using the theory of Chebyshev polynomials.";{let e=i.map(e=>`${e[0]}, ${e[1]}`).join(" and "),t=`\n@inproceedings{${(o+"2023"+r.split(" ").slice(0,3).join("")).replace(" ","").replace(/[\p{P}$+<=>^`|~]/gu,"").toLowerCase().trim()},\n  author = {${e}},\n  title = {${r}},\n  abstract = {${m}},\n  booktitle = {ICLR Blogposts 2023},\n  year = {2023},\n  date = {${a}},\n  note = {${window.location.href}},\n  url  = {${window.location.href}}\n}\n  `.trim();document.getElementById("bibtex-box").innerText=t}{let e=i.map(e=>e[0]),t=`\n${e=e.length>2?e[0]+", et al.":2==e.length?e[0]+" & "+e[1]:e[0]}, "${r}", ICLR Blogposts, 2023.\n`.trim();document.getElementById("bibtex-academic-attribution").innerText=t}};document.addEventListener("readystatechange",function(){"complete"===document.readyState&&thunk()});</script> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <title>A Hitchhiker's Guide to Momentum | ICLR Blogposts 2023</title> <meta name="author" content="abc b c"/> <meta name="description" content="Polyak momentum is one of the most iconic methods in optimization. Despite it's simplicity, it features rich dynamics that depend both on the step-size and momentum parameter. In this blog post we identify the different regions of the parameter space and discuss their convergence properties using the theory of Chebyshev polynomials."/> <meta name="keywords" content="machine-learning, ml, deep-learning, reinforcement-learning, iclr"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha256-DF7Zhf293AJxJNTmh5zhoYYIMs2oXitRfBjY+9L//AY=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/css/mdb.min.css" integrity="sha256-jpjYvU3G3N6nrrBwXJoVEYI/0zw8htfFnhT9ljN3JJw=" crossorigin="anonymous"/> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fortawesome/fontawesome-free@5.15.4/css/all.min.css" integrity="sha256-mUZM63G8m73Mcidfrv5E+Y61y7a12O5mW4ezU3bxqW4=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/academicons@1.9.1/css/academicons.min.css" integrity="sha256-i1+4qU2G2860dGGIOJscdC30s9beBXjFfzjWLjBRsBg=" crossorigin="anonymous"> <link rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700|Roboto+Slab:100,300,400,500,700|Material+Icons"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/github.css" media="" id="highlight_theme_light"/> <link rel="shortcut icon" href="/2023/assets/img/iclr_favicon.ico"/> <link rel="stylesheet" href="/2023/assets/css/main.css"> <link rel="canonical" href="https://iclr-blogposts.github.io/2023/blog/2023/hitchhikers-momentum/"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/native.css" media="none" id="highlight_theme_dark"/> <script src="/2023/assets/js/theme.js"></script> <script src="/2023/assets/js/dark_mode.js"></script> <script src="https://cdn.jsdelivr.net/npm/jquery@3.6.0/dist/jquery.min.js" integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=" crossorigin="anonymous"></script> <script type="text/javascript">window.MathJax={tex:{tags:"ams"}};</script> <script defer type="text/javascript" id="MathJax-script" src="https://cdn.jsdelivr.net/npm/mathjax@3.2.0/es5/tex-mml-chtml.js"></script> <script defer src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script> <script src="/2023/assets/js/distillpub/template.v2.js"></script> <script src="/2023/assets/js/distillpub/transforms.v2.js"></script> <script src="/2023/assets/js/distillpub/overrides.js"></script> <style type="text/css">.theorem{display:block;margin:12px 0;font-style:italic}.theorem:before{content:"Theorem.";font-weight:bold;font-style:normal}.theorem[text]:before{content:"Theorem (" attr(text) ") "}.corollary{display:block;margin:12px 0;font-style:italic}.corollary:before{content:"Corollary.";font-weight:bold;font-style:normal}.corollary[text]:before{content:"Corollary (" attr(text) ") "}.lemma{display:block;margin:12px 0;font-style:italic}.lemma:before{content:"Lemma.";font-weight:bold;font-style:normal}.lemma[text]:before{content:"Lemma (" attr(text) ") "}.definition{display:block;margin:12px 0;font-style:italic}.definition:before{content:"Definition.";font-weight:bold;font-style:normal}.definition[text]:before{content:"Definition (" attr(text) ") "}.remark{display:block;margin:12px 0;font-style:italic}.remark:before{content:"Remark.";font-weight:bold;font-style:normal}.remark[text]:before{content:"Remark (" attr(text) ") "}.lemma[text]:before{content:"Lemma (" attr(text) ") "}.proof{display:block;font-style:normal;margin:0}.proof:before{content:"Proof.";font-style:italic}.proof:after{content:"\25FC";float:right;font-size:1.8rem}.wrap-collapsible{margin-bottom:1.2rem 0}input[type='checkbox']{display:none}.lbl-toggle{text-align:center;padding:.6rem;cursor:pointer;border-radius:7px;transition:all .25s ease-out}.lbl-toggle::before{content:' ';display:inline-block;border-top:5px solid transparent;border-bottom:5px solid transparent;border-left:5px solid currentColor;vertical-align:middle;margin-right:.7rem;transform:translateY(-2px);transition:transform .2s ease-out}.toggle:checked+.lbl-toggle::before{transform:rotate(90deg) translateX(-3px)}.collapsible-content{max-height:0;overflow:hidden;transition:max-height .25s ease-in-out}.toggle:checked+.lbl-toggle+.collapsible-content{max-height:none;overflow:visible}.toggle:checked+.lbl-toggle{border-bottom-right-radius:0;border-bottom-left-radius:0}.collapsible-content .content-inner{border-bottom-left-radius:7px;border-bottom-right-radius:7px;padding:.5rem 1rem}.center{display:block;margin-left:auto;margin-right:auto}.framed{border:1px var(--global-text-color) dashed!important;padding:20px}d-article{overflow-x:visible}.underline{text-decoration:underline}</style> </head> <d-front-matter> <script async type="text/json">{
+      "title": "A Hitchhiker's Guide to Momentum",
+      "description": "Polyak momentum is one of the most iconic methods in optimization. Despite it's simplicity, it features rich dynamics that depend both on the step-size and momentum parameter. In this blog post we identify the different regions of the parameter space and discuss their convergence properties using the theory of Chebyshev polynomials.",
+      "published": "May 1, 2023",
+      "authors": [
+        {
+          "author": "Fabian Pedregosa",
+          "authorURL": "https://fa.bianp.net",
+          "affiliations": [
+            {
+              "name": "Google Research",
+              "url": ""
+            }
+          ]
+        }
+        
+      ],
+      "katex": {
+        "delimiters": [
+          {
+            "left": "$",
+            "right": "$",
+            "display": false
+          },
+          {
+            "left": "$$",
+            "right": "$$",
+            "display": true
+          }
+        ]
+      }
+    }</script> </d-front-matter> <body class="fixed-top-nav"> <header> <nav id="navbar" class="navbar navbar-light navbar-expand-sm fixed-top"> <div class="container"> <a class="navbar-brand title font-weight-lighter" href="/2023/">ICLR Blogposts 2023</a> <button class="navbar-toggler collapsed ml-auto" type="button" data-toggle="collapse" data-target="#navbarNav" aria-controls="navbarNav" aria-expanded="false" aria-label="Toggle navigation"> <span class="sr-only">Toggle navigation</span> <span class="icon-bar top-bar"></span> <span class="icon-bar middle-bar"></span> <span class="icon-bar bottom-bar"></span> </button> <div class="collapse navbar-collapse text-right" id="navbarNav"> <ul class="navbar-nav ml-auto flex-nowrap"> <li class="nav-item "> <a class="nav-link" href="/2023/about">about</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/call">call for blogposts</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/submitting">submitting</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/reviewing">reviewing</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/blog/index.html">blog</a> </li> <li class="nav-item dropdown "> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdown" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">other iterations</a> <div class="dropdown-menu dropdown-menu-right" aria-labelledby="navbarDropdown"> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2025/">2025</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2024/">2024</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blog-track.github.io/home/" target="_blank" rel="noopener noreferrer">2022</a> </div> </li> <li class="toggle-container"> <button id="light-toggle" title="Change theme"> <i class="fas fa-moon"></i> <i class="fas fa-sun"></i> </button> </li> </ul> </div> </div> </nav> </header> <div class="post distill"> <d-title> <h1>A Hitchhiker's Guide to Momentum</h1> <p>Polyak momentum is one of the most iconic methods in optimization. Despite it's simplicity, it features rich dynamics that depend both on the step-size and momentum parameter. In this blog post we identify the different regions of the parameter space and discuss their convergence properties using the theory of Chebyshev polynomials.</p> </d-title> <d-byline></d-byline> <d-article> <d-contents> <nav class="l-text figcaption"> <h3>Contents</h3> <div><a href="#gradient-descent-with-momentum">Gradient Descent with Momentum</a></div> <div><a href="#how-fast-is-momentum">How fast is Momentum?</a></div> <div><a href="#the-robust-region">The Robust Region</a></div> <div><a href="#the-lazy-region">The Lazy Region</a></div> <div><a href="#knife-s-edge">Knife's Edge</a></div> <div><a href="#putting-it-all-together">Putting it All Together</a></div> </nav> </d-contents> <blockquote> <p>Dedicated to the memory of Boris Polyak <a href="https://memorialsource.com/memorial/polyak" target="_blank" rel="noopener noreferrer">(May 4, 1935 - February 3, 2023)</a>, inventor of this method and pioneer of the field of optimization.</p> </blockquote> <div style="display: none"> $$ \def\argmin{\mathop{\mathrm{arg\,min}}} \def\xx{\pmb{x}} \def\HH{\pmb{H}} \def\bb{\pmb{b}} \def\EE{ \mathbb{E} } \def\RR{ \mathbb{R} } \def\lmax{L} \def\lmin{\mu} \def\defas{\stackrel{\text{def}}{=}} \definecolor{colormomentum}{RGB}{27, 158, 119} \definecolor{colorstepsize}{RGB}{217, 95, 2} \def\mom{ {\color{colormomentum}{m}} } \def\step{ {\color{colorstepsize}h} } $$ </div> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_convergence_momentum-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_convergence_momentum-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_convergence_momentum-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_convergence_momentum.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <h2 id="gradient-descent-with-momentum">Gradient Descent with Momentum</h2> <p>Gradient descent with momentum,<d-cite key="polyak1964some"></d-cite> also known as heavy ball or momentum for short, is an optimization method designed to solve unconstrained minimization problems of the form \begin{equation} \argmin_{\xx \in \RR^d} \, f(\xx)\,, \end{equation} where the objective function \(f\) is differentiable and we have access to its gradient \(\nabla f\). In this method the update is a sum of two terms. The first term is the difference between the current and the previous iterate \((\xx_{t} - \xx_{t-1})\), also known as <em>momentum term</em>. The second term is the gradient \(\nabla f(\xx_t)\) of the objective function.</p> <p class="framed"> <b class="underline">Gradient Descent with Momentum</b><br> <b>Input</b>: starting guess \(\xx_0\), step-size \(\step &gt; 0\) and momentum parameter \(\mom \in (0, 1)\).<br> \(\xx_1 = \xx_0 - \dfrac{\step}{\mom+1} \nabla f(\xx_0)\) <br> <b>For</b> \(t=1, 2, \ldots\) compute \begin{equation}\label{eq:momentum_update} \xx_{t+1} = \xx_t + \mom(\xx_{t} - \xx_{t-1}) - \step\nabla f(\xx_t) \end{equation} </p> <p>Despite its simplicity, gradient descent with momentum exhibits unexpectedly rich dynamics that we’ll explore on this post.</p> <h3 id="history-and-related-work">History and related work</h3> <p>The origins of momentum can be traced back to Frankel’s method in the 1950s for solving linear system of equations.<d-cite key="frankel1950convergence"></d-cite> It was later generalized by Boris Polyak to non-quadratic objectives<d-cite key="polyak1964some"></d-cite>. While the quadratic case is by now well understood, the general strongly convex case has instead had some fascinating developments in the last years. In the convex (but not strongly convex) case, <a href="https://arxiv.org/pdf/1412.7457.pdf" target="_blank" rel="noopener noreferrer">Ghadimi et al.</a><d-cite key="ghadimi2015global"></d-cite> showed the global convergence of the method in 2015. One year later, Lessard, Recht and Packard<d-cite key="lessard2016analysis"></d-cite> surprised the community with an example of a on-dimensional Lipschitz gradient and strongly convex function where the heavy-ball method (with a specific choice of parameters) doesn’t converge nor diverge, but cycles instead.</p> <p>A paper that also explores the dynamics of momentum is Gabriel Goh’s excellent <a href="https://distill.pub/2017/momentum/" target="_blank" rel="noopener noreferrer">Why Momentum Really Works</a>.<d-cite key="goh2017momentum"></d-cite> There are subtle but important differences between both analysis. The landscape described in the section <a href="https://distill.pub/2017/momentum/#momentum2D" target="_blank" rel="noopener noreferrer">“The Dynamics of Momentum”</a> describe the improvement along the direction <em>of a single eigenvector</em>. This partial view produces some misleading conclusions. For example, along the direction of a single eigenvector, the largest improvement is achieved with zero momentum and a step-size of 1 over the associated eigenvalue. This conclusion however doesn’t hold in higher dimensions, where as we will see, the momentum term that yields the fastest convergence is non-zero.</p> <p>A <strong>stochastic variant</strong> of this method, where the gradient is replaced by a stochastic estimate, is one of the most popular methods for deep learning. This has led in recent years to a flurry of research –and improved understanding – of this stochastic variant. Although we won’t be analyzing the stochastic variant, due to its importance, let us briefly mention some recent works.</p> <p>One of the first works to highlight the importance of momentum for training deep neural networks is the 2013 paper by <a href="https://arxiv.org/abs/1712.07628" target="_blank" rel="noopener noreferrer">Sutskever et al</a>.<d-cite key="sutskever2013importance"></d-cite> Some recent progress in the field has been possible by viewing the stochastic variant as an averaging method.<d-cite key="flammarion2015averaging"></d-cite> This has led to the development of improved last-iterate convergence rates <d-cite key="taylor2019stochastic"></d-cite> <d-cite key="tao2021the"></d-cite> and a better understanding of it’s behavior in the non-convex setting.<d-cite key="defazio2020momentum"></d-cite> Another fruitful line of work has been to consider the <i>overparametrized</i> (or interpolation) setting, where the variance of the updates vanishes at the optimum. In this regime, different momentum-like methods have been shown to enjoy a faster worst-case convergence rate than SGD.<d-cite key="Liu2020Accelerating"></d-cite> <d-cite key="vaswani2019fast"></d-cite></p> <h2 id="how-fast-is-momentum">How Fast is Momentum?</h2> <p>Momentum is <em>fast</em>. So fast that it’s often the default choice of machine learning practitioners. But can we quantify this more precisely?</p> <p>Throughout the post we’ll assume that our objective function \(f\) is a quadratic objective of the form \begin{equation}\label{eq:opt} f(\xx) \defas \frac{1}{2}(\xx - \xx^\star) \HH (\xx - \xx^\star)~, \end{equation} where \(\HH\) is a symmetric positive definite matrix and \(\xx^\star\) is the minimizer of the objective. We’ll assume that the eigenvalues of \(\HH\) are in the interval \([\mu, L]\), where \(\mu\) is strictly positive by the PSD assumption.</p> <p>The measure we’ll use to quantify the speed of convergence is the rate of convergence. This is the worst-case relative improvement in the iterate suboptimality at iteration \(t\), defined as \begin{equation}\label{eq:convergence_rate} r_t \defas \sup_{\xx_0, \text{eigs}(\HH) \in [\mu, L]} \frac{\|\xx_{t} - \xx^\star\|}{\|\xx_{0} - \xx^\star\|}\,. \end{equation} This is a worst-case measure because of all problem instances, we take worst possible initialization \(\xx_0\) and matrix \(\HH\) with eigenvalues in the interval \([\mu, L]\).</p> <p>This is a measure of how much progress is made (in the worst-case) at iteration \(t\). The smaller the value of \(r_t\), the faster the algorithm converges. Since all algorithms that we consider converge exponentially fast, for large enough \(t\) the error is of the order of \(\mathcal{O}{(\text{constant}^t)}\). Hence the most informative quantity is the value of \(\text{constant}\) in this expression. We’ll call this quantity the <i>asymptotic rate of convergence</i>, and denote it: \begin{equation} r_{\infty} \defas \limsup_{t \to \infty} \sqrt[t]{r_t}\,. \end{equation} This is the quantity we’ll be discussing throughout the post and what we’ll use to compare the speed of momentum for different values of its hyperparameters.</p> <h3 id="a-connection-between-optimization-methods-and-polynomials">A connection between optimization methods and polynomials</h3> <p>To compute easily the asymptotic rate of convergence for all admissible values of step-size and momentum, we’ll use a connection between optimization of quadratic functions and the theory of orthogonal polynomials. This theory was extensively used in the early days of numerical analysis <d-cite key="Rutishauser1959"></d-cite> and provides an elegant and simple way to compute asymptotic rates (and non-asymptotic ones, although not the topic of this blog post) from known results in the theory of orthogonal polynomials. We favor this technique for its simplicity and elegance, although ones that also be used with identical results. Other techniques include the linear operator technique used by Polyak,<d-cite key="polyak1964some"></d-cite> the estimate sequences technique pioneered by Nesterov<d-cite key="nesterov1983method"></d-cite> or the use of Lyapunov functions.<d-cite key="JMLR:v22:20-195"></d-cite></p> <p>The main result that will allow us to make the link between optimization and orthogonal polynomials is the following result. It’s origins seem unclear, although a proof can be found in the 1959 monograph of Rutishauser.<d-cite key="Rutishauser1959"></d-cite></p> <p class="lemma"> Consider the following polynomial \(P_t\) of degree \(t\), defined recursively as: \begin{equation} \begin{split} &amp;P_{t+1}(\lambda) = (1 + \mom - \step \lambda ) P_{t}(\lambda) - \mom P_{t-1}(\lambda)\\ &amp;P_1(\lambda) = 1 - \frac{\step}{1 + \mom} \lambda\,, ~ P_0(\lambda) = 1\,,~ \end{split}\label{eq:def_residual_polynomial2} \end{equation} Then we can write the suboptimality at iteration \(t\) as \begin{equation} \xx_t - \xx^\star = P_t(\HH) \left( \xx_0 - \xx^\star \right) \,, \end{equation} where \(P_t(\HH)\) is the matrix obtained from evaluating the (originally real-valued) polynomial \(P_t\) at the matrix \(\HH\). </p> <p>This last identity will allow us to easily compute convergence rates. In particular, plugging it into the definition of the convergence rate \eqref{eq:convergence_rate} we get that the rate is determined by the absolute value of the residual polynomial over the \([\mu, L]\) interval: \begin{align} r_t &amp;= \sup_{\xx_0, \text{eigs}(\HH) \in [\mu, L]} \frac{\|P_t(\HH) \left( \xx_0 - \xx^\star \right)\|}{\|\xx_{0} - \xx^\star\|} \\ &amp; = \sup_{\text{eigs}(\HH) \in [\mu, L]} \|P_t(\HH)\| \\ &amp; = \sup_{\lambda \in [\mu, L]} \lvert P_t(\lambda) \rvert\,. \end{align} We’ve now reduced the problem of computing the convergence rate to the problem of computing the absolute value of a polynomial over a given interval. This is a problem that has been extensively studied in the theory of orthogonal polynomials. In particular, we’ll use known bounds on Chebyshev polynomials of the first and second kind, as the residual polynomial of momentum can be written as a convex combination of these two polynomials. This fact is proven in the next result, which is a generalization of equation (II.29) in (Rutishauser 1959).<d-cite key="Rutishauser1959"></d-cite></p> <p class="lemma"> The residual polynomial of momentum can be written in terms of Chebyshev polynomials of the first and second kind as \begin{align} P_t(\lambda) = \mom^{t/2} \left( {\small\frac{2\mom}{1+\mom}}\, T_t(\sigma(\lambda)) + {\small\frac{1 - \mom}{1 + \mom}}\,U_t(\sigma(\lambda))\right)\,. \end{align} where \(\sigma(\lambda) = {\small\dfrac{1}{2\sqrt{\mom}}}(1 + \mom - \step\,\lambda)\,\) is a linear function that we'll refer to as the <span class="underline">link function</span> and \(T_t\) and \(U_t\) are the Chebyshev polynomials of the first and second kind respectively. </p> <div class="wrap-collapsible-XXX"> <input id="collapsible3" class="toggle" type="checkbox"> <label for="collapsible3" class="lbl-toggle" tabindex="0"><b>Show proof</b></label><div class="collapsible-content"><div class="content-inner"><div class="proof" id="proof-variance"> <p> Let's denote by \(\widetilde{P}_t\) the right hand side of the above equation, that is, \begin{equation} \widetilde{P}_{t}(\lambda) \defas \mom^{t/2} \left( {\small\frac{2 \mom}{1 + \mom}}\, T_t(\sigma(\lambda)) + {\small\frac{1 - \mom}{1 + \mom}}\, U_t(\sigma(\lambda))\right)\,. \end{equation} Our goal is to show that \(P_t = \widetilde{P}_t\) for all \(t\). </p> <p> For \(t=1\), \(T_1(\lambda) = \lambda\) and \(U_1(\lambda) = 2\lambda\), so we have \begin{align} \widetilde{P}_1(\lambda) &amp;= \sqrt{\mom} \left(\tfrac{2 \mom}{1 + \mom} \sigma(\lambda) + \tfrac{1 - \mom}{1 + \mom} 2 \sigma(\lambda)\right)\\ &amp;= \frac{2 \sqrt{\mom}}{1 + \mom} \sigma(\lambda) = 1 - \frac{\step}{1 + \mom} \lambda\,, \end{align} which corresponds to the definition of \(P_1\) in \eqref{eq:def_residual_polynomial2}. </p> <p> Assume it's true for any iteration up to \(t\), we will show it's true for \(t+1\). Using the three-term recurrence of Chebyshev polynomials we have \begin{align} &amp;\widetilde{P}_{t+1}(\lambda) = \mom^{(t+1)/2} \left( {\small\frac{2 \mom}{1 + \mom}}\, T_{t+1}(\sigma(\lambda)) + {\small\frac{1 - \mom}{1 + \mom}}\, U_{t+1}(\sigma(\lambda))\right) \\ &amp;= \mom^{(t+1)/2} \Big( {\small\frac{2 \mom}{1 + \mom}}\, (2 \sigma(\lambda) T_{t}(\sigma(\lambda)) - T_{t-1}(\sigma(\lambda))) \nonumber\\ &amp;\qquad\qquad + {\small\frac{1 - \mom}{1 + \mom}}\, (2 \sigma(\lambda) U_{t}(\sigma(\lambda)) - U_{t-1}(\sigma(\lambda)))\Big)\\ &amp;= 2 \sigma(\lambda) \sqrt{\mom} P_t(\lambda) - \mom P_{t-1}(\lambda)\\ &amp;= (1 + \mom - \step \lambda) P_t(\lambda) - \mom P_{t-1}(\lambda) \end{align} where the third identity follows from grouping polynomials of same degree and the induction hypothesis. The last expression is the recursive definition of \(P_{t+1}\) in \eqref{eq:def_residual_polynomial2}, which proves the desired \(\widetilde{P}_{t+1} = {P}_{t+1}\). </p> </div></div></div> </div> <h3 id="tools-of-the-trade-the-two-faces-of-chebyshev-polynomials">Tools of the trade: the two faces of Chebyshev polynomials</h3> <p>A key feature that we’ll use extensively about Chebyshev polynomials is that they behave very differently inside and outside the interval \([-1, 1]\). Inside this interval (shaded blue region) the magnitude of these polynomials stays close to zero, while outside it explodes:</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/two_phases_chebyshev.gif-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/two_phases_chebyshev.gif-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/two_phases_chebyshev.gif-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-hitchhikers-momentum/two_phases_chebyshev.gif" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <p>Let’s make this observation more precise.</p> <p><strong>Inside</strong> the \([-1, 1]\) interval, Chebyshev polynomials admit the <a href="https://en.wikipedia.org/wiki/Chebyshev_polynomials#Trigonometric_definition" target="_blank" rel="noopener noreferrer">trigonometric definitions</a> \(T_t(\cos(\theta)) = \cos(t \theta)\) and \(U_{t}(\cos(\theta)) = \sin((t+1)\theta) / \sin(\theta)\) and so they have an oscillatory behavior with values bounded in absolute value by 1 and \(t+1\) respectively.</p> <p><strong>Outside</strong> of this interval the Chebyshev polynomials of the first kind admit the <a href="https://en.wikipedia.org/wiki/Chebyshev_polynomials#Explicit_expressions" target="_blank" rel="noopener noreferrer">explicit form</a> for \(|\xi| \ge 1\): \begin{align} T_t(\xi) &amp;= \dfrac{1}{2} \Big(\xi-\sqrt{\xi^2-1} \Big)^t + \dfrac{1}{2} \Big(\xi+\sqrt{\xi^2-1} \Big)^t \\ U_t(\xi) &amp;= \frac{(\xi + \sqrt{\xi^2 - 1})^{t+1} - (\xi - \sqrt{\xi^2 - 1})^{t+1}}{2 \sqrt{\xi^2 - 1}}\,. \end{align} We’re interested in convergence rates, so we’ll look into \(t\)-th root asymptotics of the quantities.<d-footnote>With little extra effort, it would be possible to derive non-asymptotic convergence rates, although I won't pursue this analysis here.</d-footnote> Luckily, these asymptotics are the same for both polynomials<d-footnote>Although we won't use it here, this \(t\)-th root asymptotic holds for (almost) all orthogonal polynomials, not just Chebyshev polynomials. See for instance reference below</d-footnote> <d-cite key="stahl1990nth"></d-cite> and taking limits we have that \begin{equation} \lim_{t \to \infty} \sqrt[t]{|T_t(\xi)|} = \lim_{t \to \infty} \sqrt[t]{|U_t(\xi)|} = |\xi| + \sqrt{\xi^2 - 1}\,. \end{equation}</p> <h2 id="the-robust-region">The Robust Region</h2> <p>Let’s start first by considering the case in which the image of \(\sigma\) is in the \([-1, 1]\) interval. This is the most favorable case. In this case, the Chebyshev polynomials are bounded in absolute value by 1 and \(t+1\) respectively. Since the Chebsyshev polynomials are evaluated at \(\sigma(\cdot)\), this implies that \(\lvert \sigma(\lambda)\rvert \leq 1\). We’ll call the set of step-size and momentum parameters for which the previous inequality is verified the <em>robust region</em>.</p> <p>Let’s visualize this region in a map. Since \(\sigma\) is a linear function, its extremal values are reached at the edges: \begin{equation} \max_{\lambda \in [\lmin, \lmax]} |\sigma(\lambda)| = \max{|\sigma(\lmin)|, |\sigma(\lmax)|}\,. \end{equation} Using \(\lmin \leq \lmax\) and that \(\sigma(\lambda)\) is decreasing in \(\lambda\), we can simplify the condition \(\lvert \sigma(\lambda)\rvert \leq 1\) to \(\sigma(\lmin) \leq 1\) and \(\sigma(L) \geq -1\), which in terms of the step-size and momentum correspond to: \begin{equation}\label{eq:robust_region} \frac{(1 - \sqrt{\mom})^2}{\lmin} \leq \step \leq \frac{(1 + \sqrt{\mom})^2}{L} \,. \end{equation} These two conditions provide the upper and lower bound of the robust region.</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/sketch_robust_region-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/sketch_robust_region-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/sketch_robust_region-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-hitchhikers-momentum/sketch_robust_region.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <h3 id="asymptotic-rate">Asymptotic rate</h3> <p>Let \(\sigma(\lambda) = \cos(\theta)\) for some \(\theta\), which is always possible since \(\sigma(\lambda) \in [-1, 1]\). In this regime, Chebyshev polynomials verify the identities \(T_t(\cos(\theta)) = \cos(t \theta)\) and \(U_t(\cos(\theta)) = \sin((t+1)\theta)/\sin(\theta)\) , which replacing in the definition of the residual polynomial gives \begin{equation} P_t(\sigma^{-1}(\cos(\theta))) = \mom^{t/2} \left[ {\small\frac{2\mom}{1+\mom}}\, \cos(t\theta) + {\small\frac{1 - \mom}{1 + \mom}}\,\frac{\sin((t+1)\theta)}{\sin(\theta)}\right]\,. \end{equation}</p> <p>Since the expression inside the square brackets is bounded in absolute value by \(t+2\), taking \(t\)-th root and then limits we have \(\limsup_{t \to \infty} \sqrt[t]{\lvert P_t(\sigma^{-1}(\cos(\theta)))\rvert} = \sqrt{\mom}\) for <i>any</i> \(\theta\). This gives our first asymptotic rate:</p> <p class="framed" style="text-align: center"> The asymptotic rate in the robust region is \(r_{\infty} = \sqrt{\mom}\). </p> <p>This is nothing short of magical. It would seem natural –and this will be the case in other regions– that the speed of convergence should depend on both the step-size and the momentum parameter. Yet, this result implies that it’s not the case in the robust region. In this region, the convergence <i>only</i> depends on the momentum parameter $\mom$. Amazing.<d-footnote>This insensitivity to step-size has been leveraged by Zhang et al. 2018 to develop a momentum tuner </d-footnote> <d-cite key="zhang2017yellowfin"></d-cite></p> <p>This also illustrates why we call this the <i>robust</i> region. In its interior, perturbing the step-size in a way that we stay within the region has no effect on the convergence rate. The next figure displays the asymptotic rate (darker is faster) in the robust region.</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_robust_region-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_robust_region-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_robust_region-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_robust_region.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <h2 id="the-lazy-region">The Lazy Region</h2> <p>Let’s consider now what happens outside of the robust region. In this case, the convergence will depend on the largest of \(\{\lvert\sigma(\lmin)\rvert, \lvert\sigma(L)\rvert\}\). We’ll consider first the case in which the maximum is \(\lvert\sigma(\lmin)\rvert\) and leave the other one for next section.</p> <p>This region is determined by the inequalities \(\lvert\sigma(\lmin)\rvert &gt; 1\) and \(\lvert\sigma(\lmin)\rvert \geq \lvert\sigma(L)\rvert\). Using the definition of \(\sigma\) and solving for \(\step\) gives the equivalent conditions \begin{equation} \step \leq \frac{2(1 + \mom)}{L + \lmin} \quad \text{ and }\quad \step \leq \frac{(1 - \sqrt{\mom})^2}{\lmin}\,. \end{equation} Note the second inequality is the same one as for the robust region \eqref{eq:robust_region} but with the inequality sign reversed, and so the region will be on the oposite side of that curve. We’ll call this the <i>lazy region</i>, as in increasing the momentum will take us out of it and into the robust region.</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/sketch_lazy_region-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/sketch_lazy_region-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/sketch_lazy_region-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-hitchhikers-momentum/sketch_lazy_region.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <h3 id="asymptotic-rate-1">Asymptotic rate</h3> <p>As we saw earlier, outside of the \([-1, 1]\) interval both Chebyshev have simple \(t\)-th root asymptotics. Using this and that both kinds of Chebyshev polynomials agree in sign outside of the \([-1, 1]\) interval we can compute the asymptotic rate as \begin{align} \lim_{t \to \infty} \sqrt[t]{r_t} &amp;= \sqrt{\mom} \lim_{t \to \infty} \sqrt[t]{ {\small\frac{2\mom}{\mom+1}}\, T_t(\sigma(\lmin)) + {\small\frac{1 - \mom}{1 + \mom}}\,U_t(\sigma(\lmin))} \\ &amp;= \sqrt{\mom}\left(|\sigma(\lmin)| + \sqrt{\sigma(\lmin)^2 - 1} \right) \\ \end{align} This gives the asymptotic rate for this region</p> <p class="framed" style="text-align: center"> In the lazy region the asymptotic rate is \(r_{\infty} = \sqrt{\mom}\left(|\sigma(\lmin)| + \sqrt{\sigma(\lmin)^2 - 1} \right)\). </p> <p>Unlike in the robust region, this rate depends on both the step-size and the momentum parameter, which enters in the rate through the link function \(\sigma\). This can be observed in the color plot of the asymptotic rate</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_lazy_region-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_lazy_region-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_lazy_region-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_lazy_region.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <h2 id="knifes-edge">Knife’s Edge</h2> <p>The robust and lazy region occupy most (but not all!) of the region for which momentum converges. There’s a small region that sits between the lazy and robust regions and the region where momentum diverges. We call this region the <i>Knife’s edge</i></p> <p>For parameters not in the robust or lazy region, we have that \(|\sigma(L)| &gt; 1\) and \(|\sigma(L)| &gt; |\sigma(\lmin)|\). Using the asymptotics of Chebyshev polynomials as we did in the previous section, we have that the asymptotic rate is \(\sqrt{\mom}\left(|\sigma(L)| + \sqrt{\sigma(L)^2 - 1} \right)\). The method will only converge when this asymptotic rate is below 1. Enforcing this results in \(\step \lt 2 (1 + \mom) / L\). Combining this condition with the one of not being in the robust or lazy region gives the characterization: \begin{equation} \step \lt \frac{2 (1 + \mom)}{L} \quad \text{ and } \quad \step \geq \max\Big\{\tfrac{2(1 + \mom)}{L + \lmin}, \tfrac{(1 + \sqrt{\mom})^2}{L}\Big\}\,. \end{equation}</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/sketch_knife_edge-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/sketch_knife_edge-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/sketch_knife_edge-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-hitchhikers-momentum/sketch_knife_edge.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <h3 id="asymptotic-rate-2">Asymptotic rate</h3> <p>The asymptotic rate can be computed using the same technique as in the lazy region. The resulting rate is the same as in that region but with \(\sigma(L)\) replacing \(\sigma(\lmin)\):</p> <p class="framed" style="text-align: center"> In the Knife's edge region the asymptotic rate is \(\sqrt{\mom}\left(|\sigma(L)| + \sqrt{\sigma(L)^2 - 1} \right)\). </p> <p>Pictorially, this corresponds to</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_knife_edge-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_knife_edge-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_knife_edge-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_knife_edge.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <h2 id="putting-it-all-together">Putting it All Together</h2> <p>This is the end of our journey. We’ve visited all the regions on which momentum converges.<d-footnote>There's a small convergent region with <i>negative</i> momentum parameter that we haven't visited. Although not typically used for minimization, negative momentum has found applications in smooth games <a href="https://arxiv.org/abs/1807.04740" target="_blank" rel="noopener noreferrer">(Gidel et al., 2020)</a>.</d-footnote> The only thing left to do is to combine all the asymptotic rates we’ve gathered along the way.</p> <p class="theorem"> The asymptotic rate \(\limsup_{t \to \infty} \sqrt[t]{r_t}\) of momentum is \begin{alignat}{2} &amp;\sqrt{\mom} &amp;&amp;\text{ if }\step \in \big[\frac{(1 - \sqrt{\mom})^2}{\lmin}, \frac{(1+\sqrt{\mom})^2}{L}\big]\\ &amp;\sqrt{\mom}(|\sigma(\lmin)| + \sqrt{\sigma(\lmin)^2 - 1}) &amp;&amp;\text{ if } \step \in \big[0, \min\{\tfrac{2(1 + \mom)}{L + \lmin}, \tfrac{(1 - \sqrt{\mom})^2}{\lmin}\}\big]\\ &amp;\sqrt{\mom}(|\sigma(L)| + \sqrt{\sigma(L)^2 - 1})&amp;&amp;\text{ if } \step \in \big[\max\big\{\tfrac{2(1 + \mom)}{L + \lmin}, \tfrac{(1 + \sqrt{\mom})^2}{L}\big\}, \tfrac{2 (1 + \mom) }{L} \big)\\ &amp;\geq 1 \text{ (divergence)} &amp;&amp; \text{ otherwise.} \end{alignat} </p> <p>Plotting the asymptotic rates for all regions we can see that Polyak momentum (the method with momentum $\mom = \left(\frac{\sqrt{L} - \sqrt{\lmin}}{\sqrt{L} + \sqrt{\lmin}}\right)^2$ and step-size $\step = \left(\frac{2}{\sqrt{L} + \sqrt{\lmin}}\right)^2$ which is asymptotically optimal among the momentum methods with constant coefficients) is at the intersection of the three regions.</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_convergence_momentum-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_convergence_momentum-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_convergence_momentum-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_convergence_momentum.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <h2 id="reproducibility">Reproducibility</h2> <p>All plots in this post were generated using the following Jupyer notebook: <a href="/2023/assets/html/2023-05-01-hitchhikers-momentum/hitchhikers-momentum.html">[HTML]</a> <a href="/2023/assets/html/2023-05-01-hitchhikers-momentum/hitchhikers-momentum.ipynb">[IPYNB]</a></p> </d-article> <d-appendix> <d-footnote-list></d-footnote-list> <d-citation-list></d-citation-list> </d-appendix> </div> <d-bibliography src="/2023/assets/bibliography/2023-05-01-hitchhikers-momentum.bib"></d-bibliography> <d-article id="bibtex-container" class="related highlight"> For attribution in academic contexts, please cite this work as <pre id="bibtex-academic-attribution">
+        PLACEHOLDER FOR ACADEMIC ATTRIBUTION
+  </pre> BibTeX citation <pre id="bibtex-box">
+        PLACEHOLDER FOR BIBTEX
+  </pre> </d-article> <script src="https://utteranc.es/client.js" repo="iclr-blogposts/2023" issue-term="pathname" theme="github-light" crossorigin="anonymous" async> </script> <script src="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/js/bootstrap.bundle.min.js" integrity="sha256-fgLAgv7fyCGopR/gBNq2iW3ZKIdqIcyshnUULC4vex8=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/js/mdb.min.js" integrity="sha256-NdbiivsvWt7VYCt6hYNT3h/th9vSTL4EDWeGs5SN3DA=" crossorigin="anonymous"></script> </body> </html>
\ No newline at end of file
diff --git a/blog/2023/how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/index.html b/blog/2023/how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/index.html
new file mode 100644
index 00000000..f59f7b61
--- /dev/null
+++ b/blog/2023/how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/index.html
@@ -0,0 +1,56 @@
+<!DOCTYPE html> <html> <script>let thunk=()=>{let e=e=>e.trim(),t=e=>e.innerText,n=e=>{let t=e.split(" "),n=t.slice(0,-1).join(" ");return[t.at(-1),n]},a=Array.from(document.getElementsByClassName("author")).map(t).map(e).map(n),i=a[0][0],o=(Array.from(document.getElementsByClassName("affiliation")).filter(e=>"P"===e.nodeName).map(t).map(e),"May 1, 2023"),r="How does the inductive bias influence the generalization capability of neural networks?",l="The blog post discusses how memorization and generalization are affected by extreme overparameterization. Thereforeit explains the overfitting puzzle in machine learning and how the inductive bias can help to understand the generalization capability of neural networks.";{let e=a.map(e=>`${e[0]}, ${e[1]}`).join(" and "),t=`\n@inproceedings{${(i+"2023"+r.split(" ").slice(0,3).join("")).replace(" ","").replace(/[\p{P}$+<=>^`|~]/gu,"").toLowerCase().trim()},\n  author = {${e}},\n  title = {${r}},\n  abstract = {${l}},\n  booktitle = {ICLR Blogposts 2023},\n  year = {2023},\n  date = {${o}},\n  note = {${window.location.href}},\n  url  = {${window.location.href}}\n}\n  `.trim();document.getElementById("bibtex-box").innerText=t}{let e=a.map(e=>e[0]),t=`\n${e=e.length>2?e[0]+", et al.":2==e.length?e[0]+" & "+e[1]:e[0]}, "${r}", ICLR Blogposts, 2023.\n`.trim();document.getElementById("bibtex-academic-attribution").innerText=t}};document.addEventListener("readystatechange",function(){"complete"===document.readyState&&thunk()});</script> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <title>How does the inductive bias influence the generalization capability of neural networks? | ICLR Blogposts 2023</title> <meta name="author" content="abc b c"/> <meta name="description" content="The blog post discusses how memorization and generalization are affected by extreme overparameterization. Thereforeit explains the overfitting puzzle in machine learning and how the inductive bias can help to understand the generalization capability of neural networks."/> <meta name="keywords" content="machine-learning, ml, deep-learning, reinforcement-learning, iclr"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha256-DF7Zhf293AJxJNTmh5zhoYYIMs2oXitRfBjY+9L//AY=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/css/mdb.min.css" integrity="sha256-jpjYvU3G3N6nrrBwXJoVEYI/0zw8htfFnhT9ljN3JJw=" crossorigin="anonymous"/> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fortawesome/fontawesome-free@5.15.4/css/all.min.css" integrity="sha256-mUZM63G8m73Mcidfrv5E+Y61y7a12O5mW4ezU3bxqW4=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/academicons@1.9.1/css/academicons.min.css" integrity="sha256-i1+4qU2G2860dGGIOJscdC30s9beBXjFfzjWLjBRsBg=" crossorigin="anonymous"> <link rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700|Roboto+Slab:100,300,400,500,700|Material+Icons"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/github.css" media="" id="highlight_theme_light"/> <link rel="shortcut icon" href="/2023/assets/img/iclr_favicon.ico"/> <link rel="stylesheet" href="/2023/assets/css/main.css"> <link rel="canonical" href="https://iclr-blogposts.github.io/2023/blog/2023/how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/native.css" media="none" id="highlight_theme_dark"/> <script src="/2023/assets/js/theme.js"></script> <script src="/2023/assets/js/dark_mode.js"></script> <script src="https://cdn.jsdelivr.net/npm/jquery@3.6.0/dist/jquery.min.js" integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=" crossorigin="anonymous"></script> <script type="text/javascript">window.MathJax={tex:{tags:"ams"}};</script> <script defer type="text/javascript" id="MathJax-script" src="https://cdn.jsdelivr.net/npm/mathjax@3.2.0/es5/tex-mml-chtml.js"></script> <script defer src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script> <script src="/2023/assets/js/distillpub/template.v2.js"></script> <script src="/2023/assets/js/distillpub/transforms.v2.js"></script> <script src="/2023/assets/js/distillpub/overrides.js"></script> </head> <d-front-matter> <script async type="text/json">{
+      "title": "How does the inductive bias influence the generalization capability of neural networks?",
+      "description": "The blog post discusses how memorization and generalization are affected by extreme overparameterization. Thereforeit explains the overfitting puzzle in machine learning and how the inductive bias can help to understand the generalization capability of neural networks.",
+      "published": "May 1, 2023",
+      "authors": [
+        {
+          "author": "Charlotte Barth",
+          "authorURL": "https://www.linkedin.com/in/charlotte-barth-a58b0a152/?originalSubdomain=de",
+          "affiliations": [
+            {
+              "name": "TU Berlin",
+              "url": ""
+            }
+          ]
+        },
+        {
+          "author": "Thomas Goerttler",
+          "authorURL": "https://scholar.google.de/citations?user=ppQIwpIAAAAJ&hl=de",
+          "affiliations": [
+            {
+              "name": "TU Berlin",
+              "url": ""
+            }
+          ]
+        },
+        {
+          "author": "Klaus Obermayer",
+          "authorURL": "https://www.tu.berlin/ni/",
+          "affiliations": [
+            {
+              "name": "TU Berlin",
+              "url": ""
+            }
+          ]
+        }
+        
+      ],
+      "katex": {
+        "delimiters": [
+          {
+            "left": "$",
+            "right": "$",
+            "display": false
+          },
+          {
+            "left": "$$",
+            "right": "$$",
+            "display": true
+          }
+        ]
+      }
+    }</script> </d-front-matter> <body class="fixed-top-nav"> <header> <nav id="navbar" class="navbar navbar-light navbar-expand-sm fixed-top"> <div class="container"> <a class="navbar-brand title font-weight-lighter" href="/2023/">ICLR Blogposts 2023</a> <button class="navbar-toggler collapsed ml-auto" type="button" data-toggle="collapse" data-target="#navbarNav" aria-controls="navbarNav" aria-expanded="false" aria-label="Toggle navigation"> <span class="sr-only">Toggle navigation</span> <span class="icon-bar top-bar"></span> <span class="icon-bar middle-bar"></span> <span class="icon-bar bottom-bar"></span> </button> <div class="collapse navbar-collapse text-right" id="navbarNav"> <ul class="navbar-nav ml-auto flex-nowrap"> <li class="nav-item "> <a class="nav-link" href="/2023/about">about</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/call">call for blogposts</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/submitting">submitting</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/reviewing">reviewing</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/blog/index.html">blog</a> </li> <li class="nav-item dropdown "> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdown" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">other iterations</a> <div class="dropdown-menu dropdown-menu-right" aria-labelledby="navbarDropdown"> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2025/">2025</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2024/">2024</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blog-track.github.io/home/" target="_blank" rel="noopener noreferrer">2022</a> </div> </li> <li class="toggle-container"> <button id="light-toggle" title="Change theme"> <i class="fas fa-moon"></i> <i class="fas fa-sun"></i> </button> </li> </ul> </div> </div> </nav> </header> <div class="post distill"> <d-title> <h1>How does the inductive bias influence the generalization capability of neural networks?</h1> <p>The blog post discusses how memorization and generalization are affected by extreme overparameterization. Thereforeit explains the overfitting puzzle in machine learning and how the inductive bias can help to understand the generalization capability of neural networks.</p> </d-title> <d-byline></d-byline> <d-article> <d-contents> <nav class="l-text figcaption"> <h3>Contents</h3> <div><a href="#overfitting-puzzle">Overfitting Puzzle</a></div> <div><a href="#experiments">Experiments</a></div> <ul> <li><a href="#fully-connected-networks-fcn">Fully connected networks (FCN)</a></li> <li><a href="#convolutional-neural-networks-cnn">Convolutional neural networks (CNN)</a></li> </ul> <div><a href="#general-findings">General findings</a></div> <div><a href="#conclusion">Conclusion</a></div> </nav> </d-contents> <p>Deep neural networks are a commonly used machine learning technique that has proven to be effective for many different use cases. However, their ability to generalize from training data is not well understood. In this blog post, we will explore the paper “Identity Crisis: Memorization and Generalization under Extreme Overparameterization” by Zhang et al. [2020] <d-cite key="DBLP:conf/iclr/ZhangBHMS20"></d-cite>, which aims to shed light on the question of why neural networks are able to generalize, and how inductive biases influence their generalization capabilities.</p> <h2 id="overfitting-puzzle">Overfitting Puzzle</h2> <p>One open question in the field of machine learning is the <strong>overfitting puzzle</strong>, which describes the paradox that neural networks are often used in an overparameterized state (i.e., with more parameters than training examples), yet they are still able to generalize well to new, unseen data. This contradicts <strong>classical learning theory</strong>, which states that a model with too many parameters will simply memorize the training data and perform poorly on new data. This is based on the <a href="https://machinelearningcompass.com/model_optimization/bias_and_variance/" target="_blank" rel="noopener noreferrer"><strong>bias-variance tradeoff</strong></a> which is commonly illustrated in this way <d-cite key="fortmann2012understanding"></d-cite>:</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/bias_variance_tradeoff-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/bias_variance_tradeoff-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/bias_variance_tradeoff-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/bias_variance_tradeoff.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <p>The tradeoff consists of finding the optimal model complexity between two extremes: If there are too few parameters, the model may have high bias and underfit the data, resulting in poor performance on both the training and test data. On the other hand, if there are too many parameters, the model may have high variance and overfit the training data, resulting in a good performance on the training data but a poor performance on the test data.</p> <p>Therefore, it is important to carefully balance the number of parameters and the amount of data available to achieve the best possible generalization performance for a given learning task.</p> <p>Neural networks, particularly deep networks, are typically used in the overparameterized regime, where the number of parameters exceeds the number of training examples. In these cases, common generalization bounds do not apply <d-cite key="DBLP:journals/corr/abs-1801-00173"></d-cite>. According to classical learning theory, the generalization behavior of a learning system should depend on the number of training examples (n), and the complexity of the model should be balanced with its fit to the data <d-cite key="DBLP:journals/corr/abs-1801-00173"></d-cite>. Otherwise, the algorithm would overfit. However, neural networks have shown that this is not always the case, as they can perform well even in cases of extreme overparameterization (e.g., a 5-layer CNN with 80 million parameters <d-cite key="DBLP:conf/iclr/ZhangBHMS20"></d-cite>). This is a very interesting finding as it shows that the classical learning theory may not hold true for neural networks.</p> <p>To better understand this phenomenon, Zhang et al. [2020] <d-cite key="DBLP:conf/iclr/ZhangBHMS20"></d-cite> examined the role of <strong>inductive bias</strong> in neural networks and its influence on the generalization capability of these networks. Inductive bias, or learning bias, refers to the assumptions a network makes about the nature of the target function and is determined by the network’s architecture. Zhang et al. [2020] <d-cite key="DBLP:conf/iclr/ZhangBHMS20"></d-cite> conducted experiments with different types of fully connected networks (FCN) and convolutional neural networks (CNN) to investigate which biases are effective for these network architectures.</p> <h2 id="experiments">Experiments</h2> <p>In the paper “Identity Crisis: Memorization and Generalization under Extreme Overparameterization” by Zhang et al. [2020] <d-cite key="DBLP:conf/iclr/ZhangBHMS20"></d-cite>, the authors use <strong>empirical studies</strong> to better understand the <em>overfitting puzzle</em> and how inductive bias affects the behavior of overparameterized neural networks. The authors specifically aim to investigate the role of inductive bias under <strong>different architectural choices</strong> by comparing fully connected and convolutional neural networks.</p> <p>The task used in the study is to learn an identity map through a single data point, which is an artificial setup that demonstrates the most extreme case of overparameterization. The goal of the study is to determine whether a network tends towards memorization (learning a constant function) or generalization (learning the identity function).</p> <p>To enable the <strong>identity task</strong> <d-cite key="DBLP:conf/eccv/HeZRS16"></d-cite> for linear models, the authors ensure that hidden dimensions are not smaller than the input and set the weights to the identity matrix in every layer. For convolutional layers, only the center of the kernel is used, and all other values are set to zero, simulating a 1 x 1 convolution which acts as a local identity function. For deeper models that use the <a href="https://en.wikipedia.org/wiki/Rectifier_(neural_networks)" target="_blank" rel="noopener noreferrer">ReLU</a> activation function, it is necessary to encode and recover negative values, as they are discarded by the ReLU function. This can be achieved by using hidden dimensions that are twice the size of the input and storing negative and positive values separately.</p> <p>All networks are trained using standard gradient descent to minimize the mean squared error.</p> <p>The study uses the <strong><a href="https://paperswithcode.com/dataset/mnist" target="_blank" rel="noopener noreferrer">MNIST dataset</a></strong> and tests the networks on various types of data, including a linear combination of two digits, random digits from the MNIST test set, random images from the Fashion MNIST dataset, and algorithmically generated image patterns.</p> <p>So let us look at some of the results:</p> <div class="l-page"> <iframe src="/2023/assets/html/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_3.html" frameborder="0" scrolling="no" width="100%" height="450px"></iframe> </div> <p>The first column of the figure above shows the single data point that was used to train the network on, and all following columns show the test data with its specific results. The rows represent the different implementations of the respective networks (FCN, CNN).</p> <h3 id="fully-connected-networks-fcn">Fully connected networks (FCN)</h3> <p>For fully connected networks, the outputs differ depending on the depth of the network and the type of testing data. Shallower networks seem to incorporate random white noise into the output, while deeper networks tend to learn the constant function. The similarity of the test data to the training example also affects the behavior of the model. When the test data is from the MNIST digit sets, all network architectures perform quite well. However, for test data that is more dissimilar to the training data, the output tends to include more random white noise. The authors prove this finding with a <em>theorem</em> for 1-layer FCNs. The formula shows the prediction results for a test data point $x$:</p> \[f(x) = \Pi_{\parallel}(x) + R \Pi_{\perp}(x)\] <p>The test data point is decomposed into components that are parallel $\Pi_{\parallel}$ and perpendicular $\Pi_{\perp}$ to the training example. $R$ is a random matrix, independent of the training data. If the test data is highly correlated to the training data, the prediction resembles the training output. If the test data is dissimilar to the training data, $\Pi_{\perp}(x)$ dominates $\Pi_{\parallel}(x)$, the output is randomly projected by $R$ and persists of white noise.</p> <p>This behavior can be confirmed by visualizing the results of the 1-layer FCN:</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_1layer-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_1layer-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_1layer-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_1layer.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <p>The inductive bias does not lead to either good generalization or memorization. Instead, the predictions become more random as the test data becomes less similar to the training data.</p> <p>Deeper networks tend to learn the constant function, resulting in a strong inductive bias towards the training output regardless of the specific input. This behavior is similar to that of a deep ReLU network, as shown in the figure comparing deep FCN and deep ReLU networks.</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_compareFCNReLU-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_compareFCNReLU-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_compareFCNReLU-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_compareFCNReLU.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <p>Zhang et al. [2020] <d-cite key="DBLP:conf/iclr/ZhangBHMS20"></d-cite> conclude that more complex network architectures are more prone to memorization. This finding aligns with statistical learning theory, as a more complex architecture has more parameters and, therefore, more overparameterization.</p> <h3 id="convolutional-neural-networks-cnn">Convolutional neural networks (CNN)</h3> <p>For convolutional neural networks, the inductive bias was analyzed using the ReLU activation function and testing networks with different depths. The hidden layers of the CNN consist of 5 × 5 convolution filters organized into 128 channels. The networks have two constraints to match the structure of the identity target function.</p> <p>If you choose the button ‘CNN’ in the first figure, it shows the resulting visualizations. It can be seen that shallow networks are able to learn the identity function, while intermediate-depth networks function as edge detectors, and deep networks learn the constant function. Whether the model learns the identity or the constant function, both outcomes reflect inductive biases since no specific structure was given by the task.</p> <p>A better understanding of the evolution of the output can be obtained by examining the status of the prediction in the hidden layers of the CNN. Since CNNs, unlike FCNs, preserve the spatial relations between neurons in the intermediate layers, these layers can be visualized. The figure below shows the results for a randomly initialized 20-layer CNN compared to different depths of trained CNNs.”</p> <div class="l-page"> <iframe src="/2023/assets/html/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNNs_intermedLayers.html" frameborder="0" scrolling="no" width="100%" height="450px"></iframe> </div> <p>Random convolution gradually smooths out the input data, and after around eight layers, the shapes are lost. When the networks are trained, the results differ. The 7-layer CNN performs well and ends up with an identity function of the input images, while the results of the 14-layer CNN are more blurry. For the 20-layer trained CNN, it initially behaves similarly to the randomly initialized CNN by wiping out the input data, but it preserves the shapes for a longer period. In the last three layers, it renders the constant function of the training data and outputs 7 for any input.</p> <p>These results align with the findings of Radhakrishnan et al. [2018] <d-cite key="radhakrishnan2019memorization"></d-cite> in ‘Memorization in overparametrized autoencoders’, which used a similar empirical framework on fully-connected autoencoders. They found that deep neural networks learn locally contractive maps around the training examples, leading to learning the constant function.</p> <p>As for FCNs, the experiments show that the similarity of the test data to the training data point increases task success. Zhang et al. [2020] <d-cite key="DBLP:conf/iclr/ZhangBHMS20"></d-cite> conducted further experiments with different <strong>feature channel numbers and dimensions</strong>. They found that increasing the hidden dimensions/adding channels is much less prone to overfitting than adding depth. This should be considered when designing new models: if the goal is to increase the number of parameters of an existing model (perhaps to improve optimization dynamics or prepare for more training data), it is better to try increasing the hidden dimension before tuning the depth, unless the nature of the data changes.</p> <p>Another factor that influences inductive bias is **model initialization++. For networks with few channels, the difference between random initialization and the converged network is extreme <d-cite key="DBLP:conf/iclr/FrankleC19"></d-cite>. This can be explained as follows: in the regime of random initialization with only a few channels, the initialization does not have enough flexibility to compensate for incorrect choices. As a result, the networks are more likely to converge to non-optimal extrema. Having more channels helps to smooth out this problem, as more parameters can compensate for ‘unlucky’ cases.</p> <h2 id="general-findings">General findings</h2> <p>The first figure in this post shows that CNNs have better generalization capability than FCNs. However, it is important to note that the experiments primarily aim to compare different neural networks <strong>within their architecture type</strong>, so a comparison between FCNs and CNNs cannot be considered fair. CNNs have natural advantages due to sparser networks and structural biases, such as local receptive fields and parameter sharing, that are consistent with the identity task. Additionally, CNNs have more parameters, as seen in the underlying figure: a 6-layer FCN contains 3.6M parameters, while a 5-layer CNN (with 5x5 filters of 1024 channels) has 78M parameters. These differences should be taken into account when evaluating the results of the experiments.</p> <div class="l-page" style="width: 704px; margin: auto;"> <iframe src="/2023/assets/html/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/plot.html" frameborder="0" scrolling="no" width="100%" height="480px"></iframe> </div> <p>To conclude, CNNs generalize better than FCNs, even though they have more parameters. This is consistent with the observed phenomenon that neural networks do not follow the statistical learning theory.</p> <p>The experiments described above lead to the following main findings of the paper:</p> <ul> <li>The number of parameters does not strongly correlate with generalization performance, but the structural bias of the model does.</li> </ul> <p>For example, when equally overparameterized,</p> <ul> <li>training a very deep model is prone to memorization, while</li> <li>adding more feature channels/dimensions is much less likely to cause overfitting.</li> </ul> <h2 id="conclusion">Conclusion</h2> <p>After reading this blog post, we hope that the concept of the overfitting puzzle is understood and it is revealed how the generalization capability of neural networks contrasts with classical learning theory. We also made the significance of the study conducted by Zhang et al. [2020] <d-cite key="DBLP:conf/iclr/ZhangBHMS20"></d-cite> clear, as they provide more insights into the inductive bias. The artificial setup used in the study is a smart way to approach this topic and allows for an intuitive interpretation of the results. The authors found that CNNs tend to <em>generalize</em> by actually learning the concept of identity, while FCNs are prone to memorization. Within these networks, it can be said that the simpler the network architecture is, the better the task results. Another observation is that deep CNNs exhibit extreme memorization. It would have been interesting to analyze the inductive bias for other types of data (e.g., sequence data like speech) and compare whether the stated theorems also hold in those cases.</p> <p>In summary, Zhang et al. [2020] <d-cite key="DBLP:conf/iclr/ZhangBHMS20"></d-cite> conducted interesting studies that have helped the machine learning community to gain a deeper understanding of inductive bias. Their results provide concrete guidance for practitioners that can help design models for new tasks.</p> </d-article> <d-appendix> <d-footnote-list></d-footnote-list> <d-citation-list></d-citation-list> </d-appendix> </div> <d-bibliography src="/2023/assets/bibliography/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks.bib"></d-bibliography> <d-article id="bibtex-container" class="related highlight"> For attribution in academic contexts, please cite this work as <pre id="bibtex-academic-attribution">
+        PLACEHOLDER FOR ACADEMIC ATTRIBUTION
+  </pre> BibTeX citation <pre id="bibtex-box">
+        PLACEHOLDER FOR BIBTEX
+  </pre> </d-article> <script src="https://utteranc.es/client.js" repo="iclr-blogposts/2023" issue-term="pathname" theme="github-light" crossorigin="anonymous" async> </script> <script src="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/js/bootstrap.bundle.min.js" integrity="sha256-fgLAgv7fyCGopR/gBNq2iW3ZKIdqIcyshnUULC4vex8=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/js/mdb.min.js" integrity="sha256-NdbiivsvWt7VYCt6hYNT3h/th9vSTL4EDWeGs5SN3DA=" crossorigin="anonymous"></script> </body> </html>
\ No newline at end of file
diff --git a/blog/2023/how-much-meta-learning-is-in-image-to-image-translation/index.html b/blog/2023/how-much-meta-learning-is-in-image-to-image-translation/index.html
new file mode 100644
index 00000000..b3cad219
--- /dev/null
+++ b/blog/2023/how-much-meta-learning-is-in-image-to-image-translation/index.html
@@ -0,0 +1,56 @@
+<!DOCTYPE html> <html> <script>let thunk=()=>{let e=e=>e.trim(),t=e=>e.innerText,n=e=>{let t=e.split(" "),n=t.slice(0,-1).join(" ");return[t.at(-1),n]},a=Array.from(document.getElementsByClassName("author")).map(t).map(e).map(n),i=a[0][0],o=(Array.from(document.getElementsByClassName("affiliation")).filter(e=>"P"===e.nodeName).map(t).map(e),"May 1, 2023"),r="How much meta-learning is in image-to-image translation?",l="...in which we find a connection between meta-learning literature and a paper studying how well CNNs deal with nuisance transforms in a class-imbalanced setting. Closer inspection reveals a surprising amount of similarity - from meta-information to loss functions. This implies that the current conception of meta-learning might be too narrow.";{let e=a.map(e=>`${e[0]}, ${e[1]}`).join(" and "),t=`\n@inproceedings{${(i+"2023"+r.split(" ").slice(0,3).join("")).replace(" ","").replace(/[\p{P}$+<=>^`|~]/gu,"").toLowerCase().trim()},\n  author = {${e}},\n  title = {${r}},\n  abstract = {${l}},\n  booktitle = {ICLR Blogposts 2023},\n  year = {2023},\n  date = {${o}},\n  note = {${window.location.href}},\n  url  = {${window.location.href}}\n}\n  `.trim();document.getElementById("bibtex-box").innerText=t}{let e=a.map(e=>e[0]),t=`\n${e=e.length>2?e[0]+", et al.":2==e.length?e[0]+" & "+e[1]:e[0]}, "${r}", ICLR Blogposts, 2023.\n`.trim();document.getElementById("bibtex-academic-attribution").innerText=t}};document.addEventListener("readystatechange",function(){"complete"===document.readyState&&thunk()});</script> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <title>How much meta-learning is in image-to-image translation? | ICLR Blogposts 2023</title> <meta name="author" content="abc b c"/> <meta name="description" content="...in which we find a connection between meta-learning literature and a paper studying how well CNNs deal with nuisance transforms in a class-imbalanced setting. Closer inspection reveals a surprising amount of similarity - from meta-information to loss functions. This implies that the current conception of meta-learning might be too narrow."/> <meta name="keywords" content="machine-learning, ml, deep-learning, reinforcement-learning, iclr"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha256-DF7Zhf293AJxJNTmh5zhoYYIMs2oXitRfBjY+9L//AY=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/css/mdb.min.css" integrity="sha256-jpjYvU3G3N6nrrBwXJoVEYI/0zw8htfFnhT9ljN3JJw=" crossorigin="anonymous"/> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fortawesome/fontawesome-free@5.15.4/css/all.min.css" integrity="sha256-mUZM63G8m73Mcidfrv5E+Y61y7a12O5mW4ezU3bxqW4=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/academicons@1.9.1/css/academicons.min.css" integrity="sha256-i1+4qU2G2860dGGIOJscdC30s9beBXjFfzjWLjBRsBg=" crossorigin="anonymous"> <link rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700|Roboto+Slab:100,300,400,500,700|Material+Icons"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/github.css" media="" id="highlight_theme_light"/> <link rel="shortcut icon" href="/2023/assets/img/iclr_favicon.ico"/> <link rel="stylesheet" href="/2023/assets/css/main.css"> <link rel="canonical" href="https://iclr-blogposts.github.io/2023/blog/2023/how-much-meta-learning-is-in-image-to-image-translation/"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/native.css" media="none" id="highlight_theme_dark"/> <script src="/2023/assets/js/theme.js"></script> <script src="/2023/assets/js/dark_mode.js"></script> <script src="https://cdn.jsdelivr.net/npm/jquery@3.6.0/dist/jquery.min.js" integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=" crossorigin="anonymous"></script> <script type="text/javascript">window.MathJax={tex:{tags:"ams"}};</script> <script defer type="text/javascript" id="MathJax-script" src="https://cdn.jsdelivr.net/npm/mathjax@3.2.0/es5/tex-mml-chtml.js"></script> <script defer src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script> <script src="/2023/assets/js/distillpub/template.v2.js"></script> <script src="/2023/assets/js/distillpub/transforms.v2.js"></script> <script src="/2023/assets/js/distillpub/overrides.js"></script> </head> <d-front-matter> <script async type="text/json">{
+      "title": "How much meta-learning is in image-to-image translation?",
+      "description": "...in which we find a connection between meta-learning literature and a paper studying how well CNNs deal with nuisance transforms in a class-imbalanced setting. Closer inspection reveals a surprising amount of similarity - from meta-information to loss functions. This implies that the current conception of meta-learning might be too narrow.",
+      "published": "May 1, 2023",
+      "authors": [
+        {
+          "author": "Maximilian Eißler",
+          "authorURL": "https://www.linkedin.com/in/maximilian-eißler-b51b9213b/",
+          "affiliations": [
+            {
+              "name": "TU Berlin",
+              "url": ""
+            }
+          ]
+        },
+        {
+          "author": "Thomas Goerttler",
+          "authorURL": "https://scholar.google.de/citations?user=ppQIwpIAAAAJ&hl=de",
+          "affiliations": [
+            {
+              "name": "TU Berlin",
+              "url": ""
+            }
+          ]
+        },
+        {
+          "author": "Klaus Obermayer",
+          "authorURL": "https://www.tu.berlin/ni/",
+          "affiliations": [
+            {
+              "name": "TU Berlin",
+              "url": ""
+            }
+          ]
+        }
+        
+      ],
+      "katex": {
+        "delimiters": [
+          {
+            "left": "$",
+            "right": "$",
+            "display": false
+          },
+          {
+            "left": "$$",
+            "right": "$$",
+            "display": true
+          }
+        ]
+      }
+    }</script> </d-front-matter> <body class="fixed-top-nav"> <header> <nav id="navbar" class="navbar navbar-light navbar-expand-sm fixed-top"> <div class="container"> <a class="navbar-brand title font-weight-lighter" href="/2023/">ICLR Blogposts 2023</a> <button class="navbar-toggler collapsed ml-auto" type="button" data-toggle="collapse" data-target="#navbarNav" aria-controls="navbarNav" aria-expanded="false" aria-label="Toggle navigation"> <span class="sr-only">Toggle navigation</span> <span class="icon-bar top-bar"></span> <span class="icon-bar middle-bar"></span> <span class="icon-bar bottom-bar"></span> </button> <div class="collapse navbar-collapse text-right" id="navbarNav"> <ul class="navbar-nav ml-auto flex-nowrap"> <li class="nav-item "> <a class="nav-link" href="/2023/about">about</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/call">call for blogposts</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/submitting">submitting</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/reviewing">reviewing</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/blog/index.html">blog</a> </li> <li class="nav-item dropdown "> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdown" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">other iterations</a> <div class="dropdown-menu dropdown-menu-right" aria-labelledby="navbarDropdown"> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2025/">2025</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2024/">2024</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blog-track.github.io/home/" target="_blank" rel="noopener noreferrer">2022</a> </div> </li> <li class="toggle-container"> <button id="light-toggle" title="Change theme"> <i class="fas fa-moon"></i> <i class="fas fa-sun"></i> </button> </li> </ul> </div> </div> </nav> </header> <div class="post distill"> <d-title> <h1>How much meta-learning is in image-to-image translation?</h1> <p>...in which we find a connection between meta-learning literature and a paper studying how well CNNs deal with nuisance transforms in a class-imbalanced setting. Closer inspection reveals a surprising amount of similarity - from meta-information to loss functions. This implies that the current conception of meta-learning might be too narrow.</p> </d-title> <d-byline></d-byline> <d-article> <d-contents> <nav class="l-text figcaption"> <h3>Contents</h3> <div><a href="#a-closer-look-at-the-experiment">A closer look at the experiment</a></div> <div><a href="#how-is-this-a-meta-learning-experiment">How is this a meta-learning experiment?</a></div> <div><a href="#generative-invariance-transfer">Generative Invariance Transfer</a></div> <div><a href="#how-much-meta-learning-is-in-munit">How much meta-learning is in MUNIT?</a></div> <ul> <li><a href="#part-1-the-task-centered-view">Part 1: The task-centered view</a></li> <li><a href="#part-2-the-bi-level-programming-view">Part 2: The bi-level programming view</a></li> <li><a href="#now-does-munit-meta-learn">Now, does MUNIT meta-learn?</a></li> </ul> <div><a href="#implications">Implications</a></div> <div><a href="#key-takeaways">Key Takeaways</a></div> </nav> </d-contents> <p>At the last ICLR conference, Zhou et al. [2022] <d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> presented work showing that CNNs do not transfer information between classes of a classification task.</p> <ul> <li>Allan Zhou, Fahim Tajwar, Alexander Robey, Tom Knowles, George J. Pappas, Hamed Hassani, Chelsea Finn [ICLR, 2022] Do Deep Networks Transfer Invariances Across Classes?<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> </li> </ul> <p>Here is a quick summary of their findings: If we train a Convolutional Neural Net (CNN) to classify animals on a set of randomly brightened and darkened images of cats and dogs, it will learn to ignore the scene’s brightness. We say that the CNN learned that classification is <strong>invariant</strong> to the <strong>nuisance transformation</strong> of randomly changing the brightness of an image. We now add a set of leopards to the training data, but fewer examples of them (they are hard to photograph) than we have cats and dogs. However, we keep using the same random transformations. The training set thus becomes <strong>class-imbalanced</strong>.</p> <p>We might expect a sophisticated learner to look at the entire dataset, recognize the random brightness modifications across all species of animal and henceforth ignore brightness when making predictions. If this applied to our experiment, the CNN would be similarly good at ignoring lighting variations on all animals. Furthermore, we would expect the CNN to become more competent at ignoring lighting variations in proportion to <strong>the total amount of images</strong>, irrespective of which animal they depict.</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/CONCEPTUAL_DIAGRAM.svg-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/CONCEPTUAL_DIAGRAM.svg-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/CONCEPTUAL_DIAGRAM.svg-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/CONCEPTUAL_DIAGRAM.svg" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <p>Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> show that a CNN does not behave like this: When using a CNN on a <strong>class-imbalanced</strong> classification task with random nuisance transformations, the CNNs invariance to the transformation is proportional to the size of the training set <strong>for each class</strong>. This finding suggests CNNs don’t <strong>transfer invariance</strong> between classes when learning such a classification task.</p> <p>However, there is a solution: Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> use an Image-to-Image translation architecture called MUNIT<d-cite key="DBLP:conf/eccv/HuangLBK18"></d-cite> to learn the transformations and generate additional data from which the CNN can learn the invariance separately for each class. Thus, the invariance to nuisance transformations is transferred <strong>generatively</strong>. They call this method <strong>Generative Invariance Transfer (GIT)</strong>.</p> <p><strong>So why is this an interesting result?</strong></p> <p>In the field of machine learning many have dreamed for a long time<d-cite key="schmidhuber:1987:srl"></d-cite><d-cite key="DBLP:books/sp/98/ThrunP98"></d-cite> of a learner that, having learned a number of tasks can adapt to new tasks with little to no extra training - a learner that has learned to learn, a meta-learner. Yet, specialized meta-learners <d-cite key="DBLP:conf/icml/FinnAL17"></d-cite><d-cite key="NIPS2017_cb8da676"></d-cite><d-cite key="NIPS2016_90e13578"></d-cite><d-cite key="sung2018learning"></d-cite> struggled to outperform baseline methods<d-cite key="DBLP:journals/corr/abs-2104-02638"></d-cite><d-cite key="DBLP:journals/corr/abs-1904-04232"></d-cite>, arguably due to high computational requirements<d-cite key="nichol2018first"></d-cite> and few large scale datasets<d-cite key="triantafillou2019meta"></d-cite>. We believe this to be caused by a too-narrow conception of what constitutes meta-learning. We argue that:</p> <ul> <li>In contradiction to recent definitions of meta-learning, the experiment described above is a meta-learning experiment.</li> <li>MUNIT is related to contemporary meta-learning methods and a meta-learner.</li> <li>These two findings point to a too-narrow conception of meta-learning in the recent literature. A wider conception based on mutual information could lead to interesting future work.</li> </ul> <p>Before we proceed to the main post, let’s clarify some definitions. If you are already familiar with the subject, you may skip this part. If you have only a vague notion of contemporary meta-learning you will be able to follow the article anyway. However, if you want to know more, <a href="https://interactive-maml.github.io/" target="_blank" rel="noopener noreferrer">here</a> is a gentle introduction to MAML, one of the most popular methods.</p> <details> <summary><b> Definition: Class-Imbalanced Classification</b></summary> <br> <p> In many real-world classification datasets, the number of examples for each class varies. <b>Class-imbalanced classification</b> refers to classification on datasets where the frequencies of class labels vary significantly. </p> <p> It is generally more difficult for a neural network to learn to classify classes with fewer examples <d-cite key="5128907"></d-cite><d-cite key="10.1117/12.2228523"></d-cite>. However, it is often important to perform well on all classes, regardless of their frequency in the dataset. If we train a model to classify a dataset of different skin tumors, most examples may be benign. Still, it is crucial to identify the rare, malignant ones. Experiment design, including training and evaluation methods must therefore be adjusted when using class-imbalanced data. (see Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> section 3.1) </p> <br> </details> <details> <summary><b> Definition: Nuisance Transformation &amp; Transformation Invariance</b></summary> <br> <p> Transformations are alterations of data. In the context of image classification, <b>nuisance transformations</b> are alterations that do not affect the class labels of the data. A model is said to be invariant to a <b>nuisance transformation</b> if it can successfully ignore the transformation when predicting a class label. </p> We can formally define a <b>nuisance transformation</b> <p> $$T(\cdot |x)$$ </p> <p> as a distribution over transformation functions. An example of a <b>nuisance transformation</b> might be a distribution over rotation matrices of different angles, or lighting transformations with different exposure values. By definition, <b>nuisance transformations</b> have no impact on class labels $y$, only on data $x$. A perfectly <b>transformation-invariant</b> classifier would thus completely ignore them, i.e., </p> <p> $$ \hat{P}_w(y = j|x) = \hat{P}_w(y = j|x'), \; x' \sim T(\cdot |x). $$ </p> <p> (see Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> section 3.1) </p> </details> <h2 id="a-closer-look-at-the-experiment">A closer look at the experiment</h2> <p>Let’s take a more detailed look at the experiment Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> conducted:</p> <p>Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> take a dataset, e.g., CIFAR-100, then apply a nuisance transformation, for example, random rotation, background intensity, or dilation and erosion. They then remove samples from some classes until the distribution of class sizes follows <a href="https://en.wikipedia.org/wiki/Zipf%27s_law" target="_blank" rel="noopener noreferrer">Zipf’s law</a> with parameter 2.0 and a minimum class size of 5. The test set remains balanced, i.e., all test classes have the same number of samples. They then train a CNN model - for example, a ResNet - on this imbalanced and transformed training data.</p> <p>To measure the invariance of the trained model to the applied transformation Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> use the empirical <a href="https://en.wikipedia.org/wiki/Kullback-Leibler_divergence" target="_blank" rel="noopener noreferrer">Kullback-Leibler divergence</a> between the predictions on the untransformed test set and the transformed test set of each class.</p> <p>If the learner is invariant to the transformation, the predicted probability distribution over class labels should be identical for the transformed and untransformed images. In that case, the KLD should be zero and greater than zero otherwise. The higher the expected KL-divergence, the more the applied transformation impacts the network’s predictions.</p> <p>The result: eKLD falls with class size. This implies that the CNN does not learn that there are the same nuisance transformations on all images and therefore does not transfer this knowledge to the classes with less training data. A CNN learns invariance <strong>separately for each class</strong> (see also Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> section 3.2).</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/EKLD.svg-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/EKLD.svg-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/EKLD.svg-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/EKLD.svg" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <h2 id="how-is-this-a-meta-learning-experiment">How is this a meta-learning experiment?</h2> <p>You might think this is a cool experiment, but how is it related to meta-learning?</p> <p>And, indeed, in contemporary literature meta-learning is often conceived of as learning multiple tasks. In an much-cited 2022 survey, Hosepdales et al. write:</p> <blockquote> <p>Meta-learning is most commonly understood as learning to learn; the process of improving a learning algorithm over multiple learning episodes. In contrast, conventional ML improves model predictions over multiple data instances. <d-cite key="DBLP:journals/pami/HospedalesAMS22"></d-cite></p> </blockquote> <p>In another popular survey Vanschoren [2018] describes the meta-learning process as follows:</p> <blockquote> <p>First, we need to collect meta-data that describe prior learning tasks and previously learned models. They comprise the exact algorithm configurations used to train the models, including hyperparameter settings, pipeline compositions and/or network architectures, the resulting model evaluations, such as accuracy and training time, the learned model parameters, such as the trained weights of a neural net, as well as measurable properties of the task itself, also known as meta-features.<d-cite key="vanschoren2018meta"></d-cite></p> </blockquote> <p>Francheschi et al. [2018] basically equate meta-learning (ML) with <a href="https://en.wikipedia.org/wiki/Hyperparameter_optimization" target="_blank" rel="noopener noreferrer">hyperparameter optimization</a> (HO):</p> <blockquote> <p>[…] both HO and ML essentially boil down to nesting two search problems: at the inner level we seek a good hypothesis (as in standard supervised learning) while at the outer level we seek a good configuration (including a good hypothesis space) where the inner search takes place.<d-cite key="DBLP:conf/icml/FranceschiFSGP18"></d-cite></p> </blockquote> <p>This perspective on meta-learning seems to indicate that “true” meta-learning requires a rigid structure of multiple discrete tasks that is optimized over. However, in the invariance transfer setting we neither have multiple learning episodes, i.e., we learn over multiple data instances, nor any “meta-features”. Also, adding a class to the dataset does not exactly constitute a new “task”, even though knowledge of the nuisance transform is applicable.</p> <p>So is Zhou et al.’s<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> experiment no meta-learning after all?</p> <p>Let’s look at one of the original papers on meta-learning. In the 1998 book “Learning to learn” Sebastian Thrun &amp; Lorien Pratt define an algorithm as capable of “Learning to learn” if it improves its performance in proportion to the number of tasks it is exposed to:</p> <blockquote> <p>an algorithm is said to learn to learn if its performance at each task improves with experience and with the number of tasks. Put differently, a learning algorithm whose performance does not depend on the number of learning tasks, which hence would not benefit from the presence of other learning tasks, is not said to learn to learn <d-cite key="DBLP:books/sp/98/ThrunP98"></d-cite></p> </blockquote> <p>Now this seems a much looser definition. How might this apply to the experiment just outlined? In the introduction, we thought about how a sophisticated learner might handle a dataset like the one described in the last section. We said that a sophisticated learner would learn that the nuisance transformations are applied uniformly <strong>to all classes</strong>. Therefore, if we added more classes to the dataset, the learner would become <strong>more invariant</strong> to the transformations because we expose it to more examples of them. Since this is part of the classification task <strong>for each class</strong>, the learner should, everything else being equal, become better at classification, especially on classes with few training examples. To see this, we must think of the multi-classification task not as a single task but as multiple mappings from image features to activations that must be learned, as a set of binary classification tasks. Thrun and Pratt continue:</p> <blockquote> <p>For an algorithm to fit this definition, some kind of <em>transfer</em> must occur between multiple tasks that must have a positive impact on expected task-performance <d-cite key="DBLP:books/sp/98/ThrunP98"></d-cite>.</p> </blockquote> <p>This transfer is what Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> tried to measure. There is some meta-information learnable across several tasks, in our case, the transformation distribution across many binary classification tasks. If a learner can learn this meta-information and transfer it to each new task it has “learned to learn”; it is a meta-learner. The goal of Zhou et al.’s<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> experiment was to see whether this transfer takes place. Thus, arguably, it is a meta-learning experiment.</p> <h2 id="generative-invariance-transfer">Generative Invariance Transfer</h2> <p>Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> don’t stop there. They show that using the MUNIT (Multimodal Unsupervised image-to-image Translation)<d-cite key="DBLP:conf/eccv/HuangLBK18"></d-cite> architecture, they can learn the nuisance transformations applied to the dataset and generate additional training samples for the classes with few samples, improving transformation invariance there. They call this Generative invariance transfer (GIT). Let’s take a closer look:</p> <p>MUNIT networks are capable of performing image-to-image translation, which means that they can translate an image from one domain, such as pictures of leopards, into another domain, such as pictures of house cats. The translated image should look like a real house cat while still resembling the original leopard image. For instance, if the leopard in the original image has its eyes closed, the translated image should contain a house cat with closed eyes. Eye state is a feature present in both domains, so a good translator should not alter it. On the other hand, a leopard’s fur is yellow and spotted, while a house cat’s fur can be white, black, grey, or brown. To make the translated images indistinguishable from real house cats, the translator must thus replace leopard fur with house cat fur.</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/MUNIT_ENCODING.svg-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/MUNIT_ENCODING.svg-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/MUNIT_ENCODING.svg-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/MUNIT_ENCODING.svg" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <p>MUNIT networks learn to perform translations by correctly distinguishing the domain-agnostic features (such as eye state) from the domain-specific features (such as the distribution of fur color). They embed an image into two latent spaces: a content space that encodes the domain-agnostic features and a style space that encodes the domain-specific features (see figure above).</p> <p>To transform a leopard into a house cat, we can encode the leopard into a content and a style code, discard the leopard-specific style code, randomly select a cat-specific style code, and assemble a house cat image that looks similar by combining the leopard’s content code with the randomly chosen cat style code (see figure below).</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/MUNIT_TRANSLATION.svg-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/MUNIT_TRANSLATION.svg-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/MUNIT_TRANSLATION.svg-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/MUNIT_TRANSLATION.svg" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <p>Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> modify the process of using MUNIT to transfer images between domains. They do not use MUNIT to translate images <strong>between</strong> domains but <strong>within</strong> a domain. The MUNIT network exchanges the style code of an image with another style code of the same domain. For example, if the domain is house cats, the MUNIT network might translate a grey house cat into a black one. The learning task in this single-domain application of MUNIT is to decompose example-agnostic content features from example-specific style features so that the translated images still look like house cats. For example, fur color is a valid style feature for translating within the ‘house cat’ domain because every house cat has a fur color. A translator only switching fur color is hard to detect.</p> <p>However, if the domain included house cats <strong>and apples</strong>, fur color is not a valid style feature. If it was, the translator might translate fur color on an apple and give it black fur, which would look suspiciously out of place. Whatever house cats and apples have in common - maybe their position or size in the frame - would be a valid style feature. We would expect an intra-domain translator on an apples-and-cats dataset to change the position and size of an apple but not to turn it into a cat (not even partially).</p> <p>It turns out that on a dataset with uniformly applied nuisance transformations, the nuisance transformations are valid style features: The result of randomly rotating an apple cannot be discerned as artificial when images of all classes, house cats and apples, were previously randomly rotated.</p> <p>Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> find that when they train a MUNIT network on a dataset with nuisance transformations and class imbalances, the MUNIT network decomposes the class and transformation distributions. The style latent space of the MUNIT network approximates the transformation distribution $T(\cdot |x)$. The content space preserves the remaining features of the image, such as its class. Thus, when translating an image, i.e., exchanging its style code, MUNIT applies a random nuisance transformation while preserving content. Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> use this method to generate data for classes with few examples. While the CNN is still unable to transfer invariance to $T(\cdot |x)$ between classes, it can now learn it for each class separately using the data generated by MUNIT, which has acquired knowledge of $T(\cdot |x)$ from the entire dataset (see also Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> section 4).</p> <p>So MUNIT decomposes the example-specific information, e.g., whether something is an apple or a house cat, from the meta-information, i.e., nuisance transformations applied to the entire dataset. When we add more classes, it has more data and can better learn the transformation distribution $T(\cdot |x)$. Does solving a meta-learning problem make MUNIT a meta-learner? Let’s look at the relationship MUNIT has with contemporary meta-learners</p> <h2 id="how-much-meta-learning-is-in-munit">How much meta-learning is in MUNIT?</h2> <p>To see how well MUNIT fits the definition of meta-learning, let’s see what the same survey papers we consulted earlier consider the structure of a meta-learning algorithm.</p> <h3 id="part-1-the-task-centered-view">Part 1: The task-centered view</h3> <p>Hospedales et al. [2021] <d-cite key="DBLP:journals/pami/HospedalesAMS22"></d-cite> defines a generic meta-learner as follows: An outer training loop with a set of trainable parameters iterates over tasks in a distribution of tasks. Formally a task is comprised of a dataset and a loss function $ \mathcal{T} = \{ \mathcal{D}, \mathcal{L} \} $. In an inner loop, a learning algorithm based on the outer loop’s parameters is instantiated for each task. We train it on a training set (<em>meta-training</em>) and test it on a validation set (<em>meta-validation</em>). We then use loss on this validation set to update the outer loop’s parameters. In this task-centered view of meta-learning, we can express the objective function as</p> <p> $$ \underset{\omega}{\mathrm{min}} \; \mathbb{E}_{\mathcal{T} \sim p(\mathcal{T})} \; \mathcal{L}(\mathcal{D}, \omega), $$ </p> <p>where $ \omega $ is parameters trained exclusively on the meta-level, i.e., the <em>meta-knowledge</em> learnable from the task distribution <d-cite key="DBLP:journals/pami/HospedalesAMS22"></d-cite>.</p> <p>This <em>meta-knowledge</em> is what the meta-learner accumulates and transfers across the tasks. Collecting meta-knowledge allows the meta-learner to improve its expected task performance with the number of tasks. The meta-knowledge in the experiment of Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> is the invariance to the nuisance transformations as the transformations are identical and need to be ignored for images of all classes. By creating additional transformed samples, the MUNIT network makes the meta-knowledge learnable for the CNN.</p> <p>The task-centered view of meta-learning brings us to a related issue: A meta-learner must discern and decompose task-specific knowledge from meta-knowledge. Contemporary meta-learners decompose meta-knowledge through the different objectives of their inner and outer loops and their respective loss terms. They store meta-knowledge in the outer loop’s parameter set $ \omega $ but must not learn task-specific information there. Any unlearned meta-features lead to slower adaptation, negatively impacting performance, <em>meta-underfitting</em>. On the other hand, any learned task-specific features will not generalize to unseen tasks in the distribution, thus also negatively impacting performance, <em>meta-overfitting</em>.</p> <p>We recall that, similarly, MUNIT <d-cite key="DBLP:conf/eccv/HuangLBK18"></d-cite> decomposes domain-specific style information and domain-agnostic content information. Applied to two domains, leopards and house cats, a MUNIT network will encode the domain-agnostic information, e.g., posture, scale, background, in its content latent space, and the domain-specific information, e.g., how a cat’s hair looks, in its style latent space. If the MUNIT network encoded the domain-agnostic information in the style latent space, the resulting image would not appear to be a good translation since the style information is discarded and replaced. It might turn a closed-eyed leopard into a staring cat. If the MUNIT network encoded the domain-specific transformation in the content latent space, the network would have difficulty translating between domains. A house cat might still have its original leopard fur.</p> <p>Although the single-domain application of MUNIT explicitly learns a single task and scales “over multiple data instances” instead of “multiple learning episodes”<d-cite key="DBLP:journals/pami/HospedalesAMS22"></d-cite> it is clearly compatible with the task-centered view of meta-learning set forth <em>in the same survey paper</em>. Both meta-learning and multi-domain unsupervised image-to-image translation are thus learning problems that require a separation of the general from the specific.</p> <p>As we shall see, this is even visible when comparing their formalizations as optimization problems.</p> <h3 id="part-2-the-bi-level-programming-view">Part 2: The bi-level programming view</h3> <p>Francheschi et al. [2018] <d-cite key="DBLP:conf/icml/FranceschiFSGP18"></d-cite> show that all contemporary neural-network-based meta-learning approaches can be expressed as bi-level optimization problems. Formally the optimization objective of a general meta-learner can be expressed as:</p> <p> $$ \bbox[5pt, border: 2px solid blue]{ \begin{align*} \omega^{*} = \underset{\omega}{\mathrm{argmin}} \sum_{i=1}^{M} \mathcal{L}^{meta}(\theta^{* \; (i)}(\omega), D^{val}_i), \end{align*} } $$ </p> <p>where $M$ describes the number of tasks in a batch, $\mathcal{L}^{meta}$ is the meta-loss function, and $ D^{val}_i $ is the validation set of the task $ i $. $\omega$ represents the parameters exclusively updated in the outer loop. $ \theta^{* \; (i)} $ represents an inner loop learning a task that we can formally express as a sub-objective constraining the primary objective</p> <p> $$ \bbox[5pt, border: 2px solid red]{ \begin{align*} s.t. \; \theta^{* \; (i)} = \underset{\theta}{\mathrm{argmin}} \; \mathcal{L^{task}}(\theta, \omega, D^{tr}_i), \end{align*} } $$ </p> <p>where $ \theta $ are the model parameters updated in the inner loop, $ \mathcal{L}^{task} $ is the loss function by which they are updated and $ D^{tr}_i $ is the training set of the task $ i $ <d-cite key="DBLP:journals/pami/HospedalesAMS22"></d-cite>.</p> <p>While not adhering to Francheschi et al.’s [2018] notion of a meta-learner as “nesting two search problems”, it turns out that the loss functions of MUNIT can be similarly decomposed:</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/MUNIT_LOSS.svg-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/MUNIT_LOSS.svg-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/MUNIT_LOSS.svg-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/MUNIT_LOSS.svg" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <p>MUNIT’s loss function consists of two adversarial (GAN) <d-cite key="DBLP:conf/nips/GoodfellowPMXWOCB14"></d-cite> loss terms (see figure above) with several auxiliary reconstruction loss terms. To keep the notation simple, we combine all reconstruction terms into a joined reconstruction loss $ \mathcal{L}_{recon}(\theta_c, \theta_s) $, where $ \theta_c $ are the parameters of the <em>content</em> encoding/decoding networks and $ \theta_s $ are the parameters of the <em>style</em> encoding/decoding networks. We will only look at one of the two GAN losses in detail since they are symmetric, and one is discarded entirely when MUNIT is used on a single domain in the fashion of Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite>.</p> <p>MUNIT’s GAN loss term is</p> <p> $$ \begin{align*} &amp;\mathcal{L}^{x_{2}}_{GAN}(\theta_d, \theta_c, \theta_s) \\\\ =&amp; \;\mathbb{E}_{c_{1} \sim p(c_{1}), s_{2} \sim p(s_{2})} \left[ \log (1 -D_ {2} (G_{2} (c_{1}, s_{2}, \theta_c, \theta_s), \theta_d)) \right] \\ +&amp; \;\mathbb{E}_{x_{2} \sim p(x_{2})} \left[ \log(D_{2} (x_{2}, \theta_d)) \right], \end{align*} $$ </p> <p>where the $ \theta_d $ represents the parameters of the discriminator network, $p(x_2)$ is the data of the second domain, $ c_1 $ is the content embedding of an image from the first domain to be translated. $ s_2 $ is a random style code of the second domain. $ D_2 $ is the discriminator of the second domain, and $ G_2 $ is its generator. MUNIT’s full objective function is:</p> <p> $$ \begin{align*} \underset{\theta_c, \theta_s}{\mathrm{argmin}} \; \underset{\theta_d}{\mathrm{argmax}}&amp; \;\mathbb{E}_{c_{1} \sim p(c_{1}), s_{2} \sim p(s_{2})} \left[ \log (1 -D_ {2} (G_{2} (c_{1}, s_{2}, \theta_c, \theta_s), \theta_d)) \right] \\ +&amp; \; \mathbb{E}_{x_{2} \sim p(x_{2})} \left[ \log(D_{2} (x_{2}, \theta_d)) \right], + \; \mathcal{L}^{x_{1}}_{GAN}(\theta_d, \theta_c, \theta_s) \\ +&amp; \;\mathcal{L}_{recon}(\theta_c, \theta_s) \end{align*} $$ </p> <p>(compare <d-cite key="DBLP:conf/eccv/HuangLBK18, DBLP:conf/nips/GoodfellowPMXWOCB14"></d-cite>). We can reformulate this into a bi-level optimization problem by extracting a minimization problem describing the update of the generative networks. We also drop the second GAN loss term as it is not relevant to our analysis.</p> <p> $$ \bbox[5px, border: 2px solid blue]{ \begin{align*} \omega^{*} &amp; = \{ \theta_c^*, \theta_s^* \} \\\\ &amp; = \underset{\theta_c, \theta_s}{\mathrm{argmin}} \; \mathbb{E}_{c_{1} \sim p(c_{1}), s_{2} \sim p(s_{2})} \left[ \log (1 -D_ {2} (G_{2} (c_{1}, s_{2}, \theta_c, \theta_s), \theta_d^{*})) \right] \\ &amp; + \mathcal{L}_{recon}(\theta_c, \theta_s), \end{align*} } $$ </p> <p>We then add a single constraint, a subsidiary maximization problem for the discriminator function:</p> <p> $$ \bbox[5px, border: 2px solid red]{ \begin{align*} &amp;s.t. \;\theta_d^{*} \\\\ &amp; = \underset{\theta_d}{\mathrm{argmax}} \; \mathbb{E}_{c_{1} \sim p(c_{1}), s_{2} \sim p(s_{2})} \left[ \log (1 -D_ {2} (G_{2} (c_{1}, s_{2}, \theta_c, \theta_s), \theta_d)) \right] \\ &amp; + \mathbb{E}_{x_{2} \sim p(x_{2})} \left[ \log(D_{2} (x_{2}, \theta_d)) \right] \end{align*} } $$ </p> <p>Interestingly, this bi-level view does not only resemble a meta-learning procedure as expressed above, but the bi-level optimization also facilitates a similar effect. Maximizing the discriminator’s performance in the constraint punishes style information encoded as content information. If style information is encoded as content information, the discriminator detects artifacts of the original domain in the translated image. Similarly, a meta-learner prevents <em>meta-overfitting</em> via an outer optimization loop.</p> <p><em>However, MUNIT, while representable as a bi-level optimization problem does not “essentially boil down to nesting two search problems”.<d-cite key="DBLP:conf/icml/FranceschiFSGP18"></d-cite></em> During GAN training, the discriminator’s parameters are updated through the changes in the generator’s parameters, which derive from the discriminator’s parameters, and so forth; The training of the discriminator and generator are dependent processes. Crucially, they depend on each other symmetrically, forming a min-max game. Contemporary meta-learners, meanwhile, are strictly hierarchical, with an outer and inner optimization loop.</p> <h3 id="now-does-munit-meta-learn">Now, does MUNIT meta-learn?</h3> <p>So it appears that while not conforming to any verbal definition of a contemporary meta-learner MUNIT seems to:</p> <p>a) adhere to multiple formalizations made in the very same publications to define meta-learning</p> <p>b) solve a meta-learning problem via GIT when applied to a single domain (if you agree with the conclusion of the previous chapter)</p> <p>We thus conclude:</p> <p>When applied to a single domain MUNIT <em>does</em> meta-learn as it combines information from all classes to extract the transformation distribution. While it does not perform classification explicitly, the class information of an image is encoded in MUNIT’s content space. Since MUNIT is trained in an unsupervised way, it is probably closer to a distance metric than an actual class label. We might thus classify single-domain MUNIT as an unsupervised, generative meta-learner.</p> <h2 id="implications">Implications</h2> <p>That invariance transfer and GIT are meta-learning and that MUNIT is a meta-learner is important. Granted, it is not especially hard to see that invariance transfer is a form of “learning to learn” or that Image-to-Image translation is essentially a mechanism to decompose class-specific form general features.</p> <p>However, because contemporary meta-learning has been narrowly cast as “improving a learning algorithm over multiple learning episodes”<d-cite key="DBLP:journals/pami/HospedalesAMS22"></d-cite> and “nesting two search problems”<d-cite key="DBLP:conf/icml/FranceschiFSGP18"></d-cite> it is hard to recognize GIT as meta-learning.</p> <p>In these authors opinion this is not GIT’s fault, but a sign that meta-learning has recently been conceived of too narrowly. Zhou et al.’s<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> experiment is a beautiful illustration of this showing that something as general as a GAN loss term, with appropriate modifications, can be used to meta-learn.</p> <p>A too-narrow conception goes further than obscuring some experiment’s significance though: Meta-learning as a field has recently struggled to compete with less specialized architectures<d-cite key="DBLP:journals/corr/abs-2104-02638"></d-cite><d-cite key="DBLP:journals/corr/abs-1904-04232"></d-cite>. Multi-task datasets are hard to scale <d-cite key="triantafillou2019meta"></d-cite>, as are episode rollouts <d-cite key="DBLP:conf/icml/FinnAL17"></d-cite>. Meanwhile, large-scale architectures have shown impressive zero-shot capabilities<d-cite key="dosovitskiy2021an"></d-cite><d-cite key="pmlr-v139-radford21a"></d-cite>.</p> <p>Zhou et al.’s<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> contributions are therefore important as a challenge to the status quo in meta-learning. MUNIT seems to meta-learn by embedding class (and class-specific features) in one space and transformation-specific features (e.g., how bright/dark) in another. This seems to point to a conception of meta-learning as finding mutual information between sets of examples (not necessarily defined by class or transformation feature but by arbitrary concepts) or hierarchies of such sets. Examining and designing mechanisms by which such behavior can be evoked is an exciting direction for future work.</p> <h2 id="key-takeaways">Key Takeaways</h2> <ol> <li> <p>Zhou et al.’s<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> experiments show that the meta-learning setting can be formulated more broadly than learning an explicit task distribution, suggesting that specialized datasets are not necessary.</p> </li> <li> <p>Using GIT, Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> show that meta-learning algorithms can come in shapes other than inner and outer training loops. Analysis suggests that countervailing loss terms facilitate the decomposition of meta-features from task-specific features.</p> </li> <li> <p>Our discussion of Zhou et al.’s<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> experiments suggests, that when thinking about meta-learning, thinking about mutual information between batches of examples (not necessarily aligned with class labels) and how to extract it trumps thinking about distinct tasks.</p> </li> </ol> </d-article> <d-appendix> <d-footnote-list></d-footnote-list> <d-citation-list></d-citation-list> </d-appendix> </div> <d-bibliography src="/2023/assets/bibliography/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation.bib"></d-bibliography> <d-article id="bibtex-container" class="related highlight"> For attribution in academic contexts, please cite this work as <pre id="bibtex-academic-attribution">
+        PLACEHOLDER FOR ACADEMIC ATTRIBUTION
+  </pre> BibTeX citation <pre id="bibtex-box">
+        PLACEHOLDER FOR BIBTEX
+  </pre> </d-article> <script src="https://utteranc.es/client.js" repo="iclr-blogposts/2023" issue-term="pathname" theme="github-light" crossorigin="anonymous" async> </script> <script src="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/js/bootstrap.bundle.min.js" integrity="sha256-fgLAgv7fyCGopR/gBNq2iW3ZKIdqIcyshnUULC4vex8=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/js/mdb.min.js" integrity="sha256-NdbiivsvWt7VYCt6hYNT3h/th9vSTL4EDWeGs5SN3DA=" crossorigin="anonymous"></script> </body> </html>
\ No newline at end of file
diff --git a/blog/2023/index.html b/blog/2023/index.html
new file mode 100644
index 00000000..662735bb
--- /dev/null
+++ b/blog/2023/index.html
@@ -0,0 +1 @@
+<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <title>2023 | ICLR Blogposts 2023</title> <meta name="author" content="abc b c"/> <meta name="description" content="Home to the 2023 ICLR Blogposts track "/> <meta name="keywords" content="machine-learning, ml, deep-learning, reinforcement-learning, iclr"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha256-DF7Zhf293AJxJNTmh5zhoYYIMs2oXitRfBjY+9L//AY=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/css/mdb.min.css" integrity="sha256-jpjYvU3G3N6nrrBwXJoVEYI/0zw8htfFnhT9ljN3JJw=" crossorigin="anonymous"/> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fortawesome/fontawesome-free@5.15.4/css/all.min.css" integrity="sha256-mUZM63G8m73Mcidfrv5E+Y61y7a12O5mW4ezU3bxqW4=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/academicons@1.9.1/css/academicons.min.css" integrity="sha256-i1+4qU2G2860dGGIOJscdC30s9beBXjFfzjWLjBRsBg=" crossorigin="anonymous"> <link rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700|Roboto+Slab:100,300,400,500,700|Material+Icons"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/github.css" media="" id="highlight_theme_light"/> <link rel="shortcut icon" href="/2023/assets/img/iclr_favicon.ico"/> <link rel="stylesheet" href="/2023/assets/css/main.css"> <link rel="canonical" href="https://iclr-blogposts.github.io/2023/blog/2023/"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/native.css" media="none" id="highlight_theme_dark"/> <script src="/2023/assets/js/theme.js"></script> <script src="/2023/assets/js/dark_mode.js"></script> </head> <body class="fixed-top-nav "> <header> <nav id="navbar" class="navbar navbar-light navbar-expand-sm fixed-top"> <div class="container"> <a class="navbar-brand title font-weight-lighter" href="/2023/">ICLR Blogposts 2023</a> <button class="navbar-toggler collapsed ml-auto" type="button" data-toggle="collapse" data-target="#navbarNav" aria-controls="navbarNav" aria-expanded="false" aria-label="Toggle navigation"> <span class="sr-only">Toggle navigation</span> <span class="icon-bar top-bar"></span> <span class="icon-bar middle-bar"></span> <span class="icon-bar bottom-bar"></span> </button> <div class="collapse navbar-collapse text-right" id="navbarNav"> <ul class="navbar-nav ml-auto flex-nowrap"> <li class="nav-item "> <a class="nav-link" href="/2023/about">about</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/call">call for blogposts</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/submitting">submitting</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/reviewing">reviewing</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/blog/index.html">blog</a> </li> <li class="nav-item dropdown "> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdown" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">other iterations</a> <div class="dropdown-menu dropdown-menu-right" aria-labelledby="navbarDropdown"> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2025/">2025</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2024/">2024</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blog-track.github.io/home/" target="_blank" rel="noopener noreferrer">2022</a> </div> </li> <li class="toggle-container"> <button id="light-toggle" title="Change theme"> <i class="fas fa-moon"></i> <i class="fas fa-sun"></i> </button> </li> </ul> </div> </div> </nav> </header> <div class="header-background"><div class="img"></div></div> <div class="container mt-5"> <div class="post"> <header class="post-header"> <h1 class="post-title"> <i class="fas fa-calendar fa-sm"></i> 2023 </h1> <p class="post-description"> an archive of posts from this year </p> </header> <article> <div class="table-responsive"> <table class="table table-sm table-borderless"> <tr> <th scope="row">May 1, 2023</th> <td> <a class="post-link" href="/2023/blog/2023/sets-and-graphs/">Universality of Neural Networks on Sets vs. Graphs</a> </td> </tr> <tr> <th scope="row">May 1, 2023</th> <td> <a class="post-link" href="/2023/blog/2023/riit/">Rethinking the Implementation Tricks and Monotonicity Constraint in Cooperative Multi-agent Reinforcement Learning</a> </td> </tr> <tr> <th scope="row">May 1, 2023</th> <td> <a class="post-link" href="/2023/blog/2023/raspy/">Thinking Like Transformers</a> </td> </tr> <tr> <th scope="row">May 1, 2023</th> <td> <a class="post-link" href="/2023/blog/2023/how-much-meta-learning-is-in-image-to-image-translation/">How much meta-learning is in image-to-image translation?</a> </td> </tr> <tr> <th scope="row">May 1, 2023</th> <td> <a class="post-link" href="/2023/blog/2023/how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/">How does the inductive bias influence the generalization capability of neural networks?</a> </td> </tr> <tr> <th scope="row">May 1, 2023</th> <td> <a class="post-link" href="/2023/blog/2023/hitchhikers-momentum/">A Hitchhiker's Guide to Momentum</a> </td> </tr> <tr> <th scope="row">May 1, 2023</th> <td> <a class="post-link" href="/2023/blog/2023/facial-poisoning/">Data Poisoning is Hitting a Wall</a> </td> </tr> <tr> <th scope="row">May 1, 2023</th> <td> <a class="post-link" href="/2023/blog/2023/classification-layer-initialization-in-maml/">Strategies for Classification Layer Initialization in Model-Agnostic Meta-Learning</a> </td> </tr> <tr> <th scope="row">May 1, 2023</th> <td> <a class="post-link" href="/2023/blog/2023/bsuite-applications/">Practical Applications of Bsuite For Reinforcement Learning</a> </td> </tr> <tr> <th scope="row">May 1, 2023</th> <td> <a class="post-link" href="/2023/blog/2023/autoregressive-neural-pde-solver/">Autoregressive Renaissance in Neural PDE Solvers</a> </td> </tr> <tr> <th scope="row">May 1, 2023</th> <td> <a class="post-link" href="/2023/blog/2023/adamw/">Decay No More</a> </td> </tr> </table> </div> </article> </div> </div> <script src="https://cdn.jsdelivr.net/npm/jquery@3.6.0/dist/jquery.min.js" integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/js/bootstrap.bundle.min.js" integrity="sha256-fgLAgv7fyCGopR/gBNq2iW3ZKIdqIcyshnUULC4vex8=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/js/mdb.min.js" integrity="sha256-NdbiivsvWt7VYCt6hYNT3h/th9vSTL4EDWeGs5SN3DA=" crossorigin="anonymous"></script> <script defer src="https://cdn.jsdelivr.net/npm/masonry-layout@4.2.2/dist/masonry.pkgd.min.js" integrity="sha256-Nn1q/fx0H7SNLZMQ5Hw5JLaTRZp0yILA/FRexe19VdI=" crossorigin="anonymous"></script> <script defer src="https://cdn.jsdelivr.net/npm/imagesloaded@4/imagesloaded.pkgd.min.js"></script> <script defer src="/2023/assets/js/masonry.js" type="text/javascript"></script> <script defer src="https://cdn.jsdelivr.net/npm/medium-zoom@1.0.6/dist/medium-zoom.min.js" integrity="sha256-EdPgYcPk/IIrw7FYeuJQexva49pVRZNmt3LculEr7zM=" crossorigin="anonymous"></script> <script defer src="/2023/assets/js/zoom.js"></script> <script defer src="/2023/assets/js/common.js"></script> <script type="text/javascript">window.MathJax={tex:{tags:"ams"}};</script> <script defer type="text/javascript" id="MathJax-script" src="https://cdn.jsdelivr.net/npm/mathjax@3.2.0/es5/tex-mml-chtml.js"></script> <script defer src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script> </body> </html>
\ No newline at end of file
diff --git a/blog/2023/raspy/index.html b/blog/2023/raspy/index.html
new file mode 100644
index 00000000..9fb3380e
--- /dev/null
+++ b/blog/2023/raspy/index.html
@@ -0,0 +1,156 @@
+<!DOCTYPE html> <html> <script>let thunk=()=>{let e=e=>e.trim(),t=e=>e.innerText,n=e=>{let t=e.split(" "),n=t.slice(0,-1).join(" ");return[t.at(-1),n]},i=Array.from(document.getElementsByClassName("author")).map(t).map(e).map(n),a=i[0][0],o=(Array.from(document.getElementsByClassName("affiliation")).filter(e=>"P"===e.nodeName).map(t).map(e),"May 1, 2023"),r="Thinking Like Transformers",l="Thinking like Transformers proposes a simple language for coding with attention-like primitives. Using this language, we consider a challenging set of puzzles to gain intuition for how Transformer could implement basic algorithms.";{let e=i.map(e=>`${e[0]}, ${e[1]}`).join(" and "),t=`\n@inproceedings{${(a+"2023"+r.split(" ").slice(0,3).join("")).replace(" ","").replace(/[\p{P}$+<=>^`|~]/gu,"").toLowerCase().trim()},\n  author = {${e}},\n  title = {${r}},\n  abstract = {${l}},\n  booktitle = {ICLR Blogposts 2023},\n  year = {2023},\n  date = {${o}},\n  note = {${window.location.href}},\n  url  = {${window.location.href}}\n}\n  `.trim();document.getElementById("bibtex-box").innerText=t}{let e=i.map(e=>e[0]),t=`\n${e=e.length>2?e[0]+", et al.":2==e.length?e[0]+" & "+e[1]:e[0]}, "${r}", ICLR Blogposts, 2023.\n`.trim();document.getElementById("bibtex-academic-attribution").innerText=t}};document.addEventListener("readystatechange",function(){"complete"===document.readyState&&thunk()});</script> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <title>Thinking Like Transformers | ICLR Blogposts 2023</title> <meta name="author" content="abc b c"/> <meta name="description" content="Thinking like Transformers proposes a simple language for coding with attention-like primitives. Using this language, we consider a challenging set of puzzles to gain intuition for how Transformer could implement basic algorithms."/> <meta name="keywords" content="machine-learning, ml, deep-learning, reinforcement-learning, iclr"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha256-DF7Zhf293AJxJNTmh5zhoYYIMs2oXitRfBjY+9L//AY=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/css/mdb.min.css" integrity="sha256-jpjYvU3G3N6nrrBwXJoVEYI/0zw8htfFnhT9ljN3JJw=" crossorigin="anonymous"/> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fortawesome/fontawesome-free@5.15.4/css/all.min.css" integrity="sha256-mUZM63G8m73Mcidfrv5E+Y61y7a12O5mW4ezU3bxqW4=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/academicons@1.9.1/css/academicons.min.css" integrity="sha256-i1+4qU2G2860dGGIOJscdC30s9beBXjFfzjWLjBRsBg=" crossorigin="anonymous"> <link rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700|Roboto+Slab:100,300,400,500,700|Material+Icons"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/github.css" media="" id="highlight_theme_light"/> <link rel="shortcut icon" href="/2023/assets/img/iclr_favicon.ico"/> <link rel="stylesheet" href="/2023/assets/css/main.css"> <link rel="canonical" href="https://iclr-blogposts.github.io/2023/blog/2023/raspy/"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/native.css" media="none" id="highlight_theme_dark"/> <script src="/2023/assets/js/theme.js"></script> <script src="/2023/assets/js/dark_mode.js"></script> <script src="https://cdn.jsdelivr.net/npm/jquery@3.6.0/dist/jquery.min.js" integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=" crossorigin="anonymous"></script> <script type="text/javascript">window.MathJax={tex:{tags:"ams"}};</script> <script defer type="text/javascript" id="MathJax-script" src="https://cdn.jsdelivr.net/npm/mathjax@3.2.0/es5/tex-mml-chtml.js"></script> <script defer src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script> <script src="/2023/assets/js/distillpub/template.v2.js"></script> <script src="/2023/assets/js/distillpub/transforms.v2.js"></script> <script src="/2023/assets/js/distillpub/overrides.js"></script> <style type="text/css">img{display:block;margin-left:auto;margin-right:auto}.fake-img{background:#bbb;border:1px solid rgba(0,0,0,0.1);box-shadow:0 0 4px rgba(0,0,0,0.1);margin-bottom:12px}.fake-img p{font-family:monospace;color:white;text-align:left;margin:12px 0;text-align:center;font-size:16px}</style> </head> <d-front-matter> <script async type="text/json">{
+      "title": "Thinking Like Transformers",
+      "description": "Thinking like Transformers proposes a simple language for coding with attention-like primitives. Using this language, we consider a challenging set of puzzles to gain intuition for how Transformer could implement basic algorithms.",
+      "published": "May 1, 2023",
+      "authors": [
+        {
+          "author": "Alexander Rush",
+          "authorURL": "https://rush-nlp.com",
+          "affiliations": [
+            {
+              "name": "Cornell Tech",
+              "url": ""
+            }
+          ]
+        },
+        {
+          "author": "Gail Weiss",
+          "authorURL": "https://gailweiss.github.io/",
+          "affiliations": [
+            {
+              "name": "EPFL",
+              "url": ""
+            }
+          ]
+        }
+        
+      ],
+      "katex": {
+        "delimiters": [
+          {
+            "left": "$",
+            "right": "$",
+            "display": false
+          },
+          {
+            "left": "$$",
+            "right": "$$",
+            "display": true
+          }
+        ]
+      }
+    }</script> </d-front-matter> <body class="fixed-top-nav"> <header> <nav id="navbar" class="navbar navbar-light navbar-expand-sm fixed-top"> <div class="container"> <a class="navbar-brand title font-weight-lighter" href="/2023/">ICLR Blogposts 2023</a> <button class="navbar-toggler collapsed ml-auto" type="button" data-toggle="collapse" data-target="#navbarNav" aria-controls="navbarNav" aria-expanded="false" aria-label="Toggle navigation"> <span class="sr-only">Toggle navigation</span> <span class="icon-bar top-bar"></span> <span class="icon-bar middle-bar"></span> <span class="icon-bar bottom-bar"></span> </button> <div class="collapse navbar-collapse text-right" id="navbarNav"> <ul class="navbar-nav ml-auto flex-nowrap"> <li class="nav-item "> <a class="nav-link" href="/2023/about">about</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/call">call for blogposts</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/submitting">submitting</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/reviewing">reviewing</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/blog/index.html">blog</a> </li> <li class="nav-item dropdown "> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdown" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">other iterations</a> <div class="dropdown-menu dropdown-menu-right" aria-labelledby="navbarDropdown"> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2025/">2025</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2024/">2024</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blog-track.github.io/home/" target="_blank" rel="noopener noreferrer">2022</a> </div> </li> <li class="toggle-container"> <button id="light-toggle" title="Change theme"> <i class="fas fa-moon"></i> <i class="fas fa-sun"></i> </button> </li> </ul> </div> </div> </nav> </header> <div class="post distill"> <d-title> <h1>Thinking Like Transformers</h1> <p>Thinking like Transformers proposes a simple language for coding with attention-like primitives. Using this language, we consider a challenging set of puzzles to gain intuition for how Transformer could implement basic algorithms.</p> </d-title> <d-byline></d-byline> <d-article> <d-contents> <nav class="l-text figcaption"> <h3>Contents</h3> <div><a href="#transformers-as-code">Transformers as Code</a></div> <ul> <li><a href="#feed-forward-network">Feed Forward Network</a></li> <li><a href="#attention-selectors">Attention Selectors</a></li> <li><a href="#using-attention">Using Attention</a></li> <li><a href="#layers">Layers</a></li> </ul> <div><a href="#coding-with-transformers">Coding with Transformers</a></div> <ul> <li><a href="#challenge-1-select-a-given-index">Challenge 1: Select a given index</a></li> <li><a href="#challenge-2-shift">Challenge 2: Shift</a></li> <li><a href="#challenge-3-minimum">Challenge 3: Minimum</a></li> <li><a href="#challenge-4-first-index">Challenge 4: First Index</a></li> <li><a href="#challenge-5-right-align">Challenge 5: Right Align</a></li> <li><a href="#challenge-6-split">Challenge 6: Split</a></li> <li><a href="#challenge-7-add">Challenge 7: Add</a></li> </ul> </nav> </d-contents> <h1 id="thinking-like-transformers">Thinking Like Transformers</h1> <ul> <li> <a href="https://arxiv.org/pdf/2106.06981.pdf" target="_blank" rel="noopener noreferrer">Paper</a><d-cite key="weiss2021thinking"></d-cite> by Gail Weiss, Yoav Goldberg, Eran Yahav</li> </ul> <p>Transformer models are foundational to AI systems. There are now countless explanations of “how transformers work?” in the sense of the architecture diagram at the heart of transformers.</p> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_5_0.svg" alt="svg"></p> <p>However this diagram does not provide any intuition into the computational model of this framework. As researchers become interested in how Transformers work, gaining intuition into their mechanisms becomes increasingly useful.</p> <p><a href="https://arxiv.org/pdf/2106.06981.pdf" target="_blank" rel="noopener noreferrer">Thinking like Transformers</a> proposes a computational framework for Transformer-like calculations. The framework uses discrete computation to simulate Transformer computations. The resulting language <a href="https://github.com/tech-srl/RASP" target="_blank" rel="noopener noreferrer">RASP</a> is a programming language where, ideally, every program can compile down to a specific Transformer (indeed, David Lindner and colleagues have recently released a <a href="https://arxiv.org/abs/2301.05062" target="_blank" rel="noopener noreferrer">compiler</a> for a large subset of RASP!).</p> <p>In this blog post, I reimplemented a variant of RASP in Python (RASPy). The language is roughly compatible with the original version, but with some syntactic changes that I thought were fun. With this language, we have a challenging set of puzzles to walk through and understand how it works.</p> <p>Before jumping into the language itself, let’s look at an example of what coding with Transformers looks like. Here is some code that computes the <code class="language-plaintext highlighter-rouge">flip</code>, i.e. reversing an input sequence. The code itself uses two Transformer layers to apply attention and mathematical computations to achieve the result.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">flip</span><span class="p">():</span>
+    <span class="n">length</span> <span class="o">=</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="mi">1</span><span class="p">)).</span><span class="nf">value</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
+    <span class="n">flip</span> <span class="o">=</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">length</span> <span class="o">-</span> <span class="n">indices</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="n">indices</span><span class="p">)).</span><span class="nf">value</span><span class="p">(</span><span class="n">tokens</span><span class="p">)</span>
+    <span class="k">return</span> <span class="n">flip</span>
+<span class="nf">flip</span><span class="p">()</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_11_0.svg" alt="svg"></p> <h2 id="transformers-as-code">Transformers as Code</h2> <p>Our goal is to define a computational formalism that mimics the expressivity of Transformers. We will go through this process by analogy, describing each language construct next to the aspect of the Transformer it represents. (See the full <a href="https://arxiv.org/pdf/2106.06981.pdf" target="_blank" rel="noopener noreferrer">paper</a> for the formal language specification).</p> <p>The core unit of the language is a <em>sequence operation</em> that transforms a sequence to another sequence of the same length. I will refer to these throughout as <em>transforms</em>.</p> <h3 id="inputs">Inputs</h3> <p>In a Transformer, the base layer is the input fed to the model. This input usually contains the raw tokens as well as positional information.</p> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_15_0.svg" alt="svg"></p> <p>In code, the symbol <code class="language-plaintext highlighter-rouge">tokens</code> represents the simplest transform. It returns the tokens passed to the model. The default input is the sequence “hello”.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">tokens</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_17_0.svg" alt="svg"></p> <p>If we want to change the input to the transform, we use the input method to pass in an alternative.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">tokens</span><span class="p">.</span><span class="nf">input</span><span class="p">([</span><span class="mi">5</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">2</span><span class="p">])</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_19_0.svg" alt="svg"></p> <p>As with Transformers, we cannot access the positions of these sequences directly. However, to mimic position embeddings, we have access to a sequence of indices.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">indices</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_21_0.svg" alt="svg"></p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">sop</span> <span class="o">=</span> <span class="n">indices</span>
+<span class="n">sop</span><span class="p">.</span><span class="nf">input</span><span class="p">(</span><span class="sh">"</span><span class="s">goodbye</span><span class="sh">"</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_22_0.svg" alt="svg"></p> <h3 id="feed-forward-network">Feed Forward Network</h3> <p>After the input layer, we reach the feed-forward network. In a Transformer, this stage can apply mathematical operations to each element of the sequence independently.</p> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_24_0.svg" alt="svg"></p> <p>In code, we represent this stage by computation on transforms. Mathematical operations are overloaded to represent independent computation on each element of the sequence .</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">tokens</span> <span class="o">==</span> <span class="sh">"</span><span class="s">l</span><span class="sh">"</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_26_0.svg" alt="svg"></p> <p>The result is a new transform. Once constructed it can be applied to new input.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">model</span> <span class="o">=</span> <span class="n">tokens</span> <span class="o">*</span> <span class="mi">2</span>  <span class="o">-</span> <span class="mi">1</span>
+<span class="n">model</span><span class="p">.</span><span class="nf">input</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">2</span><span class="p">])</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_28_0.svg" alt="svg"></p> <p>Operations can combine multiple transforms. For example, functions of <code class="language-plaintext highlighter-rouge">tokens</code> and <code class="language-plaintext highlighter-rouge">indices</code>. The analogy here is that the Transformer activations can keep track of multiple pieces of information simultaneously.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">model</span> <span class="o">=</span> <span class="n">tokens</span> <span class="o">-</span> <span class="mi">5</span> <span class="o">+</span> <span class="n">indices</span>
+<span class="n">model</span><span class="p">.</span><span class="nf">input</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">2</span><span class="p">])</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_30_0.svg" alt="svg"></p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="n">tokens</span> <span class="o">==</span> <span class="sh">"</span><span class="s">l</span><span class="sh">"</span><span class="p">)</span> <span class="o">|</span> <span class="p">(</span><span class="n">indices</span> <span class="o">==</span> <span class="mi">1</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_31_0.svg" alt="svg"></p> <p>We provide a few helper functions to make it easier to write transforms. For example, <code class="language-plaintext highlighter-rouge">where</code> provides an “if” statement like construct</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nf">where</span><span class="p">((</span><span class="n">tokens</span> <span class="o">==</span> <span class="sh">"</span><span class="s">h</span><span class="sh">"</span><span class="p">)</span> <span class="o">|</span> <span class="p">(</span><span class="n">tokens</span> <span class="o">==</span> <span class="sh">"</span><span class="s">l</span><span class="sh">"</span><span class="p">),</span> <span class="n">tokens</span><span class="p">,</span> <span class="sh">"</span><span class="s">q</span><span class="sh">"</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_33_0.svg" alt="svg"></p> <p>And <code class="language-plaintext highlighter-rouge">map</code> lets us define our own operators, for instance a string to int transform. (Users should be careful to only use operations here that could be computed with a simple neural network).</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">atoi</span> <span class="o">=</span> <span class="n">tokens</span><span class="p">.</span><span class="nf">map</span><span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="nf">ord</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="o">-</span> <span class="nf">ord</span><span class="p">(</span><span class="sh">'</span><span class="s">0</span><span class="sh">'</span><span class="p">))</span>
+<span class="n">atoi</span><span class="p">.</span><span class="nf">input</span><span class="p">(</span><span class="sh">"</span><span class="s">31234</span><span class="sh">"</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_35_0.svg" alt="svg"></p> <p>When chaining these transforms, it is often easier to work with functions. For example the following applies where and then <code>atoi</code> and then adds 2.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">atoi</span><span class="p">(</span><span class="n">seq</span><span class="o">=</span><span class="n">tokens</span><span class="p">):</span>
+    <span class="k">return</span> <span class="n">seq</span><span class="p">.</span><span class="nf">map</span><span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="nf">ord</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="o">-</span> <span class="nf">ord</span><span class="p">(</span><span class="sh">'</span><span class="s">0</span><span class="sh">'</span><span class="p">))</span> 
+
+<span class="n">op</span> <span class="o">=</span> <span class="p">(</span><span class="nf">atoi</span><span class="p">(</span><span class="nf">where</span><span class="p">(</span><span class="n">tokens</span> <span class="o">==</span> <span class="sh">"</span><span class="s">-</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">0</span><span class="sh">"</span><span class="p">,</span> <span class="n">tokens</span><span class="p">))</span> <span class="o">+</span> <span class="mi">2</span><span class="p">)</span>
+<span class="n">op</span><span class="p">.</span><span class="nf">input</span><span class="p">(</span><span class="sh">"</span><span class="s">02-13</span><span class="sh">"</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_37_0.svg" alt="svg"></p> <p>From here on, unless we use a different input sequence, we will assume that the input is ‘hello’ and omit the input display in the illustrations.</p> <h3 id="attention-selectors">Attention Selectors</h3> <p>Things get more interesting when we start to apply attention. This allows routing of information between the different elements of the sequence.</p> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_39_0.svg" alt="svg"></p> <p>We begin by defining notation for the keys and queries of the model. Keys and queries are effectively transforms that we will broadcast and compare to each other to create <em>selectors</em>, our parallel to attention patterns. We create them directly from transforms. For example, if we want to define a key, we call <code class="language-plaintext highlighter-rouge">key</code> on a transform.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nf">key</span><span class="p">(</span><span class="n">tokens</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_41_0.svg" alt="svg"></p> <p>Similarly for <code class="language-plaintext highlighter-rouge">query</code>. (Queries are presented as columns to reflect their relation to the selectors we will create from them.)</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nf">query</span><span class="p">(</span><span class="n">tokens</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_43_0.svg" alt="svg"></p> <p>Scalars can be used as keys or queries. They broadcast out to the length of the underlying sequence.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nf">query</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_45_0.svg" alt="svg"></p> <p>By applying a comparison operation between a key and a query we create a <em>selector</em>, our parallel to an attention matrix - though this one is unweighted.</p> <p>A selector is a binary matrix indicating which input position (column) each output position (row) will attend to in an eventual attention computation. In the comparison creating it, the key values describe the input (column) positions, and the query values describe the output (row) positions.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">eq</span> <span class="o">=</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">tokens</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="n">tokens</span><span class="p">))</span>
+<span class="n">eq</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_47_0.svg" alt="svg"></p> <p>Some examples:</p> <ul> <li>A selector that matches each output position to the previous input position.</li> </ul> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">offset</span> <span class="o">=</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">indices</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="n">indices</span> <span class="o">-</span> <span class="mi">1</span><span class="p">))</span>
+<span class="n">offset</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_49_0.svg" alt="svg"></p> <ul> <li>A selector that matches each output position to all earlier input positions.</li> </ul> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">before</span> <span class="o">=</span> <span class="nf">key</span><span class="p">(</span><span class="n">indices</span><span class="p">)</span> <span class="o">&lt;</span> <span class="nf">query</span><span class="p">(</span><span class="n">indices</span><span class="p">)</span>
+<span class="n">before</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_51_0.svg" alt="svg"></p> <ul> <li>A selector that matches each output position to all later input positions.</li> </ul> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">after</span> <span class="o">=</span> <span class="nf">key</span><span class="p">(</span><span class="n">indices</span><span class="p">)</span> <span class="o">&gt;</span> <span class="nf">query</span><span class="p">(</span><span class="n">indices</span><span class="p">)</span>
+<span class="n">after</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_53_0.svg" alt="svg"></p> <p>Selectors can be merged using boolean operations. For example, this selector focuses each output position on 1) earlier positions that 2) contain the same original input token as its own. We show this by including both pairs of keys and queries in the matrix.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">before</span> <span class="o">&amp;</span> <span class="n">eq</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_55_0.svg" alt="svg"></p> <h3 id="using-attention">Using Attention</h3> <p>Given an attention selector we can provide a value sequence to aggregate. We represent aggregation by <strong>summing</strong> up over the values that have a true value for their selector.</p> <p>(Note: in the original paper, they use a <strong>mean</strong> aggregation and show a clever construction where mean aggregation is able to represent a sum calculation. RASPy uses sum by default for simplicity and to avoid fractions. In practicce this means that RASPy may underestimate the number of layers needed to convert to a mean based model by a factor of 2.)</p> <p>Attention aggregation gives us the ability to compute functions like histograms.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">tokens</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="n">tokens</span><span class="p">)).</span><span class="nf">value</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_59_0.svg" alt="svg"></p> <p>Visually we follow the architecture diagram. Queries are to the left, Keys at the top, Values at the bottom, and the Output is to the right.</p> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_61_0.svg" alt="svg"></p> <p>Some attention operations may not even use the input tokens. For instance to compute the <code class="language-plaintext highlighter-rouge">length</code> of a sequence, we create a “select all” attention selector and then add 1 from each position.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">length</span> <span class="o">=</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="mi">1</span><span class="p">)).</span><span class="nf">value</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
+<span class="n">length</span> <span class="o">=</span> <span class="n">length</span><span class="p">.</span><span class="nf">name</span><span class="p">(</span><span class="sh">"</span><span class="s">length</span><span class="sh">"</span><span class="p">)</span>
+<span class="n">length</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_63_0.svg" alt="svg"></p> <p>Here’s a more complex example, shown step-by-step. (This is the kind of thing they ask in interviews!)</p> <p>Say we want to compute the sum of neighboring values in a sequence, along a sliding window. First we apply the forward cutoff, attending only to positions that are not too far in the past.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">WINDOW</span><span class="o">=</span><span class="mi">3</span>
+<span class="n">s1</span> <span class="o">=</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">indices</span><span class="p">)</span> <span class="o">&gt;=</span> <span class="nf">query</span><span class="p">(</span><span class="n">indices</span> <span class="o">-</span> <span class="n">WINDOW</span> <span class="o">+</span> <span class="mi">1</span><span class="p">))</span>  
+<span class="n">s1</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_65_0.svg" alt="svg"></p> <p>Then the backward cutoff, attending only to positions up to and including our own.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">s2</span> <span class="o">=</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">indices</span><span class="p">)</span> <span class="o">&lt;=</span> <span class="nf">query</span><span class="p">(</span><span class="n">indices</span><span class="p">))</span>
+<span class="n">s2</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_67_0.svg" alt="svg"></p> <p>Intersect.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">sel</span> <span class="o">=</span> <span class="n">s1</span> <span class="o">&amp;</span> <span class="n">s2</span>
+<span class="n">sel</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_69_0.svg" alt="svg"></p> <p>And finally aggregate.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">sum2</span> <span class="o">=</span> <span class="n">sel</span><span class="p">.</span><span class="nf">value</span><span class="p">(</span><span class="n">tokens</span><span class="p">)</span> 
+<span class="n">sum2</span><span class="p">.</span><span class="nf">input</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span><span class="mi">3</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">2</span><span class="p">])</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_71_0.svg" alt="svg"></p> <p>Here is a simple example that produces a 2-layer transform. The first corresponds to computing length and the second the cumulative sum. The cumulative sum has to go into a second layer because it is applied to a transform which uses length, and so it can only be computed after the computation of length is complete.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">cumsum</span><span class="p">(</span><span class="n">seq</span><span class="o">=</span><span class="n">tokens</span><span class="p">):</span>
+    <span class="n">x</span> <span class="o">=</span> <span class="p">(</span><span class="n">before</span> <span class="o">|</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">indices</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="n">indices</span><span class="p">))).</span><span class="nf">value</span><span class="p">(</span><span class="n">seq</span><span class="p">)</span>
+    <span class="k">return</span> <span class="n">x</span><span class="p">.</span><span class="nf">name</span><span class="p">(</span><span class="sh">"</span><span class="s">cumsum</span><span class="sh">"</span><span class="p">)</span>
+<span class="nf">cumsum</span><span class="p">().</span><span class="nf">input</span><span class="p">([</span><span class="mi">3</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="o">-</span><span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">1</span><span class="p">])</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_73_0.svg" alt="svg"></p> <h3 id="layers">Layers</h3> <p>The language supports building up more complex transforms. It keeps track of the <em>layers</em> by tracking the operations computed so far.</p> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_76_0.svg" alt="svg"></p> <p>Here is a simple example that produces a 2-layer transform. The first corresponds to computing length and the second the cumulative sum.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">x</span> <span class="o">=</span> <span class="nf">cumsum</span><span class="p">(</span><span class="n">length</span> <span class="o">-</span> <span class="n">indices</span><span class="p">)</span>
+<span class="n">x</span><span class="p">.</span><span class="nf">input</span><span class="p">([</span><span class="mi">3</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">5</span><span class="p">])</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_78_0.svg" alt="svg"></p> <h2 id="coding-with-transformers">Coding with Transformers</h2> <p>Given this library of functions, we can write operations to accomplish surprisingly complex tasks.</p> <p><strong>Can we produce a Transformer that does basic addition of two arbitrary length numbers?</strong></p> <p>i.e. given a string “19492+23919” can we produce the correct output?</p> <p>We will go through these steps, and their solutions, here. If you would rather do them on your own, we provide a version where you can try them yourself!</p> <p>Before we dive in to the main task, we will do some challenges of increasing difficulty to help us build some intuitions.</p> <h3 id="challenge-1-select-a-given-index">Challenge 1: Select a given index</h3> <p>Produce a sequence where all the elements have the value at index i.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">index</span><span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">seq</span><span class="o">=</span><span class="n">tokens</span><span class="p">):</span>
+    <span class="n">x</span> <span class="o">=</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">indices</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="n">i</span><span class="p">)).</span><span class="nf">value</span><span class="p">(</span><span class="n">seq</span><span class="p">)</span>
+    <span class="k">return</span> <span class="n">x</span><span class="p">.</span><span class="nf">name</span><span class="p">(</span><span class="sh">"</span><span class="s">index</span><span class="sh">"</span><span class="p">)</span>
+<span class="nf">index</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_83_0.svg" alt="svg"></p> <h3 id="challenge-2-shift">Challenge 2: Shift</h3> <p>Shift all of the tokens in a sequence to the right by i positions. (Here we introduce an optional parameter in the aggregation: the default value to be used when no input positions are selected. If not defined, this value is 0.)</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">shift</span><span class="p">(</span><span class="n">i</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="sh">"</span><span class="s">_</span><span class="sh">"</span><span class="p">,</span> <span class="n">seq</span><span class="o">=</span><span class="n">tokens</span><span class="p">):</span>
+    <span class="n">x</span> <span class="o">=</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">indices</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="n">indices</span><span class="o">-</span><span class="n">i</span><span class="p">)).</span><span class="nf">value</span><span class="p">(</span><span class="n">seq</span><span class="p">,</span> <span class="n">default</span><span class="p">)</span>
+    <span class="k">return</span> <span class="n">x</span><span class="p">.</span><span class="nf">name</span><span class="p">(</span><span class="sh">"</span><span class="s">shift</span><span class="sh">"</span><span class="p">)</span>
+<span class="nf">shift</span><span class="p">(</span><span class="mi">2</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_85_0.svg" alt="svg"></p> <h3 id="challenge-3-minimum">Challenge 3: Minimum</h3> <p>Compute the minimum values of the sequence. (This one starts to get harder. Our version uses 2 layers of attention.)</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">minimum</span><span class="p">(</span><span class="n">seq</span><span class="o">=</span><span class="n">tokens</span><span class="p">):</span>
+    <span class="n">sel1</span> <span class="o">=</span> <span class="n">before</span> <span class="o">&amp;</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">seq</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="n">seq</span><span class="p">))</span>
+    <span class="n">sel2</span> <span class="o">=</span> <span class="nf">key</span><span class="p">(</span><span class="n">seq</span><span class="p">)</span> <span class="o">&lt;</span> <span class="nf">query</span><span class="p">(</span><span class="n">seq</span><span class="p">)</span>
+    <span class="n">less</span> <span class="o">=</span> <span class="p">(</span><span class="n">sel1</span> <span class="o">|</span> <span class="n">sel2</span><span class="p">).</span><span class="nf">value</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
+    <span class="n">x</span> <span class="o">=</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">less</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="mi">0</span><span class="p">)).</span><span class="nf">value</span><span class="p">(</span><span class="n">seq</span><span class="p">)</span>
+    <span class="k">return</span> <span class="n">x</span><span class="p">.</span><span class="nf">name</span><span class="p">(</span><span class="sh">"</span><span class="s">min</span><span class="sh">"</span><span class="p">)</span>
+<span class="nf">minimum</span><span class="p">()([</span><span class="mi">5</span><span class="p">,</span><span class="mi">3</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">5</span><span class="p">,</span><span class="mi">2</span><span class="p">])</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_87_0.svg" alt="svg"></p> <p>The idea behind our solution is an implicit full ordering of the input positions: we (implicitly) order the positions according to input token value, with input position as tie breaker. Our first act is to have each position attend to all positions before it in the ordering: <code class="language-plaintext highlighter-rouge">sel1</code> focuses on earlier input positions with the same input token value, and <code class="language-plaintext highlighter-rouge">sel2</code> focuses on input positions with lower input token value. We then aggregate a 1 from all positions to get where each position is located in this ordering (i.e., how many other positions precede it). The minimum value is the input value at the first position according to this ordering (i.e., the one which had no other positions precede it).</p> <h3 id="challenge-4-first-index">Challenge 4: First Index</h3> <p>Compute the first index that has token q, assuming the sequence always has length shorter than 100. (2 layers)</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">first</span><span class="p">(</span><span class="n">q</span><span class="p">,</span> <span class="n">seq</span><span class="o">=</span><span class="n">tokens</span><span class="p">):</span>
+    <span class="k">return</span> <span class="nf">minimum</span><span class="p">(</span><span class="nf">where</span><span class="p">(</span><span class="n">seq</span> <span class="o">==</span> <span class="n">q</span><span class="p">,</span> <span class="n">indices</span><span class="p">,</span> <span class="mi">99</span><span class="p">))</span>
+<span class="nf">first</span><span class="p">(</span><span class="sh">"</span><span class="s">l</span><span class="sh">"</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_90_0.svg" alt="svg"></p> <h3 id="challenge-5-right-align">Challenge 5: Right Align</h3> <p>Right align a padded sequence e.g. ralign().inputs(‘xyz___’) = ‘—xyz’” (2 layers)</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">ralign</span><span class="p">(</span><span class="n">default</span><span class="o">=</span><span class="sh">"</span><span class="s">-</span><span class="sh">"</span><span class="p">,</span> <span class="n">sop</span><span class="o">=</span><span class="n">tokens</span><span class="p">):</span>
+    <span class="n">c</span> <span class="o">=</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">sop</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="sh">"</span><span class="s">_</span><span class="sh">"</span><span class="p">)).</span><span class="nf">value</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
+    <span class="n">x</span> <span class="o">=</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">indices</span> <span class="o">+</span> <span class="n">c</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="n">indices</span><span class="p">)).</span><span class="nf">value</span><span class="p">(</span><span class="n">sop</span><span class="p">,</span> <span class="n">default</span><span class="p">)</span>
+    <span class="k">return</span> <span class="n">x</span><span class="p">.</span><span class="nf">name</span><span class="p">(</span><span class="sh">"</span><span class="s">ralign</span><span class="sh">"</span><span class="p">)</span>
+<span class="nf">ralign</span><span class="p">()(</span><span class="sh">"</span><span class="s">xyz__</span><span class="sh">"</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_92_0.svg" alt="svg"></p> <h3 id="challenge-6-split">Challenge 6: Split</h3> <p>Split a sequence into two parts at value v and then right align. You can assume there is exactly one appearance of v in the sequence. (3 layers to get and align the first part of the sequence, but only 1 for the second.)</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">split</span><span class="p">(</span><span class="n">v</span><span class="p">,</span> <span class="n">get_first_part</span><span class="p">,</span> <span class="n">sop</span><span class="o">=</span><span class="n">tokens</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="sh">"</span><span class="s">0</span><span class="sh">"</span><span class="p">):</span>
+    <span class="n">split_point</span> <span class="o">=</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">sop</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="n">v</span><span class="p">)).</span><span class="nf">value</span><span class="p">(</span><span class="n">indices</span><span class="p">)</span>
+    <span class="k">if</span> <span class="n">get_first_part</span><span class="p">:</span>
+        <span class="n">x</span> <span class="o">=</span> <span class="nf">ralign</span><span class="p">(</span><span class="n">default</span><span class="p">,</span> 
+                   <span class="nf">where</span><span class="p">(</span><span class="n">indices</span> <span class="o">&lt;</span> <span class="n">split_point</span><span class="p">,</span> 
+                         <span class="n">sop</span><span class="p">,</span> <span class="sh">"</span><span class="s">_</span><span class="sh">"</span><span class="p">))</span>
+        <span class="k">return</span> <span class="n">x</span>
+    <span class="k">else</span><span class="p">:</span>
+        <span class="n">x</span> <span class="o">=</span> <span class="nf">where</span><span class="p">(</span><span class="n">indices</span> <span class="o">&gt;</span> <span class="n">split_point</span><span class="p">,</span> <span class="n">sop</span><span class="p">,</span> <span class="n">default</span><span class="p">)</span>
+        <span class="k">return</span> <span class="n">x</span>
+<span class="nf">split</span><span class="p">(</span><span class="sh">"</span><span class="s">+</span><span class="sh">"</span><span class="p">,</span> <span class="bp">False</span><span class="p">)(</span><span class="sh">"</span><span class="s">xyz+zyr</span><span class="sh">"</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_94_0.svg" alt="svg"></p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nf">split</span><span class="p">(</span><span class="sh">"</span><span class="s">+</span><span class="sh">"</span><span class="p">,</span> <span class="mi">0</span><span class="p">)(</span><span class="sh">"</span><span class="s">xyz+zyr</span><span class="sh">"</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_95_0.svg" alt="svg"></p> <h3 id="challenge-6-slide">Challenge 6: Slide</h3> <p>Replace special tokens “&lt;” with the closest non “&lt;” value to their right. (2 layers)</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">slide</span><span class="p">(</span><span class="n">match</span><span class="p">,</span> <span class="n">seq</span><span class="o">=</span><span class="n">tokens</span><span class="p">):</span>
+    <span class="n">x</span> <span class="o">=</span> <span class="nf">cumsum</span><span class="p">(</span><span class="n">match</span><span class="p">)</span> 
+    <span class="n">y</span> <span class="o">=</span> <span class="p">((</span><span class="nf">key</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="n">x</span> <span class="o">+</span> <span class="mi">1</span><span class="p">))</span> <span class="o">&amp;</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">match</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="bp">True</span><span class="p">))).</span><span class="nf">value</span><span class="p">(</span><span class="n">seq</span><span class="p">)</span>
+    <span class="n">seq</span> <span class="o">=</span>  <span class="nf">where</span><span class="p">(</span><span class="n">match</span><span class="p">,</span> <span class="n">seq</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span>
+    <span class="k">return</span> <span class="n">seq</span><span class="p">.</span><span class="nf">name</span><span class="p">(</span><span class="sh">"</span><span class="s">slide</span><span class="sh">"</span><span class="p">)</span>
+<span class="nf">slide</span><span class="p">(</span><span class="n">tokens</span> <span class="o">!=</span> <span class="sh">"</span><span class="s">&lt;</span><span class="sh">"</span><span class="p">).</span><span class="nf">input</span><span class="p">(</span><span class="sh">"</span><span class="s">xxxh&lt;&lt;&lt;l</span><span class="sh">"</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_97_0.svg" alt="svg"></p> <h3 id="challenge-7-add">Challenge 7: Add</h3> <p>For this one you want to perform addition of two numbers. Here are the steps.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nf">add</span><span class="p">().</span><span class="nf">input</span><span class="p">(</span><span class="sh">"</span><span class="s">683+345</span><span class="sh">"</span><span class="p">)</span>
+</code></pre></div></div> <ol> <li>Split into parts (challenge 6). Convert to ints. Add.</li> </ol> <blockquote> <p>“683+345” =&gt; [0, 0, 0, 9, 12, 8]</p> </blockquote> <ol> <li>Compute the carry terms. Three possibilities: definitely receives carry (“1”), definitely doesn’t receive carry (“0”), maybe receives carry (“&lt;”).Because we are only adding two numbers, the only case in which a position might receive a carry is if the position after it sums to 9. In that case, it will receive a carry if and only if the position after <em>that</em> receives a carry.</li> </ol> <blockquote> <p>[0, 0, 0, 9, 12, 8] =&gt; “00&lt;100”</p> </blockquote> <ol> <li>Slide the carry coefficients. A position that might receive a carry will get one if and only if the next position receives a carry - and so on down the chain until the next definite carry/no carry.</li> </ol> <blockquote> <p>“00&lt;100” =&gt; 001100”</p> </blockquote> <ol> <li>Complete the addition.</li> </ol> <p>Each of these is 1 line of code. The full system is 6 layers. (if you are careful you can do it in 5!).</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">add</span><span class="p">(</span><span class="n">sop</span><span class="o">=</span><span class="n">tokens</span><span class="p">):</span>
+    <span class="c1"># 0) Parse and add
+</span>    <span class="n">x</span> <span class="o">=</span> <span class="nf">atoi</span><span class="p">(</span><span class="nf">split</span><span class="p">(</span><span class="sh">"</span><span class="s">+</span><span class="sh">"</span><span class="p">,</span> <span class="bp">True</span><span class="p">,</span> <span class="n">sop</span><span class="p">))</span> \
+        <span class="o">+</span> <span class="nf">atoi</span><span class="p">(</span><span class="nf">split</span><span class="p">(</span><span class="sh">"</span><span class="s">+</span><span class="sh">"</span><span class="p">,</span> <span class="bp">False</span><span class="p">,</span> <span class="n">sop</span><span class="p">))</span>
+    <span class="c1"># 1) Check for carries 
+</span>    <span class="n">gets_carry</span> <span class="o">=</span> <span class="nf">shift</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="sh">"</span><span class="s">0</span><span class="sh">"</span><span class="p">,</span> <span class="nf">where</span><span class="p">(</span><span class="n">x</span> <span class="o">&gt;</span> <span class="mi">9</span><span class="p">,</span> <span class="sh">"</span><span class="s">1</span><span class="sh">"</span><span class="p">,</span> <span class="nf">where</span><span class="p">(</span><span class="n">x</span> <span class="o">==</span> <span class="mi">9</span><span class="p">,</span> <span class="sh">"</span><span class="s">&lt;</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">0</span><span class="sh">"</span><span class="p">)))</span>
+    <span class="c1"># 2) Slide carries to their columns - all in one parallel go!                                         
+</span>    <span class="n">gets_carry</span> <span class="o">=</span> <span class="nf">atoi</span><span class="p">(</span><span class="nf">slide</span><span class="p">(</span><span class="n">gets_carry</span> <span class="o">!=</span> <span class="sh">"</span><span class="s">&lt;</span><span class="sh">"</span><span class="p">,</span> <span class="n">gets_carry</span><span class="p">))</span>
+    <span class="c1"># 3) Add in carries, and remove overflow from original addition.                                                                                  
+</span>    <span class="nf">return </span><span class="p">(</span><span class="n">x</span> <span class="o">+</span> <span class="n">gets_carry</span><span class="p">)</span> <span class="o">%</span> <span class="mi">10</span>
+<span class="nf">add</span><span class="p">()(</span><span class="sh">"</span><span class="s">683+345</span><span class="sh">"</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_99_0.svg" alt="svg"></p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="mi">683</span> <span class="o">+</span> <span class="mi">345</span>
+</code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1028
+</code></pre></div></div> <p>Pretty neat stuff. If you are interested more in this topic, be sure to check at the paper:</p> <p><a href="https://arxiv.org/pdf/2106.06981.pdf" target="_blank" rel="noopener noreferrer">Thinking like Transformers</a> and the <a href="https://github.com/tech-srl/RASP" target="_blank" rel="noopener noreferrer">RASP language</a>.</p> </d-article> <d-appendix> <d-footnote-list></d-footnote-list> <d-citation-list></d-citation-list> </d-appendix> </div> <d-bibliography src="/2023/assets/bibliography/2023-05-01-raspy.bib"></d-bibliography> <d-article id="bibtex-container" class="related highlight"> For attribution in academic contexts, please cite this work as <pre id="bibtex-academic-attribution">
+        PLACEHOLDER FOR ACADEMIC ATTRIBUTION
+  </pre> BibTeX citation <pre id="bibtex-box">
+        PLACEHOLDER FOR BIBTEX
+  </pre> </d-article> <script src="https://utteranc.es/client.js" repo="iclr-blogposts/2023" issue-term="pathname" theme="github-light" crossorigin="anonymous" async> </script> <script src="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/js/bootstrap.bundle.min.js" integrity="sha256-fgLAgv7fyCGopR/gBNq2iW3ZKIdqIcyshnUULC4vex8=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/js/mdb.min.js" integrity="sha256-NdbiivsvWt7VYCt6hYNT3h/th9vSTL4EDWeGs5SN3DA=" crossorigin="anonymous"></script> </body> </html>
\ No newline at end of file
diff --git a/blog/2023/riit/index.html b/blog/2023/riit/index.html
new file mode 100644
index 00000000..b5e906fa
--- /dev/null
+++ b/blog/2023/riit/index.html
@@ -0,0 +1,66 @@
+<!DOCTYPE html> <html> <script>let thunk=()=>{let e=e=>e.trim(),t=e=>e.innerText,n=e=>{let t=e.split(" "),n=t.slice(0,-1).join(" ");return[t.at(-1),n]},o=Array.from(document.getElementsByClassName("author")).map(t).map(e).map(n),i=o[0][0],a=(Array.from(document.getElementsByClassName("affiliation")).filter(e=>"P"===e.nodeName).map(t).map(e),"May 1, 2023"),r="Rethinking the Implementation Tricks and Monotonicity Constraint in Cooperative Multi-agent Reinforcement Learning",l="QMIX, a very classical multi-agent reinforcement learning (MARL) algorithm, is often considered to be a weak performance baseline due to its representation capability limitations. However, we found that by improving the implementation techniques of QMIX we can enable it to achieve state-of-the-art on the StarCraft Multi-Agent Challenge (SMAC) testbed. Furthermore, the key factor of the monotonicity constraint of QMIX was found in this post, we tried to explain its role and corroborated its superior performance by combining it with another actor-critic style algorithm. We have open-sourced the code at https://github.com/hijkzzz/pymarl2 for researchers to evaluate the effects of these proposed techniques.";{let e=o.map(e=>`${e[0]}, ${e[1]}`).join(" and "),t=`\n@inproceedings{${(i+"2023"+r.split(" ").slice(0,3).join("")).replace(" ","").replace(/[\p{P}$+<=>^`|~]/gu,"").toLowerCase().trim()},\n  author = {${e}},\n  title = {${r}},\n  abstract = {${l}},\n  booktitle = {ICLR Blogposts 2023},\n  year = {2023},\n  date = {${a}},\n  note = {${window.location.href}},\n  url  = {${window.location.href}}\n}\n  `.trim();document.getElementById("bibtex-box").innerText=t}{let e=o.map(e=>e[0]),t=`\n${e=e.length>2?e[0]+", et al.":2==e.length?e[0]+" & "+e[1]:e[0]}, "${r}", ICLR Blogposts, 2023.\n`.trim();document.getElementById("bibtex-academic-attribution").innerText=t}};document.addEventListener("readystatechange",function(){"complete"===document.readyState&&thunk()});</script> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <title>Rethinking the Implementation Tricks and Monotonicity Constraint in Cooperative Multi-agent Reinforcement Learning | ICLR Blogposts 2023</title> <meta name="author" content="abc b c"/> <meta name="description" content="QMIX, a very classical multi-agent reinforcement learning (MARL) algorithm, is often considered to be a weak performance baseline due to its representation capability limitations. However, we found that by improving the implementation techniques of QMIX we can enable it to achieve state-of-the-art on the StarCraft Multi-Agent Challenge (SMAC) testbed. Furthermore, the key factor of the monotonicity constraint of QMIX was found in this post, we tried to explain its role and corroborated its superior performance by combining it with another actor-critic style algorithm. We have open-sourced the code at https://github.com/hijkzzz/pymarl2 for researchers to evaluate the effects of these proposed techniques."/> <meta name="keywords" content="machine-learning, ml, deep-learning, reinforcement-learning, iclr"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha256-DF7Zhf293AJxJNTmh5zhoYYIMs2oXitRfBjY+9L//AY=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/css/mdb.min.css" integrity="sha256-jpjYvU3G3N6nrrBwXJoVEYI/0zw8htfFnhT9ljN3JJw=" crossorigin="anonymous"/> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fortawesome/fontawesome-free@5.15.4/css/all.min.css" integrity="sha256-mUZM63G8m73Mcidfrv5E+Y61y7a12O5mW4ezU3bxqW4=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/academicons@1.9.1/css/academicons.min.css" integrity="sha256-i1+4qU2G2860dGGIOJscdC30s9beBXjFfzjWLjBRsBg=" crossorigin="anonymous"> <link rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700|Roboto+Slab:100,300,400,500,700|Material+Icons"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/github.css" media="" id="highlight_theme_light"/> <link rel="shortcut icon" href="/2023/assets/img/iclr_favicon.ico"/> <link rel="stylesheet" href="/2023/assets/css/main.css"> <link rel="canonical" href="https://iclr-blogposts.github.io/2023/blog/2023/riit/"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/native.css" media="none" id="highlight_theme_dark"/> <script src="/2023/assets/js/theme.js"></script> <script src="/2023/assets/js/dark_mode.js"></script> <script src="https://cdn.jsdelivr.net/npm/jquery@3.6.0/dist/jquery.min.js" integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=" crossorigin="anonymous"></script> <script type="text/javascript">window.MathJax={tex:{tags:"ams"}};</script> <script defer type="text/javascript" id="MathJax-script" src="https://cdn.jsdelivr.net/npm/mathjax@3.2.0/es5/tex-mml-chtml.js"></script> <script defer src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script> <script src="/2023/assets/js/distillpub/template.v2.js"></script> <script src="/2023/assets/js/distillpub/transforms.v2.js"></script> <script src="/2023/assets/js/distillpub/overrides.js"></script> <style type="text/css">figure{text-align:center}.img-center img{margin:0 auto}.img-height-180 img{height:160px}.img-height-200 img{height:180px}.img-height-210 img{height:180px}img-height-240 img{height:220px}.img-height-300 img{height:280px}.img-height-310 img{height:280px}.img-height-340 img{height:315px}.img-height-400 img{height:370px}.img-height-600 img{height:600px}.text{text-align:center}</style> </head> <d-front-matter> <script async type="text/json">{
+      "title": "Rethinking the Implementation Tricks and Monotonicity Constraint in Cooperative Multi-agent Reinforcement Learning",
+      "description": "QMIX, a very classical multi-agent reinforcement learning (MARL) algorithm, is often considered to be a weak performance baseline due to its representation capability limitations. However, we found that by improving the implementation techniques of QMIX we can enable it to achieve state-of-the-art on the StarCraft Multi-Agent Challenge (SMAC) testbed. Furthermore, the key factor of the monotonicity constraint of QMIX was found in this post, we tried to explain its role and corroborated its superior performance by combining it with another actor-critic style algorithm. We have open-sourced the code at https://github.com/hijkzzz/pymarl2 for researchers to evaluate the effects of these proposed techniques.",
+      "published": "May 1, 2023",
+      "authors": [
+        {
+          "author": "Jian Hu",
+          "authorURL": "https://hujian.website/",
+          "affiliations": [
+            {
+              "name": "National Taiwan University",
+              "url": ""
+            }
+          ]
+        },
+        {
+          "author": "Siying Wang",
+          "authorURL": "",
+          "affiliations": [
+            {
+              "name": "University of Electronic Science and Technology of China",
+              "url": ""
+            }
+          ]
+        },
+        {
+          "author": "Siyang Jiang",
+          "authorURL": "https://siyang-jiang.github.io/",
+          "affiliations": [
+            {
+              "name": "Huizhou University",
+              "url": ""
+            }
+          ]
+        },
+        {
+          "author": "Weixun Wang",
+          "authorURL": "https://wwxfromtju.github.io/",
+          "affiliations": [
+            {
+              "name": "Tianjin University, Netease Fuxi AI Lab",
+              "url": ""
+            }
+          ]
+        }
+        
+      ],
+      "katex": {
+        "delimiters": [
+          {
+            "left": "$",
+            "right": "$",
+            "display": false
+          },
+          {
+            "left": "$$",
+            "right": "$$",
+            "display": true
+          }
+        ]
+      }
+    }</script> </d-front-matter> <body class="fixed-top-nav"> <header> <nav id="navbar" class="navbar navbar-light navbar-expand-sm fixed-top"> <div class="container"> <a class="navbar-brand title font-weight-lighter" href="/2023/">ICLR Blogposts 2023</a> <button class="navbar-toggler collapsed ml-auto" type="button" data-toggle="collapse" data-target="#navbarNav" aria-controls="navbarNav" aria-expanded="false" aria-label="Toggle navigation"> <span class="sr-only">Toggle navigation</span> <span class="icon-bar top-bar"></span> <span class="icon-bar middle-bar"></span> <span class="icon-bar bottom-bar"></span> </button> <div class="collapse navbar-collapse text-right" id="navbarNav"> <ul class="navbar-nav ml-auto flex-nowrap"> <li class="nav-item "> <a class="nav-link" href="/2023/about">about</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/call">call for blogposts</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/submitting">submitting</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/reviewing">reviewing</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/blog/index.html">blog</a> </li> <li class="nav-item dropdown "> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdown" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">other iterations</a> <div class="dropdown-menu dropdown-menu-right" aria-labelledby="navbarDropdown"> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2025/">2025</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2024/">2024</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blog-track.github.io/home/" target="_blank" rel="noopener noreferrer">2022</a> </div> </li> <li class="toggle-container"> <button id="light-toggle" title="Change theme"> <i class="fas fa-moon"></i> <i class="fas fa-sun"></i> </button> </li> </ul> </div> </div> </nav> </header> <div class="post distill"> <d-title> <h1>Rethinking the Implementation Tricks and Monotonicity Constraint in Cooperative Multi-agent Reinforcement Learning</h1> <p>QMIX, a very classical multi-agent reinforcement learning (MARL) algorithm, is often considered to be a weak performance baseline due to its representation capability limitations. However, we found that by improving the implementation techniques of QMIX we can enable it to achieve state-of-the-art on the StarCraft Multi-Agent Challenge (SMAC) testbed. Furthermore, the key factor of the monotonicity constraint of QMIX was found in this post, we tried to explain its role and corroborated its superior performance by combining it with another actor-critic style algorithm. We have open-sourced the code at https://github.com/hijkzzz/pymarl2 for researchers to evaluate the effects of these proposed techniques.</p> </d-title> <d-byline></d-byline> <d-article> <d-contents> <nav class="l-text figcaption"> <h3>Contents</h3> <div><a href="#background">Background</a></div> <ul> <li><a href="#from-rl-to-marl">From RL to MARL</a></li> <li><a href="#decentralized-partially-observable-markov-decision-process">Decentralized Partially Observable Markov Decision Process</a></li> <li><a href="#centralized-training-with-decentralized-execution-and-value-decomposition">Centralized Training with Decentralized Execution and Value Decomposition</a></li> <li><a href="#notation">Notation</a></li> </ul> <div><a href="#qmix-and-monotonicity-constraint">QMIX and Monotonicity Constraint</a></div> <div><a href="#extension-to-qmix">Extension to QMIX</a></div> <ul> <li><a href="#experimental-design">Experimental Design</a></li> <li><a href="#optimizer">Optimizer</a></li> <li><a href="#rollout-process-number">Rollout Process Number</a></li> <li><a href="#replay-buffer-size">Replay Buffer Size</a></li> <li><a href="#eligibility-traces">Eligibility Traces</a></li> <li><a href="#hidden-size">Hidden Size</a></li> <li><a href="#exploration-steps">Exploration Steps</a></li> <li><a href="#integrating-the-techniques">Integrating the Techniques</a></li> </ul> <div><a href="#role-of-monotonicity-constraint">Role of Monotonicity Constraint</a></div> <ul> <li><a href="#amazing-performance-in-policy-based-methods">Amazing Performance in Policy-Based Methods</a></li> <li><a href="#what-is-under-the-hood">What is Under the Hood?</a></li> </ul> <div><a href="#conclusion">Conclusion</a></div> <div><a href="#authorship-credit-attribution-and-acknowledgement">Authorship, Credit Attribution and Acknowledgement</a></div> <div><a href="#appendix">Appendix</a></div> </nav> </d-contents> <h2 id="background">Background</h2> <h3 id="from-rl-to-marl">From RL to MARL</h3> <p>Since AlphaZero beats humans at Go, RL has become a consistent hot spot in academia and industry. The agent of RL can obtain some rewards by interacting with the environment and taking actions to maximize these cumulative rewards. Actually, almost all the RL problems can be described as <strong>Markov Decision Processes</strong> as illustrated in Figure <a href="#mdp">1</a>.</p> <div id="mdp" class="img-height-200 img-center"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/mdp-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/mdp-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/mdp-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-riit/mdp.png" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> <div class="caption">Figure 1: The agent-environment interaction in a Markov decision process. (Image source: Sec. 3.1 Sutton &amp; Barto (2018)<d-cite key="sutton2018reinforcement"></d-cite>). $R_t, S_t, A_t$ denote the reward, state and action at timestep $t$.</div> <p>Just as its name implies, MARL contains multiple agents trained by RL algorithms in the same environment. Many complex multi-agent systems such as robot swarm control, autonomous vehicle coordination, and sensor networks, can be modeled as MARL tasks. The interaction of these agents would make them work together to achieve a common goal.</p> <div style="display:flex; margin-bottom:-30px; margin-left :150px; margin-right :150px"> <div id="chase" class="img-height-100"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/chase.gif-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/chase.gif-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/chase.gif-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-riit/chase.gif" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> <div id="magent" class="img-height-100"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/magent.gif-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/magent.gif-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/magent.gif-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-riit/magent.gif" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> </div> <div style="display:flex; margin-top:-30px; margin-left :50px; margin-right :50px"> <div id="hide" class="img-height-200"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/hide.gif-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/hide.gif-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/hide.gif-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-riit/hide.gif" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> <div id="smac" class="img-height-200"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/smac.gif-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/smac.gif-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/smac.gif-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-riit/smac.gif" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> </div> <div style="margin-bottom: 20px"><div class="caption">Figure 2: Some multi-agent cooperative scenarios [from-left-to-right]. <a href="https://github.com/openai/multiagent-particle-envs" target="_blank" rel="noopener noreferrer"> <br> (a) Chasing in Multi-Agent Particle Environment (Predator-Prey); </a> <a href="https://github.com/geek-ai/MAgent" target="_blank" rel="noopener noreferrer"> (b) MAgent Environment; </a> <a href="https://openai.com/blog/emergent-tool-use" target="_blank" rel="noopener noreferrer"> <br> (c) Hide &amp; Seek; </a> <a href="https://github.com/oxwhirl/smac" target="_blank" rel="noopener noreferrer"> (d) StarCraft Multi-Agent Challenge. </a> </div></div> <p>In this general setting, agents usually have a limited sight range to observe their surrounding environment. As shown in Figure <a href="#smac_obs">3</a>, the cyan border indicates the sight and shooting range of the agent, which means the agent could only obtain the information of terrain or other agents in that range. This restricted field of view may also result in the difficulty of agents to access to global state information, making its policy updates subject to bias and unsatisfactory performance. In general, these kinds of multi-agent scenarios can be modeled as <strong>Decentralized Partially Observable Markov Decision Processes</strong> (Dec-POMDP)<d-cite key="png2009pomdps"></d-cite>.</p> <p>Even though many RL algorithms<d-cite key="sutton2018reinforcement"></d-cite> and their variants have been successfully extended to the cooperative scenarios in MARL setting, few of their performance is satisfactory. One of the most troublesome issues is <em>Non-Stationarity</em>. Specifically, as a part of the environment, the changing policies of other agents during training would make the observation non-stationary from the perspective of any individual agent<d-cite key="oliehoek2016concise"></d-cite> and significantly slow down the policy optimization of MARL. This situation has forced researchers to seek a method that can exploit global information during training but does not destroy the ability of the agents to only use their respective observations during execution, to find a joint policy $\boldsymbol{\pi} = \langle \pi^{1},…,\pi^{n}\rangle$ to maximize global reward. Naturally, the simplicity and effectiveness of the <strong>Centralized Training with Decentralized Execution</strong> (CTDE) paradigm have attracted the attention of the community, and many MARL algorithms based on CTDE were proposed, making a remarkable contribution to MARL.</p> <p>In the rest of this section, we briefly introduce Dec-POMDP and CTDE to facilitate the understanding of the contents of MARL, the QMIX algorithm and the following text.</p> <div style="float:left; margin-left :150px; margin-right :150px;"> <div id="smac_obs" class="img-height-100"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/smac_agent_obs-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/smac_agent_obs-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/smac_agent_obs-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-riit/smac_agent_obs.jpg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> <div class="caption">Figure 3: The partial observation of agents <br>(Image source: SMAC<d-cite key="samvelyan2019starcraft"></d-cite>). </div> </div> <h3 id="decentralized-partially-observable-markov-decision-process">Decentralized Partially Observable Markov Decision Process</h3> <p>A <strong>Decentralized Partially Observable Markov Decision Process</strong> (Dec-POMDP) model, as described in <d-cite key="pmlr-v80-rashid18a"></d-cite><d-cite key="oliehoek2016concise"></d-cite>, is typically used to represent a full cooperative multi-agent task. The model consists of a tuple denoted by $G=(S, U, P, r, Z, O, n, \gamma)$, and involves $n$ agents, where $n$ is an integer between 1 and $n$, inclusive. The true state of the environment, denoted by $s \in S$, describes global information that is relevant to both agents and other auxiliary features. At each timestep $t$, a transition in the environment occurs via a joint action $\mathbf{u} \in \mathbf{U} \equiv U^{n}$, which is composed of an action $u^i \in U$, chosen by each agent. This transition is driven by the state transition function $P\left(s^{\prime} \mid s, \mathbf{u}\right): S \times \mathbf{U} \times S \rightarrow[0,1]$. Additionally, there is a shared global reward function, denoted by $r(s, \mathbf{u}): S \times \mathbf{U} \rightarrow \mathbf{R}$, which is optimized by the whole team. Finally, each agent has a partial observation described by $o^i \in O$, which is derived from the observation function $Z(o^i \mid s, u^i) : S \times U \rightarrow O$. All agents work cooperatively to maximize the shared global reward $R_{t}=\sum_{k=0}^{T} \gamma^{k} r_{t+k}$, which is described by the joint value function \(Q^{\boldsymbol{\pi}}\left(s_{t}, \mathbf{u}_{t}\right) = \mathbb{E}_{s_{t+1: \infty}, \mathbf{u}_{t+1: \infty}}\left[R_{t} \mid s_{t}, \mathbf{u}_{t}\right]\).</p> <h3 id="centralized-training-with-decentralized-execution-and-value-decomposition">Centralized Training with Decentralized Execution and Value Decomposition</h3> <p>To better explore the factors affecting the QMIX algorithm, our focus lies in the <strong>Centralized Training with Decentralized Execution</strong> (CTDE) paradigm of MARL algorithms. These algorithms under this paradigm have access to the true state $s$ and the action-observation histories $\tau^{i}$ of all agents to centrally train policies, but each agent can only rely on its local observation $o^{i}$ for decision-making. Some value-based algorithms implemented under CTDE follow the Individual-Global-Max (<strong>IGM</strong>) principle<d-cite key="pmlr-v97-son19a"></d-cite>, ensuring consistency between the joint action-value function $Q_{tot} \left(\boldsymbol{\tau}, \mathbf{u}\right)$ and individual agent-utilities $[Q_i\left(\tau^i, u^i\right)] _{i=1} ^{n}$:</p> \[\underset{\mathbf{u}}{\operatorname{argmax}}\ Q_{tot} \left(\boldsymbol{\tau}, \mathbf{u}\right) = (\underset{u^{1}}{\operatorname{argmax}}\ Q_{1} \left(\tau^{1}, u^{1}\right), \ldots, \underset{u^{n}}{\operatorname{argmax}}\ Q_{n} \left(\tau^{n} , u^{n}\right)). \tag{1} \label{eq1}\] <p>One of the most typical ways to efficiently train the joint value function \(Q_{tot} \left(\boldsymbol{\tau}, \mathbf{u}\right)\) is to decompose it into the utility functions \([Q_i\left(\tau^i, u^i\right)] _{i=1} ^{n}\) and maintain updating consistency between them via IGM. The simplest factorization structure, called <em>additivity</em>, has been proposed by VDN<d-cite key="10.5555/3237383.3238080"></d-cite>, which makes VDN simply factorize $Q_{tot}$ into a sum of per-agent utilities \(Q_{tot}^{\mathrm{VDN}} \left(\boldsymbol{\tau}, \boldsymbol{u}\right)=\sum_{i=1}^{n} Q_{i} \left(\tau^{i}, u^{i}\right)\). VDN’s simplicity and equal weighting of each utility in the joint value function makes it ineffective for cooperative tasks, which has motivated the QMIX structure and other more efficient decomposition approaches.</p> <h3 id="notation">Notation</h3> <p>In this subsection, we define the notations used in this post. Specifically, in traditional RL, time steps $t$ are usually represented in the update formula and the value function of RL is considered to be estimated by the pairwise variables at the current time step $t$ and the next time step $t+1$. Since the <em>ID</em> of the agent also needs to be represented in the MARL algorithm, it may cause ambiguity when expressed in the same formula as the time step $t$. For simplicity of expression, variables without $t$ are indicated to be implemented at the current time step, while variables at the next time step are indicated with an apostrophe in the upper right corner in the rest of the context, e.g., $s$ means the current state and $s^{\prime}$ indicates the next time step state, the same approach applies to actions $u$ and observations $o$. All the notations are listed in Table <a href="#table1">1</a>.</p> <p><a name="table1"> </a></p> <div class="caption"> Table 1: All the notations used in this post. </div> <style type="text/css">.tg{border-collapse:collapse;border-spacing:0}.tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial,sans-serif;font-size:14px;overflow:hidden;padding:10px 5px;word-break:normal}.tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial,sans-serif;font-size:14px;font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal}.tg .tg-c3ow{border-color:inherit;text-align:center;vertical-align:top}</style> <table class="tg"> <thead> <tr> <th class="tg-c3ow">Notation</th> <th class="tg-c3ow">Description</th> <th class="tg-c3ow">Notation</th> <th class="tg-c3ow">Description</th> </tr> </thead> <tbody> <tr> <td class="tg-c3ow">$s$</td> <td class="tg-c3ow">the current state (at time $t$)</td> <td class="tg-c3ow">$S$</td> <td class="tg-c3ow">the set of all states</td> </tr> <tr> <td class="tg-c3ow">$s^{\prime}$</td> <td class="tg-c3ow">the next state (at time $t+1$)</td> <td class="tg-c3ow">$U$</td> <td class="tg-c3ow">the set of all actions</td> </tr> <tr> <td class="tg-c3ow">$u^{i}$</td> <td class="tg-c3ow">the action of agent $i$</td> <td class="tg-c3ow">$N$</td> <td class="tg-c3ow">the set of all agents</td> </tr> <tr> <td class="tg-c3ow">$\mathbf{u}$</td> <td class="tg-c3ow">the joint actions (at time $t$)</td> <td class="tg-c3ow">$\tau^{i}$</td> <td class="tg-c3ow">the action-observation history of agent $i$</td> </tr> <tr> <td class="tg-c3ow">$o^{i}$</td> <td class="tg-c3ow">the observation of agent $i$</td> <td class="tg-c3ow">$${\tau}$$</td> <td class="tg-c3ow">the joint action-observation histories</td> </tr> <tr> <td class="tg-c3ow">$$o$$</td> <td class="tg-c3ow">the joint observation</td> <td class="tg-c3ow">$r(s, \mathbf{u})$</td> <td class="tg-c3ow">the joint reward supplied by environments</td> </tr> <tr> <td class="tg-c3ow">$Q_{i}(\tau^{i}, u^{i})$</td> <td class="tg-c3ow">the utility function of agent $i$</td> <td class="tg-c3ow">$\gamma$</td> <td class="tg-c3ow">the discount factor</td> </tr> <tr> <td class="tg-c3ow">$Q_{tot}({\tau}, \mathbf{u})$</td> <td class="tg-c3ow">the joint value function </td> <td class="tg-c3ow">$P(s^{\prime} \mid s, \mathbf{u})$</td> <td class="tg-c3ow">the transition function</td> </tr> <tr> <td class="tg-c3ow">$Z(o^{i} \mid s, u^{i})$</td> <td class="tg-c3ow">the observation function</td> <td class="tg-c3ow">$\epsilon$</td> <td class="tg-c3ow">action selection probability of $\epsilon$-greedy</td> </tr> <tr> <td class="tg-c3ow">$N$</td> <td class="tg-c3ow">the set of all agents with $n$ agents</td> <td class="tg-c3ow">$$\theta$$</td> <td class="tg-c3ow">the set of parameters of agents network, with $[\theta^{i}]_{i=1}^{n}$</td> </tr> <tr> <td class="tg-c3ow">$b$</td> <td class="tg-c3ow">sampled batch size for training</td> <td class="tg-c3ow">$\phi$</td> <td class="tg-c3ow">the parameter of mixing network</td> </tr> <tr> <td class="tg-c3ow">$TS$</td> <td class="tg-c3ow">the $T$otal rollout $S$amples</td> <td class="tg-c3ow">$PP$</td> <td class="tg-c3ow">the number of rollout $P$rocesses in $P$arallel</td> </tr> <tr> <td class="tg-c3ow">$SE$</td> <td class="tg-c3ow">the number of $S$amples in each <br> $E$pisode</td> <td class="tg-c3ow">$PI$</td> <td class="tg-c3ow">the $P$olicy $I$teration number</td> </tr> </tbody> </table> <h2 id="qmix-and-monotonicity-constraint">QMIX and Monotonicity Constraint</h2> <p>To deal with the relationship between the individual agent and the cooperative group, QMIX<d-cite key="pmlr-v80-rashid18a"></d-cite> learns a joint action-value function $Q_{tot}$ and factorizes the joint policy into the individual policy of each agent. In other words, as illustrated in Figure <a href="#frame">4</a>, QMIX integrates all the individual $Q_{i}$ with a mixing network to obtain a centralized value function $Q_{tot}$, which can be more appropriately updated by the global reward.</p> <div id="frame" class="img-height-310 image-center"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/qmix_frame-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/qmix_frame-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/qmix_frame-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-riit/qmix_frame.png" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> <div class="caption">Figure 4: Framework of QMIX. (Image source: QMIX<d-cite key="pmlr-v80-rashid18a"></d-cite>). On the left is Mixing Network (A Hypernetwork), and on the right is the Agent network.</div> <p>Still, it also can be represented in Eq.(\ref{eq2})</p> \[Q_{tot}(s, \boldsymbol{u} ; \boldsymbol{\theta}, \phi) = g_{\phi}\left(s, Q_{1}\left(\tau^{1}, u^{1} ; \theta^{1}\right), \ldots, Q_{n}\left(\tau^{n}, u^{n} ; \theta^{n}\right)\right);\] \[with \quad \frac{\partial Q_{tot}(s, \boldsymbol{u} ; \boldsymbol{\theta}, \phi)}{\partial Q_{i}\left(\tau^{i}, u^{i}; \theta^{i}\right)} \geq 0, \quad \forall i \in N. \tag{2} \label{eq2}\] <p>where $\theta^i$ is the parameter of the agent network $i$, $u^{i}$ denotes the action of agent $i$, and $\phi$ is the trainable parameter of the mixing network. The the mixing network $g_{\phi}(\cdot)$ is responsible to factorize $Q_{tot}$ to each utility $Q_{i}$. The <em>Monotonicity Constraint</em> is also implemented in the mixing network $g_{\phi}(\cdot)$, which inputs the global state $s$ and outputs <em>non-negative</em> wights through a <em>hyper-network</em> as illustrated in the left part of Figure <a href="#frame">4</a>, which will result in \(\frac{\partial Q_{tot}(s, \boldsymbol{u} ; \boldsymbol{\theta}, \phi)}{\partial Q_{i}\left(\tau^{i}, u^{i}; \theta^{i}\right)} \geq 0\). This delicate design ensures consistency between joint actions and the individual actions of each agent, then guarantees the Individual-Global-Max (IGM) principle. Benefiting from the monotonicity constraint in Eq.(\ref{eq2}), maximizing joint $Q_{tot}$ is precisely the equivalent of maximizing individual $Q_i$, which would also allow the optimal individual action to maintain consistency with optimal joint action. Furthermore, QMIX learns the centralized value function $Q_{tot}$ by sampling a multitude of transitions from the replay buffer and minimizing the mean squared temporal-difference (TD) error loss:</p> \[\mathcal{L}(\theta)= \frac{1}{2} \sum_{i=1}^{b}\left[\left(y_{i}^{}-Q_{tot}(s, u ; \theta, \phi)\right)^{2}\right] \tag{3} \label{eq3}\] <p>where the TD target value \(y=r+\gamma \underset{u^{\prime}}{\operatorname{max}} Q_{tot}(s^{\prime},u^{\prime};\theta^{-},\phi^{-})\), and $\theta^{-}, \phi^{-}$ are the target network parameters copied periodically from the current network and kept constant for a number of iterations. $b$ is the sampled training batch size. Due to the strong constraints in Eq.(\ref{eq2}), QMIX is still criticized for the insufficient expressive capacity of the joint value function<d-cite key="mahajan2019maven"></d-cite>.</p> <h2 id="extension-to-qmix">Extension to QMIX</h2> <h3 id="experimental-design">Experimental Design</h3> <p>To facilitate the study of proper techniques affecting the training effectiveness and sample efficiency of QMIX, we perform a set of experiments designed to provide insight into some methods that have been proven effective in single-agent RL but may be ambiguous in MARL. In particular, we investigate the effects of <strong>Adam optimizer with parallel rollout process; the incremental replay buffer size; the number of parallel rollout processes; $\epsilon$-exploration steps; the implementation of $Q(\lambda)$ in centralized value function; the hidden size of the recurrent network of agents</strong>. And we also dive into the <strong>role of monotonicity constraints in QMIX</strong>. For all experiments, we generally implement PyMARL<d-cite key="samvelyan2019starcraft"></d-cite> framework to implement QMIX. To ensure fairness we run independent 3 to 6 experimental trials for each evaluation, each with a random seed. Unless otherwise mentioned, we use default settings as in PyMARL whenever possible, while incorporating the techniques of interest. To prevent the training process of the algorithm from crashing by chance, we remove the highest and lowest scores when counting the calculated returns and win rates for the test episode. All the results are plotted with the median and shaded the interval, and the final scores were <strong><em>not</em></strong> smoothed for the sake of image aesthetics, and we did so to verify exactly what direct effect the proposed techniques could have on QMIX.</p> <p><strong>StarCraft Multi-Agent Challenge (SMAC)</strong> As a commonly used testing environment, SMAC<d-cite key="samvelyan2019starcraft"></d-cite> sets an example to offer a great opportunity to tackle the cooperative control problems in the multi-agent domain. We focus on the micromanagement challenge in SMAC, which means each agent is controlled by an independent agency that conditions on a limited observation area, and these groups of units are trained to conquer the enemy consisting of built-in AI. According to the quantity and type of enemy, all testing scenarios could be divided into <em>Easy, Hard</em>, and <em>Super-Hard</em> levels. Since QMIX can effectively solve the <em>Easy</em> tasks, we pay attention to some <em>Hard</em> and <em>Super-Hard</em> scenarios that QMIX failed to win, especially in <em>Corridor, 3s5z_vs_3s6z</em>, and <em>6h_vs_8z</em>.</p> <p><strong>Predator-Prey (PP)</strong> is representative of another classical problem called <em>relative overgeneralization</em><d-cite key="wei2018multiagent"></d-cite>. The cooperating predators are trained to chase a faster running prey, and hope to capture this escaping robot with the fewest steps possible. We leverage Predator-Prey-2 (a variant of Predator-Prey) proposed in FACMAC<d-cite key="peng2021facmac"></d-cite>, whose policy of prey is replaced with a hard-coded heuristic policy. The heuristic policy asks the prey to move to the farthest sampled position to the closest predator. If one of the cooperative agents collides with the prey, a team reward of +10 is emitted; otherwise, no reward is given. In the original simple tag environment, each agent can observe the relative positions of the other two agents, the relative position and velocity of the prey, and the relative positions of the landmarks. This means each agent’s private observation provides an almost complete representation of the true state of the environment.</p> <p>To introduce partial observability to the environment, the view radius is added to the agent, which restricts the agents from receiving information about other entities (including all landmarks, the other two agents, and the prey) that are out of range. Specifically, we set the view radius such that the agents can only observe other agents roughly 60% of the time. These environments require greater cooperation between agents.</p> <p><strong>Notes:</strong> Although the code repository of this post is given in the abstract, we give its url here again for greater convenience and still strongly welcome researchers to conduct experiments referring to the proposed methods. Still, in the following subsections, we post their corresponding permalinks for easy understanding.</p> <p>Code Repository: <a href="https://github.com/hijkzzz/pymarl2" target="_blank" rel="noopener noreferrer"> https://github.com/hijkzzz/pymarl2 </a></p> <h3 id="optimizer">Optimizer</h3> <p>As an important part of training neural networks, the selection of an optimizer is very important since it could seriously affect the training effect of the reinforcement learning agent. Without a further illustration, QMIX uses RMSProp<d-cite key="zou2019sufficient"></d-cite> to optimize the neural networks of agents as they prove stable in SMAC. While Adam<d-cite key="kingma2014adam"></d-cite> is famous for the fast convergence benefiting from the momentum in training, which seems to be the first choice for AI researchers. We reckon that the momentum property in Adam would have some advantages in learning the sampled data which is generated by agents interacting with the environment as in MARL. And then, on the other hand, QMIX is criticized for performing sub-optimally and sampling inefficiency when equipped with the A2C framework, which is implemented to promote the training efficiency of the RL algorithm. VMIX<d-cite key="su2021value"></d-cite> argues this limitation is brought about by the value-based inherent Q function, so they extend QMIX to the actor-critic style algorithm to take advantage of the A2C framework. This controversy attracts our attention to evaluate the performance of QMIX using Adam, as well as the parallel sampling paradigm.</p> <p><strong>Permalink:</strong> <a href="https://github.com/hijkzzz/pymarl2/blob/45278a5f8d1e3d006811351ed5fa99d614731e7d/src/learners/nq_learner.py#L37-L40" target="_blank" rel="noopener noreferrer"> Adam optimizer in nq_learner. </a></p> <div id="optimizer" class="img-height-210 image-center"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/optimizer.svg-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/optimizer.svg-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/optimizer.svg-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-riit/optimizer.svg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> <div class="caption">Figure 5: The performance of QMIX optimized by Adam and RMSProp.</div> <p><strong>Results</strong> As shown in Figure <a href="#optimizer">5</a>, we run the Adam-supported QMIX with <strong>8 rollout processes</strong>. Different from what was described in VMIX, the performance and efficiency of QMIX could be greatly improved by Adam. We speculate the reason is the momentum property in Adam could fastly fit the newly sampled data from the parallel rollout processes and then enhance the performance, while RMSProp failed.</p> <h3 id="rollout-process-number">Rollout Process Number</h3> <p>Naturally, we come to focus on the benefits of parallel data sampling in QMIX. A2C<d-cite key="pmlr-v48-mniha16"></d-cite> provides an excellent example to reduce training time and improve the training efficiency in single-agent RL. As we implement the algorithms under the paradigm of A2C, there is usually a defined total number of samples and an unspecified number of rollout processes. The total number of samples $TS$ can be calculated as $TS = SE \cdot PP \cdot PI$, where $TS$ is the total sum of sampled data, $SE$ denotes the number of samples in each episode, $PP$ and $PI$ denote the number of rollout processes in parallel and the policy iteration number, respectively. This section aims to perform analysis and spur discussion on the impact of the parallel rollout process on the final performance of QMIX.</p> <p><strong>Permalink:</strong> <a href="https://github.com/hijkzzz/pymarl2/blob/45278a5f8d1e3d006811351ed5fa99d614731e7d/src/config/algs/qmix.yaml#L9-L10" target="_blank" rel="noopener noreferrer"> 1) Rollout process number setting in the configuration file</a>; 2) <a href="https://github.com/hijkzzz/pymarl2/blob/45278a5f8d1e3d006811351ed5fa99d614731e7d/src/runners/parallel_runner.py#L88-L212" target="_blank" rel="noopener noreferrer"> Parallel trajectory sampling code. </a></p> <div id="process_number" class="img-height-210 image-center"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/process_number.svg-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/process_number.svg-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/process_number.svg-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-riit/process_number.svg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> <div class="caption">Figure 6: The performance of different rollout process numbers of QMIX. When given the total number of samples, the performance of fewer processes achieves better performance.</div> <p><strong>Results</strong> Still, we use Adam-supported QMIX to evaluate the effect of the number of the rollout process. Since we could choose the <em>Parallel</em> model to sample the interacting data of the agent with the environment in PyMARL, we can theoretically get more <strong>on-policy</strong> data which is close to the updating policy in training. Figure <a href="#process_number">6</a> shows that when $TS$ and $PP$ is given, the performance enhancement of QMIX is not consistent with the increase in rollout process number. The intuitive explanation is when we set the fewer rollout processes, the greater the quantity of policy would iterate<d-cite key="sutton2018reinforcement"></d-cite>. Besides, too fast updated data in parallel may cause the factitious unstable training in policy updating, i.e., it is difficult for agents to learn effective information from rapidly sampled data from the replay buffer. The more times policies are iterated, the more information the agents would learn which lead to an increase in performance. However, it also causes longer training time and loss of stability. We suggest trying the fewer rollout process in the beginning and then balancing between training time and performance.</p> <h3 id="replay-buffer-size">Replay Buffer Size</h3> <p>Replay buffer plays an important role in improving sample efficiency in off-policy single-agent RL. Its capacity would greatly affect the performance and stability of algorithms. Researchers usually set a very large capacity of replay buffer in Deep Q-network (DQN)<d-cite key="mnih2013playing"></d-cite> to stabilize the training. Some research on the effect of replay buffer in single-agent RL has already been carried out in <d-cite key="pmlr-v119-fedus20a"></d-cite>, which poses the distribution of sampled training data should be close as possible to the agents’ policies to be updated. Actually, there are two factors affected when we change the capacity of the replay buffer: (1) the replay capacity (total number of transitions/episodes stored in the buffer); and (2) the replay ratio (the number of gradient updates per environment transition/episode) of old policies. When we increase the capacity of the replay buffer, the aged experiences of old policies would grow as the replay ratio is fixed. Then the distribution of outdated experiences would also be much different from the updating policy, which would bring additional difficulty to the training agents. From the results in <d-cite key="pmlr-v119-fedus20a"></d-cite>, there seems to be an optimal range of choices between replay buffer size and replay ratio of experiences in RL, where we would like to know whether it is consistent with the results in MARL.</p> <p><strong>Permalink:</strong> <a href="https://github.com/hijkzzz/pymarl2/blob/45278a5f8d1e3d006811351ed5fa99d614731e7d/src/config/algs/qmix.yaml#L11" target="_blank" rel="noopener noreferrer"> Replay buffer size setting in the configuration file. </a></p> <div id="replay_buffer" class="img-height-210 image-center"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/buffer_size.svg-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/buffer_size.svg-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/buffer_size.svg-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-riit/buffer_size.svg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> <div class="caption">Figure 7: Setting the replay buffer size to 5000 episodes allows for QMIX’s learning to be stable.</div> <p><strong>Results</strong> The results seem not to be consistent with that in single-agent RL. Figure <a href="#replay_buffer">7</a> shows the large replay buffer size of QMIX would cause instability during training. When we increase the buffer size from the default setting in PyMARL, the performance would almost continuously declines. We speculate the reason is the fast-changing distribution of experiences in a larger buffer would make it more difficult to fit sampled data due to the enormous joint action space. Since the samples become obsolete more quickly, these aged policies would also be more different from the updating policy, which brings additional difficulty. On the other hand, we find the same performance decline when we squeeze the buffer. We reckon that a small buffer would accelerate the updating speed of sampling data in a disguised way, which makes it tough to fit the data and learn a good policy. We believe that researchers should be cautious to increase the buffer size in other multi-agent applications.</p> <h3 id="eligibility-traces">Eligibility Traces</h3> <p>The well-known trade-off between bias and variance of bootstrapping paradigm is a classic research topic in RL. Since we implement the Centralized Value Function (CVF) to alleviate the <em>Non-Stationarity</em> multi-agent settings, the estimated accuracy of CVF is critical to MARL and then guides the policies of agents to update. Eligibility traces such as TD($\lambda$)<d-cite key="sutton2018reinforcement"></d-cite>, Peng’s Q($\lambda$)<d-cite key="pmlr-v139-kozuno21a"></d-cite>, and TB($\lambda$)<d-cite key="10.5555/645529.658134"></d-cite> achieve a balance between return-based algorithms (where return refers to the sum of discounted rewards $\sum_{k} \gamma^{k} r_{t+k}$) and bootstrap algorithms (where return refers $r_t + V(s_{t+1})$), then speed up the convergence of agents’ policies. As a pioneer, SMIX<d-cite key="wen2020smix"></d-cite> equipped QMIX with the SARSA($\lambda$) to estimate the accurate CVF and get decent performance. As another example of eligibility trace in Q-learning, we study the estimation of CVF using Peng’s Q$(\lambda)$ for QMIX.</p> <p><strong>Permalink:</strong> <a href="https://github.com/hijkzzz/pymarl2/blob/45278a5f8d1e3d006811351ed5fa99d614731e7d/src/utils/rl_utils.py#L6-L45" target="_blank" rel="noopener noreferrer"> Different eligibility traces code in repository. </a></p> <div id="qlambda1" class="img-height-210 image-center"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/td_lambda.svg-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/td_lambda.svg-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/td_lambda.svg-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-riit/td_lambda.svg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> <div class="caption"> Figure 8: Q(λ) significantly improves the performance of QMIX, but large values of λ lead to instability in the algorithm. </div> <p><strong>Results</strong> As the same in single-agent RL, the Q-networks without sufficient training usually have a large bias in bootstrapping returns. Figure <a href="#qlambda1">8</a> shows that, with the help of Q$(\lambda)$, the performance of QMIX has generally improved across all scenarios. It means the more accurate estimate of CVF would still provide a better direction of policy updating for each agent. However, the value of $\lambda$ in Peng’s Q$(\lambda)$ is not so radical as in single-agent RL, which would lead to failed convergence due to the large variance. We recommend a small $\lambda$, such as $0.5$, when using $Q(\lambda)$ in MARL.</p> <h3 id="hidden-size">Hidden Size</h3> <p>Searching for an optimal scale and architecture of neural networks is a very tough problem in the field of machine learning. Researchers typically use empirically small networks to train the agents in deep reinforcement learning. Since the role of neural networks is to extract the features of input states and actions, the size of the neural network would also have a great impact on the performance of MARL algorithms. The study in <d-cite key="pmlr-v119-ota20a"></d-cite> has revealed that networks with a complex structure like ResNet<d-cite key="He_2016_CVPR"></d-cite> and DenseNet<d-cite key="Huang_2017_CVPR"></d-cite> can extract more useful information for training, while Ba<d-cite key="ba2014deep"></d-cite> poses that the width of neural networks is probably more important than its depth. The subsequent study on QMIX<d-cite key="rashid2020monotonic"></d-cite> makes preliminary research on the depth of neural networks, which showed a limited improvement in performance. Though, there is little research on the width of neural networks in MARL. Instead of searching for an optimal network architecture here, we just want to make a pilot study on the effect of the hidden size of network width in QMIX.</p> <p><strong>Permalink:</strong> <a href="https://github.com/hijkzzz/pymarl2/blob/45278a5f8d1e3d006811351ed5fa99d614731e7d/src/config/algs/qmix_large.yaml#L25-L30" target="_blank" rel="noopener noreferrer"> Hidden size of neural network setting in the configuration file. </a></p> <div id="hiddensize" class="img-height-210 image-center"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/hidden_size.svg-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/hidden_size.svg-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/hidden_size.svg-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-riit/hidden_size.svg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> <div class="caption">Figure 9: Impact of the hidden size of network in QMIX.</div> <p><strong>Results</strong> The study in <d-cite key="ba2014deep"></d-cite> illustrates the ability of infinity-width networks to fit any complex function, which would theoretically provide the performance gain from increasing network width. As shown in Figure <a href="#hiddensize">9</a>, the final performance or the efficiency of policy training would have varying degrees of improvement when we increase the hidden size of the network from 64 to 256 in QMIX, where <strong>QMIX-ALL-Hidden indicates all the sizes of the Recurrent Neural Network (RNN) and the Mixing network would be increased to 256, while QMIX-RNN-Hidden only refers to the size of the RNN part of the network will be changed</strong>. Also, the results reveal the spectacular effect of increasing the network width of RNN, which would allow for about a 20% increase in the Super-Hard scenarios <em>3s5z_vs_3s6z</em>. While the performance improvement is limited in enlarging the mixing network. We speculate that more units in the network are needed to represent the complex temporal context information in RNN, which is not included in the mixing network. We advise researchers to appropriately increase the network width of RNN to achieve better performance.</p> <h3 id="exploration-steps">Exploration Steps</h3> <p>Exploration and exploitation are other classic trade-offs in reinforcement learning. Agents need some directed mechanisms to explore the states that may be of higher value or inexperienced. The most versatile method of exploration in RL is $\epsilon$-greedy action, which makes the agent select random actions with probability $\epsilon$, or select the greedy action with $1 - \epsilon$. The value of $\epsilon$ would drop-down with training and then stays at a small constant. The annealing period of $\epsilon$-greedy determines how fast the drop down will be. This exploration mechanism is usually implemented for each agent to select their action, which has been criticized by MAVEN<d-cite key="mahajan2019maven"></d-cite> for lacking joint exploratory policy over an entire episode. However, we can still get more exploration when $\epsilon$ drops slower, then we evaluate the performance of the annealing period of $\epsilon$-greedy in some Super-Hard scenarios in SMAC.</p> <p><strong>Permalink:</strong> <a href="https://github.com/hijkzzz/pymarl2/blob/45278a5f8d1e3d006811351ed5fa99d614731e7d/src/config/algs/qmix.yaml#L5-L7" target="_blank" rel="noopener noreferrer"> $\epsilon$-greedy exploration steps setting in the configuration file. </a></p> <div id="exploration" class="img-height-210 image-center"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/exploration.svg-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/exploration.svg-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/exploration.svg-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-riit/exploration.svg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> <div class="caption">Figure 10: Experinments for the impact of ε anneal period.</div> <p><strong>Results</strong> Apparently, appropriately increasing the annealing period of $\epsilon$-greedy from 100K steps to 500K would get explicit performance gain in those hard explorated scenarios, where QMIX failed with the default setting. However, as shown in Figure <a href="#exploration">10</a>, too large steps like 1000K would also bring additional exploration noise even making the training collapse. The results above confirm the $\epsilon$-greedy mechanism is still the proper and simplest choice in MARL but should be elaboratively tuned for different tasks.</p> <h3 id="integrating-the-techniques">Integrating the Techniques</h3> <p>These techniques mentioned above indeed impact QMIX in hard cooperative scenarios of SMAC, which really catches our attention to exhaust the extreme performance of QMIX. We combine these techniques and finetune all the hyperparameters in QMIX for each scenario of SMAC. As shown in Table <a href="#table2">2</a>, the Finetuned-QMIX would almost conquer all the scenarios in SMAC and exceed the effect of the original QMIX by a large margin in some Hard and Super-Hard scenarios.</p> <p><a name="table2"> </a></p> <div class="caption"> Table 2: Best median test win rate of Finetuned-QMIX and QMIX (batch size=128) in all testing scenarios. </div> <table style="text-align: center; width: 600px; margin: 0 auto; margin-bottom:20px; margin-top:20px"> <thead> <tr> <td><b>Senarios</b></td> <td><b>Difficulty</b></td> <td><b>QMIX</b></td> <td><b>Finetuned-QMIX</b></td> </tr> </thead> <tbody> <tr> <td>10m_vs_11m</td> <td>Easy</td> <td>98%</td> <td><b>100%</b></td> </tr> <tr> <td>8m_vs_9m</td> <td>Hard</td> <td>84%</td> <td><b>100%</b></td> </tr> <tr> <td>5m_vs_6m</td> <td>Hard</td> <td>84%</td> <td><b>90%</b></td> </tr> <tr> <td>3s_vs_5z</td> <td>Hard</td> <td>96%</td> <td><b>100%</b></td> </tr> <tr> <td>bane_vs_bane</td> <td>Hard</td> <td><b>100%</b></td> <td><b>100%</b></td> </tr> <tr> <td>2c_vs_64zg</td> <td>Hard</td> <td><b>100%</b></td> <td><b>100%</b></td> </tr> <tr> <td>corridor</td> <td>Super hard</td> <td>0%</td> <td><b>100%</b></td> </tr> <tr> <td>MMM2</td> <td>Super hard</td> <td>98%</td> <td><b>100%</b></td> </tr> <tr> <td>3s5z_vs_3s6z</td> <td>Super hard</td> <td>3%</td> <td><b>93% (Hidden Size = 256)</b></td> </tr> <tr> <td>27m_vs_3s6z</td> <td>Super hard</td> <td>56%</td> <td><b>100%</b></td> </tr> <tr> <td>6h_vs_8z</td> <td>Super hard</td> <td>0%</td> <td><b>93% (λ = 0.3)</b></td> </tr> </tbody> </table> <h2 id="role-of-monotonicity-constraint">Role of Monotonicity Constraint</h2> <h3 id="amazing-performance-in-policy-based-methods">Amazing Performance in Policy-Based Methods</h3> <div id="qmix_sy" class="img-height-180 image-center img-margin-left-30"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/riit.svg-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/riit.svg-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/riit.svg-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-riit/riit.svg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> <div class="caption">Figure 11: Architecture for AC-MIX: <b>|·|</b> denotes <b>absolute value operation</b>, implementing the monotonicity constraint of QMIX. <b>W</b> denotes the non-negative mixing weights. Agent $i$ denotes the agent's network, which can be trained end-to-end by maximizing the $Q_{tot}$.</div> <p>The novelty of QMIX is the IGM consistency between $\text{argmax} Q_{tot}$ and $\text{argmax} \sum_{i}^{n} Q_{i}$, which is implemented in the mixing network. <strong>We still expect to study the role of <em>monotonicity constraint</em> in MARL</strong>. Therefore, we propose an actor-critic style algorithm called Actor-Critic-Mixer (AC-MIX), which has a similar architecture to QMIX. As illustrated in Figure <a href="#qmix_sy">11</a>, we use the monotonic mixing network as a centralized critic, which integrates $Q_{i}$ of each agent, to optimize the decentralized policy networks $π^i_{θ_i}$ in an end-to-end pattern. We still add the Adaptive Entropy $\mathcal{H}(\cdot)$<d-cite key="zhou2020smarts"></d-cite> of each agent in the optimization object of Eq.(\ref{eq4}) to get more exploration, and the detail of the algorithm will be described in Appendix <a href="#A">A</a>.</p> \[\max _{\theta} \mathbb{E}_{t, s_{t}, \tau_{t}^{1}, \ldots, \tau_{t}^{n}}\left[Q_{\theta_{c}}^{\pi}\left(s_{t}, \pi_{\theta_{1}}^{1}\left(\cdot \mid \tau_{t}^{1}\right), \ldots, \pi_{\theta_{n}}^{n}\left(\cdot \mid \tau_{t}^{n}\right)\right) + \mathbb{E}_{i}\left[\mathcal{H}\left(\pi_{\theta_{i}}^{i}\left(\cdot \mid \tau_{t}^{i}\right)\right)\right]\right] \tag{4} \label{eq4}\] <div id="riit_abla" class="img-height-210 image-center"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/monotonicity_riit.svg-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/monotonicity_riit.svg-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/monotonicity_riit.svg-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-riit/monotonicity_riit.svg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> <div class="caption">Figure 12: Comparing AC-MIX w./ and w./o. monotonicity constraint (remove absolute value operation) on SMAC and Predator-Prey-2</div> <p>As the monotonicity constraint on the critic of AC-MIX is theoretically no longer required as the critic is not used for greedy action selection. We can evaluate the effects of the monotonicity constraint by removing the absolute value operation in the mixing network. The results in Figure <a href="#riit_abla">12</a> demonstrate the <em>monotonicity constraint</em> significantly improves the performance of AC-MIX. Then to explore the generality of <em>monotonicity constraints</em> in the parallel sampling framework of MARL, we extend the above experiments to VMIX<d-cite key="su2021value"></d-cite>. VMIX adds the monotonicity constraint to the value network of A2C, and learns the policy of each agent by advantage-based policy gradient<d-cite key="sutton2018reinforcement"></d-cite> as illustrated in Figure <a href="#vmix_net">13</a>. Still, the result from Figure <a href="#vmix_abla">14</a> shows that the monotonicity constraint improves the sample efficiency in value networks.</p> <div id="vmix_net" class="img-height-180 image-center img-margin-left-60"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/vmix.svg-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/vmix.svg-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/vmix.svg-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-riit/vmix.svg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> <div class="caption">Figure 13. Architecture for VMIX: |·| denotes absolute value operation</div> <div id="vmix_abla" class="img-height-210 image-center"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/monotonicity_vmix.svg-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/monotonicity_vmix.svg-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/monotonicity_vmix.svg-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-riit/monotonicity_vmix.svg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> <div class="caption">Figure 14: Comparing VMIX w./ and w./o. monotonicity constraint (remove absolute value operation) on SMAC</div> <h3 id="what-is-under-the-hood">What is Under the Hood?</h3> <p>Observed from the results of previous experiments, <strong>the <em>monotonicity constraints</em> in the mixing network indeed improve performance and sample efficiency of training</strong>, but on the flip side of the coin, QMIX is still criticized for the insufficient expressive capacity of the centralized critic<d-cite key="mahajan2019maven"></d-cite>, which may cause poor performance. The abnormal question naturally occurred to us: <em>Why the performance of AC-MIX would be better than AC-MIX-nonmonotonic which aims to relax the monotonicity constraint of mixing network</em>?</p> <p>To answer this question we first need to reexamine the <strong>IGM</strong> principle. Since in QMIX, $Q_{tot}$ is decomposed by the mixing network into the sum of the weighted $[Q_i] _{i=1}^{n}$, as shown in Figure <a href="#frame">4</a>, where the weights and bias of mixing network are generated by the <em>Hypernetwork</em>, then the monotonicity in QMIX can be defined simplistically as a constraint on the relationship between \(Q_{tot}\) and each \(Q_{i}\) :</p> \[Q_{tot} = \sum_{i=1}^{N}w_{i}(s_{t}) \cdot Q_{i} + b(s_{t}), \\ w_{i} = \frac{\partial Q_{tot}}{\partial Q_{i}} \geq 0, \forall i \in N. \tag{5} \label{5}\] <p>From the sufficient condition above, the weight $w_{i}$ in <em>Mixing Network</em> would be forced to be greater or equal to zero $w_{i} \geq 0$. To put it another way, it makes the parameter space smaller for searching $w_{i}$ weights to decompose $Q_{tot}$. As illustrated in the schematic diagram <a href="#diagram">15</a>, assume there is only 1 agent in the environment, the parameter searching space will be directly halved and the optimal $w_{1}$ will be found in the region where $w \geq 0$, i.e., the green region. Similarly, when the number of agents is 2 or 3, its parameter searching space for $w_i$ will be restricted to the first quadrant, and the same can be recursively extended to the case of high-dimensional parameter space. <strong>In other words, the search area of exhausting the whole joint state-action space would also be decreased exponentially by $(\frac{1}{2})^{N}$ ($N$ denotes the number of parameter space of $w_{i}$, as well as the number of agents).</strong> Then the optimal solution in the original domain cannot be expressed correctly in the restricted region. Since the essence of learning in MARL is to search for the optimal joint-policy parameterized by weights and bias of agents and mixing network, QMIX could find a satisfying policy more quickly in these <strong>reduced</strong> parameter spaces.</p> <div id="diagram" style="display:flex; margin:20px 0; gap:5px"> <div id="1_agent" class="img-height-100"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/1_agent.svg-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/1_agent.svg-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/1_agent.svg-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-riit/1_agent.svg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> <div id="2_agent" class="img-height-100"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/2_agent.svg-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/2_agent.svg-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/2_agent.svg-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-riit/2_agent.svg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> <div id="3_agent" class="img-height-100"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/3_agent.svg-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/3_agent.svg-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/3_agent.svg-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-riit/3_agent.svg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> </div> <div style="margin-bottom: 20px"> <div class="caption">Figure 15: the weight parameter space diagram of different number of agents in QMIX [from-left-to-right]. (a) weight parameter space of only 1 agent; (b) weight parameter space of 2 agents; (c) weight parameter space of 3 agents.</div> </div> <p>As a side effect, the global optimum may not be in the parameter space that QMIX needs to search at all due to the monotonicity of the mixing network. One effective way is to estimate the $Q_{tot}$ as accurately as possible in the hope that it could find the global optimum, this probably explains why $Q(\lambda)$ in the previous section could result in such a performance improvement in SMAC. On the other hand, we could delicately design the reward function to be approximately monotonic when we use QMIX to solve cooperative multi-agent tasks. Then adapting the algorithm to the test environment is not a good idea, after all, we still need to figure out how to use QMIX more effectively or develop other more efficient algorithms.</p> <h2 id="conclusion">Conclusion</h2> <p>In this post, we revisited the performance of the QMIX as a baseline algorithm in the SMAC environment. We found that the application of hyperparameters and other RL techniques have a great impact on the effectiveness of QMIX. We evaluated the effect of optimizer, number of rollout processes, replay buffer size, eligibility traces, hidden size and the degree of annealed exploration on QMIX, and tried to explain their role in MARL. Furthermore, we dived into the monotonicity in QMIX, and found the absolute operation in mixing network would decrease the parameter searching space of the joint state-action area exponentially by $(\frac{1}{2})^{N}$, which would make QMIX find the satisfying policy more quickly but with the drawback of inaccurate evaluated joint value function of optimal policy. We hope that our findings will stimulate some inspiration for the value decomposition method in MARL and provoke the community to think about the performance of QMIX as a new benchmark.</p> <h2 id="authorship-credit-attribution-and-acknowledgement">Authorship, Credit Attribution and Acknowledgement</h2> <p>Jian Hu was responsible for the key ideas, open source code and all experiments, as well as the first draft of the paper.</p> <p>Siying Wang was responsible for the writing of the blog.</p> <p>Siyang Jiang participated in writing the first draft of the paper.</p> <p>Weixun Wang provided feedback on revisions.</p> <p>Siyang Jiang was supported by the fund which aims to improve scientific research capability of key construction disciplines in Guangdong province “Light-weight federal learning paradigm and its application” (No:2022ZDJS058) and Foundation for Distinguished Young Talents in Higher Education of Guangdong, China. (NO. 2022KQNCX084)</p> <h2 id="appendix">Appendix</h2> <h3 id="a-pseudo-code-of-ac-mix-">A Pseudo-code of AC-MIX<a id="A"> </a> </h3> <p>In this subsection, we show the pseudo-code for the training procedure of AC-MIX. (1) Training the critic network with offline samples and 1-step TD error loss improves the sample efficiency for critic networks; (2) We find that policy networks are sensitive to old sample reuse. Training policy networks end-to-end and critic with TD($\lambda$) and online samples improve the learning stability of AC-MIX.</p> <div id="algorithm_riit" class="img-height-600 image-center"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/algorithm_riit.svg-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/algorithm_riit.svg-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/algorithm_riit.svg-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-riit/algorithm_riit.svg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> </div> <h3 id="b-hyperparameters">B HYPERPARAMETERS</h3> <p>In this subsection, we present our hyperparameters tuning process. We get the optimal hyperparameters for each algorithm by grid search, shown in Table <a href="#t3">3</a>.</p> <div class="caption"> Table 3: Hyperparameters Search on SMAC. The bold type indicates the selected hyperparameters. </div> <table style="text-align: center; width: 700px; margin: 0 auto; margin-bottom:20px; margin-top:20px;"> <a name="t3"> </a> <thead> <tr> <td><b>Tricks</b></td> <td><b>QMIX</b></td> <td><b>AC-MIX</b></td> </tr> </thead> <tbody> <tr> <td>Optimizer</td> <td> <b>Adam</b>,RMSProp</td> <td> <b>Adam</b>,RMSProp</td> </tr> <tr> <td>Learning Rates</td> <td>0.0005, <b>0.001</b> </td> <td>0.0005, <b>0.001</b> </td> </tr> <tr> <td>Batch Size (episodes)</td> <td>32, 64, <b>128</b> </td> <td>32, <b>64</b> </td> </tr> <tr> <td>Replay Buffer Size</td> <td> <b>5000</b>, 10000, 20000</td> <td>2000, <b>5000</b>, 10000</td> </tr> <tr> <td>Q(λ)/TD(λ)</td> <td>0, 0.3, <b>0.6</b>, 0.9</td> <td>0.3, <b>0.6</b>, 0.8</td> </tr> <tr> <td>Entropy/Adaptive Entropy</td> <td>-</td> <td>0.005, 0.01, <b>0.03</b>, 0.06</td> </tr> <tr> <td>ε Anneal Steps</td> <td>50K, <b>100K, 500K</b>, 1000K</td> <td>-</td> </tr> </tbody> </table> <p><br></p> <p><strong>Rollout Processes Number</strong>. For SMAC, 8 rollout processes for parallel sampling are used to obtain as many samples as possible from the environments at a high rate. And 4 rollout processes are used for Predator-Prey-2.</p> <p><strong>Other Settings</strong>. We set all discount factors $\gamma$ = 0.99. We update the target network every 200 episodes.</p> </d-article> <d-appendix> <d-footnote-list></d-footnote-list> <d-citation-list></d-citation-list> </d-appendix> </div> <d-bibliography src="/2023/assets/bibliography/2023-05-01-riit.bib"></d-bibliography> <d-article id="bibtex-container" class="related highlight"> For attribution in academic contexts, please cite this work as <pre id="bibtex-academic-attribution">
+        PLACEHOLDER FOR ACADEMIC ATTRIBUTION
+  </pre> BibTeX citation <pre id="bibtex-box">
+        PLACEHOLDER FOR BIBTEX
+  </pre> </d-article> <script src="https://utteranc.es/client.js" repo="iclr-blogposts/2023" issue-term="pathname" theme="github-light" crossorigin="anonymous" async> </script> <script src="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/js/bootstrap.bundle.min.js" integrity="sha256-fgLAgv7fyCGopR/gBNq2iW3ZKIdqIcyshnUULC4vex8=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/js/mdb.min.js" integrity="sha256-NdbiivsvWt7VYCt6hYNT3h/th9vSTL4EDWeGs5SN3DA=" crossorigin="anonymous"></script> </body> </html>
\ No newline at end of file
diff --git a/blog/2023/sets-and-graphs/index.html b/blog/2023/sets-and-graphs/index.html
new file mode 100644
index 00000000..963c08de
--- /dev/null
+++ b/blog/2023/sets-and-graphs/index.html
@@ -0,0 +1,46 @@
+<!DOCTYPE html> <html> <script>let thunk=()=>{let e=e=>e.trim(),t=e=>e.innerText,n=e=>{let t=e.split(" "),n=t.slice(0,-1).join(" ");return[t.at(-1),n]},a=Array.from(document.getElementsByClassName("author")).map(t).map(e).map(n),i=a[0][0],o=(Array.from(document.getElementsByClassName("affiliation")).filter(e=>"P"===e.nodeName).map(t).map(e),"May 1, 2023"),r="Universality of Neural Networks on Sets vs. Graphs",l="Universal function approximation is one of the central tenets in theoretical deep learning research. It is the question of whether a specific neural network architecture is, in theory, able to approximate any function of interest. The ICLR paper \u201cHow Powerful are Graph Neural Networks?\u201d shows that mathematically analysing the constraints of an architecture as a universal function approximator and alleviating these constraints can lead to more principled architecture choices, performance improvements, and long-term impact on the field. Specifically in the fields of learning on sets and learning on graphs, universal function approximation is a well-studied property. The two fields are closely linked because the need for permutation invariance in both cases leads to similar building blocks. However, we argue that these two fields have sometimes evolved in parallel, not fully exploiting their synergies. This post aims at bringing these two fields closer together, particularly from the perspective of universal function approximation.";{let e=a.map(e=>`${e[0]}, ${e[1]}`).join(" and "),t=`\n@inproceedings{${(i+"2023"+r.split(" ").slice(0,3).join("")).replace(" ","").replace(/[\p{P}$+<=>^`|~]/gu,"").toLowerCase().trim()},\n  author = {${e}},\n  title = {${r}},\n  abstract = {${l}},\n  booktitle = {ICLR Blogposts 2023},\n  year = {2023},\n  date = {${o}},\n  note = {${window.location.href}},\n  url  = {${window.location.href}}\n}\n  `.trim();document.getElementById("bibtex-box").innerText=t}{let e=a.map(e=>e[0]),t=`\n${e=e.length>2?e[0]+", et al.":2==e.length?e[0]+" & "+e[1]:e[0]}, "${r}", ICLR Blogposts, 2023.\n`.trim();document.getElementById("bibtex-academic-attribution").innerText=t}};document.addEventListener("readystatechange",function(){"complete"===document.readyState&&thunk()});</script> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <title>Universality of Neural Networks on Sets vs. Graphs | ICLR Blogposts 2023</title> <meta name="author" content="abc b c"/> <meta name="description" content="Universal function approximation is one of the central tenets in theoretical deep learning research. It is the question of whether a specific neural network architecture is, in theory, able to approximate any function of interest. The ICLR paper “How Powerful are Graph Neural Networks?” shows that mathematically analysing the constraints of an architecture as a universal function approximator and alleviating these constraints can lead to more principled architecture choices, performance improvements, and long-term impact on the field. Specifically in the fields of learning on sets and learning on graphs, universal function approximation is a well-studied property. The two fields are closely linked because the need for permutation invariance in both cases leads to similar building blocks. However, we argue that these two fields have sometimes evolved in parallel, not fully exploiting their synergies. This post aims at bringing these two fields closer together, particularly from the perspective of universal function approximation."/> <meta name="keywords" content="machine-learning, ml, deep-learning, reinforcement-learning, iclr"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha256-DF7Zhf293AJxJNTmh5zhoYYIMs2oXitRfBjY+9L//AY=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/css/mdb.min.css" integrity="sha256-jpjYvU3G3N6nrrBwXJoVEYI/0zw8htfFnhT9ljN3JJw=" crossorigin="anonymous"/> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fortawesome/fontawesome-free@5.15.4/css/all.min.css" integrity="sha256-mUZM63G8m73Mcidfrv5E+Y61y7a12O5mW4ezU3bxqW4=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/academicons@1.9.1/css/academicons.min.css" integrity="sha256-i1+4qU2G2860dGGIOJscdC30s9beBXjFfzjWLjBRsBg=" crossorigin="anonymous"> <link rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700|Roboto+Slab:100,300,400,500,700|Material+Icons"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/github.css" media="" id="highlight_theme_light"/> <link rel="shortcut icon" href="/2023/assets/img/iclr_favicon.ico"/> <link rel="stylesheet" href="/2023/assets/css/main.css"> <link rel="canonical" href="https://iclr-blogposts.github.io/2023/blog/2023/sets-and-graphs/"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/native.css" media="none" id="highlight_theme_dark"/> <script src="/2023/assets/js/theme.js"></script> <script src="/2023/assets/js/dark_mode.js"></script> <script src="https://cdn.jsdelivr.net/npm/jquery@3.6.0/dist/jquery.min.js" integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=" crossorigin="anonymous"></script> <script type="text/javascript">window.MathJax={tex:{tags:"ams"}};</script> <script defer type="text/javascript" id="MathJax-script" src="https://cdn.jsdelivr.net/npm/mathjax@3.2.0/es5/tex-mml-chtml.js"></script> <script defer src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script> <script src="/2023/assets/js/distillpub/template.v2.js"></script> <script src="/2023/assets/js/distillpub/transforms.v2.js"></script> <script src="/2023/assets/js/distillpub/overrides.js"></script> <style type="text/css">.fake-img{background:#bbb;border:1px solid rgba(0,0,0,0.1);box-shadow:0 0 4px rgba(0,0,0,0.1);margin-bottom:12px}.fake-img p{font-family:monospace;color:white;text-align:left;margin:12px 0;text-align:center;font-size:16px}</style> </head> <d-front-matter> <script async type="text/json">{
+      "title": "Universality of Neural Networks on Sets vs. Graphs",
+      "description": "Universal function approximation is one of the central tenets in theoretical deep learning research. It is the question of whether a specific neural network architecture is, in theory, able to approximate any function of interest. The ICLR paper “How Powerful are Graph Neural Networks?” shows that mathematically analysing the constraints of an architecture as a universal function approximator and alleviating these constraints can lead to more principled architecture choices, performance improvements, and long-term impact on the field. Specifically in the fields of learning on sets and learning on graphs, universal function approximation is a well-studied property. The two fields are closely linked because the need for permutation invariance in both cases leads to similar building blocks. However, we argue that these two fields have sometimes evolved in parallel, not fully exploiting their synergies. This post aims at bringing these two fields closer together, particularly from the perspective of universal function approximation.",
+      "published": "May 1, 2023",
+      "authors": [
+        {
+          "author": "Fabian B. Fuchs*",
+          "authorURL": "https://fabianfuchsml.github.io",
+          "affiliations": [
+            {
+              "name": "Google DeepMind",
+              "url": ""
+            }
+          ]
+        },
+        {
+          "author": "Petar Veličković*",
+          "authorURL": "https://petar-v.com/",
+          "affiliations": [
+            {
+              "name": "(*equal contribution)",
+              "url": ""
+            }
+          ]
+        }
+        
+      ],
+      "katex": {
+        "delimiters": [
+          {
+            "left": "$",
+            "right": "$",
+            "display": false
+          },
+          {
+            "left": "$$",
+            "right": "$$",
+            "display": true
+          }
+        ]
+      }
+    }</script> </d-front-matter> <body class="fixed-top-nav"> <header> <nav id="navbar" class="navbar navbar-light navbar-expand-sm fixed-top"> <div class="container"> <a class="navbar-brand title font-weight-lighter" href="/2023/">ICLR Blogposts 2023</a> <button class="navbar-toggler collapsed ml-auto" type="button" data-toggle="collapse" data-target="#navbarNav" aria-controls="navbarNav" aria-expanded="false" aria-label="Toggle navigation"> <span class="sr-only">Toggle navigation</span> <span class="icon-bar top-bar"></span> <span class="icon-bar middle-bar"></span> <span class="icon-bar bottom-bar"></span> </button> <div class="collapse navbar-collapse text-right" id="navbarNav"> <ul class="navbar-nav ml-auto flex-nowrap"> <li class="nav-item "> <a class="nav-link" href="/2023/about">about</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/call">call for blogposts</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/submitting">submitting</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/reviewing">reviewing</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/blog/index.html">blog</a> </li> <li class="nav-item dropdown "> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdown" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">other iterations</a> <div class="dropdown-menu dropdown-menu-right" aria-labelledby="navbarDropdown"> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2025/">2025</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2024/">2024</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blog-track.github.io/home/" target="_blank" rel="noopener noreferrer">2022</a> </div> </li> <li class="toggle-container"> <button id="light-toggle" title="Change theme"> <i class="fas fa-moon"></i> <i class="fas fa-sun"></i> </button> </li> </ul> </div> </div> </nav> </header> <div class="post distill"> <d-title> <h1>Universality of Neural Networks on Sets vs. Graphs</h1> <p>Universal function approximation is one of the central tenets in theoretical deep learning research. It is the question of whether a specific neural network architecture is, in theory, able to approximate any function of interest. The ICLR paper “How Powerful are Graph Neural Networks?” shows that mathematically analysing the constraints of an architecture as a universal function approximator and alleviating these constraints can lead to more principled architecture choices, performance improvements, and long-term impact on the field. Specifically in the fields of learning on sets and learning on graphs, universal function approximation is a well-studied property. The two fields are closely linked because the need for permutation invariance in both cases leads to similar building blocks. However, we argue that these two fields have sometimes evolved in parallel, not fully exploiting their synergies. This post aims at bringing these two fields closer together, particularly from the perspective of universal function approximation.</p> </d-title> <d-byline></d-byline> <d-article> <d-contents> <nav class="l-text figcaption"> <h3>Contents</h3> <div><a href="#sets-and-graphs">Sets and Graphs</a></div> <div><a href="#why-do-we-care-about-universal-function-approximation">Why do we care about universal function approximation?</a></div> <div><a href="#learning-on-sets-universality">Learning on Sets &amp; Universality</a></div> <div><a href="#approximation-vs-representation">Approximation vs. Representation</a></div> <div><a href="#what-about-graph-representation-learning">What about _graph_ representation learning?</a></div> <div><a href="#learning-on-graphs-and-universality">Learning on Graphs and Universality</a></div> <div><a href="#the-weisfeiler-lehman-test">The Weisfeiler-Lehman Test</a></div> <div><a href="#broader-context-and-takeaways">Broader Context and Takeaways</a></div> </nav> </d-contents> <h2 id="sets-and-graphs">Sets and Graphs</h2> <p>Before we dive into<d-footnote>It is important to briefly focus on declaring the *conflict of interest* we had while writing this blog. We are actively working on set and graph representation learning. Accordingly, several paragraphs of this write-up focus on papers that we have co-written. That being said, and in the context of ICLR, we declare that the majority of the ICLR papers referenced in this blog post do _not_ present a conflict of interest for us. Hence, we believe we have, to the best of our efforts, provided an objective and impartial view of learning universal representations over graphs and sets.</d-footnote> universal function approximation, let’s start with the basics. What do we mean by learning on set- or graph-based data? In both cases, we assume no ordering (we will more formally describe this at the end of this section as the task being permutation <em>invariant</em> or <em>equivariant</em>). A graph is typically thought of as a set of nodes with edges between the nodes. A set doesn’t have edges, it just has the nodes, although we often don’t call them nodes, but rather set elements. Both the nodes and the edges (in the case of graphs) can have feature vectors attached to them. The figure below (originally from Wagstaff et al. 2021<d-cite key="wagstaff21"></d-cite>) visualises this relationship:</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-sets-and-graphs/graphsuniv_graphsandsets-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-sets-and-graphs/graphsuniv_graphsandsets-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-sets-and-graphs/graphsuniv_graphsandsets-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-sets-and-graphs/graphsuniv_graphsandsets.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <p>Examples of machine learning tasks on this type of data include 3D point cloud classification (a function mapping a set of coordinates to an object class) and molecular property prediction (a function mapping a molecular graph to, e.g., a free energy value).</p> <p>So, what are invariance and equivariance? Both concepts describe how the output of a function (or task) changes under a transformation of the input. Transformation can mean different things, but we restrict ourselves to permutations here for simplicity. A function \(f\) is permutation <em>invariant</em> if the output does not change as the inputs are permuted. The left-hand side of the following figure below visualises that concept:</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-sets-and-graphs/graphsuniv_permutations-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-sets-and-graphs/graphsuniv_permutations-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-sets-and-graphs/graphsuniv_permutations-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-sets-and-graphs/graphsuniv_permutations.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <p>The right-hand side depicts permutation <em>equivariance:</em> changing the order of the input implies a change in the order of the output (but the values themselves remain unchanged).</p> <p>Tasks (or functions) defined on sets and graphs are typically permutation invariant or equivariant. This symmetry is often incorporated into the neural network architecture, as we will see in examples below. It is exactly the incorporation of the symmetry that makes the question of universalilty so interesting: is the network (theoretically) able to model all permutation invariant (or equivariant) functions on this data?</p> <h2 id="why-do-we-care-about-universal-function-approximation">Why do we care about universal function approximation?</h2> <p>First of all, why do we need to be able to approximate all functions? After all, having <em>one</em> function that performs well on the train set and generalises to the test set is all we need in most cases. Well, the issue is that we have no idea what such a function looks like, otherwise we would implement it directly and wouldn’t need to train a neural network. Hence, the network not being a universal function approximator <em>may</em> hurt its performance. </p> <p>Graph Isomorphism Networks (GINs) by Xu et al.<d-cite key="GIN"></d-cite>) provide the quintessential example of the merit of universality research. The authors first realised that it is possible to mathematically describe all functions that can be computed by graph neural networks relying on message passing between immediate neighbours, over graphs with discrete-valued features. They then analysed Graph Convolutional Networks (a very popular class of graph neural networks by Kipf and Welling 2016 [3]), and pointed out that GCNs are not capable of expressing all of these functions — that is, they are not universal. Guided by their analysis, the authors then created the GIN, which was provably capable of expressing all possible such functions and achieved significantly better empirical results. </p> <p>However, this is not always the case. Sometimes, architecture changes motivated by universal function approximation arguments lead to <em>worse</em> results. Even in such unfortunate cases, however, we argue that thinking about universality is no waste of time. Firstly, it brings structure into the literature and into the wide range of models available. We need to group approaches together to see the similarities and differences. Universality research can and has served as a helpful tool for that.</p> <p>Moreover, proving that a certain architecture is or is not universal is an inherently interesting task and teaches us mathematical thinking and argumentation. In a deep learning world, where there is a general sense of randomness and magic in building high-performing neural networks and where it’s hard to interpret what’s going on, one might argue that an additional mathematical analysis is probably good for the balance, even if it turns out to not always directly result in better performance.</p> <h2 id="learning-on-sets--universality">Learning on Sets &amp; Universality</h2> <p>To prove universal function approximation<d-footnote>There is actually a nuanced distinction between *approximation* and *representation*, which we will glance over for now but discuss in the next section.</d-footnote>, we typically make two assumptions: 1) the MLP components of the neural networks are arbitrarily large. 2) the functions that we want to be able to learn are continuous on \(\mathbb{R}\). Continuity for a function \(f(x)\) mapping from \(\mathbb{R}\) to \(\mathbb{R}\) implies that for all \(x_0\) in the domain of \(f\) and all \(\epsilon &gt; 0, \epsilon \in R\), there exists a \(\delta &gt; 0, \delta \in R\) such that \(|x - x_0| &lt; \delta\) implies \(|f(x) - &lt; f(x_0)| &lt; \epsilon\) if \(x\) is in the domain of \(f\).</p> <p>The first part says: any concrete implementation of a ‘universal’ network architecture might not be able to learn the function of interest, but, if you make it <a href="https://i.redd.it/n9fgba8b0qr01.png" target="_blank" rel="noopener noreferrer">bigger</a>, eventually it will—and that is <em>guaranteed</em><d-footnote>Conversely, if the network is provably non-universal (like Graph Convolutional Networks), then there are functions it can *never* learn, no matter how many layers you stack.</d-footnote>. The second part is a non-intuitive mathematical technicality we will leave uncommented for now and get back to later (because it’s actually a really interesting and important technicality).</p> <p>One of the seminal papers discussing both permutation invariant neural networks and universal function approximation was Deep Sets by Zaheer et al. in 2017<d-cite key="Zaheer2017"></d-cite>. The idea is simple: apply the same neural network \(\phi\) to several inputs, sum up their results, and apply a final neural network \(\rho\).<d-footnote>Figure from Wagstaff et al. 2021.</d-footnote></p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-sets-and-graphs/graphsuniv_deepsets-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-sets-and-graphs/graphsuniv_deepsets-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-sets-and-graphs/graphsuniv_deepsets-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-sets-and-graphs/graphsuniv_deepsets.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <p>Because the sum operation is permutation invariant, the final output is invariant with respect to the ordering of the inputs. In other words, the sum quite obviously restricts the space of learnable functions to permutation invariant ones. The question is, can a neural network with this architecture, in principle, learn <em>all</em> (continuous) permutation invariant functions? Perhaps surprisingly, the authors show that all functions can indeed be represented with this architecture. The idea is a form of binary bit-encoding in the output of \(\phi\), which we will call the <em>latent space</em> from here on. Concretely, they argue that there is a bijective mapping from rational to natural numbers. Assuming that each input is a rational number, they first map each rational number \(x\) to a natural number \(c(x)\), and then each natural number to \(\phi(x) = 4^{-c(x)}\). It is now easy to see that \(\sum_i \phi(x_i) \neq \sum_i \phi(y_i)\) unless the finite sets \(\\{ x_0, x_1, ... \\}\) and \(\\{y_0, y_1, ...\\}\) are the same. Now that we uniquely encoded each input, a universal decoder can map this to any output we want. This concludes the proof that the Deep Sets architecture is, in theory, a universal function approximator, despite its simplicity.</p> <p>However, there is an issue with this proof: it builds on the assumption that the MLP components themselves are universal function approximators, in the limit of infinite width. However, the universal function approximation theorem says that this is the case only for continuous functions, where continuity is defined on the real numbers. That continuity is important is sort of intuitive: continuity means that a small change in the input implies a small change in the output. And because the building blocks of neural networks (specifically linear combinations and non-linearities) are continuous, it makes sense that the overall function we want the network to learn should be continuous.</p> <p>But why continuity on the real numbers? Because continuity on the rational numbers is not a very useful property as shown in Wagstaff et al. 2019<d-cite key="wagstaff19"></d-cite>. The mapping we described above is clearly highly discontinuous, and anyone could attest that it is completely unrealistic to assume that a neural network could learn such a function. That doesn’t mean all is lost. Wagstaff et al. show that the Deep Sets architecture is still a universal function approximator when requiring continuity, but only if the latent space (the range of \(\phi\)) has a dimensionality at least as large as the number of inputs, which is an important restriction.</p> <p>What about more complicated architectures? Murphy et al.<d-cite key="Janossy"></d-cite> generalise the idea of Deep Sets to applying networks to all possible \(k\)-tuples of inputs, where \(k=1\) recovers the Deep Sets case. This can be seen as unifying other architecture classes such as self-attention. However, this is not known to alleviate the constraint on the latent space mentioned above, as explained in Wagstaff et al. 2021<d-cite key="wagstaff21"></d-cite>.</p> <h2 id="approximation-vs-representation">Approximation vs. Representation</h2> <p>For simplicity, we have so far deliberately glanced over the distinction between function approximation and representation, but we will rectify this now. The Deep Sets architecture from the previous section can be written as:</p> \[\rho (\sum \phi_i(x_i))\] <p>If we forget about \(\rho\) and \(\phi\) being implemented as neural networks for a second and just think of them as general functions, it turns out that any continuous permutation invariant function can be <em>represented</em> in the above way. The word <em>represented</em> implies that it’s exact, without an approximation error, not even an arbitrarily small one. As such, Zaheer et al. 2017<d-cite key="Zaheer2017"></d-cite> and Wagstaff et al. 2019<d-cite key="wagstaff19"></d-cite> study universal function <em>representation</em>, not the softer criterion of <em>approximation</em>. However, once we assume \(\rho\) and \(\phi\) are being implemented as neural networks, it is an approximation. Hence, it makes sense to call Deep Sets a universal function <em>approximator</em> for continuous functions on sets. There is a catch here, though. If we are satisfied with approximation in the components \(\phi\) and \(\rho\), we might as well be satisfied with approximations in other places as well. A question one could ask is “how large does the latent space have to be in order to keep the errors small?”. This is unsurprisingly a much harder question to answer, but Wagstaff et al. 2021<d-cite key="wagstaff21"></d-cite> find that the result is largely the same: the latent space much have a dimensionality at least as large as the number of inputs.</p> <h2 id="what-about-graph-representation-learning">What about <em>graph</em> representation learning?</h2> <p>So, this was universality in the context of machine learning on sets, but what about graphs? Interestingly, the graph representation learning community experienced a near-identical journey, evolving entirely in parallel! Perhaps this observation comes as little surprise: to meaningfully propagate information in a graph neural network (GNN), a local, permutation invariant operation is commonplace.</p> <p>Specifically, a GNN typically operates by computing representations (<em>“messages”</em>) sent from each node to its neighbours, using a <em>message function</em><d-footnote>Here, for the purpose of clarity, we assume that the message function $\psi$ only takes into account the features of the sender and receiver nodes. It is of course possible to have additional relevant features in the graph that $\psi$ could use, for example, there could be features on the edge $i\rightarrow j$, as is often the case, e.g., in molecular graphs. Such cases can usually be resolved by inserting these features as additional inputs to $\psi$.</d-footnote>, $\psi : \mathbb{R}^k \times \mathbb{R}^k\rightarrow\mathbb{R}^l$:</p> \[\mathbf{m}_{ij} = \psi(\mathbf{x}_{i}, \mathbf{x}_{j})\] <p>where \(\mathbf{x}_{i}\) are the features of node $i$. This is followed by an <em>aggregation function</em> which, for every node, combines all of its incoming messages in a way that is invariant to permutations:</p> \[\mathbf{h}_{i} = \phi\left(\mathbf{x}_{i}, \bigoplus_{j\in\mathcal{N}_{i}} \mathbf{m}_{ji}\right)\] <p>where \(\mathcal{N}_i\) is the set of all nodes neighbouring $i$, and \(\phi : \mathbb{R}^k\times\mathbb{R}^l\rightarrow\mathbb{R}^m\) is an <em>update function</em>, updating the representation of each node \(i\) from \(\mathbf{x}_{i}\) to \(\mathbf{h}_{i}\).</p> <p>Opinions are still divided on whether <em>every</em> permutation equivariant GNN can be expressed with such pairwise messaging, with a recent position paper by Veličković<d-cite key="Velickovic22"></d-cite> claiming they <strong>can</strong>. Regardless of which way the debate goes in the future, aggregating messages over 1-hop neighbours gives rise to a highly elegant implementation of GNNs which is likely here to stay. This comes with very solid community backing, with <a href="https://www.pyg.org/" target="_blank" rel="noopener noreferrer">PyG</a>—one of the most popular GNN frameworks—<a href="https://github.com/pyg-team/pytorch_geometric/releases/tag/2.1.0" target="_blank" rel="noopener noreferrer">recently making aggregators a “first-class citizen”</a> in their GNN pipelining.</p> <p>Therefore, to build a GNN, it suffices to build a <em>permutation-invariant, local</em> layer which combines data coming from each node’s neighbours. This feels nearly identical to our previous discussion; what’s changed, really? Well, we need to take care of one seemingly minor detail: it is possible for <strong>two or more neighbours to send <em>exactly the same message</em></strong>. The theoretical framework of Deep Sets and/or Wagstaff et al. wouldn’t entirely suffice in this case, as they assumed a <em>set</em> input, whereas now we have a <em>multiset</em> (a set where some elements might be repeated)..</p> <h2 id="learning-on-graphs-and-universality">Learning on Graphs and Universality</h2> <p>Several influential GNN papers were able to overcome this limitation. The first key development came from the <em>graph isomorphism network</em> (<strong>GIN</strong>)<d-cite key="GIN"></d-cite>. GIN is an elegant example of how, over countable features, the maximally-powerful GNN<d-footnote>That is, a GNN that is capable of expressing all possible functions that can be described using several iterations of message passing between one-hop neighbours in a graph.</d-footnote> can be built up using similar ideas as in Deep Sets; so long as the local layer we use is <em>injective</em><d-footnote>Injectivity of a function means that two different inputs always yield two different outputs. In other words, if you evaluate the function twice and the output is the same both times, you know that the input must have been the same, too.</d-footnote> over multisets. Similarly to before, we must choose our encoder \(\phi\) and aggregator \(\bigoplus\), such that \(\bigoplus\limits_i \phi(x_i) \neq \bigoplus\limits_i \phi(y_i)\) unless the finite <em>multisets</em> $\{ \mkern-4mu \{x_0, x_1, …\} \mkern-4mu \}$ and $\{\mkern-4mu\{y_0, y_1, …\} \mkern-4mu \}$ are the same (\(x_i, y_i\in\mathbb{Q}\)).</p> <p>In the multiset case, the framework from Deep Sets induces an additional constraint over \(\bigoplus\)—it needs to preserve the <em>cardinality</em> information about the repeated elements in a multiset. This immediately implies that some choices of \(\bigoplus\), such as \(\max\) or averaging, will not yield maximally powerful GNNs.</p> <p>For example, consider the multisets $\{\mkern-4mu\{1, 1, 2, 2\} \mkern-4mu \}$ and $\{\mkern-4mu\{1, 2\}\mkern-4mu\}$. As we assume the features to be countable, we specify the numbers as <em>one-hot</em> integers; that is, \(1 = [1\ \ 0]\) and \(2=[0\ \ 1]\). The maximum of these features, taken over the multiset, is \([1\ \ 1]\), and their average is \(\left[\frac{1}{2}\ \ \frac{1}{2}\right]\). This is the case for both of these multisets, meaning that both maximising and averaging are <em>incapable</em> of telling them apart.</p> <p>Summations \(\left(\bigoplus=\sum\right)\), however, are an example of a suitable injective operator.</p> <p>Very similarly to the analysis from Wagstaff et al. in the domain of sets, a similar extension in the domain of graphs came through the work on <a href="**PNA**"><em>principal neighbourhood aggregation</em></a> by Corso, Cavalleri et al.<d-cite key="Corso"></d-cite>. We already discussed why it is a good idea to focus on features coming from \(\mathbb{R}\) rather than \(\mathbb{Q}\)—the universal approximation theorem only applies to functions that are continuous on \(\mathbb{R}\). However, it turns out that, when we let \(x_i, y_i\in\mathbb{R}\), it is easily possible to construct neighbourhood multisets for which setting \(\bigoplus=\sum\) would <strong>not</strong> preserve injectivity:</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-sets-and-graphs/graphsuniv_examples-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-sets-and-graphs/graphsuniv_examples-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-sets-and-graphs/graphsuniv_examples-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-sets-and-graphs/graphsuniv_examples.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <p>In fact, PNA itself is based on a proof that it is <em>impossible</em> to build an injective function over multisets with real-valued features using <em>any</em> <strong>single</strong> aggregator. In general, for an injective function over \(n\) neighbours, we need <em>at least</em> \(n\) aggregation functions (applied in parallel). PNA then builds an empirically powerful aggregator combination, leveraging this insight while trying to preserve numerical stability.</p> <p>Note that there is an apparent <strong>similarity</strong> between these results and the ones from Wagstaff et al. 2019<d-cite key="wagstaff19"></d-cite> . Wagstaff et al. show that, over real-valued sets of \(n\) elements, it is necessary to have an encoder representation <em>width</em> of at least \(n\). Corso, Cavalleri et al. show that, over real-valued multisets of \(n\) elements, it is necessary to aggregate them with at least \(n\) aggregators.</p> <p>There are also major differences between the two analyses: Wagstaff et al. 2019<d-cite key="wagstaff19"></d-cite> assume the sum as an aggregator, whereas Corso et al.<d-cite key="Corso"></d-cite> consider arbitrary aggregation functions. They also use different language: number of aggregators vs. dimensionality of the latent space, although the two are equivalent. Ultimately, the restriction to sums makes the sufficiency proof (the neural network <em>is</em> universal for num latents \(\geq\) num inputs) for Wagstaff et al. more complicated, which uses a sum-of-power mapping. Corso et al., on the other hand, simply use an aggregator that extracts the \(i\)th-smallest input element. The necessity proof (the neural network <em>is not</em> universal for num latents \(&lt;\) num inputs), on the other hand, is more complex for Corso et al. and uses the Borsuk–Ulam theorem, because all possible aggregation functions have to be taken into account. Remarkably, despite the different starting conditions, both proofs arrive at the exact same result: for a universal neural network, you need as many aggregators/latents as you have inputs.</p> <p>In other words, it appears that potent processing of real-valued collections necessitates representational capacity proportional to the collection’s size, in order to guarantee injectivity. Discovering this correspondence is actually what brought the two of us together to publish this blog post in the first place.</p> <p>We have now established what is necessary to create a maximally-powerful GNN over both <em>countable</em> and <em>uncountable</em> input features. So, <em>how powerful are they</em>, exactly?</p> <h2 id="the-weisfeiler-lehman-test">The Weisfeiler-Lehman Test</h2> <p>While GNNs are often a powerful tool for processing graph data in the real world, they also won’t solve <em>all</em> tasks specified on a graph accurately! As a simple counterexample, consider any NP-hard problem, such as the Travelling Salesperson Problem. If we had a fixed-depth GNN that perfectly solves such a problem, we would have shown P=NP! Expectedly, not all GNNs will be equally good at solving various problems, and we may be highly interested in characterising their <em>expressive power</em>.</p> <p>The canonical example for characterising expressive power is <em>deciding graph isomorphism</em>; that is, can our GNN distinguish two non-isomorphic graphs? Specifically, if our GNN is capable of computing graph-level representations \(\mathbf{h}_{\mathcal{G}}\), we are interested whether \(\mathbf{h}_{\mathcal{G_{1}}} \neq\mathbf{h}_{\mathcal{G_{2}}}\) for non-isomorphic graphs \(\mathcal{G}_{1}\) and \(\mathcal{G}_{2}\). If we cannot attach different representations to these two graphs, any kind of task requiring us to classify them differently is <em>hopeless</em>! This motivates assessing the power of GNNs by which graphs they are able to <em>distinguish</em>.</p> <p>A typical way in which this is formalised is by using the <em>Weisfeiler-Lehman</em> (<strong>WL</strong>) graph isomorphism test. To formalise this, we will study a popular algorithm for approximately deciding graph isomorphism.</p> <p>The WL algorithm featurises a graph \(\mathcal{G}=(\mathcal{V},\mathcal{E})\) as follows. First, we set the representation of each node \(i\in\mathcal{V}\) as \(x_i^{(0)} = 1\). Then, it proceeds as follows:</p> <ol> <li>Let $\mathcal{X}_i^{(t+1)} = \{\mkern-4mu\{x_j^{(t)} :(i,j)\in\mathcal{E}\}\mkern-4mu\}$ be the multiset of features of all neighbours of \(i\).</li> <li>Then, let \(x_i^{(t+1)}=\sum\limits_{y_j\in\mathcal{X}_i^{(t+1)}}\phi(y_j)\), where \(\phi : \mathbb{Q}\rightarrow\mathbb{Q}\) is an <em>injective</em> hash function.</li> </ol> <p>This process continues as long as the <em>histogram</em> of \(x_i^{(t)}\) changes—initially, all nodes have the same representation. As steps 1–2 are iterated, certain \(x_i^{(t)}\) values may become different. Finally, the WL test checks whether two graphs are (possibly) isomorphic by checking whether their histograms have the same (sorted) shape upon convergence.</p> <p>While remarkably simple, the WL test can accurately distinguish most graphs of real-world interest. It does have some rather painful failure modes, though; for example, it cannot distinguish a 6-cycle from two triangles!</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-sets-and-graphs/graphsuniv_wlfail-480.webp"></source> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-sets-and-graphs/graphsuniv_wlfail-800.webp"></source> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-sets-and-graphs/graphsuniv_wlfail-1400.webp"></source> <img src="/2023/assets/img/2023-05-01-sets-and-graphs/graphsuniv_wlfail.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"> </picture> </figure> <p>This is because, locally, <em>all nodes look the same</em> in these two graphs, and the histogram never changes.</p> <p>The key behind the power of the WL test is the <em>injectivity</em> of the hash function \(\phi\)—it may be interpreted as assigning each node a different <em>colour</em> if it has a different <em>local context</em>. Similarly, we saw that GNNs are maximally powerful when their propagation models are <em>injective</em>. It should come as little surprise then that, in terms of distinguishing graph structures over <em>countable</em> input features, GNNs can <strong>never be more powerful than the WL test</strong>! And, in fact, this level of power is achieved <em>exactly</em> when the aggregator is injective. This fact was first discovered by Morris et al.<d-cite key="Morris"></d-cite>, and reinterpreted from the perspective of multiset aggregation by the GIN paper.</p> <p>While the WL connection has certainly spurred a vast amount of works on improving GNN expressivity, it is also worth recalling the initial assumption: \(x_i^{(0)} = 1\). That is, we assume that the input node features are <em>completely uninformative</em>! Very often, this is not a good idea! It can be proven that even placing <em>random numbers</em> in the nodes can yield to a provable improvement in expressive power (Sato et al.<d-cite key="Sato"></d-cite>). Further, many recent works (Loukas et al.<d-cite key="Loukas"></d-cite>); Kanatsoulis and Ribeiro<d-cite key="Ribeiro"></d-cite> make it very explicit that, if we allow GNNs access to “appropriate” input features, this leads to a vast improvement in their expressive power. All of these models hence surpass the 1-WL test. There is now a significant body of recent research to improve GNNs beyond the 1-WL test by giving them access to features or structures they wouldn’t otherwise be capable of computing. The broad strategies for doing so, beyond the just-discussed feature augmentation, include rewiring the graph, and explicit message passing over <em>substructures</em> in the graph. Veličković<d-cite key="Velickovic22"></d-cite> provides a bird’s eye summary of these recent developments.</p> <p>Even beyond the limitation of the uninformative input features, recent influential works (published at ICLR’22 and ‘23 as orals) have demonstrated that the WL framework itself is worth extending. Geerts and Reutter<d-cite key="Geerts"></d-cite> demonstrate clear theoretical value to expressing GNN computations using a <em>tensor language</em> (TL), allowing for drawing significant connections to <em>color refinement</em> algorithms. And Zhang et al.<d-cite key="Zhang"></d-cite> demonstrate that the WL framework may be <em>weak</em> in terms of its architectural distinguishing power, showing that many higher-order GNNs that surpass the limitations of the 1-WL test are in fact still incapable of computing many standard polynomial-time-computable properties over graphs, such as ones relating to the graph’s <em>biconnected components</em>.</p> <p>Lastly, linking back to our central discussion, we argue that focusing the theoretical analysis only on discrete features may not lead to highly learnable target mappings. From the perspective of the WL test (and basically any discrete-valued procedure), the models presented in Deep Sets and PNA are no more powerful than 1-WL. However, moving into continuous feature support, PNA is indeed more powerful at distinguishing graphs than models like GIN.</p> <h2 id="broader-context-and-takeaways">Broader Context and Takeaways</h2> <p>It is no coincidence that many of the current universality discussions within machine learning are happening inside communities that build networks that exploit symmetries (in our examples, the symmetry was always permutation invariance/equivariance, but the following argument equally applies to, e.g., rotational symmetries): exploiting symmetries with a neural network architecture is tantamount to limiting the space of functions that can be learned. This naturally raises the question of <em>how much</em> the space of learnable function has been limited. In other words: for the space of functions observing a specific symmmetry, is the neural network (still) a universal function approximator? This does not imply, however, that universality isn’t interesting in other fields, too: e.g., the fact that self-attention (popularised by natural language processing) is a universal approximator for functions on sets is an interesting property that gives its design more context. The (once) ubiquitous usage of the convolutional layer seems less surprising when knowing that it is the most general<d-footnote>In fact, it is also the only such linear layer because simpler and less expressive translation equivariant linear layers (e.g. point-wise linears) can be seen as special cases of a convolutional layer.</d-footnote> linear layer that observes translation equivariance<d-cite key="Cohen"></d-cite>.</p> <p>In this blog post, we aimed at explaining most of the key concepts of universal function approximation for set and graph-based machine learning: invariance and equivariance, sets and multisets, representation vs. approximation, injectivity, Deep Sets, GINs, WL-tests, and the motivation for universality research itself. We hope that we provided some insights into the similarities and differences of universality research on graphs and sets, and maybe even food for thought leading to future research on this intersection. We also acknowledge that this is a theoretical topic and that none of these proofs can ultimately predict how well a ‘universal’ neural network will perform on a specific task in the real world. However, even in the worst-case scenario, where theoretical universality properties are completely uncorrelated (or inversely correlated?) with real-world performance, we still hope that the thoughts and concepts of this post add a bit of additional structure to the multifaceted zoo of neural network architectures for sets and graphs.</p> </d-article> <d-appendix> <d-footnote-list></d-footnote-list> <d-citation-list></d-citation-list> </d-appendix> </div> <d-bibliography src="/2023/assets/bibliography/2023-05-01-sets-and-graphs.bib"></d-bibliography> <d-article id="bibtex-container" class="related highlight"> For attribution in academic contexts, please cite this work as <pre id="bibtex-academic-attribution">
+        PLACEHOLDER FOR ACADEMIC ATTRIBUTION
+  </pre> BibTeX citation <pre id="bibtex-box">
+        PLACEHOLDER FOR BIBTEX
+  </pre> </d-article> <script src="https://utteranc.es/client.js" repo="iclr-blogposts/2023" issue-term="pathname" theme="github-light" crossorigin="anonymous" async> </script> <script src="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/js/bootstrap.bundle.min.js" integrity="sha256-fgLAgv7fyCGopR/gBNq2iW3ZKIdqIcyshnUULC4vex8=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/js/mdb.min.js" integrity="sha256-NdbiivsvWt7VYCt6hYNT3h/th9vSTL4EDWeGs5SN3DA=" crossorigin="anonymous"></script> </body> </html>
\ No newline at end of file
diff --git a/blog/index.html b/blog/index.html
index 87e3234f..9531d67c 100644
--- a/blog/index.html
+++ b/blog/index.html
@@ -1,103 +1 @@
----
-layout: default
-title: blog
-nav: true
-nav_order: 9
-permalink: /blog
-pagination:
-  enabled: true
-  collection: posts
-  permalink: /page/:num/
-  per_page: 12
-  sort_field: title
-  sort_reverse: false
-  trail:
-    before: 1 # The number of links before the current page
-    after: 3  # The number of links after the current page
----
-
-<div class="post">
-
-  <div class="header-bar">
-    <h1>{{ site.blog_name }}</h1>
-    <h2>{{ site.blog_description }}</h2>
-  </div>
-
-  {% if site.display_tags %}
-  <div class="tag-list">
-    <ul class="p-0 m-0">
-      {% for tag in site.display_tags %}
-        <li>
-          <i class="fas fa-hashtag fa-sm"></i> <a href="{{ tag | slugify | prepend: '/blog/tag/' | relative_url }}">{{ tag }}</a>
-        </li>
-        {% unless forloop.last %}
-          <p>&bull;</p>
-        {% endunless %}
-      {% endfor %}
-    </ul>
-  </div>
-  {% endif %}
-
-  <ul class="post-list">
-    {% for post in paginator.posts %}
-
-    {% if post.external_source == blank %}
-      {% assign read_time = post.content | number_of_words | divided_by: 180 | plus: 1 %}
-    {% else %}
-      {% assign read_time = post.feed_content | strip_html | number_of_words | divided_by: 180 | plus: 1 %}
-    {% endif %}
-    {% assign year = post.date | date: "%Y" %}
-    {% assign tags = post.tags | join: "" %}
-    {% assign categories = post.categories | join: "" %}
-
-    <li>
-      <h3>
-        {% if post.redirect == blank %}
-          <a class="post-title" href="{{ post.url | prepend: site.baseurl }}">{{ post.title }}</a>
-        {% else %}
-          {% if post.redirect contains '://' %}
-            <a class="post-title" href="{{ post.redirect }}" target="_blank">{{ post.title }}</a>
-            <svg width="2rem" height="2rem" viewBox="0 0 40 40" xmlns="http://www.w3.org/2000/svg">
-              <path d="M17 13.5v6H5v-12h6m3-3h6v6m0-6-9 9" class="icon_svg-stroke" stroke="#999" stroke-width="1.5" fill="none" fill-rule="evenodd" stroke-linecap="round" stroke-linejoin="round"></path>
-            </svg>
-          {% else %}
-            <a class="post-title" href="{{ post.redirect | relative_url }}">{{ post.title }}</a>
-          {% endif %}
-        {% endif %}
-      </h3>
-      <p>{{ post.description }}</p>
-      <p class="post-meta">
-        {{ read_time }} min read &nbsp; &middot; &nbsp;
-        {{ post.date | date: '%B %-d, %Y' }}
-        {%- if post.external_source %}
-        &nbsp; &middot; &nbsp; {{ post.external_source }}
-        {%- endif %}
-      </p>
-      <p class="post-tags">
-        <a href="{{ year | prepend: '/blog/' | prepend: site.baseurl}}">
-          <i class="fas fa-calendar fa-sm"></i> {{ year }} </a>
-
-          {% if tags != "" %}
-          &nbsp; &middot; &nbsp;
-            {% for tag in post.tags %}
-            <a href="{{ tag | slugify | prepend: '/blog/tag/' | prepend: site.baseurl}}">
-              <i class="fas fa-hashtag fa-sm"></i> {{ tag }}</a> &nbsp;
-              {% endfor %}
-          {% endif %}
-
-          {% if categories != "" %}
-          &nbsp; &middot; &nbsp;
-            {% for category in post.categories %}
-            <a href="{{ category | slugify | prepend: '/blog/category/' | prepend: site.baseurl}}">
-              <i class="fas fa-tag fa-sm"></i> {{ category }}</a> &nbsp;
-              {% endfor %}
-          {% endif %}
-    </p>
-    </li>
-
-    {% endfor %}
-  </ul>
-
-  {% include pagination.html %}
-
-</div>
+<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <title> blog | ICLR Blogposts 2023</title> <meta name="author" content="abc b c"/> <meta name="description" content="Home to the 2023 ICLR Blogposts track "/> <meta name="keywords" content="machine-learning, ml, deep-learning, reinforcement-learning, iclr"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha256-DF7Zhf293AJxJNTmh5zhoYYIMs2oXitRfBjY+9L//AY=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/css/mdb.min.css" integrity="sha256-jpjYvU3G3N6nrrBwXJoVEYI/0zw8htfFnhT9ljN3JJw=" crossorigin="anonymous"/> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fortawesome/fontawesome-free@5.15.4/css/all.min.css" integrity="sha256-mUZM63G8m73Mcidfrv5E+Y61y7a12O5mW4ezU3bxqW4=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/academicons@1.9.1/css/academicons.min.css" integrity="sha256-i1+4qU2G2860dGGIOJscdC30s9beBXjFfzjWLjBRsBg=" crossorigin="anonymous"> <link rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700|Roboto+Slab:100,300,400,500,700|Material+Icons"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/github.css" media="" id="highlight_theme_light"/> <link rel="shortcut icon" href="/2023/assets/img/iclr_favicon.ico"/> <link rel="stylesheet" href="/2023/assets/css/main.css"> <link rel="canonical" href="https://iclr-blogposts.github.io/2023/blog/"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/native.css" media="none" id="highlight_theme_dark"/> <script src="/2023/assets/js/theme.js"></script> <script src="/2023/assets/js/dark_mode.js"></script> </head> <body class="fixed-top-nav "> <header> <nav id="navbar" class="navbar navbar-light navbar-expand-sm fixed-top"> <div class="container"> <a class="navbar-brand title font-weight-lighter" href="/2023/">ICLR Blogposts 2023</a> <button class="navbar-toggler collapsed ml-auto" type="button" data-toggle="collapse" data-target="#navbarNav" aria-controls="navbarNav" aria-expanded="false" aria-label="Toggle navigation"> <span class="sr-only">Toggle navigation</span> <span class="icon-bar top-bar"></span> <span class="icon-bar middle-bar"></span> <span class="icon-bar bottom-bar"></span> </button> <div class="collapse navbar-collapse text-right" id="navbarNav"> <ul class="navbar-nav ml-auto flex-nowrap"> <li class="nav-item "> <a class="nav-link" href="/2023/about">about</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/call">call for blogposts</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/submitting">submitting</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/reviewing">reviewing</a> </li> <li class="nav-item active"> <a class="nav-link" href="/2023/blog/index.html">blog<span class="sr-only">(current)</span></a> </li> <li class="nav-item dropdown "> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdown" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">other iterations</a> <div class="dropdown-menu dropdown-menu-right" aria-labelledby="navbarDropdown"> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2025/">2025</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2024/">2024</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blog-track.github.io/home/" target="_blank" rel="noopener noreferrer">2022</a> </div> </li> <li class="toggle-container"> <button id="light-toggle" title="Change theme"> <i class="fas fa-moon"></i> <i class="fas fa-sun"></i> </button> </li> </ul> </div> </div> </nav> </header> <div class="header-background"><div class="img"></div></div> <div class="container mt-5"> <div class="post"> <div class="header-bar"> <h1>blogposts</h1> <h2>Accepted Blog Posts</h2> </div> <ul class="post-list"> <li> <h3> <a class="post-title" href="/2023/blog/2023/hitchhikers-momentum/">A Hitchhiker's Guide to Momentum</a> </h3> <p>Polyak momentum is one of the most iconic methods in optimization. Despite it's simplicity, it features rich dynamics that depend both on the step-size and momentum parameter. In this blog post we identify the different regions of the parameter space and discuss their convergence properties using the theory of Chebyshev polynomials.</p> <p class="post-meta"> 19 min read   ·   May 1, 2023 </p> <p class="post-tags"> <a href="/2023/blog/2023"> <i class="fas fa-calendar fa-sm"></i> 2023 </a> </p> </li> <li> <h3> <a class="post-title" href="/2023/blog/2023/autoregressive-neural-pde-solver/">Autoregressive Renaissance in Neural PDE Solvers</a> </h3> <p>Recent developments in the field of neural partial differential equation (PDE) solvers have placed a strong emphasis on neural operators. However, the paper Message Passing Neural PDE Solver by Brandstetter et al. published in ICLR 2022 revisits autoregressive models and designs a message passing graph neural network that is comparable with or outperforms both the state-of-the-art Fourier Neural Operator and traditional classical PDE solvers in its generalization capabilities and performance. This blog post delves into the key contributions of this work, exploring the strategies used to address the common problem of instability in autoregressive models and the design choices of the message passing graph neural network architecture.</p> <p class="post-meta"> 46 min read   ·   May 1, 2023 </p> <p class="post-tags"> <a href="/2023/blog/2023"> <i class="fas fa-calendar fa-sm"></i> 2023 </a> </p> </li> <li> <h3> <a class="post-title" href="/2023/blog/2023/facial-poisoning/">Data Poisoning is Hitting a Wall</a> </h3> <p>In this post, we look at the paper 'Data Poisoning Won't Save You From Facial Recognition', discuss the impact of the work, and additionally look at how this work fares in the current state of adversarial machine learning. Being a blog post as opposed to a traditional paper, we try to avoid inundating the reader with mathematical equations and complex terminologies. Instead, we aim to put forth this work's primary concept and implications, along with our observations, in a clear, concise manner. Don't want to go through the entire post? Check out the TL;DR at the end for a quick summary.</p> <p class="post-meta"> 17 min read   ·   May 1, 2023 </p> <p class="post-tags"> <a href="/2023/blog/2023"> <i class="fas fa-calendar fa-sm"></i> 2023 </a> </p> </li> <li> <h3> <a class="post-title" href="/2023/blog/2023/adamw/">Decay No More</a> </h3> <p>Weight decay is among the most important tuning parameters to reach high accuracy for large-scale machine learning models. In this blog post, we revisit AdamW, the weight decay version of Adam, summarizing empirical findings as well as theoretical motivations from an optimization perspective.</p> <p class="post-meta"> 21 min read   ·   May 1, 2023 </p> <p class="post-tags"> <a href="/2023/blog/2023"> <i class="fas fa-calendar fa-sm"></i> 2023 </a> </p> </li> <li> <h3> <a class="post-title" href="/2023/blog/2023/how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/">How does the inductive bias influence the generalization capability of neural networks?</a> </h3> <p>The blog post discusses how memorization and generalization are affected by extreme overparameterization. Thereforeit explains the overfitting puzzle in machine learning and how the inductive bias can help to understand the generalization capability of neural networks.</p> <p class="post-meta"> 13 min read   ·   May 1, 2023 </p> <p class="post-tags"> <a href="/2023/blog/2023"> <i class="fas fa-calendar fa-sm"></i> 2023 </a> </p> </li> <li> <h3> <a class="post-title" href="/2023/blog/2023/how-much-meta-learning-is-in-image-to-image-translation/">How much meta-learning is in image-to-image translation?</a> </h3> <p>...in which we find a connection between meta-learning literature and a paper studying how well CNNs deal with nuisance transforms in a class-imbalanced setting. Closer inspection reveals a surprising amount of similarity - from meta-information to loss functions. This implies that the current conception of meta-learning might be too narrow.</p> <p class="post-meta"> 26 min read   ·   May 1, 2023 </p> <p class="post-tags"> <a href="/2023/blog/2023"> <i class="fas fa-calendar fa-sm"></i> 2023 </a> </p> </li> <li> <h3> <a class="post-title" href="/2023/blog/2023/bsuite-applications/">Practical Applications of Bsuite For Reinforcement Learning</a> </h3> <p>In 2019, researchers at DeepMind published a suite of reinforcement learning environments called Behavior Suite for Reinforcement Learning, or bsuite. Each environment is designed to directly test a core capability of a general reinforcement learning agent, such as its ability to generalize from past experience or handle delayed rewards. In this blog post, we extend their work by providing specific examples of how bsuite can address common challenges faced by reinforcement learning practitioners during the development process.</p> <p class="post-meta"> 37 min read   ·   May 1, 2023 </p> <p class="post-tags"> <a href="/2023/blog/2023"> <i class="fas fa-calendar fa-sm"></i> 2023 </a> </p> </li> <li> <h3> <a class="post-title" href="/2023/blog/2023/riit/">Rethinking the Implementation Tricks and Monotonicity Constraint in Cooperative Multi-agent Reinforcement Learning</a> </h3> <p>QMIX, a very classical multi-agent reinforcement learning (MARL) algorithm, is often considered to be a weak performance baseline due to its representation capability limitations. However, we found that by improving the implementation techniques of QMIX we can enable it to achieve state-of-the-art on the StarCraft Multi-Agent Challenge (SMAC) testbed. Furthermore, the key factor of the monotonicity constraint of QMIX was found in this post, we tried to explain its role and corroborated its superior performance by combining it with another actor-critic style algorithm. We have open-sourced the code at https://github.com/hijkzzz/pymarl2 for researchers to evaluate the effects of these proposed techniques.</p> <p class="post-meta"> 39 min read   ·   May 1, 2023 </p> <p class="post-tags"> <a href="/2023/blog/2023"> <i class="fas fa-calendar fa-sm"></i> 2023 </a> </p> </li> <li> <h3> <a class="post-title" href="/2023/blog/2023/classification-layer-initialization-in-maml/">Strategies for Classification Layer Initialization in Model-Agnostic Meta-Learning</a> </h3> <p>This blog post discusses different strategies for initializing the classification layers parameters before fine-tuning on a new task in Model-Agnostic Meta-Learning. Each of the strategies in question has emerged from a different problemand it will be analyzed whether one approach can solve the problems addressed by the other approaches.</p> <p class="post-meta"> 23 min read   ·   May 1, 2023 </p> <p class="post-tags"> <a href="/2023/blog/2023"> <i class="fas fa-calendar fa-sm"></i> 2023 </a> </p> </li> <li> <h3> <a class="post-title" href="/2023/blog/2023/raspy/">Thinking Like Transformers</a> </h3> <p>Thinking like Transformers proposes a simple language for coding with attention-like primitives. Using this language, we consider a challenging set of puzzles to gain intuition for how Transformer could implement basic algorithms.</p> <p class="post-meta"> 21 min read   ·   May 1, 2023 </p> <p class="post-tags"> <a href="/2023/blog/2023"> <i class="fas fa-calendar fa-sm"></i> 2023 </a> </p> </li> <li> <h3> <a class="post-title" href="/2023/blog/2023/sets-and-graphs/">Universality of Neural Networks on Sets vs. Graphs</a> </h3> <p>Universal function approximation is one of the central tenets in theoretical deep learning research. It is the question of whether a specific neural network architecture is, in theory, able to approximate any function of interest. The ICLR paper “How Powerful are Graph Neural Networks?” shows that mathematically analysing the constraints of an architecture as a universal function approximator and alleviating these constraints can lead to more principled architecture choices, performance improvements, and long-term impact on the field. Specifically in the fields of learning on sets and learning on graphs, universal function approximation is a well-studied property. The two fields are closely linked because the need for permutation invariance in both cases leads to similar building blocks. However, we argue that these two fields have sometimes evolved in parallel, not fully exploiting their synergies. This post aims at bringing these two fields closer together, particularly from the perspective of universal function approximation.</p> <p class="post-meta"> 25 min read   ·   May 1, 2023 </p> <p class="post-tags"> <a href="/2023/blog/2023"> <i class="fas fa-calendar fa-sm"></i> 2023 </a> </p> </li> </ul> </div> </div> <script src="https://cdn.jsdelivr.net/npm/jquery@3.6.0/dist/jquery.min.js" integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/js/bootstrap.bundle.min.js" integrity="sha256-fgLAgv7fyCGopR/gBNq2iW3ZKIdqIcyshnUULC4vex8=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/js/mdb.min.js" integrity="sha256-NdbiivsvWt7VYCt6hYNT3h/th9vSTL4EDWeGs5SN3DA=" crossorigin="anonymous"></script> <script defer src="https://cdn.jsdelivr.net/npm/masonry-layout@4.2.2/dist/masonry.pkgd.min.js" integrity="sha256-Nn1q/fx0H7SNLZMQ5Hw5JLaTRZp0yILA/FRexe19VdI=" crossorigin="anonymous"></script> <script defer src="https://cdn.jsdelivr.net/npm/imagesloaded@4/imagesloaded.pkgd.min.js"></script> <script defer src="/2023/assets/js/masonry.js" type="text/javascript"></script> <script defer src="https://cdn.jsdelivr.net/npm/medium-zoom@1.0.6/dist/medium-zoom.min.js" integrity="sha256-EdPgYcPk/IIrw7FYeuJQexva49pVRZNmt3LculEr7zM=" crossorigin="anonymous"></script> <script defer src="/2023/assets/js/zoom.js"></script> <script defer src="/2023/assets/js/common.js"></script> <script type="text/javascript">window.MathJax={tex:{tags:"ams"}};</script> <script defer type="text/javascript" id="MathJax-script" src="https://cdn.jsdelivr.net/npm/mathjax@3.2.0/es5/tex-mml-chtml.js"></script> <script defer src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script> </body> </html>
\ No newline at end of file
diff --git a/call.html b/call.html
new file mode 100644
index 00000000..0da76637
--- /dev/null
+++ b/call.html
@@ -0,0 +1 @@
+<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <title>call for blogposts | ICLR Blogposts 2023</title> <meta name="author" content="abc b c"/> <meta name="description" content="Home to the 2023 ICLR Blogposts track "/> <meta name="keywords" content="machine-learning, ml, deep-learning, reinforcement-learning, iclr"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha256-DF7Zhf293AJxJNTmh5zhoYYIMs2oXitRfBjY+9L//AY=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/css/mdb.min.css" integrity="sha256-jpjYvU3G3N6nrrBwXJoVEYI/0zw8htfFnhT9ljN3JJw=" crossorigin="anonymous"/> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fortawesome/fontawesome-free@5.15.4/css/all.min.css" integrity="sha256-mUZM63G8m73Mcidfrv5E+Y61y7a12O5mW4ezU3bxqW4=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/academicons@1.9.1/css/academicons.min.css" integrity="sha256-i1+4qU2G2860dGGIOJscdC30s9beBXjFfzjWLjBRsBg=" crossorigin="anonymous"> <link rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700|Roboto+Slab:100,300,400,500,700|Material+Icons"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/github.css" media="" id="highlight_theme_light"/> <link rel="shortcut icon" href="/2023/assets/img/iclr_favicon.ico"/> <link rel="stylesheet" href="/2023/assets/css/main.css"> <link rel="canonical" href="https://iclr-blogposts.github.io/2023/call"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/native.css" media="none" id="highlight_theme_dark"/> <script src="/2023/assets/js/theme.js"></script> <script src="/2023/assets/js/dark_mode.js"></script> </head> <body class="fixed-top-nav "> <header> <nav id="navbar" class="navbar navbar-light navbar-expand-sm fixed-top"> <div class="container"> <a class="navbar-brand title font-weight-lighter" href="/2023/">ICLR Blogposts 2023</a> <button class="navbar-toggler collapsed ml-auto" type="button" data-toggle="collapse" data-target="#navbarNav" aria-controls="navbarNav" aria-expanded="false" aria-label="Toggle navigation"> <span class="sr-only">Toggle navigation</span> <span class="icon-bar top-bar"></span> <span class="icon-bar middle-bar"></span> <span class="icon-bar bottom-bar"></span> </button> <div class="collapse navbar-collapse text-right" id="navbarNav"> <ul class="navbar-nav ml-auto flex-nowrap"> <li class="nav-item "> <a class="nav-link" href="/2023/about">about</a> </li> <li class="nav-item active"> <a class="nav-link" href="/2023/call">call for blogposts<span class="sr-only">(current)</span></a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/submitting">submitting</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/reviewing">reviewing</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/blog/index.html">blog</a> </li> <li class="nav-item dropdown "> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdown" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">other iterations</a> <div class="dropdown-menu dropdown-menu-right" aria-labelledby="navbarDropdown"> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2025/">2025</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2024/">2024</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blog-track.github.io/home/" target="_blank" rel="noopener noreferrer">2022</a> </div> </li> <li class="toggle-container"> <button id="light-toggle" title="Change theme"> <i class="fas fa-moon"></i> <i class="fas fa-sun"></i> </button> </li> </ul> </div> </div> </nav> </header> <div class="header-background"><div class="img"></div></div> <div class="container mt-5"> <div class="post"> <article> <p><strong>Announcements</strong>:</p> <ul> <li>The track has concluded and accepted blogposts are viewable <a href="/2023/blog">here</a>!</li> <li>The poster session for the blog track will take place at <strong>11:30</strong> on <strong>Tuesday May 2nd</strong> in room <strong>MH1-2-3-4</strong>. <ul> <li>Check <a href="https://iclr.cc/virtual/2023/workshop/14478" target="_blank" rel="noopener noreferrer">here</a> for more information, and come by to check out the posters!</li> <li>If you are going to be presenting a poster in-person, please add the <a href="/2023/assets/pdf/sticker.pdf">blog post track sticker</a> to your poster.</li> </ul> </li> </ul> <h1 id="call-for-blogposts">Call for blogposts</h1> <p>We invite all researchers and practicioners to submit a blogpost discussing work previously published at ICLR, to the ICLR 2023 blogpost track.</p> <p>The format and process for this blog post track is as follows:</p> <ul> <li>Write a post on a subject that has been published at ICLR relatively recently. The authors of the blog posts will have to declare their conflicts of interest (positive nor negative) with the paper (and their authors) they write about. Conflicts of interest include: <ul> <li>Recent collaborators (less than 3 years)</li> <li>Current institution.</li> </ul> <p>Blog Posts must not be used to highlight or advertise past publications of the authors or of their lab. Previously, we did not accept submissions with a conflict of interest, however this year we will only ask the authors to report if they have such a conflict. If so, reviewers will be asked to judge if the submission is sufficiently critical and objective of the papers addressed in the blog post.</p> </li> <li> <p>The posts will be created and published under a unified template; see <a href="/2023/submitting">the submission instructions</a> and the <a href="/2023/blog/2022/distill-example">sample post</a> hosted on the blog of this website.</p> </li> <li>Blogs will be peer-reviewed (double-blind) for quality and novelty of the content: clarity and pedagogy of the exposition, new theoretical or practical insights, reproduction/extension of experiments, etc. We are slightly relaxing the double-blind constraints by assuming good faith from both submitters and reviewers (see <a href="/2023/submitting">the submission instructions</a> for more details).</li> </ul> <h2 id="key-dates">Key Dates</h2> <ul> <li> <p><strong>Abstract deadline</strong>: February 2nd AOE, 2023 (submit to <a href="https://openreview.net/group?id=ICLR.cc/2023/BlogPosts&amp;referrer=%5BHomepage%5D(%2F)" target="_blank" rel="noopener noreferrer">OpenReview</a>).  </p> </li> <li> <p><strong>Submission deadline</strong>: February 10th AOE, 2023 (any modifications to your blog post, via a <a href="https://github.com/iclr-blogposts/staging/pulls" target="_blank" rel="noopener noreferrer">pull request on github</a>).  </p> </li> <li> <p><strong>Notification of acceptance</strong>: March 31st, 2023  </p> </li> <li> <p><strong>Camera-ready merge</strong>: April 28th, 2023</p> </li> </ul> <h2 id="submission-guidelines">Submission Guidelines</h2> <blockquote> <p>See <a href="/2023/submitting">the submission instructions</a> for more details.</p> </blockquote> <p>For this edition of the Blogposts Track, we will forgo the requirement for total anonymity. The blog posts <strong>must be anonymized for the review process</strong>, but users will submit their anonymized blog posts via a pull request to a staging repository (in addition to a submission on OpenReview). The post will be merged into the staging repository, where it will be deployed to a separate Github Pages website. Reviewers will be able to access the posts directly through a public url on this staging website, and will submit their reviews on OpenReview. Reviewers should refrain from looking at the git history for the post, which may reveal information about the authors.</p> <p>This still largely follows the Double-Blind reviewing principle; it is no less double-blind than when reviewers are asked to score papers that have previously been released to <a href="https://arxiv.org/" target="_blank" rel="noopener noreferrer">arXiv</a>, an overwhelmingly common practice in the ML community. This approach was chosen to lower the burden on both the organizers and the authors; last year, many submissions had to be reworked once deployed due to a variety of reasons. By allowing the authors to render their websites to Github Pages prior to the review process, we hope to avoid this issue entirely. We also avoid the issue of having to host the submissions on a separate server during the reviewing process.</p> <p>However, we understand the desire for total anonymity. Authors that wish to have a fully double-blind process might consider creating new GitHub accounts without identifying information which will only be used for this track. For an example of a submission in the past which used an anonymous account in this manner, you can check out the <a href="https://worldmodels.github.io/" target="_blank" rel="noopener noreferrer">World Models blog post (Ha and Schmidhuber, 2018)</a> and the <a href="https://github.com/worldmodels/worldmodels.github.io" target="_blank" rel="noopener noreferrer">accompanying repository</a>.</p> </article> </div> </div> <script src="https://cdn.jsdelivr.net/npm/jquery@3.6.0/dist/jquery.min.js" integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/js/bootstrap.bundle.min.js" integrity="sha256-fgLAgv7fyCGopR/gBNq2iW3ZKIdqIcyshnUULC4vex8=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/js/mdb.min.js" integrity="sha256-NdbiivsvWt7VYCt6hYNT3h/th9vSTL4EDWeGs5SN3DA=" crossorigin="anonymous"></script> <script defer src="https://cdn.jsdelivr.net/npm/masonry-layout@4.2.2/dist/masonry.pkgd.min.js" integrity="sha256-Nn1q/fx0H7SNLZMQ5Hw5JLaTRZp0yILA/FRexe19VdI=" crossorigin="anonymous"></script> <script defer src="https://cdn.jsdelivr.net/npm/imagesloaded@4/imagesloaded.pkgd.min.js"></script> <script defer src="/2023/assets/js/masonry.js" type="text/javascript"></script> <script defer src="https://cdn.jsdelivr.net/npm/medium-zoom@1.0.6/dist/medium-zoom.min.js" integrity="sha256-EdPgYcPk/IIrw7FYeuJQexva49pVRZNmt3LculEr7zM=" crossorigin="anonymous"></script> <script defer src="/2023/assets/js/zoom.js"></script> <script defer src="/2023/assets/js/common.js"></script> <script type="text/javascript">window.MathJax={tex:{tags:"ams"}};</script> <script defer type="text/javascript" id="MathJax-script" src="https://cdn.jsdelivr.net/npm/mathjax@3.2.0/es5/tex-mml-chtml.js"></script> <script defer src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script> </body> </html>
\ No newline at end of file
diff --git a/feed.xml b/feed.xml
new file mode 100644
index 00000000..dd46dc3e
--- /dev/null
+++ b/feed.xml
@@ -0,0 +1,201 @@
+<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en"><generator uri="https://jekyllrb.com/" version="4.3.4">Jekyll</generator><link href="https://iclr-blogposts.github.io/2023/feed.xml" rel="self" type="application/atom+xml"/><link href="https://iclr-blogposts.github.io/2023/" rel="alternate" type="text/html" hreflang="en"/><updated>2024-10-09T00:10:30+02:00</updated><id>https://iclr-blogposts.github.io/2023/feed.xml</id><title type="html">ICLR Blogposts 2023</title><subtitle>Home to the 2023 ICLR Blogposts track </subtitle><entry><title type="html">Decay No More</title><link href="https://iclr-blogposts.github.io/2023/blog/2023/adamw/" rel="alternate" type="text/html" title="Decay No More"/><published>2023-05-01T00:00:00+02:00</published><updated>2023-05-01T00:00:00+02:00</updated><id>https://iclr-blogposts.github.io/2023/blog/2023/adamw</id><content type="html" xml:base="https://iclr-blogposts.github.io/2023/blog/2023/adamw/"><![CDATA[<h2 id="introduction">Introduction</h2> <p>Weight decay is a regularization technique in machine learning which scales down the weights in every step. It dates back at least to the 1990’s and the work of Krogh and Hertz <d-cite key="Krogh1991"></d-cite> and Bos and Chug <d-cite key="Bos1996"></d-cite>.</p> <p>In <code class="language-plaintext highlighter-rouge">Pytorch</code>, weight decay is one simple line which typically is found somewhere in the <code class="language-plaintext highlighter-rouge">step</code>-method:</p> <figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="n">group</span><span class="p">[</span><span class="sh">'</span><span class="s">params</span><span class="sh">'</span><span class="p">]:</span>
+  <span class="n">p</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="nf">add_</span><span class="p">(</span><span class="n">p</span><span class="p">.</span><span class="n">data</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=-</span><span class="n">decay</span><span class="p">)</span></code></pre></figure> <p>Subtracting a multiple of the weight can be seen as taking a step into the negative gradient direction of the squared norm of the weight. This relates weight decay to \(\ell_2\)-regularization.</p> <p>The exact mechanism of weight decay is still puzzling the machine learning community:</p> <div class="jekyll-twitter-plugin"><blockquote class="twitter-tweet"><p lang="en" dir="ltr">The story of weight decay in pictures:<br/><br/>weight decay ...<br/>1) improves data efficiency by &gt; 50%<br/>2) is frequently found in the best hyperparam configs<br/>3) is among the most important hparams to tune<br/>4) is also tricky to tune <a href="https://t.co/PjWpk3pJxz">pic.twitter.com/PjWpk3pJxz</a></p>&mdash; Sebastian Raschka (@rasbt) <a href="https://twitter.com/rasbt/status/1614327550058328064?ref_src=twsrc%5Etfw">January 14, 2023</a></blockquote> <script async="" src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> </div> <div class="jekyll-twitter-plugin"><blockquote class="twitter-tweet"><p lang="en" dir="ltr">There is a gaping hole in the literature regarding the purpose of weight decay in deep learning. Nobody knows what weight decay does! AFAIK, the last comprehensive look at weight decay was this 2019 paper <a href="https://t.co/7WDBZojsm0">https://t.co/7WDBZojsm0</a>, which argued that weight decay <a href="https://t.co/qUpCbfhFRf">https://t.co/qUpCbfhFRf</a></p>&mdash; Jeremy Cohen (@deepcohen) <a href="https://twitter.com/deepcohen/status/1617274166570528769?ref_src=twsrc%5Etfw">January 22, 2023</a></blockquote> <script async="" src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> </div> <p>The paper by Zhang et al. <d-cite key="Zhang2019"></d-cite> - which is the one mentioned in the second tweet - gives a comprehensive overview of weight decay and its effect on generalization, in particular in the interplay with Batch Normalization <code class="language-plaintext highlighter-rouge">(BN)</code> <d-cite key="Ioffe2015"></d-cite>. Batch Normalization describes a module of a network that normalizes the output of the previous layer to have zero mean and variance of one (or a variant of this with learnable mean and variance). We will not go into the details here but refer to <a href="https://iclr-blog-track.github.io/2022/03/25/unnormalized-resnets/">this blog post</a> <d-cite key="pieterjan2022normalizationisdead"></d-cite> for the interested reader.</p> <p>We want to summarize two findings of <d-cite key="Zhang2019"></d-cite>:</p> <ul> <li>On the one hand, weight decay has (in theory) no effect on layers with <code class="language-plaintext highlighter-rouge">(BN)</code>. This is simply due to the fact that <code class="language-plaintext highlighter-rouge">(BN)</code> makes the output invariant to a rescaling of the weights.</li> </ul> <blockquote> Weight decay is widely used in networks with Batch Normalization (Ioffe &amp; Szegedy, 2015). In principle, weight decay regularization should have no effect in this case, since one can scale the weights by a small factor without changing the network’s predictions. Hence, it does not meaningfully constrain the network’s capacity. —Zhang et al., 2019 </blockquote> <ul> <li>However, the experiments of the paper show that weight decay on layers with <code class="language-plaintext highlighter-rouge">(BN)</code> can nevertheless improve accuracy. The authors argue that this is due to an effectively larger learning rate.</li> </ul> <p>This blog post will summarize the development of weight decay specifically for <span style="font-family:monospace">Adam</span>. We try to shed some light on the following questions:</p> <ol> <li>What is the difference between <span style="font-family:monospace">Adam</span> and its weight decay version <span style="font-family:monospace">AdamW</span>? Does the existing literature give a clear answer to the question when (and why) <span style="font-family:monospace">AdamW</span> performs better?</li> <li>Is the weight decay mechanism of <span style="font-family:monospace">AdamW</span> just <em>one more trick</em> or can we actually motivate it from an optimization perspective?</li> <li>The last section is somewhat explorational: could we come up with different formulas for a weight decay version of <span style="font-family:monospace">Adam</span>? By doing so, we will see that <span style="font-family:monospace">AdamW</span> already combines several advantages for practical use.</li> </ol> <h3 id="notation">Notation</h3> <p>We denote by \(\alpha &gt; 0\) the initial learning rate. We use \(\eta_t &gt; 0\) for a learning rate schedule multiplier. By this, the effective learning rate in iteration \(t\) is \(\alpha \eta_t\). We use \(\lambda &gt; 0\) for the weight decay parameter.</p> <h2 id="adam">Adam</h2> <p><span style="font-family:monospace">Adam</span> uses an exponentially moving average (EMA) of stochastic gradients, typically denoted by \(m_t\), and of the elementwise squared gradients, denoted by \(v_t\).</p> <p>We denote with \(\hat m_t\) and \(\hat v_t\) the EMA estimates with bias correction (see <d-cite key="Kingma2015"></d-cite>), this means</p> \[\hat m_t = \frac{m_t}{1-\beta_1^t}, \quad \hat v_t = \frac{v_t}{1-\beta_2^t}\] <p>where \(\beta_1, \beta_2 \in [0,1)\). The update formula of <span style="font-family:monospace">Adam</span> is given by</p> \[w_t = w_{t-1} - \eta_t \alpha \frac{\hat m_t}{\epsilon + \sqrt{\hat v_t}}.\] <p>How would <span style="font-family:monospace">Adam</span> handle regularization? The first approach to this was to simply add the regularization term \(\frac{\lambda}{2}\|w\|^2\) on top of the loss, do backpropagation and then compute the <span style="font-family:monospace">Adam</span> step as outlined above. This is usually referred to as <span style="font-family:monospace">AdamL2</span>. However, Loshchilov and Hutter <d-cite key="Loshchilov2019"></d-cite> showed that this can be suboptimal and one major contribution to alleviate this was the development of <span style="font-family:monospace">AdamW</span>.</p> <h2 id="adamw">AdamW</h2> <p>For training with \(\ell_2\)-regularization, Loshchilov and Hutter proposed <span style="font-family:monospace">AdamW</span> in 2019 <d-cite key="Loshchilov2019"></d-cite> as an alternative to <span style="font-family:monospace">AdamL2</span>. In the paper, the update formula is given as</p> \[\tag{AdamW} w_t = (1-\eta_t \lambda)w_{t-1} - \eta_t \alpha \frac{\hat m_t}{\epsilon + \sqrt{\hat v_t}}.\] <p>While for <span style="font-family:monospace">Adam</span> several results for convex and nonconvex problems are established <d-cite key="Defossez2022, Reddi2018"></d-cite>, theoretical guarantees for <span style="font-family:monospace">AdamW</span> have been explored - to the best of our knowledge - only very recently <d-cite key="Anonymous2023"></d-cite>. Despite this, the method has enjoyed considerable practical success: for instance, <span style="font-family:monospace">AdamW</span> is implemented in the machine learning libraries Tensorflow <d-cite key="Abadi2015"></d-cite> and Pytorch <d-cite key="Paszke2019"></d-cite>. Another example is the <code class="language-plaintext highlighter-rouge">fairseq</code> library, developped by Facebook Research, which implements many SeqToSeq models. In their codebase, when <span style="font-family:monospace">Adam</span> is specified with weight decay, <span style="font-family:monospace">AdamW</span> is used by default (see <a href="https://github.com/facebookresearch/fairseq/blob/main/fairseq/optim/adam.py">here</a>).</p> <p>We summarize the empirical findings of <d-cite key="Loshchilov2019"></d-cite> as follows:</p> <ul> <li> <p><span style="font-family:monospace">AdamW</span> improves generalization as compared to <span style="font-family:monospace">AdamL2</span> for image classification tasks. In the paper, the authors use a ResNet model <d-cite key="He2016"></d-cite> for the CIFAR10 and Imagenet32 dataset.</p> </li> <li> <p>Another advantage of <span style="font-family:monospace">AdamW</span> is stated in the abstract of <d-cite key="Loshchilov2019"></d-cite>:</p> </li> </ul> <blockquote> We provide empirical evidence that our proposed modification decouples the optimal choice of weight decay factor from the setting of the learning rate for both standard SGD and Adam [...]. —Loshchilov and Hutter, 2019 </blockquote> <p>What the authors mean by <em>decoupling</em> is that if we plot the test accuracy as a heatmap of learning rate and weight decay, the areas with high accuracy are more rectangular; the best learing rate is not too sensitive to the choice of weight decay. We illustrate this conceptually in the plot below which is inspired by Figure 2 in <d-cite key="Loshchilov2019"></d-cite>. The advantage of a decoupled method is that if one of the two hyperparameters is changed, the optimal value for the other one might still be identical and does not need to be retuned - this could reduce a 2D grid search to two 1D line searches.</p> <div class="row mt-3"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-adamw/heatmap-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-adamw/heatmap-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-adamw/heatmap-1400.webp"/> <img src="/2023/assets/img/2023-05-01-adamw/heatmap.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> <div class="caption"> Fig. 1: Heatmap of the test accuracy (bright = good accuracy) depending on learning rate and weight decay parameter choice. </div> <p>When revisiting the literature on <span style="font-family:monospace">AdamW</span> we made an interesting practical observation: the <a href="https://pytorch.org/docs/stable/generated/torch.optim.AdamW.html">Pytorch implementation</a> of <span style="font-family:monospace">AdamW</span> is actually slightly different to the algorithm proposed in the paper. In Pytorch, the following is implemented:</p> \[w_t = (1-\eta_t \alpha \lambda)w_{t-1} - \eta_t \alpha \frac{\hat m_t}{\epsilon + \sqrt{\hat v_t}}.\] <p>The difference is that the decay factor in the code is \(1-\eta_t \alpha \lambda\) instead of \(1-\eta_t \lambda\) in the paper. Clearly, this is equivalent as we can simply reparametrize the weight decay factor \(\lambda\) to make up for this. However, as the default learning rate \(\alpha=0.001\) is rather small, this means that practicioners might need to choose rather high values of \(\lambda\) in order to get sufficiently strong decay. Moreover, this leaves a certain ambiguity when tuned values for \(\lambda\) are reported in the literature.</p> <h2 id="follow-up-work">Follow-up work</h2> <p>In a recent article, Zhuang et al. revisit the <span style="font-family:monospace">AdamW</span> method and try to explain its practical success <d-cite key="Zhuang2022"></d-cite>. One of their central arguments is that <span style="font-family:monospace">AdamW</span> is approximately equal to <span style="font-family:monospace">Adam</span> with a proximal update for \(\ell_2\)-regularization.</p> <p>Before explaining this in detail, we first want to summarize the empirical findings of <d-cite key="Zhuang2022"></d-cite>:</p> <ul> <li>When <code class="language-plaintext highlighter-rouge">(BN)</code> is <em>deactivated</em>, <span style="font-family:monospace">AdamW</span> achieves better generalization compared to <span style="font-family:monospace">AdamL2</span> for image classification with a standard ResNet architecture <d-cite key="He2016"></d-cite>.</li> <li>When <code class="language-plaintext highlighter-rouge">(BN)</code> is <em>activated</em>, the test accuracy of <span style="font-family:monospace">AdamW</span> and <span style="font-family:monospace">AdamL2</span> are on par. Moreover, the best accuracy is achieved for no weight decay, i.e. \(\lambda=0\).</li> </ul> <p>The second result is somewhat stunning as it seems to contradict the results in <d-cite key="Loshchilov2019"></d-cite>, which had shown that <span style="font-family:monospace">AdamW</span> generalizes better than <span style="font-family:monospace">AdamL2</span>.<d-footnote>It seems like the AdamW-paper also used (BN) in their experiments, see https://github.com/loshchil/AdamW-and-SGDW.</d-footnote></p> <p>Comparing the details of the experimental setups, we presume the following explanations for this:</p> <ul> <li> <p>The model that is trained in <d-cite key="Loshchilov2019"></d-cite> is slightly different as it uses a Shake-Shake-Image ResNet <d-cite key="He2016, Gastaldi2017"></d-cite>.</p> </li> <li> <p>From Figure 4 in <d-cite key="Loshchilov2019"></d-cite>, one can observe that the improvement in accuracy for the CIFAR-10 dataset becomes noticeable very late in the training (see also Section 4.3 in <d-cite key="Loshchilov2019"></d-cite>). Thus, depending on the number of epochs after which training is stopped, one can reach different conclusions.</p> </li> </ul> <h2 id="proxadam">ProxAdam</h2> <p>The paper by Zhuang et al. <d-cite key="Zhuang2022"></d-cite> does not only compare <span style="font-family:monospace">AdamL2</span> to <span style="font-family:monospace">AdamW</span> experimentally, but it also provides a mathematical motivation for weight decay. In order to understand this, we first need to introduce the <strong>proximal operator</strong>, a central concept of convex analysis.</p> <h3 id="a-short-introduction-to-proximal-operators">A short introduction to proximal operators</h3> <p>Proximal algorithms have been studied for decades in the context of (non-smooth) optimization, way before machine learning was a thing. The groundwork of this field has been laid by R. Tyrrell Rockafellar from the 1970’s onwards <d-cite key="Rockafellar1976,Rockafellar1998"></d-cite>. If \(\varphi: \mathbb{R}^n \to \mathbb{R}\) is convex then the proximal operator is defined as</p> \[\mathrm{prox}_\varphi(x) := \mathrm{argmin}_{z \in \mathbb{R}^n} \varphi(z) + \frac12 \|z-x\|^2.\] <p>If \(\varphi\) is non-smooth, we can not simply compute a gradient step - hence we have to deal with non-smooth terms in a different way. For many classical regularization functions (e.g. the \(\ell_1\)-norm), the proximal operator can be computed in closed form. This makes it a key ingredient of optimization algorithms for non-smooth, regularized problems. Assume that we want to minimize the sum of a differentiable loss \(f\) and a convex regularizer \(\varphi\), i.e.</p> \[\min_{w \in \mathbb{R}^n} f(w) + \varphi(w).\] <p>The proximal gradient method in this setting has the update formula</p> \[w_{t} = \mathrm{prox}_{\alpha \varphi}\big(w_{t-1}- \alpha \nabla f(w_{t-1})\big),\] <p>where \(\alpha&gt;0\) is a step size (<em>aka</em> learning rate). An equivalent way of writing this (which will become useful later on) is<d-footnote>This can be proven using the definition of the proximal operator and completing the square.</d-footnote></p> \[\tag{1} w_{t} = \mathrm{argmin}_y \langle y-w_{t-1}, \nabla f(w_{t-1})\rangle + \varphi(y) + \frac{1}{2\alpha}\|y-w_{t-1}\|^2.\] <h3 id="weight-decay-as-a-proximal-operator">Weight decay as a proximal operator</h3> <p>For \(\ell_2\)-regularization \(\varphi(w) = \frac{\lambda}{2}\|w\|^2\), the proximal operator at \(w\) is given by \(\frac{1}{1+\lambda}w = (1-\frac{\lambda}{1+\lambda})w\). Based on this, the authors of <d-cite key="Zhuang2022"></d-cite> propose a proximal version of <span style="font-family:monospace">Adam</span> called <span style="font-family:monospace">ProxAdam</span>. It is given by</p> \[\tag{ProxAdam} w_t = \big(1- \frac{\lambda\eta_t}{1+\lambda\eta_t} \big)w_{t-1} - \frac{\eta_t \alpha}{1+\lambda\eta_t} \frac{\hat m_t}{\epsilon + \sqrt{\hat v_t}}.\] <p>Knowing this, we can now understand why <span style="font-family:monospace">AdamW</span> is approximately a proximal version of <span style="font-family:monospace">Adam</span>. Using the first-order Taylor-approximation \(\frac{ax}{1+bx}\approx ax\) for small \(x\), applied to the coefficients in front of \(w_{t-1}\) and \(\frac{\hat m_t}{\epsilon + \sqrt{\hat v_t}}\) gives the formula</p> \[w_t = (1-\eta_t \lambda)w_{t-1} - \eta_t \alpha \frac{\hat m_t}{\epsilon + \sqrt{\hat v_t}}\] <p>which is equal to <span style="font-family:monospace">AdamW</span>. The argument we just presented is exactly how <d-cite key="Zhuang2022"></d-cite> concludes that <span style="font-family:monospace">AdamW</span> \(\approx\) <span style="font-family:monospace">ProxAdam</span>.</p> <h3 id="changing-the-norm">Changing the norm</h3> <p>There is one more way of interpreting proximal methods. Let us begin with a simple example: Define the diagonal matrix \(D_t := \mathrm{Diag}(\epsilon + \sqrt{\hat v_t})\). Then, the <span style="font-family:monospace">Adam</span> update can be equivalently written<d-footnote>This can be proven by first-order optimality and solving for $w_t$. We will do a similar calculation further below.</d-footnote> as</p> \[w_t = \mathrm{argmin}_y \langle y-w_{t-1}, \hat m_t \rangle + \frac{1}{2\eta_t\alpha}\|y-w_{t-1}\|_{D_t}^2.\] <p>In other words, <span style="font-family:monospace">Adam</span> takes a proximal step of a linear function, but with the adaptive norm \(D_t\). This change in norm is what makes <span style="font-family:monospace">Adam</span> different from <span style="font-family:monospace">SGD</span> with (heavy-ball) momentum.</p> <p>The update formula of <span style="font-family:monospace">ProxAdam</span> can also be written as a proximal method:</p> \[\tag{P1} w_t = \mathrm{argmin}_y \langle y-w_{t-1}, \hat m_t \rangle + \frac{\lambda}{2\alpha}\|y\|_{D_t}^2 + \frac{1}{2 \eta_t \alpha}\|y-w_{t-1}\|_{D_t}^2.\] <p>In fact, the first-order optimality conditions of (P1) are</p> \[0 = \hat m_t + \frac{\lambda}{\alpha} D_t w_t + \frac{1}{\eta_t \alpha}D_t (w_t-w_{t-1}).\] <p>Solving for \(w_t\) (and doing simple algebra) gives</p> \[\tag{2} w_t = (1+\lambda \eta_t)^{-1}\big[w_{t-1} - \eta_t \alpha D_t^{-1} \hat m_t\big]\] <p>which is equal to <span style="font-family:monospace">ProxAdam</span>.</p> <p>What is slightly surprising here is the term \(\alpha^{-1}\|y\|_{D_t}^2\) in (P1) - we might have expected the regularization term to be used with the standard \(\ell_2\)-norm. This leads us to our final section.</p> <h2 id="adamw-is-scale-free"><span style="font-family:monospace">AdamW</span> is scale-free</h2> <p>As an alternative to (P1), we could replace \(\alpha^{-1}\|y\|_{D_t}^2\) by \(\|y\|^2\) and update</p> \[w_t = \mathrm{argmin}_y \langle y-w_{t-1}, \hat m_t \rangle + \frac{\lambda}{2}\|y\|^2 + \frac{1}{2\eta_t\alpha}\|y-w_{t-1}\|_{D_t}^2.\] <p>Again, setting the gradient of the objective to zero and solving for \(w_t\) we get</p> \[w_t = \big(\mathrm{Id} + \eta_t \lambda \alpha D_t^{-1}\big)^{-1} \big[w_{t-1} - \eta_t\alpha D_t^{-1} \hat m_t \big].\] <p>Comparing this to (2) we see that the second factor is the same, but the decay factor now also depends on \(D_t\) and \(\alpha\). Let us call this method <span style="font-family:monospace">AdamP</span>.</p> <p>Now the natural question is whether <span style="font-family:monospace">AdamP</span> or <span style="font-family:monospace">ProxAdam</span> (or <span style="font-family:monospace">AdamW</span> as its approximation) would be superior. One answer to this is that we would prefer a <em>scale-free</em> algorithm: with this we mean that if the loss function would be multiplied by a positive constant, we could still run the method with exactly the same parameters and obtain the same result. <span style="font-family:monospace">Adam</span> for example is scale-free and in <d-cite key="Zhuang2022"></d-cite> it is explained that <span style="font-family:monospace">ProxAdam</span>/<span style="font-family:monospace">AdamW</span> are, too. The reason for this is the following: looking at (P1) we see that if the loss is scaled by \(c&gt;0\), then \(\hat m_t\) and \(D_t\) are scaled by \(c\) (if we neglect the \(\epsilon\) in \(D_t\)). Hence, the objective in (P1) is multiplied by \(c\) which implies that <span style="font-family:monospace">ProxAdam</span> for \(\epsilon=0\) is invariant to scaling for the same values of \(\lambda,\alpha,\eta_t\). Now, for (P2) the story is different, as here the second term \(\frac{\lambda}{2}\|y\|^2\) is not scaled by \(c\), but the other terms are. We would need to rescale \(\lambda\) by \(c\) to obtain the identical update. As a consequence, <span style="font-family:monospace">AdamP</span> would <strong>not be scale-free</strong> and this makes it less attractive as a method. We should point out that scale-freeness is rather a practical advantage that requires less tuning when changing the model or dataset - it does not imply that the test accuracy would be different when both methods are tuned.</p> <p>To verify this, we ran a simple experiment on a ResNet20 for CIFAR10 with <code class="language-plaintext highlighter-rouge">(BN)</code> deactivated. For <span style="font-family:monospace">AdamW</span> (the <code class="language-plaintext highlighter-rouge">Pytorch</code> version) and <span style="font-family:monospace">AdamP</span> we tested the learning rates <code class="language-plaintext highlighter-rouge">[1e-3,1e-2,1e-1]</code> and weight decay <code class="language-plaintext highlighter-rouge">[1e-5,1e-4,1e-3,1e-2]</code>. From the plots below, we can see that both methods approximately achieve the same accuracy for the best configurations<d-footnote>The best configurations all have learning rate 1e-3.</d-footnote>. The only difference - in this very simple example - is that <span style="font-family:monospace">AdamP</span> seems to arrive at a model with smaller norm for the configurations with high accuracy (see right plot). Hence, its regularization seems to be stronger.</p> <div class="row mt-3"> <div class="col-sm mt-3 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-adamw/resnet20val_score-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-adamw/resnet20val_score-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-adamw/resnet20val_score-1400.webp"/> <img src="/2023/assets/img/2023-05-01-adamw/resnet20val_score.png" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> <div class="col-sm mt-3 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-adamw/resnet20model_norm-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-adamw/resnet20model_norm-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-adamw/resnet20model_norm-1400.webp"/> <img src="/2023/assets/img/2023-05-01-adamw/resnet20model_norm.png" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> </div> <p>For the sake of completeness, we also add a <code class="language-plaintext highlighter-rouge">Pytorch</code> implementation of <span style="font-family:monospace">AdamP</span> in the <a href="#appendix">Appendix</a>.</p> <h2 id="summary">Summary</h2> <ul> <li> <p>Weight decay can be seen as a proximal way of handling \(\ell_2\)-regularization. Therefore, it is not a different <em>type</em> of regularization itself but rather a different <em>treatment</em> of regularization in the optimization method. As a consequence, <span style="font-family:monospace">AdamW</span> is an (almost) proximal version of <span style="font-family:monospace">Adam</span>.</p> </li> <li> <p>Whether or not weight decay brings advantages when used <em>together with</em> <code class="language-plaintext highlighter-rouge">(BN)</code> seems to depend on several factors of the model and experimental design. However, in all experiments we discussed here <span style="font-family:monospace">AdamW</span> performed better or at least on par to <span style="font-family:monospace">AdamL2</span>.</p> </li> <li> <p>The second conclusion suggests that proximal algorithms such as <span style="font-family:monospace">AdamW</span> seem to be favourable. Together with the scale-free property that we described in the final section, this makes <span style="font-family:monospace">AdamW</span> a robust method and explains its practical success.</p> </li> </ul> <h2 id="acknowledgements">Acknowledgements</h2> <p>Special thanks go to Robert M. Gower and the anonymous reviewers for their constructive feedback.</p> <p><a name="appendix"></a></p> <h2 id="appendix">Appendix</h2> <p>Below you find a <code class="language-plaintext highlighter-rouge">Pytorch</code> implementation of <span style="font-family:monospace">AdamP</span>:</p> <figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="kn">import</span> <span class="n">torch</span>
+<span class="kn">from</span> <span class="n">torch.optim</span> <span class="kn">import</span> <span class="n">Optimizer</span>
+
+
+<span class="k">class</span> <span class="nc">AdamP</span><span class="p">(</span><span class="n">Optimizer</span><span class="p">):</span>
+    <span class="sa">r</span><span class="sh">"""</span><span class="s">
+    Arguments:
+        params (iterable): iterable of parameters to optimize or dicts defining
+            parameter groups
+        lr (float, optional): learning rate (default: 1e-3)
+        betas (Tuple[float, float], optional): coefficients used for computing
+            running averages of gradient and its square (default: (0.9, 0.999))
+        eps (float, optional): term added to the denominator to improve
+            numerical stability (default: 1e-8)
+        weight_decay (float, optional): weight decay (L2 penalty) (default: 0)
+        
+    </span><span class="sh">"""</span>
+
+    <span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="n">self</span><span class="p">,</span> <span class="n">params</span><span class="p">,</span> <span class="n">lr</span><span class="o">=</span><span class="mf">1e-3</span><span class="p">,</span> <span class="n">betas</span><span class="o">=</span><span class="p">(</span><span class="mf">0.9</span><span class="p">,</span> <span class="mf">0.999</span><span class="p">),</span> <span class="n">eps</span><span class="o">=</span><span class="mf">1e-8</span><span class="p">,</span>
+                 <span class="n">weight_decay</span><span class="o">=</span><span class="mi">0</span><span class="p">):</span>
+        <span class="k">if</span> <span class="ow">not</span> <span class="mf">0.0</span> <span class="o">&lt;=</span> <span class="n">lr</span><span class="p">:</span>
+            <span class="k">raise</span> <span class="nc">ValueError</span><span class="p">(</span><span class="sh">"</span><span class="s">Invalid learning rate: {}</span><span class="sh">"</span><span class="p">.</span><span class="nf">format</span><span class="p">(</span><span class="n">lr</span><span class="p">))</span>
+        <span class="k">if</span> <span class="ow">not</span> <span class="mf">0.0</span> <span class="o">&lt;=</span> <span class="n">eps</span><span class="p">:</span>
+            <span class="k">raise</span> <span class="nc">ValueError</span><span class="p">(</span><span class="sh">"</span><span class="s">Invalid epsilon value: {}</span><span class="sh">"</span><span class="p">.</span><span class="nf">format</span><span class="p">(</span><span class="n">eps</span><span class="p">))</span>
+        <span class="k">if</span> <span class="ow">not</span> <span class="mf">0.0</span> <span class="o">&lt;=</span> <span class="n">betas</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">&lt;</span> <span class="mf">1.0</span><span class="p">:</span>
+            <span class="k">raise</span> <span class="nc">ValueError</span><span class="p">(</span><span class="sh">"</span><span class="s">Invalid beta parameter at index 0: {}</span><span class="sh">"</span><span class="p">.</span><span class="nf">format</span><span class="p">(</span><span class="n">betas</span><span class="p">[</span><span class="mi">0</span><span class="p">]))</span>
+        <span class="k">if</span> <span class="ow">not</span> <span class="mf">0.0</span> <span class="o">&lt;=</span> <span class="n">betas</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">&lt;</span> <span class="mf">1.0</span><span class="p">:</span>
+            <span class="k">raise</span> <span class="nc">ValueError</span><span class="p">(</span><span class="sh">"</span><span class="s">Invalid beta parameter at index 1: {}</span><span class="sh">"</span><span class="p">.</span><span class="nf">format</span><span class="p">(</span><span class="n">betas</span><span class="p">[</span><span class="mi">1</span><span class="p">]))</span>
+        <span class="k">if</span> <span class="ow">not</span> <span class="mf">0.0</span> <span class="o">&lt;=</span> <span class="n">weight_decay</span><span class="p">:</span>
+            <span class="k">raise</span> <span class="nc">ValueError</span><span class="p">(</span><span class="sh">"</span><span class="s">Invalid weight_decay value: {}</span><span class="sh">"</span><span class="p">.</span><span class="nf">format</span><span class="p">(</span><span class="n">weight_decay</span><span class="p">))</span>
+        <span class="n">defaults</span> <span class="o">=</span> <span class="nf">dict</span><span class="p">(</span><span class="n">lr</span><span class="o">=</span><span class="n">lr</span><span class="p">,</span> <span class="n">betas</span><span class="o">=</span><span class="n">betas</span><span class="p">,</span> <span class="n">eps</span><span class="o">=</span><span class="n">eps</span><span class="p">,</span>
+                        <span class="n">weight_decay</span><span class="o">=</span><span class="n">weight_decay</span><span class="p">)</span>
+        
+        <span class="n">self</span><span class="p">.</span><span class="n">_init_lr</span> <span class="o">=</span> <span class="n">lr</span>
+        <span class="nf">super</span><span class="p">().</span><span class="nf">__init__</span><span class="p">(</span><span class="n">params</span><span class="p">,</span> <span class="n">defaults</span><span class="p">)</span>
+
+        <span class="k">return</span>
+   
+
+    <span class="k">def</span> <span class="nf">step</span><span class="p">(</span><span class="n">self</span><span class="p">,</span> <span class="n">closure</span><span class="o">=</span><span class="bp">None</span><span class="p">):</span>
+        <span class="sh">"""</span><span class="s">Performs a single optimization step.
+
+        Arguments:
+            closure (callable, optional): A closure that reevaluates the model
+                and returns the loss.
+        </span><span class="sh">"""</span>
+        
+        <span class="k">if</span> <span class="n">closure</span> <span class="ow">is</span> <span class="ow">not</span> <span class="bp">None</span><span class="p">:</span>
+            <span class="k">with</span> <span class="n">torch</span><span class="p">.</span><span class="nf">enable_grad</span><span class="p">():</span>
+                <span class="n">loss</span> <span class="o">=</span> <span class="nf">closure</span><span class="p">()</span>
+
+        <span class="k">for</span> <span class="n">group</span> <span class="ow">in</span> <span class="n">self</span><span class="p">.</span><span class="n">param_groups</span><span class="p">:</span>
+            <span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="n">group</span><span class="p">[</span><span class="sh">'</span><span class="s">params</span><span class="sh">'</span><span class="p">]:</span>
+                <span class="k">if</span> <span class="n">p</span><span class="p">.</span><span class="n">grad</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
+                    <span class="k">continue</span>
+
+                <span class="n">grad</span> <span class="o">=</span> <span class="n">p</span><span class="p">.</span><span class="n">grad</span>
+                <span class="n">state</span> <span class="o">=</span> <span class="n">self</span><span class="p">.</span><span class="n">state</span><span class="p">[</span><span class="n">p</span><span class="p">]</span>
+
+                <span class="c1"># State initialization
+</span>                <span class="k">if</span> <span class="sh">'</span><span class="s">step</span><span class="sh">'</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">state</span><span class="p">:</span>
+                    <span class="n">state</span><span class="p">[</span><span class="sh">'</span><span class="s">step</span><span class="sh">'</span><span class="p">]</span> <span class="o">=</span> <span class="mi">0</span>
+                    <span class="c1"># Exponential moving average of gradient values
+</span>                    <span class="n">state</span><span class="p">[</span><span class="sh">'</span><span class="s">exp_avg</span><span class="sh">'</span><span class="p">]</span> <span class="o">=</span> <span class="n">torch</span><span class="p">.</span><span class="nf">zeros_like</span><span class="p">(</span><span class="n">p</span><span class="p">.</span><span class="n">data</span><span class="p">).</span><span class="nf">detach</span><span class="p">()</span>
+                    <span class="c1"># Exponential moving average of squared gradient values
+</span>                    <span class="n">state</span><span class="p">[</span><span class="sh">'</span><span class="s">exp_avg_sq</span><span class="sh">'</span><span class="p">]</span> <span class="o">=</span> <span class="n">torch</span><span class="p">.</span><span class="nf">zeros_like</span><span class="p">(</span><span class="n">p</span><span class="p">.</span><span class="n">data</span><span class="p">).</span><span class="nf">detach</span><span class="p">()</span>
+                    
+                <span class="n">exp_avg</span><span class="p">,</span> <span class="n">exp_avg_sq</span> <span class="o">=</span> <span class="n">state</span><span class="p">[</span><span class="sh">'</span><span class="s">exp_avg</span><span class="sh">'</span><span class="p">],</span> <span class="n">state</span><span class="p">[</span><span class="sh">'</span><span class="s">exp_avg_sq</span><span class="sh">'</span><span class="p">]</span>
+                <span class="n">beta1</span><span class="p">,</span> <span class="n">beta2</span> <span class="o">=</span> <span class="n">group</span><span class="p">[</span><span class="sh">'</span><span class="s">betas</span><span class="sh">'</span><span class="p">]</span>
+
+                <span class="n">state</span><span class="p">[</span><span class="sh">'</span><span class="s">step</span><span class="sh">'</span><span class="p">]</span> <span class="o">+=</span> <span class="mi">1</span>
+                <span class="n">bias_correction1</span> <span class="o">=</span> <span class="mi">1</span> <span class="o">-</span> <span class="n">beta1</span><span class="o">**</span><span class="n">state</span><span class="p">[</span><span class="sh">'</span><span class="s">step</span><span class="sh">'</span><span class="p">]</span>
+                <span class="n">bias_correction2</span> <span class="o">=</span> <span class="mi">1</span> <span class="o">-</span> <span class="n">beta2</span><span class="o">**</span><span class="n">state</span><span class="p">[</span><span class="sh">'</span><span class="s">step</span><span class="sh">'</span><span class="p">]</span>
+
+                
+                <span class="c1"># Decay the first and second moment running average coefficient
+</span>                <span class="n">exp_avg</span><span class="p">.</span><span class="nf">mul_</span><span class="p">(</span><span class="n">beta1</span><span class="p">).</span><span class="nf">add_</span><span class="p">(</span><span class="n">grad</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span> <span class="mi">1</span><span class="o">-</span><span class="n">beta1</span><span class="p">)</span>
+                <span class="n">exp_avg_sq</span><span class="p">.</span><span class="nf">mul_</span><span class="p">(</span><span class="n">beta2</span><span class="p">).</span><span class="nf">addcmul_</span><span class="p">(</span><span class="n">grad</span><span class="p">,</span> <span class="n">grad</span><span class="p">,</span> <span class="n">value</span><span class="o">=</span> <span class="mi">1</span><span class="o">-</span><span class="n">beta2</span><span class="p">)</span>
+                <span class="n">D</span> <span class="o">=</span> <span class="p">(</span><span class="n">exp_avg_sq</span><span class="p">.</span><span class="nf">div</span><span class="p">(</span><span class="n">bias_correction2</span><span class="p">)).</span><span class="nf">sqrt</span><span class="p">().</span><span class="nf">add_</span><span class="p">(</span><span class="n">group</span><span class="p">[</span><span class="sh">'</span><span class="s">eps</span><span class="sh">'</span><span class="p">])</span>
+
+                <span class="n">lr</span> <span class="o">=</span> <span class="n">group</span><span class="p">[</span><span class="sh">'</span><span class="s">lr</span><span class="sh">'</span><span class="p">]</span>
+                <span class="n">lmbda</span> <span class="o">=</span> <span class="n">group</span><span class="p">[</span><span class="sh">'</span><span class="s">weight_decay</span><span class="sh">'</span><span class="p">]</span>
+
+                <span class="n">p</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="nf">addcdiv_</span><span class="p">(</span><span class="n">exp_avg</span><span class="p">,</span> <span class="n">D</span><span class="p">,</span> <span class="n">value</span><span class="o">=-</span><span class="n">lr</span><span class="o">/</span><span class="n">bias_correction1</span><span class="p">)</span>
+                <span class="k">if</span> <span class="n">lmbda</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">:</span>
+                    <span class="n">p</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="nf">div_</span><span class="p">(</span><span class="mf">1.0</span> <span class="o">+</span> <span class="n">lr</span><span class="o">*</span><span class="n">lmbda</span><span class="o">/</span><span class="n">D</span><span class="p">)</span> <span class="c1"># adaptive weight decay
+</span>
+            
+
+        <span class="k">return</span> <span class="n">loss</span></code></pre></figure>]]></content><author><name>Fabian Schaipp</name></author><summary type="html"><![CDATA[Weight decay is among the most important tuning parameters to reach high accuracy for large-scale machine learning models. In this blog post, we revisit AdamW, the weight decay version of Adam, summarizing empirical findings as well as theoretical motivations from an optimization perspective.]]></summary></entry><entry><title type="html">Autoregressive Renaissance in Neural PDE Solvers</title><link href="https://iclr-blogposts.github.io/2023/blog/2023/autoregressive-neural-pde-solver/" rel="alternate" type="text/html" title="Autoregressive Renaissance in Neural PDE Solvers"/><published>2023-05-01T00:00:00+02:00</published><updated>2023-05-01T00:00:00+02:00</updated><id>https://iclr-blogposts.github.io/2023/blog/2023/autoregressive-neural-pde-solver</id><content type="html" xml:base="https://iclr-blogposts.github.io/2023/blog/2023/autoregressive-neural-pde-solver/"><![CDATA[<h2 id="introduction">Introduction</h2> <blockquote> Improving PDE solvers has trickle down benefits to a vast range of other fields. </blockquote> <p>Partial differential equations (PDEs) play a crucial role in modeling complex systems and understanding how they change over time and in space.</p> <p>They are used across physics and engineering, modeling a wide range of physical phenomena like heat transfer, sound waves, electromagnetism, and fluid dynamics, but they can also be used in finance to model the behavior of financial markets, in biology to model the spread of diseases, and in computer vision to model the processing of images.</p> <p>They are particularly interesting in deep learning!</p> <ol> <li><span style="color:#9444e2;">Neural networks can be used to model complex PDEs.</span></li> <li><span style="color:#9444e2;">Embedding knowledge of a PDE into a neural network can help it generalize better and/or use less data</span></li> <li><span style="color:#9444e2;">PDEs can help explain, interpret, and design neural networks.</span></li> </ol> <p>Despite their long history dating back to equations first formalized by Euler over 250 years ago, finding numerical solutions to PDEs continues to be a challenging problem.</p> <p>The recent advances in machine learning and artificial intelligence have opened up new possibilities for solving PDEs in a more efficient and accurate manner. These developments have the potential to revolutionize many fields, leading to a better understanding of complex systems and the ability to make more informed predictions about their behavior.</p> <p>The background and problem set up precedes a brief look into classical and neural solvers, and finally discusses the message passing neural PDE solver (MP-PDE) introduced by Brandstetter et al. <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>.</p> <h2 id="background">Background</h2> <h3 id="lets-brush-up-on-the-basics">Let's brush up on the basics…</h3> <p><em>The notation and definitions provided match those in the paper for consistency, unless otherwise specified.</em></p> <div> <p> Ordinary differential equations (ODEs) describe how a function changes with respect to a <span style="color:#9444e2">single independent variable</span> and its derivatives. In contrast, PDEs are mathematical equations that describe the behavior of a dependent variable as it changes with respect to <span style="color:#9444e2">multiple independent variables</span> and their derivatives. </p> <p> Formally, for one time dimension and possibly multiple spatial dimensions denoted by \(\textbf{x}=[x_{1},x_{2},x_{3},\text{...}]^{\top} \in \mathbb{X}\), a general (temporal) PDE may be written as </p> <p> $$\partial_{t}\textbf{u}= F\left(t, \textbf{x}, \textbf{u},\partial_{\textbf{x}}\textbf{u},\partial_{\textbf{xx}}\textbf{u},\text{...}\right) \qquad (t,\mathbf{x}) \in [0,T] \times \mathbb{X}$$ </p> <p> The \(\partial\) is a partial derivative operator which can be understood as "a small change in". For example, the \(\partial_{t}\textbf{u}\) term refers to how much an infinitesmally small change in \(t\) changes \(\textbf{u}\). Below is an explicit definition for some arbitrary function \(f(x,y)\): $$\frac{\partial f(x,y)}{\partial x} = \lim_{h \to 0} \frac{f(x+h,y) - f(x,y)}{h}$$ </p> <ul> <li>Initial condition: \(\mathbf{u}(0,\mathbf{x})=\mathbf{u}^{0}(\mathbf{x})\) for \(\mathbf{x} \in \mathbb{X}\)</li> <li>Boundary conditions: \(B[ \mathbf{u}](t,x)=0\) for \((t,\mathbf{x}) \in [0,T] \times \partial \mathbb{X}\)</li> </ul> </div> <div class="fake-img l-gutter"> <p> Many equations are solutions to such PDEs alone. For example, the wave equation is given by \(\partial_{tt}u = \partial_{xx}u\). You will find that any function in the form \(u(x,t)=F(x-ct)+\) \(G(x+ct)\) is a potential solution. Initial conditions are used to specify how a PDE "starts" in time, and boundary conditions determine the value of the solution at the boundaries of the region where the PDE is defined. </p> </div> <details><summary>Types of boundary conditions</summary> Dirichlet boundary conditions prescribe a fixed value of the solution at a particular point on the boundary of the domain. Neumann boundary conditions, on the other hand, prescribe the rate of change of the solution at a particular point on the boundary. There are also mixed boundary conditions, which involve both Dirichlet and Neumann conditions, and Robin boundary conditions, which involve a linear combination of the solution and its derivatives at the boundary. </details> <p><br/></p> <div class="l-body-outset"> <iframe src="/2023/assets/html/2023-05-01-autoregressive-neural-pde-solver/slider.html" frameborder="0" scrolling="no" height="750px" width="100%"></iframe> </div> <div class="caption"> Example of the wave equation PDE \(\partial^{2}_{t}u = c^{2}\partial^{2}_ {\mathbf{x}}u\) solved using finite differences. Drag the slider to watch it evolve in time! </div> <p>The study of PDEs is in itself split into many broad fields. Briefly, these are two other important properties in addition to the initial and boundary conditions:</p> <details><summary>Linearity</summary> <ul> <li>Linear: the highest power of the unknown function appearing in the equation is one (i.e., a linear combination of the unknown function and its derivatives)</li> <li>Nonlinear: the highest power of the unknown function appearing in the equation is greater than one</li> </ul> </details> <p><br/></p> <details><summary>Homogeneity</summary> For an example PDE \(u_t - u_xx = f(x, t)\): <ul> <li>Homogeneous: PDEs with no constant terms (i.e., the right-hand side \(f(x,t)=0\)) and express a balance between different physical quantities</li> <li>Inhomogeneous: PDEs with a non-zero constant term \(f(x,t)\neq0\) on the right-hand side and describe how an external factor affects the balance</li> </ul> </details> <p><br/></p> <p>PDEs can be either linear or nonlinear, homogeneous or inhomogeneous, and can contain a combination of constant coefficients and variable coefficients. They can also involve a variety of boundary conditions, such as Dirichlet, Neumann, and Robin conditions, and can be solved using analytical, numerical, or semi-analytical methods <d-cite key="straussPartialDifferentialEquations2007"></d-cite>.</p> <hr style="width:40%"/> <p>Brandstetter et al. <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite> follow precedence set by Li et al. <d-cite key="liFourierNeuralOperator2021"></d-cite> and Bar-Sinai et al. <d-cite key="bar-sinaiLearningDatadrivenDiscretizations2019"></d-cite>to focus on <span style="color:#9444e2;">PDEs written in conservation form</span>:</p> <p style="text-align:center;"> \(\partial_{t} \mathbf{u} + \nabla \cdot \mathbf{J}(\mathbf{u}) = 0\) </p> <ul> <li><p>\(J\) is the flux, or the amount of some quantity that is flowing through a region at a given time</p> </li> <li><p>\(\nabla \cdot J\) is the divergence of the flux, or the amount of outflow of the flux at a given point</p> </li> </ul> <p>Additionally, they consider <span style="color:#9444e2;">Dirichlet and Neumann</span> boundary conditions.</p> <h3 id="solving-pdes-the-classical-way">Solving PDEs the classical way</h3> <p>A brief search in a library will find numerous books detailing how to solve various types of PDEs. </p> <details><summary>Analytical methods: an exact solution to a PDE can be found by mathematical means <d-cite key="straussPartialDifferentialEquations2007"></d-cite>.</summary><br/> <ul> <li>Separation of Variables<ul> <li>This method involves expressing the solution as the product of functions of each variable, and then solving each function individually. It is mainly used for linear PDEs that can be separated into two or more ordinary differential equations.</li> </ul> </li> <li>Green&#39;s Functions<ul> <li>This method involves expressing the solution in terms of a Green&#39;s function, which is a particular solution to a homogeneous equation with specified boundary conditions.</li> </ul> </li> </ul> </details> <p><br/></p> <details><summary>Semi-analytical methods: an analytical solution is combined with numerical approximations to find a solution <d-cite key="bartelsNumericalApproximationPartial"></d-cite>.</summary><br/> <ul> <li>Perturbation methods<ul> <li>This method is used when the solution to a PDE is close to a known solution or is a small deviation from a known solution. The solution is found by making a perturbation to the known solution and solving the resulting equation analytically.</li> </ul> </li> <li>Asymptotic methods<ul> <li>In this method, the solution is represented as a series of terms that are solved analytically. The solution is then approximated by taking the leading terms of the series.</li> </ul> </li> </ul> </details> <p><br/></p> <blockquote> Very few PDEs have analytical solutions, so numerical methods have been developed to approximate PDE solutions over a wider range of potential problems. </blockquote> <h4 id="numerical-methods">Numerical Methods</h4> <p>Often, approaches for temporal PDEs follow the <span style="color:#9444e2;">method of lines (<abbr title="method of lines">MOL</abbr>)</span>.</p> <p>Every point of the discretization is then thought of as a separate ODE evolving in time, enabling the use of ODE solvers such as Runge-Kutta methods.</p> <details><summary>1. Discretizing the problem</summary><br/> <p> In the most basic case (<span style="color:#9444e2;">a regular grid</span>), arbitrary spatial and temporal resolutions \(\mathbf{n_{x}}\) and \(n_{t}\) can be chosen and thus used to create a grid where \(\mathbf{n_{x}}\) is a vector containing a resolution for each spatial dimension. </p> <hr style="width:40%"/> <p> The domain may also be <span style="color:#9444e2;">irregularly sampled, resulting in a grid-free discretization</span>. This is often the case with real-world data that comes from scattered sensors, for example. </p> <p>Finite difference methods (FDMs) or any other discretization technique can be used to discretize the time domain. </p> <p> One direction of ongoing research seeks to determine discretization methods which can result in more efficient numerical solvers (for example, take larger steps in flatter regions and smaller steps in rapidly changing regions). </p> </details> <p><br/></p> <details><summary>2. Estimating the spatial derivatives</summary><br/> <p> A popular choice when using a gridded discretization is the <span style="color:#9444e2;">finite difference method (FDM)</span>. Spatial derivative operators are replaced by a stencil which indicates how values at a finite set of neighboring grid points are combined to approximate the derivative at a given position. This stencil is based on the Taylor series expansion. </p> <p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/fdm_animation.gif-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/fdm_animation.gif-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/fdm_animation.gif-1400.webp"/> <img src="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/fdm_animation.gif" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </p> <div class="caption"> Credits: Augusto Peres, Inductiva <d-cite key="HeatHeatEquation"></d-cite>. </div> <hr style="width:40%"/> <p> The <span style="color:#9444e2;">finite volume method (FVM)</span> is another approach which works for irregular geometries. Rather than requiring a grid, the computation domain can be divided into discrete, non-overlapping control volumes used to compute the solution for that portion <d-cite key="bartelsNumericalApproximationPartial"></d-cite>. </p> <p> For every control volume, a set of equations describing the balance of some physical quantities (in essence, estimating the flux at control volume boundaries) can be solved which results in the approximated spatial derivative. </p> <p> While this method <span style="color:#9444e2;">only works for conservation form equations</span>, it can handle complex problems with irregular geometries and fluxes that are difficult to handle with other numerical techniques such as the <abbr title="finite difference method">FDM</abbr>. </p> <hr style="width:40%"/> <p> In the <span style="color:#9444e2;">pseudospectral method (PSM)</span>, PDEs are solved pointwise in physical space by using basis functions to approximate the spatial derivatives <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. The pseudospectral method and the Galerkin method are two common examples of spectral methods which use basis functions satisfying various conditions depending on the specific algorithm being applied. While the <abbr title="finite difference method">FDM</abbr> considers local information to construct approximations, spectral methods determine global solutions and have exponential convergence. </p> <p> These methods are well-suited for solving problems with <span style="color:#9444e2;">smooth solutions and periodic boundary conditions</span>, but their performance drops for irregular or non-smooth solutions, as well as problems with more degrees of freedom where their global nature results in high dimensional dense matrix computations. </p> </details> <p><br/></p> <details><summary>3. Time updates</summary><br/> The resulting problem is a set of temporal ODEs which can be solved with classical ODE solvers such as any member of the Runge-Kutta method family. </details> <p><br/></p> <h4 id="limitations-of-classical-methods">Limitations of Classical Methods</h4> <p>The properties of a PDE, such as its order, linearity, homogeneity, and boundary conditions, determine its solution method. <span style="color:#9444e2;">Different methods have been developed based on the different properties and requirements of the problem at hand.</span> Brandstetter at al. categorizes these requirements into the following <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>:</p> <div> <table> <thead> <tr> <th>User</th> <th>Structural</th> <th>Implementational</th> </tr> </thead> <tbody> <tr> <td>Computation efficiency, computational cost, accuracy, guarantees (or uncertainty estimates), generalization across PDEs</td> <td>Spatial and temporal resolution, boundary conditions, domain sampling regularity, dimensionality</td> <td>Stability over long rollouts, preservation of invariants</td> </tr> </tbody> </table> <p> The countless combinations of requirements resulted in what Bartels defines as a <span style="color:#9444e2;">splitter field</span> <d-cite key="bartelsNumericalApproximationPartial"></d-cite>: a specialized classical solver is developed for each sub-problems, resulting in many specialized tools rather than a single one. </p> <p> These methods, while effective and mathematically proven, often come at high computation costs. Taking into account that PDEs often exhibit chaotic behaviour and are sensitive to any changes in their parameters, <span style="color:#ff4f4b;">re-running a solver every time a coefficient or boundary condition changes in a single PDE can be computationally expensive</span>. </p> <p> One key example which limits grid-based classical solvers is the <span style="color:#9444e2;">Courant-Friedrichs-Lewy (CFL) condition</span>, which states that the maximum time step size should be proportional to the minimum spatial grid size. According to this condition, as the number of dimensions increases, the size of the temporal step must decrease and therefore numerical solvers become very slow for complex PDEs. </p> </div> <table> <thead> <tr> <th>Algorithm</th> <th>Equation</th> <th>Boundary conditions</th> <th>Complexity</th> </tr> </thead> <tbody> <tr> <td>Classical FDM/FEM/FVM</td> <td>general</td> <td>general</td> <td>poly\(((\frac{1}{\varepsilon})^{d})\)</td> </tr> <tr> <td>Adaptive FDM/FEM <d-cite key="babuskaHpVersionFinite1987"></d-cite></td> <td>general</td> <td>general</td> <td>poly\(((\log(\frac{1}{\varepsilon}))^{d})\)</td> </tr> <tr> <td>Spectral method <d-cite key="gheorghiuSpectralMethodsDifferential2007,shenSpectralMethodsAlgorithms2011"></d-cite></td> <td>general</td> <td>general</td> <td>poly\(((\log(\frac{1}{\varepsilon}))^{d})\)</td> </tr> <tr> <td>Sparse grid FDM/FEM <d-cite key="bungartzSparseGrids2004,zengerSparseGrids1991"></d-cite></td> <td>general</td> <td>general</td> <td>poly\(((\frac{1}{\varepsilon})(\log(\frac{1}{\varepsilon}))^{d})\)</td> </tr> <tr> <td>Sparse grid spectral method <d-cite key="shenEfficientSpectralSparse2010,shenEfficientSpectralSparse2012"></d-cite></td> <td>elliptic</td> <td>general</td> <td>poly\((\log(\frac{1}{\varepsilon})(\log \log(\frac{1}{\varepsilon}))^{d})\)</td> </tr> </tbody> </table> <div class="caption"> Table showing (polynomial) computational complexity of some common numerical methods, including finite difference method (FDM), finite elements method (FEM), finite volume method (FVM), spectral method, and some of their variants for \(d\)-dimensional PDEs with error tolerance ε. Note that every method has an exponential dependency on the dimenAdapted from <d-cite key="childsHighprecisionQuantumAlgorithms2021"></d-cite>. </div> <h3 id="neural-solvers">Neural Solvers</h3> <p> Neural solvers offer some very desirable properties that may serve to unify some of this splitter field. Neural networks can <span style="color:#9444e2;">learn and generalize to new contexts</span> such as different initial/boundary conditions, coefficients, or even different PDEs entirely <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. They can also circumvent the CFL condition, making them a promising avenue for solving highly complex PDEs such as those found in weather prediction. For a review which contextualizes physics informed machine learning with regards to classical problems and methods, see <d-cite key="mengWhenPhysicsMeets2022"></d-cite> </p> <p> Though most methods lie along a spectrum from classical leaning to end-to-end neural, a naive yet illustrative categorization into three groupings is shown below. </p> <p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/PDEchart-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/PDEchart-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/PDEchart-1400.webp"/> <img src="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/PDEchart.png" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </p> <h4 id="fully-neuraluniversal-function-approximators">Fully Neural/Universal Function Approximators</h4> <p>The term fully neural here refers to methods which rely on the universal function approximation theory such that a sufficiently complex network can represent any arbitrary function. Many common fully neural methods are also known as neural operators which <span style="color:#9444e2;">model the solution of a PDE as an operator that maps inputs to outputs</span>. The problem is set such that a neural operator \(\mathcal{M}\) satisfies \(\mathcal{M}(t,\mathbf{u}^{0}) = \mathbf{u}(t)\) where \(\mathbf{u}^{0}\) are the initial conditions <d-cite key="luDeepONetLearningNonlinear2021, brandstetterMessagePassingNeural2022a"></d-cite>. The idea of using deep learning techniques to solve differential equations has a long history, including Dissanayake’s and Phan-Thien’s attempt to use <abbr title="multilayer perceptron">MLP</abbr>s as universal approximators to solve PDEs, and arguably includes any work involving incorporating prior knowledge into models in general <d-cite key="dissanayakeNeuralnetworkbasedApproximationsSolving1994,psichogiosHybridNeuralNetworkfirst1992,lagarisArtificialNeuralNetworks1998"></d-cite>. Simple <abbr title="multilayer perceptron">MLP</abbr>s, CNNs, RNNs, and other networks used to map input vectors to output vectors are naive examples of finite-dimensional operators.</p> <p>Raissi et al. officially coined the physics-informed neural network (PINN) in 2017 <d-cite key="raissiPhysicsinformedNeuralNetworks2019"></d-cite>. The problem is set such that the network \(\mathcal{N}\) satisfies \(\mathcal{N}(t,\mathbf{u}^{0}) = \mathbf{u}(t)\) where \(\mathbf{u}^{0}\) are the initial conditions. The main principle behind <abbr title="physics informed neural network">PINN</abbr>s is to enforce the governing physical laws of the problem on the network’s predictions by adding loss term(s) to the network’s objective function.</p> <p>For a typical loss function \(\theta = \text{argmin}_{\theta} \mathcal{L}(\theta)\)</p> <p>the loss with a physics prior may be defined as follows:</p> \[\mathcal{L}(\theta) = \omega_{\mathcal{F}} \mathcal{L}_{\mathcal{F}}(\theta) + \omega_{\mathcal{B}} \mathcal{L}_{\mathcal{B}}(\theta) + \omega_{d} \mathcal{L}_{\text{data}}(\theta)\] <table> <thead> <tr> <th>Term</th> <th>Definition</th> <th>Effect</th> <th> </th> </tr> </thead> <tbody> <tr> <td>\(\mathcal{L}_{\mathcal{B}}\)</td> <td>Loss wrt. the initial and/or boundary conditions</td> <td>Fits the known data over the network</td> <td> </td> </tr> <tr> <td>\(\mathcal{L}_{\mathcal{F}}\)</td> <td>Loss wrt. the PDE</td> <td>Enforces DE \(\mathcal{F}\) at collocation points; Calculating using autodiff to compute derivatives of \(\mathbf{\hat{u}_{\theta}(\mathbf{z})}\)</td> <td> </td> </tr> <tr> <td>\(\mathcal{L}_{\text{data}}\)</td> <td>Validation of known data points</td> <td>Fits the known data over the NN and forces \(\mathbf{\hat{u}}_{\theta}\) to match measurements of \(\mathbf{u}\) over provided points</td> <td>–&gt;</td> </tr> </tbody> </table> <p>Since the network maps input variables to output variables which are both finite-dimensional and dependent on the grid used to discretize the problem domain, it is considered a finite dimensional neural operator. The paper gained a lot of traction and inspired many architectures which now fall under the <abbr title="physics informed neural network">PINN</abbr> family; for a more thorough review see <d-cite key="cuomoScientificMachineLearning2022"></d-cite>, and for <a href="https://www.physicsbaseddeeplearning.org/intro.html">hands-on examples visit this digital book</a> <d-cite key="thuereyPhysicsbasedDeepLearning2022"></d-cite>.</p> <p>The success of this loss-based approach is apparent when considering the rapid growth of papers which extend the original iteration of the <abbr title="physics informed neural network">PINN</abbr>. However, Krishnapriyan et al. <d-cite key="krishnapriyanCharacterizingPossibleFailure2021"></d-cite> has shown that even though standard fully-connected neural networks are theoretically capable of representing any function given enough neurons and layers, a <abbr title="physics informed neural network">PINN</abbr> may still fail to approximate a solution due to the complex loss landscapes arising from soft PDE constraints.</p> <p>The DeepONet architecture is a seminal example of an infinite dimensional neural operator in contrast to the finite dimensional <abbr title="physics informed neural network">PINN</abbr> <d-cite key="luDeepONetLearningNonlinear2021"></d-cite>. It consists of one or multiple branch net(s) which encode discrete inputs to an input function space, and a single trunk net which receives the query location to evaluate the output function. The model maps from a fixed, finite dimensional grid to an infinite dimensional output space.</p> <p>Since the development of the DeepONet, many novel neural operators have emerged which generalize this finite-infinite dimensional mapping to an infinite-infinite dimensional mapping<d-cite key="liNeuralOperatorGraph2020,liPhysicsinformedNeuralOperator2021,goswamiPhysicsInformedDeepNeural2022,rahmanUshapedNeuralOperators2022,tripuraWaveletNeuralOperator2022,fanaskovSpectralNeuralOperators2022,pathakFourCastNetGlobalDatadriven2022"></d-cite>, including the <span style="color:#9444e2;">Fourier Neural Operator (FNO)</span> <d-cite key="liFourierNeuralOperator2021"></d-cite>. It operates within Fourier space and takes advantage of the convolution theorem to place the integral kernel in Fourier space as a convolutional operator.</p> <div> <p> These global integral operators (implemented as Fourier space convolutional operators) are combined with local nonlinear activation functions, resulting in an architecture which is <span style="color:#9444e2;">highly expressive yet computationally efficient, as well as being resolution-invariant</span>. </p> <p> While the vanilla <abbr title="Fourier neural operator">FNO</abbr> required the input function to be defined on a grid due to its reliance on the FFT, further work developed mesh-independent variations as well <d-cite key="kovachkiNeuralOperatorLearning2022"></d-cite>. </p> </div> <div class="fake-img l-gutter"> <p> Convolution Theorem </p> <p> The Fourier transform of the convolution of two signals is equal to the pointwise product of their individual Fourier transforms </p> </div> <p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/FNO-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/FNO-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/FNO-1400.webp"/> <img src="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/FNO.png" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </p> <div class="caption"> <abbr title="Fourier neural operator">FNO</abbr> architecture. For more details, see <a href="https://zongyi-li.github.io/blog/2020/fourier-pde/">this blogpost</a>. Credits: Li et al. <d-cite key="liFourierNeuralOperator2021"></d-cite>. </div> <p> Neural operators are able to operate on multiple domains and can be completely data-driven. </p> <p> However, these models <span style="color:#ff4f4b;">do not tend to predict out-of-distribution \(t\)</span> and are therefore limited when dealing with temporal PDEs. Another major barrier is their relative <span style="color:#ff4f4b;">lack of interpretability and guarantees</span> compared to classical solvers. </p> <h4 id="neural-augmented-classical-methods">Neural-Augmented Classical Methods</h4> <p>A parallel line of research involves using deep learning as a tool to improve classical numerical methods for solving PDEs. One avenue involves modifying existing iterative methods: while neural operator methods directly mapped inputs to outputs, <span style="color:#9444e2;">autoregressive methods take an iterative approach instead</span>. For example, iterating over time results in a problem such as \(\mathbf{u}(t+\Delta t) = \mathcal{A}(\Delta t, \mathbf{u}(t))\) where \(\mathcal{A}\) is some temporal update <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>.</p> <div class="l-body-outset"> <div class="row mt-3"> <div class="col-sm mt-3 mt-md-0"> <div class="vertical-center" style="background-color:white"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/rnn-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/rnn-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/rnn-1400.webp"/> <img src="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/rnn.png" class="img-fluid rounded" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> </div> <div class="col-sm mt-3 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/wavenet.gif-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/wavenet.gif-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/wavenet.gif-1400.webp"/> <img src="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/wavenet.gif" class="img-fluid rounded" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> </div> <div class="caption"> Similarly to <abbr title="recurrent neural networks">RNN</abbr>s (left), autoregressive models take previous time steps to predict the next time step. However, autoregressive models (right) are entirely feed-forward and take the previous predictions as inputs rather than storing them in some hidden state. Credits: RNN diagram from Colah's Blog <d-cite key="UnderstandingLSTMNetworks"></d-cite>, WaveNet from Deepmind Blog <d-cite key="WaveNetGenerativeModel"></d-cite> </div> </div> <p>Three autoregressive systems mentioned by Brandstetter et al. are hybrid methods which use neural networks to predict certain parameters for finite volume, multigrid, and iterative finite elements methods. <span style="color:#9444e2;">All three retain a (classical) computation grid which makes them somewhat interpretable</span> <d-cite key="bar-sinaiLearningDatadrivenDiscretizations2019, greenfeldLearningOptimizeMultigrid2019a, hsiehLearningNeuralPDE2019"></d-cite>.</p> <div class="fake-img l-gutter"> <p> Other autoregressive models include PixelCNN for images, WaveNet for audio, and the Transformer for text. </p> </div> <p>Hsieh et al. <d-cite key="hsiehLearningNeuralPDE2019"></d-cite>, for example, develops a neural network-accelerated iterative finite elements method. Most significantly, their approach offers theoretical guarantees of convergence and correctness. Their problem formulation focuses on solving a single linear PDE class for variable discretization, boundary conditions, and source/forcing terms. For any PDE with an existing linear iterative solver, a learned iterator can replace a handcrafted classical iterator.</p> <p>Similarly, Um et al. <d-cite key="umSolverintheLoopLearningDifferentiable2020a"></d-cite> proposed using a neural network component to learn the error or deviation from the path of an iterative solver. Using this component, the iterative method can be “pushed” back onto the true PDE solution.</p> <p>Another way deep learning can be leveraged in classical methods is characterized by <d-cite key="meurisMachinelearningbasedSpectralMethods2023"></d-cite> and also highlights the deeply interconnected nature of these novel developments. The conventional spectral method rewrites a PDE in terms of the sum of basis functions; Meuris et al. use a DeepONet to discover candidate functions to be used as basis functions. Though mathematical work is required to mold the extracted function (from the DeepONet) to a basis function satisfying certain desirable properties, it expands the use of the spectral method toward complex domains where we might not have known appropriate basis functions.</p> <p>However, augmented classical systems have not gained the acclaim seen by their fully neural counterparts as a whole.</p> <p>This is on one hand due to their <span style="color:#ff4f4b;">limitations in generalization</span>. In Hsieh et al.’s case, an existing numerical method must be used to craft a complementary neural iterator <d-cite key="hsiehLearningNeuralPDE2019"></d-cite>. Another major concern is the <span style="color:#ff4f4b;">accumulation of error</span> in iterative methods, which is particularly detrimental for PDE problems that often exhibit chaotic behavior <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. Overarching both neural component and neural-optimized methods, however, is the tradeoff between marginal improvements to classical methods and what tends to be a non-trivial amount of manual work required to implement such methods.</p> <h4 id="classical-inspired-neural-methods">Classical-Inspired Neural Methods</h4> <p>Ruthotto and Haber released an impactful study in 2018 which interprets residual neural networks (ResNets) as PDEs, and addresses some of their challenges using PDE theory <d-cite key="ruthottoDeepNeuralNetworks2018"></d-cite>. A standard ResNet has skip connections which in effect add a previous layer’s output directly to the calculation of a future layer’s output. Given input features \(\mathbf{Y}_{0}=\mathbf{Y}\) and a ResNet with \(N\) layers, the output of the \(j\)th layer is used to calculate that of the next:</p> \[\mathbf{Y}_{j+1}=\mathbf{Y}_{j}+f(\theta^{(j)},\mathbf{Y}_{j})\] <p>This formulation also describes a typical forward Euler discretization with a step size \(\delta_{t}=1\). Based on this continuous interpretation of a ResNet layer, PDEs from control theory can be used to develop novel networks with specific and expected behaviours like smoothing or even memory reduction <d-cite key="ruthottoDeepNeuralNetworks2018"></d-cite>.</p> <p>This is an example of a strong classical-inspired neural method which allowed us to systematically develop novel architectures. Since then, PDE interpretations of neural network architectures have been expanded to encompass embedding PDEs into architectures themselves, and building architectures to mimic classical PDE solvers.</p> <p>The Graph Neural Diffusion (GRAND) model introduced by Chamberlain et al. demonstrates that <span style="color:#9444e2;">graph neural networks (GNNs) can be crafted using differential equations</span> (like diffusion processes) where the spatial derivative is analogous to the difference between node features, and the temporal update is a continuous counterpart to the layer index <d-cite key="chamberlainGRANDGraphNeural2021a"></d-cite>. From these two principles and their derivations of diffusion PDEs on graphs, Chamberlain et al. design networks which ameliorate common <abbr title="graph neural network">GNN</abbr> pitfalls like oversmoothing (which occurs as the number of layers increases). Note that here, the emphasis is not in outputting the solution of a PDE directly but rather using a PDE to influence or bias the output toward an expected result, somewhat more similarly to how a <abbr title="physics informed neural network">PINN</abbr> biases the output to obey a specified PDE.</p> <p>Later, the PDE-GCN model extends GRAND by deriving differential operators on manifolds which are then discretized on graphs to then build not only diffusion, but hyperbolic PDE-inspired <abbr title="graph neural network">GNN</abbr>s as well <d-cite key="eliasofPdegcnNovelArchitectures2021"></d-cite>. The discretized nonlinear diffusion and nonlinear hyperbolic PDEs call back to Ruthotto et al.’s comparison to ResNet updates and are used to define the titular PDE-inspired graph convolutional network (GCN) layer. Interestingly, mixing both diffusion and hyperbolic variants can allow one to discover which is more prominent to a task by retrieving a parameter which weights how much one network dynamic contributes to the output.</p> <p>This category of models highlights the diverse ways that PDEs are used in deep learning. Not only can these networks be tested on mathematical datasets, but they provide valuable interpretations and performance improvements when used in non-geometric tasks like node classification and even protein-protein interactions in biology.</p> <h2 id="message-passing-neural-pde-solver-mp-pde">Message Passing Neural PDE Solver (MP-PDE)</h2> <p>Brandstetter et al. propose a <span style="color:#9444e2;">fully neural PDE solver which capitalizes on neural message passing</span>. The overall architecture is laid out below, consisting of an <abbr title="multilayer perceptron">MLP</abbr> encoder, a <abbr title="graph neural network">GNN</abbr> processor, and a CNN decoder.</p> <div class="l-body-outset" style="background-color:white"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/MP-PDE-Solver-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/MP-PDE-Solver-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/MP-PDE-Solver-1400.webp"/> <img src="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/MP-PDE-Solver.png" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> <div class="caption"> Overall MP-PDE architecture. Credits: Brandstetter et al. <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. </div> <p>At its core, this model is autoregressive and thus faces the same challenge listed above. Two key contributions of this work are the <span style="color:#9444e2;">pushforward trick and temporal bundling which mitigate the potential butterfly effect of error accumulation</span><d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. The network itself, being fully neural, is capable of generalization across many changes as well.</p> <h3 id="the-pushforward-trick-and-temporal-bundling">The Pushforward Trick and Temporal Bundling</h3> <div class="l-body-outset"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/pushforward3-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/pushforward3-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/pushforward3-1400.webp"/> <img src="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/pushforward3.jpg" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Pushforward trick compared to one-step and unrolled training. Credits: Brandstetter et al. <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. </div> </div> <p>During testing, the model uses current time steps (first from data, then <span style="color:#9444e2;">from its own predictions</span>) to approximate the next time step.</p> <p>This results in a distribution shift problem because the inputs are no longer solely from ground truth data: <span style="color:#9444e2;">the distribution learned during training will always be an approximation of the true data distribution</span>. The model will appear to overfit to the one-step training distribution and perform poorly the further it continues to predict.</p> <p>An adversarial-style stability loss is added to the one-step loss so that the training distribution is brought closer to the test time distribution <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>:</p> <details><summary style="text-align:center;"><span style="summary-math"> \(L_{\text{one-step}} =\) <span style="color:#23a15c;">\(\mathbb{E}_{k}\)</span> <span style="color:#928b54;">\(\mathbb{E}_{\mathbf{u^{k+1}|\mathbf{u^{k},\mathbf{u^{k} \sim p_{k}}}}}\)</span> \([\) <span style="color:#5588e0;">\(\mathcal{L}\)</span> \((\) <span style="color:#9444e2;">\(\mathcal{A}(\mathbf{u}^{k})\)</span> \(,\) <span style="color:#46b4af;">\(\mathbf{u}^{k+1}\)</span>\(]\) </span> </summary> <p> The <span style="color:#5588e0;">loss function</span> is used to evaluate the difference between the <span style="color:#9444e2;">temporal update</span> and the <span style="color:#46b4af;">expected next state</span>, and the overall one-step loss is calculated as the expected value of this loss over <span style="color:#23a15c;">all time-steps</span> and <span style="color:#928b54;">all possible next states</span>. </p> </details> <p><br style="line-height:5px"/></p> <p style="text-align:center;"> \(L_{\text{stability}} = \mathbb{E}_{k}\mathbb{E}_{\mathbf{u^{k+1}|\mathbf{u^{k},\mathbf{u^{k} \sim p_{k}}}}}[\mathbb{E}_{\epsilon | \mathbf{u}^{k}} [\mathcal{L}(\mathcal{A}(\mathbf{u}^{k}+\) <span style="color:#faad18;">\(\epsilon\)</span> \()),\mathbf{u}^{k+1}]]\) </p> <p style="text-align:center;"> \(L_{\text{total}} = L_{\text{one-step}} + L_{\text{stability}}\) </p> <p> The stability loss is largely based off the one-step loss, but now assumes that the temporal update uses <span style="color:#faad18;">noisy data</span>. </p> <p> The pushforward trick lies in the choice of <span style="color:#faad18;">\(\epsilon\)</span> such that \(\mathbf{u}^{k}+\epsilon = \mathcal{A}(\mathbf{u}^{k-1})\), similar to the test time distribution. Practically, it is implemented to be <span style="color:#9444e2;">noise from the network itself</span> so that as the network improves, the loss decreases. </p> <p> Necessarily, the noise of the network must be known or calculated to implement this loss term. So, <span style="color:#9444e2;">the model is unrolled for 2 steps</span> but only backpropagated over the most recent unroll step, which already has the neural network noise <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. In essence, the one-step training has a clean input and noisy output whereas the pushforward trick has both noisy input and noisy output with the \(\epsilon\) term capturing the noise. </p> <p> While the network could be unrolled during training, this not only slows the training down but also might result in the network learning shortcuts across unrolled steps. </p> <p><strong>Temporal bundling</strong></p> <div class="row mt-3"> <div class="col-8"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/NN-AR-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/NN-AR-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/NN-AR-1400.webp"/> <img src="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/NN-AR.jpg" class="img-fluid rounded" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> <div class="col-4"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/temporalbundling-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/temporalbundling-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/temporalbundling-1400.webp"/> <img src="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/temporalbundling.jpg" class="img-fluid rounded" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> </div> <div class="caption"> Temporal bundling compared to neural operators and autoregressive models. Credits: Brandstetter et al. <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. </div> <p>This trick complements the previous by <span style="color:#9444e2;">reducing the amount of times the test time distribution changes</span>. Rather than predicting a single value at a time, the MP-PDE predicts multiple time-steps at a time, as seen above <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>.</p> <h3 id="network-architecture">Network Architecture</h3> <p><abbr title="graph neural network">GNN</abbr>s have been used as PDE solvers in a variety of works <d-cite key="liNeuralOperatorGraph2020, eliasofPdegcnNovelArchitectures2021, iakovlevLearningContinuoustimePDEs2021"></d-cite>; however, in this implementation, <span style="color:#9444e2;">links can be drawn directly from the <abbr title="method of lines">MOL</abbr> to each component of the network architecture centering around the use of a message passing algorithm.</span></p> <table> <thead> <tr> <th>Classical Numerical Method</th> <th>MP-PDE Network Component</th> </tr> </thead> <tbody> <tr> <td>Partitioning the problem onto a grid</td> <td>Encoder <br/><em>Encodes a vector of solutions into node embeddings</em></td> </tr> <tr> <td>Estimating the spatial derivatives</td> <td>Processor <br/><em>Estimates spatial derivatives via message passing</em></td> </tr> <tr> <td>Time updates</td> <td>Decoder <br/><em>Combines some representation of spatial derivatives smoothed into a time update</em></td> </tr> </tbody> </table> <ol> <li>Encoder <p> The encoder is implemented as a two-layer <abbr title="multilayer perceptron">MLP</abbr> which computes an embedding for each node \(i\) to cast the data to a <span style="color:#9444e2;">non-regular integration grid</span>: </p> <details><summary style="text-align:center"><span style="summary-math"> \(\mathbf{f}_{i}^{0} = \epsilon^{v}([\mathbf{u}_{i}^{k-K:k},\mathbf{x}_{i},t_{k},\theta_{PDE}])\) </span> </summary> where \(\mathbf{u}_{i}^{k-K:k}\) is a vector of previous solutions (the length equaling the temporal bundle length), \(\mathbf{x}_{i}\) is the node's position, \(t_{k}\) is the current timestep, and \(\theta_{PDE}\) holds equation parameters. </details> </li> <li> Processor <p> The node embeddings from the encoder are then used in a message passing <abbr title="graph neural network">GNN</abbr>. <a id="spatialderivative" style="text-decoration:none;">The message passing algorithm, which approximates spatial derivatives, is run \(M\) steps using the following updates:</a> </p> <details><summary style="text-align:center"><span style="summary-math"> \(\text{edge } j \to i \text{ message:} \qquad \mathbf{m}_{ij}^{m} =\) <span style="color:#ae46b4;">\(\phi\)</span> \((\) <span style="color:#b4a546;">\(\mathbf{f}_{i}^{m}, \mathbf{f}_{j}^{m},\)</span> <span style="color:steelblue;">\(\mathbf{u}_{i}^{k-K:k}-\mathbf{u}_{j}^{k-K:k}\)</span>, <span style="color:#6546b4;">\(\mathbf{x}_{i}-\mathbf{x}_{j}\)</span>, <span style="color:#46b4af;">\(\theta_{PDE}\)</span> \())\) </span> </summary> The <span style="color:#6546b4;">difference in spatial coordinates</span> helps enforce translational symmetry and, combined with the <span style="color:steelblue;">difference in node solutions</span>, relates the message passing to a local difference operator. The addition of the <span style="color:#46b4af;">PDE parameters</span> is motivated by considering what the MP-PDE should generalize over: by adding this information in multiple places, flexibility can potentially be learned since all this information (as well as the <span style="color:#b4a546;">node embeddings</span>) is fed through <span style="color:#ae46b4;">a two-layer <abbr title="multilayer perceptron">MLP</abbr></span>. In addition, the solution of a PDE at any timestep must respect the boundary condition (the same as in classical methods for BVPs), so adding the <span style="color:#46b4af;">PDE parameters</span> in the edge update provides knowledge of the boundary conditions to the neural solver. </details> <br/> <details><summary style="text-align:center;"><span style="summary-math"> \(\text{node } i \text{ update:} \qquad\) <span style="color:#ff4f4b;">\(\mathbf{f}_{i}^{m+1}\)</span> \(=\) <span style="color:#928b54;">\(\psi\)</span> \((\) <span style="color:#5588e0;">\(\mathbf{f}^{m}_{i}\)</span>, <span style="color:#722e4e;">\(\sum_{j \in \mathcal{N}(i)} \mathbf{m}_{ij}^{m}\)</span>, <span style="color:#46b4af;">\(\theta_{PDE}\)</span> \()\) </span> </summary> The <span style="color:#ff4f4b;">future node embedding</span> is updated using <span style="color:#5588e0;">the current node embedding</span>, <span style="color:#722e4e;">the aggregation of all received messages</span>, and (again) the <span style="color:#46b4af;">PDE parameters</span>. This information is also fed through <span style="color:#928b54;">a two-layer <abbr title="multilayer perceptron">MLP</abbr></span>. </details><br/> <p> Bar-Sinai et al. explores the relationship between <abbr title="finite difference method">FDM</abbr> and <abbr title="finite volume method">FVM</abbr> as used in the method of lines <d-cite key="bar-sinaiLearningDatadrivenDiscretizations2019"></d-cite>. In both methods, the \(n^{th}\) order derivative at a point \(x\) is approximated by </p> <p style="text-align:center;"> \(\partial^{(n)}_{x}u \approx \sum_{i} a^{(n)}_{i} u_{i}\) </p> <p> for some precomputed coefficients \(a^{(n)}_{i}\). <span style="color:#9444e2;">The right hand side parallels the message passing scheme</span>, which aggregates the local difference (<span style="color:steelblue;">\(\mathbf{u}_{i}^{k-K:k}-\mathbf{u}_{j}^{k-K:k}\)</span> in the edge update) and other (learned) embeddings over neighborhoods of nodes. </p> <p> This relationship gives an intuitive understanding of the message passing <abbr title="graph neural network">GNN</abbr>, which mimics <abbr title="finite difference method">FDM</abbr> for a single layer, <abbr title="finite volume method">FVM</abbr> for two layers, and <abbr title="Weighted Essentially Non-Oscillatory (5th order)">WENO5</abbr> for three layers <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. <abbr title="Weighted Essentially Non-Oscillatory (5th order)">WENO5</abbr> is a numerical interpolation scheme used to reconstruct the solution at cell interfaces in <abbr title="finite volume method">FVM</abbr>. </p> <p> While the interpretation is desirable, how far this holds in the actual function of the <abbr title="message passing graph neural network">MP-GNN</abbr> is harder to address. The concepts of the nodes as integration points and messages as local differences break down as the nodes and edges update. In addition, the furthest node that contributes a message from for any point is at \(n\) edges away for the \(n^{th}\) layer (or a specified limit). This results in a very coarse and potentially underinformed approximation for the first layer which is then propagated to the next layers. However, both the updates use two layer <abbr title="multilayer perceptron">MLP</abbr>s which (although abstracting away from their respective interpretations) may in effect learn optimal weightings to counterbalance this. </p> </li> <li> Decoder <p> The approximated spatial derivatives are then <span style="color:#9444e2;">combined and smoothed using a 1D CNN</span> which outputs a bundle of next time steps (recall temporal bundling) \(\mathbf{d}_{i}\). The solution is then updated: </p> <p style="text-align:center;"> \(\mathbf{u}^{k+l}_{i} = u^{k}_{i} + (t_{k+l}-t_{k})\mathbf{d}^{l}_{i}\) </p> <p> Some precedence is seen, for example, in classical linear multistep methods which (though effective) face stability concerns. Since the CNN is adaptive, it appears that it avoids this issue <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. </p> </li> </ol> <h3 id="results">Results</h3> <details><summary>Quantitative measures: accumulated error, runtime</summary> <p> Accumulated error: \(\frac{1}{n_{x}} \sum_{x,t} MSE\) </p> <p> Runtime (s): Measured time taken to run for a given number of steps. </p> </details> <blockquote> As a general neural PDE solver, the <abbr title="message passing graph neural network">MP-GNN</abbr> surpasses even the current state-of-the-art <abbr title="Fourier neural operator">FNO</abbr>. </blockquote> <p>For example, after training a neural model and setting up an instance of <abbr title="method of lines">MOL</abbr>, this is a brief comparison of how they can generalize without re-training.</p> <table> <thead> <tr> <th>Generalization to...</th> <th><abbr title="message passing graph neural network">MP-GNN</abbr></th> <th><abbr title="Fourier neural operator">FNO</abbr></th> <th>Classical (<abbr title="method of lines">MOL</abbr>)</th> </tr> </thead> <tbody> <tr> <td>New PDEs</td> <td>Yes</td> <td>No</td> <td>No</td> </tr> <tr> <td>Different resolutions</td> <td>Yes</td> <td>Yes</td> <td>No (unless downsampling)</td> </tr> <tr> <td>Changes in PDE parameters</td> <td>Yes</td> <td>Yes</td> <td>Sometimes</td> </tr> <tr> <td>Non-regular grids</td> <td>Yes</td> <td>Some</td> <td>Yes (dependent on implementation)</td> </tr> <tr> <td>Higher dimensions</td> <td>Yes</td> <td>No</td> <td>No</td> </tr> </tbody> </table> <div class="l-body-outset"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/shock_formation-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/shock_formation-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/shock_formation-1400.webp"/> <img src="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/shock_formation.png" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Demonstration of shock formation using MP-PDE from different training data resolutions. Credits: Brandstetter et al. <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. </div> </div> <p>This experiment exemplifies the MP-PDE’s ability to model shocks (where both the <abbr title="finite difference method">FDM</abbr> and PSM methods fail) across multiple resolutions. Even at a fifth of the resolution of the ground truth, both the small and large shocks are captured well.</p> <div class="l-body-outset"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/2dshock-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/2dshock-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/2dshock-1400.webp"/> <img src="/2023/assets/img/2023-05-01-autoregressive-neural-pde-solver/2dshock.jpg" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Demonstration of shock formation using MP-PDE from different training data resolutions. Credits: Brandstetter et al. <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. </div> </div> <p>The same data is displayed in 2D to show the time evolution. After about 7.5s, the error accumulation is large enough to visibly diverge from the ground truth. The predictions become unreliable due to error accumulation.</p> <p>In practice, this survival time should be empirically found (as seen here) to determine how long the solution is reliable. However, the ground truth would be needed for comparison, rendering this as another chicken-egg problem.</p> <table> <thead> <tr> <th colspan="2"></th> <th colspan="4" style="border-left:1px solid lightgrey;">Accumulated Error</th> <th colspan="2" style="border-left:1px solid lightgrey;">Runtime [s]</th> </tr> </thead> <tbody> <tr> <td colspan="2"> \(\quad (n_{t},n_{x})\) </td> <td style="border-left:1px solid lightgrey;">WENO5</td> <td>FNO-RNN</td> <td style="border-left:1px solid lightgrey;">FNO-PF</td> <td>MP-PDE</td> <td style="border-left:1px solid lightgrey;">WENO5</td> <td>MP-PDE</td> </tr> <tr> <td><b>E1</b></td> <td>(250,100)</td> <td style="border-left:1px solid lightgrey;">2.02</td> <td>11.93</td> <td style="border-left:1px solid lightgrey;">0.54</td> <td>1.55</td> <td style="border-left:1px solid lightgrey;">1.9</td> <td>0.09</td> </tr> <tr> <td><b>E1</b></td> <td>(250, 50)</td> <td style="border-left:1px solid lightgrey;">6.23</td> <td>29.98</td> <td style="border-left:1px solid lightgrey;">0.51</td> <td>1.67</td> <td style="border-left:1px solid lightgrey;">1.8</td> <td>0.08</td> </tr> <tr> <td><b>E1</b></td> <td>(250, 40)</td> <td style="border-left:1px solid lightgrey;">9.63</td> <td>10.44</td> <td style="border-left:1px solid lightgrey;">0.57</td> <td>1.47</td> <td style="border-left:1px solid lightgrey;">1.7</td> <td>0.08</td> </tr> <tr> <td><b>E2</b></td> <td>(250, 100)</td> <td style="border-left:1px solid lightgrey;">1.19</td> <td>17.09</td> <td style="border-left:1px solid lightgrey;">2.53</td> <td>1.58</td> <td style="border-left:1px solid lightgrey;">1.9</td> <td>0.09</td> </tr> <tr> <td><b>E2</b></td> <td>(250, 50)</td> <td style="border-left:1px solid lightgrey;">5.35</td> <td>3.57</td> <td style="border-left:1px solid lightgrey;">2.27</td> <td>1.63</td> <td style="border-left:1px solid lightgrey;">1.8</td> <td>0.09</td> </tr> <tr> <td><b>E2</b></td> <td>(250, 40)</td> <td style="border-left:1px solid lightgrey;">8.05</td> <td>3.26</td> <td style="border-left:1px solid lightgrey;">2.38</td> <td>1.45</td> <td style="border-left:1px solid lightgrey;">1.7</td> <td>0.08</td> </tr> <tr> <td><b>E3</b></td> <td>(250, 100)</td> <td style="border-left:1px solid lightgrey;">4.71</td> <td>10.16</td> <td style="border-left:1px solid lightgrey;">5.69</td> <td>4.26</td> <td style="border-left:1px solid lightgrey;">4.8</td> <td>0.09</td> </tr> <tr> <td><b>E3</b></td> <td>(250, 50)</td> <td style="border-left:1px solid lightgrey;">11.71</td> <td>14.49</td> <td style="border-left:1px solid lightgrey;">5.39</td> <td>3.74</td> <td style="border-left:1px solid lightgrey;">4.5</td> <td>0.09</td> </tr> <tr> <td><b>E3</b></td> <td>(250, 40)</td> <td style="border-left:1px solid lightgrey;">15.97</td> <td>20.90</td> <td style="border-left:1px solid lightgrey;">5.98</td> <td>3.70</td> <td style="border-left:1px solid lightgrey;">4.4</td> <td>0.09</td> </tr> </tbody> </table> <div class="caption"> Table of experiment results adapted from paper. Credits: Brandstetter et al. <d-cite key="brandstetterMessagePassingNeural2022a"></d-cite>. </div> <details><summary>Abbreviations</summary> <table> <thead> <tr> <th>Shorthand</th> <th>Meaning</th> </tr> </thead> <tbody> <tr> <td><strong>E1</strong></td> <td>Burgers&#39; equation without diffusion</td> </tr> <tr> <td><strong>E2</strong></td> <td>Burgers&#39; equation with variable diffusion</td> </tr> <tr> <td><strong>E3</strong></td> <td>Mixed equation, see below</td> </tr> <tr> <td>\(n_{t}\)</td> <td>Temporal resolution</td> </tr> <tr> <td>\(n_{x}\)</td> <td>Spatial resolution</td> </tr> <tr> <td>WENO5</td> <td>Weighted Essentially Non-Oscillatory (5th order)</td> </tr> <tr> <td><abbr title="Fourier neural operator">FNO</abbr>-<abbr title="recurrent neural networks">RNN</abbr></td> <td>Recurrent variation of <abbr title="Fourier neural operator">FNO</abbr> from original paper</td> </tr> <tr> <td><abbr title="Fourier neural operator">FNO</abbr>-PF</td> <td><abbr title="Fourier neural operator">FNO</abbr> with the pushforward trick added</td> </tr> <tr> <td>MP-PDE</td> <td>Message passing neural PDE solver</td> </tr> </tbody> </table> <p> The authors form a general PDE in the form </p> <p style="text-align:center;"> \([\partial_{t}u + \partial_{x}(\alpha u^{2} - \beta \partial_{x} u + \gamma \partial_{xx} u)](t,x) = \delta (t,x)\) </p> <p style="text-align:center;"> \(u(0,x) = \delta(0,x)\) </p> <p> such that \(\theta_{PDE} = (\alpha, \beta, \gamma)\) and different combinations of these result in the heat equation, Burgers' equation, and the KdV equation. \(\delta\) is a forcing term, allowing for greater variation in the equations being tested. </p> </details> <p>For this same experiment, the error and runtimes were recorded when solving using <abbr title="Weighted Essentially Non-Oscillatory (5th order)">WENO5</abbr>, the recurrent variant of the <abbr title="Fourier neural operator">FNO</abbr> (<abbr title="Fourier neural operator">FNO</abbr>-<abbr title="recurrent neural networks">RNN</abbr>), the <abbr title="Fourier neural operator">FNO</abbr> with the pushforward trick (<abbr title="Fourier neural operator">FNO</abbr>-PF), and the MP-PDE.</p> <blockquote> The pushforward trick is successful in mitigating error accumulation. </blockquote> <p>Comparing the accumulated errors of <abbr title="Fourier neural operator">FNO</abbr>-<abbr title="recurrent neural networks">RNN</abbr> and the <abbr title="Fourier neural operator">FNO</abbr>-PF across all experiments highlights the advantage of the pushforward trick. While the MP-PDE outperforms all other tested methods in the two generalization experiments <strong>E2</strong> and <strong>E3</strong>, the <abbr title="Fourier neural operator">FNO</abbr>-PF is most accurate for <strong>E1</strong>.</p> <p>When solving a single equation, the <abbr title="Fourier neural operator">FNO</abbr> likely performs better, though both <abbr title="Fourier neural operator">FNO</abbr>-PF and MP-PDE methods outperform <abbr title="Weighted Essentially Non-Oscillatory (5th order)">WENO5</abbr>.</p> <blockquote> Neural solvers are resolution-invariant. </blockquote> <p>As \(n_{x}\) is decreased, <abbr title="Weighted Essentially Non-Oscillatory (5th order)">WENO5</abbr> performs increasingly worse whereas all the neural solvers remain relatively stable.</p> <blockquote> Neural solver runtimes are constant to resolution. </blockquote> <p>Additionally, the runtimes of <abbr title="Weighted Essentially Non-Oscillatory (5th order)">WENO5</abbr> decrease (likely proportionally) since fewer steps require fewer calculations, but the MP-PDE runtimes again appear relatively stable.</p> <h3 id="comparing-interpretations">Comparing Interpretations</h3> <p>The way the MP-PDE is constructed parallels how both GRAND and the PDE-GCN are built. All three architectures follow a basic premise of mirroring the <abbr title="method of lines">MOL</abbr> and describe certain mechanisms in their respective systems which mimic spatial discretisations and temporal discretisations.</p> <p>The spatial derivative is discretized by a <abbr title="graph neural network">GNN</abbr> in the MP-PDE and by the message passing algorithm (consisting of node and edge updates within one layer of a <abbr title="graph neural network">GNN</abbr>) in the GRAND and PDE-GCN. In the MP-PDE, the spatial derivatives are in effect parameterized by the node and edge updates (the former which Brandstetter et al. highlight takes the difference in solutions \(u_{i}=u_{j}\)) detailed above, both of which are generic <abbr title="multilayer perceptron">MLP</abbr>s. In comparison, both GRAND and PDE-GCN (using the diffusion variant) come to comparable formulas when discretising using the forward Euler method.</p> <p>The GRAND paper derives the following, where \(\tau\) is a temporal step, \(\mathbf{x}\) is the diffusion equation, and \(\mathbf{A}\) is the attention matrix <d-cite key="chamberlainGRANDGraphNeural2021a"></d-cite>:</p> \[\mathbf{x}^{(k+1)}=(\mathbf{I} + \tau \mathbf{A}(\mathbf{x}^{(k)}))\mathbf{x}^{(k)}\] <p>which, when modified, results in:</p> \[\mathbf{x}^{(k+1)}=\mathbf{x}^{(k)} + \tau \mathbf{x}^{(k)} \mathbf{A}(\mathbf{x}^{(k)})\] <p>The PDE-GCN defines manifold operators discretized onto graphs. The update is defined as the following, where \(\mathbf{G}\) is the gradient operator, \(\mathbf{K}\) is a \(1 \times 1\) trainable convolution kernel, \(\sigma\) is the activation function, \(\tau\) is the temporal step, and \(\mathbf{x}\) is the diffusion equation <d-cite key="eliasofPdegcnNovelArchitectures2021"></d-cite>:</p> \[\mathbf{x}^{(k+1)}=\mathbf{x}^{(k)}-\tau \mathbf{G}^{T} \mathbf{K}^{T}_{k} \sigma (\mathbf{K}_{k} \mathbf{G} \mathbf{x}^{(k)})\] <p>The structure of these latter two models shares many similarities, though where GRAND naturally results in a graph attention network, the PDE-GCN results in a graph convolutional network.</p> <p>The temporal update for the MP-PDE relies on the 1D CNN outputting a temporal bundle, whereas GRAND and PDE-GCN regard their respective layer indexes to be the discretised time steps.</p> <p>These are examples of how spatial and temporal discretisations can result in unique architectures. The PDE-GCN outperforms GRAND on at least two out of three out of the popular Cora, SiteSeer, and PubMed benchmarks. However, the MP-PDE has a different objective altogether; while the PDE-GCN and GRAND output a single graph result (which is fed through a convolutional layer for node classification tasks), the MP-PDE iteratively produces results through time. This iterative requirement also requires that the temporal update must be retrievable and therefore must diverge from Ruthotto et al.’s original interpretation of time steps as layers adopted by the other two models. The MP-PDE instead appears to rely on the neural networks in both node and edge updates to learn spatial derivatives over multiple layers. An interesting experiment would be to apply the other two techniques to the same testing data as PDE-GCN and compare accuracies at a specific point in time (see future directions).</p> <h2 id="conclusion">Conclusion</h2> <h4 id="future-directions">Future Directions</h4> <p>The authors conclude by discussing some future directions.</p> <p>For example, the MP-PDE can be modified for <span style="color:#9444e2;">PDE <em>retrieval</em> (which they call parameter optimization)</span>. There is some precedence for this: Cranmer et al. develop a method which fits a symbolic regression model (eg.: PySR, eureqa) to the learned internal functions of a GNN <d-cite key="cranmerDiscoveringSymbolicModels2020"></d-cite>. Alternatively, the MP-PDE’s capacity for generalization means that biasing the model with a prior to determine coefficients could be as simple as training on an example instance of the predicted equation, fitting this model on real world data (much like a finetuning process), and extracting the \(\theta_{PDE}\) parameters.</p> <p>The one-step loss which is the basis of the <span style="color:#9444e2;">adversarial-style loss</span> is also used in reinforcement learning, which frequently uses deep autoregressive models. Other formulations which borrow from reinforcement learning (where distribution shifts are quite common) and other fields could prove successful as well. Transformer-based natural language processing are now capable of capturing extremely long sequence dependencies and generating coherent long-form text. Since <a href="https://graphdeeplearning.github.io/post/transformers-are-gnns/">Transformers are GNNs</a> which use attention to aggregate neighborhoods, this may be a viable avenue to explore.</p> <p><span style="color:#9444e2;">Adaptive time stepping</span> is another avenue which could make the model more efficient and accurate by taking large steps over stable/predictable solution regions and smaller steps over changing/unpredictable solution regions. The choice of a CNN for the decoder works well over regular inputs and outputs, but other options like attention-based architectures could potentially weigh the outputted node embeddings such that the model might learn different time steps. Some care would have to be taken with temporal bundling in this case, since the resulting vectors would be potentially irregular in time.</p> <p>In addition, while the GRAND architecture is designed for a single output, adapting it to suit an iterative solver may prove fruitful since the attention mechanism would encode spatial awareness. The motivation for this choice is that a sparse attention matrix might be able to provide a more global solution.</p> <h4 id="ongoing-challenges">Ongoing Challenges</h4> <p>While there are numerous diverse branches of development, key challenges remain:</p> <ul> <li>(Unified and) appropriate evaluation metrics <ul> <li>Currently, mean squared error (or root mean squared error) is implemented as the choice of loss in not only MP-PDE, but most named networks herein. However, it is unclear whether this is the best measure of correctness to solving a PDE since the specific values of the solution evaluated at the discretised points will depend on the discretisation method. An interesting further study would be to use the MP-PDE and test it on data generated from multiple numerical solvers. Additionally, Brandstetter et al. identify a metric called survival time which defines the length of time before the predicted solution diverges past a specified error threshold. Such metrics are important from a user’s perspective when choosing between architectures, but there has yet to be a unified set of metrics in literature and so we lack convenient benchmarking.</li> </ul> </li> <li>Understanding choices in network architecture <ul> <li>Given an end goal of using neural PDE solvers in practical settings, a major barrier for not only MP-PDE but for GRAND and PDE-GCN as well are the difficulties in choosing network parameters. While the proposed MP-PDE sheds light on certain choices like the message passing function and encoder-processor-decoder structure, it does not address some pragmatic decisions. For example, the 6 message passing layers in the MP-PDE appears relatively arbitrary which is a complaint shared in many machine learning methods. Because of the resulting upfront work in optimising the chosen model to determine what works for a new problem setting, the time cost of implementing it can be prohibitively high in comparison to the relative convenience of the many numerical solvers. One avenue of research to address this concern is neural architecture searching, where the design of neural architectures is discovered rather than manually specified. However, there is still a long way to go as many automated searches require significant compute to test the parameter space adequately.</li> </ul> </li> <li>The chicken and the egg <ul> <li>As impressive as many novel neural methods may be, it remains that training data comes from classical methods. One of the largest open questions (which also drives the need for generalisation) is how we can design neural solvers which require as little data as possible. Transfer learning, curriculum learning, and techniques to encourage generalisation (as seen with the MP-PDE) are all steps toward addressing this problem, but no significant success has been seen from any one in particular.</li> </ul> </li> </ul> <h4 id="remarks">Remarks</h4> <p>In their paper “Message Passing Neural PDE Solver”, Brandstetter at al. present a well-motivated neural solver based on the principle of message passing. The key contributions are the end-to-end network capable of one-shot generalization, and the mitigation of error accumulation in autoregressive models via temporal bundling and the pushforward trick. Note that the latter are self-contained can be applied to other architectures (as in the FNO-PF), providing a valuable tool to improve autoregressive models.</p>]]></content><author><name>Yolanne Lee</name></author><summary type="html"><![CDATA[Recent developments in the field of neural partial differential equation (PDE) solvers have placed a strong emphasis on neural operators. However, the paper Message Passing Neural PDE Solver by Brandstetter et al. published in ICLR 2022 revisits autoregressive models and designs a message passing graph neural network that is comparable with or outperforms both the state-of-the-art Fourier Neural Operator and traditional classical PDE solvers in its generalization capabilities and performance. This blog post delves into the key contributions of this work, exploring the strategies used to address the common problem of instability in autoregressive models and the design choices of the message passing graph neural network architecture.]]></summary></entry><entry><title type="html">Practical Applications of Bsuite For Reinforcement Learning</title><link href="https://iclr-blogposts.github.io/2023/blog/2023/bsuite-applications/" rel="alternate" type="text/html" title="Practical Applications of Bsuite For Reinforcement Learning"/><published>2023-05-01T00:00:00+02:00</published><updated>2023-05-01T00:00:00+02:00</updated><id>https://iclr-blogposts.github.io/2023/blog/2023/bsuite-applications</id><content type="html" xml:base="https://iclr-blogposts.github.io/2023/blog/2023/bsuite-applications/"><![CDATA[<h2 id="0-introduction">0. Introduction</h2> <p>For the past few decades, the field of AI has appeared similar to the Wild West. There have been rapid achievements <d-cite key="krizhevsky_imagenet_2012"></d-cite><d-cite key="hessel_rainbow_2018"></d-cite> and epic showdowns <d-cite key="brown_superhuman_2019"></d-cite><d-cite key="silver_mastering_2016"></d-cite><d-cite key="vinyals_sc2_2019"></d-cite> happening in the frontier of AI research. The subfield of reinforcement learning has been no exception, where progress in the frontier has generated sensational applied feats while leaving theoretical understanding in the dust <d-cite key="osband_behaviour_2020"></d-cite>. As in many other AI subfields, there remain prevailing questions such as, <em>“Which model should I initially select for the given task?”</em>, <em>“How can I tune hyperparameters to increase performance?”</em>, and <em>“What is the best way to improve my already working model?”</em>. In this blog post, we help tame the frontier of reinforcement learning research by providing insights and quantitative answers to such questions through diagnostic, methodical, and reproducible reinforcement learning techniques. In particular, we focus on DeepMind’s <em>Behaviour Suite for Reinforcement Learning</em> (bsuite) codebase and showcase explicit examples of how it can aid reinforcement learning researchers in the development process and help provide a bridge between theoretical and applied reinforcement learning understanding.</p> <p>This introduction section provides the necessary background and motivation to understand the importance of our contribution. The background section describes how deep learning provides a blueprint for bridging theory to practice, and then discusses traditional reinforcement learning benchmarks. The bsuite summary section provides a high-level overview of the core capabilities tested by bsuite, its motivation, an example environment, and a comparison against traditional benchmark environments. In the motivation section, we present arguments for increasing the wealth and diversity of documented bsuite examples, with references to the paper and reviewer comments. The contribution statement presents the four distinct contributions of our work that help extend the bsuite publication. Finally, the experiment summary section describes our setup and rationale for the experimental illustrations in sections 1-5. The information in this introduction section is primarily distilled from the original bsuite publication <d-cite key="osband_behaviour_2020"></d-cite>.</p> <h3 id="background">Background</h3> <p>The current state of reinforcement learning (RL) theory notably lags progress in practice, especially in challenging problems. There are examples of deep reinforcement learning (DRL) agents learning to play Go from scratch at the professional level <d-cite key="silver_mastering_2016"></d-cite>, learning to navigate diverse video games from raw pixels <d-cite key="mnih_human-level_2015"></d-cite>, and learning to manipulate objects with robotic hands <d-cite key="andrychowicz_learning_2020"></d-cite>. While these algorithms have some foundational roots in theory, including gradient descent <d-cite key="bottou_large-scale_2010"></d-cite>, TD learning <d-cite key="sutton_learning_1988"></d-cite>, and Q-learning <d-cite key="watkins_q-learning_1992"></d-cite>, the authors of bsuite acknowledge that, “The current theory of deep reinforcement learning is still in its infancy” <d-cite key="osband_behaviour_2020"></d-cite>. A strong theory is prized since it can help provide insight and direction for improving known algorithms, while hinting at future research directions.</p> <p>Fortunately, deep learning (DL) provides a blueprint of the interaction between theoretical and practical improvements. During the ‘neural network winter’, DL techniques were disregarded in favor of more theoretically sound convex loss methods <d-cite key="cortes_support-vector_1995"></d-cite>, even though the main ideas and successful demonstrations existed many years previously <d-cite key="rosenblatt_perceptron_1958"></d-cite>. It was only until DL techniques achieved superior scores on benchmark problems, mainly for image recognition <d-cite key="krizhevsky_imagenet_2012"></d-cite>, that DL earned the research spotlight. Consequently, a renewed interest in DL theory followed shortly after <d-cite key="kawaguchi_deep_2016"></d-cite><d-cite key="bartlett_spectrally-normalized_2017"></d-cite><d-cite key="belkin_reconciling_2019"></d-cite>, bolstered by the considerable wealth of applied research. Due to the lack of theory in DRL and the proximity of the DL and DRL research fields, <span class="emph">one enticing avenue to accelerate progress in reinforcement learning research is to follow the blueprint laid out by deep learning research and create well-defined and vetted benchmarks for the understanding of reinforcement learning algorithms</span>.</p> <p>To this end, the trend of RL benchmarks has seen an increase in overall complexity. The earliest such benchmarks were simple MDPs that served as basic testbeds with fairly obvious solutions, such as <em>Cartpole</em> <d-cite key="barto_neuronlike_1983"></d-cite> and <em>MountainCar</em> <d-cite key="moore_efficient_1990"></d-cite>. Other benchmarks proved to be more diagnostic by targeting certain capabilities such as <em>RiverSwim</em> <d-cite key="strehl_analysis_2008"></d-cite> for exploration and <em>Taxi</em> <d-cite key="dietterich_hierarchical_2000"></d-cite> for temporal abstraction. Modern benchmarks such as the <em>ATARI Learning Environment</em> <d-cite key="bellemare_arcade_2013"></d-cite> and board games such as <em>Chess</em>, <em>Go</em>, and <em>Shogi</em> are more complex and prove difficult for humans, with even the best humans unable to achieve perfect play. The corresponding achievements were highly publicized <d-cite key="silver_mastering_2016"></d-cite><d-cite key="mnih_human-level_2015"></d-cite> due to the superhuman performance of the agents, with the agents taking actions that were sometimes not even considered by their human counterparts. Consequently, the pursuit of superhuman performance on complex benchmarks has recently been a strong driver of progress in the field <d-cite key="vinyals_sc2_2019"></d-cite><d-cite key="silver_general_2018"></d-cite><d-cite key="perolat_mastering_2022"></d-cite><d-cite key="ecoffet_first_2021"></d-cite><d-cite key="bakhtin_diplomacy_2022"></d-cite>.</p> <h3 id="summary-of-bsuite">Summary of bsuite</h3> <p>The open-source <em>Behaviour Suite for Reinforcement Learning</em> (bsuite) benchmark <d-cite key="osband_behaviour_2020"></d-cite> goes against the grain of the current benchmark trend of increasing complexity. It acts as a complement to existing benchmarks by creating 23 environments with minimal confounding factors to test 7 behavioral core capabilities of RL agents, as follows: <strong>basic</strong>, <strong>exploration</strong>, <strong>memory</strong>, <strong>generalization</strong>, <strong>noise</strong>, <strong>scale</strong>, and <strong>credit assignment</strong>. Current benchmarks often contain most of these capabilities within a single environment, whereas bsuite tailors its environments to target one or a few of these capabilities. Each bsuite environment is scalable and has 16 to 22 levels of difficulty, providing a more precise analysis of the corresponding capabilities than a simple, and possibly misleading <d-cite key="agarwal_deep_2021"></d-cite>, ranking of algorithm performance. Furthermore, algorithms have fixed evaluation regimes based on the number of seeds and episodes allowed during training, which rewards algorithms that exhibit the capabilities rather than those that focus on sheer compute power. The targeted and scalable nature of bsuite can provide insights such as eliciting bottlenecks and revealing scaling properties that are opaque in traditional benchmarks. With respect to the benchmarks described in the preceding paragraph, bsuite is most similar to the diagnostic benchmarks of <em>RiverSwim</em> <d-cite key="strehl_analysis_2008"></d-cite> for and <em>Taxi</em> <d-cite key="dietterich_hierarchical_2000"></d-cite> due to its purpose as a stepping stone for tackling more challenging benchmarks.</p> <p>The bsuite evaluation of an agent yields a radar chart (Fig. 1) that displays the agent’s score from 0 to 1 on all seven capabilities, usually based on regret, that yields a quick quantitative comparison between agents. Scores near 0 indicate poor performance, often akin to an agent acting randomly, while scores near 1 indicate mastery of all environment difficulties. A central premise of bsuite is that <span class="emph">if an agent achieves high scores on certain environments, then it is much more likely to exhibit the associated core capabilities due to the targeted nature of the environments. Therefore, the agent will more likely perform better on a challenging environment that contains many of the capabilities than one with lower scores on bsuite</span>. This premise is corroborated by recent research that shows how insights on simple environments can still hold true on more complex environments <d-cite key="ceron_revisiting_2021"></d-cite>. However, we urge practitioners to exercise caution when adopting bsuite into the development process, as the insights on more simple bsuite environments are not guaranteed to extend to more complex environments in a straightforward manner.</p> <div style="text-align: center;"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar01-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar01-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar01-1400.webp"/> <img src="/2023/assets/img/2023-05-01-bsuite-applications/radar01.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> <div class="caption"> Figure 1. Example radar chart of DQN on all 7 bsuite core capabilities. </div> <p>An example environment is <em>deep sea</em> that targets exploration power. As shown in Figure 2, <em>deep sea</em> is an $N \times N$ grid with starting state at cell $(1, 1)$ and treasure at $(N, N)$, with $N$ ranging from 10 to 100. The agent has two actions, move downward left and downward right; the goal is to reach the treasure and receive a reward of $1$ by always moving downward right. A reward of $0$ is given to the agent for moving downward left at a timestep, while a penalizing reward of $-0.01/N$ is given for moving downward right. The evaluation protocol of <em>deep sea</em> only allows for $10K$ episodes of $N-1$ time steps each, which prevents an algorithm with unlimited time from casually exploring the entire state space and stumbling upon the treasure. Note that superhuman performance is nonexistent in <em>deep sea</em> (and more precisely in the entire bsuite gamut) since a human can spot the optimal policy nearly instantaneously. Surprisingly, we will show later that baseline DRL agents fail miserably at this task.</p> <div style="text-align: center;"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/diagram02-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/diagram02-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/diagram02-1400.webp"/> <img src="/2023/assets/img/2023-05-01-bsuite-applications/diagram02.png" class="img-fluid asdf" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Figure 2. Illustration of deep sea environment taken from <d-cite key="osband_behaviour_2020"></d-cite>. </div> </div> <p>The <strong>challenge</strong> of <em>deep sea</em> is the necessity of exploration in an environment that presents an irreversible, suboptimal greedy action (moving downward left) at every time step. This environment <strong>targets</strong> exploration power by ensuring that a successful agent must deliberately choose to explore the state space by neglecting the greedy action. The <strong>simplistic</strong> implementation removes confounding goals, such as learning to see from pixels while completing other tasks <d-cite key="mnih_human-level_2015"></d-cite>. Furthermore, this environment provides a granular exploration score through <strong>scaling</strong> the environment size by $N$ and determining when an agent starts to fail. Finally, the implementation of the environment yields <strong>fast</strong> computation, allowing multiple, quick runs with minimal overhead and compute cost. These 5 aforementioned key qualities are encompassed by all bsuite environments, and we contrast such environments against traditional benchmark environments in the below table.</p> <table> <thead> <tr> <th>Key Quality</th> <th>Traditional Benchmark Environment</th> <th>bsuite Environment</th> </tr> </thead> <tbody> <tr> <td><strong>Targeted</strong></td> <td>Performance on environment subtly related to many or all core capabilities.</td> <td>Performance on environment directly related with one or few core capabilities.</td> </tr> <tr> <td><strong>Simple</strong></td> <td>Exhibits many confounding factors related to performance.</td> <td>Removes confounding factors related to performance.</td> </tr> <tr> <td><strong>Challenging</strong></td> <td>Requires competency in many core capabilities but not necessarily past normal range in any capability.</td> <td>Pushes agents beyond normal range in one or few core capabilities.</td> </tr> <tr> <td><strong>Scalable</strong></td> <td>Discerns agent’s power through comparing against other agents and human performance.</td> <td>Discerns agent’s competency of core capabilities through increasingly more difficult environments.</td> </tr> <tr> <td><strong>Fast</strong></td> <td>Long episodes with computationally-intensive observations.</td> <td>Relatively small episode and experiment lengths with low observation complexity.</td> </tr> </tbody> </table> <h3 id="motivation">Motivation</h3> <p>The authors of bsuite stated, “Our aim is that these experiments can help provide a bridge between theory and practice, with benefits to both sides” <d-cite key="osband_behaviour_2020"></d-cite>. As discussed in the background section, establishing clear benchmarks can yield applied progress, which in turn can accelerate theoretical progress. The use of bsuite in this manner seems highly fruitful since its environments are targeted, which allows for hypothesis testing and eventual formalization into provable guarantees. As such, <span class="emph">it is instrumental that the applied aspect of bsuite is emphasized through the adoption and diverse application of reinforcement learning practitioners</span>.</p> <p>The applied examples in the published paper are rather meagre: there are two examples of algorithm comparison on two specific environments and three example comparisons of algorithms, optimizers, and ensemble sizes across the entire bsuite gamut in the appendix. The two examples on the specific environments showcase how bsuite can be used for directed algorithm improvement, but the experiments in the appendices only discuss the general notion of algorithm comparison using bsuite scores. In addition to the examples, the authors supply some comments throughout the paper that provide hints regarding the applied usage of bsuite. Looking at the <a href="https://openreview.net/forum?id=rygf-kSYwH">paper reviews</a>, <a href="https://openreview.net/forum?id=rygf-kSYwH&amp;noteId=rkxk2BR3YH">reviewer #1</a> mentioned how there was no explicit conclusion from the evaluation, and <a href="https://openreview.net/forum?id=rygf-kSYwH&amp;noteId=rJxjmH6otS">reviewer #3</a> mentioned that examples of diagnostic use and concrete examples would help support the paper. Furthermore, <a href="https://openreview.net/forum?id=rygf-kSYwH&amp;noteId=SJgEVpbAFr">reviewer #2</a> encouraged publication of bsuite at a top venue to see traction within with the RL research community, and the <a href="https://openreview.net/forum?id=rygf-kSYwH&amp;noteId=7x_6G9OVWG">program chairs</a> mentioned how success or failure can rely on community acceptance. Considering that bsuite received a spotlight presentation at ICLR 2020 and has amassed over 100 citations in the relatively small field of RL reproducibility during the past few years, bsuite has all intellectual merit and some community momentum to reach the level of a timeless benchmark in RL research. <span class="emph">To elevate bsuite to the status of a timeless reinforcement learning benchmark and to help bridge the theoretical and applied sides of reinforcement learning, we believe that it is necessary to develop and document concrete bsuite examples that help answer difficult and prevailing questions throughout the reinforcement learning development process</span>.</p> <h3 id="contribution-statement">Contribution Statement</h3> <p>This blog post extends the work of bsuite by showcasing 12 example use cases with experimental illustration that directly address specific questions in the reinforcement learning development process to (i) help bridge the gap between theory and practice, (ii) promote community acceptance, (iii) aid applied practitioners, and (iv) highlight potential research directions in reproducible reinforcement learning.</p> <h3 id="experiment-summary">Experiment Summary</h3> <p>We separate our examples into 5 categories of <strong>initial model selection</strong>, <strong>preprocessing choice</strong>, <strong>hyperparameter tuning</strong>, <strong>testing and debugging</strong>, and <strong>model improvement</strong>. This blog post follows a similar structure to the paper <em>Deep Reinforcement Learning that Matters</em> <d-cite key="henderson_deep_2018"></d-cite> by posing and answering a question in each category, and then providing a few illustrative examples with conclusions. Most examples use Stable-Baselines3 (SB3) <d-cite key="raffin_stable-baselines3_2022"></d-cite> for training DRL agents due to its clarity and simplicity, and the examples focus on DRL due to its pervasiveness in the applied RL community. We provide code and instructions for each experiment in our <a href="https://github.com/LorenJAnderson/bsuite-applications.git">GitHub codebase</a>, along with hyperparameters and implementation details. Since the focus of this blog post is the discussion of diverse example use cases, not architectural considerations or implementation details, we refer the reader to the <a href="https://openreview.net/pdf?id=rygf-kSYwH#page=13">paper appendix</a> and the <a href="https://colab.research.google.com/github/deepmind/bsuite/blob/master/bsuite/analysis/results.ipynb">colab analysis tutorial</a> for more information about the environments and to the <a href="https://colab.research.google.com/drive/1rU20zJ281sZuMD1DHbsODFr1DbASL0RH">colab intro tutorial</a> and our own codebase for instructions and examples regarding the implementation of bsuite.</p> <p>Although running a bsuite environment is orders of magnitude faster than most benchmark environments, the number of individual bsuite environments and the number of our examples required us to create a subset of bsuite, which we will refer to as <em>mini-bsuite</em> or <em>msuite</em> in this work. We designed msuite to mirror the general scaling pattern of each bsuite environment and the diversity of core capabilities in bsuite; a complete description of msuite can be found in our GitHub codebase. Running experiments on a subset of bsuite highlights its flexibility, and we will show, still elicits quality insights. Since we use a subset of bsuite for our experiments, our radar charts will look different from those in the original bsuite paper. We generally keep the more challenging environments and consequently produce lower scores, especially in the generalization category.</p> <p>We stress that the below examples are not meant to amaze the reader or exhibit state-of-the-art research. <span class="epmh">The main products of this work are the practicality and diversity of ideas in the examples</span>, while the experiments are primarily for basic validation and illustrative purposes. Moreover, these experiments use modest compute power and showcase the effectiveness of bsuite in the low-compute regime. Each example has tangible benefits such as saving development time, shortening compute time, increasing performance, and lessening frustration of the practitioner, among others. To maintain any sense of brevity in this post, we now begin discussion of the examples.</p> <h2 id="1-initial-model-selection">1. Initial Model Selection</h2> <p>The reinforcement learning development cycle typically begins with an environment to solve. A natural question usually follows: “<em>Which underlying RL model should I choose to best tackle this environment, given my resources</em>?”. Resources can range from the hardware (e.g. model size on the GPU), to temporal constraints, to availability of off-the-shelf algorithms <d-cite key="liang_rllib_2018"></d-cite><d-cite key="raffin_stable-baselines3_2022"></d-cite>, to programming efficiency of the practitioner. Initially selecting an effective model can save a great amount of development time due to the potentially greater performance baseline of the agent. In this section, we illustrate how bsuite can be used to effectively answer the question of initial model selection.</p> <h3 id="comparing-baseline-algorithms">Comparing Baseline Algorithms</h3> <p>Perhaps the first choice in the RL development cycle is choosing the algorithm. A considerable amount of RL research is focused on the corresponding algorithms, which presents many possibilities for the researcher. The No Free Lunch Theorem <d-cite key="wolpert_no_1997"></d-cite> tailored to reinforcement learning would state that no algorithm will prove better than any other unless the characteristics of the underlying environment are known. Using bsuite provides a quantitative assessment of algorithm performance on capabilities that are prevalent in many or even most reinforcement learning environments of interest.</p> <p>Example: Figure 3 shows the performance of the Stable-Baselines3 (SB3) implementations of DQN, A2C, and PPO on msuite with our default hyperparameters. Recent research <d-cite key="andrychowicz_what_2020"></d-cite> suggests that PPO is the most commonly used RL algorithm, and it was a successor to DQN and A2C. The results indeed show that PPO is superior on msuite in most categories, providing credibility for its use as the premiere baseline DRL algorithm.</p> <div style="text-align: center;"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar11-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar11-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar11-1400.webp"/> <img src="/2023/assets/img/2023-05-01-bsuite-applications/radar11.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Figure 3. Comparison of SB3 default DQN, A2C, and PPO baseline algorithms. </div> </div> <h3 id="comparing-off-the-shelf-implementations">Comparing Off-the-Shelf Implementations</h3> <p>Due to the vast number of reinforcement learning paradigms (e.g. model-based, hierarchical), there are many off-the-shelf (OTS) libraries that provide a select number of thoroughly tested reinforcement learning algorithms. Often, temporal resources or coding capabilities do not allow for practitioners to implement every algorithm by hand. Fortunately, running an algorithm on bsuite can provide a quick glance of an OTS algorithm’s abilities at low cost to the practitioner.</p> <p>Example: Figure 4 compares our default DQN implementation against the example DQN implementation in the bsuite codebase. There is a significant difference between the performance of each implementation on msuite, with the bsuite implementation displaying its superiority. Note that the hyperparameters of bsuite DQN were most likely chosen with the evaluation on bsuite in mind, which could explain its increased performance.</p> <div style="text-align: center;"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar12-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar12-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar12-1400.webp"/> <img src="/2023/assets/img/2023-05-01-bsuite-applications/radar12.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Figure 4. Comparison of SB3 DQN and bsuite DQN. </div> </div> <h3 id="gauging-hardware-necessities">Gauging Hardware Necessities</h3> <p>Even after an initial algorithm is selected, hardware limitations such as network size and data storage can prevent the agent from being deployed. Using bsuite provides a low-cost comparison among possible hardware choices that can be used to argue for their necessity. This is especially important for small development teams since there can likely be a major disparity between their own hardware resources and those discussed in corresponding research publications.</p> <p>Example: Figure 5 compares the default DQN implementation when varying replay buffer sizes, from $1\mathrm{e}{2}$ to $1\mathrm{e}{5}$, with the default having size $1\mathrm{e}{4}$. The original DQN implementation used a replay buffer of size $1\mathrm{e}{6}$, which is too large for the RAM constraints of many personal computers. The results show that increasing the buffer size to at least $1\mathrm{e}{4}$ yields significant returns on msuite. Note that since the experiment lengths (total time steps for all episodes) of msuite were sometimes less than $1\mathrm{e}{5}$, the largest buffer size of $1\mathrm{e}{5}$ did not always discard experiences from very old episodes, which most likely decreased its performance in comparison to a buffer size of $1\mathrm{e}{4}$.</p> <div style="text-align: center;"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar13-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar13-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar13-1400.webp"/> <img src="/2023/assets/img/2023-05-01-bsuite-applications/radar13.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Figure 5. Comparison of DQN with varying buffer sizes. </div> </div> <h3 id="future-work">Future Work</h3> <p>Due to the diversity of OTS libraries, one possible research direction in reproducible RL is to test algorithms from different OTS libraries using the same hyperparameters on bsuite and create a directory of bsuite radar charts. This provides practitioners a comparison with their own implementation or a starting point when selecting an OTS library and algorithm. Another direction is to test various aspects related to hardware constraints and attempt to show the tradeoff between constraints and performance on bsuite and other benchmarks. This would especially help practitioners with low compute resources to budget resource use on multiple projects.</p> <h2 id="2-preprocessing-choice">2. Preprocessing Choice</h2> <p>Most benchmark environments present complexities such as high-dimensional observations, unscaled rewards, unnecessary actions, and partially-observable Markov Decision Process (POMDP) dynamics. Some of these difficulties can be curbed using environment preprocessing techniques. While certain environments such as <em>ATARI</em> have formalized standards for preprocessing, there are some aspects such as frame skipping that are considered part of the underlying algorithm, and therefore, a choice of the practitioner <d-cite key="machado_revisiting_2018"></d-cite>. A natural question to ask is, “<em>What environment preprocessing techniques will best help my agent attain its goal in this environment</em>?”. In this section, we show how bsuite can provide insight to the choice of preprocessing, with benefits of increased performance and shortened training time.</p> <h3 id="verification-of-preprocessing">Verification of Preprocessing</h3> <p>Preprocessing techniques usually targeted to ease some aspect of the agent’s training. For example, removing unnecessary actions (e.g. in a joystick action space) prevents the agent from having to learn which actions are useless. While a new preprocessing technique can provide improvements, there is always the chance that it fails to make a substantial improvement, or worse yet, generally decreases performance. Invoking bsuite can help provide verification that the preprocessing provided the planned improvement.</p> <p>Example: Figure 6 shows the performance of the default DQN agent versus an agent that received normalized rewards from the environment. Normalizing the rewards increases the speed of training a neural network, since the parameters are usually initialized to expect target values in a range from $-1$ to $1$. Our results show that the normalization preprocessing indeed increases the capability of navigating varying reward scales while not suffering drastically in any other capability.</p> <div style="text-align: center;"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar21-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar21-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar21-1400.webp"/> <img src="/2023/assets/img/2023-05-01-bsuite-applications/radar21.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Figure 6. Comparison of DQN with and without reward normalization. </div> </div> <h3 id="better-model-versus-preprocessing">Better Model versus Preprocessing</h3> <p>Instead of choosing to preprocess the environment, a more sophisticated algorithm may better achieve the preprocessing goals. For example, many improvements on the original DQN algorithm have been directed towards accomplishing goals such as improving stability, reducing overestimation, and bolstering exploration. Comparing preprocessing against an algorithmic improvement provides a quantitative reason for deciding between the two options, especially since development time of many common preprocessing wrappers is quite short.</p> <p>Example: Figure 7 shows the results of PPO with a recurrent network versus PPO having its observation as the last 4 stacked frames from the environment. Frame stacking is common on some <em>ATARI</em> environments by converting the POMDP dynamics to an MDP, which is necessary to determine velocity of any element on the screen. An improvement to DQN, Deep Recurrent Q-networks <d-cite key="hausknecht_deep_2017"></d-cite> uses a recurrent LSTM to aid in memory and achieve the same effects of frame stacking. The msuite results show that memory is considerably improved with PPO RNN and therefore may be worth the extra development time.</p> <div style="text-align: center;"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar22-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar22-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar22-1400.webp"/> <img src="/2023/assets/img/2023-05-01-bsuite-applications/radar22.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Figure 7. Comparison of PPO with frame stacking and PPO with RNN. </div> </div> <h3 id="future-work-1">Future Work</h3> <p>One research direction is to document common preprocessing techniques and determine their scores on bsuite. This would provide practitioners a summary of directed strengths for each preprocessing technique while possibly uncovering unexpected behavior. Another direction is to determine the extent to which preprocessing techniques aided previous results in the literature, which could illuminate strengths or weaknesses in the corresponding algorithms.</p> <h2 id="3-hyperparameter-tuning">3. Hyperparameter Tuning</h2> <p>After selecting a model and determining any preprocessing of the environment, an agent must eventually be trained on the environment to gauge its performance. During the training process, initial choices of hyperparameters can heavily influence the agent’s performance <d-cite key="andrychowicz_what_2020"></d-cite>, including how to explore and how quickly the model should learn from past experience. The corresponding question to ask is, “<em>How can I choose hyperparameters to yield the best performance, given a model?</em>” In this section, we show how bsuite can be used to tune hyperparameters, thereby increasing performance and shortening compute time.</p> <h3 id="unintuitive-hyperparameters">Unintuitive Hyperparameters</h3> <p>Some hyperparameters such as exploration percentage and batch size are more concrete, while others such as discounting factor and learning rate are a little less intuitive. Determining a starting value of an unintuitive hyperparameter can be challenging and require a few trials before honing in on a successful value. Instead of having to run experiments on a costly environment, using bsuite can provide a thoughtful initial guess of the value with minimal compute.</p> <p>Example: Figure 8 shows the results of running PPO with various entropy bonus coefficients across msuite (default is $0.01$). The entropy bonus affects the action distribution of the agent, and the value of $1\mathrm{e}{-2}$ presented in the original paper <d-cite key="schulman_proximal_2017"></d-cite> is fairly unintuitive. The results show that the value of $1\mathrm{e}{-2}$ is indeed superior on msuite by a small margin. Since SB3 has the entropy bonus initialized to 0, this example also shows how hyperparameter tuning with msuite can improve performance even on OTS implementations.</p> <div style="text-align: center;"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar31-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar31-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar31-1400.webp"/> <img src="/2023/assets/img/2023-05-01-bsuite-applications/radar31.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Figure 8. Comparison of default PPO with varying entropy bonuses. </div> </div> <h3 id="promising-ranges-of-hyperparameters">Promising Ranges of Hyperparameters</h3> <p>Instead of determining a single value of a hyperparameter, gauging an acceptable range may be required. Since hyperparameters can have confounding effects, knowing approximate soft boundaries of hyperparameters at which agents start to fail basic tasks can provide useful information during a more general hyperparameter tuning process. For example, smaller learning rates generally take longer for algorithm convergence, and a practitioner may want to know a promising range of learning rates if the computing budget is flexible. The scaling nature of bsuite presents knowledge of the extent to which different hyperparameter choices affect performance, greatly aiding in ascertaining a promising hyperparameter range.</p> <p>Example: Figure 9 shows the results of default DQN with varying learning rates on msuite (default $7\mathrm{e}{-4}$). The results suggest that learning rates above $1\mathrm{e}{-2}$ start to yield diminishing returns. Since some experiment lengths in msuite only run for $10K$ episodes, the lowest learning rate of $1\mathrm{e}{-6}$ may never converge in time even with high-quality training data, necessitating a modification to msuite to learn a lower bound.</p> <div style="text-align: center;"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar32-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar32-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar32-1400.webp"/> <img src="/2023/assets/img/2023-05-01-bsuite-applications/radar32.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Figure 9. Comparison of default DQN with varying learning rates. </div> </div> <h3 id="pace-of-annealing-hyperparameters">Pace of Annealing Hyperparameters</h3> <p>While some hyperparameters stay fixed, others must change throughout the course of training. Typically, these include hyperparameters that control the exploration vs. exploitation dilemma, such as entropy bonus and epsilon-greedy exploration. These hyperparameters are often dependent on the entire experiment; for example, SB3 anneals epsilon-greedy exploration for a fixed fraction of the experiment. Therefore, entire experiments, some consisting of millions of episodes, need to be run to determine successful values of these hyperparameters. Using bsuite can provide a quick confirmation that the annealing of these parameters happens at an acceptable rate.</p> <p>Example: Figure 10 shows the performance of DQN with various epsilon-greedy exploration annealing lengths, based on a fixed fraction of the entire experiment (default $0.1$). The annealing fraction of $0.1$ performs best on msuite, which is the same choice of parameter in the original DQN paper. Furthermore, performance decreases with greater annealing lengths. Since bsuite environments are generally scored with regret, we acknowledge that the longer annealing lengths may have better relative performance if bsuite were scored with a training versus testing split.</p> <div style="text-align: center;"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar33-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar33-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar33-1400.webp"/> <img src="/2023/assets/img/2023-05-01-bsuite-applications/radar33.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Figure 10. Comparison of default DQN with varying epsilon annealing lengths. </div> </div> <h3 id="future-work-2">Future Work</h3> <p>The three experiments above can be extended by documenting the effect of varying hyperparameters on performance, especially in OTS implementations. This would help practitioners understand the effects of certain hyperparameters on the bsuite core capabilities, allowing for a better initial hyperparameter choice when certain capabilities are necessary for the environment at hand. Another research direction is to determine if integrating a fast hyperparameter tuner on general environments such as bsuite into a hyperparameter tuner for single, complex environments would increase the speed of tuning on the fixed environment. Since the bsuite core capabilities are necessary in many complex environments, initially determining competency on bsuite would act as a first pass of the tuning algorithm.</p> <h2 id="4-testing-and-debugging">4. Testing and Debugging</h2> <p>Known to every RL practitioner, testing and debugging during the development cycle is nearly unavoidable. It is common to encounter silent bugs in RL code, where the program runs but the agent fails to learn because of an implementation error. Examples include incorrect preprocessing, incorrect hyperparameters, or missing algorithm additions. Quick unit tests can be invaluable for the RL practitioner, as shown in successor work to bsuite <d-cite key="rajan_mdp_2021"></d-cite>. A corresponding question to ask during the testing and debugging phase is, “<em>What tests can I perform to verify that my agent is running as intended?</em>” In this section, we show how bsuite can be used as a sanity check for the implementation, saving compute time and lessening the frustration of the practitioner. In an effort to refrain from contrived examples, the two examples below highlight real-life scenarios where using bsuite could have saved the authors of this blog post hours of frustration in their own work.</p> <h3 id="incorrect-hyperparameter">Incorrect Hyperparameter</h3> <p>As discussed in the previous section, hyperparameters are of major importance to the performance of a RL algorithm. A missing or incorrect hyperparameter will not necessarily prevent a program from running, but most such bugs will severely degrade performance. Using bsuite can quickly expose poor performance of an algorithm at a low cost to the practitioner.</p> <p>Example: Figure 11 shows the default PPO implementation against a PPO implementation with an erroneous learning rate of $1\mathrm{e}{3}$. Many hyperparameters such as total training steps and maximum buffer size are usually coded using scientific notation since they are so large; consequently, it is easy to forget the ‘minus sign’ when coding the learning rate and instead code the learning rate as $1\mathrm{e}{3}$. The results on msuite show that performance has degraded severely from an OTS implementation, and more investigation into the code is required. One of the authors of this blog post would have saved roughly a day of training a PPO agent in their own work had they realized this exact mistake.</p> <div style="text-align: center;"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar41-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar41-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar41-1400.webp"/> <img src="/2023/assets/img/2023-05-01-bsuite-applications/radar41.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Figure 11. Comparison of default PPO with miscoded PPO. </div> </div> <h3 id="off-the-shelf-algorithm-testing">Off-the-Shelf Algorithm Testing</h3> <p>While the previous example used an OTS algorithm for comparison to illuminate silent bugs, it may be the case that the OTS algorithm itself could have a silent bug. Whether due to an incorrect library being used or a misunderstanding of the OTS algorithm, any silent bug in an OTS algorithm can be difficult to detect due to the codebase being written by another practitioner. Again, bsuite can be used to diagnose poor performance and elucidate a coding problem.</p> <p>Example: Figure 12 shows the results of the SB3 DQN with our default experimental hyperparameters and with the default SB3 hyperparameters on msuite. A core difference between the hyperparameters is the burn rate: the default SB3 hyperparameters perform $10K$ steps before learning takes place (e.g. backprop), while our default experimental hyperparameters start the learning after $1K$ steps. Since many of the easier msuite environments only last $10K$ time steps, failure to learn anything during that time severely degrades performance, as shown. Noticing the default value of this hyperparameter in SB3 would have saved the authors roughly 10 hours of training time.</p> <div style="text-align: center;"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar42-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar42-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar42-1400.webp"/> <img src="/2023/assets/img/2023-05-01-bsuite-applications/radar42.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Figure 12. Comparison of DQN with small and large burn-in. </div> </div> <h3 id="future-work-3">Future Work</h3> <p>The training time for a complete run of bsuite can take an hour for even the most basic algorithms. Considering that a few of the easiest bsuite environments could have shown poor performance in the above examples within mere minutes, one research avenue is to create a fast debugging system for reinforcement learning algorithms. In the spirit of bsuite, it should implement targeted experiments to provide actionable solutions for eliminating silent bugs. Such work would primarily act as a public good, but it could also help bridge the gap between RL theory and practice if it embodies the targeted nature of bsuite.</p> <h2 id="5-model-improvement">5. Model Improvement</h2> <p>A natural milestone in the RL development cycle is getting an algorithm running bug-free with notable signs of learning. A common follow-up question to ask is, “<em>How can I improve my model to yield better performance?</em>”. The practitioner may consider choosing an entirely new model and repeating some of the above steps; a more enticing option is usually to improve the existing model by reusing its core structure and only making minor additions or modifications, an approach taken in the development of the baseline RAINBOW DQN algorithm <d-cite key="hessel_rainbow_2018"></d-cite>. In this section, we discuss how bsuite can be used to provide targeted improvements of existing models and increase performance while mitigating compute time.</p> <h3 id="increasing-network-complexity">Increasing Network Complexity</h3> <p>In DRL, the neural network usually encodes the policy, and its architecture directly affects the agent’s learning capacity. The more complicated CNN architecture was a driver for the first superhuman performance of a DRL algorithm on the <em>ATARI</em> suite due to its ability to distill image data into higher-level features. Using bsuite can provide a quick verification if an architectural improvement produces its intended effect.</p> <p>Example: Figure 13 shows the results of PPO against PPO with a recurrent neural network. As mentioned in a previous example, RNNs aid memory and were originally incorporated into DRL as a way to deal with POMDP dynamics. The results on msuite display the substantial increase in memory capability while sacrificing on credit assignment. This example highlights how bsuite can provide warnings of possible unexpected decreases in certain capabilities, which must be monitored closely by the practitioner.</p> <div style="text-align: center;"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar51-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar51-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar51-1400.webp"/> <img src="/2023/assets/img/2023-05-01-bsuite-applications/radar51.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Figure 13. Comparison of default PPO with PPO RNN. </div> </div> <h3 id="off-the-shelf-improvements">Off-the-Shelf Improvements</h3> <p>While previous examples discussed comparison, verification, and debugging OTS implementations, many OTS libraries provide support for well-known algorithm improvements. For example, some DQN implementations have boolean values to signify the use of noisy networks, double Q-learning, and more. Using bsuite provides the necessary targeted analysis to help determine if certain improvements are fruitful for the environment at hand.</p> <p>Example: Figure 14 shows the results of our default DQN compared against the SB3 QRDQN algorithm with default hyperparameters and the SBE QRDQN algorithm with hyperparameters matching our default DQN implementation. The QRDQN algorithm is an improvement over DQN that aims to capture the distribution over returns instead of a point estimate of the expected return. This implementation is more complex but allows for a precise estimate that aids in stability. The results show that this improvement was rather negligible on msuite, and unless credit assignment is the major concern in the environment at hand, a different improvement may prove more useful.</p> <div style="text-align: center;"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar52-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar52-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-bsuite-applications/radar52-1400.webp"/> <img src="/2023/assets/img/2023-05-01-bsuite-applications/radar52.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Figure 14. Comparison of DQN with QRDQN variants. </div> </div> <h3 id="future-work-4">Future Work</h3> <p>Since bsuite provides quantitative results, one avenue of research is to create a recommender system that uses information from previous bsuite analyses to recommend improvements in DRL algorithms. The practitioner would need to provide as input the most important capabilities that an environment is believed to exhibit, and bsuite would tailor recommendations towards those capabilities. Such a recommender system could save compute time, increase performance, and ultimately expose the practitioner to new and exciting algorithmic possibilities.</p> <h2 id="6-conclusion">6. Conclusion</h2> <p>Traditional RL benchmarks contain many confounding variables, which makes analysis of agent performance rather opaque. In contrast, bsuite provides targeted environments that help gauge agent prowess in one or few core capabilities. The goal of bsuite is to help bridge the gap between practical theory and practical algorithms, yet there currently is no database or list of example use cases for the practitioner. Our work extends bsuite by providing concrete examples of its use, with a few examples in each of five categories. We supply at least one possible avenue of related future work or research in reproducible RL for each category. In its current state, bsuite is poised to be a standard RL benchmark for years to come due to its acceptance in a top-tier venue, well-structured codebase, multiple tutorials, and over 100 citations in the past few years in a relatively small field. We aim to help propel bsuite, and more generally methodical and reproducible RL research, into the mainstream through our explicit use cases and examples. With a diverse set of examples to choose from, we intend for applied RL practitioners to understand more use cases of bsuite, apply and document the use of bsuite in their experiments, and ultimately help bridge the gap between practical theory and practical algorithms.</p> <h3 id="green-computing-statement">Green Computing Statement</h3> <p>The use of bsuite can provide directed improvements in algorithms, from high-level model selection and improvement to lower-level debugging, testing, and hyperparameter tuning. Due to the current climate crisis, we feel that thoroughly-tested and accessible ideas that can reduce computational cost should be promoted to a wide audience of researchers.</p> <h3 id="inclusive-computing-statement">Inclusive Computing Statement</h3> <p>Many of the ideas in bsuite and this blog post are most helpful in regimes with low compute resources because of the targeted nature of these works. Due to the increasing gap between compute power of various research teams, we feel that thoroughly-tested and accessible ideas that can benefit teams with meagre compute power should be promoted to a wide audience of researchers.</p> <h2 id="acknowledgements">Acknowledgements</h2> <p>We thank the reviewers for their helpful comments. We also thank the authors of bsuite for their outstanding work.</p>]]></content><author><name>Loren Anderson</name></author><summary type="html"><![CDATA[In 2019, researchers at DeepMind published a suite of reinforcement learning environments called Behavior Suite for Reinforcement Learning, or bsuite. Each environment is designed to directly test a core capability of a general reinforcement learning agent, such as its ability to generalize from past experience or handle delayed rewards. In this blog post, we extend their work by providing specific examples of how bsuite can address common challenges faced by reinforcement learning practitioners during the development process.]]></summary></entry><entry><title type="html">Strategies for Classification Layer Initialization in Model-Agnostic Meta-Learning</title><link href="https://iclr-blogposts.github.io/2023/blog/2023/classification-layer-initialization-in-maml/" rel="alternate" type="text/html" title="Strategies for Classification Layer Initialization in Model-Agnostic Meta-Learning"/><published>2023-05-01T00:00:00+02:00</published><updated>2023-05-01T00:00:00+02:00</updated><id>https://iclr-blogposts.github.io/2023/blog/2023/classification-layer-initialization-in-maml</id><content type="html" xml:base="https://iclr-blogposts.github.io/2023/blog/2023/classification-layer-initialization-in-maml/"><![CDATA[<h2 id="introduction">Introduction</h2> <p>In a previous study, Raghu et al. [2020] <d-cite key="DBLP:conf/iclr/RaghuRBV20"></d-cite> found that in model-agnostic meta-learning (MAML) for few-shot classification, the majority of changes observed in the network during the inner loop fine-tuning process occurred in the linear classification head. It is commonly believed that during this phase, the linear head remaps encoded features to the classes of the new task. In traditional MAML, the weights of the final linear layer are meta-learned in the usual way. However, there are some issues with this approach:</p> <p>First, it is difficult to imagine that a single set of optimal classification head weights can be learned. This becomes apparent when considering class label permutations: two different tasks may have the same classes but in a different order. As a result, the weights that perform well for the first task will likely not be effective for the second task. This is reflected in the fact that MAML’s performance can vary by up to 15% depending on the class label ordering during testing <d-cite key="DBLP:conf/iclr/YeC22"></d-cite>.</p> <p>Second, more challenging datasets are being proposed as few-shot learning benchmarks, such as Meta-Dataset <d-cite key="DBLP:conf/iclr/TriantafillouZD20"></d-cite>. These datasets have varying numbers of classes per task, making it impossible to learn a single set of weights for the classification layer.</p> <p>Therefore, it seems logical to consider how to initialize the final classification layer before fine-tuning on a new task. Random initialization may not be optimal, as it can introduce unnecessary noise <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite>.</p> <p>This blog post will discuss different approaches to the last layer initialization that claim to outperform the original MAML method.</p> <h2 id="what-is-meta-learning">What is Meta-Learning?</h2> <p>Before diving into the topic, let’s look at the general idea of meta-learning. In supervised machine learning, tasks are learned using a large number of labeled examples. However, acquiring a sufficient amount of labeled data can be labor extensive. Also, this approach to machine learning evidently deviates from the human learning process; a child is certainly able to learn what a specific object is, using only a few examples, and not hundreds or thousands. This is where meta-learning comes in. Its goal can be described as acquiring the ability to learn new tasks from only a few examples <d-cite key="9428530"></d-cite>.</p> <p>There is not one fixed framework for meta-learning; however, a common approach is based on the principle that the conditions in which a model is trained and evaluated must match <d-cite key="vinyals2016matching"></d-cite>.<br/> Let’s look at this in more detail for the case of few-shot classification, which can be solved with meta-learning. Here, the meta-learning goal can be verbalized as “learning to learn new classes from few examples” <d-cite key="DBLP:conf/iclr/TriantafillouZD20"></d-cite>. When evaluating a meta-learner, one needs a training set \(\mathcal{D^{tr}} = ((\mathbf{x}_1, y_1), (\mathbf{x}_2, y_2), (\mathbf{x}_3, y_3), ...)\), consisting of labeled examples for unseen classes. Those are used by the meta-learner to adapt to the novel task. How well the meta-learner performs can then be evaluated on labeled examples from the same classes: \(\mathcal{D^{test}} = ((\mathbf{x}_{1}^{\ast}, y_{1}^{\ast}), (\mathbf{x}_{2}^{\ast}, y_{2}^{\ast}), (\mathbf{x}_{3}^{\ast}, y_{3}^{\ast}), ...)\). The combination of such a training and test set is referred to as an episode or a task: $\mathcal{T} = (\mathcal{D^{tr}}, \mathcal{D^{test}})$.</p> <p>To match the conditions for training and evaluation, one would split all available classes with their examples into a dataset for meta-training \(\mathcal{C}_{train}\) and a dataset for meta-testing \(\mathcal{C}_{test}\). Tasks are then drawn from those datasets for either training or testing purposes.<br/> A possible approach for using a task in the training phase could be: Fine-tune the meta-learner using \(\mathcal{D^{tr}}\), evaluate its performance on \(\mathcal{D^{test}}\), and finally update the model based on this evaluation error.</p> <h2 id="quick-recap-on-maml">Quick recap on MAML</h2> <p>Model-Agnostic Meta-Learning (MAML) <d-cite key="DBLP:conf/icml/FinnAL17"></d-cite> is a well-established algorithm in the field of optimization-based meta-learning. Its goal is to find parameters $\theta$ for a parametric model $f_{\theta}$ that can be efficiently adapted to perform an unseen task from the same task distribution, using only a few training examples. The pre-training of $\theta$ is done using two nested loops (bi-level optimization), with meta-training occurring in the outer loop and task-specific fine-tuning in the inner loop. The task-specific fine-tuning is typically done using a few steps of gradient descent:</p> \[\theta_{i}' = \theta - \alpha\nabla_{\theta}\mathcal{L_{\mathcal{T_{i}}}}(\theta, \mathcal{D^{tr}})\] <p>where $\alpha$ is the inner loop learning rate, $\mathcal{L_{\mathcal{T_{i}}}}$ is a task’s loss function, and $\mathcal{D^{tr}}$ is a task’s training set. The task includes a test set as well: $\mathcal{T_{i}} = (\mathcal{D_{i}^{tr}}, \mathcal{D_{i}^{test}})$.</p> <p>In the outer loop, the meta parameter $\theta$ is updated by backpropagating through the inner loop to reduce errors made on the tasks’ test set using the fine-tuned parameters:</p> \[\theta' = \theta - \eta\nabla_{\theta} \sum_{\mathcal{T_{i}} \sim p(\mathcal{T})}^{} \mathcal{L_{\mathcal{T_{i}}}}(\theta_{i}', \mathcal{D^{test}}).\] <p>Here, $\eta$ is the meta-learning rate. The differentiation through the inner loop involves calculating second-order derivatives, which mainly distinguishes MAML from simply optimizing for a $\theta$ that minimizes the average task loss.</p> <p>It is worth noting that in practical scenarios, this second-order differentiation is computationally expensive, and approximation methods such as first-order MAML (FOMAML) <d-cite key="DBLP:conf/icml/FinnAL17"></d-cite> or Reptile <d-cite key="DBLP:journals/corr/abs-1803-02999"></d-cite> are often used. In FOMAML, the outer loop update is simply: \(\theta' = \theta - \eta\nabla_{\theta'} \sum_{\mathcal{T_{i}} \sim p(\mathcal{T})}^{}\mathcal{L_{\mathcal{T_{i}}}}(\theta_{i}', \mathcal{D^{test}})\), which avoids differentiating through the inner loop.</p> <p>Before proceeding, let’s prepare ourselves for the next sections by looking at the notation we can use when discussing MAML in the few-shot classification regime: The model’s output prediction can be described as $\hat{y} = f_{\theta}(\mathbf{x}) = \underset{c\in[N]}{\mathrm{argmax}} h_{\mathbf{w}} (g_{\phi}(\mathbf{x}), c)$, where we divide our model $f_{\theta}(\mathbf{x})$ (which takes an input $\mathbf{x}$) into a feature extractor $g_{\phi}(\mathbf{x})$ and a classifier $h_\mathbf{w}(\mathbf{r}, c)$, which is parameterized by classification head weight vectors ${\mathbf{w}}_{c=1}^N$. $\mathbf{r}$ denotes an input’s representation, and $c$ is the index of the class we want the output prediction for.</p> <p>Finally, $\theta = {\mathbf{w_1}, \mathbf{w_1}, …, \mathbf{w_N}, \phi}$, and we are consistent with our previous notation.</p> <h2 id="learning-a-single-initialization-vector">Learning a single initialization vector</h2> <p>The first two variants of MAML - we look at - approach the initialization task by initializing the classification head weight vectors identically for all classes. In the paper</p> <p></p> <p><span>   ▶  </span>Han-Jia Ye &amp; Wei-Lun Chao (ICLR, 2022) How to train your MAML to excel in few-shot classification <d-cite key="DBLP:conf/iclr/YeC22"></d-cite>,</p> <p></p> <p>an approach called <strong>UnicornMAML</strong> is presented. It is explicitly motivated by the effect that different class-label assignments can have. Ye &amp; Chao [2022] <d-cite key="DBLP:conf/iclr/YeC22"></d-cite> report that during testing, vanilla MAML can perform very differently for <ins>tasks with the same set of classes</ins>, which are just <ins>differently ordered</ins>. Namely, they report that classification accuracy can vary up to 15% in the one-shot setting and up to 8% in the five-shot setting. This makes MAML’s performance quite unstable. <br/><br/></p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-classification-layer-initialization-in-maml/perm_final-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-classification-layer-initialization-in-maml/perm_final-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-classification-layer-initialization-in-maml/perm_final-1400.webp"/> <img src="/2023/assets/img/2023-05-01-classification-layer-initialization-in-maml/perm_final.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <p align="center"> <em>Fig.1 Example of MAML and a class label permutation <d-cite key="DBLP:conf/iclr/YeC22"></d-cite>. We can see the randomness introduced, as $\mathbf{w_1}$ is supposed to interpret the input features as "unicorn" for the first task and as "bee" for the second. For both tasks, the class outputted as a prediction should be the same, as in human perception, both tasks are identical. This, however, is obviously not the case.</em> </p> <p>The solution proposed is fairly simple: Instead of meta-learning $N$ weight vectors for the final layer, only a <ins>single vector</ins> $\mathbf{w}$ is meta-learned and used to initialize all $ \{ \mathbf{w} \}_{c=1}^N $ before the fine-tuning stage.</p> <p>This forces the model to make random predictions before the inner loop, as $\hat{y_c}= h_{\mathbf{w}} (g_{\phi} (\mathbf{x}), c)$ will be the same for all $c \in [1,…,N ]$.</p> <p>After the inner loop, the updated parameters have been computed as usual: \(\theta' = \\{\mathbf{w_1}', \mathbf{w_2}', ..., \mathbf{w_N}', \phi'\\}\). The gradient for updating the single classification head meta weight vector $\mathbf{w}$, is just the aggregation of the gradients w.r.t. all the single $\mathbf{w_c}$:</p> \[\nabla_{\mathbf{w}} \mathcal{L_{\mathcal{T_i}}} (\mathcal{D^{test}}, \theta_i) = \sum_{c \in [N]} \nabla_{\mathbf{w_c}} \mathcal{L_{\mathcal{T_i}}} (\theta_i, \mathcal{D^{test}})\] <p>This collapses the models meta-parameters to $ \theta = \{\mathbf{w}, \phi\} $. <br/><br/></p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-classification-layer-initialization-in-maml/unicorn_maml_final-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-classification-layer-initialization-in-maml/unicorn_maml_final-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-classification-layer-initialization-in-maml/unicorn_maml_final-1400.webp"/> <img src="/2023/assets/img/2023-05-01-classification-layer-initialization-in-maml/unicorn_maml_final.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <p align="center"> <em>Fig.2 Overview of UnicornMAML <d-cite key="DBLP:conf/iclr/YeC22"></d-cite>. We can see that class label permutations don't matter anymore, as before fine-tuning, the probability of predicting each class is the same.</em> </p> <p>This tweak to vanilla MAML makes UnicornMAML permutation invariant, as models fine-tuned on tasks including the same categories - just differently ordered - will now yield the same output predictions. Also, the method could be used with more challenging datasets where the number of classes varies without any further adaptation: It doesn’t matter how many classification head weight vectors are initialized by the single meta-classification head weight vector.</p> <p>Furthermore, the uniform initialization in Unicorn-MAML addresses the problem of memorization overfitting <d-cite key="DBLP:conf/iclr/YinTZLF20"></d-cite>. The phenomenon describes a scenario where a single model can learn all the training tasks only from the test data in the outer loop. This leads to a model that learns to perform the training tasks but also to a model that doesn’t do any fine-tuning and thus fails to generalize to unseen tasks.</p> <p>Yin et al. [2020] <d-cite key="DBLP:conf/iclr/YinTZLF20"></d-cite> illustrate memorization overfitting using a simple example: Imagine a 3D pose prediction problem, where each task consists of 2D pictures of a certain object. The objects are rotated by some angle from an (unknown) canonical pose in every picture. Each picture is labeled by the angle by which the object is rotated from the object’s canonical pose.</p> <p>In a memorization overfitting scenario, a model learns and memorizes the canonical pose of all the objects shown during training. This way, the model no longer needs to adapt during fine-tuning in the meta-training phase. For correctly dealing with the test examples during training, it could just recognize which object it is looking at and calculate the angle from the remembered canonical pose.<br/> This becomes a problem when unseen objects are shown to the model during meta-testing. Here, it would be critical to infer the canonical pose from the training examples to infer the rotation angle for the test examples correctly. This, however, was not learned by the model in this example.</p> <p>When initializing the classification head identically for all classes, the model is forced to adapt during fine-tuning, as otherwise, it would predict only at the chance level. This prevents memorization overfitting.</p> <p>Ye &amp; Chao [2022] <d-cite key="DBLP:conf/iclr/YeC22"></d-cite> benchmark UnicornMAML on MiniImageNet and TieredImageNet. In the five-shot setting, the approach is claimed to outperform ProtoNet, ProtoMAML, MetaOptNet, MTL+E3BM, RFS-Distill, DeepEMD, MATE+MetaOpt DSN-MR and FEAT. In the one-shot setting, UnicornMAML is reported to perform averagely compared with the other methods.</p> <p>Let’s finally think of how to interpret UnicornMAML: When meta-learning only a single classification head vector, one could say that rather than learning a mapping from features to classes, the weight vector instead learns a prioritization of those features that seem to be more relevant across tasks.</p> <h2 id="zero-initialization">Zero initialization</h2> <p>The second approach for initializing weights identically for all classes is proposed in the paper</p> <p></p> <p><span>   ▶  </span>Chia-Hsiang Kao et al. (ICLR, 2022) MAML is a Noisy Contrastive Learner in Classification <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite>.</p> <p></p> <p>Kao et al. [2022] <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite> modify the original MAML by setting the whole classification head to zero before each inner loop. They refer to this MAML-tweak as the <strong>zeroing trick</strong>.</p> <p>An overview of MAML with the zeroing trick is displayed below:</p> <div class="l-page"> <iframe src="/2023/assets/html/2023-05-01-classification-layer-initialization-in-maml/algorithm.html" frameborder="0" scrolling="no" width="100%" height="400px"></iframe> </div> <p align="center"> <em>Fig.3 MAML with the zeroing trick applied <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite>.</em> </p> <p>Through applying the zero initialization, three of the problems addressed by UnicornMAML are solved as well:</p> <ul> <li>MAML, with the zeroing trick applied, leads to random predictions before fine-tuning. This happens as zeroing the whole classification head is also a form of identical weight initialization for all classes. Thus, the zeroing trick solves the problem caused by class label ordering permutations during testing.</li> <li>Through the random predictions before fine-tuning, memorization overfitting is prevented as well.</li> <li>The zeroing trick makes MAML applicable for datasets with a varying number of classes per task.</li> </ul> <p>Interestingly, the motivation for applying the zeroing trick, stated by Kao et al. [2022] <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite>, is entirely different. In general, Kao et al. [2022] <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite> want to unveil in what sense MAML encourages its models to learn general-purpose feature representations. They show that under some assumptions, there is a supervised contrastive learning (SCL) objective underlying MAML.</p> <p>In SCL, the label information is leveraged by pulling embeddings belonging to the same class closer together while increasing the embedding distances of samples from different classes <d-cite key="DBLP:conf/nips/KhoslaTWSTIMLK20"></d-cite>. This is achieved by contrasting examples within a batch to each other. If two examples share the same label, the SCL loss is designed to increase their embeddings’ similarity. If the label differs, it enforces the examples embedding similarity to decrease. The SCL loss contains an explicit similarity measure, which distinguishes it from supervised learning.</p> <p>More specifically, Kao et al. [2022] <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite> show that, in MAML without the zeroing trick, the outer-loop update for the encoder follows a noisy SCL loss under the following assumptions:</p> <ol> <li>The encoder weights are frozen in the inner loop (EFIL assumption)</li> <li>There is only a single inner loop update step.<d-footnote>Note that FOMAML technically follows a noisy SCL loss without this assumption. However, when applying the zeroing trick, this assumption is needed again for stating that the encoder update is following an SCL loss</d-footnote></li> </ol> <p>A noisy SCL loss means that cases can occur where the loss forces the model to maximize similarities between embeddings from samples of different classes. The outer-loop encoder loss in this setting contains an “interference term” which causes the model to pull together embeddings from different tasks or to pull embeddings into a random direction, with the randomness being introduced by random initialization of the classification head. Those two phenomena are termed <em>cross-task interference</em> and <em>initialization interference</em>. Noise and interference in the loss vanish when applying the zeroing trick, and the outer-loop encoder loss turns into a proper SCL loss. Meaning that minimizing this loss forces embeddings of the same class/task together while pushing embeddings from the same task and different classes apart.</p> <p>Those findings are derived using a general formulation of MAML, with a cross-entropy loss, and the details are available in the paper <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite>. Also, a slightly simpler example is stated to give an intuition of MAML’s SCL properties. We will briefly summarize it in the following to share this intuition with you.</p> <p>In experiments on the mini-ImageNet and Omniglot datasets, a decent increase in performance is reported for MAML with the zeroing trick compared to vanilla MAML.</p> <h3 id="mamls-scl-intuition">MAML’s SCL Intuition</h3> <p>To get an intuition of how MAML relates to SCL, let’s look at the following setup: an N-way one-shot classification task using MAML with Mean Squared Error (MSE) between the one-hot encoded class label and the prediction of the model. Furthermore, the EFIL assumption is made, the zeroing trick is applied, only a single inner loop update step is used, and only a single task is sampled per batch.</p> <p>In this setting, the classification heads inner-loop update for a single datapoint looks like this:</p> \[\mathbf{w}' = \mathbf{w} - \alpha (-g_{\phi} (\mathbf{x}_{1}^{tr}) \mathbf{t}_{1}^{tr\top})\] <p>$\mathbf{t}_1^{tr}$ refers to the one-hot encoded class label belonging to $\mathbf{x}_1^{tr}$. In words, the features extracted for training example $\mathbf{x}_1^{tr}$ are added to column $\mathbf{w}_c$, with $c$ being the index of 1 in $\mathbf{t}_1^{tr}$. For multiple examples, the features of all training examples labeled with class $c$ are added to the $c^{th}$ column of $\mathbf{w}$.</p> <p>Now, for calculating the model’s output in the outer loop, the model computes the dot products of the columns \(\\{\mathbf{w'} \\}_{c=1}^N\) and the encoded test examples \(g_{\phi}(\mathbf{x}_1^{test})\). To match the one-hot encoded label as well as possible, the dot product has to be large when \(\mathbf{t}_1^{test}\) = \(1\) at index \(c\), and small otherwise. We can see that the loss enforces embedding similarity for features from the same classes while enforcing dissimilarity for embeddings from different classes, which fits the SCL objective.</p> <h2 id="initialization-using-prototypes">Initialization using prototypes</h2> <p>A more sophisticated approach for last-layer initialization in MAML is introduced in the paper</p> <p></p> <p><span>   ▶  </span>Eleni Triantafillou et al. (ICLR, 2020) Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples <d-cite key="DBLP:conf/iclr/TriantafillouZD20"></d-cite> .</p> <p></p> <p>As one might guess from the name, <strong>Proto-MAML</strong> makes use of Prototypical Networks (PNs) for enhancing MAML. Unlike the two initialization strategies presented above, Proto-MAML does not force the classification head weights to be initialized identically for all classes before fine-tuning. Instead, it calculates class-specific initialization vectors based on the training examples. This solves some of the problems mentioned earlier (see <a href="#conclusion--discussion">Conclusion &amp; Discussion</a>), but also it adds another type of logic to the classification layer.</p> <p>Let’s revise how PNs work when used for few-shot learning for understanding Proto-MAML afterward:</p> <p>Class prototypes \(\mathbf{c}_{c}\) are computed by averaging over train example embeddings of each class, created by a feature extractor \(g_{\phi}(\mathbf{x})\). For classifying a test example, a softmax over the distances (e.g., squared Euclidean distance) between class prototypes \(\mathbf{c}_{c}\) and example embeddings \(g_{\phi}(\mathbf{x}^{test})\) is used, to generate probabilities for each class.</p> <p>When using the squared Euclidean distance, the model’s output logits are expressed as:</p> \[\begin{align*} &amp;- \vert \vert g_{\phi}(\mathbf{x}) - \mathbf{c}_c \vert \vert^2 \\ =&amp; −g_{\phi}(\mathbf{\mathbf{x}})^{\top} g_{\phi}(\mathbf{x}) + 2 \mathbf{c}_{c}^{\top} g_{\phi}(\mathbf{x}) − \mathbf{c}_{c}^{\top} \mathbf{c}_{c} \\ =&amp; 2 \mathbf{c}_{c}^{\top} g_{\phi}(\mathbf{x}) − \vert \vert \mathbf{c}_{c} \vert \vert^2 + constant. \end{align*}\] <p>Note that the “test” superscripts on $\mathbf{x}$ are left out for clarity. \(−g_{\phi}(\mathbf{x})^{\top} g_{\phi}(\mathbf{x})\) is disregarded here, as it’s the same for all logits, and thus doesn’t affect the output probabilities. When inspecting the left-over equation, we can see that it now has the shape of a linear classifier. More specifically, a linear classifier with weight vectors \(\mathbf{w}_c = 2 \mathbf{c}_c^{\top}\) and biases \(b_c = \vert \vert \mathbf{c}_{c} \vert \vert^2\).</p> <p>Returning to Proto-MAML, Triantafillou et al. [2020] <d-cite key="DBLP:conf/iclr/TriantafillouZD20"></d-cite> adapt vanilla MAML by initializing the classification head using the prototype weights and biases, as just discussed. The initialization happens before the inner loop for each task, and the prototypes are computed by MAML’s own feature extractor. Afterward, the fine-tuning works as usual. Finally, when updating $\theta$ in the outer loop, the gradients flow also through the initialization of \(\mathbf{w}_c\) and \(b_c\), which is easy as they fully depend on \(g_{\phi}(\mathbf{x})\).</p> <p>Note that because of computational reasons, Triantafillou et al. [2020] <d-cite key="DBLP:conf/iclr/TriantafillouZD20"></d-cite> refer to Proto-MAML as (FO-)Proto-MAML.</p> <p>With Proto-MAML, one gets a task-specific, data-dependent initialization in a simple fashion, which seems super nice. For computing the model’s output logits after classification head initialization, dot products between class prototypes and embedded examples are computed, which again seems very reasonable.</p> <p>One could argue that in the one-shot scenario, Proto-MAML doesn’t learn that much in the inner loop beside the initialization itself. This happens as the dot product between an embedded training example and one class prototype (which equals the embedded training example itself for one class) will be disproportionately high. For a k-shot example, this effect might be less, but still, there is always one training example embedding within the prototype to compare. Following this thought, the training samples would rather provide a useful initialization of the final layer than a lot of parameter adaptation.</p> <p>Proto-MAML is claimed to outperform the approaches, K-nearest neighbours, Finetune, MatchingNet, ProtoNet, fo-MAML and RelationNet on most sub-datasets of MetaDataset <d-cite key="DBLP:conf/iclr/TriantafillouZD20"></d-cite>, like ILSVRC-2012 or Omniglot.</p> <h2 id="what-else-is-there">What else is there?</h2> <p>Before proceeding to <a href="#conclusion--discussion">Conclusion &amp; Discussion</a>, here are some pointers to methods that did not perfectly fit the topic but which are closely related:</p> <p>The first method worth mentioning is called Latent Embedding Optimization (LEO) <d-cite key="DBLP:conf/iclr/RusuRSVPOH19"></d-cite>. The authors encode the training data in a low dimensional subspace, from which model parameters $\theta$ can be generated. In the example presented, $\theta$ consists only of $\mathbf{w}$, so for the first inner-loop iteration, this would perfectly fit our initialization topic. The low-dimensional code is generated using a feed-forward encoder, as well as a relation network. Using the relation network allows LEO to consider relations between the training examples of different classes. Very similar classes, for example, might require different decision boundaries than more distinct classes, hence the intuition.</p> <p>LEO deviates from the initialization scheme, however, as optimization is done in the low dimensional subspace and not on the model’s parameters directly. It is stated that optimizing in a lower dimensional subspace helps in low-data regimes.</p> <p>Another related method is called MetaOptNet <d-cite key="DBLP:conf/cvpr/LeeMRS19"></d-cite>. In this approach, convex base learners, like support vector machines, are used as the classification head. Those can be optimized till convergence, which solves, e.g., the problem of varying performance due to random class label orderings.</p> <h2 id="conclusion-and-discussion">Conclusion and Discussion</h2> <p>To conclude, we’ve seen that a variety of problems can be tackled by using initialization strategies for MAML’s linear classification head, including:</p> <ul> <li>Varying performance due to random class label orderings</li> <li>Ability of MAML to work on datasets where the number of classes per task varies</li> <li>Memorization overfitting</li> <li>Cross-task interference</li> <li>and Initialization interference.</li> </ul> <p>Furthermore, for all the approaches presented, a decent gain in performance is reported in comparison to vanilla MAML. It seems, therefore, very reasonable to spend some time thinking about the last layer initialization.</p> <p>Looking at the problems mentioned and variants discussed in more detail, we can state that all the different variants make MAML <strong>permutation invariant with regard to class label orderings</strong>. UnicornMAML and the zeroing trick solve it by uniform initialization of $\mathbf{w}$. In Proto-MAML, the initialization adapts to the class label assignments, so it’s permutation invariant as well.</p> <p>Also, all variants are compatible with <strong>datasets where the number of classes per task varies</strong>. In UnicornMAML, an arbitrary number of classification head vectors can be initialized with the single meta-learned classification head weight vector. When zero-initializing the classification head, the number of classes per task does not matter as well. In Proto-MAML, prototypes can be computed for an arbitrary number of classes, so again, the algorithm works on such a dataset without further adaption.</p> <p>Next, UnicornMAML and the zeroing trick solve <strong>memorization overfitting</strong>, again by initializing $\mathbf{w}$ identically for all classes. Proto-MAML solves memorization overfitting as well, as the task-specific initialization of $\mathbf{w}$ itself can be interpreted as fine-tuning.</p> <p><strong>Cross-task interference</strong> and <strong>initialization interference</strong> are solved by the zeroing trick. For the other methods, this is harder to say, as the derivations made by Kao et al. [2022] <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite> are quite a case specific. Intuitively, Proto-MAML should solve cross-task interference, as the classification head is reinitialized after each task. Initialization interference is not solved by either ProtoMAML or UnicornMAML, as random initialization before the beginning of meta-training remains.</p> <p>Note that in discussion with a reviewer, Kao et al. [2022] <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite> state that the main results they show are achieved by models which had the zeroing trick implemented but which didn’t follow the EFIL assumption. They argue that using only the zeroing trick still enhances supervised contrastiveness. This kind of puts their whole theory into perspective, as without the EFIL assumption, MAML with the zeroing trick is neither an SCL algorithm nor a noisy SCL algorithm. Still, noticeable performance gains are reported though.</p> <p>The question arises whether the whole theoretical background is needed or whether the zeroing tricks benefit is mainly the identical initialization for all classes, like in UnicornMAML. It would be nice to see how the single learned initialization vector in UnicornMAML turns out to be shaped and how it compares to the zeroing trick. While the zeroing trick reduces cross-task noise and initialization noise, a single initialization vector can weight some features as more important than others for the final classification decision across tasks.</p> <p>In contrast to the uniform initialization approaches, we have seen Proto-MAML, where class-specific classification head vectors are computed for initialization based on the training data.</p> <p>Finally, Ye &amp; Chao [2022] <d-cite key="DBLP:conf/iclr/YeC22"></d-cite> compare the performance between Proto-MAML and UnicornMAML on MiniImageNet and TieredImageNet. UnicornMAML performs slightly better here in the one- and five-shot settings. Kao et al. [2022] <d-cite key="DBLP:conf/iclr/KaoCC22"></d-cite> report that MAML with the zeroing-trick outperforms unmodified MAML on the mini-ImageNet and Omniglot datasets. They do not provide a benchmark score, however.</p>]]></content><author><name>Nys Tjade Siegel</name></author><summary type="html"><![CDATA[["This blog post discusses different strategies for initializing the classification layers parameters before fine-tuning on a new task in Model-Agnostic Meta-Learning. Each of the strategies in question has emerged from a different problem", "and it will be analyzed whether one approach can solve the problems addressed by the other approaches."]]]></summary></entry><entry><title type="html">Data Poisoning is Hitting a Wall</title><link href="https://iclr-blogposts.github.io/2023/blog/2023/facial-poisoning/" rel="alternate" type="text/html" title="Data Poisoning is Hitting a Wall"/><published>2023-05-01T00:00:00+02:00</published><updated>2023-05-01T00:00:00+02:00</updated><id>https://iclr-blogposts.github.io/2023/blog/2023/facial-poisoning</id><content type="html" xml:base="https://iclr-blogposts.github.io/2023/blog/2023/facial-poisoning/"><![CDATA[<h2 id="overview-and-motivation">Overview and Motivation</h2> <p>To illustrate the data poisoning process, and to tie in with the paper below, let’s describe data poisoning against the backdrop of the facial recognition problem.</p> <p>Facial recognition systems have been known to pose a severe threat to society. With unprecedented advancements in AI research, it is evident that this threat will be around for a while. There has been a steady increase in vendors offering facial recognition services for downstream applications — ranging from customer onboarding tools to criminal identification for police forces. The systems provided by these vendors are usually trained on images of users’ faces scraped from the Web. Ethical and moral concerns aside, this poses a considerable risk to the privacy of individuals.</p> <h4 id="what-is-data-poisoning">What is Data Poisoning?</h4> <p>Keeping this in mind, a growing body of work has emerged that allows users to fight back using principles from adversarial machine learning. Primary among these is the technique of data poisoning - where users can perturb pictures that they post online so that models that train on these become <em>poisoned</em>. In other words, once a model has been introduced to a perturbed image of a user, it misidentifies any future instances of that person.</p> <p>Services like <em>Fawkes</em> popularized this approach by offering a service promising “strong protection against unauthorized {facial recognition} models.” Users could pass their images through Fawkes and receive poisoned photos - virtually identical to the naked eye, which were then posted to social media, alleviating any worries that they might be used to identify them in the future. It quickly gained popularity, was covered by the New York Times <d-footnote>[This tool could protect your photos from Facial Recognition](https://www.nytimes.com/2020/08/03/technology/fawkes-tool-protects-photos-from-facial-recognition.html)</d-footnote> and received over 500,000 downloads. Following Fawkes’ success, similar systems were proposed in academic and commercial settings.</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-facial-poisoning/facial_poisoning-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-facial-poisoning/facial_poisoning-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-facial-poisoning/facial_poisoning-1400.webp"/> <img src="/2023/assets/img/2023-05-01-facial-poisoning/facial_poisoning.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <hr/> <p>The authors of the paper, however, look at these systems from a different perspective. They argue that services like Fawkes (and poisoning strategies in general) cannot protect users’ privacy when it comes to facial recognition systems. In fact, it usually exacerbates the situation by providing them with a false sense of security. For instance, there might have previously been a privacy-focused user who would have refrained from uploading their photos to the Internet. However, they might do so now under the false belief that their poisoned photos would work towards protecting their privacy. Thus, these users are now <em>less private</em> than they were before.</p> <h2 id="why-doesnt-data-poisoning-work">Why doesn’t data poisoning work?</h2> <p>While data poisoning may have uses in other fields, such as healthcare, this post shows that it would not protect against facial recognition models. The main reason for this is due to a fundamental asymmetry between the users and the model trainers. Let us take the scenario described in the above figure. A user commits to an attack and uploads a perturbed image of themselves to the Web. This image eventually gets scraped by the model as part of its data collection strategy. In this case, the model trainer, or the vendors offering facial recognition services, now benefit from acting second. This provides them with two significant advantages:</p> <ul> <li> <p>Since image poisoning systems cater to large user bases, these systems are usually made publicly accessible. This allows the model trainers to become aware of the technique used, which, in turn, helps them apply techniques to resist the poisoning attacks. This strategy of using alternate training techniques is known as an <strong>adaptive defense</strong>.</p> </li> <li> <p>As current poisoning attacks are designed to prevent <em>existing</em> facial recognition tools from working, there is no reason to assume that future models will also be poisoned. So, trainers can simply wait a while and use newer models to keep identifying users, which would be invulnerable to poisoning attacks. This technique can (aptly) be named an <strong>oblivious defense</strong>.</p> </li> </ul> <p>Observant readers might equate this setting of continually evolving attack and defense tactics to an <em>arms race</em>. However, since a perturbation applied to an image cannot be changed once scraped by the model, a successful attack has to remain effective against <em>all</em> future models, even those trained adaptively against the attack. A better alternative to this would be pushing for legislation that restricts the use of privacy-invasive facial recognition systems.</p> <h2 id="high-level-idea">High Level Idea</h2> <p>We now look at the conclusions put forward in the excellent paper written by Radiya-Dixit <em>et al</em>.</p> <ol> <li>An adaptive model trainer with black-box access to facial recognition systems like Fawkes can train a robust model that resists poisoning attacks and correctly identifies all users with high accuracy.</li> <li>An adaptive model trainer can also repurpose this model to <em>detect</em> perturbed pictures with near-perfect accuracy.</li> <li>Image poisoning systems have already been broken by newer facial recognition that appeared less than a year after the attacks were introduced and employed superior training strategies.</li> <li>It is possible to increase the robustness of a model (against poisoning attacks) without degrading its accuracy in identifying ‘clean’ images.</li> </ol> <p>Let us take a closer look and deconstruct how the authors arrived at these conclusions.</p> <h2 id="experiments">Experiments</h2> <p>For clarity, before we arrive at the individual conclusions, we look at the setup used by the authors to carry out their experiments.</p> <p>The authors evaluate three distinct poisoning attacks: <strong>Fawkes v0.3</strong>, <strong>Fawkes v1.0</strong><d-cite key="shan2020fawkes"></d-cite>, and a separate attack published at ICLR 2021 called <strong>LowKey</strong><d-cite key="cherepanova2021lowkey"></d-cite>. All of these function on the same underlying principle of data poisoning. Their goal is to force the facial recognition model to associate an image with spurious features absent in unperturbed images.</p> <p>The experiments are performed with the <em>FaceScrub</em> dataset<d-cite key="ng2014data"></d-cite>, which contains over 50,000 pictures of 530 celebrities. A sample run of an experimental procedure can be described as follows: A user, in this case, one of the celebrities in the <em>FaceScrub</em> dataset, perturbs all of their images with <em>Fawkes</em> or <em>LowKey</em> in their strongest settings. These images then end up as the training data used by the model trainer. The model trainer uses the standard approach for training their facial recognition system by employing a pre-trained feature extractor to convert pictures into embeddings. Given a test image, the model tries to find a training example that minimizes the distance between them in the embedding space and returns the identity associated with the training example.</p> <p>The authors use various models as feature extractors from <em>FaceNet</em><d-cite key="schroff2015facenet"></d-cite> to OpenAI’s <em>CLIP</em><d-cite key="radford2021learning"></d-cite>. This is an important step that helps quantify the effectiveness of the <strong>oblivious defense</strong> strategy. ***</p> <h4 id="adaptive-defenses-break-facial-poisoning-attacks">Adaptive defenses break facial poisoning attacks</h4> <p>This section describes how the model trainer can adaptively train a generic feature extractor that can resist poisoning attacks.</p> <p>The model trainer begins by collecting a public dataset of unperturbed images. In this case, that would be a canonical dataset of celebrities that are a part of the <em>FaceScrub</em> dataset. With black-box access to the poisoning tool, the trainer calls it to obtain perturbed samples of the same images.</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-facial-poisoning/adaptive-attack.gif-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-facial-poisoning/adaptive-attack.gif-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-facial-poisoning/adaptive-attack.gif-1400.webp"/> <img src="/2023/assets/img/2023-05-01-facial-poisoning/adaptive-attack.gif" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <p>With access to both unperturbed images and their corresponding poisoned counterparts, the trainer can teach a model to produce similar embeddings for both sets of pictures, encouraging the model to adaptively learn robust features. This is done hoping that this robustness would eventually generalize to perturbations’ applied to other images.</p> <p>While the above strategy works in theory, it requires direct intervention from model trainers by using the ‘clean’ images provided by them. This would not scale well, especially for large-scale facial recognition systems that look at millions of photographs. However, this attack could also occur without the trainers’ explicit involvement. There is a high possibility that some users already have unperturbed images of themselves on the Web; either they forgot to perturb some pictures, or they were uploaded by someone else. Feature extractors trained on these pictures would then be encouraged to learn robust features.</p> <p><strong>Results:</strong> All three attacks were evaluated against a non-robust <em>WebFace</em> model to establish a baseline. They were found to have a misclassification rate of 55-77% for users who poisoned their pictures online. This compares starkly to a rate of 8% for unprotected users. However, when trained adaptively, the misclassification rate for all users - protected or unprotected - dropped to 5-8%, and all poisoning attacks were rendered ineffective. ***</p> <h4 id="attack-detection">Attack Detection</h4> <p>Since the model trainers have black-box access to the facial poisoning tools (<em>Fawkes</em> and <em>LowKey</em>), they can also turn the tables and build a detector to determine whether a specific image has been perturbed. Such a detector can dynamically filter out perturbed photos, allowing the model to retain only unperturbed pictures of a user. Moreover, detecting an attack could be a privacy concern (for instance, law enforcement might actively target users whose attack attempts are detected).</p> <p>To verify this, the authors were able to fine-tune a standard pre-trained <em>ImageNet</em> model to distinguish between perturbed and clean images of 25 random celebrities in the dataset. The model detected the poisoned images with near-perfect precision (99.8%) and recall (99.8%). ***</p> <h4 id="time-is-all-you-need">Time is all you need</h4> <p>Rather than creating poisoned counterparts to clean images and adaptively training a model, trainers have a much simpler alternative. They can simply wait for better facial recognition systems to be developed and then retroactively apply such a system to pictures they scraped in the past. <em><strong>Simply put, facial poisoning attacks cannot withstand the test of time</strong></em>.</p> <p>To bypass this <em>oblivious</em> defense strategy, an attack must not only be able to fool all present models but also be effective against future iterations without changing its perturbation. Asymetrically (to the benefit of the model trainer) newer techniques need not be robust to all attacks; instead, they just have to resist the specific method used in previous pictures.</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-facial-poisoning/oblivious-attack.gif-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-facial-poisoning/oblivious-attack.gif-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-facial-poisoning/oblivious-attack.gif-1400.webp"/> <img src="/2023/assets/img/2023-05-01-facial-poisoning/oblivious-attack.gif" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <p>To confirm this, the paper included a study where <em>Fawkes</em> was pitted against various feature extractors ordered chronologically. While the original <em>Fawkes v0.3</em> was utterly ineffective against any model apart from <em>WebFace</em>, the updated v1.0 could transfer its attack to other extractors like <em>VGGFace</em>, <em>FaceNet</em>, and <em>ArcFace</em>. However, while <em>Fawkes v1.0</em> provided a perfect (100%) error rate on the <em>Celeb1M</em> model (the one it was trained to target), it failed miserably against more recent extractors like <em>MagFace</em><d-cite key="meng2021magface"></d-cite> or <em>CLIP</em>. A similar trend was also observed when using <em>LowKey</em>. While it fared better than <em>Fawkes</em> and could transfer its attack to MagSafe, LowKey failed to break the fine-tuned <em>CLIP</em> model trained by the authors.</p> <p>To provide more credence to their findings, the authors also illustrated how users who downloaded an older model (<em>Fawkes v0.3</em>, for example) could not ‘regain’ their privacy by switching to an updated attack. For brevity, this post does not go into the specifics, but we encourage interested readers to look at the paper and additional supplementary material. ***</p> <h4 id="robustness-shouldnt-come-at-the-cost-of-accuracy">Robustness shouldn’t come at the cost of accuracy</h4> <p>A potential caveat for the <em>adaptive</em> and <em>oblivious</em> defenses is that increased robustness may come at the cost of decreased accuracy. For example, the CLIP model is much more robust than all the other feature extractors, but its clean accuracy falls slightly below the best models. In most cases, a trainer might be hesitant to deploy a <em>CLIP</em>-based model if only a small minority of users try to attack the system.</p> <p>Keeping this in mind, the authors demonstrated two approaches that allow model trainers to incorporate the best qualities of both worlds:</p> <p><strong>Top2:</strong> This approach involved having a human in the loop. The authors propose that the system simply run the image through both models and return two candidate labels. To further streamline the process, the system could pass the image to the robust model only when the more accurate model cannot get a result. Humans could visually inspect these images to check for inconsistencies or determine if they were poisoned.</p> <p><strong>Confidence Thresholding:</strong> To automate the above process, the system could begin by passing the image through the most accurate model and checking the prediction’s confidence. This can be quantitatively defined as the distance between the target picture and its nearest neighbor in the embedding space. If the system finds the confidence below a certain threshold, the image is passed through the robust model instead.</p> <p>The paper demonstrates a facial recognition system that uses <em>MagFace</em> for an accurate model and combines that with a more robust model like the fine-tuned <em>CLIP</em> or an adaptively trained model. In both cases, the clean accuracy of the system matches or exceeds that of <em>MagFace</em>, while retaining high robustness to attacks.</p> <hr/> <h3 id="conclusion">Conclusion</h3> <p>The main takeaway from this post is that data poisoning is no longer an effective method to protect users from facial recognition systems. The original premise for developing poisoning attacks was to facilitate an ‘arms race,’ where better attacks could counteract improved defenses. However, the people who deploy facial recognition models would always have the upper hand.</p> <p>The paper shows that facial recognition models can be trained to detect and overcome poisoning attacks by simply having black-box access to a public-facing tool or just waiting for newer models and retroactively using them. To compete even against the latter category of systems, users would have to presume that minimal changes will be made to facial recognition models in the upcoming years. Given the state and pace of research in the field, that seems highly unlikely. ***</p> <h3 id="outlook">Outlook</h3> <p>This blog post provides a better understanding of the techniques used to neutralize the effects of data poisoning from the ICLR 2022 paper <em>Data Poisoning Won’t Save You from Facial Recognition.</em> We hope that this has been of help to researchers and practitioners in the fields of adversarial ML.</p> <p>We now look to provide some clarifications and how we think this work would fit in the current age of machine learning.</p> <p><strong>The work is a net positive</strong> This paper takes a gloomy stance on the current state of protection against facial recognition models. By stating that model trainers would always have the upper hand in the race by simply switching to a more advanced framework, the authors quash any possibility of a technological solution. Instead, they argue that a legislative approach might hold the key to solving the problem. Looking at the discussion between the authors and the reviewers before the acceptance of the paper <d-footnote>[ICLR OpenReview](https://openreview.net/forum?id=B5XahNLmna)</d-footnote>, it was clear that the reviewers were reluctant to accept the finality of the solution - a sentiment we’re sure would be shared by many others. However, if nothing else, this paper warns users against the futility of using commercial products like Fawkes to protect their identities. In alleviating the false sense of security provided by data poisoning attacks, this paper - and, by extension, this post - serves as a net positive for users’ privacy.</p> <p><strong>Is legislation the answer?</strong> With artificial intelligence embedding itself into society at an unprecedented rate, it is clear that a complete overhaul of legislative frameworks is urgently required. As AI becomes more mainstream, privacy-invasive systems could graduate from storing information to using them for financial incentives. While we have seen this happen with users’ browsing data, the repercussions of using biometrics would be much more severe. In fact, there have already been cases where facial recognition has been used by companies on users without their prior explicit consent. <d-footnote> [Madison Square Garden has put lawyers who represent people suing it on an 'exclusion list' to keep them out of concerts and sporting events](https://www.nytimes.com/2022/12/22/nyregion/madison-square-garden-facial-recognition.html)</d-footnote></p> <p>While we agree with the authors for a push towards proper legislation, given the rate of progress, we believe the community can do more. Legislation is a process that moves slowly and usually needs uniform implementation. Literature on the subject has shown that each country has its own views on the emerging landscape of AI <d-footnote>[How different countries view artificial intelligence](https://www.brookings.edu/research/how-different-countries-view-artificial-intelligence/)</d-footnote> and bases its rules on those views. These may or may not always work. We believe a temporary stopgap in the form of a technological solution would be helpful, while a legislative solution holds maximum promise in the long run.</p> <hr/> <h3 id="tldr">TL;DR</h3> <p>This post broadly explores the ineffectiveness of data poisoning strategies against facial recognition models. It shows that commercial solutions like Fawkes and LowKey, which allow users to perturb their photos before posting them to social media, offer no protection to the users once their pictures are scraped.</p> <p>It reveals that an ‘oblivious’ model trainer can simply wait long enough for future developments to nullify the effects of the perturbation. Or, since the people developing the facial recognition systems also have access to poisoning tools, they can simply develop strategies to detect and adapt to the perturbations.</p> <p>Finally, given that there are no technical solutions to the problem, the best approach would be to push for legislation to counteract privacy-invasive facial recognition systems.</p> <hr/>]]></content><author><name>Rajat Sahay</name></author><summary type="html"><![CDATA[In this post, we look at the paper 'Data Poisoning Won't Save You From Facial Recognition', discuss the impact of the work, and additionally look at how this work fares in the current state of adversarial machine learning. Being a blog post as opposed to a traditional paper, we try to avoid inundating the reader with mathematical equations and complex terminologies. Instead, we aim to put forth this work's primary concept and implications, along with our observations, in a clear, concise manner. Don't want to go through the entire post? Check out the TL;DR at the end for a quick summary.]]></summary></entry><entry><title type="html">A Hitchhiker’s Guide to Momentum</title><link href="https://iclr-blogposts.github.io/2023/blog/2023/hitchhikers-momentum/" rel="alternate" type="text/html" title="A Hitchhiker’s Guide to Momentum"/><published>2023-05-01T00:00:00+02:00</published><updated>2023-05-01T00:00:00+02:00</updated><id>https://iclr-blogposts.github.io/2023/blog/2023/hitchhikers-momentum</id><content type="html" xml:base="https://iclr-blogposts.github.io/2023/blog/2023/hitchhikers-momentum/"><![CDATA[<blockquote> <p>Dedicated to the memory of Boris Polyak <a href="https://memorialsource.com/memorial/polyak">(May 4, 1935 - February 3, 2023)</a>, inventor of this method and pioneer of the field of optimization.</p> </blockquote> <div style="display: none"> $$ \def\argmin{\mathop{\mathrm{arg\,min}}} \def\xx{\pmb{x}} \def\HH{\pmb{H}} \def\bb{\pmb{b}} \def\EE{ \mathbb{E} } \def\RR{ \mathbb{R} } \def\lmax{L} \def\lmin{\mu} \def\defas{\stackrel{\text{def}}{=}} \definecolor{colormomentum}{RGB}{27, 158, 119} \definecolor{colorstepsize}{RGB}{217, 95, 2} \def\mom{ {\color{colormomentum}{m}} } \def\step{ {\color{colorstepsize}h} } $$ </div> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_convergence_momentum-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_convergence_momentum-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_convergence_momentum-1400.webp"/> <img src="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_convergence_momentum.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <h2 id="gradient-descent-with-momentum">Gradient Descent with Momentum</h2> <p>Gradient descent with momentum,<d-cite key="polyak1964some"></d-cite> also known as heavy ball or momentum for short, is an optimization method designed to solve unconstrained minimization problems of the form \begin{equation} \argmin_{\xx \in \RR^d} \, f(\xx)\,, \end{equation} where the objective function \(f\) is differentiable and we have access to its gradient \(\nabla f\). In this method the update is a sum of two terms. The first term is the difference between the current and the previous iterate \((\xx_{t} - \xx_{t-1})\), also known as <em>momentum term</em>. The second term is the gradient \(\nabla f(\xx_t)\) of the objective function.</p> <p class="framed"> <b class="underline">Gradient Descent with Momentum</b><br/> <b>Input</b>: starting guess \(\xx_0\), step-size \(\step &gt; 0\) and momentum parameter \(\mom \in (0, 1)\).<br/> \(\xx_1 = \xx_0 - \dfrac{\step}{\mom+1} \nabla f(\xx_0)\) <br/> <b>For</b> \(t=1, 2, \ldots\) compute \begin{equation}\label{eq:momentum_update} \xx_{t+1} = \xx_t + \mom(\xx_{t} - \xx_{t-1}) - \step\nabla f(\xx_t) \end{equation} </p> <p>Despite its simplicity, gradient descent with momentum exhibits unexpectedly rich dynamics that we’ll explore on this post.</p> <h3 id="history-and-related-work">History and related work</h3> <p>The origins of momentum can be traced back to Frankel’s method in the 1950s for solving linear system of equations.<d-cite key="frankel1950convergence"></d-cite> It was later generalized by Boris Polyak to non-quadratic objectives<d-cite key="polyak1964some"></d-cite>. While the quadratic case is by now well understood, the general strongly convex case has instead had some fascinating developments in the last years. In the convex (but not strongly convex) case, <a href="https://arxiv.org/pdf/1412.7457.pdf">Ghadimi et al.</a><d-cite key="ghadimi2015global"></d-cite> showed the global convergence of the method in 2015. One year later, Lessard, Recht and Packard<d-cite key="lessard2016analysis"></d-cite> surprised the community with an example of a on-dimensional Lipschitz gradient and strongly convex function where the heavy-ball method (with a specific choice of parameters) doesn’t converge nor diverge, but cycles instead.</p> <p>A paper that also explores the dynamics of momentum is Gabriel Goh’s excellent <a href="https://distill.pub/2017/momentum/">Why Momentum Really Works</a>.<d-cite key="goh2017momentum"></d-cite> There are subtle but important differences between both analysis. The landscape described in the section <a href="https://distill.pub/2017/momentum/#momentum2D">“The Dynamics of Momentum”</a> describe the improvement along the direction <em>of a single eigenvector</em>. This partial view produces some misleading conclusions. For example, along the direction of a single eigenvector, the largest improvement is achieved with zero momentum and a step-size of 1 over the associated eigenvalue. This conclusion however doesn’t hold in higher dimensions, where as we will see, the momentum term that yields the fastest convergence is non-zero.</p> <p>A <strong>stochastic variant</strong> of this method, where the gradient is replaced by a stochastic estimate, is one of the most popular methods for deep learning. This has led in recent years to a flurry of research –and improved understanding – of this stochastic variant. Although we won’t be analyzing the stochastic variant, due to its importance, let us briefly mention some recent works.</p> <p>One of the first works to highlight the importance of momentum for training deep neural networks is the 2013 paper by <a href="https://arxiv.org/abs/1712.07628">Sutskever et al</a>.<d-cite key="sutskever2013importance"></d-cite> Some recent progress in the field has been possible by viewing the stochastic variant as an averaging method.<d-cite key="flammarion2015averaging"></d-cite> This has led to the development of improved last-iterate convergence rates <d-cite key="taylor2019stochastic"></d-cite> <d-cite key="tao2021the"></d-cite> and a better understanding of it’s behavior in the non-convex setting.<d-cite key="defazio2020momentum"></d-cite> Another fruitful line of work has been to consider the <i>overparametrized</i> (or interpolation) setting, where the variance of the updates vanishes at the optimum. In this regime, different momentum-like methods have been shown to enjoy a faster worst-case convergence rate than SGD.<d-cite key="Liu2020Accelerating"></d-cite> <d-cite key="vaswani2019fast"></d-cite></p> <h2 id="how-fast-is-momentum">How Fast is Momentum?</h2> <p>Momentum is <em>fast</em>. So fast that it’s often the default choice of machine learning practitioners. But can we quantify this more precisely?</p> <p>Throughout the post we’ll assume that our objective function \(f\) is a quadratic objective of the form \begin{equation}\label{eq:opt} f(\xx) \defas \frac{1}{2}(\xx - \xx^\star) \HH (\xx - \xx^\star)~, \end{equation} where \(\HH\) is a symmetric positive definite matrix and \(\xx^\star\) is the minimizer of the objective. We’ll assume that the eigenvalues of \(\HH\) are in the interval \([\mu, L]\), where \(\mu\) is strictly positive by the PSD assumption.</p> <p>The measure we’ll use to quantify the speed of convergence is the rate of convergence. This is the worst-case relative improvement in the iterate suboptimality at iteration \(t\), defined as \begin{equation}\label{eq:convergence_rate} r_t \defas \sup_{\xx_0, \text{eigs}(\HH) \in [\mu, L]} \frac{\|\xx_{t} - \xx^\star\|}{\|\xx_{0} - \xx^\star\|}\,. \end{equation} This is a worst-case measure because of all problem instances, we take worst possible initialization \(\xx_0\) and matrix \(\HH\) with eigenvalues in the interval \([\mu, L]\).</p> <p>This is a measure of how much progress is made (in the worst-case) at iteration \(t\). The smaller the value of \(r_t\), the faster the algorithm converges. Since all algorithms that we consider converge exponentially fast, for large enough \(t\) the error is of the order of \(\mathcal{O}{(\text{constant}^t)}\). Hence the most informative quantity is the value of \(\text{constant}\) in this expression. We’ll call this quantity the <i>asymptotic rate of convergence</i>, and denote it: \begin{equation} r_{\infty} \defas \limsup_{t \to \infty} \sqrt[t]{r_t}\,. \end{equation} This is the quantity we’ll be discussing throughout the post and what we’ll use to compare the speed of momentum for different values of its hyperparameters.</p> <h3 id="a-connection-between-optimization-methods-and-polynomials">A connection between optimization methods and polynomials</h3> <p>To compute easily the asymptotic rate of convergence for all admissible values of step-size and momentum, we’ll use a connection between optimization of quadratic functions and the theory of orthogonal polynomials. This theory was extensively used in the early days of numerical analysis <d-cite key="Rutishauser1959"></d-cite> and provides an elegant and simple way to compute asymptotic rates (and non-asymptotic ones, although not the topic of this blog post) from known results in the theory of orthogonal polynomials. We favor this technique for its simplicity and elegance, although ones that also be used with identical results. Other techniques include the linear operator technique used by Polyak,<d-cite key="polyak1964some"></d-cite> the estimate sequences technique pioneered by Nesterov<d-cite key="nesterov1983method"></d-cite> or the use of Lyapunov functions.<d-cite key="JMLR:v22:20-195"></d-cite></p> <p>The main result that will allow us to make the link between optimization and orthogonal polynomials is the following result. It’s origins seem unclear, although a proof can be found in the 1959 monograph of Rutishauser.<d-cite key="Rutishauser1959"></d-cite></p> <p class="lemma"> Consider the following polynomial \(P_t\) of degree \(t\), defined recursively as: \begin{equation} \begin{split} &amp;P_{t+1}(\lambda) = (1 + \mom - \step \lambda ) P_{t}(\lambda) - \mom P_{t-1}(\lambda)\\ &amp;P_1(\lambda) = 1 - \frac{\step}{1 + \mom} \lambda\,, ~ P_0(\lambda) = 1\,,~ \end{split}\label{eq:def_residual_polynomial2} \end{equation} Then we can write the suboptimality at iteration \(t\) as \begin{equation} \xx_t - \xx^\star = P_t(\HH) \left( \xx_0 - \xx^\star \right) \,, \end{equation} where \(P_t(\HH)\) is the matrix obtained from evaluating the (originally real-valued) polynomial \(P_t\) at the matrix \(\HH\). </p> <p>This last identity will allow us to easily compute convergence rates. In particular, plugging it into the definition of the convergence rate \eqref{eq:convergence_rate} we get that the rate is determined by the absolute value of the residual polynomial over the \([\mu, L]\) interval: \begin{align} r_t &amp;= \sup_{\xx_0, \text{eigs}(\HH) \in [\mu, L]} \frac{\|P_t(\HH) \left( \xx_0 - \xx^\star \right)\|}{\|\xx_{0} - \xx^\star\|} \\ &amp; = \sup_{\text{eigs}(\HH) \in [\mu, L]} \|P_t(\HH)\| \\ &amp; = \sup_{\lambda \in [\mu, L]} \lvert P_t(\lambda) \rvert\,. \end{align} We’ve now reduced the problem of computing the convergence rate to the problem of computing the absolute value of a polynomial over a given interval. This is a problem that has been extensively studied in the theory of orthogonal polynomials. In particular, we’ll use known bounds on Chebyshev polynomials of the first and second kind, as the residual polynomial of momentum can be written as a convex combination of these two polynomials. This fact is proven in the next result, which is a generalization of equation (II.29) in (Rutishauser 1959).<d-cite key="Rutishauser1959"></d-cite></p> <p class="lemma"> The residual polynomial of momentum can be written in terms of Chebyshev polynomials of the first and second kind as \begin{align} P_t(\lambda) = \mom^{t/2} \left( {\small\frac{2\mom}{1+\mom}}\, T_t(\sigma(\lambda)) + {\small\frac{1 - \mom}{1 + \mom}}\,U_t(\sigma(\lambda))\right)\,. \end{align} where \(\sigma(\lambda) = {\small\dfrac{1}{2\sqrt{\mom}}}(1 + \mom - \step\,\lambda)\,\) is a linear function that we'll refer to as the <span class="underline">link function</span> and \(T_t\) and \(U_t\) are the Chebyshev polynomials of the first and second kind respectively. </p> <div class="wrap-collapsible-XXX"> <input id="collapsible3" class="toggle" type="checkbox"/> <label for="collapsible3" class="lbl-toggle" tabindex="0"><b>Show proof</b></label><div class="collapsible-content"><div class="content-inner"><div class="proof" id="proof-variance"> <p> Let's denote by \(\widetilde{P}_t\) the right hand side of the above equation, that is, \begin{equation} \widetilde{P}_{t}(\lambda) \defas \mom^{t/2} \left( {\small\frac{2 \mom}{1 + \mom}}\, T_t(\sigma(\lambda)) + {\small\frac{1 - \mom}{1 + \mom}}\, U_t(\sigma(\lambda))\right)\,. \end{equation} Our goal is to show that \(P_t = \widetilde{P}_t\) for all \(t\). </p> <p> For \(t=1\), \(T_1(\lambda) = \lambda\) and \(U_1(\lambda) = 2\lambda\), so we have \begin{align} \widetilde{P}_1(\lambda) &amp;= \sqrt{\mom} \left(\tfrac{2 \mom}{1 + \mom} \sigma(\lambda) + \tfrac{1 - \mom}{1 + \mom} 2 \sigma(\lambda)\right)\\ &amp;= \frac{2 \sqrt{\mom}}{1 + \mom} \sigma(\lambda) = 1 - \frac{\step}{1 + \mom} \lambda\,, \end{align} which corresponds to the definition of \(P_1\) in \eqref{eq:def_residual_polynomial2}. </p> <p> Assume it's true for any iteration up to \(t\), we will show it's true for \(t+1\). Using the three-term recurrence of Chebyshev polynomials we have \begin{align} &amp;\widetilde{P}_{t+1}(\lambda) = \mom^{(t+1)/2} \left( {\small\frac{2 \mom}{1 + \mom}}\, T_{t+1}(\sigma(\lambda)) + {\small\frac{1 - \mom}{1 + \mom}}\, U_{t+1}(\sigma(\lambda))\right) \\ &amp;= \mom^{(t+1)/2} \Big( {\small\frac{2 \mom}{1 + \mom}}\, (2 \sigma(\lambda) T_{t}(\sigma(\lambda)) - T_{t-1}(\sigma(\lambda))) \nonumber\\ &amp;\qquad\qquad + {\small\frac{1 - \mom}{1 + \mom}}\, (2 \sigma(\lambda) U_{t}(\sigma(\lambda)) - U_{t-1}(\sigma(\lambda)))\Big)\\ &amp;= 2 \sigma(\lambda) \sqrt{\mom} P_t(\lambda) - \mom P_{t-1}(\lambda)\\ &amp;= (1 + \mom - \step \lambda) P_t(\lambda) - \mom P_{t-1}(\lambda) \end{align} where the third identity follows from grouping polynomials of same degree and the induction hypothesis. The last expression is the recursive definition of \(P_{t+1}\) in \eqref{eq:def_residual_polynomial2}, which proves the desired \(\widetilde{P}_{t+1} = {P}_{t+1}\). </p> </div></div></div></div> <h3 id="tools-of-the-trade-the-two-faces-of-chebyshev-polynomials">Tools of the trade: the two faces of Chebyshev polynomials</h3> <p>A key feature that we’ll use extensively about Chebyshev polynomials is that they behave very differently inside and outside the interval \([-1, 1]\). Inside this interval (shaded blue region) the magnitude of these polynomials stays close to zero, while outside it explodes:</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/two_phases_chebyshev.gif-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/two_phases_chebyshev.gif-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/two_phases_chebyshev.gif-1400.webp"/> <img src="/2023/assets/img/2023-05-01-hitchhikers-momentum/two_phases_chebyshev.gif" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <p>Let’s make this observation more precise.</p> <p><strong>Inside</strong> the \([-1, 1]\) interval, Chebyshev polynomials admit the <a href="https://en.wikipedia.org/wiki/Chebyshev_polynomials#Trigonometric_definition">trigonometric definitions</a> \(T_t(\cos(\theta)) = \cos(t \theta)\) and \(U_{t}(\cos(\theta)) = \sin((t+1)\theta) / \sin(\theta)\) and so they have an oscillatory behavior with values bounded in absolute value by 1 and \(t+1\) respectively.</p> <p><strong>Outside</strong> of this interval the Chebyshev polynomials of the first kind admit the <a href="https://en.wikipedia.org/wiki/Chebyshev_polynomials#Explicit_expressions">explicit form</a> for \(|\xi| \ge 1\): \begin{align} T_t(\xi) &amp;= \dfrac{1}{2} \Big(\xi-\sqrt{\xi^2-1} \Big)^t + \dfrac{1}{2} \Big(\xi+\sqrt{\xi^2-1} \Big)^t \\ U_t(\xi) &amp;= \frac{(\xi + \sqrt{\xi^2 - 1})^{t+1} - (\xi - \sqrt{\xi^2 - 1})^{t+1}}{2 \sqrt{\xi^2 - 1}}\,. \end{align} We’re interested in convergence rates, so we’ll look into \(t\)-th root asymptotics of the quantities.<d-footnote>With little extra effort, it would be possible to derive non-asymptotic convergence rates, although I won't pursue this analysis here.</d-footnote> Luckily, these asymptotics are the same for both polynomials<d-footnote>Although we won't use it here, this \(t\)-th root asymptotic holds for (almost) all orthogonal polynomials, not just Chebyshev polynomials. See for instance reference below</d-footnote> <d-cite key="stahl1990nth"></d-cite> and taking limits we have that \begin{equation} \lim_{t \to \infty} \sqrt[t]{|T_t(\xi)|} = \lim_{t \to \infty} \sqrt[t]{|U_t(\xi)|} = |\xi| + \sqrt{\xi^2 - 1}\,. \end{equation}</p> <h2 id="the-robust-region">The Robust Region</h2> <p>Let’s start first by considering the case in which the image of \(\sigma\) is in the \([-1, 1]\) interval. This is the most favorable case. In this case, the Chebyshev polynomials are bounded in absolute value by 1 and \(t+1\) respectively. Since the Chebsyshev polynomials are evaluated at \(\sigma(\cdot)\), this implies that \(\lvert \sigma(\lambda)\rvert \leq 1\). We’ll call the set of step-size and momentum parameters for which the previous inequality is verified the <em>robust region</em>.</p> <p>Let’s visualize this region in a map. Since \(\sigma\) is a linear function, its extremal values are reached at the edges: \begin{equation} \max_{\lambda \in [\lmin, \lmax]} |\sigma(\lambda)| = \max{|\sigma(\lmin)|, |\sigma(\lmax)|}\,. \end{equation} Using \(\lmin \leq \lmax\) and that \(\sigma(\lambda)\) is decreasing in \(\lambda\), we can simplify the condition \(\lvert \sigma(\lambda)\rvert \leq 1\) to \(\sigma(\lmin) \leq 1\) and \(\sigma(L) \geq -1\), which in terms of the step-size and momentum correspond to: \begin{equation}\label{eq:robust_region} \frac{(1 - \sqrt{\mom})^2}{\lmin} \leq \step \leq \frac{(1 + \sqrt{\mom})^2}{L} \,. \end{equation} These two conditions provide the upper and lower bound of the robust region.</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/sketch_robust_region-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/sketch_robust_region-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/sketch_robust_region-1400.webp"/> <img src="/2023/assets/img/2023-05-01-hitchhikers-momentum/sketch_robust_region.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <h3 id="asymptotic-rate">Asymptotic rate</h3> <p>Let \(\sigma(\lambda) = \cos(\theta)\) for some \(\theta\), which is always possible since \(\sigma(\lambda) \in [-1, 1]\). In this regime, Chebyshev polynomials verify the identities \(T_t(\cos(\theta)) = \cos(t \theta)\) and \(U_t(\cos(\theta)) = \sin((t+1)\theta)/\sin(\theta)\) , which replacing in the definition of the residual polynomial gives \begin{equation} P_t(\sigma^{-1}(\cos(\theta))) = \mom^{t/2} \left[ {\small\frac{2\mom}{1+\mom}}\, \cos(t\theta) + {\small\frac{1 - \mom}{1 + \mom}}\,\frac{\sin((t+1)\theta)}{\sin(\theta)}\right]\,. \end{equation}</p> <p>Since the expression inside the square brackets is bounded in absolute value by \(t+2\), taking \(t\)-th root and then limits we have \(\limsup_{t \to \infty} \sqrt[t]{\lvert P_t(\sigma^{-1}(\cos(\theta)))\rvert} = \sqrt{\mom}\) for <i>any</i> \(\theta\). This gives our first asymptotic rate:</p> <p class="framed" style="text-align: center"> The asymptotic rate in the robust region is \(r_{\infty} = \sqrt{\mom}\). </p> <p>This is nothing short of magical. It would seem natural –and this will be the case in other regions– that the speed of convergence should depend on both the step-size and the momentum parameter. Yet, this result implies that it’s not the case in the robust region. In this region, the convergence <i>only</i> depends on the momentum parameter $\mom$. Amazing.<d-footnote>This insensitivity to step-size has been leveraged by Zhang et al. 2018 to develop a momentum tuner </d-footnote> <d-cite key="zhang2017yellowfin"></d-cite></p> <p>This also illustrates why we call this the <i>robust</i> region. In its interior, perturbing the step-size in a way that we stay within the region has no effect on the convergence rate. The next figure displays the asymptotic rate (darker is faster) in the robust region.</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_robust_region-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_robust_region-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_robust_region-1400.webp"/> <img src="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_robust_region.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <h2 id="the-lazy-region">The Lazy Region</h2> <p>Let’s consider now what happens outside of the robust region. In this case, the convergence will depend on the largest of \(\{\lvert\sigma(\lmin)\rvert, \lvert\sigma(L)\rvert\}\). We’ll consider first the case in which the maximum is \(\lvert\sigma(\lmin)\rvert\) and leave the other one for next section.</p> <p>This region is determined by the inequalities \(\lvert\sigma(\lmin)\rvert &gt; 1\) and \(\lvert\sigma(\lmin)\rvert \geq \lvert\sigma(L)\rvert\). Using the definition of \(\sigma\) and solving for \(\step\) gives the equivalent conditions \begin{equation} \step \leq \frac{2(1 + \mom)}{L + \lmin} \quad \text{ and }\quad \step \leq \frac{(1 - \sqrt{\mom})^2}{\lmin}\,. \end{equation} Note the second inequality is the same one as for the robust region \eqref{eq:robust_region} but with the inequality sign reversed, and so the region will be on the oposite side of that curve. We’ll call this the <i>lazy region</i>, as in increasing the momentum will take us out of it and into the robust region.</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/sketch_lazy_region-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/sketch_lazy_region-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/sketch_lazy_region-1400.webp"/> <img src="/2023/assets/img/2023-05-01-hitchhikers-momentum/sketch_lazy_region.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <h3 id="asymptotic-rate-1">Asymptotic rate</h3> <p>As we saw earlier, outside of the \([-1, 1]\) interval both Chebyshev have simple \(t\)-th root asymptotics. Using this and that both kinds of Chebyshev polynomials agree in sign outside of the \([-1, 1]\) interval we can compute the asymptotic rate as \begin{align} \lim_{t \to \infty} \sqrt[t]{r_t} &amp;= \sqrt{\mom} \lim_{t \to \infty} \sqrt[t]{ {\small\frac{2\mom}{\mom+1}}\, T_t(\sigma(\lmin)) + {\small\frac{1 - \mom}{1 + \mom}}\,U_t(\sigma(\lmin))} \\ &amp;= \sqrt{\mom}\left(|\sigma(\lmin)| + \sqrt{\sigma(\lmin)^2 - 1} \right) \\ \end{align} This gives the asymptotic rate for this region</p> <p class="framed" style="text-align: center"> In the lazy region the asymptotic rate is \(r_{\infty} = \sqrt{\mom}\left(|\sigma(\lmin)| + \sqrt{\sigma(\lmin)^2 - 1} \right)\). </p> <p>Unlike in the robust region, this rate depends on both the step-size and the momentum parameter, which enters in the rate through the link function \(\sigma\). This can be observed in the color plot of the asymptotic rate</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_lazy_region-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_lazy_region-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_lazy_region-1400.webp"/> <img src="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_lazy_region.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <h2 id="knifes-edge">Knife’s Edge</h2> <p>The robust and lazy region occupy most (but not all!) of the region for which momentum converges. There’s a small region that sits between the lazy and robust regions and the region where momentum diverges. We call this region the <i>Knife’s edge</i></p> <p>For parameters not in the robust or lazy region, we have that \(|\sigma(L)| &gt; 1\) and \(|\sigma(L)| &gt; |\sigma(\lmin)|\). Using the asymptotics of Chebyshev polynomials as we did in the previous section, we have that the asymptotic rate is \(\sqrt{\mom}\left(|\sigma(L)| + \sqrt{\sigma(L)^2 - 1} \right)\). The method will only converge when this asymptotic rate is below 1. Enforcing this results in \(\step \lt 2 (1 + \mom) / L\). Combining this condition with the one of not being in the robust or lazy region gives the characterization: \begin{equation} \step \lt \frac{2 (1 + \mom)}{L} \quad \text{ and } \quad \step \geq \max\Big\{\tfrac{2(1 + \mom)}{L + \lmin}, \tfrac{(1 + \sqrt{\mom})^2}{L}\Big\}\,. \end{equation}</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/sketch_knife_edge-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/sketch_knife_edge-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/sketch_knife_edge-1400.webp"/> <img src="/2023/assets/img/2023-05-01-hitchhikers-momentum/sketch_knife_edge.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <h3 id="asymptotic-rate-2">Asymptotic rate</h3> <p>The asymptotic rate can be computed using the same technique as in the lazy region. The resulting rate is the same as in that region but with \(\sigma(L)\) replacing \(\sigma(\lmin)\):</p> <p class="framed" style="text-align: center"> In the Knife's edge region the asymptotic rate is \(\sqrt{\mom}\left(|\sigma(L)| + \sqrt{\sigma(L)^2 - 1} \right)\). </p> <p>Pictorially, this corresponds to</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_knife_edge-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_knife_edge-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_knife_edge-1400.webp"/> <img src="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_knife_edge.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <h2 id="putting-it-all-together">Putting it All Together</h2> <p>This is the end of our journey. We’ve visited all the regions on which momentum converges.<d-footnote>There's a small convergent region with <i>negative</i> momentum parameter that we haven't visited. Although not typically used for minimization, negative momentum has found applications in smooth games <a href="https://arxiv.org/abs/1807.04740">(Gidel et al., 2020)</a>.</d-footnote> The only thing left to do is to combine all the asymptotic rates we’ve gathered along the way.</p> <p class="theorem"> The asymptotic rate \(\limsup_{t \to \infty} \sqrt[t]{r_t}\) of momentum is \begin{alignat}{2} &amp;\sqrt{\mom} &amp;&amp;\text{ if }\step \in \big[\frac{(1 - \sqrt{\mom})^2}{\lmin}, \frac{(1+\sqrt{\mom})^2}{L}\big]\\ &amp;\sqrt{\mom}(|\sigma(\lmin)| + \sqrt{\sigma(\lmin)^2 - 1}) &amp;&amp;\text{ if } \step \in \big[0, \min\{\tfrac{2(1 + \mom)}{L + \lmin}, \tfrac{(1 - \sqrt{\mom})^2}{\lmin}\}\big]\\ &amp;\sqrt{\mom}(|\sigma(L)| + \sqrt{\sigma(L)^2 - 1})&amp;&amp;\text{ if } \step \in \big[\max\big\{\tfrac{2(1 + \mom)}{L + \lmin}, \tfrac{(1 + \sqrt{\mom})^2}{L}\big\}, \tfrac{2 (1 + \mom) }{L} \big)\\ &amp;\geq 1 \text{ (divergence)} &amp;&amp; \text{ otherwise.} \end{alignat} </p> <p>Plotting the asymptotic rates for all regions we can see that Polyak momentum (the method with momentum $\mom = \left(\frac{\sqrt{L} - \sqrt{\lmin}}{\sqrt{L} + \sqrt{\lmin}}\right)^2$ and step-size $\step = \left(\frac{2}{\sqrt{L} + \sqrt{\lmin}}\right)^2$ which is asymptotically optimal among the momentum methods with constant coefficients) is at the intersection of the three regions.</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_convergence_momentum-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_convergence_momentum-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_convergence_momentum-1400.webp"/> <img src="/2023/assets/img/2023-05-01-hitchhikers-momentum/rate_convergence_momentum.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <h2 id="reproducibility">Reproducibility</h2> <p>All plots in this post were generated using the following Jupyer notebook: <a href="/2023/assets/html/2023-05-01-hitchhikers-momentum/hitchhikers-momentum.html">[HTML]</a> <a href="/2023/assets/html/2023-05-01-hitchhikers-momentum/hitchhikers-momentum.ipynb">[IPYNB]</a></p>]]></content><author><name>Fabian Pedregosa</name></author><summary type="html"><![CDATA[Polyak momentum is one of the most iconic methods in optimization. Despite it's simplicity, it features rich dynamics that depend both on the step-size and momentum parameter. In this blog post we identify the different regions of the parameter space and discuss their convergence properties using the theory of Chebyshev polynomials.]]></summary></entry><entry><title type="html">How does the inductive bias influence the generalization capability of neural networks?</title><link href="https://iclr-blogposts.github.io/2023/blog/2023/how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/" rel="alternate" type="text/html" title="How does the inductive bias influence the generalization capability of neural networks?"/><published>2023-05-01T00:00:00+02:00</published><updated>2023-05-01T00:00:00+02:00</updated><id>https://iclr-blogposts.github.io/2023/blog/2023/how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks</id><content type="html" xml:base="https://iclr-blogposts.github.io/2023/blog/2023/how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/"><![CDATA[<p>Deep neural networks are a commonly used machine learning technique that has proven to be effective for many different use cases. However, their ability to generalize from training data is not well understood. In this blog post, we will explore the paper “Identity Crisis: Memorization and Generalization under Extreme Overparameterization” by Zhang et al. [2020] <d-cite key="DBLP:conf/iclr/ZhangBHMS20"></d-cite>, which aims to shed light on the question of why neural networks are able to generalize, and how inductive biases influence their generalization capabilities.</p> <h2 id="overfitting-puzzle">Overfitting Puzzle</h2> <p>One open question in the field of machine learning is the <strong>overfitting puzzle</strong>, which describes the paradox that neural networks are often used in an overparameterized state (i.e., with more parameters than training examples), yet they are still able to generalize well to new, unseen data. This contradicts <strong>classical learning theory</strong>, which states that a model with too many parameters will simply memorize the training data and perform poorly on new data. This is based on the <a href="https://machinelearningcompass.com/model_optimization/bias_and_variance/"><strong>bias-variance tradeoff</strong></a> which is commonly illustrated in this way <d-cite key="fortmann2012understanding"></d-cite>:</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/bias_variance_tradeoff-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/bias_variance_tradeoff-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/bias_variance_tradeoff-1400.webp"/> <img src="/2023/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/bias_variance_tradeoff.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <p>The tradeoff consists of finding the optimal model complexity between two extremes: If there are too few parameters, the model may have high bias and underfit the data, resulting in poor performance on both the training and test data. On the other hand, if there are too many parameters, the model may have high variance and overfit the training data, resulting in a good performance on the training data but a poor performance on the test data.</p> <p>Therefore, it is important to carefully balance the number of parameters and the amount of data available to achieve the best possible generalization performance for a given learning task.</p> <p>Neural networks, particularly deep networks, are typically used in the overparameterized regime, where the number of parameters exceeds the number of training examples. In these cases, common generalization bounds do not apply <d-cite key="DBLP:journals/corr/abs-1801-00173"></d-cite>. According to classical learning theory, the generalization behavior of a learning system should depend on the number of training examples (n), and the complexity of the model should be balanced with its fit to the data <d-cite key="DBLP:journals/corr/abs-1801-00173"></d-cite>. Otherwise, the algorithm would overfit. However, neural networks have shown that this is not always the case, as they can perform well even in cases of extreme overparameterization (e.g., a 5-layer CNN with 80 million parameters <d-cite key="DBLP:conf/iclr/ZhangBHMS20"></d-cite>). This is a very interesting finding as it shows that the classical learning theory may not hold true for neural networks.</p> <p>To better understand this phenomenon, Zhang et al. [2020] <d-cite key="DBLP:conf/iclr/ZhangBHMS20"></d-cite> examined the role of <strong>inductive bias</strong> in neural networks and its influence on the generalization capability of these networks. Inductive bias, or learning bias, refers to the assumptions a network makes about the nature of the target function and is determined by the network’s architecture. Zhang et al. [2020] <d-cite key="DBLP:conf/iclr/ZhangBHMS20"></d-cite> conducted experiments with different types of fully connected networks (FCN) and convolutional neural networks (CNN) to investigate which biases are effective for these network architectures.</p> <h2 id="experiments">Experiments</h2> <p>In the paper “Identity Crisis: Memorization and Generalization under Extreme Overparameterization” by Zhang et al. [2020] <d-cite key="DBLP:conf/iclr/ZhangBHMS20"></d-cite>, the authors use <strong>empirical studies</strong> to better understand the <em>overfitting puzzle</em> and how inductive bias affects the behavior of overparameterized neural networks. The authors specifically aim to investigate the role of inductive bias under <strong>different architectural choices</strong> by comparing fully connected and convolutional neural networks.</p> <p>The task used in the study is to learn an identity map through a single data point, which is an artificial setup that demonstrates the most extreme case of overparameterization. The goal of the study is to determine whether a network tends towards memorization (learning a constant function) or generalization (learning the identity function).</p> <p>To enable the <strong>identity task</strong> <d-cite key="DBLP:conf/eccv/HeZRS16"></d-cite> for linear models, the authors ensure that hidden dimensions are not smaller than the input and set the weights to the identity matrix in every layer. For convolutional layers, only the center of the kernel is used, and all other values are set to zero, simulating a 1 x 1 convolution which acts as a local identity function. For deeper models that use the <a href="https://en.wikipedia.org/wiki/Rectifier_(neural_networks)">ReLU</a> activation function, it is necessary to encode and recover negative values, as they are discarded by the ReLU function. This can be achieved by using hidden dimensions that are twice the size of the input and storing negative and positive values separately.</p> <p>All networks are trained using standard gradient descent to minimize the mean squared error.</p> <p>The study uses the <strong><a href="https://paperswithcode.com/dataset/mnist">MNIST dataset</a></strong> and tests the networks on various types of data, including a linear combination of two digits, random digits from the MNIST test set, random images from the Fashion MNIST dataset, and algorithmically generated image patterns.</p> <p>So let us look at some of the results:</p> <div class="l-page"> <iframe src="/2023/assets/html/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_3.html" frameborder="0" scrolling="no" width="100%" height="450px"></iframe> </div> <p>The first column of the figure above shows the single data point that was used to train the network on, and all following columns show the test data with its specific results. The rows represent the different implementations of the respective networks (FCN, CNN).</p> <h3 id="fully-connected-networks-fcn">Fully connected networks (FCN)</h3> <p>For fully connected networks, the outputs differ depending on the depth of the network and the type of testing data. Shallower networks seem to incorporate random white noise into the output, while deeper networks tend to learn the constant function. The similarity of the test data to the training example also affects the behavior of the model. When the test data is from the MNIST digit sets, all network architectures perform quite well. However, for test data that is more dissimilar to the training data, the output tends to include more random white noise. The authors prove this finding with a <em>theorem</em> for 1-layer FCNs. The formula shows the prediction results for a test data point $x$:</p> \[f(x) = \Pi_{\parallel}(x) + R \Pi_{\perp}(x)\] <p>The test data point is decomposed into components that are parallel $\Pi_{\parallel}$ and perpendicular $\Pi_{\perp}$ to the training example. $R$ is a random matrix, independent of the training data. If the test data is highly correlated to the training data, the prediction resembles the training output. If the test data is dissimilar to the training data, $\Pi_{\perp}(x)$ dominates $\Pi_{\parallel}(x)$, the output is randomly projected by $R$ and persists of white noise.</p> <p>This behavior can be confirmed by visualizing the results of the 1-layer FCN:</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_1layer-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_1layer-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_1layer-1400.webp"/> <img src="/2023/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_1layer.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <p>The inductive bias does not lead to either good generalization or memorization. Instead, the predictions become more random as the test data becomes less similar to the training data.</p> <p>Deeper networks tend to learn the constant function, resulting in a strong inductive bias towards the training output regardless of the specific input. This behavior is similar to that of a deep ReLU network, as shown in the figure comparing deep FCN and deep ReLU networks.</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_compareFCNReLU-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_compareFCNReLU-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_compareFCNReLU-1400.webp"/> <img src="/2023/assets/img/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/Figure2_compareFCNReLU.png" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <p>Zhang et al. [2020] <d-cite key="DBLP:conf/iclr/ZhangBHMS20"></d-cite> conclude that more complex network architectures are more prone to memorization. This finding aligns with statistical learning theory, as a more complex architecture has more parameters and, therefore, more overparameterization.</p> <h3 id="convolutional-neural-networks-cnn">Convolutional neural networks (CNN)</h3> <p>For convolutional neural networks, the inductive bias was analyzed using the ReLU activation function and testing networks with different depths. The hidden layers of the CNN consist of 5 × 5 convolution filters organized into 128 channels. The networks have two constraints to match the structure of the identity target function.</p> <p>If you choose the button ‘CNN’ in the first figure, it shows the resulting visualizations. It can be seen that shallow networks are able to learn the identity function, while intermediate-depth networks function as edge detectors, and deep networks learn the constant function. Whether the model learns the identity or the constant function, both outcomes reflect inductive biases since no specific structure was given by the task.</p> <p>A better understanding of the evolution of the output can be obtained by examining the status of the prediction in the hidden layers of the CNN. Since CNNs, unlike FCNs, preserve the spatial relations between neurons in the intermediate layers, these layers can be visualized. The figure below shows the results for a randomly initialized 20-layer CNN compared to different depths of trained CNNs.”</p> <div class="l-page"> <iframe src="/2023/assets/html/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/CNNs_intermedLayers.html" frameborder="0" scrolling="no" width="100%" height="450px"></iframe> </div> <p>Random convolution gradually smooths out the input data, and after around eight layers, the shapes are lost. When the networks are trained, the results differ. The 7-layer CNN performs well and ends up with an identity function of the input images, while the results of the 14-layer CNN are more blurry. For the 20-layer trained CNN, it initially behaves similarly to the randomly initialized CNN by wiping out the input data, but it preserves the shapes for a longer period. In the last three layers, it renders the constant function of the training data and outputs 7 for any input.</p> <p>These results align with the findings of Radhakrishnan et al. [2018] <d-cite key="radhakrishnan2019memorization"></d-cite> in ‘Memorization in overparametrized autoencoders’, which used a similar empirical framework on fully-connected autoencoders. They found that deep neural networks learn locally contractive maps around the training examples, leading to learning the constant function.</p> <p>As for FCNs, the experiments show that the similarity of the test data to the training data point increases task success. Zhang et al. [2020] <d-cite key="DBLP:conf/iclr/ZhangBHMS20"></d-cite> conducted further experiments with different <strong>feature channel numbers and dimensions</strong>. They found that increasing the hidden dimensions/adding channels is much less prone to overfitting than adding depth. This should be considered when designing new models: if the goal is to increase the number of parameters of an existing model (perhaps to improve optimization dynamics or prepare for more training data), it is better to try increasing the hidden dimension before tuning the depth, unless the nature of the data changes.</p> <p>Another factor that influences inductive bias is **model initialization++. For networks with few channels, the difference between random initialization and the converged network is extreme <d-cite key="DBLP:conf/iclr/FrankleC19"></d-cite>. This can be explained as follows: in the regime of random initialization with only a few channels, the initialization does not have enough flexibility to compensate for incorrect choices. As a result, the networks are more likely to converge to non-optimal extrema. Having more channels helps to smooth out this problem, as more parameters can compensate for ‘unlucky’ cases.</p> <h2 id="general-findings">General findings</h2> <p>The first figure in this post shows that CNNs have better generalization capability than FCNs. However, it is important to note that the experiments primarily aim to compare different neural networks <strong>within their architecture type</strong>, so a comparison between FCNs and CNNs cannot be considered fair. CNNs have natural advantages due to sparser networks and structural biases, such as local receptive fields and parameter sharing, that are consistent with the identity task. Additionally, CNNs have more parameters, as seen in the underlying figure: a 6-layer FCN contains 3.6M parameters, while a 5-layer CNN (with 5x5 filters of 1024 channels) has 78M parameters. These differences should be taken into account when evaluating the results of the experiments.</p> <div class="l-page" style="width: 704px; margin: auto;"> <iframe src="/2023/assets/html/2023-05-01-how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/plot.html" frameborder="0" scrolling="no" width="100%" height="480px"></iframe> </div> <p>To conclude, CNNs generalize better than FCNs, even though they have more parameters. This is consistent with the observed phenomenon that neural networks do not follow the statistical learning theory.</p> <p>The experiments described above lead to the following main findings of the paper:</p> <ul> <li>The number of parameters does not strongly correlate with generalization performance, but the structural bias of the model does.</li> </ul> <p>For example, when equally overparameterized,</p> <ul> <li>training a very deep model is prone to memorization, while</li> <li>adding more feature channels/dimensions is much less likely to cause overfitting.</li> </ul> <h2 id="conclusion">Conclusion</h2> <p>After reading this blog post, we hope that the concept of the overfitting puzzle is understood and it is revealed how the generalization capability of neural networks contrasts with classical learning theory. We also made the significance of the study conducted by Zhang et al. [2020] <d-cite key="DBLP:conf/iclr/ZhangBHMS20"></d-cite> clear, as they provide more insights into the inductive bias. The artificial setup used in the study is a smart way to approach this topic and allows for an intuitive interpretation of the results. The authors found that CNNs tend to <em>generalize</em> by actually learning the concept of identity, while FCNs are prone to memorization. Within these networks, it can be said that the simpler the network architecture is, the better the task results. Another observation is that deep CNNs exhibit extreme memorization. It would have been interesting to analyze the inductive bias for other types of data (e.g., sequence data like speech) and compare whether the stated theorems also hold in those cases.</p> <p>In summary, Zhang et al. [2020] <d-cite key="DBLP:conf/iclr/ZhangBHMS20"></d-cite> conducted interesting studies that have helped the machine learning community to gain a deeper understanding of inductive bias. Their results provide concrete guidance for practitioners that can help design models for new tasks.</p>]]></content><author><name>Charlotte Barth</name></author><summary type="html"><![CDATA[["The blog post discusses how memorization and generalization are affected by extreme overparameterization. Therefore", "it explains the overfitting puzzle in machine learning and how the inductive bias can help to understand the generalization capability of neural networks."]]]></summary></entry><entry><title type="html">How much meta-learning is in image-to-image translation?</title><link href="https://iclr-blogposts.github.io/2023/blog/2023/how-much-meta-learning-is-in-image-to-image-translation/" rel="alternate" type="text/html" title="How much meta-learning is in image-to-image translation?"/><published>2023-05-01T00:00:00+02:00</published><updated>2023-05-01T00:00:00+02:00</updated><id>https://iclr-blogposts.github.io/2023/blog/2023/how-much-meta-learning-is-in-image-to-image-translation</id><content type="html" xml:base="https://iclr-blogposts.github.io/2023/blog/2023/how-much-meta-learning-is-in-image-to-image-translation/"><![CDATA[<p>At the last ICLR conference, Zhou et al. [2022] <d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> presented work showing that CNNs do not transfer information between classes of a classification task.</p> <ul> <li>Allan Zhou, Fahim Tajwar, Alexander Robey, Tom Knowles, George J. Pappas, Hamed Hassani, Chelsea Finn [ICLR, 2022] Do Deep Networks Transfer Invariances Across Classes?<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite></li> </ul> <p>Here is a quick summary of their findings: If we train a Convolutional Neural Net (CNN) to classify animals on a set of randomly brightened and darkened images of cats and dogs, it will learn to ignore the scene’s brightness. We say that the CNN learned that classification is <strong>invariant</strong> to the <strong>nuisance transformation</strong> of randomly changing the brightness of an image. We now add a set of leopards to the training data, but fewer examples of them (they are hard to photograph) than we have cats and dogs. However, we keep using the same random transformations. The training set thus becomes <strong>class-imbalanced</strong>.</p> <p>We might expect a sophisticated learner to look at the entire dataset, recognize the random brightness modifications across all species of animal and henceforth ignore brightness when making predictions. If this applied to our experiment, the CNN would be similarly good at ignoring lighting variations on all animals. Furthermore, we would expect the CNN to become more competent at ignoring lighting variations in proportion to <strong>the total amount of images</strong>, irrespective of which animal they depict.</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/CONCEPTUAL_DIAGRAM.svg-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/CONCEPTUAL_DIAGRAM.svg-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/CONCEPTUAL_DIAGRAM.svg-1400.webp"/> <img src="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/CONCEPTUAL_DIAGRAM.svg" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <p>Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> show that a CNN does not behave like this: When using a CNN on a <strong>class-imbalanced</strong> classification task with random nuisance transformations, the CNNs invariance to the transformation is proportional to the size of the training set <strong>for each class</strong>. This finding suggests CNNs don’t <strong>transfer invariance</strong> between classes when learning such a classification task.</p> <p>However, there is a solution: Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> use an Image-to-Image translation architecture called MUNIT<d-cite key="DBLP:conf/eccv/HuangLBK18"></d-cite> to learn the transformations and generate additional data from which the CNN can learn the invariance separately for each class. Thus, the invariance to nuisance transformations is transferred <strong>generatively</strong>. They call this method <strong>Generative Invariance Transfer (GIT)</strong>.</p> <p><strong>So why is this an interesting result?</strong></p> <p>In the field of machine learning many have dreamed for a long time<d-cite key="schmidhuber:1987:srl"></d-cite><d-cite key="DBLP:books/sp/98/ThrunP98"></d-cite> of a learner that, having learned a number of tasks can adapt to new tasks with little to no extra training - a learner that has learned to learn, a meta-learner. Yet, specialized meta-learners <d-cite key="DBLP:conf/icml/FinnAL17"></d-cite><d-cite key="NIPS2017_cb8da676"></d-cite><d-cite key="NIPS2016_90e13578"></d-cite><d-cite key="sung2018learning"></d-cite> struggled to outperform baseline methods<d-cite key="DBLP:journals/corr/abs-2104-02638"></d-cite><d-cite key="DBLP:journals/corr/abs-1904-04232"></d-cite>, arguably due to high computational requirements<d-cite key="nichol2018first"></d-cite> and few large scale datasets<d-cite key="triantafillou2019meta"></d-cite>. We believe this to be caused by a too-narrow conception of what constitutes meta-learning. We argue that:</p> <ul> <li>In contradiction to recent definitions of meta-learning, the experiment described above is a meta-learning experiment.</li> <li>MUNIT is related to contemporary meta-learning methods and a meta-learner.</li> <li>These two findings point to a too-narrow conception of meta-learning in the recent literature. A wider conception based on mutual information could lead to interesting future work.</li> </ul> <p>Before we proceed to the main post, let’s clarify some definitions. If you are already familiar with the subject, you may skip this part. If you have only a vague notion of contemporary meta-learning you will be able to follow the article anyway. However, if you want to know more, <a href="https://interactive-maml.github.io/">here</a> is a gentle introduction to MAML, one of the most popular methods.</p> <details> <summary><b> Definition: Class-Imbalanced Classification</b></summary> <br/> <p> In many real-world classification datasets, the number of examples for each class varies. <b>Class-imbalanced classification</b> refers to classification on datasets where the frequencies of class labels vary significantly. </p> <p> It is generally more difficult for a neural network to learn to classify classes with fewer examples <d-cite key="5128907"></d-cite><d-cite key="10.1117/12.2228523"></d-cite>. However, it is often important to perform well on all classes, regardless of their frequency in the dataset. If we train a model to classify a dataset of different skin tumors, most examples may be benign. Still, it is crucial to identify the rare, malignant ones. Experiment design, including training and evaluation methods must therefore be adjusted when using class-imbalanced data. (see Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> section 3.1) </p> <br/> </details> <details> <summary><b> Definition: Nuisance Transformation &amp; Transformation Invariance</b></summary> <br/> <p> Transformations are alterations of data. In the context of image classification, <b>nuisance transformations</b> are alterations that do not affect the class labels of the data. A model is said to be invariant to a <b>nuisance transformation</b> if it can successfully ignore the transformation when predicting a class label. </p> We can formally define a <b>nuisance transformation</b> <p> $$T(\cdot |x)$$ </p> <p> as a distribution over transformation functions. An example of a <b>nuisance transformation</b> might be a distribution over rotation matrices of different angles, or lighting transformations with different exposure values. By definition, <b>nuisance transformations</b> have no impact on class labels $y$, only on data $x$. A perfectly <b>transformation-invariant</b> classifier would thus completely ignore them, i.e., </p> <p> $$ \hat{P}_w(y = j|x) = \hat{P}_w(y = j|x'), \; x' \sim T(\cdot |x). $$ </p> <p> (see Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> section 3.1) </p> </details> <h2 id="a-closer-look-at-the-experiment">A closer look at the experiment</h2> <p>Let’s take a more detailed look at the experiment Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> conducted:</p> <p>Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> take a dataset, e.g., CIFAR-100, then apply a nuisance transformation, for example, random rotation, background intensity, or dilation and erosion. They then remove samples from some classes until the distribution of class sizes follows <a href="https://en.wikipedia.org/wiki/Zipf%27s_law">Zipf’s law</a> with parameter 2.0 and a minimum class size of 5. The test set remains balanced, i.e., all test classes have the same number of samples. They then train a CNN model - for example, a ResNet - on this imbalanced and transformed training data.</p> <p>To measure the invariance of the trained model to the applied transformation Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> use the empirical <a href="https://en.wikipedia.org/wiki/Kullback-Leibler_divergence">Kullback-Leibler divergence</a> between the predictions on the untransformed test set and the transformed test set of each class.</p> <p>If the learner is invariant to the transformation, the predicted probability distribution over class labels should be identical for the transformed and untransformed images. In that case, the KLD should be zero and greater than zero otherwise. The higher the expected KL-divergence, the more the applied transformation impacts the network’s predictions.</p> <p>The result: eKLD falls with class size. This implies that the CNN does not learn that there are the same nuisance transformations on all images and therefore does not transfer this knowledge to the classes with less training data. A CNN learns invariance <strong>separately for each class</strong> (see also Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> section 3.2).</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/EKLD.svg-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/EKLD.svg-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/EKLD.svg-1400.webp"/> <img src="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/EKLD.svg" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <h2 id="how-is-this-a-meta-learning-experiment">How is this a meta-learning experiment?</h2> <p>You might think this is a cool experiment, but how is it related to meta-learning?</p> <p>And, indeed, in contemporary literature meta-learning is often conceived of as learning multiple tasks. In an much-cited 2022 survey, Hosepdales et al. write:</p> <blockquote> <p>Meta-learning is most commonly understood as learning to learn; the process of improving a learning algorithm over multiple learning episodes. In contrast, conventional ML improves model predictions over multiple data instances. <d-cite key="DBLP:journals/pami/HospedalesAMS22"></d-cite></p> </blockquote> <p>In another popular survey Vanschoren [2018] describes the meta-learning process as follows:</p> <blockquote> <p>First, we need to collect meta-data that describe prior learning tasks and previously learned models. They comprise the exact algorithm configurations used to train the models, including hyperparameter settings, pipeline compositions and/or network architectures, the resulting model evaluations, such as accuracy and training time, the learned model parameters, such as the trained weights of a neural net, as well as measurable properties of the task itself, also known as meta-features.<d-cite key="vanschoren2018meta"></d-cite></p> </blockquote> <p>Francheschi et al. [2018] basically equate meta-learning (ML) with <a href="https://en.wikipedia.org/wiki/Hyperparameter_optimization">hyperparameter optimization</a> (HO):</p> <blockquote> <p>[…] both HO and ML essentially boil down to nesting two search problems: at the inner level we seek a good hypothesis (as in standard supervised learning) while at the outer level we seek a good configuration (including a good hypothesis space) where the inner search takes place.<d-cite key="DBLP:conf/icml/FranceschiFSGP18"></d-cite></p> </blockquote> <p>This perspective on meta-learning seems to indicate that “true” meta-learning requires a rigid structure of multiple discrete tasks that is optimized over. However, in the invariance transfer setting we neither have multiple learning episodes, i.e., we learn over multiple data instances, nor any “meta-features”. Also, adding a class to the dataset does not exactly constitute a new “task”, even though knowledge of the nuisance transform is applicable.</p> <p>So is Zhou et al.’s<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> experiment no meta-learning after all?</p> <p>Let’s look at one of the original papers on meta-learning. In the 1998 book “Learning to learn” Sebastian Thrun &amp; Lorien Pratt define an algorithm as capable of “Learning to learn” if it improves its performance in proportion to the number of tasks it is exposed to:</p> <blockquote> <p>an algorithm is said to learn to learn if its performance at each task improves with experience and with the number of tasks. Put differently, a learning algorithm whose performance does not depend on the number of learning tasks, which hence would not benefit from the presence of other learning tasks, is not said to learn to learn <d-cite key="DBLP:books/sp/98/ThrunP98"></d-cite></p> </blockquote> <p>Now this seems a much looser definition. How might this apply to the experiment just outlined? In the introduction, we thought about how a sophisticated learner might handle a dataset like the one described in the last section. We said that a sophisticated learner would learn that the nuisance transformations are applied uniformly <strong>to all classes</strong>. Therefore, if we added more classes to the dataset, the learner would become <strong>more invariant</strong> to the transformations because we expose it to more examples of them. Since this is part of the classification task <strong>for each class</strong>, the learner should, everything else being equal, become better at classification, especially on classes with few training examples. To see this, we must think of the multi-classification task not as a single task but as multiple mappings from image features to activations that must be learned, as a set of binary classification tasks. Thrun and Pratt continue:</p> <blockquote> <p>For an algorithm to fit this definition, some kind of <em>transfer</em> must occur between multiple tasks that must have a positive impact on expected task-performance <d-cite key="DBLP:books/sp/98/ThrunP98"></d-cite>.</p> </blockquote> <p>This transfer is what Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> tried to measure. There is some meta-information learnable across several tasks, in our case, the transformation distribution across many binary classification tasks. If a learner can learn this meta-information and transfer it to each new task it has “learned to learn”; it is a meta-learner. The goal of Zhou et al.’s<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> experiment was to see whether this transfer takes place. Thus, arguably, it is a meta-learning experiment.</p> <h2 id="generative-invariance-transfer">Generative Invariance Transfer</h2> <p>Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> don’t stop there. They show that using the MUNIT (Multimodal Unsupervised image-to-image Translation)<d-cite key="DBLP:conf/eccv/HuangLBK18"></d-cite> architecture, they can learn the nuisance transformations applied to the dataset and generate additional training samples for the classes with few samples, improving transformation invariance there. They call this Generative invariance transfer (GIT). Let’s take a closer look:</p> <p>MUNIT networks are capable of performing image-to-image translation, which means that they can translate an image from one domain, such as pictures of leopards, into another domain, such as pictures of house cats. The translated image should look like a real house cat while still resembling the original leopard image. For instance, if the leopard in the original image has its eyes closed, the translated image should contain a house cat with closed eyes. Eye state is a feature present in both domains, so a good translator should not alter it. On the other hand, a leopard’s fur is yellow and spotted, while a house cat’s fur can be white, black, grey, or brown. To make the translated images indistinguishable from real house cats, the translator must thus replace leopard fur with house cat fur.</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/MUNIT_ENCODING.svg-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/MUNIT_ENCODING.svg-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/MUNIT_ENCODING.svg-1400.webp"/> <img src="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/MUNIT_ENCODING.svg" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <p>MUNIT networks learn to perform translations by correctly distinguishing the domain-agnostic features (such as eye state) from the domain-specific features (such as the distribution of fur color). They embed an image into two latent spaces: a content space that encodes the domain-agnostic features and a style space that encodes the domain-specific features (see figure above).</p> <p>To transform a leopard into a house cat, we can encode the leopard into a content and a style code, discard the leopard-specific style code, randomly select a cat-specific style code, and assemble a house cat image that looks similar by combining the leopard’s content code with the randomly chosen cat style code (see figure below).</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/MUNIT_TRANSLATION.svg-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/MUNIT_TRANSLATION.svg-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/MUNIT_TRANSLATION.svg-1400.webp"/> <img src="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/MUNIT_TRANSLATION.svg" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <p>Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> modify the process of using MUNIT to transfer images between domains. They do not use MUNIT to translate images <strong>between</strong> domains but <strong>within</strong> a domain. The MUNIT network exchanges the style code of an image with another style code of the same domain. For example, if the domain is house cats, the MUNIT network might translate a grey house cat into a black one. The learning task in this single-domain application of MUNIT is to decompose example-agnostic content features from example-specific style features so that the translated images still look like house cats. For example, fur color is a valid style feature for translating within the ‘house cat’ domain because every house cat has a fur color. A translator only switching fur color is hard to detect.</p> <p>However, if the domain included house cats <strong>and apples</strong>, fur color is not a valid style feature. If it was, the translator might translate fur color on an apple and give it black fur, which would look suspiciously out of place. Whatever house cats and apples have in common - maybe their position or size in the frame - would be a valid style feature. We would expect an intra-domain translator on an apples-and-cats dataset to change the position and size of an apple but not to turn it into a cat (not even partially).</p> <p>It turns out that on a dataset with uniformly applied nuisance transformations, the nuisance transformations are valid style features: The result of randomly rotating an apple cannot be discerned as artificial when images of all classes, house cats and apples, were previously randomly rotated.</p> <p>Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> find that when they train a MUNIT network on a dataset with nuisance transformations and class imbalances, the MUNIT network decomposes the class and transformation distributions. The style latent space of the MUNIT network approximates the transformation distribution $T(\cdot |x)$. The content space preserves the remaining features of the image, such as its class. Thus, when translating an image, i.e., exchanging its style code, MUNIT applies a random nuisance transformation while preserving content. Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> use this method to generate data for classes with few examples. While the CNN is still unable to transfer invariance to $T(\cdot |x)$ between classes, it can now learn it for each class separately using the data generated by MUNIT, which has acquired knowledge of $T(\cdot |x)$ from the entire dataset (see also Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> section 4).</p> <p>So MUNIT decomposes the example-specific information, e.g., whether something is an apple or a house cat, from the meta-information, i.e., nuisance transformations applied to the entire dataset. When we add more classes, it has more data and can better learn the transformation distribution $T(\cdot |x)$. Does solving a meta-learning problem make MUNIT a meta-learner? Let’s look at the relationship MUNIT has with contemporary meta-learners</p> <h2 id="how-much-meta-learning-is-in-munit">How much meta-learning is in MUNIT?</h2> <p>To see how well MUNIT fits the definition of meta-learning, let’s see what the same survey papers we consulted earlier consider the structure of a meta-learning algorithm.</p> <h3 id="part-1-the-task-centered-view">Part 1: The task-centered view</h3> <p>Hospedales et al. [2021] <d-cite key="DBLP:journals/pami/HospedalesAMS22"></d-cite> defines a generic meta-learner as follows: An outer training loop with a set of trainable parameters iterates over tasks in a distribution of tasks. Formally a task is comprised of a dataset and a loss function $ \mathcal{T} = \{ \mathcal{D}, \mathcal{L} \} $. In an inner loop, a learning algorithm based on the outer loop’s parameters is instantiated for each task. We train it on a training set (<em>meta-training</em>) and test it on a validation set (<em>meta-validation</em>). We then use loss on this validation set to update the outer loop’s parameters. In this task-centered view of meta-learning, we can express the objective function as</p> <p> $$ \underset{\omega}{\mathrm{min}} \; \mathbb{E}_{\mathcal{T} \sim p(\mathcal{T})} \; \mathcal{L}(\mathcal{D}, \omega), $$ </p> <p>where $ \omega $ is parameters trained exclusively on the meta-level, i.e., the <em>meta-knowledge</em> learnable from the task distribution <d-cite key="DBLP:journals/pami/HospedalesAMS22"></d-cite>.</p> <p>This <em>meta-knowledge</em> is what the meta-learner accumulates and transfers across the tasks. Collecting meta-knowledge allows the meta-learner to improve its expected task performance with the number of tasks. The meta-knowledge in the experiment of Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> is the invariance to the nuisance transformations as the transformations are identical and need to be ignored for images of all classes. By creating additional transformed samples, the MUNIT network makes the meta-knowledge learnable for the CNN.</p> <p>The task-centered view of meta-learning brings us to a related issue: A meta-learner must discern and decompose task-specific knowledge from meta-knowledge. Contemporary meta-learners decompose meta-knowledge through the different objectives of their inner and outer loops and their respective loss terms. They store meta-knowledge in the outer loop’s parameter set $ \omega $ but must not learn task-specific information there. Any unlearned meta-features lead to slower adaptation, negatively impacting performance, <em>meta-underfitting</em>. On the other hand, any learned task-specific features will not generalize to unseen tasks in the distribution, thus also negatively impacting performance, <em>meta-overfitting</em>.</p> <p>We recall that, similarly, MUNIT <d-cite key="DBLP:conf/eccv/HuangLBK18"></d-cite> decomposes domain-specific style information and domain-agnostic content information. Applied to two domains, leopards and house cats, a MUNIT network will encode the domain-agnostic information, e.g., posture, scale, background, in its content latent space, and the domain-specific information, e.g., how a cat’s hair looks, in its style latent space. If the MUNIT network encoded the domain-agnostic information in the style latent space, the resulting image would not appear to be a good translation since the style information is discarded and replaced. It might turn a closed-eyed leopard into a staring cat. If the MUNIT network encoded the domain-specific transformation in the content latent space, the network would have difficulty translating between domains. A house cat might still have its original leopard fur.</p> <p>Although the single-domain application of MUNIT explicitly learns a single task and scales “over multiple data instances” instead of “multiple learning episodes”<d-cite key="DBLP:journals/pami/HospedalesAMS22"></d-cite> it is clearly compatible with the task-centered view of meta-learning set forth <em>in the same survey paper</em>. Both meta-learning and multi-domain unsupervised image-to-image translation are thus learning problems that require a separation of the general from the specific.</p> <p>As we shall see, this is even visible when comparing their formalizations as optimization problems.</p> <h3 id="part-2-the-bi-level-programming-view">Part 2: The bi-level programming view</h3> <p>Francheschi et al. [2018] <d-cite key="DBLP:conf/icml/FranceschiFSGP18"></d-cite> show that all contemporary neural-network-based meta-learning approaches can be expressed as bi-level optimization problems. Formally the optimization objective of a general meta-learner can be expressed as:</p> <p> $$ \bbox[5pt, border: 2px solid blue]{ \begin{align*} \omega^{*} = \underset{\omega}{\mathrm{argmin}} \sum_{i=1}^{M} \mathcal{L}^{meta}(\theta^{* \; (i)}(\omega), D^{val}_i), \end{align*} } $$ </p> <p>where $M$ describes the number of tasks in a batch, $\mathcal{L}^{meta}$ is the meta-loss function, and $ D^{val}_i $ is the validation set of the task $ i $. $\omega$ represents the parameters exclusively updated in the outer loop. $ \theta^{* \; (i)} $ represents an inner loop learning a task that we can formally express as a sub-objective constraining the primary objective</p> <p> $$ \bbox[5pt, border: 2px solid red]{ \begin{align*} s.t. \; \theta^{* \; (i)} = \underset{\theta}{\mathrm{argmin}} \; \mathcal{L^{task}}(\theta, \omega, D^{tr}_i), \end{align*} } $$ </p> <p>where $ \theta $ are the model parameters updated in the inner loop, $ \mathcal{L}^{task} $ is the loss function by which they are updated and $ D^{tr}_i $ is the training set of the task $ i $ <d-cite key="DBLP:journals/pami/HospedalesAMS22"></d-cite>.</p> <p>While not adhering to Francheschi et al.’s [2018] notion of a meta-learner as “nesting two search problems”, it turns out that the loss functions of MUNIT can be similarly decomposed:</p> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/MUNIT_LOSS.svg-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/MUNIT_LOSS.svg-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/MUNIT_LOSS.svg-1400.webp"/> <img src="/2023/assets/img/2023-05-01-how-much-meta-learning-is-in-image-to-image-translation/MUNIT_LOSS.svg" class="img-fluid" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <p>MUNIT’s loss function consists of two adversarial (GAN) <d-cite key="DBLP:conf/nips/GoodfellowPMXWOCB14"></d-cite> loss terms (see figure above) with several auxiliary reconstruction loss terms. To keep the notation simple, we combine all reconstruction terms into a joined reconstruction loss $ \mathcal{L}_{recon}(\theta_c, \theta_s) $, where $ \theta_c $ are the parameters of the <em>content</em> encoding/decoding networks and $ \theta_s $ are the parameters of the <em>style</em> encoding/decoding networks. We will only look at one of the two GAN losses in detail since they are symmetric, and one is discarded entirely when MUNIT is used on a single domain in the fashion of Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite>.</p> <p>MUNIT’s GAN loss term is</p> <p> $$ \begin{align*} &amp;\mathcal{L}^{x_{2}}_{GAN}(\theta_d, \theta_c, \theta_s) \\\\ =&amp; \;\mathbb{E}_{c_{1} \sim p(c_{1}), s_{2} \sim p(s_{2})} \left[ \log (1 -D_ {2} (G_{2} (c_{1}, s_{2}, \theta_c, \theta_s), \theta_d)) \right] \\ +&amp; \;\mathbb{E}_{x_{2} \sim p(x_{2})} \left[ \log(D_{2} (x_{2}, \theta_d)) \right], \end{align*} $$ </p> <p>where the $ \theta_d $ represents the parameters of the discriminator network, $p(x_2)$ is the data of the second domain, $ c_1 $ is the content embedding of an image from the first domain to be translated. $ s_2 $ is a random style code of the second domain. $ D_2 $ is the discriminator of the second domain, and $ G_2 $ is its generator. MUNIT’s full objective function is:</p> <p> $$ \begin{align*} \underset{\theta_c, \theta_s}{\mathrm{argmin}} \; \underset{\theta_d}{\mathrm{argmax}}&amp; \;\mathbb{E}_{c_{1} \sim p(c_{1}), s_{2} \sim p(s_{2})} \left[ \log (1 -D_ {2} (G_{2} (c_{1}, s_{2}, \theta_c, \theta_s), \theta_d)) \right] \\ +&amp; \; \mathbb{E}_{x_{2} \sim p(x_{2})} \left[ \log(D_{2} (x_{2}, \theta_d)) \right], + \; \mathcal{L}^{x_{1}}_{GAN}(\theta_d, \theta_c, \theta_s) \\ +&amp; \;\mathcal{L}_{recon}(\theta_c, \theta_s) \end{align*} $$ </p> <p>(compare <d-cite key="DBLP:conf/eccv/HuangLBK18, DBLP:conf/nips/GoodfellowPMXWOCB14"></d-cite>). We can reformulate this into a bi-level optimization problem by extracting a minimization problem describing the update of the generative networks. We also drop the second GAN loss term as it is not relevant to our analysis.</p> <p> $$ \bbox[5px, border: 2px solid blue]{ \begin{align*} \omega^{*} &amp; = \{ \theta_c^*, \theta_s^* \} \\\\ &amp; = \underset{\theta_c, \theta_s}{\mathrm{argmin}} \; \mathbb{E}_{c_{1} \sim p(c_{1}), s_{2} \sim p(s_{2})} \left[ \log (1 -D_ {2} (G_{2} (c_{1}, s_{2}, \theta_c, \theta_s), \theta_d^{*})) \right] \\ &amp; + \mathcal{L}_{recon}(\theta_c, \theta_s), \end{align*} } $$ </p> <p>We then add a single constraint, a subsidiary maximization problem for the discriminator function:</p> <p> $$ \bbox[5px, border: 2px solid red]{ \begin{align*} &amp;s.t. \;\theta_d^{*} \\\\ &amp; = \underset{\theta_d}{\mathrm{argmax}} \; \mathbb{E}_{c_{1} \sim p(c_{1}), s_{2} \sim p(s_{2})} \left[ \log (1 -D_ {2} (G_{2} (c_{1}, s_{2}, \theta_c, \theta_s), \theta_d)) \right] \\ &amp; + \mathbb{E}_{x_{2} \sim p(x_{2})} \left[ \log(D_{2} (x_{2}, \theta_d)) \right] \end{align*} } $$ </p> <p>Interestingly, this bi-level view does not only resemble a meta-learning procedure as expressed above, but the bi-level optimization also facilitates a similar effect. Maximizing the discriminator’s performance in the constraint punishes style information encoded as content information. If style information is encoded as content information, the discriminator detects artifacts of the original domain in the translated image. Similarly, a meta-learner prevents <em>meta-overfitting</em> via an outer optimization loop.</p> <p><em>However, MUNIT, while representable as a bi-level optimization problem does not “essentially boil down to nesting two search problems”.<d-cite key="DBLP:conf/icml/FranceschiFSGP18"></d-cite></em> During GAN training, the discriminator’s parameters are updated through the changes in the generator’s parameters, which derive from the discriminator’s parameters, and so forth; The training of the discriminator and generator are dependent processes. Crucially, they depend on each other symmetrically, forming a min-max game. Contemporary meta-learners, meanwhile, are strictly hierarchical, with an outer and inner optimization loop.</p> <h3 id="now-does-munit-meta-learn">Now, does MUNIT meta-learn?</h3> <p>So it appears that while not conforming to any verbal definition of a contemporary meta-learner MUNIT seems to:</p> <p>a) adhere to multiple formalizations made in the very same publications to define meta-learning</p> <p>b) solve a meta-learning problem via GIT when applied to a single domain (if you agree with the conclusion of the previous chapter)</p> <p>We thus conclude:</p> <p>When applied to a single domain MUNIT <em>does</em> meta-learn as it combines information from all classes to extract the transformation distribution. While it does not perform classification explicitly, the class information of an image is encoded in MUNIT’s content space. Since MUNIT is trained in an unsupervised way, it is probably closer to a distance metric than an actual class label. We might thus classify single-domain MUNIT as an unsupervised, generative meta-learner.</p> <h2 id="implications">Implications</h2> <p>That invariance transfer and GIT are meta-learning and that MUNIT is a meta-learner is important. Granted, it is not especially hard to see that invariance transfer is a form of “learning to learn” or that Image-to-Image translation is essentially a mechanism to decompose class-specific form general features.</p> <p>However, because contemporary meta-learning has been narrowly cast as “improving a learning algorithm over multiple learning episodes”<d-cite key="DBLP:journals/pami/HospedalesAMS22"></d-cite> and “nesting two search problems”<d-cite key="DBLP:conf/icml/FranceschiFSGP18"></d-cite> it is hard to recognize GIT as meta-learning.</p> <p>In these authors opinion this is not GIT’s fault, but a sign that meta-learning has recently been conceived of too narrowly. Zhou et al.’s<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> experiment is a beautiful illustration of this showing that something as general as a GAN loss term, with appropriate modifications, can be used to meta-learn.</p> <p>A too-narrow conception goes further than obscuring some experiment’s significance though: Meta-learning as a field has recently struggled to compete with less specialized architectures<d-cite key="DBLP:journals/corr/abs-2104-02638"></d-cite><d-cite key="DBLP:journals/corr/abs-1904-04232"></d-cite>. Multi-task datasets are hard to scale <d-cite key="triantafillou2019meta"></d-cite>, as are episode rollouts <d-cite key="DBLP:conf/icml/FinnAL17"></d-cite>. Meanwhile, large-scale architectures have shown impressive zero-shot capabilities<d-cite key="dosovitskiy2021an"></d-cite><d-cite key="pmlr-v139-radford21a"></d-cite>.</p> <p>Zhou et al.’s<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> contributions are therefore important as a challenge to the status quo in meta-learning. MUNIT seems to meta-learn by embedding class (and class-specific features) in one space and transformation-specific features (e.g., how bright/dark) in another. This seems to point to a conception of meta-learning as finding mutual information between sets of examples (not necessarily defined by class or transformation feature but by arbitrary concepts) or hierarchies of such sets. Examining and designing mechanisms by which such behavior can be evoked is an exciting direction for future work.</p> <h2 id="key-takeaways">Key Takeaways</h2> <ol> <li> <p>Zhou et al.’s<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> experiments show that the meta-learning setting can be formulated more broadly than learning an explicit task distribution, suggesting that specialized datasets are not necessary.</p> </li> <li> <p>Using GIT, Zhou et al.<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> show that meta-learning algorithms can come in shapes other than inner and outer training loops. Analysis suggests that countervailing loss terms facilitate the decomposition of meta-features from task-specific features.</p> </li> <li> <p>Our discussion of Zhou et al.’s<d-cite key="DBLP:conf/iclr/ZhouTRKPHF22"></d-cite> experiments suggests, that when thinking about meta-learning, thinking about mutual information between batches of examples (not necessarily aligned with class labels) and how to extract it trumps thinking about distinct tasks.</p> </li> </ol>]]></content><author><name>Maximilian Eißler</name></author><summary type="html"><![CDATA[...in which we find a connection between meta-learning literature and a paper studying how well CNNs deal with nuisance transforms in a class-imbalanced setting. Closer inspection reveals a surprising amount of similarity - from meta-information to loss functions. This implies that the current conception of meta-learning might be too narrow.]]></summary></entry><entry><title type="html">Thinking Like Transformers</title><link href="https://iclr-blogposts.github.io/2023/blog/2023/raspy/" rel="alternate" type="text/html" title="Thinking Like Transformers"/><published>2023-05-01T00:00:00+02:00</published><updated>2023-05-01T00:00:00+02:00</updated><id>https://iclr-blogposts.github.io/2023/blog/2023/raspy</id><content type="html" xml:base="https://iclr-blogposts.github.io/2023/blog/2023/raspy/"><![CDATA[<h1 id="thinking-like-transformers">Thinking Like Transformers</h1> <ul> <li><a href="https://arxiv.org/pdf/2106.06981.pdf">Paper</a><d-cite key="weiss2021thinking"></d-cite> by Gail Weiss, Yoav Goldberg, Eran Yahav</li> </ul> <p>Transformer models are foundational to AI systems. There are now countless explanations of “how transformers work?” in the sense of the architecture diagram at the heart of transformers.</p> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_5_0.svg" alt="svg"/></p> <p>However this diagram does not provide any intuition into the computational model of this framework. As researchers become interested in how Transformers work, gaining intuition into their mechanisms becomes increasingly useful.</p> <p><a href="https://arxiv.org/pdf/2106.06981.pdf">Thinking like Transformers</a> proposes a computational framework for Transformer-like calculations. The framework uses discrete computation to simulate Transformer computations. The resulting language <a href="https://github.com/tech-srl/RASP">RASP</a> is a programming language where, ideally, every program can compile down to a specific Transformer (indeed, David Lindner and colleagues have recently released a <a href="https://arxiv.org/abs/2301.05062">compiler</a> for a large subset of RASP!).</p> <p>In this blog post, I reimplemented a variant of RASP in Python (RASPy). The language is roughly compatible with the original version, but with some syntactic changes that I thought were fun. With this language, we have a challenging set of puzzles to walk through and understand how it works.</p> <p>Before jumping into the language itself, let’s look at an example of what coding with Transformers looks like. Here is some code that computes the <code class="language-plaintext highlighter-rouge">flip</code>, i.e. reversing an input sequence. The code itself uses two Transformer layers to apply attention and mathematical computations to achieve the result.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">flip</span><span class="p">():</span>
+    <span class="n">length</span> <span class="o">=</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="mi">1</span><span class="p">)).</span><span class="nf">value</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
+    <span class="n">flip</span> <span class="o">=</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">length</span> <span class="o">-</span> <span class="n">indices</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="n">indices</span><span class="p">)).</span><span class="nf">value</span><span class="p">(</span><span class="n">tokens</span><span class="p">)</span>
+    <span class="k">return</span> <span class="n">flip</span>
+<span class="nf">flip</span><span class="p">()</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_11_0.svg" alt="svg"/></p> <h2 id="transformers-as-code">Transformers as Code</h2> <p>Our goal is to define a computational formalism that mimics the expressivity of Transformers. We will go through this process by analogy, describing each language construct next to the aspect of the Transformer it represents. (See the full <a href="https://arxiv.org/pdf/2106.06981.pdf">paper</a> for the formal language specification).</p> <p>The core unit of the language is a <em>sequence operation</em> that transforms a sequence to another sequence of the same length. I will refer to these throughout as <em>transforms</em>.</p> <h3 id="inputs">Inputs</h3> <p>In a Transformer, the base layer is the input fed to the model. This input usually contains the raw tokens as well as positional information.</p> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_15_0.svg" alt="svg"/></p> <p>In code, the symbol <code class="language-plaintext highlighter-rouge">tokens</code> represents the simplest transform. It returns the tokens passed to the model. The default input is the sequence “hello”.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">tokens</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_17_0.svg" alt="svg"/></p> <p>If we want to change the input to the transform, we use the input method to pass in an alternative.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">tokens</span><span class="p">.</span><span class="nf">input</span><span class="p">([</span><span class="mi">5</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">2</span><span class="p">])</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_19_0.svg" alt="svg"/></p> <p>As with Transformers, we cannot access the positions of these sequences directly. However, to mimic position embeddings, we have access to a sequence of indices.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">indices</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_21_0.svg" alt="svg"/></p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">sop</span> <span class="o">=</span> <span class="n">indices</span>
+<span class="n">sop</span><span class="p">.</span><span class="nf">input</span><span class="p">(</span><span class="sh">"</span><span class="s">goodbye</span><span class="sh">"</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_22_0.svg" alt="svg"/></p> <h3 id="feed-forward-network">Feed Forward Network</h3> <p>After the input layer, we reach the feed-forward network. In a Transformer, this stage can apply mathematical operations to each element of the sequence independently.</p> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_24_0.svg" alt="svg"/></p> <p>In code, we represent this stage by computation on transforms. Mathematical operations are overloaded to represent independent computation on each element of the sequence .</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">tokens</span> <span class="o">==</span> <span class="sh">"</span><span class="s">l</span><span class="sh">"</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_26_0.svg" alt="svg"/></p> <p>The result is a new transform. Once constructed it can be applied to new input.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">model</span> <span class="o">=</span> <span class="n">tokens</span> <span class="o">*</span> <span class="mi">2</span>  <span class="o">-</span> <span class="mi">1</span>
+<span class="n">model</span><span class="p">.</span><span class="nf">input</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">2</span><span class="p">])</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_28_0.svg" alt="svg"/></p> <p>Operations can combine multiple transforms. For example, functions of <code class="language-plaintext highlighter-rouge">tokens</code> and <code class="language-plaintext highlighter-rouge">indices</code>. The analogy here is that the Transformer activations can keep track of multiple pieces of information simultaneously.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">model</span> <span class="o">=</span> <span class="n">tokens</span> <span class="o">-</span> <span class="mi">5</span> <span class="o">+</span> <span class="n">indices</span>
+<span class="n">model</span><span class="p">.</span><span class="nf">input</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">2</span><span class="p">])</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_30_0.svg" alt="svg"/></p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="n">tokens</span> <span class="o">==</span> <span class="sh">"</span><span class="s">l</span><span class="sh">"</span><span class="p">)</span> <span class="o">|</span> <span class="p">(</span><span class="n">indices</span> <span class="o">==</span> <span class="mi">1</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_31_0.svg" alt="svg"/></p> <p>We provide a few helper functions to make it easier to write transforms. For example, <code class="language-plaintext highlighter-rouge">where</code> provides an “if” statement like construct</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nf">where</span><span class="p">((</span><span class="n">tokens</span> <span class="o">==</span> <span class="sh">"</span><span class="s">h</span><span class="sh">"</span><span class="p">)</span> <span class="o">|</span> <span class="p">(</span><span class="n">tokens</span> <span class="o">==</span> <span class="sh">"</span><span class="s">l</span><span class="sh">"</span><span class="p">),</span> <span class="n">tokens</span><span class="p">,</span> <span class="sh">"</span><span class="s">q</span><span class="sh">"</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_33_0.svg" alt="svg"/></p> <p>And <code class="language-plaintext highlighter-rouge">map</code> lets us define our own operators, for instance a string to int transform. (Users should be careful to only use operations here that could be computed with a simple neural network).</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">atoi</span> <span class="o">=</span> <span class="n">tokens</span><span class="p">.</span><span class="nf">map</span><span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="nf">ord</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="o">-</span> <span class="nf">ord</span><span class="p">(</span><span class="sh">'</span><span class="s">0</span><span class="sh">'</span><span class="p">))</span>
+<span class="n">atoi</span><span class="p">.</span><span class="nf">input</span><span class="p">(</span><span class="sh">"</span><span class="s">31234</span><span class="sh">"</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_35_0.svg" alt="svg"/></p> <p>When chaining these transforms, it is often easier to work with functions. For example the following applies where and then <code>atoi</code> and then adds 2.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">atoi</span><span class="p">(</span><span class="n">seq</span><span class="o">=</span><span class="n">tokens</span><span class="p">):</span>
+    <span class="k">return</span> <span class="n">seq</span><span class="p">.</span><span class="nf">map</span><span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="nf">ord</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="o">-</span> <span class="nf">ord</span><span class="p">(</span><span class="sh">'</span><span class="s">0</span><span class="sh">'</span><span class="p">))</span> 
+
+<span class="n">op</span> <span class="o">=</span> <span class="p">(</span><span class="nf">atoi</span><span class="p">(</span><span class="nf">where</span><span class="p">(</span><span class="n">tokens</span> <span class="o">==</span> <span class="sh">"</span><span class="s">-</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">0</span><span class="sh">"</span><span class="p">,</span> <span class="n">tokens</span><span class="p">))</span> <span class="o">+</span> <span class="mi">2</span><span class="p">)</span>
+<span class="n">op</span><span class="p">.</span><span class="nf">input</span><span class="p">(</span><span class="sh">"</span><span class="s">02-13</span><span class="sh">"</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_37_0.svg" alt="svg"/></p> <p>From here on, unless we use a different input sequence, we will assume that the input is ‘hello’ and omit the input display in the illustrations.</p> <h3 id="attention-selectors">Attention Selectors</h3> <p>Things get more interesting when we start to apply attention. This allows routing of information between the different elements of the sequence.</p> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_39_0.svg" alt="svg"/></p> <p>We begin by defining notation for the keys and queries of the model. Keys and queries are effectively transforms that we will broadcast and compare to each other to create <em>selectors</em>, our parallel to attention patterns. We create them directly from transforms. For example, if we want to define a key, we call <code class="language-plaintext highlighter-rouge">key</code> on a transform.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nf">key</span><span class="p">(</span><span class="n">tokens</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_41_0.svg" alt="svg"/></p> <p>Similarly for <code class="language-plaintext highlighter-rouge">query</code>. (Queries are presented as columns to reflect their relation to the selectors we will create from them.)</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nf">query</span><span class="p">(</span><span class="n">tokens</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_43_0.svg" alt="svg"/></p> <p>Scalars can be used as keys or queries. They broadcast out to the length of the underlying sequence.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nf">query</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_45_0.svg" alt="svg"/></p> <p>By applying a comparison operation between a key and a query we create a <em>selector</em>, our parallel to an attention matrix - though this one is unweighted.</p> <p>A selector is a binary matrix indicating which input position (column) each output position (row) will attend to in an eventual attention computation. In the comparison creating it, the key values describe the input (column) positions, and the query values describe the output (row) positions.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">eq</span> <span class="o">=</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">tokens</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="n">tokens</span><span class="p">))</span>
+<span class="n">eq</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_47_0.svg" alt="svg"/></p> <p>Some examples:</p> <ul> <li>A selector that matches each output position to the previous input position.</li> </ul> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">offset</span> <span class="o">=</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">indices</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="n">indices</span> <span class="o">-</span> <span class="mi">1</span><span class="p">))</span>
+<span class="n">offset</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_49_0.svg" alt="svg"/></p> <ul> <li>A selector that matches each output position to all earlier input positions.</li> </ul> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">before</span> <span class="o">=</span> <span class="nf">key</span><span class="p">(</span><span class="n">indices</span><span class="p">)</span> <span class="o">&lt;</span> <span class="nf">query</span><span class="p">(</span><span class="n">indices</span><span class="p">)</span>
+<span class="n">before</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_51_0.svg" alt="svg"/></p> <ul> <li>A selector that matches each output position to all later input positions.</li> </ul> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">after</span> <span class="o">=</span> <span class="nf">key</span><span class="p">(</span><span class="n">indices</span><span class="p">)</span> <span class="o">&gt;</span> <span class="nf">query</span><span class="p">(</span><span class="n">indices</span><span class="p">)</span>
+<span class="n">after</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_53_0.svg" alt="svg"/></p> <p>Selectors can be merged using boolean operations. For example, this selector focuses each output position on 1) earlier positions that 2) contain the same original input token as its own. We show this by including both pairs of keys and queries in the matrix.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">before</span> <span class="o">&amp;</span> <span class="n">eq</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_55_0.svg" alt="svg"/></p> <h3 id="using-attention">Using Attention</h3> <p>Given an attention selector we can provide a value sequence to aggregate. We represent aggregation by <strong>summing</strong> up over the values that have a true value for their selector.</p> <p>(Note: in the original paper, they use a <strong>mean</strong> aggregation and show a clever construction where mean aggregation is able to represent a sum calculation. RASPy uses sum by default for simplicity and to avoid fractions. In practicce this means that RASPy may underestimate the number of layers needed to convert to a mean based model by a factor of 2.)</p> <p>Attention aggregation gives us the ability to compute functions like histograms.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">tokens</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="n">tokens</span><span class="p">)).</span><span class="nf">value</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_59_0.svg" alt="svg"/></p> <p>Visually we follow the architecture diagram. Queries are to the left, Keys at the top, Values at the bottom, and the Output is to the right.</p> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_61_0.svg" alt="svg"/></p> <p>Some attention operations may not even use the input tokens. For instance to compute the <code class="language-plaintext highlighter-rouge">length</code> of a sequence, we create a “select all” attention selector and then add 1 from each position.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">length</span> <span class="o">=</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="mi">1</span><span class="p">)).</span><span class="nf">value</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
+<span class="n">length</span> <span class="o">=</span> <span class="n">length</span><span class="p">.</span><span class="nf">name</span><span class="p">(</span><span class="sh">"</span><span class="s">length</span><span class="sh">"</span><span class="p">)</span>
+<span class="n">length</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_63_0.svg" alt="svg"/></p> <p>Here’s a more complex example, shown step-by-step. (This is the kind of thing they ask in interviews!)</p> <p>Say we want to compute the sum of neighboring values in a sequence, along a sliding window. First we apply the forward cutoff, attending only to positions that are not too far in the past.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">WINDOW</span><span class="o">=</span><span class="mi">3</span>
+<span class="n">s1</span> <span class="o">=</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">indices</span><span class="p">)</span> <span class="o">&gt;=</span> <span class="nf">query</span><span class="p">(</span><span class="n">indices</span> <span class="o">-</span> <span class="n">WINDOW</span> <span class="o">+</span> <span class="mi">1</span><span class="p">))</span>  
+<span class="n">s1</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_65_0.svg" alt="svg"/></p> <p>Then the backward cutoff, attending only to positions up to and including our own.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">s2</span> <span class="o">=</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">indices</span><span class="p">)</span> <span class="o">&lt;=</span> <span class="nf">query</span><span class="p">(</span><span class="n">indices</span><span class="p">))</span>
+<span class="n">s2</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_67_0.svg" alt="svg"/></p> <p>Intersect.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">sel</span> <span class="o">=</span> <span class="n">s1</span> <span class="o">&amp;</span> <span class="n">s2</span>
+<span class="n">sel</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_69_0.svg" alt="svg"/></p> <p>And finally aggregate.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">sum2</span> <span class="o">=</span> <span class="n">sel</span><span class="p">.</span><span class="nf">value</span><span class="p">(</span><span class="n">tokens</span><span class="p">)</span> 
+<span class="n">sum2</span><span class="p">.</span><span class="nf">input</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span><span class="mi">3</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">2</span><span class="p">])</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_71_0.svg" alt="svg"/></p> <p>Here is a simple example that produces a 2-layer transform. The first corresponds to computing length and the second the cumulative sum. The cumulative sum has to go into a second layer because it is applied to a transform which uses length, and so it can only be computed after the computation of length is complete.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">cumsum</span><span class="p">(</span><span class="n">seq</span><span class="o">=</span><span class="n">tokens</span><span class="p">):</span>
+    <span class="n">x</span> <span class="o">=</span> <span class="p">(</span><span class="n">before</span> <span class="o">|</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">indices</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="n">indices</span><span class="p">))).</span><span class="nf">value</span><span class="p">(</span><span class="n">seq</span><span class="p">)</span>
+    <span class="k">return</span> <span class="n">x</span><span class="p">.</span><span class="nf">name</span><span class="p">(</span><span class="sh">"</span><span class="s">cumsum</span><span class="sh">"</span><span class="p">)</span>
+<span class="nf">cumsum</span><span class="p">().</span><span class="nf">input</span><span class="p">([</span><span class="mi">3</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="o">-</span><span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">1</span><span class="p">])</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_73_0.svg" alt="svg"/></p> <h3 id="layers">Layers</h3> <p>The language supports building up more complex transforms. It keeps track of the <em>layers</em> by tracking the operations computed so far.</p> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_76_0.svg" alt="svg"/></p> <p>Here is a simple example that produces a 2-layer transform. The first corresponds to computing length and the second the cumulative sum.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">x</span> <span class="o">=</span> <span class="nf">cumsum</span><span class="p">(</span><span class="n">length</span> <span class="o">-</span> <span class="n">indices</span><span class="p">)</span>
+<span class="n">x</span><span class="p">.</span><span class="nf">input</span><span class="p">([</span><span class="mi">3</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">5</span><span class="p">])</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_78_0.svg" alt="svg"/></p> <h2 id="coding-with-transformers">Coding with Transformers</h2> <p>Given this library of functions, we can write operations to accomplish surprisingly complex tasks.</p> <p><strong>Can we produce a Transformer that does basic addition of two arbitrary length numbers?</strong></p> <p>i.e. given a string “19492+23919” can we produce the correct output?</p> <p>We will go through these steps, and their solutions, here. If you would rather do them on your own, we provide a version where you can try them yourself!</p> <p>Before we dive in to the main task, we will do some challenges of increasing difficulty to help us build some intuitions.</p> <h3 id="challenge-1-select-a-given-index">Challenge 1: Select a given index</h3> <p>Produce a sequence where all the elements have the value at index i.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">index</span><span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">seq</span><span class="o">=</span><span class="n">tokens</span><span class="p">):</span>
+    <span class="n">x</span> <span class="o">=</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">indices</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="n">i</span><span class="p">)).</span><span class="nf">value</span><span class="p">(</span><span class="n">seq</span><span class="p">)</span>
+    <span class="k">return</span> <span class="n">x</span><span class="p">.</span><span class="nf">name</span><span class="p">(</span><span class="sh">"</span><span class="s">index</span><span class="sh">"</span><span class="p">)</span>
+<span class="nf">index</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_83_0.svg" alt="svg"/></p> <h3 id="challenge-2-shift">Challenge 2: Shift</h3> <p>Shift all of the tokens in a sequence to the right by i positions. (Here we introduce an optional parameter in the aggregation: the default value to be used when no input positions are selected. If not defined, this value is 0.)</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">shift</span><span class="p">(</span><span class="n">i</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="sh">"</span><span class="s">_</span><span class="sh">"</span><span class="p">,</span> <span class="n">seq</span><span class="o">=</span><span class="n">tokens</span><span class="p">):</span>
+    <span class="n">x</span> <span class="o">=</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">indices</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="n">indices</span><span class="o">-</span><span class="n">i</span><span class="p">)).</span><span class="nf">value</span><span class="p">(</span><span class="n">seq</span><span class="p">,</span> <span class="n">default</span><span class="p">)</span>
+    <span class="k">return</span> <span class="n">x</span><span class="p">.</span><span class="nf">name</span><span class="p">(</span><span class="sh">"</span><span class="s">shift</span><span class="sh">"</span><span class="p">)</span>
+<span class="nf">shift</span><span class="p">(</span><span class="mi">2</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_85_0.svg" alt="svg"/></p> <h3 id="challenge-3-minimum">Challenge 3: Minimum</h3> <p>Compute the minimum values of the sequence. (This one starts to get harder. Our version uses 2 layers of attention.)</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">minimum</span><span class="p">(</span><span class="n">seq</span><span class="o">=</span><span class="n">tokens</span><span class="p">):</span>
+    <span class="n">sel1</span> <span class="o">=</span> <span class="n">before</span> <span class="o">&amp;</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">seq</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="n">seq</span><span class="p">))</span>
+    <span class="n">sel2</span> <span class="o">=</span> <span class="nf">key</span><span class="p">(</span><span class="n">seq</span><span class="p">)</span> <span class="o">&lt;</span> <span class="nf">query</span><span class="p">(</span><span class="n">seq</span><span class="p">)</span>
+    <span class="n">less</span> <span class="o">=</span> <span class="p">(</span><span class="n">sel1</span> <span class="o">|</span> <span class="n">sel2</span><span class="p">).</span><span class="nf">value</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
+    <span class="n">x</span> <span class="o">=</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">less</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="mi">0</span><span class="p">)).</span><span class="nf">value</span><span class="p">(</span><span class="n">seq</span><span class="p">)</span>
+    <span class="k">return</span> <span class="n">x</span><span class="p">.</span><span class="nf">name</span><span class="p">(</span><span class="sh">"</span><span class="s">min</span><span class="sh">"</span><span class="p">)</span>
+<span class="nf">minimum</span><span class="p">()([</span><span class="mi">5</span><span class="p">,</span><span class="mi">3</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">5</span><span class="p">,</span><span class="mi">2</span><span class="p">])</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_87_0.svg" alt="svg"/></p> <p>The idea behind our solution is an implicit full ordering of the input positions: we (implicitly) order the positions according to input token value, with input position as tie breaker. Our first act is to have each position attend to all positions before it in the ordering: <code class="language-plaintext highlighter-rouge">sel1</code> focuses on earlier input positions with the same input token value, and <code class="language-plaintext highlighter-rouge">sel2</code> focuses on input positions with lower input token value. We then aggregate a 1 from all positions to get where each position is located in this ordering (i.e., how many other positions precede it). The minimum value is the input value at the first position according to this ordering (i.e., the one which had no other positions precede it).</p> <h3 id="challenge-4-first-index">Challenge 4: First Index</h3> <p>Compute the first index that has token q, assuming the sequence always has length shorter than 100. (2 layers)</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">first</span><span class="p">(</span><span class="n">q</span><span class="p">,</span> <span class="n">seq</span><span class="o">=</span><span class="n">tokens</span><span class="p">):</span>
+    <span class="k">return</span> <span class="nf">minimum</span><span class="p">(</span><span class="nf">where</span><span class="p">(</span><span class="n">seq</span> <span class="o">==</span> <span class="n">q</span><span class="p">,</span> <span class="n">indices</span><span class="p">,</span> <span class="mi">99</span><span class="p">))</span>
+<span class="nf">first</span><span class="p">(</span><span class="sh">"</span><span class="s">l</span><span class="sh">"</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_90_0.svg" alt="svg"/></p> <h3 id="challenge-5-right-align">Challenge 5: Right Align</h3> <p>Right align a padded sequence e.g. ralign().inputs(‘xyz___’) = ‘—xyz’” (2 layers)</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">ralign</span><span class="p">(</span><span class="n">default</span><span class="o">=</span><span class="sh">"</span><span class="s">-</span><span class="sh">"</span><span class="p">,</span> <span class="n">sop</span><span class="o">=</span><span class="n">tokens</span><span class="p">):</span>
+    <span class="n">c</span> <span class="o">=</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">sop</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="sh">"</span><span class="s">_</span><span class="sh">"</span><span class="p">)).</span><span class="nf">value</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
+    <span class="n">x</span> <span class="o">=</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">indices</span> <span class="o">+</span> <span class="n">c</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="n">indices</span><span class="p">)).</span><span class="nf">value</span><span class="p">(</span><span class="n">sop</span><span class="p">,</span> <span class="n">default</span><span class="p">)</span>
+    <span class="k">return</span> <span class="n">x</span><span class="p">.</span><span class="nf">name</span><span class="p">(</span><span class="sh">"</span><span class="s">ralign</span><span class="sh">"</span><span class="p">)</span>
+<span class="nf">ralign</span><span class="p">()(</span><span class="sh">"</span><span class="s">xyz__</span><span class="sh">"</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_92_0.svg" alt="svg"/></p> <h3 id="challenge-6-split">Challenge 6: Split</h3> <p>Split a sequence into two parts at value v and then right align. You can assume there is exactly one appearance of v in the sequence. (3 layers to get and align the first part of the sequence, but only 1 for the second.)</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">split</span><span class="p">(</span><span class="n">v</span><span class="p">,</span> <span class="n">get_first_part</span><span class="p">,</span> <span class="n">sop</span><span class="o">=</span><span class="n">tokens</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="sh">"</span><span class="s">0</span><span class="sh">"</span><span class="p">):</span>
+    <span class="n">split_point</span> <span class="o">=</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">sop</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="n">v</span><span class="p">)).</span><span class="nf">value</span><span class="p">(</span><span class="n">indices</span><span class="p">)</span>
+    <span class="k">if</span> <span class="n">get_first_part</span><span class="p">:</span>
+        <span class="n">x</span> <span class="o">=</span> <span class="nf">ralign</span><span class="p">(</span><span class="n">default</span><span class="p">,</span> 
+                   <span class="nf">where</span><span class="p">(</span><span class="n">indices</span> <span class="o">&lt;</span> <span class="n">split_point</span><span class="p">,</span> 
+                         <span class="n">sop</span><span class="p">,</span> <span class="sh">"</span><span class="s">_</span><span class="sh">"</span><span class="p">))</span>
+        <span class="k">return</span> <span class="n">x</span>
+    <span class="k">else</span><span class="p">:</span>
+        <span class="n">x</span> <span class="o">=</span> <span class="nf">where</span><span class="p">(</span><span class="n">indices</span> <span class="o">&gt;</span> <span class="n">split_point</span><span class="p">,</span> <span class="n">sop</span><span class="p">,</span> <span class="n">default</span><span class="p">)</span>
+        <span class="k">return</span> <span class="n">x</span>
+<span class="nf">split</span><span class="p">(</span><span class="sh">"</span><span class="s">+</span><span class="sh">"</span><span class="p">,</span> <span class="bp">False</span><span class="p">)(</span><span class="sh">"</span><span class="s">xyz+zyr</span><span class="sh">"</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_94_0.svg" alt="svg"/></p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nf">split</span><span class="p">(</span><span class="sh">"</span><span class="s">+</span><span class="sh">"</span><span class="p">,</span> <span class="mi">0</span><span class="p">)(</span><span class="sh">"</span><span class="s">xyz+zyr</span><span class="sh">"</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_95_0.svg" alt="svg"/></p> <h3 id="challenge-6-slide">Challenge 6: Slide</h3> <p>Replace special tokens “&lt;” with the closest non “&lt;” value to their right. (2 layers)</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">slide</span><span class="p">(</span><span class="n">match</span><span class="p">,</span> <span class="n">seq</span><span class="o">=</span><span class="n">tokens</span><span class="p">):</span>
+    <span class="n">x</span> <span class="o">=</span> <span class="nf">cumsum</span><span class="p">(</span><span class="n">match</span><span class="p">)</span> 
+    <span class="n">y</span> <span class="o">=</span> <span class="p">((</span><span class="nf">key</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="n">x</span> <span class="o">+</span> <span class="mi">1</span><span class="p">))</span> <span class="o">&amp;</span> <span class="p">(</span><span class="nf">key</span><span class="p">(</span><span class="n">match</span><span class="p">)</span> <span class="o">==</span> <span class="nf">query</span><span class="p">(</span><span class="bp">True</span><span class="p">))).</span><span class="nf">value</span><span class="p">(</span><span class="n">seq</span><span class="p">)</span>
+    <span class="n">seq</span> <span class="o">=</span>  <span class="nf">where</span><span class="p">(</span><span class="n">match</span><span class="p">,</span> <span class="n">seq</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span>
+    <span class="k">return</span> <span class="n">seq</span><span class="p">.</span><span class="nf">name</span><span class="p">(</span><span class="sh">"</span><span class="s">slide</span><span class="sh">"</span><span class="p">)</span>
+<span class="nf">slide</span><span class="p">(</span><span class="n">tokens</span> <span class="o">!=</span> <span class="sh">"</span><span class="s">&lt;</span><span class="sh">"</span><span class="p">).</span><span class="nf">input</span><span class="p">(</span><span class="sh">"</span><span class="s">xxxh&lt;&lt;&lt;l</span><span class="sh">"</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_97_0.svg" alt="svg"/></p> <h3 id="challenge-7-add">Challenge 7: Add</h3> <p>For this one you want to perform addition of two numbers. Here are the steps.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nf">add</span><span class="p">().</span><span class="nf">input</span><span class="p">(</span><span class="sh">"</span><span class="s">683+345</span><span class="sh">"</span><span class="p">)</span>
+</code></pre></div></div> <ol> <li>Split into parts (challenge 6). Convert to ints. Add.</li> </ol> <blockquote> <p>“683+345” =&gt; [0, 0, 0, 9, 12, 8]</p> </blockquote> <ol> <li>Compute the carry terms. Three possibilities: definitely receives carry (“1”), definitely doesn’t receive carry (“0”), maybe receives carry (“&lt;”).Because we are only adding two numbers, the only case in which a position might receive a carry is if the position after it sums to 9. In that case, it will receive a carry if and only if the position after <em>that</em> receives a carry.</li> </ol> <blockquote> <p>[0, 0, 0, 9, 12, 8] =&gt; “00&lt;100”</p> </blockquote> <ol> <li>Slide the carry coefficients. A position that might receive a carry will get one if and only if the next position receives a carry - and so on down the chain until the next definite carry/no carry.</li> </ol> <blockquote> <p>“00&lt;100” =&gt; 001100”</p> </blockquote> <ol> <li>Complete the addition.</li> </ol> <p>Each of these is 1 line of code. The full system is 6 layers. (if you are careful you can do it in 5!).</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">add</span><span class="p">(</span><span class="n">sop</span><span class="o">=</span><span class="n">tokens</span><span class="p">):</span>
+    <span class="c1"># 0) Parse and add
+</span>    <span class="n">x</span> <span class="o">=</span> <span class="nf">atoi</span><span class="p">(</span><span class="nf">split</span><span class="p">(</span><span class="sh">"</span><span class="s">+</span><span class="sh">"</span><span class="p">,</span> <span class="bp">True</span><span class="p">,</span> <span class="n">sop</span><span class="p">))</span> \
+        <span class="o">+</span> <span class="nf">atoi</span><span class="p">(</span><span class="nf">split</span><span class="p">(</span><span class="sh">"</span><span class="s">+</span><span class="sh">"</span><span class="p">,</span> <span class="bp">False</span><span class="p">,</span> <span class="n">sop</span><span class="p">))</span>
+    <span class="c1"># 1) Check for carries 
+</span>    <span class="n">gets_carry</span> <span class="o">=</span> <span class="nf">shift</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="sh">"</span><span class="s">0</span><span class="sh">"</span><span class="p">,</span> <span class="nf">where</span><span class="p">(</span><span class="n">x</span> <span class="o">&gt;</span> <span class="mi">9</span><span class="p">,</span> <span class="sh">"</span><span class="s">1</span><span class="sh">"</span><span class="p">,</span> <span class="nf">where</span><span class="p">(</span><span class="n">x</span> <span class="o">==</span> <span class="mi">9</span><span class="p">,</span> <span class="sh">"</span><span class="s">&lt;</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">0</span><span class="sh">"</span><span class="p">)))</span>
+    <span class="c1"># 2) Slide carries to their columns - all in one parallel go!                                         
+</span>    <span class="n">gets_carry</span> <span class="o">=</span> <span class="nf">atoi</span><span class="p">(</span><span class="nf">slide</span><span class="p">(</span><span class="n">gets_carry</span> <span class="o">!=</span> <span class="sh">"</span><span class="s">&lt;</span><span class="sh">"</span><span class="p">,</span> <span class="n">gets_carry</span><span class="p">))</span>
+    <span class="c1"># 3) Add in carries, and remove overflow from original addition.                                                                                  
+</span>    <span class="nf">return </span><span class="p">(</span><span class="n">x</span> <span class="o">+</span> <span class="n">gets_carry</span><span class="p">)</span> <span class="o">%</span> <span class="mi">10</span>
+<span class="nf">add</span><span class="p">()(</span><span class="sh">"</span><span class="s">683+345</span><span class="sh">"</span><span class="p">)</span>
+</code></pre></div></div> <p><img src="/2023/assets/img/2023-05-01-raspy/Blog_99_0.svg" alt="svg"/></p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="mi">683</span> <span class="o">+</span> <span class="mi">345</span>
+</code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1028
+</code></pre></div></div> <p>Pretty neat stuff. If you are interested more in this topic, be sure to check at the paper:</p> <p><a href="https://arxiv.org/pdf/2106.06981.pdf">Thinking like Transformers</a> and the <a href="https://github.com/tech-srl/RASP">RASP language</a>.</p>]]></content><author><name>Alexander Rush</name></author><summary type="html"><![CDATA[Thinking like Transformers proposes a simple language for coding with attention-like primitives. Using this language, we consider a challenging set of puzzles to gain intuition for how Transformer could implement basic algorithms.]]></summary></entry><entry><title type="html">Rethinking the Implementation Tricks and Monotonicity Constraint in Cooperative Multi-agent Reinforcement Learning</title><link href="https://iclr-blogposts.github.io/2023/blog/2023/riit/" rel="alternate" type="text/html" title="Rethinking the Implementation Tricks and Monotonicity Constraint in Cooperative Multi-agent Reinforcement Learning"/><published>2023-05-01T00:00:00+02:00</published><updated>2023-05-01T00:00:00+02:00</updated><id>https://iclr-blogposts.github.io/2023/blog/2023/riit</id><content type="html" xml:base="https://iclr-blogposts.github.io/2023/blog/2023/riit/"><![CDATA[<h2 id="background">Background</h2> <h3 id="from-rl-to-marl">From RL to MARL</h3> <p>Since AlphaZero beats humans at Go, RL has become a consistent hot spot in academia and industry. The agent of RL can obtain some rewards by interacting with the environment and taking actions to maximize these cumulative rewards. Actually, almost all the RL problems can be described as <strong>Markov Decision Processes</strong> as illustrated in Figure <a href="#mdp">1</a>.</p> <div id="mdp" class="img-height-200 img-center"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/mdp-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/mdp-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/mdp-1400.webp"/> <img src="/2023/assets/img/2023-05-01-riit/mdp.png" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> <div class="caption">Figure 1: The agent-environment interaction in a Markov decision process. (Image source: Sec. 3.1 Sutton &amp; Barto (2018)<d-cite key="sutton2018reinforcement"></d-cite>). $R_t, S_t, A_t$ denote the reward, state and action at timestep $t$.</div> <p>Just as its name implies, MARL contains multiple agents trained by RL algorithms in the same environment. Many complex multi-agent systems such as robot swarm control, autonomous vehicle coordination, and sensor networks, can be modeled as MARL tasks. The interaction of these agents would make them work together to achieve a common goal.</p> <div style="display:flex; margin-bottom:-30px; margin-left :150px; margin-right :150px"> <div id="chase" class="img-height-100"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/chase.gif-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/chase.gif-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/chase.gif-1400.webp"/> <img src="/2023/assets/img/2023-05-01-riit/chase.gif" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> <div id="magent" class="img-height-100"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/magent.gif-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/magent.gif-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/magent.gif-1400.webp"/> <img src="/2023/assets/img/2023-05-01-riit/magent.gif" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div></div> <div style="display:flex; margin-top:-30px; margin-left :50px; margin-right :50px"> <div id="hide" class="img-height-200"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/hide.gif-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/hide.gif-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/hide.gif-1400.webp"/> <img src="/2023/assets/img/2023-05-01-riit/hide.gif" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> <div id="smac" class="img-height-200"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/smac.gif-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/smac.gif-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/smac.gif-1400.webp"/> <img src="/2023/assets/img/2023-05-01-riit/smac.gif" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div></div> <div style="margin-bottom: 20px"><div class="caption">Figure 2: Some multi-agent cooperative scenarios [from-left-to-right]. <a href="https://github.com/openai/multiagent-particle-envs"> <br/> (a) Chasing in Multi-Agent Particle Environment (Predator-Prey); </a> <a href="https://github.com/geek-ai/MAgent"> (b) MAgent Environment; </a> <a href="https://openai.com/blog/emergent-tool-use"> <br/> (c) Hide &amp; Seek; </a> <a href="https://github.com/oxwhirl/smac"> (d) StarCraft Multi-Agent Challenge. </a></div></div> <p>In this general setting, agents usually have a limited sight range to observe their surrounding environment. As shown in Figure <a href="#smac_obs">3</a>, the cyan border indicates the sight and shooting range of the agent, which means the agent could only obtain the information of terrain or other agents in that range. This restricted field of view may also result in the difficulty of agents to access to global state information, making its policy updates subject to bias and unsatisfactory performance. In general, these kinds of multi-agent scenarios can be modeled as <strong>Decentralized Partially Observable Markov Decision Processes</strong> (Dec-POMDP)<d-cite key="png2009pomdps"></d-cite>.</p> <p>Even though many RL algorithms<d-cite key="sutton2018reinforcement"></d-cite> and their variants have been successfully extended to the cooperative scenarios in MARL setting, few of their performance is satisfactory. One of the most troublesome issues is <em>Non-Stationarity</em>. Specifically, as a part of the environment, the changing policies of other agents during training would make the observation non-stationary from the perspective of any individual agent<d-cite key="oliehoek2016concise"></d-cite> and significantly slow down the policy optimization of MARL. This situation has forced researchers to seek a method that can exploit global information during training but does not destroy the ability of the agents to only use their respective observations during execution, to find a joint policy $\boldsymbol{\pi} = \langle \pi^{1},…,\pi^{n}\rangle$ to maximize global reward. Naturally, the simplicity and effectiveness of the <strong>Centralized Training with Decentralized Execution</strong> (CTDE) paradigm have attracted the attention of the community, and many MARL algorithms based on CTDE were proposed, making a remarkable contribution to MARL.</p> <p>In the rest of this section, we briefly introduce Dec-POMDP and CTDE to facilitate the understanding of the contents of MARL, the QMIX algorithm and the following text.</p> <div style="float:left; margin-left :150px; margin-right :150px;"><div id="smac_obs" class="img-height-100"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/smac_agent_obs-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/smac_agent_obs-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/smac_agent_obs-1400.webp"/> <img src="/2023/assets/img/2023-05-01-riit/smac_agent_obs.jpg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> <div class="caption">Figure 3: The partial observation of agents <br/>(Image source: SMAC<d-cite key="samvelyan2019starcraft"></d-cite>). </div></div> <h3 id="decentralized-partially-observable-markov-decision-process">Decentralized Partially Observable Markov Decision Process</h3> <p>A <strong>Decentralized Partially Observable Markov Decision Process</strong> (Dec-POMDP) model, as described in <d-cite key="pmlr-v80-rashid18a"></d-cite><d-cite key="oliehoek2016concise"></d-cite>, is typically used to represent a full cooperative multi-agent task. The model consists of a tuple denoted by $G=(S, U, P, r, Z, O, n, \gamma)$, and involves $n$ agents, where $n$ is an integer between 1 and $n$, inclusive. The true state of the environment, denoted by $s \in S$, describes global information that is relevant to both agents and other auxiliary features. At each timestep $t$, a transition in the environment occurs via a joint action $\mathbf{u} \in \mathbf{U} \equiv U^{n}$, which is composed of an action $u^i \in U$, chosen by each agent. This transition is driven by the state transition function $P\left(s^{\prime} \mid s, \mathbf{u}\right): S \times \mathbf{U} \times S \rightarrow[0,1]$. Additionally, there is a shared global reward function, denoted by $r(s, \mathbf{u}): S \times \mathbf{U} \rightarrow \mathbf{R}$, which is optimized by the whole team. Finally, each agent has a partial observation described by $o^i \in O$, which is derived from the observation function $Z(o^i \mid s, u^i) : S \times U \rightarrow O$. All agents work cooperatively to maximize the shared global reward $R_{t}=\sum_{k=0}^{T} \gamma^{k} r_{t+k}$, which is described by the joint value function \(Q^{\boldsymbol{\pi}}\left(s_{t}, \mathbf{u}_{t}\right) = \mathbb{E}_{s_{t+1: \infty}, \mathbf{u}_{t+1: \infty}}\left[R_{t} \mid s_{t}, \mathbf{u}_{t}\right]\).</p> <h3 id="centralized-training-with-decentralized-execution-and-value-decomposition">Centralized Training with Decentralized Execution and Value Decomposition</h3> <p>To better explore the factors affecting the QMIX algorithm, our focus lies in the <strong>Centralized Training with Decentralized Execution</strong> (CTDE) paradigm of MARL algorithms. These algorithms under this paradigm have access to the true state $s$ and the action-observation histories $\tau^{i}$ of all agents to centrally train policies, but each agent can only rely on its local observation $o^{i}$ for decision-making. Some value-based algorithms implemented under CTDE follow the Individual-Global-Max (<strong>IGM</strong>) principle<d-cite key="pmlr-v97-son19a"></d-cite>, ensuring consistency between the joint action-value function $Q_{tot} \left(\boldsymbol{\tau}, \mathbf{u}\right)$ and individual agent-utilities $[Q_i\left(\tau^i, u^i\right)] _{i=1} ^{n}$:</p> \[\underset{\mathbf{u}}{\operatorname{argmax}}\ Q_{tot} \left(\boldsymbol{\tau}, \mathbf{u}\right) = (\underset{u^{1}}{\operatorname{argmax}}\ Q_{1} \left(\tau^{1}, u^{1}\right), \ldots, \underset{u^{n}}{\operatorname{argmax}}\ Q_{n} \left(\tau^{n} , u^{n}\right)). \tag{1} \label{eq1}\] <p>One of the most typical ways to efficiently train the joint value function \(Q_{tot} \left(\boldsymbol{\tau}, \mathbf{u}\right)\) is to decompose it into the utility functions \([Q_i\left(\tau^i, u^i\right)] _{i=1} ^{n}\) and maintain updating consistency between them via IGM. The simplest factorization structure, called <em>additivity</em>, has been proposed by VDN<d-cite key="10.5555/3237383.3238080"></d-cite>, which makes VDN simply factorize $Q_{tot}$ into a sum of per-agent utilities \(Q_{tot}^{\mathrm{VDN}} \left(\boldsymbol{\tau}, \boldsymbol{u}\right)=\sum_{i=1}^{n} Q_{i} \left(\tau^{i}, u^{i}\right)\). VDN’s simplicity and equal weighting of each utility in the joint value function makes it ineffective for cooperative tasks, which has motivated the QMIX structure and other more efficient decomposition approaches.</p> <h3 id="notation">Notation</h3> <p>In this subsection, we define the notations used in this post. Specifically, in traditional RL, time steps $t$ are usually represented in the update formula and the value function of RL is considered to be estimated by the pairwise variables at the current time step $t$ and the next time step $t+1$. Since the <em>ID</em> of the agent also needs to be represented in the MARL algorithm, it may cause ambiguity when expressed in the same formula as the time step $t$. For simplicity of expression, variables without $t$ are indicated to be implemented at the current time step, while variables at the next time step are indicated with an apostrophe in the upper right corner in the rest of the context, e.g., $s$ means the current state and $s^{\prime}$ indicates the next time step state, the same approach applies to actions $u$ and observations $o$. All the notations are listed in Table <a href="#table1">1</a>.</p> <p><a name="table1"> </a></p> <div class="caption"> Table 1: All the notations used in this post. </div> <style type="text/css">.tg{border-collapse:collapse;border-spacing:0}.tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial,sans-serif;font-size:14px;overflow:hidden;padding:10px 5px;word-break:normal}.tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial,sans-serif;font-size:14px;font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal}.tg .tg-c3ow{border-color:inherit;text-align:center;vertical-align:top}</style> <table class="tg"> <thead> <tr> <th class="tg-c3ow">Notation</th> <th class="tg-c3ow">Description</th> <th class="tg-c3ow">Notation</th> <th class="tg-c3ow">Description</th> </tr> </thead> <tbody> <tr> <td class="tg-c3ow">$s$</td> <td class="tg-c3ow">the current state (at time $t$)</td> <td class="tg-c3ow">$S$</td> <td class="tg-c3ow">the set of all states</td> </tr> <tr> <td class="tg-c3ow">$s^{\prime}$</td> <td class="tg-c3ow">the next state (at time $t+1$)</td> <td class="tg-c3ow">$U$</td> <td class="tg-c3ow">the set of all actions</td> </tr> <tr> <td class="tg-c3ow">$u^{i}$</td> <td class="tg-c3ow">the action of agent $i$</td> <td class="tg-c3ow">$N$</td> <td class="tg-c3ow">the set of all agents</td> </tr> <tr> <td class="tg-c3ow">$\mathbf{u}$</td> <td class="tg-c3ow">the joint actions (at time $t$)</td> <td class="tg-c3ow">$\tau^{i}$</td> <td class="tg-c3ow">the action-observation history of agent $i$</td> </tr> <tr> <td class="tg-c3ow">$o^{i}$</td> <td class="tg-c3ow">the observation of agent $i$</td> <td class="tg-c3ow">$${\tau}$$</td> <td class="tg-c3ow">the joint action-observation histories</td> </tr> <tr> <td class="tg-c3ow">$$o$$</td> <td class="tg-c3ow">the joint observation</td> <td class="tg-c3ow">$r(s, \mathbf{u})$</td> <td class="tg-c3ow">the joint reward supplied by environments</td> </tr> <tr> <td class="tg-c3ow">$Q_{i}(\tau^{i}, u^{i})$</td> <td class="tg-c3ow">the utility function of agent $i$</td> <td class="tg-c3ow">$\gamma$</td> <td class="tg-c3ow">the discount factor</td> </tr> <tr> <td class="tg-c3ow">$Q_{tot}({\tau}, \mathbf{u})$</td> <td class="tg-c3ow">the joint value function </td> <td class="tg-c3ow">$P(s^{\prime} \mid s, \mathbf{u})$</td> <td class="tg-c3ow">the transition function</td> </tr> <tr> <td class="tg-c3ow">$Z(o^{i} \mid s, u^{i})$</td> <td class="tg-c3ow">the observation function</td> <td class="tg-c3ow">$\epsilon$</td> <td class="tg-c3ow">action selection probability of $\epsilon$-greedy</td> </tr> <tr> <td class="tg-c3ow">$N$</td> <td class="tg-c3ow">the set of all agents with $n$ agents</td> <td class="tg-c3ow">$$\theta$$</td> <td class="tg-c3ow">the set of parameters of agents network, with $[\theta^{i}]_{i=1}^{n}$</td> </tr> <tr> <td class="tg-c3ow">$b$</td> <td class="tg-c3ow">sampled batch size for training</td> <td class="tg-c3ow">$\phi$</td> <td class="tg-c3ow">the parameter of mixing network</td> </tr> <tr> <td class="tg-c3ow">$TS$</td> <td class="tg-c3ow">the $T$otal rollout $S$amples</td> <td class="tg-c3ow">$PP$</td> <td class="tg-c3ow">the number of rollout $P$rocesses in $P$arallel</td> </tr> <tr> <td class="tg-c3ow">$SE$</td> <td class="tg-c3ow">the number of $S$amples in each <br/> $E$pisode</td> <td class="tg-c3ow">$PI$</td> <td class="tg-c3ow">the $P$olicy $I$teration number</td> </tr> </tbody> </table> <h2 id="qmix-and-monotonicity-constraint">QMIX and Monotonicity Constraint</h2> <p>To deal with the relationship between the individual agent and the cooperative group, QMIX<d-cite key="pmlr-v80-rashid18a"></d-cite> learns a joint action-value function $Q_{tot}$ and factorizes the joint policy into the individual policy of each agent. In other words, as illustrated in Figure <a href="#frame">4</a>, QMIX integrates all the individual $Q_{i}$ with a mixing network to obtain a centralized value function $Q_{tot}$, which can be more appropriately updated by the global reward.</p> <div id="frame" class="img-height-310 image-center"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/qmix_frame-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/qmix_frame-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/qmix_frame-1400.webp"/> <img src="/2023/assets/img/2023-05-01-riit/qmix_frame.png" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> <div class="caption">Figure 4: Framework of QMIX. (Image source: QMIX<d-cite key="pmlr-v80-rashid18a"></d-cite>). On the left is Mixing Network (A Hypernetwork), and on the right is the Agent network.</div> <p>Still, it also can be represented in Eq.(\ref{eq2})</p> \[Q_{tot}(s, \boldsymbol{u} ; \boldsymbol{\theta}, \phi) = g_{\phi}\left(s, Q_{1}\left(\tau^{1}, u^{1} ; \theta^{1}\right), \ldots, Q_{n}\left(\tau^{n}, u^{n} ; \theta^{n}\right)\right);\] \[with \quad \frac{\partial Q_{tot}(s, \boldsymbol{u} ; \boldsymbol{\theta}, \phi)}{\partial Q_{i}\left(\tau^{i}, u^{i}; \theta^{i}\right)} \geq 0, \quad \forall i \in N. \tag{2} \label{eq2}\] <p>where $\theta^i$ is the parameter of the agent network $i$, $u^{i}$ denotes the action of agent $i$, and $\phi$ is the trainable parameter of the mixing network. The the mixing network $g_{\phi}(\cdot)$ is responsible to factorize $Q_{tot}$ to each utility $Q_{i}$. The <em>Monotonicity Constraint</em> is also implemented in the mixing network $g_{\phi}(\cdot)$, which inputs the global state $s$ and outputs <em>non-negative</em> wights through a <em>hyper-network</em> as illustrated in the left part of Figure <a href="#frame">4</a>, which will result in \(\frac{\partial Q_{tot}(s, \boldsymbol{u} ; \boldsymbol{\theta}, \phi)}{\partial Q_{i}\left(\tau^{i}, u^{i}; \theta^{i}\right)} \geq 0\). This delicate design ensures consistency between joint actions and the individual actions of each agent, then guarantees the Individual-Global-Max (IGM) principle. Benefiting from the monotonicity constraint in Eq.(\ref{eq2}), maximizing joint $Q_{tot}$ is precisely the equivalent of maximizing individual $Q_i$, which would also allow the optimal individual action to maintain consistency with optimal joint action. Furthermore, QMIX learns the centralized value function $Q_{tot}$ by sampling a multitude of transitions from the replay buffer and minimizing the mean squared temporal-difference (TD) error loss:</p> \[\mathcal{L}(\theta)= \frac{1}{2} \sum_{i=1}^{b}\left[\left(y_{i}^{}-Q_{tot}(s, u ; \theta, \phi)\right)^{2}\right] \tag{3} \label{eq3}\] <p>where the TD target value \(y=r+\gamma \underset{u^{\prime}}{\operatorname{max}} Q_{tot}(s^{\prime},u^{\prime};\theta^{-},\phi^{-})\), and $\theta^{-}, \phi^{-}$ are the target network parameters copied periodically from the current network and kept constant for a number of iterations. $b$ is the sampled training batch size. Due to the strong constraints in Eq.(\ref{eq2}), QMIX is still criticized for the insufficient expressive capacity of the joint value function<d-cite key="mahajan2019maven"></d-cite>.</p> <h2 id="extension-to-qmix">Extension to QMIX</h2> <h3 id="experimental-design">Experimental Design</h3> <p>To facilitate the study of proper techniques affecting the training effectiveness and sample efficiency of QMIX, we perform a set of experiments designed to provide insight into some methods that have been proven effective in single-agent RL but may be ambiguous in MARL. In particular, we investigate the effects of <strong>Adam optimizer with parallel rollout process; the incremental replay buffer size; the number of parallel rollout processes; $\epsilon$-exploration steps; the implementation of $Q(\lambda)$ in centralized value function; the hidden size of the recurrent network of agents</strong>. And we also dive into the <strong>role of monotonicity constraints in QMIX</strong>. For all experiments, we generally implement PyMARL<d-cite key="samvelyan2019starcraft"></d-cite> framework to implement QMIX. To ensure fairness we run independent 3 to 6 experimental trials for each evaluation, each with a random seed. Unless otherwise mentioned, we use default settings as in PyMARL whenever possible, while incorporating the techniques of interest. To prevent the training process of the algorithm from crashing by chance, we remove the highest and lowest scores when counting the calculated returns and win rates for the test episode. All the results are plotted with the median and shaded the interval, and the final scores were <strong><em>not</em></strong> smoothed for the sake of image aesthetics, and we did so to verify exactly what direct effect the proposed techniques could have on QMIX.</p> <p><strong>StarCraft Multi-Agent Challenge (SMAC)</strong> As a commonly used testing environment, SMAC<d-cite key="samvelyan2019starcraft"></d-cite> sets an example to offer a great opportunity to tackle the cooperative control problems in the multi-agent domain. We focus on the micromanagement challenge in SMAC, which means each agent is controlled by an independent agency that conditions on a limited observation area, and these groups of units are trained to conquer the enemy consisting of built-in AI. According to the quantity and type of enemy, all testing scenarios could be divided into <em>Easy, Hard</em>, and <em>Super-Hard</em> levels. Since QMIX can effectively solve the <em>Easy</em> tasks, we pay attention to some <em>Hard</em> and <em>Super-Hard</em> scenarios that QMIX failed to win, especially in <em>Corridor, 3s5z_vs_3s6z</em>, and <em>6h_vs_8z</em>.</p> <p><strong>Predator-Prey (PP)</strong> is representative of another classical problem called <em>relative overgeneralization</em><d-cite key="wei2018multiagent"></d-cite>. The cooperating predators are trained to chase a faster running prey, and hope to capture this escaping robot with the fewest steps possible. We leverage Predator-Prey-2 (a variant of Predator-Prey) proposed in FACMAC<d-cite key="peng2021facmac"></d-cite>, whose policy of prey is replaced with a hard-coded heuristic policy. The heuristic policy asks the prey to move to the farthest sampled position to the closest predator. If one of the cooperative agents collides with the prey, a team reward of +10 is emitted; otherwise, no reward is given. In the original simple tag environment, each agent can observe the relative positions of the other two agents, the relative position and velocity of the prey, and the relative positions of the landmarks. This means each agent’s private observation provides an almost complete representation of the true state of the environment.</p> <p>To introduce partial observability to the environment, the view radius is added to the agent, which restricts the agents from receiving information about other entities (including all landmarks, the other two agents, and the prey) that are out of range. Specifically, we set the view radius such that the agents can only observe other agents roughly 60% of the time. These environments require greater cooperation between agents.</p> <p><strong>Notes:</strong> Although the code repository of this post is given in the abstract, we give its url here again for greater convenience and still strongly welcome researchers to conduct experiments referring to the proposed methods. Still, in the following subsections, we post their corresponding permalinks for easy understanding.</p> <p>Code Repository: <a href="https://github.com/hijkzzz/pymarl2"> https://github.com/hijkzzz/pymarl2 </a></p> <h3 id="optimizer">Optimizer</h3> <p>As an important part of training neural networks, the selection of an optimizer is very important since it could seriously affect the training effect of the reinforcement learning agent. Without a further illustration, QMIX uses RMSProp<d-cite key="zou2019sufficient"></d-cite> to optimize the neural networks of agents as they prove stable in SMAC. While Adam<d-cite key="kingma2014adam"></d-cite> is famous for the fast convergence benefiting from the momentum in training, which seems to be the first choice for AI researchers. We reckon that the momentum property in Adam would have some advantages in learning the sampled data which is generated by agents interacting with the environment as in MARL. And then, on the other hand, QMIX is criticized for performing sub-optimally and sampling inefficiency when equipped with the A2C framework, which is implemented to promote the training efficiency of the RL algorithm. VMIX<d-cite key="su2021value"></d-cite> argues this limitation is brought about by the value-based inherent Q function, so they extend QMIX to the actor-critic style algorithm to take advantage of the A2C framework. This controversy attracts our attention to evaluate the performance of QMIX using Adam, as well as the parallel sampling paradigm.</p> <p><strong>Permalink:</strong> <a href="https://github.com/hijkzzz/pymarl2/blob/45278a5f8d1e3d006811351ed5fa99d614731e7d/src/learners/nq_learner.py#L37-L40"> Adam optimizer in nq_learner. </a></p> <div id="optimizer" class="img-height-210 image-center"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/optimizer.svg-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/optimizer.svg-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/optimizer.svg-1400.webp"/> <img src="/2023/assets/img/2023-05-01-riit/optimizer.svg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> <div class="caption">Figure 5: The performance of QMIX optimized by Adam and RMSProp.</div> <p><strong>Results</strong> As shown in Figure <a href="#optimizer">5</a>, we run the Adam-supported QMIX with <strong>8 rollout processes</strong>. Different from what was described in VMIX, the performance and efficiency of QMIX could be greatly improved by Adam. We speculate the reason is the momentum property in Adam could fastly fit the newly sampled data from the parallel rollout processes and then enhance the performance, while RMSProp failed.</p> <h3 id="rollout-process-number">Rollout Process Number</h3> <p>Naturally, we come to focus on the benefits of parallel data sampling in QMIX. A2C<d-cite key="pmlr-v48-mniha16"></d-cite> provides an excellent example to reduce training time and improve the training efficiency in single-agent RL. As we implement the algorithms under the paradigm of A2C, there is usually a defined total number of samples and an unspecified number of rollout processes. The total number of samples $TS$ can be calculated as $TS = SE \cdot PP \cdot PI$, where $TS$ is the total sum of sampled data, $SE$ denotes the number of samples in each episode, $PP$ and $PI$ denote the number of rollout processes in parallel and the policy iteration number, respectively. This section aims to perform analysis and spur discussion on the impact of the parallel rollout process on the final performance of QMIX.</p> <p><strong>Permalink:</strong> <a href="https://github.com/hijkzzz/pymarl2/blob/45278a5f8d1e3d006811351ed5fa99d614731e7d/src/config/algs/qmix.yaml#L9-L10"> 1) Rollout process number setting in the configuration file</a>; 2) <a href="https://github.com/hijkzzz/pymarl2/blob/45278a5f8d1e3d006811351ed5fa99d614731e7d/src/runners/parallel_runner.py#L88-L212"> Parallel trajectory sampling code. </a></p> <div id="process_number" class="img-height-210 image-center"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/process_number.svg-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/process_number.svg-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/process_number.svg-1400.webp"/> <img src="/2023/assets/img/2023-05-01-riit/process_number.svg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> <div class="caption">Figure 6: The performance of different rollout process numbers of QMIX. When given the total number of samples, the performance of fewer processes achieves better performance.</div> <p><strong>Results</strong> Still, we use Adam-supported QMIX to evaluate the effect of the number of the rollout process. Since we could choose the <em>Parallel</em> model to sample the interacting data of the agent with the environment in PyMARL, we can theoretically get more <strong>on-policy</strong> data which is close to the updating policy in training. Figure <a href="#process_number">6</a> shows that when $TS$ and $PP$ is given, the performance enhancement of QMIX is not consistent with the increase in rollout process number. The intuitive explanation is when we set the fewer rollout processes, the greater the quantity of policy would iterate<d-cite key="sutton2018reinforcement"></d-cite>. Besides, too fast updated data in parallel may cause the factitious unstable training in policy updating, i.e., it is difficult for agents to learn effective information from rapidly sampled data from the replay buffer. The more times policies are iterated, the more information the agents would learn which lead to an increase in performance. However, it also causes longer training time and loss of stability. We suggest trying the fewer rollout process in the beginning and then balancing between training time and performance.</p> <h3 id="replay-buffer-size">Replay Buffer Size</h3> <p>Replay buffer plays an important role in improving sample efficiency in off-policy single-agent RL. Its capacity would greatly affect the performance and stability of algorithms. Researchers usually set a very large capacity of replay buffer in Deep Q-network (DQN)<d-cite key="mnih2013playing"></d-cite> to stabilize the training. Some research on the effect of replay buffer in single-agent RL has already been carried out in <d-cite key="pmlr-v119-fedus20a"></d-cite>, which poses the distribution of sampled training data should be close as possible to the agents’ policies to be updated. Actually, there are two factors affected when we change the capacity of the replay buffer: (1) the replay capacity (total number of transitions/episodes stored in the buffer); and (2) the replay ratio (the number of gradient updates per environment transition/episode) of old policies. When we increase the capacity of the replay buffer, the aged experiences of old policies would grow as the replay ratio is fixed. Then the distribution of outdated experiences would also be much different from the updating policy, which would bring additional difficulty to the training agents. From the results in <d-cite key="pmlr-v119-fedus20a"></d-cite>, there seems to be an optimal range of choices between replay buffer size and replay ratio of experiences in RL, where we would like to know whether it is consistent with the results in MARL.</p> <p><strong>Permalink:</strong> <a href="https://github.com/hijkzzz/pymarl2/blob/45278a5f8d1e3d006811351ed5fa99d614731e7d/src/config/algs/qmix.yaml#L11"> Replay buffer size setting in the configuration file. </a></p> <div id="replay_buffer" class="img-height-210 image-center"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/buffer_size.svg-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/buffer_size.svg-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/buffer_size.svg-1400.webp"/> <img src="/2023/assets/img/2023-05-01-riit/buffer_size.svg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> <div class="caption">Figure 7: Setting the replay buffer size to 5000 episodes allows for QMIX’s learning to be stable.</div> <p><strong>Results</strong> The results seem not to be consistent with that in single-agent RL. Figure <a href="#replay_buffer">7</a> shows the large replay buffer size of QMIX would cause instability during training. When we increase the buffer size from the default setting in PyMARL, the performance would almost continuously declines. We speculate the reason is the fast-changing distribution of experiences in a larger buffer would make it more difficult to fit sampled data due to the enormous joint action space. Since the samples become obsolete more quickly, these aged policies would also be more different from the updating policy, which brings additional difficulty. On the other hand, we find the same performance decline when we squeeze the buffer. We reckon that a small buffer would accelerate the updating speed of sampling data in a disguised way, which makes it tough to fit the data and learn a good policy. We believe that researchers should be cautious to increase the buffer size in other multi-agent applications.</p> <h3 id="eligibility-traces">Eligibility Traces</h3> <p>The well-known trade-off between bias and variance of bootstrapping paradigm is a classic research topic in RL. Since we implement the Centralized Value Function (CVF) to alleviate the <em>Non-Stationarity</em> multi-agent settings, the estimated accuracy of CVF is critical to MARL and then guides the policies of agents to update. Eligibility traces such as TD($\lambda$)<d-cite key="sutton2018reinforcement"></d-cite>, Peng’s Q($\lambda$)<d-cite key="pmlr-v139-kozuno21a"></d-cite>, and TB($\lambda$)<d-cite key="10.5555/645529.658134"></d-cite> achieve a balance between return-based algorithms (where return refers to the sum of discounted rewards $\sum_{k} \gamma^{k} r_{t+k}$) and bootstrap algorithms (where return refers $r_t + V(s_{t+1})$), then speed up the convergence of agents’ policies. As a pioneer, SMIX<d-cite key="wen2020smix"></d-cite> equipped QMIX with the SARSA($\lambda$) to estimate the accurate CVF and get decent performance. As another example of eligibility trace in Q-learning, we study the estimation of CVF using Peng’s Q$(\lambda)$ for QMIX.</p> <p><strong>Permalink:</strong> <a href="https://github.com/hijkzzz/pymarl2/blob/45278a5f8d1e3d006811351ed5fa99d614731e7d/src/utils/rl_utils.py#L6-L45"> Different eligibility traces code in repository. </a></p> <div id="qlambda1" class="img-height-210 image-center"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/td_lambda.svg-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/td_lambda.svg-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/td_lambda.svg-1400.webp"/> <img src="/2023/assets/img/2023-05-01-riit/td_lambda.svg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> <div class="caption"> Figure 8: Q(λ) significantly improves the performance of QMIX, but large values of λ lead to instability in the algorithm. </div> <p><strong>Results</strong> As the same in single-agent RL, the Q-networks without sufficient training usually have a large bias in bootstrapping returns. Figure <a href="#qlambda1">8</a> shows that, with the help of Q$(\lambda)$, the performance of QMIX has generally improved across all scenarios. It means the more accurate estimate of CVF would still provide a better direction of policy updating for each agent. However, the value of $\lambda$ in Peng’s Q$(\lambda)$ is not so radical as in single-agent RL, which would lead to failed convergence due to the large variance. We recommend a small $\lambda$, such as $0.5$, when using $Q(\lambda)$ in MARL.</p> <h3 id="hidden-size">Hidden Size</h3> <p>Searching for an optimal scale and architecture of neural networks is a very tough problem in the field of machine learning. Researchers typically use empirically small networks to train the agents in deep reinforcement learning. Since the role of neural networks is to extract the features of input states and actions, the size of the neural network would also have a great impact on the performance of MARL algorithms. The study in <d-cite key="pmlr-v119-ota20a"></d-cite> has revealed that networks with a complex structure like ResNet<d-cite key="He_2016_CVPR"></d-cite> and DenseNet<d-cite key="Huang_2017_CVPR"></d-cite> can extract more useful information for training, while Ba<d-cite key="ba2014deep"></d-cite> poses that the width of neural networks is probably more important than its depth. The subsequent study on QMIX<d-cite key="rashid2020monotonic"></d-cite> makes preliminary research on the depth of neural networks, which showed a limited improvement in performance. Though, there is little research on the width of neural networks in MARL. Instead of searching for an optimal network architecture here, we just want to make a pilot study on the effect of the hidden size of network width in QMIX.</p> <p><strong>Permalink:</strong> <a href="https://github.com/hijkzzz/pymarl2/blob/45278a5f8d1e3d006811351ed5fa99d614731e7d/src/config/algs/qmix_large.yaml#L25-L30"> Hidden size of neural network setting in the configuration file. </a></p> <div id="hiddensize" class="img-height-210 image-center"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/hidden_size.svg-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/hidden_size.svg-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/hidden_size.svg-1400.webp"/> <img src="/2023/assets/img/2023-05-01-riit/hidden_size.svg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> <div class="caption">Figure 9: Impact of the hidden size of network in QMIX.</div> <p><strong>Results</strong> The study in <d-cite key="ba2014deep"></d-cite> illustrates the ability of infinity-width networks to fit any complex function, which would theoretically provide the performance gain from increasing network width. As shown in Figure <a href="#hiddensize">9</a>, the final performance or the efficiency of policy training would have varying degrees of improvement when we increase the hidden size of the network from 64 to 256 in QMIX, where <strong>QMIX-ALL-Hidden indicates all the sizes of the Recurrent Neural Network (RNN) and the Mixing network would be increased to 256, while QMIX-RNN-Hidden only refers to the size of the RNN part of the network will be changed</strong>. Also, the results reveal the spectacular effect of increasing the network width of RNN, which would allow for about a 20% increase in the Super-Hard scenarios <em>3s5z_vs_3s6z</em>. While the performance improvement is limited in enlarging the mixing network. We speculate that more units in the network are needed to represent the complex temporal context information in RNN, which is not included in the mixing network. We advise researchers to appropriately increase the network width of RNN to achieve better performance.</p> <h3 id="exploration-steps">Exploration Steps</h3> <p>Exploration and exploitation are other classic trade-offs in reinforcement learning. Agents need some directed mechanisms to explore the states that may be of higher value or inexperienced. The most versatile method of exploration in RL is $\epsilon$-greedy action, which makes the agent select random actions with probability $\epsilon$, or select the greedy action with $1 - \epsilon$. The value of $\epsilon$ would drop-down with training and then stays at a small constant. The annealing period of $\epsilon$-greedy determines how fast the drop down will be. This exploration mechanism is usually implemented for each agent to select their action, which has been criticized by MAVEN<d-cite key="mahajan2019maven"></d-cite> for lacking joint exploratory policy over an entire episode. However, we can still get more exploration when $\epsilon$ drops slower, then we evaluate the performance of the annealing period of $\epsilon$-greedy in some Super-Hard scenarios in SMAC.</p> <p><strong>Permalink:</strong> <a href="https://github.com/hijkzzz/pymarl2/blob/45278a5f8d1e3d006811351ed5fa99d614731e7d/src/config/algs/qmix.yaml#L5-L7"> $\epsilon$-greedy exploration steps setting in the configuration file. </a></p> <div id="exploration" class="img-height-210 image-center"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/exploration.svg-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/exploration.svg-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/exploration.svg-1400.webp"/> <img src="/2023/assets/img/2023-05-01-riit/exploration.svg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> <div class="caption">Figure 10: Experinments for the impact of ε anneal period.</div> <p><strong>Results</strong> Apparently, appropriately increasing the annealing period of $\epsilon$-greedy from 100K steps to 500K would get explicit performance gain in those hard explorated scenarios, where QMIX failed with the default setting. However, as shown in Figure <a href="#exploration">10</a>, too large steps like 1000K would also bring additional exploration noise even making the training collapse. The results above confirm the $\epsilon$-greedy mechanism is still the proper and simplest choice in MARL but should be elaboratively tuned for different tasks.</p> <h3 id="integrating-the-techniques">Integrating the Techniques</h3> <p>These techniques mentioned above indeed impact QMIX in hard cooperative scenarios of SMAC, which really catches our attention to exhaust the extreme performance of QMIX. We combine these techniques and finetune all the hyperparameters in QMIX for each scenario of SMAC. As shown in Table <a href="#table2">2</a>, the Finetuned-QMIX would almost conquer all the scenarios in SMAC and exceed the effect of the original QMIX by a large margin in some Hard and Super-Hard scenarios.</p> <p><a name="table2"> </a></p> <div class="caption"> Table 2: Best median test win rate of Finetuned-QMIX and QMIX (batch size=128) in all testing scenarios. </div> <table style="text-align: center; width: 600px; margin: 0 auto; margin-bottom:20px; margin-top:20px"> <thead> <tr> <td><b>Senarios</b></td> <td><b>Difficulty</b></td> <td><b>QMIX</b></td> <td><b>Finetuned-QMIX</b></td> </tr> </thead> <tbody> <tr> <td>10m_vs_11m</td> <td>Easy</td> <td>98%</td> <td><b>100%</b></td> </tr> <tr> <td>8m_vs_9m</td> <td>Hard</td> <td>84%</td> <td><b>100%</b></td> </tr> <tr> <td>5m_vs_6m</td> <td>Hard</td> <td>84%</td> <td><b>90%</b></td> </tr> <tr> <td>3s_vs_5z</td> <td>Hard</td> <td>96%</td> <td><b>100%</b></td> </tr> <tr> <td>bane_vs_bane</td> <td>Hard</td> <td><b>100%</b></td> <td><b>100%</b></td> </tr> <tr> <td>2c_vs_64zg</td> <td>Hard</td> <td><b>100%</b></td> <td><b>100%</b></td> </tr> <tr> <td>corridor</td> <td>Super hard</td> <td>0%</td> <td><b>100%</b></td> </tr> <tr> <td>MMM2</td> <td>Super hard</td> <td>98%</td> <td><b>100%</b></td> </tr> <tr> <td>3s5z_vs_3s6z</td> <td>Super hard</td> <td>3%</td> <td><b>93% (Hidden Size = 256)</b></td> </tr> <tr> <td>27m_vs_3s6z</td> <td>Super hard</td> <td>56%</td> <td><b>100%</b></td> </tr> <tr> <td>6h_vs_8z</td> <td>Super hard</td> <td>0%</td> <td><b>93% (λ = 0.3)</b></td> </tr> </tbody> </table> <h2 id="role-of-monotonicity-constraint">Role of Monotonicity Constraint</h2> <h3 id="amazing-performance-in-policy-based-methods">Amazing Performance in Policy-Based Methods</h3> <div id="qmix_sy" class="img-height-180 image-center img-margin-left-30"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/riit.svg-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/riit.svg-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/riit.svg-1400.webp"/> <img src="/2023/assets/img/2023-05-01-riit/riit.svg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> <div class="caption">Figure 11: Architecture for AC-MIX: <b>|·|</b> denotes <b>absolute value operation</b>, implementing the monotonicity constraint of QMIX. <b>W</b> denotes the non-negative mixing weights. Agent $i$ denotes the agent's network, which can be trained end-to-end by maximizing the $Q_{tot}$.</div> <p>The novelty of QMIX is the IGM consistency between $\text{argmax} Q_{tot}$ and $\text{argmax} \sum_{i}^{n} Q_{i}$, which is implemented in the mixing network. <strong>We still expect to study the role of <em>monotonicity constraint</em> in MARL</strong>. Therefore, we propose an actor-critic style algorithm called Actor-Critic-Mixer (AC-MIX), which has a similar architecture to QMIX. As illustrated in Figure <a href="#qmix_sy">11</a>, we use the monotonic mixing network as a centralized critic, which integrates $Q_{i}$ of each agent, to optimize the decentralized policy networks $π^i_{θ_i}$ in an end-to-end pattern. We still add the Adaptive Entropy $\mathcal{H}(\cdot)$<d-cite key="zhou2020smarts"></d-cite> of each agent in the optimization object of Eq.(\ref{eq4}) to get more exploration, and the detail of the algorithm will be described in Appendix <a href="#A">A</a>.</p> \[\max _{\theta} \mathbb{E}_{t, s_{t}, \tau_{t}^{1}, \ldots, \tau_{t}^{n}}\left[Q_{\theta_{c}}^{\pi}\left(s_{t}, \pi_{\theta_{1}}^{1}\left(\cdot \mid \tau_{t}^{1}\right), \ldots, \pi_{\theta_{n}}^{n}\left(\cdot \mid \tau_{t}^{n}\right)\right) + \mathbb{E}_{i}\left[\mathcal{H}\left(\pi_{\theta_{i}}^{i}\left(\cdot \mid \tau_{t}^{i}\right)\right)\right]\right] \tag{4} \label{eq4}\] <div id="riit_abla" class="img-height-210 image-center"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/monotonicity_riit.svg-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/monotonicity_riit.svg-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/monotonicity_riit.svg-1400.webp"/> <img src="/2023/assets/img/2023-05-01-riit/monotonicity_riit.svg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> <div class="caption">Figure 12: Comparing AC-MIX w./ and w./o. monotonicity constraint (remove absolute value operation) on SMAC and Predator-Prey-2</div> <p>As the monotonicity constraint on the critic of AC-MIX is theoretically no longer required as the critic is not used for greedy action selection. We can evaluate the effects of the monotonicity constraint by removing the absolute value operation in the mixing network. The results in Figure <a href="#riit_abla">12</a> demonstrate the <em>monotonicity constraint</em> significantly improves the performance of AC-MIX. Then to explore the generality of <em>monotonicity constraints</em> in the parallel sampling framework of MARL, we extend the above experiments to VMIX<d-cite key="su2021value"></d-cite>. VMIX adds the monotonicity constraint to the value network of A2C, and learns the policy of each agent by advantage-based policy gradient<d-cite key="sutton2018reinforcement"></d-cite> as illustrated in Figure <a href="#vmix_net">13</a>. Still, the result from Figure <a href="#vmix_abla">14</a> shows that the monotonicity constraint improves the sample efficiency in value networks.</p> <div id="vmix_net" class="img-height-180 image-center img-margin-left-60"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/vmix.svg-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/vmix.svg-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/vmix.svg-1400.webp"/> <img src="/2023/assets/img/2023-05-01-riit/vmix.svg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> <div class="caption">Figure 13. Architecture for VMIX: |·| denotes absolute value operation</div> <div id="vmix_abla" class="img-height-210 image-center"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/monotonicity_vmix.svg-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/monotonicity_vmix.svg-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/monotonicity_vmix.svg-1400.webp"/> <img src="/2023/assets/img/2023-05-01-riit/monotonicity_vmix.svg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> <div class="caption">Figure 14: Comparing VMIX w./ and w./o. monotonicity constraint (remove absolute value operation) on SMAC</div> <h3 id="what-is-under-the-hood">What is Under the Hood?</h3> <p>Observed from the results of previous experiments, <strong>the <em>monotonicity constraints</em> in the mixing network indeed improve performance and sample efficiency of training</strong>, but on the flip side of the coin, QMIX is still criticized for the insufficient expressive capacity of the centralized critic<d-cite key="mahajan2019maven"></d-cite>, which may cause poor performance. The abnormal question naturally occurred to us: <em>Why the performance of AC-MIX would be better than AC-MIX-nonmonotonic which aims to relax the monotonicity constraint of mixing network</em>?</p> <p>To answer this question we first need to reexamine the <strong>IGM</strong> principle. Since in QMIX, $Q_{tot}$ is decomposed by the mixing network into the sum of the weighted $[Q_i] _{i=1}^{n}$, as shown in Figure <a href="#frame">4</a>, where the weights and bias of mixing network are generated by the <em>Hypernetwork</em>, then the monotonicity in QMIX can be defined simplistically as a constraint on the relationship between \(Q_{tot}\) and each \(Q_{i}\) :</p> \[Q_{tot} = \sum_{i=1}^{N}w_{i}(s_{t}) \cdot Q_{i} + b(s_{t}), \\ w_{i} = \frac{\partial Q_{tot}}{\partial Q_{i}} \geq 0, \forall i \in N. \tag{5} \label{5}\] <p>From the sufficient condition above, the weight $w_{i}$ in <em>Mixing Network</em> would be forced to be greater or equal to zero $w_{i} \geq 0$. To put it another way, it makes the parameter space smaller for searching $w_{i}$ weights to decompose $Q_{tot}$. As illustrated in the schematic diagram <a href="#diagram">15</a>, assume there is only 1 agent in the environment, the parameter searching space will be directly halved and the optimal $w_{1}$ will be found in the region where $w \geq 0$, i.e., the green region. Similarly, when the number of agents is 2 or 3, its parameter searching space for $w_i$ will be restricted to the first quadrant, and the same can be recursively extended to the case of high-dimensional parameter space. <strong>In other words, the search area of exhausting the whole joint state-action space would also be decreased exponentially by $(\frac{1}{2})^{N}$ ($N$ denotes the number of parameter space of $w_{i}$, as well as the number of agents).</strong> Then the optimal solution in the original domain cannot be expressed correctly in the restricted region. Since the essence of learning in MARL is to search for the optimal joint-policy parameterized by weights and bias of agents and mixing network, QMIX could find a satisfying policy more quickly in these <strong>reduced</strong> parameter spaces.</p> <div id="diagram" style="display:flex; margin:20px 0; gap:5px"> <div id="1_agent" class="img-height-100"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/1_agent.svg-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/1_agent.svg-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/1_agent.svg-1400.webp"/> <img src="/2023/assets/img/2023-05-01-riit/1_agent.svg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> <div id="2_agent" class="img-height-100"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/2_agent.svg-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/2_agent.svg-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/2_agent.svg-1400.webp"/> <img src="/2023/assets/img/2023-05-01-riit/2_agent.svg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> <div id="3_agent" class="img-height-100"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/3_agent.svg-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/3_agent.svg-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/3_agent.svg-1400.webp"/> <img src="/2023/assets/img/2023-05-01-riit/3_agent.svg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div></div> <div style="margin-bottom: 20px"> <div class="caption">Figure 15: the weight parameter space diagram of different number of agents in QMIX [from-left-to-right]. (a) weight parameter space of only 1 agent; (b) weight parameter space of 2 agents; (c) weight parameter space of 3 agents.</div></div> <p>As a side effect, the global optimum may not be in the parameter space that QMIX needs to search at all due to the monotonicity of the mixing network. One effective way is to estimate the $Q_{tot}$ as accurately as possible in the hope that it could find the global optimum, this probably explains why $Q(\lambda)$ in the previous section could result in such a performance improvement in SMAC. On the other hand, we could delicately design the reward function to be approximately monotonic when we use QMIX to solve cooperative multi-agent tasks. Then adapting the algorithm to the test environment is not a good idea, after all, we still need to figure out how to use QMIX more effectively or develop other more efficient algorithms.</p> <h2 id="conclusion">Conclusion</h2> <p>In this post, we revisited the performance of the QMIX as a baseline algorithm in the SMAC environment. We found that the application of hyperparameters and other RL techniques have a great impact on the effectiveness of QMIX. We evaluated the effect of optimizer, number of rollout processes, replay buffer size, eligibility traces, hidden size and the degree of annealed exploration on QMIX, and tried to explain their role in MARL. Furthermore, we dived into the monotonicity in QMIX, and found the absolute operation in mixing network would decrease the parameter searching space of the joint state-action area exponentially by $(\frac{1}{2})^{N}$, which would make QMIX find the satisfying policy more quickly but with the drawback of inaccurate evaluated joint value function of optimal policy. We hope that our findings will stimulate some inspiration for the value decomposition method in MARL and provoke the community to think about the performance of QMIX as a new benchmark.</p> <h2 id="authorship-credit-attribution-and-acknowledgement">Authorship, Credit Attribution and Acknowledgement</h2> <p>Jian Hu was responsible for the key ideas, open source code and all experiments, as well as the first draft of the paper.</p> <p>Siying Wang was responsible for the writing of the blog.</p> <p>Siyang Jiang participated in writing the first draft of the paper.</p> <p>Weixun Wang provided feedback on revisions.</p> <p>Siyang Jiang was supported by the fund which aims to improve scientific research capability of key construction disciplines in Guangdong province “Light-weight federal learning paradigm and its application” (No:2022ZDJS058) and Foundation for Distinguished Young Talents in Higher Education of Guangdong, China. (NO. 2022KQNCX084)</p> <h2 id="appendix">Appendix</h2> <h3 id="a-pseudo-code-of-ac-mix-">A Pseudo-code of AC-MIX<a id="A"> </a></h3> <p>In this subsection, we show the pseudo-code for the training procedure of AC-MIX. (1) Training the critic network with offline samples and 1-step TD error loss improves the sample efficiency for critic networks; (2) We find that policy networks are sensitive to old sample reuse. Training policy networks end-to-end and critic with TD($\lambda$) and online samples improve the learning stability of AC-MIX.</p> <div id="algorithm_riit" class="img-height-600 image-center"> <figure> <picture> <source class="responsive-img-srcset" media="(max-width: 480px)" srcset="/2023/assets/img/2023-05-01-riit/algorithm_riit.svg-480.webp"/> <source class="responsive-img-srcset" media="(max-width: 800px)" srcset="/2023/assets/img/2023-05-01-riit/algorithm_riit.svg-800.webp"/> <source class="responsive-img-srcset" media="(max-width: 1400px)" srcset="/2023/assets/img/2023-05-01-riit/algorithm_riit.svg-1400.webp"/> <img src="/2023/assets/img/2023-05-01-riit/algorithm_riit.svg" class="img-fluid rounded z-depth-1" width="auto" height="auto" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> <h3 id="b-hyperparameters">B HYPERPARAMETERS</h3> <p>In this subsection, we present our hyperparameters tuning process. We get the optimal hyperparameters for each algorithm by grid search, shown in Table <a href="#t3">3</a>.</p> <div class="caption"> Table 3: Hyperparameters Search on SMAC. The bold type indicates the selected hyperparameters. </div> <table style="text-align: center; width: 700px; margin: 0 auto; margin-bottom:20px; margin-top:20px;"><a name="t3"> </a> <thead> <tr> <td><b>Tricks</b></td> <td><b>QMIX</b></td> <td><b>AC-MIX</b></td> </tr> </thead> <tbody> <tr> <td>Optimizer</td> <td><b>Adam</b>,RMSProp</td> <td><b>Adam</b>,RMSProp</td> </tr> <tr> <td>Learning Rates</td> <td>0.0005, <b>0.001</b></td> <td>0.0005, <b>0.001</b></td> </tr> <tr> <td>Batch Size (episodes)</td> <td>32, 64, <b>128</b></td> <td>32, <b>64</b> </td> </tr> <tr> <td>Replay Buffer Size</td> <td><b>5000</b>, 10000, 20000</td> <td>2000, <b>5000</b>, 10000</td> </tr> <tr> <td>Q(λ)/TD(λ)</td> <td>0, 0.3, <b>0.6</b>, 0.9</td> <td>0.3, <b>0.6</b>, 0.8</td> </tr> <tr> <td>Entropy/Adaptive Entropy</td> <td>-</td> <td>0.005, 0.01, <b>0.03</b>, 0.06</td> </tr> <tr> <td>ε Anneal Steps</td> <td>50K, <b>100K, 500K</b>, 1000K</td> <td>-</td> </tr> </tbody> </table> <p><br/></p> <p><strong>Rollout Processes Number</strong>. For SMAC, 8 rollout processes for parallel sampling are used to obtain as many samples as possible from the environments at a high rate. And 4 rollout processes are used for Predator-Prey-2.</p> <p><strong>Other Settings</strong>. We set all discount factors $\gamma$ = 0.99. We update the target network every 200 episodes.</p>]]></content><author><name>Jian Hu</name></author><summary type="html"><![CDATA[QMIX, a very classical multi-agent reinforcement learning (MARL) algorithm, is often considered to be a weak performance baseline due to its representation capability limitations. However, we found that by improving the implementation techniques of QMIX we can enable it to achieve state-of-the-art on the StarCraft Multi-Agent Challenge (SMAC) testbed. Furthermore, the key factor of the monotonicity constraint of QMIX was found in this post, we tried to explain its role and corroborated its superior performance by combining it with another actor-critic style algorithm. We have open-sourced the code at https://github.com/hijkzzz/pymarl2 for researchers to evaluate the effects of these proposed techniques.]]></summary></entry></feed>
\ No newline at end of file
diff --git a/index.html b/index.html
new file mode 100644
index 00000000..6b06dc7a
--- /dev/null
+++ b/index.html
@@ -0,0 +1 @@
+<meta charset="utf-8"> <title>Redirecting…</title> <link rel="canonical" href="https://iclr-blogposts.github.io/2023/about"> <script>location="https://iclr-blogposts.github.io/2023/about";</script> <meta http-equiv="refresh" content="0; url=https://iclr-blogposts.github.io/2023/about"> <meta name="robots" content="noindex"> <h1>Redirecting…</h1> <a href="https://iclr-blogposts.github.io/2023/about">Click here if you are not redirected.</a>
\ No newline at end of file
diff --git a/index.md b/index.md
deleted file mode 100644
index 470ab2f4..00000000
--- a/index.md
+++ /dev/null
@@ -1,4 +0,0 @@
----
-title: home
-redirect_to: /about
----
diff --git a/news/announcement_1/index.html b/news/announcement_1/index.html
index 3333a286..bf8ab2d0 100644
--- a/news/announcement_1/index.html
+++ b/news/announcement_1/index.html
@@ -1 +1 @@
-<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <title>Announcement_1 | You R. Name</title> <meta name="author" content="You R. Name"/> <meta name="description" content="A simple, whitespace theme for academics. Based on [*folio](https://github.com/bogoli/-folio) design. "/> <meta name="keywords" content="jekyll, jekyll-theme, academic-website, portfolio-website"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha256-DF7Zhf293AJxJNTmh5zhoYYIMs2oXitRfBjY+9L//AY=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/css/mdb.min.css" integrity="sha256-jpjYvU3G3N6nrrBwXJoVEYI/0zw8htfFnhT9ljN3JJw=" crossorigin="anonymous"/> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fortawesome/fontawesome-free@5.15.4/css/all.min.css" integrity="sha256-mUZM63G8m73Mcidfrv5E+Y61y7a12O5mW4ezU3bxqW4=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/academicons@1.9.1/css/academicons.min.css" integrity="sha256-i1+4qU2G2860dGGIOJscdC30s9beBXjFfzjWLjBRsBg=" crossorigin="anonymous"> <link rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700|Roboto+Slab:100,300,400,500,700|Material+Icons"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/github.css" media="" id="highlight_theme_light"/> <link rel="shortcut icon" href="data:image/svg+xml,<svg xmlns=%22http://www.w3.org/2000/svg%22 viewBox=%220 0 100 100%22><text y=%22.9em%22 font-size=%2290%22>⚛️</text></svg>"> <link rel="stylesheet" href="/al-folio/assets/css/main.css"> <link rel="canonical" href="https://alshedivat.github.io/al-folio/news/announcement_1/"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/native.css" media="none" id="highlight_theme_dark"/> <script src="/al-folio/assets/js/theme.js"></script> <script src="/al-folio/assets/js/dark_mode.js"></script> </head> <body class="fixed-top-nav "> <header> <nav id="navbar" class="navbar navbar-light navbar-expand-sm fixed-top"> <div class="container"> <a class="navbar-brand title font-weight-lighter" href="/al-folio/"><span class="font-weight-bold">You </span>R. Name</a> <button class="navbar-toggler collapsed ml-auto" type="button" data-toggle="collapse" data-target="#navbarNav" aria-controls="navbarNav" aria-expanded="false" aria-label="Toggle navigation"> <span class="sr-only">Toggle navigation</span> <span class="icon-bar top-bar"></span> <span class="icon-bar middle-bar"></span> <span class="icon-bar bottom-bar"></span> </button> <div class="collapse navbar-collapse text-right" id="navbarNav"> <ul class="navbar-nav ml-auto flex-nowrap"> <li class="nav-item "> <a class="nav-link" href="/al-folio/">about</a> </li> <li class="nav-item "> <a class="nav-link" href="/al-folio/blog/">blog</a> </li> <li class="nav-item "> <a class="nav-link" href="/al-folio/publications/">publications</a> </li> <li class="nav-item "> <a class="nav-link" href="/al-folio/projects/">projects</a> </li> <li class="nav-item "> <a class="nav-link" href="/al-folio/repositories/">repositories</a> </li> <li class="nav-item "> <a class="nav-link" href="/al-folio/cv/">cv</a> </li> <li class="nav-item "> <a class="nav-link" href="/al-folio/teaching/">teaching</a> </li> <li class="nav-item dropdown "> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdown" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">submenus</a> <div class="dropdown-menu dropdown-menu-right" aria-labelledby="navbarDropdown"> <a class="dropdown-item" href="/al-folio/publications/">publications</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="/al-folio/projects/">projects</a> </div> </li> <li class="toggle-container"> <button id="light-toggle" title="Change theme"> <i class="fas fa-moon"></i> <i class="fas fa-sun"></i> </button> </li> </ul> </div> </div> </nav> </header> <div class="container mt-5"> <div class="post"> <header class="post-header"> <h1 class="post-title">Announcement_1</h1> <p class="post-meta">October 22, 2015</p> <p class="post-tags"> <a href="/al-folio/blog/2015"> <i class="fas fa-calendar fa-sm"></i> 2015 </a> </p> </header> <article class="post-content"> <p>A simple inline announcement.</p> </article> </div> </div> <footer class="fixed-bottom"> <div class="container mt-0"> © Copyright 2022 You R. Name. Powered by <a href="https://jekyllrb.com/" target="_blank" rel="noopener noreferrer">Jekyll</a> with <a href="https://github.com/alshedivat/al-folio" target="_blank" rel="noopener noreferrer">al-folio</a> theme. Hosted by <a href="https://pages.github.com/" target="_blank" rel="noopener noreferrer">GitHub Pages</a>. Photos from <a href="https://unsplash.com" target="_blank" rel="noopener noreferrer">Unsplash</a>. </div> </footer> <script src="https://cdn.jsdelivr.net/npm/jquery@3.6.0/dist/jquery.min.js" integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/js/bootstrap.bundle.min.js" integrity="sha256-fgLAgv7fyCGopR/gBNq2iW3ZKIdqIcyshnUULC4vex8=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/js/mdb.min.js" integrity="sha256-NdbiivsvWt7VYCt6hYNT3h/th9vSTL4EDWeGs5SN3DA=" crossorigin="anonymous"></script> <script defer src="https://cdn.jsdelivr.net/npm/masonry-layout@4.2.2/dist/masonry.pkgd.min.js" integrity="sha256-Nn1q/fx0H7SNLZMQ5Hw5JLaTRZp0yILA/FRexe19VdI=" crossorigin="anonymous"></script> <script defer src="https://cdn.jsdelivr.net/npm/imagesloaded@4/imagesloaded.pkgd.min.js"></script> <script defer src="/al-folio/assets/js/masonry.js" type="text/javascript"></script> <script defer src="https://cdn.jsdelivr.net/npm/medium-zoom@1.0.6/dist/medium-zoom.min.js" integrity="sha256-EdPgYcPk/IIrw7FYeuJQexva49pVRZNmt3LculEr7zM=" crossorigin="anonymous"></script> <script defer src="/al-folio/assets/js/zoom.js"></script> <script defer src="/al-folio/assets/js/common.js"></script> <script type="text/javascript">window.MathJax={tex:{tags:"ams"}};</script> <script defer type="text/javascript" id="MathJax-script" src="https://cdn.jsdelivr.net/npm/mathjax@3.2.0/es5/tex-mml-chtml.js"></script> <script defer src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script> </body> </html>
\ No newline at end of file
+<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <title>Announcement_1 | ICLR Blogposts 2023</title> <meta name="author" content="abc b c"/> <meta name="description" content="Home to the 2023 ICLR Blogposts track "/> <meta name="keywords" content="machine-learning, ml, deep-learning, reinforcement-learning, iclr"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha256-DF7Zhf293AJxJNTmh5zhoYYIMs2oXitRfBjY+9L//AY=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/css/mdb.min.css" integrity="sha256-jpjYvU3G3N6nrrBwXJoVEYI/0zw8htfFnhT9ljN3JJw=" crossorigin="anonymous"/> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fortawesome/fontawesome-free@5.15.4/css/all.min.css" integrity="sha256-mUZM63G8m73Mcidfrv5E+Y61y7a12O5mW4ezU3bxqW4=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/academicons@1.9.1/css/academicons.min.css" integrity="sha256-i1+4qU2G2860dGGIOJscdC30s9beBXjFfzjWLjBRsBg=" crossorigin="anonymous"> <link rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700|Roboto+Slab:100,300,400,500,700|Material+Icons"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/github.css" media="" id="highlight_theme_light"/> <link rel="shortcut icon" href="/2023/assets/img/iclr_favicon.ico"/> <link rel="stylesheet" href="/2023/assets/css/main.css"> <link rel="canonical" href="https://iclr-blogposts.github.io/2023/news/announcement_1/"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/native.css" media="none" id="highlight_theme_dark"/> <script src="/2023/assets/js/theme.js"></script> <script src="/2023/assets/js/dark_mode.js"></script> </head> <body class="fixed-top-nav "> <header> <nav id="navbar" class="navbar navbar-light navbar-expand-sm fixed-top"> <div class="container"> <a class="navbar-brand title font-weight-lighter" href="/2023/">ICLR Blogposts 2023</a> <button class="navbar-toggler collapsed ml-auto" type="button" data-toggle="collapse" data-target="#navbarNav" aria-controls="navbarNav" aria-expanded="false" aria-label="Toggle navigation"> <span class="sr-only">Toggle navigation</span> <span class="icon-bar top-bar"></span> <span class="icon-bar middle-bar"></span> <span class="icon-bar bottom-bar"></span> </button> <div class="collapse navbar-collapse text-right" id="navbarNav"> <ul class="navbar-nav ml-auto flex-nowrap"> <li class="nav-item "> <a class="nav-link" href="/2023/about">about</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/call">call for blogposts</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/submitting">submitting</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/reviewing">reviewing</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/blog/index.html">blog</a> </li> <li class="nav-item dropdown "> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdown" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">other iterations</a> <div class="dropdown-menu dropdown-menu-right" aria-labelledby="navbarDropdown"> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2025/">2025</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2024/">2024</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blog-track.github.io/home/" target="_blank" rel="noopener noreferrer">2022</a> </div> </li> <li class="toggle-container"> <button id="light-toggle" title="Change theme"> <i class="fas fa-moon"></i> <i class="fas fa-sun"></i> </button> </li> </ul> </div> </div> </nav> </header> <div class="header-background"><div class="img"></div></div> <div class="container mt-5"> <div class="post"> <header class="post-header"> <h1 class="post-title">Announcement_1</h1> <p class="post-meta">October 22, 2015</p> <p class="post-tags"> <a href="/2023/blog/2015"> <i class="fas fa-calendar fa-sm"></i> 2015 </a> </p> </header> <article class="post-content"> <p>A simple inline announcement.</p> </article> <div id="bibtex-container" class="related"> For attribution in academic contexts, please cite this work as BibTeX citation </div> </div> </div> <script src="https://cdn.jsdelivr.net/npm/jquery@3.6.0/dist/jquery.min.js" integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/js/bootstrap.bundle.min.js" integrity="sha256-fgLAgv7fyCGopR/gBNq2iW3ZKIdqIcyshnUULC4vex8=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/js/mdb.min.js" integrity="sha256-NdbiivsvWt7VYCt6hYNT3h/th9vSTL4EDWeGs5SN3DA=" crossorigin="anonymous"></script> <script defer src="https://cdn.jsdelivr.net/npm/masonry-layout@4.2.2/dist/masonry.pkgd.min.js" integrity="sha256-Nn1q/fx0H7SNLZMQ5Hw5JLaTRZp0yILA/FRexe19VdI=" crossorigin="anonymous"></script> <script defer src="https://cdn.jsdelivr.net/npm/imagesloaded@4/imagesloaded.pkgd.min.js"></script> <script defer src="/2023/assets/js/masonry.js" type="text/javascript"></script> <script defer src="https://cdn.jsdelivr.net/npm/medium-zoom@1.0.6/dist/medium-zoom.min.js" integrity="sha256-EdPgYcPk/IIrw7FYeuJQexva49pVRZNmt3LculEr7zM=" crossorigin="anonymous"></script> <script defer src="/2023/assets/js/zoom.js"></script> <script defer src="/2023/assets/js/common.js"></script> <script type="text/javascript">window.MathJax={tex:{tags:"ams"}};</script> <script defer type="text/javascript" id="MathJax-script" src="https://cdn.jsdelivr.net/npm/mathjax@3.2.0/es5/tex-mml-chtml.js"></script> <script defer src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script> </body> </html>
\ No newline at end of file
diff --git a/news/announcement_2/index.html b/news/announcement_2/index.html
index 5410b4c6..8972bd39 100644
--- a/news/announcement_2/index.html
+++ b/news/announcement_2/index.html
@@ -1 +1 @@
-<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <title>A long announcement with details | You R. Name</title> <meta name="author" content="You R. Name"/> <meta name="description" content="A simple, whitespace theme for academics. Based on [*folio](https://github.com/bogoli/-folio) design. "/> <meta name="keywords" content="jekyll, jekyll-theme, academic-website, portfolio-website"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha256-DF7Zhf293AJxJNTmh5zhoYYIMs2oXitRfBjY+9L//AY=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/css/mdb.min.css" integrity="sha256-jpjYvU3G3N6nrrBwXJoVEYI/0zw8htfFnhT9ljN3JJw=" crossorigin="anonymous"/> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fortawesome/fontawesome-free@5.15.4/css/all.min.css" integrity="sha256-mUZM63G8m73Mcidfrv5E+Y61y7a12O5mW4ezU3bxqW4=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/academicons@1.9.1/css/academicons.min.css" integrity="sha256-i1+4qU2G2860dGGIOJscdC30s9beBXjFfzjWLjBRsBg=" crossorigin="anonymous"> <link rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700|Roboto+Slab:100,300,400,500,700|Material+Icons"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/github.css" media="" id="highlight_theme_light"/> <link rel="shortcut icon" href="data:image/svg+xml,<svg xmlns=%22http://www.w3.org/2000/svg%22 viewBox=%220 0 100 100%22><text y=%22.9em%22 font-size=%2290%22>⚛️</text></svg>"> <link rel="stylesheet" href="/al-folio/assets/css/main.css"> <link rel="canonical" href="https://alshedivat.github.io/al-folio/news/announcement_2/"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/native.css" media="none" id="highlight_theme_dark"/> <script src="/al-folio/assets/js/theme.js"></script> <script src="/al-folio/assets/js/dark_mode.js"></script> </head> <body class="fixed-top-nav "> <header> <nav id="navbar" class="navbar navbar-light navbar-expand-sm fixed-top"> <div class="container"> <a class="navbar-brand title font-weight-lighter" href="/al-folio/"><span class="font-weight-bold">You </span>R. Name</a> <button class="navbar-toggler collapsed ml-auto" type="button" data-toggle="collapse" data-target="#navbarNav" aria-controls="navbarNav" aria-expanded="false" aria-label="Toggle navigation"> <span class="sr-only">Toggle navigation</span> <span class="icon-bar top-bar"></span> <span class="icon-bar middle-bar"></span> <span class="icon-bar bottom-bar"></span> </button> <div class="collapse navbar-collapse text-right" id="navbarNav"> <ul class="navbar-nav ml-auto flex-nowrap"> <li class="nav-item "> <a class="nav-link" href="/al-folio/">about</a> </li> <li class="nav-item "> <a class="nav-link" href="/al-folio/blog/">blog</a> </li> <li class="nav-item "> <a class="nav-link" href="/al-folio/publications/">publications</a> </li> <li class="nav-item "> <a class="nav-link" href="/al-folio/projects/">projects</a> </li> <li class="nav-item "> <a class="nav-link" href="/al-folio/repositories/">repositories</a> </li> <li class="nav-item "> <a class="nav-link" href="/al-folio/cv/">cv</a> </li> <li class="nav-item "> <a class="nav-link" href="/al-folio/teaching/">teaching</a> </li> <li class="nav-item dropdown "> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdown" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">submenus</a> <div class="dropdown-menu dropdown-menu-right" aria-labelledby="navbarDropdown"> <a class="dropdown-item" href="/al-folio/publications/">publications</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="/al-folio/projects/">projects</a> </div> </li> <li class="toggle-container"> <button id="light-toggle" title="Change theme"> <i class="fas fa-moon"></i> <i class="fas fa-sun"></i> </button> </li> </ul> </div> </div> </nav> </header> <div class="container mt-5"> <div class="post"> <header class="post-header"> <h1 class="post-title">A long announcement with details</h1> <p class="post-meta">November 7, 2015</p> <p class="post-tags"> <a href="/al-folio/blog/2015"> <i class="fas fa-calendar fa-sm"></i> 2015 </a> </p> </header> <article class="post-content"> <p>Announcements and news can be much longer than just quick inline posts. In fact, they can have all the features available for the standard blog posts. See below.</p> <hr> <p>Jean shorts raw denim Vice normcore, art party High Life PBR skateboard stumptown vinyl kitsch. Four loko meh 8-bit, tousled banh mi tilde forage Schlitz dreamcatcher twee 3 wolf moon. Chambray asymmetrical paleo salvia, sartorial umami four loko master cleanse drinking vinegar brunch. <a href="https://www.pinterest.com" target="_blank" rel="noopener noreferrer">Pinterest</a> DIY authentic Schlitz, hoodie Intelligentsia butcher trust fund brunch shabby chic Kickstarter forage flexitarian. Direct trade <a href="https://en.wikipedia.org/wiki/Cold-pressed_juice" target="_blank" rel="noopener noreferrer">cold-pressed</a> meggings stumptown plaid, pop-up taxidermy. Hoodie XOXO fingerstache scenester Echo Park. Plaid ugh Wes Anderson, freegan pug selvage fanny pack leggings pickled food truck DIY irony Banksy.</p> <h4 id="hipster-list">Hipster list</h4> <ul> <li>brunch</li> <li>fixie</li> <li>raybans</li> <li>messenger bag</li> </ul> <p>Hoodie Thundercats retro, tote bag 8-bit Godard craft beer gastropub. Truffaut Tumblr taxidermy, raw denim Kickstarter sartorial dreamcatcher. Quinoa chambray slow-carb salvia readymade, bicycle rights 90’s yr typewriter selfies letterpress cardigan vegan.</p> <hr> <p>Pug heirloom High Life vinyl swag, single-origin coffee four dollar toast taxidermy reprehenderit fap distillery master cleanse locavore. Est anim sapiente leggings Brooklyn ea. Thundercats locavore excepteur veniam eiusmod. Raw denim Truffaut Schlitz, migas sapiente Portland VHS twee Bushwick Marfa typewriter retro id keytar.</p> <blockquote> <p>We do not grow absolutely, chronologically. We grow sometimes in one dimension, and not in another, unevenly. We grow partially. We are relative. We are mature in one realm, childish in another. —Anais Nin</p> </blockquote> <p>Fap aliqua qui, scenester pug Echo Park polaroid irony shabby chic ex cardigan church-key Odd Future accusamus. Blog stumptown sartorial squid, gastropub duis aesthetic Truffaut vero. Pinterest tilde twee, odio mumblecore jean shorts lumbersexual.</p> </article> </div> </div> <footer class="fixed-bottom"> <div class="container mt-0"> © Copyright 2022 You R. Name. Powered by <a href="https://jekyllrb.com/" target="_blank" rel="noopener noreferrer">Jekyll</a> with <a href="https://github.com/alshedivat/al-folio" target="_blank" rel="noopener noreferrer">al-folio</a> theme. Hosted by <a href="https://pages.github.com/" target="_blank" rel="noopener noreferrer">GitHub Pages</a>. Photos from <a href="https://unsplash.com" target="_blank" rel="noopener noreferrer">Unsplash</a>. </div> </footer> <script src="https://cdn.jsdelivr.net/npm/jquery@3.6.0/dist/jquery.min.js" integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/js/bootstrap.bundle.min.js" integrity="sha256-fgLAgv7fyCGopR/gBNq2iW3ZKIdqIcyshnUULC4vex8=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/js/mdb.min.js" integrity="sha256-NdbiivsvWt7VYCt6hYNT3h/th9vSTL4EDWeGs5SN3DA=" crossorigin="anonymous"></script> <script defer src="https://cdn.jsdelivr.net/npm/masonry-layout@4.2.2/dist/masonry.pkgd.min.js" integrity="sha256-Nn1q/fx0H7SNLZMQ5Hw5JLaTRZp0yILA/FRexe19VdI=" crossorigin="anonymous"></script> <script defer src="https://cdn.jsdelivr.net/npm/imagesloaded@4/imagesloaded.pkgd.min.js"></script> <script defer src="/al-folio/assets/js/masonry.js" type="text/javascript"></script> <script defer src="https://cdn.jsdelivr.net/npm/medium-zoom@1.0.6/dist/medium-zoom.min.js" integrity="sha256-EdPgYcPk/IIrw7FYeuJQexva49pVRZNmt3LculEr7zM=" crossorigin="anonymous"></script> <script defer src="/al-folio/assets/js/zoom.js"></script> <script defer src="/al-folio/assets/js/common.js"></script> <script type="text/javascript">window.MathJax={tex:{tags:"ams"}};</script> <script defer type="text/javascript" id="MathJax-script" src="https://cdn.jsdelivr.net/npm/mathjax@3.2.0/es5/tex-mml-chtml.js"></script> <script defer src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script> </body> </html>
\ No newline at end of file
+<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <title>A long announcement with details | ICLR Blogposts 2023</title> <meta name="author" content="abc b c"/> <meta name="description" content="Home to the 2023 ICLR Blogposts track "/> <meta name="keywords" content="machine-learning, ml, deep-learning, reinforcement-learning, iclr"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha256-DF7Zhf293AJxJNTmh5zhoYYIMs2oXitRfBjY+9L//AY=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/css/mdb.min.css" integrity="sha256-jpjYvU3G3N6nrrBwXJoVEYI/0zw8htfFnhT9ljN3JJw=" crossorigin="anonymous"/> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fortawesome/fontawesome-free@5.15.4/css/all.min.css" integrity="sha256-mUZM63G8m73Mcidfrv5E+Y61y7a12O5mW4ezU3bxqW4=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/academicons@1.9.1/css/academicons.min.css" integrity="sha256-i1+4qU2G2860dGGIOJscdC30s9beBXjFfzjWLjBRsBg=" crossorigin="anonymous"> <link rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700|Roboto+Slab:100,300,400,500,700|Material+Icons"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/github.css" media="" id="highlight_theme_light"/> <link rel="shortcut icon" href="/2023/assets/img/iclr_favicon.ico"/> <link rel="stylesheet" href="/2023/assets/css/main.css"> <link rel="canonical" href="https://iclr-blogposts.github.io/2023/news/announcement_2/"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/native.css" media="none" id="highlight_theme_dark"/> <script src="/2023/assets/js/theme.js"></script> <script src="/2023/assets/js/dark_mode.js"></script> </head> <body class="fixed-top-nav "> <header> <nav id="navbar" class="navbar navbar-light navbar-expand-sm fixed-top"> <div class="container"> <a class="navbar-brand title font-weight-lighter" href="/2023/">ICLR Blogposts 2023</a> <button class="navbar-toggler collapsed ml-auto" type="button" data-toggle="collapse" data-target="#navbarNav" aria-controls="navbarNav" aria-expanded="false" aria-label="Toggle navigation"> <span class="sr-only">Toggle navigation</span> <span class="icon-bar top-bar"></span> <span class="icon-bar middle-bar"></span> <span class="icon-bar bottom-bar"></span> </button> <div class="collapse navbar-collapse text-right" id="navbarNav"> <ul class="navbar-nav ml-auto flex-nowrap"> <li class="nav-item "> <a class="nav-link" href="/2023/about">about</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/call">call for blogposts</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/submitting">submitting</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/reviewing">reviewing</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/blog/index.html">blog</a> </li> <li class="nav-item dropdown "> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdown" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">other iterations</a> <div class="dropdown-menu dropdown-menu-right" aria-labelledby="navbarDropdown"> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2025/">2025</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2024/">2024</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blog-track.github.io/home/" target="_blank" rel="noopener noreferrer">2022</a> </div> </li> <li class="toggle-container"> <button id="light-toggle" title="Change theme"> <i class="fas fa-moon"></i> <i class="fas fa-sun"></i> </button> </li> </ul> </div> </div> </nav> </header> <div class="header-background"><div class="img"></div></div> <div class="container mt-5"> <div class="post"> <header class="post-header"> <h1 class="post-title">A long announcement with details</h1> <p class="post-meta">November 7, 2015</p> <p class="post-tags"> <a href="/2023/blog/2015"> <i class="fas fa-calendar fa-sm"></i> 2015 </a> </p> </header> <article class="post-content"> <p>Announcements and news can be much longer than just quick inline posts. In fact, they can have all the features available for the standard blog posts. See below.</p> <hr> <p>Jean shorts raw denim Vice normcore, art party High Life PBR skateboard stumptown vinyl kitsch. Four loko meh 8-bit, tousled banh mi tilde forage Schlitz dreamcatcher twee 3 wolf moon. Chambray asymmetrical paleo salvia, sartorial umami four loko master cleanse drinking vinegar brunch. <a href="https://www.pinterest.com" target="_blank" rel="noopener noreferrer">Pinterest</a> DIY authentic Schlitz, hoodie Intelligentsia butcher trust fund brunch shabby chic Kickstarter forage flexitarian. Direct trade <a href="https://en.wikipedia.org/wiki/Cold-pressed_juice" target="_blank" rel="noopener noreferrer">cold-pressed</a> meggings stumptown plaid, pop-up taxidermy. Hoodie XOXO fingerstache scenester Echo Park. Plaid ugh Wes Anderson, freegan pug selvage fanny pack leggings pickled food truck DIY irony Banksy.</p> <h4 id="hipster-list">Hipster list</h4> <ul> <li>brunch</li> <li>fixie</li> <li>raybans</li> <li>messenger bag</li> </ul> <p>Hoodie Thundercats retro, tote bag 8-bit Godard craft beer gastropub. Truffaut Tumblr taxidermy, raw denim Kickstarter sartorial dreamcatcher. Quinoa chambray slow-carb salvia readymade, bicycle rights 90’s yr typewriter selfies letterpress cardigan vegan.</p> <hr> <p>Pug heirloom High Life vinyl swag, single-origin coffee four dollar toast taxidermy reprehenderit fap distillery master cleanse locavore. Est anim sapiente leggings Brooklyn ea. Thundercats locavore excepteur veniam eiusmod. Raw denim Truffaut Schlitz, migas sapiente Portland VHS twee Bushwick Marfa typewriter retro id keytar.</p> <blockquote> <p>We do not grow absolutely, chronologically. We grow sometimes in one dimension, and not in another, unevenly. We grow partially. We are relative. We are mature in one realm, childish in another. —Anais Nin</p> </blockquote> <p>Fap aliqua qui, scenester pug Echo Park polaroid irony shabby chic ex cardigan church-key Odd Future accusamus. Blog stumptown sartorial squid, gastropub duis aesthetic Truffaut vero. Pinterest tilde twee, odio mumblecore jean shorts lumbersexual.</p> </article> <div id="bibtex-container" class="related"> For attribution in academic contexts, please cite this work as BibTeX citation </div> </div> </div> <script src="https://cdn.jsdelivr.net/npm/jquery@3.6.0/dist/jquery.min.js" integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/js/bootstrap.bundle.min.js" integrity="sha256-fgLAgv7fyCGopR/gBNq2iW3ZKIdqIcyshnUULC4vex8=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/js/mdb.min.js" integrity="sha256-NdbiivsvWt7VYCt6hYNT3h/th9vSTL4EDWeGs5SN3DA=" crossorigin="anonymous"></script> <script defer src="https://cdn.jsdelivr.net/npm/masonry-layout@4.2.2/dist/masonry.pkgd.min.js" integrity="sha256-Nn1q/fx0H7SNLZMQ5Hw5JLaTRZp0yILA/FRexe19VdI=" crossorigin="anonymous"></script> <script defer src="https://cdn.jsdelivr.net/npm/imagesloaded@4/imagesloaded.pkgd.min.js"></script> <script defer src="/2023/assets/js/masonry.js" type="text/javascript"></script> <script defer src="https://cdn.jsdelivr.net/npm/medium-zoom@1.0.6/dist/medium-zoom.min.js" integrity="sha256-EdPgYcPk/IIrw7FYeuJQexva49pVRZNmt3LculEr7zM=" crossorigin="anonymous"></script> <script defer src="/2023/assets/js/zoom.js"></script> <script defer src="/2023/assets/js/common.js"></script> <script type="text/javascript">window.MathJax={tex:{tags:"ams"}};</script> <script defer type="text/javascript" id="MathJax-script" src="https://cdn.jsdelivr.net/npm/mathjax@3.2.0/es5/tex-mml-chtml.js"></script> <script defer src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script> </body> </html>
\ No newline at end of file
diff --git a/news/announcement_3/index.html b/news/announcement_3/index.html
index 2e9b4295..9b29f1fd 100644
--- a/news/announcement_3/index.html
+++ b/news/announcement_3/index.html
@@ -1 +1 @@
-<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <title>Announcement_3 | You R. Name</title> <meta name="author" content="You R. Name"/> <meta name="description" content="A simple, whitespace theme for academics. Based on [*folio](https://github.com/bogoli/-folio) design. "/> <meta name="keywords" content="jekyll, jekyll-theme, academic-website, portfolio-website"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha256-DF7Zhf293AJxJNTmh5zhoYYIMs2oXitRfBjY+9L//AY=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/css/mdb.min.css" integrity="sha256-jpjYvU3G3N6nrrBwXJoVEYI/0zw8htfFnhT9ljN3JJw=" crossorigin="anonymous"/> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fortawesome/fontawesome-free@5.15.4/css/all.min.css" integrity="sha256-mUZM63G8m73Mcidfrv5E+Y61y7a12O5mW4ezU3bxqW4=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/academicons@1.9.1/css/academicons.min.css" integrity="sha256-i1+4qU2G2860dGGIOJscdC30s9beBXjFfzjWLjBRsBg=" crossorigin="anonymous"> <link rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700|Roboto+Slab:100,300,400,500,700|Material+Icons"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/github.css" media="" id="highlight_theme_light"/> <link rel="shortcut icon" href="data:image/svg+xml,<svg xmlns=%22http://www.w3.org/2000/svg%22 viewBox=%220 0 100 100%22><text y=%22.9em%22 font-size=%2290%22>⚛️</text></svg>"> <link rel="stylesheet" href="/al-folio/assets/css/main.css"> <link rel="canonical" href="https://alshedivat.github.io/al-folio/news/announcement_3/"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/native.css" media="none" id="highlight_theme_dark"/> <script src="/al-folio/assets/js/theme.js"></script> <script src="/al-folio/assets/js/dark_mode.js"></script> </head> <body class="fixed-top-nav "> <header> <nav id="navbar" class="navbar navbar-light navbar-expand-sm fixed-top"> <div class="container"> <a class="navbar-brand title font-weight-lighter" href="/al-folio/"><span class="font-weight-bold">You </span>R. Name</a> <button class="navbar-toggler collapsed ml-auto" type="button" data-toggle="collapse" data-target="#navbarNav" aria-controls="navbarNav" aria-expanded="false" aria-label="Toggle navigation"> <span class="sr-only">Toggle navigation</span> <span class="icon-bar top-bar"></span> <span class="icon-bar middle-bar"></span> <span class="icon-bar bottom-bar"></span> </button> <div class="collapse navbar-collapse text-right" id="navbarNav"> <ul class="navbar-nav ml-auto flex-nowrap"> <li class="nav-item "> <a class="nav-link" href="/al-folio/">about</a> </li> <li class="nav-item "> <a class="nav-link" href="/al-folio/blog/">blog</a> </li> <li class="nav-item "> <a class="nav-link" href="/al-folio/publications/">publications</a> </li> <li class="nav-item "> <a class="nav-link" href="/al-folio/projects/">projects</a> </li> <li class="nav-item "> <a class="nav-link" href="/al-folio/repositories/">repositories</a> </li> <li class="nav-item "> <a class="nav-link" href="/al-folio/cv/">cv</a> </li> <li class="nav-item "> <a class="nav-link" href="/al-folio/teaching/">teaching</a> </li> <li class="nav-item dropdown "> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdown" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">submenus</a> <div class="dropdown-menu dropdown-menu-right" aria-labelledby="navbarDropdown"> <a class="dropdown-item" href="/al-folio/publications/">publications</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="/al-folio/projects/">projects</a> </div> </li> <li class="toggle-container"> <button id="light-toggle" title="Change theme"> <i class="fas fa-moon"></i> <i class="fas fa-sun"></i> </button> </li> </ul> </div> </div> </nav> </header> <div class="container mt-5"> <div class="post"> <header class="post-header"> <h1 class="post-title">Announcement_3</h1> <p class="post-meta">January 15, 2016</p> <p class="post-tags"> <a href="/al-folio/blog/2016"> <i class="fas fa-calendar fa-sm"></i> 2016 </a> </p> </header> <article class="post-content"> <p>A simple inline announcement with Markdown emoji! <img class="emoji" title=":sparkles:" alt=":sparkles:" src="https://github.githubassets.com/images/icons/emoji/unicode/2728.png" height="20" width="20"> <img class="emoji" title=":smile:" alt=":smile:" src="https://github.githubassets.com/images/icons/emoji/unicode/1f604.png" height="20" width="20"></p> </article> </div> </div> <footer class="fixed-bottom"> <div class="container mt-0"> © Copyright 2022 You R. Name. Powered by <a href="https://jekyllrb.com/" target="_blank" rel="noopener noreferrer">Jekyll</a> with <a href="https://github.com/alshedivat/al-folio" target="_blank" rel="noopener noreferrer">al-folio</a> theme. Hosted by <a href="https://pages.github.com/" target="_blank" rel="noopener noreferrer">GitHub Pages</a>. Photos from <a href="https://unsplash.com" target="_blank" rel="noopener noreferrer">Unsplash</a>. </div> </footer> <script src="https://cdn.jsdelivr.net/npm/jquery@3.6.0/dist/jquery.min.js" integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/js/bootstrap.bundle.min.js" integrity="sha256-fgLAgv7fyCGopR/gBNq2iW3ZKIdqIcyshnUULC4vex8=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/js/mdb.min.js" integrity="sha256-NdbiivsvWt7VYCt6hYNT3h/th9vSTL4EDWeGs5SN3DA=" crossorigin="anonymous"></script> <script defer src="https://cdn.jsdelivr.net/npm/masonry-layout@4.2.2/dist/masonry.pkgd.min.js" integrity="sha256-Nn1q/fx0H7SNLZMQ5Hw5JLaTRZp0yILA/FRexe19VdI=" crossorigin="anonymous"></script> <script defer src="https://cdn.jsdelivr.net/npm/imagesloaded@4/imagesloaded.pkgd.min.js"></script> <script defer src="/al-folio/assets/js/masonry.js" type="text/javascript"></script> <script defer src="https://cdn.jsdelivr.net/npm/medium-zoom@1.0.6/dist/medium-zoom.min.js" integrity="sha256-EdPgYcPk/IIrw7FYeuJQexva49pVRZNmt3LculEr7zM=" crossorigin="anonymous"></script> <script defer src="/al-folio/assets/js/zoom.js"></script> <script defer src="/al-folio/assets/js/common.js"></script> <script type="text/javascript">window.MathJax={tex:{tags:"ams"}};</script> <script defer type="text/javascript" id="MathJax-script" src="https://cdn.jsdelivr.net/npm/mathjax@3.2.0/es5/tex-mml-chtml.js"></script> <script defer src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script> </body> </html>
\ No newline at end of file
+<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <title>Announcement_3 | ICLR Blogposts 2023</title> <meta name="author" content="abc b c"/> <meta name="description" content="Home to the 2023 ICLR Blogposts track "/> <meta name="keywords" content="machine-learning, ml, deep-learning, reinforcement-learning, iclr"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha256-DF7Zhf293AJxJNTmh5zhoYYIMs2oXitRfBjY+9L//AY=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/css/mdb.min.css" integrity="sha256-jpjYvU3G3N6nrrBwXJoVEYI/0zw8htfFnhT9ljN3JJw=" crossorigin="anonymous"/> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fortawesome/fontawesome-free@5.15.4/css/all.min.css" integrity="sha256-mUZM63G8m73Mcidfrv5E+Y61y7a12O5mW4ezU3bxqW4=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/academicons@1.9.1/css/academicons.min.css" integrity="sha256-i1+4qU2G2860dGGIOJscdC30s9beBXjFfzjWLjBRsBg=" crossorigin="anonymous"> <link rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700|Roboto+Slab:100,300,400,500,700|Material+Icons"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/github.css" media="" id="highlight_theme_light"/> <link rel="shortcut icon" href="/2023/assets/img/iclr_favicon.ico"/> <link rel="stylesheet" href="/2023/assets/css/main.css"> <link rel="canonical" href="https://iclr-blogposts.github.io/2023/news/announcement_3/"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/native.css" media="none" id="highlight_theme_dark"/> <script src="/2023/assets/js/theme.js"></script> <script src="/2023/assets/js/dark_mode.js"></script> </head> <body class="fixed-top-nav "> <header> <nav id="navbar" class="navbar navbar-light navbar-expand-sm fixed-top"> <div class="container"> <a class="navbar-brand title font-weight-lighter" href="/2023/">ICLR Blogposts 2023</a> <button class="navbar-toggler collapsed ml-auto" type="button" data-toggle="collapse" data-target="#navbarNav" aria-controls="navbarNav" aria-expanded="false" aria-label="Toggle navigation"> <span class="sr-only">Toggle navigation</span> <span class="icon-bar top-bar"></span> <span class="icon-bar middle-bar"></span> <span class="icon-bar bottom-bar"></span> </button> <div class="collapse navbar-collapse text-right" id="navbarNav"> <ul class="navbar-nav ml-auto flex-nowrap"> <li class="nav-item "> <a class="nav-link" href="/2023/about">about</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/call">call for blogposts</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/submitting">submitting</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/reviewing">reviewing</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/blog/index.html">blog</a> </li> <li class="nav-item dropdown "> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdown" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">other iterations</a> <div class="dropdown-menu dropdown-menu-right" aria-labelledby="navbarDropdown"> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2025/">2025</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2024/">2024</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blog-track.github.io/home/" target="_blank" rel="noopener noreferrer">2022</a> </div> </li> <li class="toggle-container"> <button id="light-toggle" title="Change theme"> <i class="fas fa-moon"></i> <i class="fas fa-sun"></i> </button> </li> </ul> </div> </div> </nav> </header> <div class="header-background"><div class="img"></div></div> <div class="container mt-5"> <div class="post"> <header class="post-header"> <h1 class="post-title">Announcement_3</h1> <p class="post-meta">January 15, 2016</p> <p class="post-tags"> <a href="/2023/blog/2016"> <i class="fas fa-calendar fa-sm"></i> 2016 </a> </p> </header> <article class="post-content"> <p>A simple inline announcement with Markdown emoji! :sparkles: :smile:</p> </article> <div id="bibtex-container" class="related"> For attribution in academic contexts, please cite this work as BibTeX citation </div> </div> </div> <script src="https://cdn.jsdelivr.net/npm/jquery@3.6.0/dist/jquery.min.js" integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/js/bootstrap.bundle.min.js" integrity="sha256-fgLAgv7fyCGopR/gBNq2iW3ZKIdqIcyshnUULC4vex8=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/js/mdb.min.js" integrity="sha256-NdbiivsvWt7VYCt6hYNT3h/th9vSTL4EDWeGs5SN3DA=" crossorigin="anonymous"></script> <script defer src="https://cdn.jsdelivr.net/npm/masonry-layout@4.2.2/dist/masonry.pkgd.min.js" integrity="sha256-Nn1q/fx0H7SNLZMQ5Hw5JLaTRZp0yILA/FRexe19VdI=" crossorigin="anonymous"></script> <script defer src="https://cdn.jsdelivr.net/npm/imagesloaded@4/imagesloaded.pkgd.min.js"></script> <script defer src="/2023/assets/js/masonry.js" type="text/javascript"></script> <script defer src="https://cdn.jsdelivr.net/npm/medium-zoom@1.0.6/dist/medium-zoom.min.js" integrity="sha256-EdPgYcPk/IIrw7FYeuJQexva49pVRZNmt3LculEr7zM=" crossorigin="anonymous"></script> <script defer src="/2023/assets/js/zoom.js"></script> <script defer src="/2023/assets/js/common.js"></script> <script type="text/javascript">window.MathJax={tex:{tags:"ams"}};</script> <script defer type="text/javascript" id="MathJax-script" src="https://cdn.jsdelivr.net/npm/mathjax@3.2.0/es5/tex-mml-chtml.js"></script> <script defer src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script> </body> </html>
\ No newline at end of file
diff --git a/redirects.json b/redirects.json
new file mode 100644
index 00000000..662c1a3b
--- /dev/null
+++ b/redirects.json
@@ -0,0 +1 @@
+{"/":"https://iclr-blogposts.github.io/2023/about"}
\ No newline at end of file
diff --git a/reviewing.html b/reviewing.html
new file mode 100644
index 00000000..b1e77da2
--- /dev/null
+++ b/reviewing.html
@@ -0,0 +1 @@
+<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <title>reviewing | ICLR Blogposts 2023</title> <meta name="author" content="abc b c"/> <meta name="description" content="Home to the 2023 ICLR Blogposts track "/> <meta name="keywords" content="machine-learning, ml, deep-learning, reinforcement-learning, iclr"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha256-DF7Zhf293AJxJNTmh5zhoYYIMs2oXitRfBjY+9L//AY=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/css/mdb.min.css" integrity="sha256-jpjYvU3G3N6nrrBwXJoVEYI/0zw8htfFnhT9ljN3JJw=" crossorigin="anonymous"/> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fortawesome/fontawesome-free@5.15.4/css/all.min.css" integrity="sha256-mUZM63G8m73Mcidfrv5E+Y61y7a12O5mW4ezU3bxqW4=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/academicons@1.9.1/css/academicons.min.css" integrity="sha256-i1+4qU2G2860dGGIOJscdC30s9beBXjFfzjWLjBRsBg=" crossorigin="anonymous"> <link rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700|Roboto+Slab:100,300,400,500,700|Material+Icons"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/github.css" media="" id="highlight_theme_light"/> <link rel="shortcut icon" href="/2023/assets/img/iclr_favicon.ico"/> <link rel="stylesheet" href="/2023/assets/css/main.css"> <link rel="canonical" href="https://iclr-blogposts.github.io/2023/reviewing"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/native.css" media="none" id="highlight_theme_dark"/> <script src="/2023/assets/js/theme.js"></script> <script src="/2023/assets/js/dark_mode.js"></script> </head> <body class="fixed-top-nav "> <header> <nav id="navbar" class="navbar navbar-light navbar-expand-sm fixed-top"> <div class="container"> <a class="navbar-brand title font-weight-lighter" href="/2023/">ICLR Blogposts 2023</a> <button class="navbar-toggler collapsed ml-auto" type="button" data-toggle="collapse" data-target="#navbarNav" aria-controls="navbarNav" aria-expanded="false" aria-label="Toggle navigation"> <span class="sr-only">Toggle navigation</span> <span class="icon-bar top-bar"></span> <span class="icon-bar middle-bar"></span> <span class="icon-bar bottom-bar"></span> </button> <div class="collapse navbar-collapse text-right" id="navbarNav"> <ul class="navbar-nav ml-auto flex-nowrap"> <li class="nav-item "> <a class="nav-link" href="/2023/about">about</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/call">call for blogposts</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/submitting">submitting</a> </li> <li class="nav-item active"> <a class="nav-link" href="/2023/reviewing">reviewing<span class="sr-only">(current)</span></a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/blog/index.html">blog</a> </li> <li class="nav-item dropdown "> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdown" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">other iterations</a> <div class="dropdown-menu dropdown-menu-right" aria-labelledby="navbarDropdown"> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2025/">2025</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2024/">2024</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blog-track.github.io/home/" target="_blank" rel="noopener noreferrer">2022</a> </div> </li> <li class="toggle-container"> <button id="light-toggle" title="Change theme"> <i class="fas fa-moon"></i> <i class="fas fa-sun"></i> </button> </li> </ul> </div> </div> </nav> </header> <div class="header-background"><div class="img"></div></div> <div class="container mt-5"> <div class="post"> <article> <h3 id="reviewing-process">Reviewing Process</h3> <p>Reviewers will be required to only view the live content of the blog. We ask that they act in good faith, and refrain from digging into the repository’s logs and closed Pull Requests to find any identifying information on the authors.</p> <p>Reviewers should motivate their final decision based on the following points:</p> <ul> <li>Is there a significant added value in comparison to the cited papers? (BlogPosts have to be about a paper previously published at ICLR)</li> <li>Is this added value supported by accurate, convincing, and clear arguments?</li> <li>In case the field <em>Conflict Of Interest</em> is marked as <em>YES</em> the reviewers are asked to pay specific attention to how the related work mentioned in the field <em>ICLR Papers</em>: is the blogpost <em>too positive</em> (self advertisement) or <em>too negative</em> (unfair assessment of this related work)?</li> </ul> <p>In order to access them please follow the following steps:</p> <ol> <li>Go to the OpenReview submission page.</li> <li>To see the blogpost submission, go to the blogpost url specified in the field ‘Blogpost Url’. Example: In this submission <a href="https://openreview.net/forum?id=djS_CaOq2F" target="_blank" rel="noopener noreferrer">https://openreview.net/forum?id=djS_CaOq2F</a> is this link: <a href="https://iclr-blogposts.github.io/blog/2022/raspy/">https://iclr-blogposts.github.io/blog/2022/raspy/</a>. This link is broken because it links to the main website. Instead, add the <code class="language-plaintext highlighter-rouge">staging</code> URI: <a href="https://iclr-blogposts.github.io/staging/blog/2022/raspy/">https://iclr-blogposts.github.io/staging/blog/2022/raspy/</a>.</li> </ol> </article> </div> </div> <script src="https://cdn.jsdelivr.net/npm/jquery@3.6.0/dist/jquery.min.js" integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/js/bootstrap.bundle.min.js" integrity="sha256-fgLAgv7fyCGopR/gBNq2iW3ZKIdqIcyshnUULC4vex8=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/js/mdb.min.js" integrity="sha256-NdbiivsvWt7VYCt6hYNT3h/th9vSTL4EDWeGs5SN3DA=" crossorigin="anonymous"></script> <script defer src="https://cdn.jsdelivr.net/npm/masonry-layout@4.2.2/dist/masonry.pkgd.min.js" integrity="sha256-Nn1q/fx0H7SNLZMQ5Hw5JLaTRZp0yILA/FRexe19VdI=" crossorigin="anonymous"></script> <script defer src="https://cdn.jsdelivr.net/npm/imagesloaded@4/imagesloaded.pkgd.min.js"></script> <script defer src="/2023/assets/js/masonry.js" type="text/javascript"></script> <script defer src="https://cdn.jsdelivr.net/npm/medium-zoom@1.0.6/dist/medium-zoom.min.js" integrity="sha256-EdPgYcPk/IIrw7FYeuJQexva49pVRZNmt3LculEr7zM=" crossorigin="anonymous"></script> <script defer src="/2023/assets/js/zoom.js"></script> <script defer src="/2023/assets/js/common.js"></script> <script type="text/javascript">window.MathJax={tex:{tags:"ams"}};</script> <script defer type="text/javascript" id="MathJax-script" src="https://cdn.jsdelivr.net/npm/mathjax@3.2.0/es5/tex-mml-chtml.js"></script> <script defer src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script> </body> </html>
\ No newline at end of file
diff --git a/robots.txt b/robots.txt
index a450fbe2..faf6cc90 100644
--- a/robots.txt
+++ b/robots.txt
@@ -1,7 +1,4 @@
----
-permalink: /robots.txt
----
 User-agent: *
 Disallow:
 
-Sitemap: {{ site.baseurl | prepend: site.url }}/sitemap.xml
+Sitemap: https://iclr-blogposts.github.io/2023/sitemap.xml
diff --git a/sitemap.xml b/sitemap.xml
new file mode 100644
index 00000000..b95194e6
--- /dev/null
+++ b/sitemap.xml
@@ -0,0 +1 @@
+<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>https://iclr-blogposts.github.io/2023/news/announcement_1/</loc> <lastmod>2015-10-22T21:59:00+02:00</lastmod> </url> <url> <loc>https://iclr-blogposts.github.io/2023/news/announcement_2/</loc> <lastmod>2015-11-07T22:11:00+02:00</lastmod> </url> <url> <loc>https://iclr-blogposts.github.io/2023/news/announcement_3/</loc> <lastmod>2016-01-15T13:59:00+02:00</lastmod> </url> <url> <loc>https://iclr-blogposts.github.io/2023/blog/2023/adamw/</loc> <lastmod>2023-05-01T00:00:00+02:00</lastmod> </url> <url> <loc>https://iclr-blogposts.github.io/2023/blog/2023/autoregressive-neural-pde-solver/</loc> <lastmod>2023-05-01T00:00:00+02:00</lastmod> </url> <url> <loc>https://iclr-blogposts.github.io/2023/blog/2023/bsuite-applications/</loc> <lastmod>2023-05-01T00:00:00+02:00</lastmod> </url> <url> <loc>https://iclr-blogposts.github.io/2023/blog/2023/classification-layer-initialization-in-maml/</loc> <lastmod>2023-05-01T00:00:00+02:00</lastmod> </url> <url> <loc>https://iclr-blogposts.github.io/2023/blog/2023/facial-poisoning/</loc> <lastmod>2023-05-01T00:00:00+02:00</lastmod> </url> <url> <loc>https://iclr-blogposts.github.io/2023/blog/2023/hitchhikers-momentum/</loc> <lastmod>2023-05-01T00:00:00+02:00</lastmod> </url> <url> <loc>https://iclr-blogposts.github.io/2023/blog/2023/how-does-the-inductive-bias-influence-the-generalization-capability-of-neural-networks/</loc> <lastmod>2023-05-01T00:00:00+02:00</lastmod> </url> <url> <loc>https://iclr-blogposts.github.io/2023/blog/2023/how-much-meta-learning-is-in-image-to-image-translation/</loc> <lastmod>2023-05-01T00:00:00+02:00</lastmod> </url> <url> <loc>https://iclr-blogposts.github.io/2023/blog/2023/raspy/</loc> <lastmod>2023-05-01T00:00:00+02:00</lastmod> </url> <url> <loc>https://iclr-blogposts.github.io/2023/blog/2023/riit/</loc> <lastmod>2023-05-01T00:00:00+02:00</lastmod> </url> <url> <loc>https://iclr-blogposts.github.io/2023/blog/2023/sets-and-graphs/</loc> <lastmod>2023-05-01T00:00:00+02:00</lastmod> </url> <url> <loc>https://iclr-blogposts.github.io/2023/about</loc> </url> <url> <loc>https://iclr-blogposts.github.io/2023/call</loc> </url> <url> <loc>https://iclr-blogposts.github.io/2023/_pages/dropdown/</loc> </url> <url> <loc>https://iclr-blogposts.github.io/2023/reviewing</loc> </url> <url> <loc>https://iclr-blogposts.github.io/2023/submitting</loc> </url> <url> <loc>https://iclr-blogposts.github.io/2023/blog/2023/</loc> </url> <url> <loc>https://iclr-blogposts.github.io/2023/blog/</loc> </url> <url> <loc>https://iclr-blogposts.github.io/2023/news/announcement_1/</loc> <lastmod>2024-10-09T00:09:52+02:00</lastmod> </url> <url> <loc>https://iclr-blogposts.github.io/2023/news/announcement_2/</loc> <lastmod>2024-10-09T00:09:52+02:00</lastmod> </url> <url> <loc>https://iclr-blogposts.github.io/2023/news/announcement_3/</loc> <lastmod>2024-10-09T00:09:52+02:00</lastmod> </url> <url> <loc>https://iclr-blogposts.github.io/2023/publications/</loc> <lastmod>2024-10-09T00:09:52+02:00</lastmod> </url> </urlset>
\ No newline at end of file
diff --git a/submitting.html b/submitting.html
new file mode 100644
index 00000000..f2a1b0e3
--- /dev/null
+++ b/submitting.html
@@ -0,0 +1,149 @@
+<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <title>submitting | ICLR Blogposts 2023</title> <meta name="author" content="abc b c"/> <meta name="description" content="Home to the 2023 ICLR Blogposts track "/> <meta name="keywords" content="machine-learning, ml, deep-learning, reinforcement-learning, iclr"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha256-DF7Zhf293AJxJNTmh5zhoYYIMs2oXitRfBjY+9L//AY=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/css/mdb.min.css" integrity="sha256-jpjYvU3G3N6nrrBwXJoVEYI/0zw8htfFnhT9ljN3JJw=" crossorigin="anonymous"/> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@fortawesome/fontawesome-free@5.15.4/css/all.min.css" integrity="sha256-mUZM63G8m73Mcidfrv5E+Y61y7a12O5mW4ezU3bxqW4=" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/academicons@1.9.1/css/academicons.min.css" integrity="sha256-i1+4qU2G2860dGGIOJscdC30s9beBXjFfzjWLjBRsBg=" crossorigin="anonymous"> <link rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700|Roboto+Slab:100,300,400,500,700|Material+Icons"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/github.css" media="" id="highlight_theme_light"/> <link rel="shortcut icon" href="/2023/assets/img/iclr_favicon.ico"/> <link rel="stylesheet" href="/2023/assets/css/main.css"> <link rel="canonical" href="https://iclr-blogposts.github.io/2023/submitting"> <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jwarby/jekyll-pygments-themes@master/native.css" media="none" id="highlight_theme_dark"/> <script src="/2023/assets/js/theme.js"></script> <script src="/2023/assets/js/dark_mode.js"></script> </head> <body class="fixed-top-nav "> <header> <nav id="navbar" class="navbar navbar-light navbar-expand-sm fixed-top"> <div class="container"> <a class="navbar-brand title font-weight-lighter" href="/2023/">ICLR Blogposts 2023</a> <button class="navbar-toggler collapsed ml-auto" type="button" data-toggle="collapse" data-target="#navbarNav" aria-controls="navbarNav" aria-expanded="false" aria-label="Toggle navigation"> <span class="sr-only">Toggle navigation</span> <span class="icon-bar top-bar"></span> <span class="icon-bar middle-bar"></span> <span class="icon-bar bottom-bar"></span> </button> <div class="collapse navbar-collapse text-right" id="navbarNav"> <ul class="navbar-nav ml-auto flex-nowrap"> <li class="nav-item "> <a class="nav-link" href="/2023/about">about</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/call">call for blogposts</a> </li> <li class="nav-item active"> <a class="nav-link" href="/2023/submitting">submitting<span class="sr-only">(current)</span></a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/reviewing">reviewing</a> </li> <li class="nav-item "> <a class="nav-link" href="/2023/blog/index.html">blog</a> </li> <li class="nav-item dropdown "> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdown" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">other iterations</a> <div class="dropdown-menu dropdown-menu-right" aria-labelledby="navbarDropdown"> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2025/">2025</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blogposts.github.io/2024/">2024</a> <div class="dropdown-divider"></div> <a class="dropdown-item" href="https://iclr-blog-track.github.io/home/" target="_blank" rel="noopener noreferrer">2022</a> </div> </li> <li class="toggle-container"> <button id="light-toggle" title="Change theme"> <i class="fas fa-moon"></i> <i class="fas fa-sun"></i> </button> </li> </ul> </div> </div> </nav> </header> <div class="header-background"><div class="img"></div></div> <div class="container mt-5"> <div class="post"> <article> <p><strong>Announcements</strong>:</p> <ul> <li>The track has concluded and accepted blogposts are viewable <a href="/2023/blog">here</a>!</li> <li>The poster session for the blog track will take place at <strong>11:30</strong> on <strong>Tuesday May 2nd</strong> in room <strong>MH1-2-3-4</strong>. <ul> <li>Check <a href="https://iclr.cc/virtual/2023/workshop/14478" target="_blank" rel="noopener noreferrer">here</a> for more information, and come by to check out the posters!</li> <li>If you are going to be presenting a poster in-person, please add the <a href="/2023/assets/pdf/sticker.pdf">blog post track sticker</a> to your poster.</li> </ul> </li> </ul> <h3 id="a-more-open-process">A more open process</h3> <p>For this edition of the Blogposts Track, we will forgo the requirement for total anonymity. The blog posts <strong>must be anonymized for the review process</strong>, but users will submit their anonymized blog posts via a pull request to a staging repository (in addition to a submission on OpenReview). The post will be merged into the staging repository, where it will be deployed to a separate Github Pages website. Reviewers will be able to access the posts directly through a public url on this staging website, and will submit their reviews on OpenReview. Reviewers should refrain from looking at the git history for the post, which may reveal information about the authors.</p> <p>This still largely follows the Double-Blind reviewing principle; it is no less double-blind than when reviewers are asked to score papers that have previously been released to <a href="https://arxiv.org/" target="_blank" rel="noopener noreferrer">arXiv</a>, an overwhelmingly common practice in the ML community. This approach was chosen to lower the burden on both the organizers and the authors; last year, many submissions had to be reworked once deployed due to a variety of reasons. By allowing the authors to render their websites to Github Pages prior to the review process, we hope to avoid this issue entirely. We also avoid the issue of having to host the submissions on a separate server during the reviewing process.</p> <p>However, we understand the desire for total anonymity. Authors that wish to have a fully double-blind process might consider creating new GitHub accounts without identifying information which will only be used for this track. For an example of a submission in the past which used an anonymous account in this manner, you can check out the <a href="https://worldmodels.github.io/" target="_blank" rel="noopener noreferrer">World Models blog post (Ha and Schmidhuber, 2018)</a> and the <a href="https://github.com/worldmodels/worldmodels.github.io" target="_blank" rel="noopener noreferrer">accompanying repository</a>.</p> <h3 id="template">Template</h3> <p>The workflow you will use to participate in this track should be relatively familiar to you if have used <a href="https://pages.github.com/" target="_blank" rel="noopener noreferrer">Github Pages</a>. Specifically, our website uses the <a href="https://github.com/alshedivat/al-folio" target="_blank" rel="noopener noreferrer">Al-Folio</a> template. This template uses Github Pages as part of its process, but it also utilizes a separate build step using <a href="https://github.com/features/actions" target="_blank" rel="noopener noreferrer">Github Actions</a> and intermediary <a href="https://www.docker.com/" target="_blank" rel="noopener noreferrer">Docker Images</a>.</p> <p><strong>We stress that you must pay close attention to the steps presented in this guide. Small mistakes here can have very hard-to-debug consequences.</strong></p> <h3 id="contents">Contents</h3> <ul> <li><a href="#quickstart">Quickstart</a></li> <li><a href="#download-the-blog-repository">Download the Blog Repository</a></li> <li><a href="#creating-a-blog-post">Creating a Blog Post</a></li> <li> <a href="#local-serving">Local Serving</a> <ul> <li><a href="#method-1-using-docker">Method 1: Using Docker</a></li> <li> <a href="#method-2-using-jekyll-manually">Method 2: Using Jekyll Manually</a> <ul> <li><a href="#installation">Installation</a></li> <li><a href="#manual-serving">Manual Serving</a></li> </ul> </li> </ul> </li> <li><a href="#submitting-your-blog-post">Submitting Your Blog Post</a></li> <li><a href="#reviewing-process">Reviewing Process</a></li> <li><a href="#camera-ready-instructions">Camera Ready</a></li> </ul> <h3 id="quickstart">Quickstart</h3> <p>This section provides a summary of the workflow for creating and submitting a blog post. For more details about any of these steps, please refer to the appropriate section.</p> <ol> <li>Fork or download our <a href="https://github.com/iclr-blogposts/staging" target="_blank" rel="noopener noreferrer">staging repository</a>. We stress that you work with the <a href="https://github.com/iclr-blogposts/staging" target="_blank" rel="noopener noreferrer">staging repository</a>, not the main repository. <ul> <li>If you do fork this repo, rename your fork. You probably should rename it using a personalized name inspired by the subject of your submission. This is a <strong>project</strong> website, not a <strong>user</strong> website.</li> <li>If you wish to deploy the website on your own account before submitting a pull request, follow the <a href="https://github.com/iclr-blogposts/staging/blob/master/README.md#deployment" target="_blank" rel="noopener noreferrer">deployment instructions</a> in the README <strong>very carefully</strong>. Pay particular attention to the instructions detailing how you must edit the <code class="language-plaintext highlighter-rouge">_config.yml</code>.</li> </ul> <p>Note that any pull request to our repo will only permit modifying certain files, so you may have to omit some changes during the pull request.</p> </li> <li>Create your blog post content as detailed in the <a href="#creating-a-blog-post">Creating a Blog Post</a> section. In summary, to create your post, you will: <ul> <li>Create a markdown file in the <code class="language-plaintext highlighter-rouge">_posts/</code> directory with the format <code class="language-plaintext highlighter-rouge">_posts/2022-12-01-[SUBMISSION NAME].md</code>. <strong>Please ensure to use the provided <code class="language-plaintext highlighter-rouge">2022-12-01-distill-example.md</code> (with the distill layout) as your template</strong>.</li> <li>Add any static image assets will be added to <code class="language-plaintext highlighter-rouge">assets/img/2022-12-01-[SUBMISSION NAME]/</code>.</li> <li>Add any interactive HTML figures will be added to <code class="language-plaintext highlighter-rouge">assets/html/2022-12-01-[SUBMISSION NAME]/</code>.</li> <li>Put your citations into a bibtex file in <code class="language-plaintext highlighter-rouge">assets/bibliography/2022-12-01-[SUBMISSION NAME].bib</code>.</li> </ul> <p>You <strong>should not</strong> touch anything else in the blog post. Read the <a href="#creating-a-blog-post">relevant section</a> for more details. <strong>Make sure to omit any identifying information for the review process.</strong></p> </li> <li> <p>To render your website locally, you can build a docker container via <code class="language-plaintext highlighter-rouge">$ ./bin/docker_build_image.sh</code> to serve your website locally. You can then run it with <code class="language-plaintext highlighter-rouge">$ ./bin/docker_run.sh</code>. Alternatively, you can setup your loval environment to render the website via conventional <code class="language-plaintext highlighter-rouge">$ bundle exec jekyll serve</code> commands. More information for both of these configuratoins can be found in the <a href="#local-serving">Local Serving</a> section.</p> </li> <li> <p>When ready to submit, open a pull request to our <a href="https://github.com/iclr-blogposts/staging" target="_blank" rel="noopener noreferrer">staging repository</a>. Your PR may only add files specified as specified in the <a href="#creating-a-blog-post">Creating a Blog Post</a> section. Any modification to any other files will require you to undo or omit these changes. See the section on <a href="#submitting-your-blog-post">submitting your blog post</a> for more details.</p> </li> <li>If accepted, we will then merge the accepted posts to our main repository. See the <a href="#camera-ready">camera ready</a> section for more details on merging in an accepted blog post.</li> </ol> <p><strong>Should you edit ANY files other than <code class="language-plaintext highlighter-rouge">_config.yml</code>, your new post inside the <code class="language-plaintext highlighter-rouge">_posts</code> directory, and your new folder inside the <code class="language-plaintext highlighter-rouge">assets</code> directory, your pull requests will automatically be ignored.</strong></p> <h3 id="download-the-blog-repository">Download the Blog Repository</h3> <p>Download or fork our <a href="https://github.com/iclr-blogposts/staging" target="_blank" rel="noopener noreferrer">staging repository</a>. You will be submitting a pull request to this staging repository, so if you use the fork approach, we stress that you must fork the <a href="https://github.com/iclr-blogposts/staging" target="_blank" rel="noopener noreferrer">staging repository</a>, not the main repository!</p> <p>This is in contrast to last year’s Blog Post track, where we explicitly stated you should <em>not</em> fork our repository.</p> <h3 id="creating-a-blog-post">Creating a Blog Post</h3> <p>The bulk of your blogpost will be written in a Markdown file You can check out a <a href="https://iclr-blogposts.github.io//blog/2022/09/01/distill-example">sample blogpost</a>, which was generated by the markdown file in <code class="language-plaintext highlighter-rouge">_posts/2022-12-01-distill-example.md</code>. <strong>Please ensure that you use the distill layout in your submission.</strong> You must modify the file’s header (or ‘front-matter’) as needed.</p> <div class="language-markdown highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nn">---</span>
+<span class="na">layout</span><span class="pi">:</span> <span class="s">distill</span>
+<span class="na">title</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">Your Blog Title</span><span class="pi">]</span>
+<span class="na">description</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">Your blog's abstract - a short description of what your blog is about</span><span class="pi">]</span>
+<span class="na">date</span><span class="pi">:</span> <span class="s">2022-12-01</span>
+<span class="na">htmlwidgets</span><span class="pi">:</span> <span class="kc">true</span>
+
+<span class="c1"># anonymize when submitting </span>
+<span class="na">authors</span><span class="pi">:</span>
+  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Anonymous</span> 
+
+<span class="c1"># do not fill this in until your post is accepted and you're publishing your camera-ready post!</span>
+<span class="c1"># authors:</span>
+<span class="c1">#   - name: Albert Einstein</span>
+<span class="c1">#     url: "https://en.wikipedia.org/wiki/Albert_Einstein"</span>
+<span class="c1">#     affiliations:</span>
+<span class="c1">#       name: IAS, Princeton</span>
+<span class="c1">#   - name: Boris Podolsky</span>
+<span class="c1">#     url: "https://en.wikipedia.org/wiki/Boris_Podolsky"</span>
+<span class="c1">#     affiliations:</span>
+<span class="c1">#       name: IAS, Princeton</span>
+<span class="c1">#   - name: Nathan Rosen</span>
+<span class="c1">#     url: "https://en.wikipedia.org/wiki/Nathan_Rosen"</span>
+<span class="c1">#     affiliations:</span>
+<span class="c1">#       name: IAS, Princeton </span>
+
+<span class="c1"># must be the exact same name as your blogpost</span>
+<span class="na">bibliography</span><span class="pi">:</span> <span class="s">2022-12-01-distill-example.bib</span>  
+
+<span class="c1"># Add a table of contents to your post.</span>
+<span class="c1">#   - make sure that TOC names match the actual section names</span>
+<span class="c1">#     for hyperlinks within the post to work correctly.</span>
+<span class="na">toc</span><span class="pi">:</span>
+  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">Section 1</span><span class="pi">]</span>
+  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">Section 2</span><span class="pi">]</span>
+  <span class="c1"># you can additionally add subentries like so</span>
+    <span class="na">subsections</span><span class="pi">:</span>
+    <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">Subsection 2.1</span><span class="pi">]</span>
+  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">Section 3</span><span class="pi">]</span>
+<span class="nn">---</span>
+
+<span class="gh"># ... your blog post's content ...</span>
+</code></pre></div></div> <p>You must change the <code class="language-plaintext highlighter-rouge">title</code>, <code class="language-plaintext highlighter-rouge">discription</code>, <code class="language-plaintext highlighter-rouge">toc</code>, and eventually the <code class="language-plaintext highlighter-rouge">authors</code> fields (<strong>ensure that the submission is anonymous for the review process</strong>).</p> <p>Read our <a href="/2023/blog/2022/distill-example">sample blog post</a> carefully to see how you can add image assets, and how to write using \(\LaTeX\)! Read about rendering your post locally <a href="#serving">below</a>.</p> <p><strong>Important: make sure your post is completely anonymized before you export and submit it!</strong></p> <p>Before going any further, it will be useful to highlight exactly what folders and files you are going to add or modify. Even if you use one of our simpler quickstart methods, this will always be what’s happening behind the scenes.</p> <p>If you clone our repo or download a release, you will find a directory structure that looks like the following (excluding all files and directories that are not relevant to your submission):</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>your_blogpost_repo/
+│
+├── _posts
+│   ├── 2022-12-01-[YOUR SUBMISSION].md         <span class="c"># &lt;--- Create this markdown file; this is your blogpost</span>
+│   └── ...
+├── assets
+│   ├── bibliography
+│   │   ├── 2022-12-01-[YOUR SUBMISSION].bib    <span class="c"># &lt;--- Create this bibtex file</span>
+│   │   └── ...
+│   ├── html
+│   │   ├── 2022-12-01-[YOUR SUBMISSION]        <span class="c"># &lt;--- Create this directory and add interactive html figures</span>
+│   │   │   └──[YOUR HTML FIGURES].html
+│   │   └── ...
+│   ├── img
+│   │   ├── 2022-12-01-[YOUR SUBMISSION]        <span class="c"># &lt;--- Create this directory and add static images here</span>
+│   │   │   └──[YOUR IMAGES].png
+│   │   └── ...
+│   └── ...
+└── ...
+</code></pre></div></div> <p>In summary, to create your post, you will:</p> <ul> <li>Create a markdown file in the <code class="language-plaintext highlighter-rouge">_posts/</code> directory with the format <code class="language-plaintext highlighter-rouge">_posts/2022-12-01-[SUBMISSION NAME].md</code>.</li> <li>Add any static image assets will be added to <code class="language-plaintext highlighter-rouge">assets/img/2022-12-01-[SUBMISSION NAME]/</code>.</li> <li>Add any interactive HTML figures will be added to <code class="language-plaintext highlighter-rouge">assets/html/2022-12-01-[SUBMISSION NAME]/</code>.</li> <li>Put your citations into a bibtex file in <code class="language-plaintext highlighter-rouge">assets/bibliography/2022-12-01-[SUBMISSION NAME].bib</code>.</li> </ul> <p>You <strong>should not</strong> touch anything else in the blog post.</p> <p>Note that <code class="language-plaintext highlighter-rouge">2022-12-01-[YOUR SUBMISSION]</code> serves as a tag to your submission, so it should be the same for all three items. For example, if you’re writing a blog post called “Deep Learning”, you’d likely want to make your tag <code class="language-plaintext highlighter-rouge">2022-12-01-deep-learning</code>, and the directory structure would look like this:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>your_blogpost_repo/
+│
+├── _posts
+│   ├── 2022-12-01-deep-learning.md         <span class="c"># &lt;--- Create this markdown file; this is your blogpost</span>
+│   └── ...
+├── assets
+│   ├── bibliography
+│   │   ├── 2022-12-01-deep-learning.bib    <span class="c"># &lt;--- Create this bibtex file</span>
+│   │   └── ...
+│   ├── html
+│   │   ├── 2022-12-01-deep-learning        <span class="c"># &lt;--- Create this directory and add interactive html figures</span>
+│   │   │   └──[YOUR HTML FIGURES].html
+│   │   └── ...
+│   ├── img
+│   │   ├── 2022-12-01-deep-learning        <span class="c"># &lt;--- Create this directory and add static images here</span>
+│   │   │   └──[YOUR IMAGES].png
+│   │   └── ...
+│   └── ...
+└── ...
+</code></pre></div></div> <h3 id="local-serving">Local serving</h3> <p>So far we’ve talked about how to get the relevant repository and create a blog post conforming to our requirements. Everything you have done so far has been in Markdown, but this is not the same format as web content (typically HTML, etc.). You’ll now need to build your static web site (which is done using Jekyll), and then <em>serve</em> it on some local webserver in order to view it properly. We will now discuss how you can <em>serve</em> your blog site locally, so you can visualize your work before you open a pull request on the staging website so you can submit it to the ICLR venue.</p> <h4 id="method-1-using-docker">Method 1: Using Docker</h4> <p>To render your website locally, we follow the instructions for <a href="https://github.com/iclr-blogposts/iclr-blogposts.github.io/blob/master/README.md#local-setup-using-docker-recommended-on-windows" target="_blank" rel="noopener noreferrer">Local setup using Docker (Recommended on Windows)</a>, but specifically you will need to create your own docker container rather than pull it from Dockerhub (because we modified the Gemfile).</p> <p>In summary, the steps are as follows:</p> <ol> <li> <p>Create your Docker image:</p> <div class="language-plaintext highlighter-rouge"> <div class="highlight"><pre class="highlight"><code> ./bin/docker_build_image.sh
+</code></pre></div> </div> <p>Remove the <code class="language-plaintext highlighter-rouge">Gemfile.lock</code> file if prompted. This will create a docker image labeled as <code class="language-plaintext highlighter-rouge">al-folio:latest</code>.</p> </li> <li> <p>Run the Docker image:</p> <div class="language-plaintext highlighter-rouge"> <div class="highlight"><pre class="highlight"><code> ./bin/docker_run.sh
+</code></pre></div> </div> <p>Remove the <code class="language-plaintext highlighter-rouge">Gemfile.lock</code> file if prompted. Don’t use <code class="language-plaintext highlighter-rouge">dockerhub_run.sh</code>; this may result in issues with missing jekyll dependencies.</p> </li> </ol> <h4 id="method-2-using-jekyll-manually">Method 2: Using Jekyll Manually</h4> <p>For users wishing to not use a Docker container, you can install Jekyll directly to your computer and build the site using Jekyll directly. This is done at your own risk, as there are many potential points of error! Follow the instructions for rendering the website via the conventional method of <code class="language-plaintext highlighter-rouge">$ bundle exec jekyll serve</code></p> <h5 id="installation">Installation</h5> <p>You will need to manually install Jekyll which will vary based on your operating system. The instructions here are only for convenience - you are responsible for making sure it works on your system and we are not liable for potential issues that occur when adding your submissions to our repo!</p> <p><strong>Ubuntu/Debian</strong></p> <ol> <li> <p>Install Ruby</p> <div class="language-bash highlighter-rouge"> <div class="highlight"><pre class="highlight"><code> <span class="nb">sudo </span>apt <span class="nb">install </span>ruby-full
+</code></pre></div> </div> </li> <li> <p>Once installed, add the following to your <code class="language-plaintext highlighter-rouge">.bashrc</code> or whatever terminal startup script you may use (this is important because otherwise gem may complain about needing sudo permission to install packages):</p> <div class="language-bash highlighter-rouge"> <div class="highlight"><pre class="highlight"><code> <span class="nb">export </span><span class="nv">GEM_HOME</span><span class="o">=</span><span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/.gem"</span>
+ <span class="nb">export </span><span class="nv">PATH</span><span class="o">=</span><span class="s2">"</span><span class="nv">$HOME</span><span class="s2">/.gem/bin:</span><span class="nv">$PATH</span><span class="s2">"</span>
+</code></pre></div> </div> </li> <li> <p>Install Jekyll and Bundler:</p> <div class="language-bash highlighter-rouge"> <div class="highlight"><pre class="highlight"><code> gem <span class="nb">install </span>jekyll bundler
+</code></pre></div> </div> </li> </ol> <p><strong>MacOS and Windows</strong></p> <p>Mac and Windows users can find relevant guides for installing Jekyll here:</p> <ul> <li><a href="https://jekyllrb.com/docs/installation/windows/" target="_blank" rel="noopener noreferrer">Windows guide</a></li> <li><a href="https://jekyllrb.com/docs/installation/macos/" target="_blank" rel="noopener noreferrer">MacOS guide</a></li> </ul> <h5 id="manual-serving">Manual Serving</h5> <p>Once you’ve installed jekyll and all of the dependencies, you can now serve the webpage on your local machine for development purposes using the <code class="language-plaintext highlighter-rouge">bundle exec jekyll serve</code> command.</p> <p>You may first need to install any project dependencies. In your terminal, from the directory containing the Jekyll project run:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bundle <span class="nb">install</span>
+</code></pre></div></div> <p>This will install any plugins required by the project. To serve the webpage locally, from your terminal, in the directory containing the Jekyll project run:</p> <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bundle <span class="nb">exec </span>jekyll serve
+</code></pre></div></div> <p>You should see something along the lines of:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&gt; bundle exec jekyll serve
+Configuration file: /home/$USER/blog_post_repo/_config.yml
+            Source: /home/$USER/blog_post_repo
+       Destination: /home/$USER/blog_post_repo/_site
+ Incremental build: disabled. Enable with --incremental
+      Generating... 
+       Jekyll Feed: Generating feed for posts
+
+        ... you may see a lot of stuff in here related to images ...
+
+                    done in 0.426 seconds.
+ Auto-regeneration: enabled for '/home/$USER/blog_post_repo'
+    Server address: http://127.0.0.1:4000/2023/
+  Server running... press ctrl-c to stop.
+</code></pre></div></div> <p>If you see this, you’ve successfully served your web page locally! You can access it at server address specified, in this case <code class="language-plaintext highlighter-rouge">http://127.0.0.1:4000/2023</code> (and the blog posts should once again be viewable at the <code class="language-plaintext highlighter-rouge">blog/</code> endpoint).</p> <h3 id="submitting-your-blog-post">Submitting your Blog Post</h3> <p>The submission steps are as follows:</p> <ol> <li> <p>Strip all identifying information from your blog post, such as your names, instituitions, etc. Be mindful that your commit history may include identifying history (beyond your Github usernames); this is okay as reviewers are only permitted to look at the live blog post and not the source repository during the review process, however if this is important to you, you may consider to rebase your commits (not required).</p> </li> <li>Make a new Pull Request to the <a href="https://github.com/iclr-blogposts/staging/pulls" target="_blank" rel="noopener noreferrer">staging repository</a> (not the 2023 repo!) containing your blog post. Recall that your changes should (at most) modify the following files and directories: <div class="language-bash highlighter-rouge"> <div class="highlight"><pre class="highlight"><code> your_blogpost_repo/
+ │
+ ├── _posts
+ │   ├── 2022-12-01-deep-learning.md         <span class="c"># &lt;--- Create this markdown file; this is your blogpost</span>
+ │   └── ...
+ ├── assets
+ │   ├── bibliography
+ │   │   ├── 2022-12-01-deep-learning.bib    <span class="c"># &lt;--- Create this bibtex file</span>
+ │   │   └── ...
+ │   ├── html
+ │   │   ├── 2022-12-01-deep-learning        <span class="c"># &lt;--- Create this directory and add interactive html figures</span>
+ │   │   │   └──[YOUR HTML FIGURES].html
+ │   │   └── ...
+ │   ├── img
+ │   │   ├── 2022-12-01-deep-learning        <span class="c"># &lt;--- Create this directory and add static images here</span>
+ │   │   │   └──[YOUR IMAGES].png
+ │   │   └── ...
+ │   └── ...
+ └── ...
+</code></pre></div> </div> <p>Your PR will be briefly reviewed to ensure that it matches the formatting requirements (no content review), and it will then be merged into the staging version of the blog.</p> </li> <li>Submit the name of your blog post and its URL to our <a href="https://openreview.net/group?id=ICLR.cc/2023/BlogPosts&amp;referrer=%5BHomepage%5D(%2F)" target="_blank" rel="noopener noreferrer">OpenReview</a>.</li> </ol> <blockquote> <p><strong>Note:</strong> the abstract deadline preceeds the PR deadline and you might not have a PR merged before this. As a result, you may not have a URL ready for the abstract deadline; please do your best to estimate your URL. It will be created with the following format:</p> <div class="language-plaintext highlighter-rouge"> <div class="highlight"><pre class="highlight"><code>https://iclr-blogposts.github.io/staging/blog/2022/&lt;YOUR-BLOGPOST-NAME&gt;/
+</code></pre></div> </div> <p>Using the example above, if your blog post’s file is <code class="language-plaintext highlighter-rouge">2022-12-01-deep-learning.md</code>, then the corresponding url will be:</p> <div class="language-plaintext highlighter-rouge"> <div class="highlight"><pre class="highlight"><code>https://iclr-blogposts.github.io/staging/blog/2022/deep-learning/
+</code></pre></div> </div> <p>Note that if you render your post locally, you will be able to see how the URL of your post is formatted (but please use the correct base url of <code class="language-plaintext highlighter-rouge">https://iclr-blogposts.github.io/staging</code>). We will be fairly accomodating about this if any issues arise once your submission is merged.</p> </blockquote> <h3 id="camera-ready-instructions">Camera-ready instructions</h3> <p>To streamline the process of merging the accepted posts into the final blog post site, we have prepared a branch with all of the accepted blog posts in the staging repo which can be found here:</p> <ul> <li><a href="https://github.com/iclr-blogposts/staging/tree/accepted" target="_blank" rel="noopener noreferrer">https://github.com/iclr-blogposts/staging/tree/accepted</a></li> </ul> <p>Please fetch this branch, and proceed with adding any final changes by creating a branch from accepted, and then merge your changes by opening a PR against this branch. The checklist for updating your blog post is as follows:</p> <ol> <li>Implement any required changes from the review stage <ul> <li>If you had a conditional acceptance, ensure that you update your post following the feedback given.</li> </ul> </li> <li>Deanonymize your post <ul> <li>Update the author list + any links that were anonymized for the review process</li> </ul> </li> <li>Update formatting <ul> <li> <strong>Abstracts:</strong> ensure that your abstracts are contained within the <code class="language-plaintext highlighter-rouge">description</code> entry of the front-matter, so it renders correctly in the blog (<a href="https://github.com/iclr-blogposts/staging/blob/aa15aa3797b572e7b7bb7c8881fd350d5f76fcbd/_posts/2022-12-01-distill-example.md?plain=1#L4-L" target="_blank" rel="noopener noreferrer">example</a>)</li> <li> <strong>Table of contents:</strong> you must use the <code class="language-plaintext highlighter-rouge">toc</code> formatting like that in the distill template (<a href="https://github.com/iclr-blogposts/staging/blob/aa15aa3797b572e7b7bb7c8881fd350d5f76fcbd/_posts/2022-12-01-distill-example.md?plain=1#L33-L42" target="_blank" rel="noopener noreferrer">example</a>)</li> <li> <strong>Bibliography:</strong> uses correct reference style as per the distill template (i.e. using the bibtex file)</li> </ul> </li> </ol> <p>Once you have updated your blog post with any necessary changes:</p> <ul> <li>Open a pull request against the accepted branch of the staging repo.</li> <li>You should see a PR template when you open up a PR - please fill it in and make sure all of the required boxes are ticked before submitting your final PR.</li> </ul> <p>Below is what you should see in the PR template:</p> <div class="language-md highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">&lt;!-- Please make sure you are opening a pull request against the `accepted` branch (not master!) of the STAGING repo (not 2023!) --&gt;</span>
+
+<span class="gu">## OpenReview Submission Thread</span>
+<span class="c">&lt;!-- link to your OpenReview submission --&gt;</span>
+
+<span class="gu">## Checklist before requesting a review</span>
+<span class="c">&lt;!-- To tick a box, put an 'x' inside it (e.g. [x]) --&gt;</span>
+<span class="p">
+-</span> [ ] I am opening a pull request against the <span class="sb">`accepted`</span> branch of the <span class="sb">`staging`</span> repo
+<span class="p">-</span> [ ] I have de-anonymized my post, added author lists, etc.
+<span class="p">-</span> [ ] My post matches the formatting requirements
+<span class="p">	-</span> [ ] I have a short 2-3 sentence abstract in the <span class="sb">`description`</span> field of my front-matter 
+<span class="p">	-</span> [ ] I have a table of contents, formatted using the <span class="sb">`toc`</span> field of my front-matter 
+<span class="p">	-</span> [ ] My bibliography is correctly formatted, using a <span class="sb">`.bibtex`</span> file as per the sample post
+
+<span class="gu">## Changes implemented in response to reviewer feedback</span>
+<span class="p">
+-</span> [ ] Tick this box if you received a conditional accept
+<span class="p">-</span> [ ] I have implemented the necessary changes in response to reviewer feedback (if any)
+
+<span class="c">&lt;!-- briefly add your changes in response to reviewer feedback --&gt;</span>
+
+<span class="gu">## Any other comments</span><span class="sb">
+
+
+</span></code></pre></div></div> </article> </div> </div> <script src="https://cdn.jsdelivr.net/npm/jquery@3.6.0/dist/jquery.min.js" integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/bootstrap@4.6.1/dist/js/bootstrap.bundle.min.js" integrity="sha256-fgLAgv7fyCGopR/gBNq2iW3ZKIdqIcyshnUULC4vex8=" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/npm/mdbootstrap@4.20.0/js/mdb.min.js" integrity="sha256-NdbiivsvWt7VYCt6hYNT3h/th9vSTL4EDWeGs5SN3DA=" crossorigin="anonymous"></script> <script defer src="https://cdn.jsdelivr.net/npm/masonry-layout@4.2.2/dist/masonry.pkgd.min.js" integrity="sha256-Nn1q/fx0H7SNLZMQ5Hw5JLaTRZp0yILA/FRexe19VdI=" crossorigin="anonymous"></script> <script defer src="https://cdn.jsdelivr.net/npm/imagesloaded@4/imagesloaded.pkgd.min.js"></script> <script defer src="/2023/assets/js/masonry.js" type="text/javascript"></script> <script defer src="https://cdn.jsdelivr.net/npm/medium-zoom@1.0.6/dist/medium-zoom.min.js" integrity="sha256-EdPgYcPk/IIrw7FYeuJQexva49pVRZNmt3LculEr7zM=" crossorigin="anonymous"></script> <script defer src="/2023/assets/js/zoom.js"></script> <script defer src="/2023/assets/js/common.js"></script> <script type="text/javascript">window.MathJax={tex:{tags:"ams"}};</script> <script defer type="text/javascript" id="MathJax-script" src="https://cdn.jsdelivr.net/npm/mathjax@3.2.0/es5/tex-mml-chtml.js"></script> <script defer src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script> </body> </html>
\ No newline at end of file