Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add prototype for upgrades #3

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

mileslucas
Copy link
Contributor

Prototype of new design for PyMuellerMat (#2)

In this PR you can see two new files:

  • mm_functions.py - this has numba-accelerated fast Mueller-matrix implementations
  • components.py - this has the nice class-based component and system models

If you run

python pyMuellerMat/components.py

you will see a demo of the prototype.

Numba vs. Numpy
I performed some micro-benchmarks using the two-component system in components.py. I used timeit with 10,000 iterations and numba improved on numpy by a factor of ~2x (20 ms vs. 40 ms). More extensive benchmarks, preferably built into some automated tests.

File IO
By using pydantic we get really easy-to-use serialization of the component classes to dictionaries (which can be saved to TOML, JSON, YAML, etc.). I'm a fan of TOML, personally- this is the exact same system I've built the VAMPIRES DPP configuration off of. If you run the components.py file you'll see an example of the dictionary and TOML outputs from the System class.

My thinking is that we can save these TOML files in zenodo or on google drive and then that's really easy for me to download to the DPP and load back using pyMuellerMat.

@maxwellmb
Copy link
Owner

It it the actual matrix calculations that are the slow parts of Rebecca's code?

@mileslucas
Copy link
Contributor Author

Unsure, she said she would run some profiling benchmarks on the loglikelihood function. Faster is still faster though, and the implementation is effortless (one line decorator).

@maxwellmb
Copy link
Owner

A factor of 2 is good, but adds some slight package bloat (unless you're already using numba) if it's not the main cause of the slowdown. My instinct is that there's weird overhead somewhere else. Multiplying together 4x4 matrices should be blazing fast.

@mileslucas
Copy link
Contributor Author

mileslucas commented Nov 22, 2023

Turns out I had the units wrong for the timing function- the example in components.py is actually ~2 us.

I did a direct comparison versus SystemMuellerMatrix

from pyMuellerMat.MuellerMat import SystemMuellerMatrix
from pyMuellerMat.common_mms import HWP, HorizontalPolarizer

sys = SystemMuellerMatrix((HorizontalPolarizer(), HWP()))
%timeit sys.evaluate()
11 µs ± 41.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
from pyMuellerMat.components import System, LinearPolarizer, GenericWaveplate

sys2 = System(components=dict(lp=LinearPolarizer(), hwp=GenericWaveplate()))
%timeit sys2()
2.71 µs ± 6.24 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

overall these numbers are meaninglessly small, so yeah numba isn't a huge difference-maker here. All that's being improved, then, is the dictionary/class interactions. With the current SystemMuellerMatrix everything is purely dictionary-based and has no automatic utilities for serialization. The pydantic version allows treating everything as classes (a little bit easier than dictionaries especially when combined with code introspection tools like in VS code), includes modern type-hinting, and a clear path for serializing models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants