Clean up of codes with pre-commit #37

mrava87 · 2024-07-27T19:33:17Z

This PR is aimed at cleaning-up the codebase and documentation:

introduces pre-commit
performs linting of all files

drrelyea · 2024-07-27T19:54:17Z

This looks great. God, does this code look atrocious - funny what 15 years of experience does to your perspective. :)

We should seriously think about redoing this in pytorch, given the now-massive difference in speed between it and numpy. (Jax is also a contender but currently seems slower than pytorch on a CPU.) Would the Pylops maintainers be averse to that dependency?

mrava87 · 2024-07-28T06:47:42Z

@drrelyea thanks!

Right, in principle we will not be averse as we already have torch as optional dependency for other operators. In practice, I am not sure this is the best way to go: from experience, torch is not any faster than numpy when they play on the same ground (CPU); of course it is much faster if you run your code on a GPU, but then a more direct and easy route for spgl1 could be to do what we do in pylops: add a second backed that uses cupy and let the solver figure out what to use based on the type of the input data. If you are interested I can point you to parts of our code where we implemented this logic, could be easily copy-pasted. And the same came be done for JAX, we just have a PR in progress for that. The benefit compared to torch is that the APIs of numpy/cupy/jax are %90+ the same so you don't need a full code rewriting to get the solver to run on a GPU :)

I will personally have no time to do this in foreseeable future, but happy to facilitate it if you are interested.

PS: I am currently trying to fix the readthedocs build as a lot of things have changed since we last build the doc, once done I will ask you to make a new release so that spgl1 becomes compatible with numpy2 :D

drrelyea · 2024-07-28T19:16:55Z

You would be very surprised what improvements have been made to torch in terms of their JIT. Matrix multiplication on a CPU is somewhere between 3 and 5 times faster in torch currently. I haven't tried numpy 2, so it's possible that would yield improvements, but that speed difference is with numpy compiled against OpenBLAS specifically for the chip. No idea what's the actual under the hood difference, but it's eye opening. (This is all on a ln M1 Mac - need to resurrect my Linux box to see what the gains are there, if any.)

I'll take a stab at it at some point. I should add tests to this repo as well! That first, then improvements.

mrava87 · 2024-07-28T20:06:28Z

Interesting :) have you tried comparing with Intel MKL, which is much faster than OpenBLAS?

mrava87 · 2024-07-28T20:08:22Z

And yes, tests would be a good addition before trying to do any code optimization :)

PS: can you please make a release so that I can then upload it on PyPI...

Clean up of codes with pre-commit

e90ad91

drrelyea merged commit deec018 into drrelyea:master Jul 27, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clean up of codes with pre-commit #37

Clean up of codes with pre-commit #37

mrava87 commented Jul 27, 2024

drrelyea commented Jul 27, 2024

mrava87 commented Jul 28, 2024

drrelyea commented Jul 28, 2024 •

edited

Loading

mrava87 commented Jul 28, 2024

mrava87 commented Jul 28, 2024 •

edited

Loading

Clean up of codes with pre-commit #37

Clean up of codes with pre-commit #37

Conversation

mrava87 commented Jul 27, 2024

drrelyea commented Jul 27, 2024

mrava87 commented Jul 28, 2024

drrelyea commented Jul 28, 2024 • edited Loading

mrava87 commented Jul 28, 2024

mrava87 commented Jul 28, 2024 • edited Loading

drrelyea commented Jul 28, 2024 •

edited

Loading

mrava87 commented Jul 28, 2024 •

edited

Loading