Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NNCG to optimizers submodule #1661

Open
wants to merge 23 commits into
base: master
Choose a base branch
from

Conversation

pratikrathore8
Copy link

No description provided.

@lululxvi
Copy link
Owner

Format the code using black https://github.com/psf/black

@pratikrathore8
Copy link
Author

I've formatted the code with black in the latest commit!

@lululxvi
Copy link
Owner

If this only supports pytorch, then it should be in https://github.com/lululxvi/deepxde/tree/master/deepxde/optimizers/pytorch

@pratikrathore8
Copy link
Author

Moved to the right folder!

@pratikrathore8
Copy link
Author

Made the formatting changes. I noticed there are other lines in the code with > 88 chars per line. Would you like for me to fix those?

@lululxvi
Copy link
Owner

lululxvi commented Mar 3, 2024

Made the formatting changes. I noticed there are other lines in the code with > 88 chars per line. Would you like for me to fix those?

Yes.

Could you also fix the Codacy issues?

@pratikrathore8
Copy link
Author

pratikrathore8 commented Mar 4, 2024

The formatting has been fixed.

There are still some Codacy issues, but these are mainly due to "too many local variables" and the __init__ function in the optimizer, which is just following PyTorch conventions.

What do we need do next to further integrate NNCG into the codebase?

@lululxvi
Copy link
Owner

lululxvi commented Mar 5, 2024

@pratikrathore8
Copy link
Author

I've added NNCG to optimizers.py and NNCG_options to config.py

@pratikrathore8
Copy link
Author

Fixed!

@pratikrathore8
Copy link
Author

Fixed!

@lululxvi
Copy link
Owner

Have you tried using NNCG in some demo examples?

@pratikrathore8
Copy link
Author

A quick update:

I've started thinking about how to make a demo example that uses NNCG. The plan would be to modify some of the existing examples (e.g. Burgers.py) and add on NNCG after Adam and L-BFGS.

I'll have to make some modifications to model.py to full integrate NNCG into the training pipeline. I'm aiming to have an example fully working by the end of next week!

@pratikrathore8
Copy link
Author

pratikrathore8 commented Sep 17, 2024

@lululxvi Sorry for not responding earlier -- I was away from Stanford during the summer. My plan is to take a look at the code next week!

@pratikrathore8
Copy link
Author

@lululxvi My apologies for taking so long. I've updated the code -- I moved most of the complexity from model.py into nncg.py.

The main thing to note is that I added an argument perform_backward to train_step for the PyTorch backend -- this argument is True by default and is set to False only in _train_pytorch_nncg. The reason for adding this argument is because NNCG requires Hessian-vector products (hvps), which necessitates a slightly differently implementation from other optimizers, as I have explained below:

We can obtain hvps by either using .backward() with create_graph=True followed by torch.autograd.grad or by simply using torch.autograd.grad twice.

However, I find that the approach using .backward() runs out of memory; the memory of NNCG increases with the number of iterations instead of remaining constant, which is undesirable. I believe this is because using create_graph=True with .backward() leads to the computation graph being retained across iterations instead of being deleted at every iteration.

On the other hand using torch.autograd.grad twice to compute hvps leads to constant memory across iterations.

The memory issues with .backward() might be fixable with a lot of debugging, but I don't think it's worth it -- even PyTorch throws a warning about possible memory leaks when using the .backward() approach to calculate hvps.

Let me know if you'd like for me to make more changes -- I'd be interested in knowing if the tutorial Burgers_NNCG.py in examples works on your end.

The errors in the Codacy analysis are from nncg.py, but these are minor issues related to PyTorch optimizer conventions and too many local variables (which is difficult to avoid in the implementation).

@lululxvi
Copy link
Owner

Could you fix the other Codacy issues of dict and super-with-arguments?

deepxde/model.py Outdated Show resolved Hide resolved
deepxde/model.py Outdated Show resolved Hide resolved
@pratikrathore8
Copy link
Author

@lululxvi I've removed the unnecessary error checking and fixed the Codacy issues of dict and super-with-arguments.

deepxde/model.py Outdated Show resolved Hide resolved
@pratikrathore8
Copy link
Author

@lululxvi Made the changes. Should we consider merging _train_pytorch_nncg into _train_sgd by adding an argument to _train_sgd that decides whether or not to perform a backward pass on the loss? Or would you prefer we leave the functions separate, despite the code duplication?

deepxde/model.py Outdated Show resolved Hide resolved
@pratikrathore8
Copy link
Author

@lululxvi I added a train_step_nncg function as you suggested. NNCG now runs using _train_sgd.

Thanks for your patience with the refactoring steps so far!

@pratikrathore8
Copy link
Author

@lululxvi Done!

@lululxvi
Copy link
Owner

lululxvi commented Nov 5, 2024

Add a document for the example similar to this https://deepxde.readthedocs.io/en/latest/demos/pinn_forward/burgers.html

@pratikrathore8
Copy link
Author

pratikrathore8 commented Nov 5, 2024

@lululxvi I added the demo. It's basically the same as the one you linked, except I added the part with NNCG to the end.

Is there a way to check if it's been properly integrated into the docs?

@lululxvi
Copy link
Owner

lululxvi commented Nov 7, 2024

Please resolve the conflict.

You can look at the generated doc by clicking "Details" at the right side of "docs/readthedocs.org:deepxde" in in the check list.

@pratikrathore8
Copy link
Author

@lululxvi I've fixed the merge conflict!

I took a look at the doc for the demo, and it seems good to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants