Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change basic example to contain backward two losses. #64

Merged
merged 2 commits into from
Jun 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 16 additions & 15 deletions docs/source/examples/basic_usage.rst
Original file line number Diff line number Diff line change
@@ -1,12 +1,14 @@
Basic Usage
===========

This example shows how to use TorchJD to perform an iteration of Jacobian descent on a regression
model with two objectives. In this example, a batch of inputs is forwarded through the model and two
corresponding batches of labels are used to compute two losses. These losses are then backwarded
through the model. The obtained Jacobian matrix, consisting of the gradients of the two losses with
respect to the parameters, is then aggregated using :doc:`UPGrad <../docs/aggregation/upgrad>`, and
the parameters are updated using the resulting aggregation.


This example shows how to use TorchJD to perform an iteration of Jacobian Descent on a regression
model. In this example, a batch of inputs is forwarded through the model and the corresponding batch
of labels is used to compute a batch of losses. These losses are then backwarded through the model.
The obtained Jacobian matrix, consisting of the gradients of the losses, is then aggregated using
UPGrad, and the parameters are updated using the resulting aggregation.

Import several classes from ``torch`` and ``torchjd``:

Expand All @@ -19,7 +21,7 @@ Import several classes from ``torch`` and ``torchjd``:

Define the model and the optimizer, as usual:

>>> model = Sequential(Linear(10, 5), ReLU(), Linear(5, 1))
>>> model = Sequential(Linear(10, 5), ReLU(), Linear(5, 2))
>>> optimizer = SGD(model.parameters(), lr=0.1)

Define the aggregator that will be used to combine the Jacobian matrix:
Expand All @@ -33,28 +35,27 @@ negatively affected by the update.
Now that everything is defined, we can train the model. Define the input and the associated target:

>>> input = torch.randn(16, 10) # Batch of 16 input random vectors of length 10
>>> target = input.sum(dim=1, keepdim=True) # Batch of 16 targets
>>> target1 = torch.randn(16) # First batch of 16 targets
>>> target2 = torch.randn(16) # Second batch of 16 targets

Here, we generate fake data in which each target is equal to the sum of its corresponding input
vector, for the sake of the example.
Here, we generate fake inputs and labels for the sake of the example.

We can now compute the losses associated to each element of the batch.

>>> loss_fn = MSELoss(reduction='none')
>>> loss_fn = MSELoss()
>>> output = model(input)
>>> losses = loss_fn(output, target)

Note that setting ``reduction='none'`` is necessary to obtain the element-wise loss vector.
>>> loss1 = loss_fn(output[:, 0], target1)
>>> loss2 = loss_fn(output[:, 1], target2)

The last steps are similar to gradient descent-based optimization.
The last steps are similar to gradient descent-based optimization, but using the two losses.

Reset the ``.grad`` field of each model parameter:

>>> optimizer.zero_grad()

Perform the Jacobian descent backward pass:

>>> torchjd.backward(losses, model.parameters(), A)
>>> torchjd.backward([loss1, loss2], model.parameters(), A)

This will populate the ``.grad`` field of each model parameter with the corresponding aggregated
Jacobian matrix.
Expand Down
13 changes: 7 additions & 6 deletions tests/doc/test_rst.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,20 +6,21 @@ def test_basic_usage():
import torchjd
from torchjd.aggregation import UPGrad

model = Sequential(Linear(10, 5), ReLU(), Linear(5, 1))
model = Sequential(Linear(10, 5), ReLU(), Linear(5, 2))
optimizer = SGD(model.parameters(), lr=0.1)

A = UPGrad()

input = torch.randn(16, 10) # Batch of 16 input random vectors of length 10
target = input.sum(dim=1, keepdim=True) # Batch of 16 targets
target1 = torch.randn(16) # First batch of 16 targets
target2 = torch.randn(16) # Second batch of 16 targets

loss_fn = MSELoss(reduction="none")
loss_fn = MSELoss()
output = model(input)
losses = loss_fn(output, target)
loss1 = loss_fn(output[:, 0], target1)
loss2 = loss_fn(output[:, 1], target2)

optimizer.zero_grad()
torchjd.backward(losses, model.parameters(), A)
torchjd.backward([loss1, loss2], model.parameters(), A)
optimizer.step()


Expand Down
Loading