You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First of all, I‘m greatful for the training code you provided! Recently ,I tried to replicate the ootd-dc or ootd-hd on the data searched from shopping website. But whenever the dc or hd model , the loss curve fluctuates repeatedly around 0.02 and the results were getting worse.
after 4 epochs:
We trained the model form the ootd-dc checkpoint with
1.mixed_precision float32.
2.Resolution 1024*768
3.batch_size 8 on multi-gpu
4. epochs=4 for expirement
5. Additional 15000 paired data
We tried to use fp16 for faster and bigger batch_size, but the debugger report "Attempting to unscale FP16 gradient" at "optimizer.step()"
Besides , I wonder what the normal loss curve looks like. And How many epochs should we set for replication If the fluctuation around 0.02 is normal.
The text was updated successfully, but these errors were encountered:
Hello,
First of all, I‘m greatful for the training code you provided! Recently ,I tried to replicate the ootd-dc or ootd-hd on the data searched from shopping website. But whenever the dc or hd model , the loss curve fluctuates repeatedly around 0.02 and the results were getting worse.
after 4 epochs:
We trained the model form the ootd-dc checkpoint with
1.mixed_precision float32.
2.Resolution 1024*768
3.batch_size 8 on multi-gpu
4. epochs=4 for expirement
5. Additional 15000 paired data
We tried to use fp16 for faster and bigger batch_size, but the debugger report "Attempting to unscale FP16 gradient" at "optimizer.step()"
Besides , I wonder what the normal loss curve looks like. And How many epochs should we set for replication If the fluctuation around 0.02 is normal.
The text was updated successfully, but these errors were encountered: