You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Sep 2, 2024. It is now read-only.
Hi, thanks for releasing this awesome code! Currently, i am working on reproducing the result on cityscapes in paper. I found that in paper the description of mtl update equation say the weights of task specific subnetwork should be updated with original learning rate, then the shared weights of network is updated with the MGDA algorithm. But i didnt find the corresponding implementation in code where both the shared weights and task specific weights are updated consistently by timing loss of different task with a weight factor determined by MGDA. Am i missing something here, or is this a implemention trick?
The text was updated successfully, but these errors were encountered:
Hi @ozansener, I'm also trying to reproduce and utilize the method. And the above also confuses me a bit as well. Algorithm 2 line 2 shows the that the task specific params are updated without any scaling factor, then line 4 would be replaced with the solver using your approximation and alphas would be calculated using gradients of Lt with respect Z. Then in line 5, only the shared parameters are updated with the alpha weighed sum of losses.
@bsaint and @milos-popovic Thanks for raising the issue. You are right. There is a discrepancy between the paper and the code. We used this code to get all the results so please use the codebase. I will run some experiments and will update the paper if necessary.
Hi, thanks for releasing this awesome code! Currently, i am working on reproducing the result on cityscapes in paper. I found that in paper the description of mtl update equation say the weights of task specific subnetwork should be updated with original learning rate, then the shared weights of network is updated with the MGDA algorithm. But i didnt find the corresponding implementation in code where both the shared weights and task specific weights are updated consistently by timing loss of different task with a weight factor determined by MGDA. Am i missing something here, or is this a implemention trick?
The text was updated successfully, but these errors were encountered: