You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First of all, thank you for your excellent work on this repository. I have some questions regarding the training process of GIM-dkm, specifically with respect to the dataset and loss functions.
In the original DKM model, the training was divided into indoor and outdoor models. For GIM, I understand that it was trained using data from the indoor model combined with YouTube data. Could you confirm if this is correct? Specifically, does the GIM training use a combination of MegaDepth, ScanNet, and YouTube datasets?
In the original model, depth loss and confidence loss were calculated based on the depth information from the dataset. However, when using YouTube videos, which have image pairs with matched ground truth, could you clarify what kind of loss function is used in this case for training?
I would greatly appreciate any clarification on these points.
The text was updated successfully, but these errors were encountered:
@gyj155
Sorry for the long wait, I have now released the code for video data generation and training network (gim-loftr). Please refer to the README for relevant information.
The training code for gim-dkm will be released later.
First of all, thank you for your excellent work on this repository. I have some questions regarding the training process of GIM-dkm, specifically with respect to the dataset and loss functions.
In the original DKM model, the training was divided into indoor and outdoor models. For GIM, I understand that it was trained using data from the indoor model combined with YouTube data. Could you confirm if this is correct? Specifically, does the GIM training use a combination of MegaDepth, ScanNet, and YouTube datasets?
In the original model, depth loss and confidence loss were calculated based on the depth information from the dataset. However, when using YouTube videos, which have image pairs with matched ground truth, could you clarify what kind of loss function is used in this case for training?
I would greatly appreciate any clarification on these points.
The text was updated successfully, but these errors were encountered: