Make Zingg More Usable - Part 2 #928

sonalgoyal · 2024-10-28T06:27:09Z

The current findTrainingData phase can be optimized to show more positive samples than negative so that we can converge to the correct models faster. Currently, we do 10 pos and 10 neg each. In the earlier rounds, even the pos we sample are mostly neg, leading to longer training cycles by running ftd and label. What if we changed this to 15 pos and 5 neg?

Introduce a new phase findtrainingDataV2 and let us see if that helps building models faster. If it works better based on our own testing and user feedback, we can make it the default going forward.

sania-16 · 2024-11-06T09:06:47Z

running FTD - 10 pos, 10 neg on febrl120k on a new model
first round of ftd and label - start with 0 matches and 22 pairs for labeling
second round of ftd and label - start with 0 matches and 20 pairs for labeling
third round of ftd and label - start with 20 matches and 20 pairs for labeling
fourth round of ftd and label - start with 33 matches and 20 pairs for labeling
fifth round of ftd and label - start with 39 matches and 20 pairs for labeling

In fifth round - we get 40 matching pairs
trained model on 41 pos and 41 neg pairs
cc converged in 3 iterations
with 5:05 mins to run match

sania-16 · 2024-11-06T15:17:06Z

running FTD - 15 pos, 5 neg on febrl120k on a new model
first round of ftd and label - start with 0 matches and 18 pairs for labeling
second round of ftd and label - start with 0 matches and 10 pairs for labeling
third round of ftd and label - start with 10 matches and 20 pairs for labeling
fourth round of ftd and label - start with 26 matches and 20 pairs for labeling
fifth round of ftd and label - start with 30 matches and 20 pairs for labeling

In fifth round - we get 40 matching pairs
trained model on 41 pos and 47 neg pairs
cc converged in 3 iterations
with 5:07 mins to run match

sonalgoyal assigned Nitish1814 Oct 28, 2024

sonalgoyal added this to the 0.5.0 milestone Oct 28, 2024

sonalgoyal assigned sania-16 and unassigned Nitish1814 Nov 5, 2024

sonalgoyal moved this to Todo in 0.5.0 Nov 5, 2024

sonalgoyal added this to 0.5.0 Nov 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make Zingg More Usable - Part 2 #928

Make Zingg More Usable - Part 2 #928

sonalgoyal commented Oct 28, 2024

sania-16 commented Nov 6, 2024 •

edited

Loading

sania-16 commented Nov 6, 2024

Make Zingg More Usable - Part 2 #928

Make Zingg More Usable - Part 2 #928

Comments

sonalgoyal commented Oct 28, 2024

sania-16 commented Nov 6, 2024 • edited Loading

sania-16 commented Nov 6, 2024

sania-16 commented Nov 6, 2024 •

edited

Loading