Compute pairwise distances only once instead of each turn. #68

AlexandreAbraham · 2020-01-29T09:48:10Z

This is a proposition of optimization for the ranked batch method.
In the ranked batch methods, the distance between all unlabeled samples is computed at each iteration which is computationally prohibitive on large datasets. This modification simply update the minimum distance vector with each selected sample. Note that I do not remove the labelled sample from the vector to avoid memory reallocation and unnecessary complexity. However, one sample will not be selected twice as the distance to itself is 0.

I observe a difference in the performance on the ranked_batch example though.

Here is the performance history given master:

[0.3333333333333333, 0.8933333333333333, 0.84, 0.9266666666666666, 0.9333333333333333, 0.9466666666666667, 0.9533333333333334]

Here is the performance history on my branch:
[0.3333333333333333, 0.8733333333333333, 0.8733333333333333, 0.9333333333333333, 0.9333333333333333, 0.9466666666666667, 0.9733333333333334]

I am still investigating this. Any insight is welcome.

cosmic-cortex · 2020-02-02T09:48:52Z

Hi! Thanks for the PR! I'll try to remove it soon and get back to you.

cosmic-cortex · 2020-02-09T14:20:42Z

Just a quick update, I am still working on reviewing it, not sure if I understand everything correctly, so I need to spend more time with it.

I have also noticed that in my previous comment, I wrote "I'll try to remove it soon" :D I actually meant to say I'll try to review it soon. Sorry for the potential confusion :)

AlexandreAbraham · 2020-02-10T10:58:46Z

Hey,
No problem, let me know of I can add comments to make it more readable.

Compute pairwise distances only once instead of each turn.

a3b1785

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compute pairwise distances only once instead of each turn. #68

Compute pairwise distances only once instead of each turn. #68

AlexandreAbraham commented Jan 29, 2020

cosmic-cortex commented Feb 2, 2020

cosmic-cortex commented Feb 9, 2020

AlexandreAbraham commented Feb 10, 2020

Compute pairwise distances only once instead of each turn. #68

Are you sure you want to change the base?

Compute pairwise distances only once instead of each turn. #68

Conversation

AlexandreAbraham commented Jan 29, 2020

cosmic-cortex commented Feb 2, 2020

cosmic-cortex commented Feb 9, 2020

AlexandreAbraham commented Feb 10, 2020