Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix IsBinaryRel() check #9007

Closed
wants to merge 1 commit into from
Closed

Fix IsBinaryRel() check #9007

wants to merge 1 commit into from

Conversation

cbalint13
Copy link

@cbalint13 cbalint13 commented Apr 3, 2023

This small PR fixes wrong binary checks for labels.

  • The behaviour bug was introduced by Rework the MAP metric. #8931
  • The checks should be now are in range 1.0+eps > label > 0.0-eps, with eps = 1e-6

The errors was cought on TVM autotune process:

File "/usr/lib64/python3.11/site-packages/tvm/autotvm/tuner/xgboost_cost_model.py"
, line 538, in after_iteration
    bst_eval = model.eval_set(self.evals, epoch, feval)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.11/site-packages/xgboost/core.py", line 1995, in eval_set
    _check_call(
File "/usr/lib64/python3.11/site-packages/xgboost/core.py", line 270, in _check_call
    raise XGBoostError(py_str(_LIB.XGBGetLastError()))
xgboost.core.XGBoostError: [15:59:59] /builddir/build/BUILD/xgboost/src/common/ranking_utils.h:378: 
Check failed: is_binary: MAP can only be used with binary labels.

Thank you,
~Cristian.

Cc: @trivialfis , @RAMitchell , please help with the review.

Signed-off-by: Balint Cristian <[email protected]>
@trivialfis
Copy link
Member

trivialfis commented Apr 3, 2023

Hi, the check was for y \in {0, 1} or y == 0.0f || y == 1.0f, it's written this way for being compatible with floating point.

@cbalint13
Copy link
Author

Hi, the check was for y \in {0, 1} or y == 0.0f || y == 1.0f, it's written this way for being compatible with floating point.

@trivialfis

  • So you mean y can be only 0.0 (with small eps margin) or y can be only 1.0 (with small eps margin) ?

If so, I should reconsider the use case.

@trivialfis
Copy link
Member

That's the definition of average precision: https://en.wikipedia.org/wiki/Evaluation_measures_(information_retrieval)#Average_precision

where rel ⁡ ( k ) \operatorname{rel}(k) is an indicator function equaling 1 if the item at rank k k is a relevant document, zero otherwise.

@trivialfis
Copy link
Member

trivialfis commented Apr 3, 2023

If you have a special use case where the label is soft and fits into the map framework, please share and I would love to learn more.

In that case, we can threshold the label using 0.5. But it's not as rigorous as one might prefer.

@cbalint13
Copy link
Author

cbalint13 commented Apr 3, 2023

In that case, we can threshold the label using 0.5. But it's not as rigorous as one might prefer.

I'll do so, will see if resulted mAP impacts beaviour of TVM's internal model evaluation.
To close the discussion I leave here a way of inplace binarization as per your suggestion.

 dtrain = xgb.DMatrix(data, label=label)
 lbl = dtrain.get_label()
 lbl[lbl>=.5] = 1.0
 lbl[lbl<0.5] = 0.0
 dtrain.set_label(lbl)

Thanks @trivialfis !
~Cristian.

@cbalint13 cbalint13 closed this Apr 3, 2023
@cbalint13 cbalint13 changed the title Fix IsBinaryRel() check ~~Fix~~ IsBinaryRel() check Apr 3, 2023
@cbalint13 cbalint13 changed the title ~~Fix~~ IsBinaryRel() check Fix IsBinaryRel() check Apr 3, 2023
@tqchen
Copy link
Member

tqchen commented Apr 4, 2023

@cbalint13 perhaps a better way would be to move away from the map metric. Since likely we need continuous regression ranking. I assume pair-wise or other metrics could still work for such cases.

@trivialfis can you confirm if we still support other metrics that comes with continuous rank value output.

@trivialfis
Copy link
Member

In the last couple PRs, I have changed NDCG to integer only, meaning it accepts only discrete labels, with an additional option for user to specify custom gain function, which can be continuous. The custom gain function still needs a few more PRs to be merged.

Other than this, MAP is changed to binary. For ranking where query group is available, AUC supports pairwise definition since a few recent releases (curve constructed from correct/incorrect pairs instead of samples).

No additional change is currently planned. Feel free to share your use case and suggestions. :-)

@tqchen
Copy link
Member

tqchen commented Apr 4, 2023

It would still be useful to have rank values being continuous. The particular use case in TVM was cost model prediction, where we only care about the relative rank of the cost but not their accurate values.

The ability to be able to specify the ground truth rank sample where rank score is provided, rather than the pairs is useful as there can be many such pairs in total.

While the strict definition MAP indeed follows the binary scheme, it is useful to think about the use-cases where rank score is being provided, as that is usually a more compact case in terms of samples comparing to explicit construction of pairs.

@cbalint13
Copy link
Author

@trivialfis ,

In the last couple PRs, I have changed NDCG to integer only, meaning it accepts only discrete labels, with an additional option for user to specify custom gain function, which can be continuous. The custom gain function still needs a few more PRs to be merged.

I am interested to track it, is there a tracker for the upcoming custom gain function (as pr or issue) ?

@trivialfis
Copy link
Member

#8822 is the WIP PR, I'm extracting small parts of it for review and merge.

For continuous rank value, NDCG with custom gain seems to be the right choice. Will keep you updated with the progress.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants