Rework the precision metric. #9222

trivialfis · 2023-05-30T20:39:58Z

I just realized this was not a documented feature before the PR.

Rework the precision metric for both CPU and GPU.
Mention it in the document.
Clean up old support code for GPU ranking metric.

The new implementation supports only learning to rank. Please consider one of the existing library implementations like sklearn for classification tasks. The previous implementation claims support for binary classification in a comment, which is false.

It is extracted from #8822.

- Rework the precision metric for both CPU and GPU. - Mention it in document. - Cleanup old support code for GPU ranking metric. - Add proper support for binary classification.

hcho3 · 2023-06-01T02:37:24Z

src/common/ranking_utils.h

@@ -372,12 +372,37 @@ bool IsBinaryRel(linalg::VectorView<float const> label, AllOf all_of) {
 *         both CPU and GPU.
 */
 template <typename AllOf>
-void CheckMapLabels(linalg::VectorView<float const> label, AllOf all_of) {
+void CheckPreLabels(StringView name, linalg::VectorView<float const> label, AllOf all_of) {


Update the docstring description: \brief Validate label for the Precision metric.

hcho3 · 2023-06-01T02:41:36Z

src/metric/rank_metric.cc

-// When device ordinal is present, we would want to build the metrics on the GPU. It is *not*
-// possible for a valid device ordinal to be present for non GPU builds. However, it is possible
-// for an invalid device ordinal to be specified in GPU builds - to train/predict and/or compute
-// the metrics on CPU. To accommodate these scenarios, the following is done for the metrics
-// accelerated on the GPU.
-// - An internal GPU registry holds all the GPU metric types (defined in the .cu file)
-// - An instance of the appropriate GPU metric type is created when a device ordinal is present
-// - If the creation is successful, the metric computation is done on the device
-// - else, it falls back on the CPU
-// - The GPU metric types are *only* registered when xgboost is built for GPUs
-//
-// This is done for 2 reasons:
-// - Clear separation of CPU and GPU logic
-// - Sorting datasets containing large number of rows is (much) faster when parallel sort
-//   semantics is used on the CPU. The __gnu_parallel/concurrency primitives needed to perform
-//   this cannot be used when the translation unit is compiled using the 'nvcc' compiler (as the
-//   corresponding headers that brings in those function declaration can't be included with CUDA).
-//   This precludes the CPU and GPU logic to coexist inside a .cu file
-


This comment is no longer valid?

The GPU registry is removed in this PR, as the precision metric is the last one being rewritten.

hcho3 · 2023-06-01T02:43:37Z

src/metric/rank_metric.cc

+          << "Invalid size of weight. For a binary classification task, it's size should be equal "
+             "to the number of samples. For a learning to rank task, it's size should be equal to "
+             "the number of query groups.";


I thought the Precision metric is only used for the learning-to-rank task? If so, we should not mention the binary classification task from this error message.

Removed. This comment didn't get cleaned up. I tried to support binary classification at one point but gave up as the existing implementations are feature rich.

trivialfis added 12 commits May 31, 2023 03:12

Rework the precision metric.

629af41

- Rework the precision metric for both CPU and GPU. - Mention it in document. - Cleanup old support code for GPU ranking metric. - Add proper support for binary classification.

specialize the cache.

505cd90

more test.

177c6e4

Drop support for classification.

5ec3678

type.

5556974

use batch shape.

211e35a

lint.

7274969

cpu build.

317690b

cpu build.

e3f36c9

lint.

a0c6028

Tests.

426eede

Fix.

05b88d5

hcho3 requested changes Jun 1, 2023

View reviewed changes

Cleanup error message.

204c251

hcho3 approved these changes Jun 1, 2023

View reviewed changes

trivialfis merged commit 9fbde21 into dmlc:master Jun 2, 2023

trivialfis deleted the metric-pre branch June 2, 2023 12:49

ShellLM mentioned this pull request Aug 11, 2024

Xgboost 2.0.0 · dmlc/xgboost irthomasthomas/undecidability#878

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rework the precision metric. #9222

Rework the precision metric. #9222

trivialfis commented May 30, 2023

hcho3 Jun 1, 2023

trivialfis Jun 1, 2023

hcho3 Jun 1, 2023

trivialfis Jun 1, 2023

hcho3 Jun 1, 2023

trivialfis Jun 1, 2023

Rework the precision metric. #9222

Rework the precision metric. #9222

Conversation

trivialfis commented May 30, 2023

hcho3 Jun 1, 2023

Choose a reason for hiding this comment

trivialfis Jun 1, 2023

Choose a reason for hiding this comment

hcho3 Jun 1, 2023

Choose a reason for hiding this comment

trivialfis Jun 1, 2023

Choose a reason for hiding this comment

hcho3 Jun 1, 2023

Choose a reason for hiding this comment

trivialfis Jun 1, 2023

Choose a reason for hiding this comment