This repository has been archived by the owner on Jul 16, 2021. It is now read-only.
-
-
Notifications
You must be signed in to change notification settings - Fork 43
[QST] test_numpy() fail with "rabit::Init is already called in this thread" #47
Comments
I've never really understood the issue unfortunately. I tried to fix this
upstream in xgboost, but didn't get too far with it:
dmlc/xgboost#2796
…On Mon, Jul 1, 2019 at 9:30 AM ksangeek ***@***.***> wrote:
I am using dask-xgboost 0.1.7 with xgboost 0.82.
test_core.py::test_numpy was failing for me and I looked more into the
failure and this is my understanding. I am bit amused as these tests were
passing for me the last week and AFAIR with the same version of packages )!
Need some help to understand what is going on here.
1. test_core.py::test_numpy failed with rabit::Init is already called
in this thread. And these are the details from pdb -
$ pytest test_core.py::test_numpy
====================================== test session starts =======================================
platform linux -- Python 3.6.8, pytest-4.6.2, py-1.8.0, pluggy-0.12.0
rootdir: ./tests
plugins: cov-2.7.1, forked-1.0.2, xdist-1.28.0
collected 1 item
test_core.py
>>>>>>>>>>>>>>>>>>>>>>>>>>>> PDB set_trace (IO-capturing turned off) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ./tests/test_core.py(200)test_numpy()
-> dX = da.from_array(X, chunks=(2, 2))
(Pdb) n> ./tests/test_core.py(201)test_numpy()
-> dy = da.from_array(y, chunks=(2,))
(Pdb)> ./tests/test_core.py(202)test_numpy()
-> dbst = yield dxgb.train(c, param, dX, dy)
(Pdb)
[08:42:34] Tree method is automatically selected to be 'approx' for distributed training.[08:42:34
] Tree method is automatically selected to be 'approx' for distributed training.
> ./tests/test_core.py(203)test_numpy()
-> dbst = yield dxgb.train(c, param, dX, dy) # we can do this twice
(Pdb)
[08:42:38] Tree method is automatically selected to be 'approx' for distributed training.[08:42:38
] Tree method is automatically selected to be 'approx' for distributed training.
> ./tests/test_core.py(205)test_numpy()
-> predictions = dxgb.predict(c, dbst, dX)
(Pdb)
rabit::Init is already called in this thread
1. On seeing the comment python# workaround for "Doing rabit call
after Finalize" in the test-case; I attempted to fix it with -
@@ -179,6 +179,7 @@ def test_dmatrix_kwargs(c, s, a, b):
def _test_container(dbst, predictions, X_type):+ xgb.rabit.init() # workaround for "Doing rabit call after Finalize"
dtrain = xgb.DMatrix(X_type(X), label=y)
bst = xgb.train(param, dtrain)
@@ -195,7 +196,6 @@ def _test_container(dbst, predictions, X_type):
@gen_cluster(client=True, timeout=None, check_new_threads=False)
def test_numpy(c, s, a, b):- xgb.rabit.init() # workaround for "Doing rabit call after Finalize"
dX = da.from_array(X, chunks=(2, 2))
dy = da.from_array(y, chunks=(2,))
dbst = yield dxgb.train(c, param, dX, dy)
and this particular test case worked fine, but it does not help me to fix
failure with overall test script execution. That still fails like this -
$ pytest
======================================================================================== test session starts =========================================================================================
platform linux -- Python 3.6.8, pytest-4.6.2, py-1.8.0, pluggy-0.12.0 -- ./anaconda3/envs/test-dask-xgb/bin/python
cachedir: .pytest_cache
rootdir: ./sandbox/dask-xgboost, inifile: setup.cfg
plugins: cov-2.7.1, forked-1.0.2, xdist-1.28.0
[gw0] linux Python 3.6.8 cwd: ./sandbox/dask-xgboost/dask_xgboost/tests
[gw0] Python 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:34:02) -- [GCC 7.3.0]
gw0 [12]
scheduling tests via LoadScheduling
[gw0] [ 8%] PASSED test_core.py::test_basic
[gw0] [ 16%] PASSED test_core.py::test_dmatrix_kwargs
[gw0] [ 25%] FAILED test_core.py::test_numpy
[gw0] [ 33%] FAILED test_core.py::test_scipy_sparse
[gw0] [ 41%] FAILED test_core.py::test_sparse
[gw0] [ 50%] PASSED test_core.py::test_errors
[gw0] [ 58%] FAILED test_core.py::test_classifier
[gw0] [ 66%] FAILED test_core.py::test_multiclass_classifier
[gw0] [ 75%] FAILED test_core.py::test_classifier_multi[array]
[gw0] [ 83%] FAILED test_core.py::test_classifier_multi[dataframe]
[gw0] [ 91%] FAILED test_core.py::test_regressor
[gw0] [100%] FAILED test_core.py::test_synchronous_api ./anaconda3/envs/test-dask-xgb/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 6 leaked semaphores to clean up at shutdown
..
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#47?email_source=notifications&email_token=AAKAOITHMB72MUW65FQK67LP5IIIBA5CNFSM4H4S76W2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4G4U7BWQ>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAKAOIVVITKUPAD7IGPRCPLP5IIIBANCNFSM4H4S76WQ>
.
|
@TomAugspurger Thanks for the link to your attempt. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
I am using dask-xgboost 0.1.7 with xgboost 0.82.
test_core.py::test_numpy
was failing for me and I looked more into the failure and this is my understanding. I am bit amused as these tests were passing for me the last week and AFAIR with the same version of packages )!Need some help to understand what is going on here.
test_core.py::test_numpy
failed withrabit::Init is already called in this thread
. And these are the details from pdb -python# workaround for "Doing rabit call after Finalize"
in the test-case; I attempted to fix it with -and this particular test case worked fine, but it does not help me to fix failure with overall test script execution. That still fails like this -
The text was updated successfully, but these errors were encountered: