Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

joblib threading issue with esig #116

Open
terrylyons opened this issue Jun 12, 2021 · 2 comments
Open

joblib threading issue with esig #116

terrylyons opened this issue Jun 12, 2021 · 2 comments

Comments

@terrylyons
Copy link
Collaborator

terrylyons commented Jun 12, 2021

the work around
`# returns a list of signatures using parallel processing
import esig, joblib, numpy as np
def parallel_stream2sig(paths,depth):
return joblib.Parallel(n_jobs=-1)(joblib.delayed(esig.stream2sig)(x,depth) for x in paths)

paths = [np.random.rand(20,2) for i in range (10)]

print(parallel_stream2sig(paths,4))`

works but
`# returns a list of signatures using parallel processing
import esig, joblib, numpy as np
def parallel_stream2sig(paths,depth):
return joblib.Parallel(n_jobs=-1, prefer="threads")(joblib.delayed(esig.stream2sig)(x,depth) for x in paths)

paths = [np.random.rand(20,2) for i in range (10)]

print(parallel_stream2sig(paths,4))`

causes the following crash:

Traceback (most recent call last):
File "C:\Users\t\source\repos\esig_batch\esig_batch\esig_batch.py", line 9, in
print(parallel_stream2sig(paths,4))
File "C:\Users\t\source\repos\esig_batch\esig_batch\esig_batch.py", line 5, in parallel_stream2sig
return joblib.Parallel(n_jobs=-1, prefer="threads")(joblib.delayed(esig.stream2sig)(x,depth) for x in paths)
File "C:\Users\t\source\repos\esig_batch\esig_batch\env\lib\site-packages\joblib\parallel.py", line 1054, in call
self.retrieve()
File "C:\Users\t\source\repos\esig_batch\esig_batch\env\lib\site-packages\joblib\parallel.py", line 933, in retrieve
self.output.extend(job.get(timeout=self.timeout))
File "C:\users\t\source\Python37_64\lib\multiprocessing\pool.py", line 657, in get
raise self.value
File "C:\users\t\source\Python37_64\lib\multiprocessing\pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "C:\Users\t\source\repos\esig_batch\esig_batch\env\lib\site-packages\joblib_parallel_backends.py", line 595, in call
return self.func(*args, **kwargs)
File "C:\Users\t\source\repos\esig_batch\esig_batch\env\lib\site-packages\joblib\parallel.py", line 263, in call
for func, args, kwargs in self.items]
File "C:\Users\t\source\repos\esig_batch\esig_batch\env\lib\site-packages\joblib\parallel.py", line 263, in
for func, args, kwargs in self.items]
File "C:\Users\t\source\repos\esig_batch\esig_batch\env\lib\site-packages\esig_init
.py", line 133, in wrapper
return func(as_array, *args, **kwargs)
File "C:\Users\t\source\repos\esig_batch\esig_batch\env\lib\site-packages\esig_init
.py", line 151, in stream2sig
backend = get_backend()
File "C:\Users\t\source\repos\esig_batch\esig_batch\env\lib\site-packages\esig\backends.py", line 35, in get_backend
return _BACKEND_CONTAINER.context
AttributeError: '_thread._local' object has no attribute 'context'

@terrylyons
Copy link
Collaborator Author

terrylyons commented Jun 12, 2021

The readme for joblib says:

When you know that the function you are calling is based on a compiled extension that releases the Python Global Interpreter Lock (GIL) during most of its computation then it is more efficient to use threads instead of Python processes as concurrent workers. For instance this is the case if you write the CPU intensive part of your code inside a with nogil block of a Cython function.

I don't know if the esig extension releases the GIL. So perhaps this is an issue too.

@inakleinbottle
Copy link
Contributor

Esig does not release the GIL at present. We can easily add this feature but we also need some tests to make sure it works correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants