-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] - parallel_bulk does not work in AWS lambda #94
Comments
I'm also seeing this error with Python 3.8
|
@jasongilman Did you get this error in a lambda or elsewhere? |
@wbeckler It was in a lambda. |
@jasongilman Yes it was in aws lambda. |
Is anyone up for contributing a patch that addresses this issue when /dev/shm isn't available? There's a potential drop in replacement for the multiprocessing library: https://pypi.org/project/lambda-multiprocessing/ |
At a high level, is this issue about adding Python 3.9 support (starting with CI)? |
@Aarif1430 @jasongilman Is the bug still persisting? |
CI with Python 3.9 was added in #336 and it currently passes. We need a test that reproduces this problem. |
I'm able the reproduce the issue: Create lambda with python3.9:
Give error
|
Looking at https://pypi.org/project/lambda-thread-pool/ You cannot use "multiprocessing.Queue" or "multiprocessing.Pool" within a Python Lambda environment because the Python Lambda execution environment does not support shared memory for processes. This means we need to get rid of or be able to swap opensearch-py/opensearchpy/helpers/actions.py Line 470 in da436cb
For an immediate workaround you can copy-paste the |
I renamed this to "parallel_bulk doesn't work in AWS lambda", is there anything else that doesn't? |
Thank you, in my case the ThreadPool is used by some sdk and it wouldn't be ideal to change. We started getting the issue when upgrading from python3.7 to 3.9. We might just find an alternative solution instead of using the sdk. |
OSError: [Errno 38] Function not implemented. I started seeing this error after upgrading to python3.9. The reason is opensearch
bulk
function is using multiprocessing module internally andpython multiprocessing.pool.ThreadPool
is breaking.--
It looks like:
synchronize.Lock doesn't work in lambda for any version of Python (lambda has no /dev/shm, and no write access to /dev in lambda - see: https://aws.amazon.com/blogs/compute/parallel-processing-in-python-with-aws-lambda
)
ThreadPool is now using synchronize.Lock from version 3.9
To Reproduce
Steps to reproduce the behavior:
opensearch-py==1.0.0
to aws lambdaExpected behavior
The opensearch client should work as it was working fine with python3.6
Plugins
opensearch-py==1.0.0
Screenshots
![image](https://user-images.githubusercontent.com/19341315/143569162-781c1f21-52f8-4229-8c19-e9afe457b42a.png)
![image](https://user-images.githubusercontent.com/19341315/143569241-a16882d9-63b4-4383-bbae-7cdf1a5621cd.png)
Error screenshots
Host/Environment (please complete the following information):
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: