Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] opensearch-py doesn't support chunked encoding with compression enabled with sigv4 AuthN/AuthZ #176

Open
kumjiten opened this issue Jun 25, 2022 · 3 comments
Labels
bug Something isn't working good first issue Good for newcomers performance Make it fast.

Comments

@kumjiten
Copy link

What is the bug?
Opensearch python client using content length header and does not support chunked with compression enabled.

How can one reproduce the bug?
Steps to reproduce the behavior:

  1. create openSearch domain in (AWS) which support IAM based AuthN/AuthZ
  2. send signed request to opensearch cluster using python rest client(https://docs.aws.amazon.com/opensearch-service/latest/developerguide/request-signing.html#request-signing-python)
  3. create rest-client in java with compression enabled
search = OpenSearch(
    hosts = [{'host': host, 'port': 443}],
    http_auth = awsauth,
    use_ssl = True,
    verify_certs = True,
    http_compress = True, # enables gzip compression for request bodies <---------
    connection_class = RequestsHttpConnection
)
  1. it's sending content-length header by default
python3  client.py
-----------START-----------
PUT https://xxxxxxxx:443/movies/_doc/1?refresh=true
content-type: application/json
user-agent: opensearch-py/2.0.0 (Python 3.8.9)
accept-encoding: gzip,deflate
content-encoding: gzip
Content-Length: 78 <--------------
x-amz-date: 20220625T131237Z
x-amz-content-sha256: 70ced8b1d2572d31b43dcf4ad0c58867d4f23bbbdb3bb24d7cb0059a87465816
Authorization: AWS4-HMAC-SHA256 Credential=AKIAV7BDGZUCRKUTEG7B/20220625/eu-west-1/es/aws4_request, SignedHeaders=content-type;host;x-amz-content-sha256;x-amz-date, Signature=5e8d252a9bd11728ec2e3305a74f2cc2eeddb29e69ae102cc815ed90bcb27d34

repro code:

from opensearchpy import OpenSearch, RequestsHttpConnection
from requests_aws4auth import AWS4Auth
import boto3
import pdb

host = '' # e.g. my-test-domain.us-east-1.es.amazonaws.com
region = 'eu-west-1' # e.g. us-west-1
service = 'es'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)

# Create the client.
search = OpenSearch(
    hosts = [{'host': host, 'port': 443}],
    http_auth = awsauth,
    use_ssl = True,
    verify_certs = True,
    http_compress = True, # enables gzip compression for request bodies
    connection_class = RequestsHttpConnection
)

document = {
  "title": "Moneyball",
  "director": "Bennett Miller",
  "year": "2011"
}

# Send the request.
print(search.index(index='movies', id='1', body=document, refresh=True))

causing this call to pass, what if content is too large and wanted to use chunked with compression.

What is the expected behavior?
It should support chunked with sigv4 to work with large payload.

similar issue: opensearch-project/OpenSearch#3640

What is your host/environment?

  • OS: [e.g. iOS]
  • Version [e.g. 22]
  • Plugins

Do you have any screenshots?
If applicable, add screenshots to help explain your problem.

Do you have any additional context?
opensearch-project/OpenSearch#3640

@harshavamsi
Copy link
Collaborator

@jiten1551 Are you saying that this is a bug or a feature that you might want. The default RequestsHttpConnection does not support chunked encoding. It would have to be a new flag in the connection class to allow for that. But just to separate things, SigV4 works with compressed requests using http_compress, what you're asking for is compressing and chunking, which could be a new feature?

@dblock
Copy link
Member

dblock commented Sep 30, 2022

I think it's a feature request: enable chunked transfer encoding (and ensure it works with Sigv4). A similar problem in the java client was that setting compression would also automatically turn on chunked transfer encoding, which would work, except for Sigv4.

@wbeckler wbeckler removed the untriaged Need triage label Nov 3, 2022
@fabioasdias
Copy link

python requests does chunked automatically if a generator is passed. In fact, one could arguably bypass the api straight into the connector.perform_request with a generator, as long as the http_compress is disabled (and then the gzip.compress doesn't run) and the input argument is just happily passed along to requests...

@wbeckler wbeckler added the good first issue Good for newcomers label Sep 19, 2023
@dblock dblock added the performance Make it fast. label Dec 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers performance Make it fast.
Projects
None yet
Development

No branches or pull requests

5 participants