Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added the aws and gcp connectors #187

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Conversation

Ris-code
Copy link

@Ris-code Ris-code commented Aug 22, 2024

Fix for - #173

@KaranrajM
Copy link
Contributor

Hi @Ris-code can you add these env variables in docker-compose.yaml under language service? Also can you please add these necessary GCP and AWS packages to pyproject.toml file by using poetry add command.

I hope the packages are tested with keys and its working. Rest of the code looks good.

@Ris-code
Copy link
Author

Hi @Ris-code can you add these env variables in docker-compose.yaml under language service? Also can you please add these necessary GCP and AWS packages to pyproject.toml file by using poetry add command.

I hope the packages are tested with keys and its working. Rest of the code looks good.

Ok, will do it by today.

…mpose.yml and added the required packages in pyproject.toml
@Ris-code
Copy link
Author

Hi @KaranrajM , I have added the env variables as well as updated the pyproject.toml . Please review it once.

@KaranrajM
Copy link
Contributor

Hi @Ris-code I have left some new reviews for the latest commit. Please take a look and make the required changes when you can.

@Ris-code
Copy link
Author

Hi @Ris-code I have left some new reviews for the latest commit. Please take a look and make the required changes when you can.

Sure.

@Ris-code
Copy link
Author

Hi @Ris-code I have left some new reviews for the latest commit. Please take a look and make the required changes when you can.

I could not find any review. Could you please assist me where it is?

@KaranrajM
Copy link
Contributor

Hi @Ris-code I have left some new reviews for the latest commit. Please take a look and make the required changes when you can.

I could not find any review. Could you please assist me where it is?

You can see them at the top when you open this PR conversation. Let me tag you in those comments.

@@ -34,7 +34,7 @@ services:
volumes:
- "kafka_data:/bitnami"
- ./scripts/create-all-topics.sh:/usr/bin/create-topics.sh
command: "sh -c './run.sh & /usr/bin/create-topics.sh && tail -f /dev/null'"
command: "sh -c './run.sh && tail -f /dev/null'"
Copy link
Collaborator

@sameersegal sameersegal Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did we remove the call to create-topic.sh? It creates KAFKA topics if they haven't been created before. It's an important step in "single-command docker run", which makes it easy for first time users.

@sameersegal
Copy link
Collaborator

@shreypandey please review the changes here

language/src/speech_processor.py Show resolved Hide resolved
@@ -34,7 +34,7 @@ services:
volumes:
- "kafka_data:/bitnami"
- ./scripts/create-all-topics.sh:/usr/bin/create-topics.sh
command: "sh -c './run.sh & /usr/bin/create-topics.sh && tail -f /dev/null'"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can u revert this change as it seems to ur local change and not needed for this PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# AWS_DEFAULT_REGION=

# Create the S3 bucket
# BUCKET_NAME=
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we change BUCKET_NAME to S3_BUCKET_NAME to be more specific everywhere?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

language/pyproject.toml Show resolved Hide resolved
- AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
- AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
- AWS_DEFAULT_REGION=${AWS_DEFAULT_REGION}
- BUCKET_NAME=${BUCKET_NAME}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same BUCKET_NAME to S3_BUCKET_NAME

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ris-code
Copy link
Author

@KaranrajM I have done the changes. Please review them once.

@VedantKhairnar
Copy link

Hello @Ris-code
Will you pls check the PR-Issue linking guidelines and make the changes accordingly in the PR description(need to mention the Issue number).
This helps in efficient progress tracking. Thanks.

.env.template Outdated
# Set AWS keys if using AWS instances
AWS_ACCESS_KEY_ID=
AWS_SECRET_ACCESS_KEY=
AWS_DEFAULT_REGION=
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please put these three keys below the azure speech keys (i.e move these keys above)? Also comment before them saying these are Azure speech/translation keys.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I also shift the google application credential below the AWS ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GCP credentials are used for both storage and language purposes right? It can be there. Also have you confirmed whether the packages are working?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I have tested each and every package and they are working fine.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KaranrajM I have shifted the AWS credentials above. Please review it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @KaranrajM
Can we pls get the PR reviewed? We need it to be merged by 15th Sept for it to be considered of our ongoing challenge. Thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes sure @VedantKhairnar Will review and try to merge it by today if everything looks good.

Copy link
Contributor

@KaranrajM KaranrajM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Ris-code I have reviewed the code and some cosmetic changes need to happen. Once these are done, I can merge it for sure.

@@ -38,6 +43,12 @@ PUBLIC_URL_PREFIX= # Set Tunnel URL if using local storage
# AZURE_STORAGE_ACCOUNT_KEY=
# AZURE_STORAGE_CONTAINER=

# Create the S3 bucket
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can u please change this sentence to "Set S3 Bucket name when using AWS speech services" in both .env.template and .env-dev.template files? Also you can move this S3_BUCKET_NAME to below AWS Speech/Translation Keys as they are only used there and nowhere else.

@@ -78,6 +78,11 @@ services:
- AZURE_STORAGE_ACCOUNT_KEY=${AZURE_STORAGE_ACCOUNT_KEY}
- AZURE_STORAGE_CONTAINER=${AZURE_STORAGE_CONTAINER}
- PUBLIC_URL_PREFIX=${PUBLIC_URL_PREFIX}
- AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
Copy link
Contributor

@KaranrajM KaranrajM Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move the three AWS speech/translation credentials along S3_BUCKET_NAME up to above STORAGE_TYPE=${STORAGE_TYPE} line?

import requests
from botocore.exceptions import BotoCoreError, ClientError
import asyncio
import requests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove one requests import as there are two present.

@@ -391,27 +398,198 @@ async def text_to_speech(
# )
return new_audio_content

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a new line here as whenever a new class is present, two new lines should be there.


# raise ExceptionGroup("CompositeSpeechProcessor text to speech failed", excs)
raise ExceptionGroup("CompositeSpeechProcessor text to speech failed", excs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add a new line at the end of the file?

source_language=source_language_code,
target_language=destination_language_code,
)
return response['translatedText']

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a new line before every class to make it 2 lines in total.

- AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
- AWS_DEFAULT_REGION=${AWS_DEFAULT_REGION}
- S3_BUCKET_NAME=${S3_BUCKET_NAME}
- GOOGLE_APPLICATION_CREDENTIALS=${GOOGLE_APPLICATION_CREDENTIALS}
Copy link
Contributor

@KaranrajM KaranrajM Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may need to mount GOOGLE_APPLICATION_CREDENTIALS json file during runtime as local files cannot be accessed in a docker env without embedding them in docker image or mounting them during runtime. As the GCP creds file is highly sensitive, we can safely mount them during runtime. We can try something like this

environment:
- GOOGLE_APPLICATION_CREDENTIALS=/app/credentials/google-credentials.json
volumes:
- ${GOOGLE_APPLICATION_CREDENTIALS}:/app/credentials/google-credentials.json:ro

Let me know what you think and any other approach is also welcome. Once this change is made, please test them after building the language image and running the services through it.

@Ris-code
Copy link
Author

@KaranrajM I have done all the mentioned changes and tested after mounting the GCP credentials during runtime. Please review all the changes.

@Lekhanrao Lekhanrao linked an issue Sep 17, 2024 that may be closed by this pull request
@KaranrajM
Copy link
Contributor

KaranrajM commented Oct 10, 2024

Hi @Ris-code I have tried to build a docker-image of language using this code, it throws an error when GOOGLE_APPLICATION_CREDENTIALS is empty in .env-dev. Can you handle this case as well as people mostly use either of the three services (Azure or AWS or GCP) not all of them? So we need to handle the case of gcp not being used then the volume should not be mounted in docker-compose.yml if the GOOGLE_APPLICATION_CREDENTIALS is not set. Rest of the code looks good. Please fix the docker-compose.yml change so this code can be merged. Let me know if you need any help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Issue: Support for cloud speech & translation connectors
4 participants