Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI Shortening #332

Merged
merged 2 commits into from
Apr 3, 2023
Merged

CI Shortening #332

merged 2 commits into from
Apr 3, 2023

Conversation

neelasha23
Copy link

@neelasha23 neelasha23 commented Mar 29, 2023

Describe your changes

This PR aims to shorten the docs building time in every pull request.

Issue number

Closes #X

Checklist before requesting a review


📚 Documentation preview 📚: https://jupysql--332.org.readthedocs.build/en/332/

@neelasha23 neelasha23 marked this pull request as draft March 29, 2023 12:27
@neelasha23
Copy link
Author

neelasha23 commented Mar 29, 2023

  1. I'm testing out the pre-build and post-build configs in readthedocs.yml file. You meant this file right? Or we need to to add in the CI rtd.yml file through Githubs actions?
    Currently this is failing with error
Completed 3.5 MiB/3.5 MiB (0 Bytes/s) with 1 file(s) remaining
upload failed: doc/user-guide/tables-columns.md to s3://jupysql-build-docs/user-guide/tables-columns.md Unable to locate credentials

We would need to configure AWS Access key and secret key in Readthedocs Advanced settings/Environment Variables for the project ? I do not have access to this account

@edublancas

@idomic
Copy link

idomic commented Mar 29, 2023

Added a role to readthedocs, you need to assume it and use it to access the files via S3.
Also seems the CI is failing (I'm assuming due to the missing notebooks but if not please fix)

@neelasha23 neelasha23 force-pushed the docs_ci branch 4 times, most recently from 2663135 to 909eabb Compare March 31, 2023 11:02
@neelasha23
Copy link
Author

neelasha23 commented Mar 31, 2023

Added the pre_build and post_build configurations. Currently this will upload the cache files to the S3 path s3://jupysql-build-docs/Pull-request-number.

The access keys for IAM user readthedocs are hardcoded for testing purpose only. The user has permissions only for this bucket:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObject",
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::jupysql-build-docs",
                "arn:aws:s3:::jupysql-build-docs/*"
            ]
        }
    ]
}

As of now, I haven't been able to figure out how to setup the access keys in environment variables without making them public. Posted a stackoverflow question

@idomic @edublancas

@neelasha23 neelasha23 marked this pull request as ready for review March 31, 2023 11:43
@neelasha23 neelasha23 requested review from edublancas and idomic March 31, 2023 11:43
@idomic
Copy link

idomic commented Mar 31, 2023

Does it work fine with those keys directly?
I'd also try the official repo and see if they have some sort of community.

You can also try another route, maybe there's a better way of doing it without setting the env vars public.

  • When you set an environment variable in Readthedocs, it is only accessible to your Readthedocs project and its associated builds. It is not visible to the public, and it is not included in any publicly accessible files or URLs.

So essentially, setting an environment variable like AWS_SECRET_ACCESS_KEY in the Readthedocs project's "Environment variables" section does not make it public.

We can also try this (it'll allow you using assume-role via python and boot):

from s3cache import S3Cache

S3CACHE_ENABLED = True
S3CACHE_BACKEND = S3Cache(
    bucket_name=os.environ.get('S3_CACHE_BUCKET'),
    aws_access_key_id=os.environ.get('AWS_ACCESS_KEY_ID'),
    aws_secret_access_key=os.environ.get('AWS_SECRET_ACCESS_KEY'),
)

You can try starting a side readthedocs sample account, it might be easier/faster

Copy link

@edublancas edublancas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the runtime went down from 321s to 270s, and the notebook cache is working:

/home/docs/checkouts/readthedocs.org/user_builds/jupysql/checkouts/332/doc/howto/autocompletion.md: Using cached notebook: ID=9 [mystnb]
reading sources... [ 40%] howto/csv

Installing awscli from pip is taking 15s so let's move it to the doc/environment.lock.yml file and see if that reduces runtime.

@neelasha23
Copy link
Author

Does it work fine with those keys directly?

yes it does. but that's not necessarily needed if we move those to environment variables as public keys.

You can also try another route, maybe there's a better way of doing it without setting the env vars public.

  • When you set an environment variable in Readthedocs, it is only accessible to your Readthedocs project and its associated builds. It is not visible to the public, and it is not included in any publicly accessible files or URLs.

So essentially, setting an environment variable like AWS_SECRET_ACCESS_KEY in the Readthedocs project's "Environment variables" section does not make it public.

Setting as env vars without public is not accessible in PRs like we saw yesterday. Quoting from readthedocs documentation: Check the Public option if you want to expose this environment variable to builds from pull requests.

@idomic

@neelasha23
Copy link
Author

Installing awscli from pip is taking 15s so let's move it to the doc/environment.lock.yml file and see if that reduces runtime.

This is causing some dependency conflicts: https://readthedocs.org/projects/jupysql/builds/19973640/
@edublancas

@edublancas
Copy link

ok, let's revert the change and keep awscli via a pip install as you had it before. I'm still ironing out the details of these new way of installing documentation dependencies.

@neelasha23
Copy link
Author

Have removed the hardcoded access keys and deactivated this access key as well, while I figure out what approach can be taken for configuring the keys.
@idomic @edublancas

@neelasha23
Copy link
Author

I was trying to find other relevant approaches but haven't been able to find any yet. I have received a reply from readthedocs community:

Unfortunately, we don't have a good suggestion for your use case. In the past, we've talked about implementing a "cache" config key in our YAML file for these cases: readthedocs/readthedocs.org#9183
I'd recommend you to explain there your use case and subscribe to it. It will help us to prioritize this work in the future.
One thing that I just realize that could help: you can create an S3 user only for this purpose with pretty restricted permissions (only upload/download) to a special S3 bucket only used for this purpose. This is not perfect, but at least it does not expose any risk, I guess.

We already have setup an IAM user with upload/download restrictions to one bucket only.
Can we check with AWS support if it's fine to setup the access keys as public environment variables so that we don't get notice again?

@idomic @edublancas

actions role

aws install

aws install

aws install

aws install

aws install

aws install

aws install

aws install

added assumerole

assumerole

Revert ci changes

Empty commit

cache upload

cache upload

cache upload

cache upload

folder

ls

ls

changed dir

prebuild

Empty commit

git id

git id

git id

version

Empty commit

removed logs

Empty commit

awscli dep

lock file

lock file

revert

revert

Removed access keys;

testing github actions

testing github actions

testing github actions

testing github actions

testing github actions

testing github actions

testing github actions

revert ci changes

revert ci changes
@idomic
Copy link

idomic commented Apr 3, 2023

We don’t need AWS support, set the new role and we can set up readthedocs. Make sure you don't hardcode it into git or output it into the console.

@idomic
Copy link

idomic commented Apr 3, 2023

Also seems the CI is failing?

@idomic idomic merged commit fcbac41 into ploomber:master Apr 3, 2023
@idomic
Copy link

idomic commented Apr 3, 2023

Relates to this issue: ploomber/contributing#28

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants