Skip to content

Commit

Permalink
adding AWS Setup instructions
Browse files Browse the repository at this point in the history
  • Loading branch information
brifordwylie committed Dec 29, 2023
1 parent b2489a9 commit 1450d17
Show file tree
Hide file tree
Showing 5 changed files with 260 additions and 2 deletions.
151 changes: 151 additions & 0 deletions docs/aws_setup/aws_tips_and_tricks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
# AWS Tips and Tricks
!!!tip inline end "Need AWS Help?"
The SuperCowPowers team is happy to give any assistance needed when setting up AWS and SageWorks. So please contact us at [[email protected]](mailto:[email protected]) or on chat us up on [Discord](https://discord.gg/WHAJuz8sw8)

This page tries to give helpful guidance when setting up AWS Accounts, Users, and Groups. In general AWS can be a bit tricky to set up the first time. Feel free to use any material in this guide but we're more than happy to help clients get their AWS Setup ready to go for FREE. Below are some guides for setting up a new AWS account for SageWorks and also setting up SSO Users and Groups within AWS.

## New AWS Account (with AWS Organizations: easy)
- If you already have an AWS Account you can activate the AWS Identity Center/Organization functionality.
- Now go to AWS Organizations page and hit 'Add an AWS Account' button
- Add a new User with permissions that **allows AWS Stack creation**

!!! note inline end
If you need a 'new' email just add a plus sign '+' at the end of your existing email. That will works and any emails to that address will get forwarded to the existing email `[email protected]`


## New AWS Account (without AWS Organizations: a bit harder)
- Goto: https://aws.amazon.com/free and hit the big button 'Create a Free Account'
- Enter email and the account name you'd like (anything is fine)
- You'll get a validation email and go through the rest of the Account setup procedure
- Add a new User with permissions that **allows AWS Stack creation**


# SSO Users and Groups
AWS SSO (Single Sign-On) is a cloud-based service that allows users to manage access to multiple AWS accounts and business applications using a single set of credentials. It simplifies the authentication process for users and provides centralized management of permissions and access control across various AWS resources. With AWS SSO, users can log in once and access all the applications and accounts they need, streamlining the user experience and increasing productivity. AWS SSO also enables IT administrators to manage access more efficiently by providing a single point of control for managing user access, permissions, and policies, reducing the risk of unauthorized access or security breaches.

## Setting up SSO Users
* Log in to your AWS account and go to the AWS Identity Center console.
* Click on the "Users" tab and then click on the "Add user" button.

The 'Add User' setup is fairly straight forward but here are some screen shots:

On the first panel you can fill in the users information.

<img width="800" alt="Screenshot 2023-05-03 at 9 31 30 AM" src="https://user-images.githubusercontent.com/4806709/235965493-eaa5f879-df04-473b-b98d-03d422db7272.png">

## Groups
On the second panel we suggest that you have at LEAST two groups:
- Admin group
- DataScientists group

### Setting up Groups
This allows you to put most of the users into the DataScientists group that has AWS policies based on their job role. AWS uses 'permission sets' and you assign AWS Policies. This approach makes it easy to give a group of users a set of relevant policies for their tasks.

Our standard setup is to have two permission sets with the following policies:
- IAM Identity Center --> Permission sets --> DataScientist
- Add Policy: arn:aws:iam::aws:policy/job-function/DataScientist

- IAM Identity Center --> Permission sets --> AdministratorAccess
- Add Policy: arn:aws:iam::aws:policy/job-function/AdministratorAccess

See: [Permission Sets](https://docs.aws.amazon.com/singlesignon/latest/userguide/permissionsetsconcept.html) for more details and instructions.

Another benefit of creating groups is that you can include that group in 'Trust Policy (assume_role)' for the SageWorks-ExecutionRole (this gets deployed as part of the SageWorks AWS Stack). This means that the management of what SageWorks can do/see/read/write is completely done through the SageWorks-ExecutionRole.

## Back to Adding User
Okay now that we have our groups set up we can go back to our original goal of adding a user. So here's the second panel with the groups and now we can hit 'Next'

<img width="800" alt="Screenshot 2023-05-03 at 9 31 49 AM" src="https://user-images.githubusercontent.com/4806709/235965818-fa44bb58-6e58-49df-93df-ba582148b3f4.png">

On the third panel just review the details and hit the 'Add User' button at the bottom. The user will get an email giving them instructions on how to log on to their AWS account.

<img width="600" alt="Screenshot 2023-05-03 at 9 32 28 AM" src="https://user-images.githubusercontent.com/4806709/235967585-d772d2f9-13ac-4795-aca3-429fbb1b7311.png">

### AWS Console
Now when the user logs onto the AWS Console they should see something like this:
<img width="800" alt="Screenshot 2023-05-03 at 9 21 27 AM" src="https://user-images.githubusercontent.com/4806709/235970829-d1fdf1a8-84a2-46ca-a20e-143664715531.png">

### SSO Setup for Command Line/Python Usage
For full instructions see [SSO Command Line/Python Configure](https://docs.aws.amazon.com/cli/latest/userguide/sso-configure-profile-token.html). But here's a quick summary
#### Get some information
- Goto your AWS Identity Center in the AWS Console
- On the right side there will be two important pieces of information
- Region
- Start URL
#### Install AWS CLI
- Mac: `brew install awscli`
- Linus: TBD
- Windows: TBD

#### Running the SSO Configuration
**Note:** You only need to do this once!
```
aws configure sso --profile <the name of the new profile> (something like bob_sso)
SSO session name (Recommended): my-sso
SSO start URL []: <the Start URL from info above>
SSO region []: <the Region from info above>
SSO registration scopes [sso:account:access]:
```

You will get a browser open/redirect at this point and get a list of available accounts.. something like below, just pick the correct account

```
There are 2 AWS accounts available to you.
> SCP_Sandbox, [email protected] (XXXX40646YYY)
SCP_Main, [email protected] (XXX576391YYY)
```

Now pick the role that you're going to use

```
There are 2 roles available to you.
> DataScientist
AdministratorAccess
```

## Setting up some aliases for bash/zsh
Edit your favorite ~/.bashrc ~/.zshrc and add these nice aliases/helper

```
# AWS Aliases
alias bob_sso='export AWS_PROFILE=bob_sso'
# Default AWS Profile
export AWS_PROFILE=bob_sso
```

## Testing your new AWS Profile
Make sure your profile is active/set
```
env | grep AWS
AWS_PROFILE=<bob_sso or whatever>
```
Now you can list the S3 buckets in the AWS Account
```
aws ls s3
```
If you get some message like this...

```The SSO session associated with this profile has expired or is otherwise invalid. To refresh this SSO session run aws sso login with the corresponding profile.```

This is fine/good, a browser will open up and you can refresh your SSO Token.

After that you should get a listing of the S3 buckets without needed to refresh your token.

```
aws s3 ls
❯ aws s3 ls
2023-03-20 20:06:53 aws-athena-query-results-XXXYYY-us-west-2
2023-03-30 13:22:28 sagemaker-studio-XXXYYY-dbgyvq8ruka
2023-03-24 22:05:55 sagemaker-us-west-2-XXXYYY
2023-04-30 13:43:29 scp-sageworks-artifacts
```


## AWS Resources
- [AWS Identity Center](https://docs.aws.amazon.com/singlesignon/latest/userguide/what-is.html)
- [Users and Groups](https://docs.aws.amazon.com/singlesignon/latest/userguide/users-groups-provisioning.html)
- [Permission Sets](https://docs.aws.amazon.com/singlesignon/latest/userguide/permissionsetsconcept.html)
- [SSO Command Line/Python Configure](https://docs.aws.amazon.com/cli/latest/userguide/sso-configure-profile-token.html)


105 changes: 105 additions & 0 deletions docs/aws_setup/core_stack.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
# Initial AWS Setup
Welcome to the SageWorks AWS Setup Guide. SageWorks is deployed into AWS as an AWS **Stack** following the well architected system practices of AWS.

!!! warning "AWS Setup can be a bit complex"
Setting up SageWorks with AWS can be a bit complex, but you only have to do it ONCE and SageWorks tries to make it straight forward. If you have any troubles at all feel free to contact us a [[email protected]](mailto:[email protected]) or on [Discord](https://discord.gg/WHAJuz8sw8) and we're happy to help you with AWS for FREE.

## Two main options when using SageWorks
1. Spin up a new AWS Account for the SageWorks Stacks ([Make a New Account](aws_tips_and_tricks.md))
2. Deploy SageWorks Stacks into your existing AWS Account

Either of these options are fully supported, but we highly suggest a NEW account as it gives the following benefits:

- **AWS Data Isolation:** Data Scientists will feel empowered to play in the sandbox without impacting production services.
- **AWS Cost Accounting:** Monitor and Track all those new ML Pipelines that your team creates with SageWorks :)

## Setting up Users and Groups
If your AWS Account already has users and groups set up you can skip this but here's our recommendations on setting up [SSO Users and Groups](aws_tips_and_tricks.md)

## Onboarding SageWorks to your AWS Account

Pulling down the SageWorks Repo
```
git clone https://github.com/SuperCowPowers/sageworks.git
```

## SageWorks uses AWS Python CDK for Deployments into AWS
If you don't have AWS cdk already installed you can do these steps:

Mac

```
brew install node
npm install -g aws-cdk
```
Linux

```
sudo apt install nodejs
sudo npm install -g aws-cdk
```
For more information on Linux installs see [Digital Ocean NodeJS](https://www.digitalocean.com/community/tutorials/how-to-install-node-js-on-ubuntu-20-04)

## Create an S3 Bucket for SageWorks
SageWorks pushes and pulls data from AWS, it will use this S3 Bucket for storage and processing. You should create a **NEW** S3 Bucket, we suggest a name like `<company_name/url>-sageworks`

## Deploying the SageWorks Core Stack

!!! note "AWS Stuff"
Activate your AWS Account that's used for SageWorks deployment. For this one time install you should use an Admin Account (or an account that had permissions to create/update AWS Stacks)

```bash
cd sageworks/aws_setup/sageworks_core
export AWS_PROFLE=<aws_admin_account>
export SAGEWORKS_BUCKET=<name of your S3 bucket>
(optional) export SAGEWORKS_SSO_GROUP=<your SSO group>
pip install -r requirements.txt
cdk bootstrap
cdk deploy
```

## AWS Account Setup Check
After setting up SageWorks config/AWS Account you can run this test/checking script. If the results ends with `INFO AWS Account Clamp: AOK!` you're in good shape. If not feel free to contact us on [Discord](https://discord.gg/WHAJuz8sw8) and we'll get it straightened out for you :)

```bash
pip install sageworks
cd sageworks/aws_setup
python aws_account_check.py
<lot of print outs for various checks>
2023-04-12 11:17:09 (aws_account_check.py:48) INFO AWS Account Clamp: AOK!
```

## Building our first ML Pipeline
Okay, now the more significant testing. We're literally going to build an entire AWS ML Pipeline. The script `build_ml_pipeline.py` uses the SageWorks API to quickly and easily build an AWS Modeling Pipeline.
- DataLoader(abalone.csv) --> DataSource
- DataToFeatureSet Transform --> FeatureSet
- FeatureSetToModel Transform --> Model
- ModelToEndpoint Transform --> Endpoint

This script will take a LONG TiME to run, most of the time is waiting on AWS to finalize FeatureGroup population.

```
❯ python build_ml_pipeline.py
<lot of building ML pipeline outputs>
```
After the script completes you will see that it's built out an AWS ML Pipeline and testing artifacts.

## How to Start the SageWorks Dashboard (Locally)

!!! tip inline end "Running Dashboard AWS Stack"
For testing it's nice to run the Dashboard locally, but the SageWorks Dashboard should be deployed as an AWS Stack, so that everyone in the company can use and interact with the AWS ML Pipeline Artifacts (see [AWS Dashboard Stack](dashboard_stack.md))

```
cd sageworks/application/aws_dashboard
./dashboard
```
**Open browser to http://localhost:8080**

<figure">
<img alt="sageworks_new_light" src="https://github.com/SuperCowPowers/sageworks/assets/4806709/5f8b32a2-ed72-45f2-bd96-91b7bbbccff4">
<figcaption>SageWorks Dashboard: AWS Pipelines in a Whole New Light!</figcaption>
</figure>


## Congratulations: SageWorks is now deployed to your AWS Account
If you ran into any issues with this procedure please contact us via [Discord](https://discord.gg/WHAJuz8sw8) or email [[email protected]](mailto:[email protected]) and the SCP team will provide **free** setup and support for new SageWorks users.
File renamed without changes.
2 changes: 1 addition & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ The SageWorks framework makes AWS® both easier to use and more powerful. SageWo


## Getting Started
- Setting up SageWorks on your AWS Account: [AWS Setup](aws_setup/initial_setup.md)
- Setting up SageWorks on your AWS Account: [AWS Setup](aws_setup/core_stack.md)
- Using SageWorks for ML Pipelines: [SageWorks API Classes](api_classes/overview.md)

## Additional Resources
Expand Down
4 changes: 3 additions & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,9 @@ nav:
- Model to Endpoint: core_classes/transforms/model_to_endpoint.md
- Pandas Transforms: core_classes/transforms/pandas_transforms.md
- AWS Setup:
- Initial Setup: aws_setup/initial_setup.md
- Initial Setup: aws_setup/core_stack.md
- Dashboard Setup: aws_setup/dashboard_stack.md
- AWS Tips and Tricks: aws_setup/aws_tips_and_tricks.md
- Admin:
- PyPI Release: admin/pypi_release.md
- Docker Push: admin/docker_push.md
Expand Down

0 comments on commit 1450d17

Please sign in to comment.