- Initial Snowpark Overview presentation/value proposal (whiteboard) including Snowpark 100-level_demos
- Meeting to setup environment for Snowpark Hands-on Lab
- Snowpark Hands-on Lab Workshop to walk through 200-level Demo and share 300-level overview (leave-behind example)
- Follow-up 7days after HOL for Snowpark use case discussion & pitch Snowflake PS Data Science offering
Detailed instructions for setting up and running the 200-level Snowpark hands-on lab.
Snowpark Python Syntax Ref Doc
What You'll Need
You will need the following things before beginning:
- A Snowflake Account
- A Snowflake user created with ACCOUNTADMIN permissions. This user will be used to get things setup in Snowflake.
- Anaconda Terms & Conditions accepted. See Getting Started section in Third-Party Packages. https://docs.snowflake.com/en/developer-guide/udf/python/udf-python-packages.html#getting-started
- A Python Environment and Python IDE or Code Editor. We recommend Visual Studio Code
- Access to Git to fork the Snowpark_Hands-on_Lab clone locally
We recommend scheduling a call to review all Snowflake account setup and Python environment setup several days in advance for the Snowpark Hands-on Lab to ensure all pre-work is completed. This will allow the participants to focus solely on running through the three (3) hands-on lab notebooks. We would also recommend using your Snowflake account for this Hands-on Lab to ensure Snowpark runs effectively within your account.
Snowflake Environment Setup
- Log into your Snowflake account and switch to ACCOUNTADMIN role
- Click on Admin and then Billing & Terms on the left side panel
- On the Terms and Billing tab, read and accept terms to continue with the workshop
- Create a new worksheet and run the Snowpark_Hands-on_Lab_SF_setup code. This code is required to create all the required Snowflake Database Roles, Database, Schema, Warehouse and to grant required permissions. Step through and run each line to ensure all code runs without error.
- Ensure each Snowpark Hands-on Lab participant has been granted access to the snowpark_workshop_role role in line 12
- GRANT ROLE snowpark_workshop_role to USER <user_name>;
Python Environment Setup
- Create and Activate Conda Environment (OR, use any other Python environment with Python 3.8).
conda create --name snowpark -c https://repo.anaconda.com/pkgs/snowflake python=3.8
conda activate snowpark
- Install Snowpark for Python, Streamlit, scikit-learn, xgboost and other libraries in Conda environment
conda install -c https://repo.anaconda.com/pkgs/snowflake snowflake-snowpark-python pandas notebook scikit-learn cachetools
- Update connection.json with your Snowflake account details and credentials
TIP: We suggest installing Visual Studio Code on your computer. Check out the Visual Studio Code homepage for a link to the download page. Ensure the Python extension is installed. Search for and install the "Python" extension (from Microsoft) in the Extensions pane in VS Code. Also ensure Snowflake extension installed. Search for and install the "Snowflake" extension (from Snowflake) in the Extensions pane in VS Code. We also recommend installing GitHub Desktop on your computer.
Log into Visual Studio Code as follows:
- ROLE: snowpark_workshop_role
- DATABASE: snowpark_workshop
- SCHEMA: campaign_demo
- WAREHOUSE: snowparkws_wh
Step 1 - Walk through Snowpark Overview Deck
Snowpark Hands-on Lab Overview PPT
Step 2 - Ensure participants can access hands-on lab instructions, Snowflake, and Visual Studio Code (or other Python IDE)
Step 3 - Walk through an run 200-level Snowpark_For_Python notebook setup
- Navigate to the GitHub repo for this lab
- Click on the Green button to get the files in GitHub Desktop
- Select the Open in Visual Studio Code button
- Go to the connection.json file and update as follows:
- account: update with your Snowflake account
- user: update with your user
- password: update with your password
- role: snowpark_workshop_role
- warehouse: snowparkws_wh
- database: snowpark_workshop
- schema: campaign_demo
- Ensure you save the updated connection.json file
TIP: Trying to determine your Snowflake account name? Log into Snowflake. Click your account on the bottom left corner. Select the account to expose the details. Click to copy account identifier. Replace the "." with "-". For example, NXAAXGQ.LRB86899 should be NXAAXGQ-LRB86899.
-
Ensure your environment is configured to leverage Snowpark (Python 3.8.13).
-
Go to the Snowpark_For_Python.ipynb file
-
Ensure you have selected snowpark (Python 3.8.13) as your Python kernel
Step 4 - Hands-on Lab Time!
-
Read through the Objective and Instructions for the Snowpark_For_Python.ipynb notebook
-
Execute the code to import required Snowpark, json, pandas and logging libraries. Ensure you get a green check.
- Execute the code to user your connection.json file
- Walk through the notebook and execute each cell, ensuring you have no errors.
- When you encounter "YOUR TURN", update the cell as needed and execute.
TIP: Check the Snowpark_Hands-on-Lab_Solution if you need assistance.
- As you execute code in Snowpark, you can easily track how Snowflake is processing Snowpark DataFrame functions from both the Python IDE and Snowflake's Query History.
- Once completed, this asset is designed to be educational and help demonstrate the architecture of Snowpark. Step 5 is a notebook designed to be a framework for training and deploying a model.
Step 5 - Streamlit App Streamlit in Snowflake (SiS) is not yet available to most Snowflake accounts. As a result, we recommend running the Streamlit_Local_App.py version for this Hands-on Lab.
-
Go to the Streamlit_Local_App.py file to review the code. Not necessary to run the app.
-
Open a new Terminal within your IDE (VS Code), or from the desktop and cd into the folder where Streamlit_Local_App.py is
-
Type "streamlit run Streamlit_Local_App.py" and press enter.
-
If this is your first time running Streamlit it will ask you for an email. Feel free to leave this blank and just press enter
-
If you get an error message when running the above line of code, try typing "conda deactivate" then type the "streamlit run Streamlit_Local_App.py" command again.
-
A new browser window should open (as shown below). At this point you can interact with the Streamlit application.
-
Important to note that within this application, the user is giving inputs and the udf/model stored within a stage are being called to give predictions based on data presented by the app.
-
When you are finished, make sure to press ctrl+c or cmd+c to end the application before closing the browser tab. Otherwise this will lock up the terminal and it will need to be closed and re-opened.
Step 6 - 300-level Snowpark Leave-behind Modeling Scale Example
- Go to the Snowpark_Model_At_Scale_XGBoost file This notebook can be used as an example to follow when wanting to train and deploy your first end to end model with Snowpark. This features a large dataset, available in your Snowflake account, and will be predicting the lifetime value of a customer using XGboost regression.
- Please pay attention to when/how feature sets and predictions are being written to tables back on Snowflake. These steps may look different depending on your business needs (archiving training feature sets, where the predictions should be stored, etc.). The code here is just an example of how these would be written to tables within snowflake.
- When you have completed the Snowpark Hands-on Lab, run the Snowflake_Hands-On_Lab_SF_Cleanup script:
- Log into your Snowflake account and switch to ACCOUNTADMIN role
- Create a new worksheet and run the Snowpark_Hands-on_Lab_SF_Cleanup code. This code is required to drop the Snowflake Database, Role, and Warehouse used in the hands-on lab.