- For this CodeAlong, we will be working with the Yelp API.
- You will use the the Yelp API to search your home town for a cuisine type of your choice.
- Next class, we will then use Plotly Express to create a map with the Mapbox API to visualize the results.
-
Part 1:
-
Yelp API:
- Getting Started:
-
YelpAPI
python package- "YelpAPI": https://github.com/gfairchild/yelpapi
-
-
Part 2:
- Plotly Express: https://plotly.com/python/getting-started/
- With Mapbox API: https://www.mapbox.com/
px.scatter_mapbox
Documentation:
- Plotly Express: https://plotly.com/python/getting-started/
- Efficient API Calls Lesson Link: https://login.codingdojo.com/m/376/12529/88078
# Standard Imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Additional Imports
import os, json, math, time
from yelpapi import YelpAPI
from tqdm.notebook import tqdm_notebook
Check the official API documentation to know what arguments we can search for: https://www.yelp.com/developers/documentation/v3/business_search
# Load API Credentials
# Instantiate YelpAPI Variable
# set our API call parameters and filename before the first call
## Specify fodler for saving data
# Specifying JSON_FILE filename (can include a folder)
JSON_FILE = None
## Check if JSON_FILE exists
## If it does not exist:
## CREATE ANY NEEDED FOLDERS
# Get the Folder Name only
## If JSON_FILE included a folder:
# create the folder
## INFORM USER AND SAVE EMPTY LIST
## save the first page of results
## If it exists, inform user
## Load previous results and use len of results for offset
## set offset based on previous results
- We will use this first result to check:
- how many total results there are?
- Where is the actual data we want to save?
- how many results do we get at a time?
# use our yelp_api variable's search_query method to perform our API call
## How many results total?
- Where is the actual data we want to save?
## How many did we get the details for?
results_per_page = None
results_per_page
- Calculate how many pages of results needed to cover the total_results
# Use math.ceil to round up for the total number of pages of results.
for i in tqdm_notebook( range(1,n_pages+1)):
## The block of code we want to TRY to run
## Read in results in progress file and check the length
## save number of results for to use as offset
## use n_results as the OFFSET
## append new results and save to file
## What to do if we get an error/exception.
df = None
## convert the filename to a .csv.gz
csv_file = JSON_FILE.replace('.json','.csv.gz')
csv_file
## Save it as a compressed csv (to save space)
size_json = os.path.getsize(JSON_FILE)
size_csv_gz = os.path.getsize(JSON_FILE.replace('.json','.csv.gz'))
print(f'JSON FILE: {size_json:,} Bytes')
print(f'CSV.GZ FILE: {size_csv_gz:,} Bytes')
print(f'the csv.gz is {size_json/size_csv_gz} times smaller!')