To solve these tasks, you must use both Python and Pandas, but otherwise feel free to pip install
whatever library you would like. These tasks are designed to test your ability to:
- Call REST APIs
- Import and export data as CSVs
- Clean messy data
- Do copy operations in
Pandas
like grouping and joining - Write Python
Once you have completed the assignment, please email the Python files to [email protected].
-
This task is not restricted in any way. So feel free to use Google, Stack Overflow, etc. The only thing you cannot do is ask a friend for help!
-
This task should not take you longer than 3 hours. It took that last developer ~ 1 hour to complete it fully. If you exceed three hours, please stop. This test is not designed to be an all-day assignment. We made a bunch of questions bonus specifically so you don't have to go over time.
-
Using a GET request, pull down appointments.csv and people.csv from these links and convert them into DataFrames:
-
Replace \n in the address column with a
" "
-
Extract out the
zipcode
into a new column calledzipcode
using regex. For this test, you can assume that thezipcode
will always be a five-digit number. Hint: Pandas has a method calledstr.extract
. -
Bonus: Replace all phone numbers not in this format:
XXX-XXX-XXXX
with the string “INVALID”. For example, 779-477-6793 is valid while 779-477-6793-121 is not. -
Bonus: Replace
weight_in_lbs
below 90 and above 250 withnp.nan
. -
Bonus: Create a
bmi
column: formula here. -
Bonus: Create a
metabolic_syndrome
column which isTrue
- The participant has a
bmi
> 30, has_hypertension
is truehas_diabetes
is also true
- The participant has a
-
Bonus: Create another column,
BMI_by_sex
, which contains the average BMI of each sex. Hint: This will require you toINNER JOIN
the datasets. Be sure to remove any duplicateparticipant_id
from both the person and appointments DataFrame before you do! -
Bonus: Export the combined dataset as
combined.csv
.
Write a simple Request class whose constructor takes a single parameter of path
. The concept of path is constrained to just refer to two elements resource and id
in the form resource_name/id
-
Validate that the path that was passed was a string. Raise a
TypeError
with the message "Path should be a string with two elements". -
Split the path based on the "/" delimiter. You can leave off handling a case where a wrong delimiter was passed in.
-
Check the first element of the path is a word. Hint.
-
Assign to an instance variable
resource
if this is true and raiseValueError
if it is not. -
Check the second element is an integral number.
-
Assign to an instance variable
id
if this is true and raise ValueError if it is not. -
If these assignments worked out, go ahead and set a boolean instance variable
valid
.
- All the code you write will be in the constructor of a class Request found in a module
server
. - Unit tests are provided for you that reiterate the above specification of tasks.
- To test your code, run the following from your project root directory
python -m unittest