Skip to content

Commit

Permalink
chore: Add introduction to NumPy arrays and vectorized operations
Browse files Browse the repository at this point in the history
  • Loading branch information
marc committed Aug 9, 2024
1 parent 8203d0b commit 420241c
Show file tree
Hide file tree
Showing 25 changed files with 8,405 additions and 341 deletions.
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
Code_Examples/Day 4/Data/Pancreas
.vscode
.vscode
Assignments/*.ipynb
4,111 changes: 4,111 additions & 0 deletions Assignments/Data/CB_Reptor2_agonists.csv

Large diffs are not rendered by default.

183 changes: 183 additions & 0 deletions Assignments/Py02-Assignments.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@
# Assignments (Py02)

## Problem 1: Patient Monitoring and Alert System

### Hard

**Scenario:**
A hospital has a patient monitoring system that tracks the vital signs of patients in the intensive care unit (ICU). Each patient’s heart rate is recorded every minute and stored in an array. The array for each patient contains the heart rate data for the last 24 hours (1440 minutes). The hospital wants to implement an alert system that detects potential issues based on abnormal heart rate patterns.

**Problem Statement:**
Write a program that takes in the heart rate data for a patient as an array of 1440 integers and checks for the following:
1. **Tachycardia Alert:** If the heart rate exceeds 100 bpm for 15 consecutive minutes or more, raise a Tachycardia alert.
2. **Bradycardia Alert:** If the heart rate drops below 60 bpm for 10 consecutive minutes or more, raise a Bradycardia alert.

The program should output a list of times (in minutes from the start of the 24-hour period) where alerts were triggered.

**Array Usage:**

- The heart rate data is stored in an array of integers.
- The program uses the array to check for consecutive abnormal values and raises alerts accordingly.

```python
# Example usage (dummy data)
# Random seed random.seed(seed=None)
np.random.seed(42)
# Generate dummy heart rate data
high_phase_data = np.random.randint(95, 130, size=500)
normal_phase_data = np.random.randint(50, 120, size=440)
low_phase_data = np.random.randint(35, 70, size=500)
heart_rate_data = np.concatenate([high_phase_data, normal_phase_data, low_phase_data])
# Check for alerts
alerts = check_heart_rate(heart_rate_data)
print(alerts) # Print the detected alerts

# {'Tachycardia': [(47, 65, 19), (234, 248, 15), (315, 329, 15), (404, 439, 36)], 'Bradycardia': [(990, 1001, 12), (1090, 1100, 11), (1166, 1179, 14), (1374, 1388, 15)]}

```

*If you stop here, and code this function, by yourself, you will learn a lot. (HARD)*

---
---
---

### Medium

To solve the problem more efficiently using array functions and reduce the number of iterations, follow these steps:

#### 1. **Identify Consecutive Segments**

- Use a **sliding window** or **filtering technique** to find segments of the array where the heart rate is consistently above 100 bpm or below 60 bpm.
- For each value in the array, you can create a binary mask (an array of 0s and 1s), where:
- 1 indicates the heart rate exceeds 100 bpm (for Tachycardia) or drops below 60 bpm (for Bradycardia).
- 0 indicates normal heart rate.

#### 2. **Find Runs of 1s**

- Use **array operations** to identify consecutive runs of 1s in the binary mask.
- This can be achieved by using functions that help detect the start and end of each run, such as **`diff`**, **`cumsum`**, or by combining conditions.

#### 3. **Filter Based on Length**

- Once you have the segments where the heart rate is consistently abnormal, filter these segments by their length:
- For Tachycardia, only keep segments where the length is 15 minutes or more.
- For Bradycardia, only keep segments where the length is 10 minutes or more.
- The length of a segment is determined by counting the consecutive 1s.

#### 4. **Extract Start and End Times**

- For each valid segment that meets the length criteria, calculate the start and end times.
- Use the indices of the start and end of the segments to determine the exact times (in minutes from the start of the 24-hour period).

#### 5. **Output the Alerts**

- Compile the start and end times of all valid Tachycardia and Bradycardia alerts into a list.
- Return or print this list.

#### Efficiency Considerations

- By converting the heart rate data into binary masks and then using array operations to detect and filter segments, you minimize the need for explicit loops.
- This approach leverages vectorized operations that are typically more efficient than manual iteration, especially for large datasets like a 1440-element array.

#### Summary of Steps Using Array Functions

1. **Create binary masks** for abnormal heart rates.
2. **Identify consecutive runs** of abnormal values using array operations.
3. **Filter** these runs by duration to determine valid alerts.
4. **Calculate and store** the times of these alerts.
5. **Return the results** in a structured format.

*If you stop here, and code this function, with the help of the instructions above, you still learn a lot. (MEDIUM)*

---
---
---

### Easy

Fill the TODOs in the code below to complete the function.

Here's the dummy Python code that outlines the process using array functions to minimize iterations:

```python
import numpy as np

def check_heart_rate(heart_rate_data):
# Step 1: Create binary masks
tachy_mask = #TODO # Mask for Tachycardia
brady_mask = #TODO # Mask for Bradycardia

# Step 2: Identify consecutive runs of abnormal values
# Calculate differences to find the start and end of runs
tachy_diff = #TODO # Calculate diff for Tachycardia
brady_diff = #TODO # Calculate diff for Bradycardia

# Find the indices where runs start and end
tachy_start_indices = np.where(tachy_diff == 1)[0]
tachy_end_indices = np.where(tachy_diff == -1)[0] - 1

brady_start_indices = #TODO
brady_end_indices = #TODO

# Step 3: Filter by length of runs
tachy_alerts = []
brady_alerts = []

for start, end in zip(tachy_start_indices, tachy_end_indices):
lenght = end - start + 1
if lenght >= 15: # Check if the run is 15 minutes or more
tachy_alerts.append((start, end, lenght))

#TODO same for brady_alerts loop

# Step 4: Combine the alerts
alerts = {
"Tachycardia": tachy_alerts,
"Bradycardia": brady_alerts
}

return alerts

```

#### Breakdown of the Python Code

1. **Binary Masks:**
- `tachy_mask` and `brady_mask` are arrays where each element is `True` if the corresponding heart rate meets the criteria for Tachycardia or Bradycardia, respectively.

2. **Identifying Consecutive Runs:**
- `np.diff` is used to find where the binary mask switches from `False` to `True` (indicating the start of a run) and from `True` to `False` (indicating the end of a run).
- Concatenating with `[0]` ensures that changes at the boundaries (beginning or end) are detected.

3. **Filtering Runs:**
- The code checks each run to see if it meets the minimum duration required for an alert.
- Only runs that are long enough are stored in the `tachy_alerts` or `brady_alerts` lists.

4. **Combining Alerts:**
- The alerts are stored in a dictionary and returned.

*You finished this problem, if you complete the code above, and test it with the example usage. (EASY)*

---
---
---

## Problem 2: Medical Imaging Pixel Intensity Analysis

**Scenario:**
A radiologist is analyzing a grayscale MRI image represented as a 2D array, where each element in the array represents the intensity of a pixel (ranging from 0 to 255). The radiologist wants to identify regions of interest (ROI) where the pixel intensity exceeds a certain threshold, indicating possible abnormalities.

**Problem Statement:**
Write a program that takes a 2D array representing an MRI scan and a threshold value as input. The program should:

1. Identify all contiguous regions (clusters) in the array where the pixel intensity exceeds the threshold.
2. For each cluster, calculate the size of the region (number of pixels) and the average intensity.
3. Return a list of clusters with their respective sizes and average intensities.

**Array Usage:**

- The MRI scan data is stored in a 2D array of integers.
- The program uses the array to identify contiguous regions (clusters) of high intensity, calculate their sizes, and determine the average intensity within each cluster.

These problems can be solved efficiently using arrays and basic array operations in coding.
File renamed without changes.
File renamed without changes.
51 changes: 51 additions & 0 deletions Coding/Data/high_risk_patients.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
Patient ID,Name,Age,Gender,BMI,Blood Pressure,Chronic Condition,Smoking Status,Physical Activity Level
P001,Alice Smith,72,Female,32.1,145,Yes,Non-Smoker,Low
P002,Bob Jones,65,Male,29.4,135,No,Smoker,Moderate
P003,Charlie Brown,50,Male,31.5,130,Yes,Non-Smoker,High
P004,Diana Prince,80,Female,34.0,150,Yes,Non-Smoker,Low
P005,Eva Green,55,Female,28.2,120,No,Smoker,Moderate
P006,Frank Castle,60,Male,33.2,160,Yes,Smoker,Low
P007,Grace Lee,40,Female,24.5,110,No,Non-Smoker,High
P008,Hank Hill,67,Male,31.0,145,Yes,Smoker,Low
P009,Irene Adler,50,Female,27.0,135,No,Non-Smoker,Moderate
P010,Jack Ryan,30,Male,26.7,120,No,Smoker,High
P011,Karen White,85,Female,29.8,155,Yes,Non-Smoker,Low
P012,Louis Black,55,Male,32.5,140,Yes,Smoker,Moderate
P013,Maria Garcia,45,Female,23.5,115,No,Non-Smoker,High
P014,Nathan Drake,60,Male,27.9,130,Yes,Smoker,Low
P015,Olivia Wilde,70,Female,28.8,140,Yes,Non-Smoker,Moderate
P016,Paul Walker,55,Male,29.3,150,Yes,Smoker,Low
P017,Quinn Harper,50,Female,25.0,120,Yes,Non-Smoker,High
P018,Rebecca Black,65,Female,30.2,145,Yes,Smoker,Low
P019,Samuel Green,60,Male,31.4,130,Yes,Non-Smoker,Moderate
P020,Tracy Adams,45,Female,27.1,115,No,Non-Smoker,High
P021,Uma Patel,40,Female,26.7,120,No,Non-Smoker,High
P022,Vincent Martinez,55,Male,32.0,160,Yes,Smoker,Low
P023,Willow Anderson,50,Female,24.5,110,No,Non-Smoker,High
P024,Xander Brooks,60,Male,33.1,155,Yes,Smoker,Low
P025,Yara Lopez,45,Female,23.9,115,No,Non-Smoker,High
P026,Zachary Green,65,Male,30.5,160,Yes,Smoker,Low
P027,Alice Thompson,50,Female,28.0,130,No,Non-Smoker,Moderate
P028,Bob Allen,40,Male,25.5,120,No,Non-Smoker,High
P029,Clara Evans,55,Female,27.3,135,Yes,Smoker,Moderate
P030,David Carter,65,Male,31.2,150,Yes,Non-Smoker,Low
P031,Emma Roberts,48,Female,29.6,145,Yes,Smoker,Moderate
P032,John Smith,53,Male,30.8,140,Yes,Non-Smoker,Low
P033,Kate Johnson,42,Female,26.4,120,No,Non-Smoker,High
P034,Lucas Martinez,57,Male,28.9,135,Yes,Smoker,Moderate
P035,Nina Carter,63,Female,31.2,150,Yes,Non-Smoker,Low
P036,Oscar Davis,49,Male,27.8,130,No,Smoker,Moderate
P037,Pam Brown,52,Female,28.3,140,Yes,Non-Smoker,High
P038,Quincy Lewis,44,Male,25.9,120,No,Non-Smoker,High
P039,Rachel Green,58,Female,29.1,135,Yes,Smoker,Low
P040,Steve Wilson,61,Male,30.4,140,Yes,Non-Smoker,Moderate
P041,Tina White,39,Female,24.2,110,No,Non-Smoker,High
P042,Victor Adams,54,Male,27.0,120,Yes,Smoker,Moderate
P043,Wendy Scott,60,Female,28.7,130,No,Non-Smoker,Low
P044,Xena Lee,49,Female,29.4,140,Yes,Smoker,Moderate
P045,Yvonne Martinez,50,Female,27.1,125,No,Non-Smoker,High
P046,Zane Brown,62,Male,30.3,145,Yes,Smoker,Low
P047,Abigail Green,46,Female,28.2,120,No,Non-Smoker,High
P048,Blake Allen,59,Male,29.6,140,Yes,Smoker,Moderate
P049,Chloe Evans,44,Female,25.0,115,No,Non-Smoker,High
P050,Daniel Carter,65,Male,31.0,150,Yes,Smoker,Low
31 changes: 31 additions & 0 deletions Coding/Data/patients.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
Patient ID,Name,Age,Gender,BMI,Diagnosis,Admission Date,Discharge Date,Treatment Cost,Insurance Status,Medications
P001,Alice Smith,45,Female,28.7,Pneumonia,2023-06-01,2023-06-10,15000,Yes,Amoxicillin;Albuterol
P002,Bob Jones,52,Male,31.2,Diabetes,2023-06-05,2023-06-15,20000,No,Metformin;Insulin
P003,Charlie Brown,35,Male,27.5,Hypertension,2023-07-01,2023-07-10,12000,Yes,Lisinopril
P004,Diana Prince,60,Female,29.8,COPD,2023-07-10,2023-07-20,25000,Yes,Tiotropium;Prednisone
P005,Eva Green,70,Female,32.4,Heart Failure,2023-07-15,2023-07-25,30000,No,Furosemide;Digoxin;Carvedilol
P006,Frank Castle,55,Male,26.5,Arthritis,2023-07-20,2023-07-30,18000,Yes,Ibuprofen;Methotrexate
P007,Grace Lee,40,Female,22.9,Migraine,2023-07-25,2023-07-28,9000,Yes,Sumatriptan;Naproxen
P008,Hank Hill,65,Male,29.1,Lung Cancer,2023-08-01,2023-08-20,45000,No,Cisplatin;Etoposide
P009,Irene Adler,50,Female,27.4,Asthma,2023-08-05,2023-08-15,14000,Yes,Salbutamol;Montelukast
P010,Jack Ryan,30,Male,24.3,Infection,2023-08-10,2023-08-18,11000,No,Ciprofloxacin;Ibuprofen
P011,Karen White,85,Female,29.8,Heart Failure,2023-08-15,2023-08-30,32000,Yes,Lisinopril;Digoxin
P012,Louis Black,55,Male,32.5,Diabetes,2023-08-20,2023-08-30,21000,No,Metformin;Insulin
P013,Maria Garcia,45,Female,23.5,Hypertension,2023-08-25,2023-09-05,12000,Yes,Lisinopril
P014,Nathan Drake,60,Male,27.9,Asthma,2023-09-01,2023-09-10,15000,Yes,Salbutamol;Montelukast
P015,Olivia Wilde,70,Female,28.8,COPD,2023-09-05,2023-09-20,27000,No,Tiotropium;Prednisone
P016,Paul Walker,55,Male,29.3,Heart Failure,2023-09-10,2023-09-25,31000,Yes,Furosemide;Carvedilol
P017,Quinn Harper,50,Female,25.0,Arthritis,2023-09-15,2023-09-25,17000,Yes,Ibuprofen;Methotrexate
P018,Rebecca Black,65,Female,30.2,Diabetes,2023-09-20,2023-10-01,22000,No,Metformin;Insulin
P019,Samuel Green,60,Male,31.4,Hypertension,2023-09-25,2023-10-05,16000,Yes,Lisinopril
P020,Tracy Adams,45,Female,27.1,Infection,2023-10-01,2023-10-10,13000,No,Ciprofloxacin;Ibuprofen
P021,Uma Patel,40,Female,26.7,Asthma,2023-10-05,2023-10-15,15000,Yes,Salbutamol;Montelukast
P022,Vincent Martinez,55,Male,32.0,Heart Failure,2023-10-10,2023-10-25,29000,No,Furosemide;Digoxin
P023,Willow Anderson,50,Female,24.5,Hypertension,2023-10-15,2023-10-25,14000,Yes,Lisinopril
P024,Xander Brooks,60,Male,33.1,Diabetes,2023-10-20,2023-10-30,20000,Yes,Metformin;Insulin
P025,Yara Lopez,45,Female,23.9,Infection,2023-10-25,2023-11-05,11000,No,Ciprofloxacin;Ibuprofen
P026,Zachary Green,65,Male,30.5,Heart Failure,2023-11-01,2023-11-15,32000,Yes,Furosemide;Carvedilol
P027,Alice Thompson,50,Female,28.0,Hypertension,2023-11-05,2023-11-15,16000,Yes,Lisinopril
P028,Bob Allen,40,Male,25.5,Asthma,2023-11-10,2023-11-20,14000,No,Salbutamol;Montelukast
P029,Clara Evans,55,Female,27.3,Diabetes,2023-11-15,2023-11-25,18000,Yes,Metformin;Insulin
P030,David Carter,65,Male,31.2,COPD,2023-11-20,2023-12-01,26000,No,Tiotropium;Prednisone
51 changes: 51 additions & 0 deletions Coding/Data/readmissions.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
Patient ID,Name,Previous Discharge Date,Admission Date,Diagnosis,BMI,Days Since Last Visit
P001,Alice Smith,2023-05-20,2023-06-10,Pneumonia,28.7,21
P002,Bob Jones,2023-05-25,2023-06-15,Diabetes,31.2,21
P003,Charlie Brown,2023-06-25,2023-07-10,Hypertension,27.5,15
P004,Diana Prince,2023-06-30,2023-07-20,COPD,29.8,20
P005,Eva Green,2023-07-05,2023-07-25,Heart Failure,32.4,20
P006,Frank Castle,2023-06-30,2023-07-30,Arthritis,26.5,30
P007,Grace Lee,2023-07-10,2023-07-28,Migraine,22.9,18
P008,Hank Hill,2023-07-10,2023-08-20,Lung Cancer,29.1,41
P009,Irene Adler,2023-07-15,2023-08-15,Asthma,27.4,31
P010,Jack Ryan,2023-07-30,2023-08-18,Infection,24.3,19
P011,Karen White,2023-06-15,2023-08-15,Heart Failure,29.8,61
P012,Louis Black,2023-06-20,2023-08-20,Diabetes,32.5,61
P013,Maria Garcia,2023-07-10,2023-08-25,Hypertension,23.5,46
P014,Nathan Drake,2023-07-15,2023-09-01,Asthma,27.9,47
P015,Olivia Wilde,2023-08-05,2023-09-05,COPD,28.8,31
P016,Paul Walker,2023-07-20,2023-09-10,Heart Failure,29.3,52
P017,Quinn Harper,2023-08-01,2023-09-15,Arthritis,25.0,45
P018,Rebecca Black,2023-08-10,2023-09-20,Diabetes,30.2,41
P019,Samuel Green,2023-08-15,2023-09-25,Hypertension,31.4,41
P020,Tracy Adams,2023-09-01,2023-10-01,Infection,27.1,30
P021,Uma Patel,2023-09-05,2023-10-05,Asthma,26.7,30
P022,Vincent Martinez,2023-09-10,2023-10-10,Heart Failure,32.0,30
P023,Willow Anderson,2023-09-15,2023-10-15,Hypertension,24.5,30
P024,Xander Brooks,2023-09-20,2023-10-20,Diabetes,33.1,30
P025,Yara Lopez,2023-09-25,2023-10-25,Infection,23.9,30
P026,Zachary Green,2023-10-01,2023-11-01,Heart Failure,30.5,31
P027,Alice Thompson,2023-10-05,2023-11-05,Hypertension,28.0,31
P028,Bob Allen,2023-10-10,2023-11-10,Asthma,25.5,31
P029,Clara Evans,2023-10-15,2023-11-15,Diabetes,27.3,31
P030,David Carter,2023-10-20,2023-11-20,COPD,31.2,31
P031,Emma Roberts,2023-09-05,2023-10-15,Diabetes,29.1,40
P032,John Smith,2023-08-25,2023-09-30,Hypertension,25.4,36
P033,Kate Johnson,2023-09-15,2023-10-25,Asthma,26.8,40
P034,Lucas Martinez,2023-08-10,2023-09-20,Heart Failure,30.2,41
P035,Nina Carter,2023-09-20,2023-10-30,COPD,32.4,40
P036,Oscar Davis,2023-07-15,2023-08-25,Infection,24.5,41
P037,Pam Brown,2023-09-01,2023-10-10,Arthritis,28.7,39
P038,Quincy Lewis,2023-08-25,2023-09-30,Diabetes,27.6,36
P039,Rachel Green,2023-09-15,2023-10-25,Hypertension,26.2,40
P040,Steve Wilson,2023-09-20,2023-10-30,Heart Failure,31.1,40
P041,Tina White,2023-08-10,2023-09-15,Asthma,22.4,36
P042,Victor Adams,2023-09-01,2023-10-10,Infection,23.5,39
P043,Wendy Scott,2023-09-15,2023-10-25,COPD,32.2,40
P044,Xena Lee,2023-08-20,2023-09-30,Diabetes,30.3,41
P045,Yvonne Martinez,2023-08-25,2023-10-05,Hypertension,27.4,41
P046,Zane Brown,2023-09-05,2023-10-15,Heart Failure,29.8,40
P047,Abigail Green,2023-09-10,2023-10-20,Asthma,26.6,40
P048,Blake Allen,2023-08-15,2023-09-25,Diabetes,31.1,41
P049,Chloe Evans,2023-09-01,2023-10-15,Hypertension,27.7,44
P050,Daniel Carter,2023-09-20,2023-10-30,COPD,32.5,40
Loading

0 comments on commit 420241c

Please sign in to comment.