Skip to content

Commit

Permalink
Merge pull request #65 from haesleinhuepf/git-bob-mod-l4Qf0x3lXw
Browse files Browse the repository at this point in the history
Add notebook demonstrating `zip` function for batch processing paired folders.
  • Loading branch information
haesleinhuepf authored Aug 17, 2024
2 parents ecc6b69 + c8b9f76 commit d39837c
Show file tree
Hide file tree
Showing 3 changed files with 304 additions and 0 deletions.
301 changes: 301 additions & 0 deletions docs/33_batch_processing/16_zip_folders.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,301 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "50401389",
"metadata": {},
"source": [
"## `zip` for Processing Paired Folders\n",
"In this notebook, we will use the Python built-in function `zip` to iterate over paired folders of images and label masks. Specifically, we will process images and their corresponding masks from the following directories:\n",
"* `data/BBBC007/images`\n",
"* `data/BBBC007/masks`\n",
"\n",
"We'll calculate the average intensity of labeled objects and the number of objects in each pair of image and mask files, and store the results in a pandas DataFrame."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "4077ca2f-f34a-4efe-bb4f-4c07fa782b60",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import os\n",
"import pandas as pd\n",
"from skimage import io, measure\n",
"import numpy as np"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "59d71bed-929e-4c7c-ae21-4352c41d1f28",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Define paths\n",
"image_folder = '../../data/BBBC007/images'\n",
"mask_folder = '../../data/BBBC007/masks'"
]
},
{
"cell_type": "markdown",
"id": "09904b87-503b-470f-be5e-1db462d31951",
"metadata": {},
"source": [
"Before starting, we just have a look at the folder contents to see if there are indeed paired files."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "99b7a708-80bc-4e60-bf2f-c7c6fcb22ae2",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"['A9 p5d (cropped 1).tif',\n",
" 'A9 p5d (cropped 2).tif',\n",
" 'A9 p5d (cropped 3).tif',\n",
" 'A9 p5d (cropped 4).tif']"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"image_files = sorted(os.listdir(image_folder))\n",
"image_files"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "ca9a595b-4864-461d-b76c-ed2d529facb4",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"['A9 p5d (cropped 1).tif',\n",
" 'A9 p5d (cropped 2).tif',\n",
" 'A9 p5d (cropped 3).tif',\n",
" 'A9 p5d (cropped 4).tif']"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"mask_files = sorted(os.listdir(mask_folder))\n",
"mask_files"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "3fb862a7-29f4-420c-9a78-bcc9648ae744",
"metadata": {},
"outputs": [],
"source": [
"df = pd.DataFrame(columns=['Image', 'Average Intensity', 'Number of Objects'])"
]
},
{
"cell_type": "markdown",
"id": "861bf937-9c8b-45c8-9ebc-9d7b991b3b5f",
"metadata": {},
"source": [
"To demonstrate how `zip()` allows iterate over image and mask files in parallel, we just print out file names in a short for-loop: "
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "ca5635ff-51a8-4d67-ab52-d64b60f89608",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"../../data/BBBC007/images\\A9 p5d (cropped 1).tif ../../data/BBBC007/masks\\A9 p5d (cropped 1).tif \n",
"\n",
"\n",
"../../data/BBBC007/images\\A9 p5d (cropped 2).tif ../../data/BBBC007/masks\\A9 p5d (cropped 2).tif \n",
"\n",
"\n",
"../../data/BBBC007/images\\A9 p5d (cropped 3).tif ../../data/BBBC007/masks\\A9 p5d (cropped 3).tif \n",
"\n",
"\n",
"../../data/BBBC007/images\\A9 p5d (cropped 4).tif ../../data/BBBC007/masks\\A9 p5d (cropped 4).tif \n",
"\n",
"\n"
]
}
],
"source": [
"for image_file, mask_file in zip(image_files, mask_files):\n",
" image_path = os.path.join(image_folder, image_file)\n",
" mask_path = os.path.join(mask_folder, mask_file)\n",
" \n",
" print(image_path, mask_path, \"\\n\\n\")"
]
},
{
"cell_type": "markdown",
"id": "443218ba-f193-4ad1-8e41-737bec3974eb",
"metadata": {},
"source": [
"The same code can be used to go through both folders in parallel and analyse intensity images paired with given label images."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "3a07e9a4-f89f-4afa-bd9f-04e38b4a1576",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Image</th>\n",
" <th>Average Intensity</th>\n",
" <th>Number of Objects</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>A9 p5d (cropped 1).tif</td>\n",
" <td>26.269523</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>A9 p5d (cropped 2).tif</td>\n",
" <td>16.698528</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>A9 p5d (cropped 3).tif</td>\n",
" <td>34.847166</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>A9 p5d (cropped 4).tif</td>\n",
" <td>28.707185</td>\n",
" <td>2</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Image Average Intensity Number of Objects\n",
"0 A9 p5d (cropped 1).tif 26.269523 2\n",
"1 A9 p5d (cropped 2).tif 16.698528 2\n",
"2 A9 p5d (cropped 3).tif 34.847166 2\n",
"3 A9 p5d (cropped 4).tif 28.707185 2"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"for image_file, mask_file in zip(image_files, mask_files):\n",
" image_path = os.path.join(image_folder, image_file)\n",
" mask_path = os.path.join(mask_folder, mask_file)\n",
" \n",
" # Read the image and its mask\n",
" image = io.imread(image_path)\n",
" mask = io.imread(mask_path).astype(np.uint32)\n",
"\n",
" # Measure labeled regions\n",
" labeled_regions = measure.regionprops(mask, intensity_image=image)\n",
"\n",
" # Calculate average intensity and number of objects\n",
" num_objects = len(labeled_regions)\n",
" avg_intensity = sum(region.mean_intensity for region in labeled_regions) / num_objects\n",
"\n",
" # Append results for the current pair\n",
" df.loc[len(df)] = {\n",
" 'Image': image_file,\n",
" 'Average Intensity': avg_intensity,\n",
" 'Number of Objects': num_objects\n",
" }\n",
"\n",
"# Display the result\n",
"df"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "88625c82-99b1-4190-b3ab-cff9539938d6",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
2 changes: 2 additions & 0 deletions docs/33_batch_processing/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,5 @@

[Batch processing](https://www.investopedia.com/terms/b/batch-processing.asp) comes into play when we want to process multiple images in the same way.
One example of batch processing would be to loop over a folder of images using a for-loop to apply a segmentation-workflow.

For an example using the `zip` function to process paired images and masks, see [16_zip_folders](16_zip_folders.ipynb).
1 change: 1 addition & 0 deletions docs/_toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -308,6 +308,7 @@ parts:
sections:
- file: 33_batch_processing/12_process_folders
- file: 33_batch_processing/14_process_timelapse
- file: 33_batch_processing/16_zip_folders.ipynb
# - file: 18_image_filtering/run_on_all_hyperslices

# - file: 27_cell_classification/cell_classification
Expand Down

0 comments on commit d39837c

Please sign in to comment.