Skip to content

Commit

Permalink
Merge pull request #338 from Living-with-machines/dev
Browse files Browse the repository at this point in the history
Mapreader 1.1.1
  • Loading branch information
rwood-97 authored Jan 8, 2024
2 parents 7e90e2b + 8997b4d commit 566e602
Show file tree
Hide file tree
Showing 13 changed files with 987 additions and 375 deletions.
9 changes: 5 additions & 4 deletions docs/source/User-guide/Annotate.rst
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ Other arguments that you may want to be aware of when initializing the ``Annotat
- ``show_context``: Whether to show a context image in the annotation interface (default: ``False``).
- ``surrounding``: How many surrounding patches to show in the context image (default: ``1``).
- ``sortby``: The name of the column to use to sort the patch Dataframe (e.g. "mean_pixel_R" to sort by red pixel intensities).
- ``ascending``: A boolean indicating whether to sort in ascending or descending order (default: ``True``).
- ``delimiter``: The delimiter to use when reading your data files (default: ``","`` for csv).

After setting up the ``Annotator`` instance, you can interactively annotate a sample of your images using:
Expand All @@ -78,7 +79,7 @@ Patch size

By default, your patches will be shown to you as their original size in pixels.
This can make annotating difficult if your patches are very small.
To resize your patches when viewing them in the annotation interface, you can pass the ``resize_to`` keyword argument when initializing the ``Annotator`` instance or when calling the ``annotate()`` method.
To resize your patches when viewing them in the annotation interface, you can pass the ``resize_to`` argument when initializing the ``Annotator`` or when calling the ``annotate()`` method.

e.g. to resize your patches so that their largest edge is 300 pixels:

Expand All @@ -101,14 +102,14 @@ Or, equivalently, :
annotator.annotate(resize_to=300)
.. note:: Passing the ``resize_to`` argument when calling the ``annotate()`` method overrides the ``resize_to`` argument passed when initializing the ``Annotator`` instance.
.. note:: Passing the ``resize_to`` argument when calling the ``annotate()`` method overrides the ``resize_to`` argument passed when initializing the ``Annotator``.

Context
~~~~~~~

As well as resizing your patches, you can also set the annotation interface to show a context image using ``show_context=True``.
This creates a panel of patches in the annotation interface, highlighting your patch in the middle of its surrounding immediate images.
As above, you can either pass the ``show_context`` argument when initializing the ``Annotator`` instance or when calling the ``annotate`` method.
As above, you can either pass the ``show_context`` argument when initializing the ``Annotator`` or when calling the ``annotate`` method.

e.g. :

Expand Down Expand Up @@ -192,7 +193,7 @@ e.g. To sort your patches by the mean red pixel intensity in each patch but only
Save your annotations
----------------------

Your annotations are automatically saved as you're making progress through the annotation task as a ``csv`` file (unless you've set the ``auto_save`` keyword argument to ``False`` when you set up the ``Annotator`` instance).
Your annotations are automatically saved as you're making progress through the annotation task as a ``csv`` file (unless you've set ``auto_save=False`` when you set up the ``Annotator`` instance).

If you need to know the name of the annotations file, you may refer to a property on your ``Annotator`` instance:

Expand Down
36 changes: 20 additions & 16 deletions docs/source/User-guide/Load.rst
Original file line number Diff line number Diff line change
Expand Up @@ -80,11 +80,13 @@ For example, if you have downloaded your maps using the default settings of our
Other arguments you may want to specify when adding metadata to your images include:

- ``index_col`` - By default, this is set to ``0`` so the first column of your csv/excel spreadsheet will be used as the index column when creating a pandas dataframe. If you would like to use a different column you can specify ``index_col``.
- ``columns`` - By default, the ``.add_metadata()`` method will add all the columns in your metadata to your ``MapImages`` object. If you would like to add only specific columns, you can pass a list of these as the ``columns``\s argument (e.g. ``columns=[`name`, `coordinates`, `region`]``) to add only these columns to your ``MapImages`` object.
- ``columns`` - By default, the ``add_metadata()`` method will add all the columns in your metadata to your ``MapImages`` object. If you would like to add only specific columns, you can pass a list of these as the ``columns``\s argument (e.g. ``columns=[`name`, `coordinates`, `region`]``) to add only these columns to your ``MapImages`` object.
- ``ignore_mismatch``- By default, this is set to ``False`` so that an error is given if the images in your ``MapImages`` object are mismatched to your metadata. Setting ``ignore_mismatch`` to ``True`` (by specifying ``ignore_mismatch=True``) will allow you to bypass this error and add mismatched metadata. Only metadata corresponding to images in your ``MapImages`` object will be added.
- ``delimiter`` - By default, this is set to ``|``. If your csv file is delimited using a different delimiter you should specify the delimiter argument.


.. note:: In MapReader versions < 1.0.7, coordinates were miscalculated. To correct this, use the ``add_coords_from_grid_bb()`` method to calculate new, correct coordinates.

Patchify
----------

Expand Down Expand Up @@ -184,7 +186,7 @@ As above, you can use the ``path_save`` argument to change where these patches a
Other arguments you may want to specify when patchifying your images include:

- ``square_cuts`` - By default, this is set to ``False``. Thus, if your ``patch_size`` is not a factor of your image size (e.g. if you are trying to slice a 100x100 pixel image into 8x8 pixel patches), you will end up with some rectangular patches at the edges of your image. If you set ``square_cuts=True``, then all your patches will be square, however there will be some overlap between edge patches. Using ``square_cuts=True`` is useful if you need square images for model training, and don't want to warp your rectangular images by resizing them at a later stage.
- ``add_to_parent`` - By default, this is set to ``True`` so that each time you run ``.patchify_all()`` your patches are added to your ``MapImages`` object. Setting it to ``False`` (by specifying ``add_to_parent=False``) will mean your patches are created, but not added to your ``MapImages`` object. This can be useful for testing out different patch sizes.
- ``add_to_parent`` - By default, this is set to ``True`` so that each time you run ``patchify_all()`` your patches are added to your ``MapImages`` object. Setting it to ``False`` (by specifying ``add_to_parent=False``) will mean your patches are created, but not added to your ``MapImages`` object. This can be useful for testing out different patch sizes.
- ``rewrite`` - By default, this is set to ``False`` so that if your patches already exist they are not overwritten. Setting it to ``True`` (by specifying ``rewrite=True``) will mean already existing patches are recreated and overwritten.

If you would like to save your patches as geo-referenced tiffs (i.e. geotiffs), use:
Expand All @@ -193,10 +195,12 @@ If you would like to save your patches as geo-referenced tiffs (i.e. geotiffs),
my_files.save_patches_as_geotiffs()
This will save each patch in your ``MapImages`` object as a ``.geotiff`` file in your patches directory.
This will save each patch in your ``MapImages`` object as a georeferenced ``.tif`` file in your patches directory.

.. note:: MapReader also has a ``save_parents_as_geotiff()`` method for saving parent images as geotiffs.

After running the ``.patchify_all()`` method, you'll see that ``print(my_files)`` shows you have both 'parents' and 'patches'.
To view an iterable list of these, you can use the ``.list_parents()`` and ``.list_patches()`` methods:
After running the ``patchify_all()`` method, you'll see that ``print(my_files)`` shows you have both 'parents' and 'patches'.
To view an iterable list of these, you can use the ``list_parents()`` and ``list_patches()`` methods:

.. code-block:: python
Expand Down Expand Up @@ -229,7 +233,7 @@ or
.. note:: These parent and patch dataframes **will not** automatically update so you will want to run this command again if you add new information into your ``MapImages`` object.

At any point, you can also save these dataframes by passing the ``save`` argument to the ``.convert_images()`` method:
At any point, you can also save these dataframes by passing the ``save`` argument to the ``convert_images()`` method:

.. code-block:: python
Expand Down Expand Up @@ -280,7 +284,7 @@ If, however, you want to see a random sample of your patches use the ``tree_leve


It can also be helpful to see your patches in the context of their parent image.
To do this use the ``.show()`` method.
To do this use the ``show()`` method.

e.g. :

Expand Down Expand Up @@ -312,7 +316,7 @@ This will show you your chosen patches, by default highlighted with red borders,
.. admonition:: Advanced usage
:class: dropdown

Further usage of the ``.show()`` method is detailed in :ref:`Further_analysis`.
Further usage of the ``show()`` method is detailed in :ref:`Further_analysis`.
Please head there for guidance on advanced usage.

You may also want to see all the patches created from one of your parent images.
Expand All @@ -330,7 +334,7 @@ This can be done using:
.. admonition:: Advanced usage
:class: dropdown

Further usage of the ``.show_parent()`` method is detailed in :ref:`Further_analysis`.
Further usage of the ``show_parent()`` method is detailed in :ref:`Further_analysis`.
Please head there for guidance on advanced usage.

.. todo:: Move 'Further analysis/visualization' to a different page (e.g. as an appendix)
Expand All @@ -341,13 +345,13 @@ Further analysis/visualization (optional)
-------------------------------------------

If you have loaded geographic coordinates into your ``MapImages`` object, you may want to calculate the central coordinates of your patches.
The ``.add_center_coord()`` method can used to do this:
The ``add_center_coord()`` method can used to do this:

.. code-block:: python
my_files.add_center_coord()
You can then rerun the ``.convert_images()`` method to see your results.
You can then rerun the ``convert_images()`` method to see your results.

i.e.:

Expand All @@ -358,15 +362,15 @@ i.e.:
You will see that center coordinates of each patch have been added to your patch dataframe.

The ``.calc_pixel_stats()`` method can be used to calculate means and standard deviations of pixel intensities of each of your patches:
The ``calc_pixel_stats()`` method can be used to calculate means and standard deviations of pixel intensities of each of your patches:

.. code-block:: python
my_files.calc_pixel_stats()
After rerunning the ``.convert_images()`` method (as above), you will see that mean and standard pixel intensities have been added to your patch dataframe.
After rerunning the ``convert_images()`` method (as above), you will see that mean and standard pixel intensities have been added to your patch dataframe.

The ``.show()`` and ``.show_parent()`` methods can be used to plot these values ontop of your patches.
The ``show()`` and ``show_parent()`` methods can be used to plot these values ontop of your patches.
This is done by specifying the ``column_to_plot`` argument.

e.g. to view "mean_pixel_R" on your patches:
Expand Down Expand Up @@ -394,12 +398,12 @@ e.g. to view "mean_pixel_R" on your patches:
.. image:: ../figures/show_par_RGB_0.5.png
:width: 400px

.. note:: The ``column_to_plot`` argument can also be used with the ``.show()`` method.
.. note:: The ``column_to_plot`` argument can also be used with the ``show()`` method.

.. admonition:: Advanced usage
:class: dropdown

Other arguments you may want to specify when showing your images (for both the ``.show()`` and ``.show_parent()`` methods):
Other arguments you may want to specify when showing your images (for both the ``show()`` and ``show_parent()`` methods):

- ``plot_parent`` - By default, this is set to ``True`` so that the parent image is shown. If you would like to remove the parent image, e.g. if you are plotting column values, you can set ``plot_parent=False``. This should speed up the code for plotting.
- ``patch_border`` - By default, this is set to ``True`` so that borders are plotted around each patch. Setting ``patch_border`` to ``False`` (by specifying ``patch_border=False``) will stop patch borders being shown.
Expand Down
94 changes: 45 additions & 49 deletions mapreader/annotate/annotator.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,6 @@

warnings.filterwarnings("ignore", category=UserWarning)

MAX_SIZE = 1000

_CENTER_LAYOUT = widgets.Layout(
display="flex", flex_flow="column", align_items="center"
)
Expand Down Expand Up @@ -64,8 +62,23 @@ class Annotator(pd.DataFrame):
sortby : str or None, optional
Name of the column to use to sort the patch DataFrame, by default None.
Default sort order is ``ascending=True``. Pass ``ascending=False`` keyword argument to sort in descending order.
**kwargs
Additional keyword arguments
ascending : bool, optional
Whether to sort the DataFrame in ascending order when using the ``sortby`` argument, by default True.
username : str or None, optional
Username to use when saving annotations file, by default None.
If not provided, a random string is generated.
task_name : str or None, optional
Name of the annotation task, by default None.
min_values : dict, optional
A dictionary consisting of column names (keys) and minimum values as floating point values (values), by default None.
max_values : dict, optional
A dictionary consisting of column names (keys) and maximum values as floating point values (values), by default None.
surrounding : int, optional
The number of surrounding images to show for context, by default 1.
max_size : int, optional
The size in pixels for the longest side to which constrain each patch image, by default 1000.
resize_to : int or None, optional
The size in pixels for the longest side to which resize each patch image, by default None.
Raises
------
Expand All @@ -79,21 +92,6 @@ class Annotator(pd.DataFrame):
If labels provided are not in the form of a list
SyntaxError
If labels provided are not in the form of a list
Notes
-----
Additional kwargs:
- ``username``: Username to use when saving annotations file. Default: Randomly generated string.
- ``task_name``: Name of the annotation task. Default: "task".
- ``min_values``: A dictionary consisting of column names (keys) and minimum values as floating point values (values). Default: {}.
- ``max_values``: A dictionary consisting of column names (keys) and maximum values as floating point values (values). Default: {}.
- ``buttons_per_row``: Number of buttons to display per row. Default: None.
- ``ascending``: Whether to sort the DataFrame in ascending order. Default: True.
- ``surrounding``: The number of surrounding images to show for context. Default: 1.
- ``max_size``: The size in pixels for the longest side to which constrain each patch image. Default: 1000.
- ``resize_to``: The size in pixels for the longest side to which resize each patch image. Default: None.
"""

def __init__(
Expand All @@ -111,7 +109,14 @@ def __init__(
auto_save: bool = True,
delimiter: str = ",",
sortby: str | None = None,
**kwargs,
ascending: bool = True,
username: str | None = None,
task_name: str | None = None,
min_values: dict | None = None,
max_values: dict | None = None,
surrounding: int = 1,
max_size: int = 1000,
resize_to: int | None = None,
):
if labels is None:
labels = []
Expand Down Expand Up @@ -174,10 +179,6 @@ def __init__(
# Check for url column and add to patch dataframe
if "url" in parent_df.columns:
patch_df = patch_df.join(parent_df["url"], on="parent_id")
else:
raise ValueError(
"[ERROR] Metadata (parent data) should contain a 'url' column."
)

# Add label column if not present
if label_col not in patch_df.columns:
Expand All @@ -195,13 +196,12 @@ def __init__(
)

# Set up annotations file
username = kwargs.get(
"username",
"".join(
if not username:
username = "".join(
[random.choice(string.ascii_letters + string.digits) for n in range(30)]
),
)
task_name = kwargs.get("task_name", "task")
)
if not task_name:
task_name = "task"
id = hashlib.md5(image_list.encode("utf-8")).hexdigest()

annotations_file = task_name.replace(" ", "_") + f"_#{username}#-{id}.csv"
Expand Down Expand Up @@ -269,9 +269,7 @@ def __init__(
# Sort by sortby column if provided
if isinstance(sortby, str):
if sortby in self.columns:
self.sort_values(
sortby, ascending=kwargs.get("ascending", True), inplace=True
)
self.sort_values(sortby, ascending=ascending, inplace=True)
else:
raise ValueError(f"[ERROR] {sortby} is not a column in the DataFrame.")
elif sortby is not None:
Expand All @@ -287,35 +285,33 @@ def __init__(
self.task_name = task_name

# set up for the annotator
self.buttons_per_row = kwargs.get("buttons_per_row", None)
self._min_values = kwargs.get("min_values", {})
self._max_values = kwargs.get("max_values", {}) # pixel_bounds = x0, y0, x1, y1
self._min_values = min_values or {}
self._max_values = max_values or {}

self.patch_width, self.patch_height = self.get_patch_size()

# Create annotations_dir
Path(annotations_dir).mkdir(parents=True, exist_ok=True)

# Set up standards for context display
self.surrounding = kwargs.get("surrounding", 1)
self.max_size = kwargs.get("max_size", MAX_SIZE)
self.resize_to = kwargs.get("resize_to", None)
self.surrounding = surrounding
self.max_size = max_size
self.resize_to = resize_to

# set up buttons
self._buttons = []

# Set max buttons
if not self.buttons_per_row:
if (len(self._labels) % 2) == 0:
if len(self._labels) > 4:
self.buttons_per_row = 4
else:
self.buttons_per_row = 2
if (len(self._labels) % 2) == 0:
if len(self._labels) > 4:
self.buttons_per_row = 4
else:
if len(self._labels) == 3:
self.buttons_per_row = 3
else:
self.buttons_per_row = 5
self.buttons_per_row = 2
else:
if len(self._labels) == 3:
self.buttons_per_row = 3
else:
self.buttons_per_row = 5

# Set indices
self.current_index = -1
Expand Down
2 changes: 2 additions & 0 deletions mapreader/classify/load_annotations.py
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,8 @@ def _load_annotations_csv(
if os.path.isfile(annotations):
print(f'[INFO] Reading "{annotations}"')
annotations = pd.read_csv(annotations, sep=delimiter, index_col=0)
if annotations.index.name in ["name", "image_id"]:
annotations.reset_index(inplace=True, drop=False)
else:
raise ValueError(f'[ERROR] "{annotations}" cannot be found.')

Expand Down
Loading

0 comments on commit 566e602

Please sign in to comment.