Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Annotator class #173

Merged
merged 74 commits into from
Dec 14, 2023
Merged

New Annotator class #173

merged 74 commits into from
Dec 14, 2023

Conversation

kallewesterling
Copy link
Contributor

@kallewesterling kallewesterling commented May 4, 2023

Attempting to create a new Annotator class for MapReader, which will provide a less buggy experience for the user who is trying to annotate their patches.

Feature requests:

  • ensure joining on existing annotations works
  • ensure not re-annotating
  • showing context
  • Add in sortby keyword, for sorting by a different column (existed in prior version)
  • Add a min_mean_pixel and max_mean_pixel for filtering data shown to you (existed in prior version)
  • display URL to NLS map
  • keep patch filepaths in annotations csv output
  • keep label names (either instead of or in addition to label indices) in annotations csv output.
  • margin
  • "next random patch"
  • keyboard shortcuts
  • batch sizing
  • restrict to bounding box
  • consider load and dump method/s (for, for instance LabelStudio input/output etc.)
  • Make it obvious that 'next' and 'previous' are next random patch not like moving left/right on context image

@rwood-97
Copy link
Collaborator

rwood-97 commented Dec 5, 2023

@thobson88 this is more-or-less ready now, I need to add tests but otherwise should be good. Do you want to check the notebook works on your machine (in notebook/lab/vscode)?

I think things we make follow up tickets for are:

  • Adding keyboard shortcuts
  • Restricting annotations to patches within a bounding box (relies on patches having coords)
  • load and dump method/s (for, for instance LabelStudio input/output etc.)

@thobson88
Copy link
Contributor

Good news: all tests are passing for me.

Bad news: when I run the annotation notebook, in the Patchify! step, I get an IndexError: list index out of range at this call:

my_files.add_metadata(metadata="./download/maps/metadata.csv")

The full trace is:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[5], line 3
      1 # First load all the files
      2 my_files = loader("./download/maps/*.png")
----> 3 my_files.add_metadata(metadata="./download/maps/metadata.csv")

File ~/Documents/development/github/Living-with-machines/MapReader/mapreader/load/images.py:357, in MapImages.add_metadata(self, metadata, index_col, delimiter, columns, tree_level, ignore_mismatch)
    353             metadata_df = pd.read_csv(
    354                 metadata, usecols=columns, delimiter=delimiter
    355             )
    356         else:
--> 357             metadata_df = pd.read_csv(
    358                 metadata, index_col=index_col, delimiter=delimiter
    359             )
    360             columns = list(metadata_df.columns)
    362 else:

File ~/.virtualenvs/mapreader-annotation/lib/python3.9/site-packages/pandas/util/_decorators.py:211, in deprecate_kwarg.<locals>._deprecate_kwarg.<locals>.wrapper(*args, **kwargs)
    209     else:
    210         kwargs[new_arg_name] = new_arg_value
--> 211 return func(*args, **kwargs)

File ~/.virtualenvs/mapreader-annotation/lib/python3.9/site-packages/pandas/util/_decorators.py:331, in deprecate_nonkeyword_arguments.<locals>.decorate.<locals>.wrapper(*args, **kwargs)
    325 if len(args) > num_allow_args:
    326     warnings.warn(
    327         msg.format(arguments=_format_argument_list(allow_args)),
    328         FutureWarning,
    329         stacklevel=find_stack_level(),
    330     )
--> 331 return func(*args, **kwargs)

File ~/.virtualenvs/mapreader-annotation/lib/python3.9/site-packages/pandas/io/parsers/readers.py:950, in read_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, error_bad_lines, warn_bad_lines, on_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options)
    935 kwds_defaults = _refine_defaults_read(
    936     dialect,
    937     delimiter,
   (...)
    946     defaults={"delimiter": ","},
    947 )
    948 kwds.update(kwds_defaults)
--> 950 return _read(filepath_or_buffer, kwds)

File ~/.virtualenvs/mapreader-annotation/lib/python3.9/site-packages/pandas/io/parsers/readers.py:611, in _read(filepath_or_buffer, kwds)
    608     return parser
    610 with parser:
--> 611     return parser.read(nrows)

File ~/.virtualenvs/mapreader-annotation/lib/python3.9/site-packages/pandas/io/parsers/readers.py:1778, in TextFileReader.read(self, nrows)
   1771 nrows = validate_integer("nrows", nrows)
   1772 try:
   1773     # error: "ParserBase" has no attribute "read"
   1774     (
   1775         index,
   1776         columns,
   1777         col_dict,
-> 1778     ) = self._engine.read(  # type: ignore[attr-defined]
   1779         nrows
   1780     )
   1781 except Exception:
   1782     self.close()

File ~/.virtualenvs/mapreader-annotation/lib/python3.9/site-packages/pandas/io/parsers/c_parser_wrapper.py:276, in CParserWrapper.read(self, nrows)
    274     values = data.pop(i)
    275 else:
--> 276     values = data.pop(self.index_col[i])
    278 values = self._maybe_parse_dates(values, i, try_parse_dates=True)
    279 arrays.append(values)

IndexError: list index out of range

@rwood-97
Copy link
Collaborator

rwood-97 commented Dec 7, 2023

Hmm, can you check your delimiter?
You might have tab or pipe vs comma and so it isn't reading the file properly. Fix would be to pass delimiter="\t" as a kwarg.
I'll add a test for this and try add some better error handling.

EDIT: I don't think the delimiter is the problem but I still added tests/docs

@rwood-97 rwood-97 requested a review from kmcdono2 December 7, 2023 11:44
@thobson88
Copy link
Contributor

Thanks @rwood-97. It was the delimiter. I must have had an old version of the metadata CSV with tabs instead of commas.

All working now. I'd say this is ready to merge, and we just need to capture those additional features in a follow-up ticket.

@rwood-97 rwood-97 marked this pull request as ready for review December 7, 2023 14:31
@rwood-97
Copy link
Collaborator

rwood-97 commented Dec 7, 2023

@kmcdono2 do you want to have a go at running this before we merge or should we just go for it?

@kmcdono2
Copy link
Collaborator

kmcdono2 commented Dec 7, 2023

Testing now

Copy link
Collaborator

@kmcdono2 kmcdono2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few small comments. It works though ;)

Copy link
Contributor

@thobson88 thobson88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Just a couple of suggestions in docs/source/User-guide/Annotate.rst.

@kallewesterling
Copy link
Contributor Author

Just wanted to drop a note here and say thank you to @rwood-97 for cleaning up my messy code and making this something that actually works 🚀

Happy holidays to all!!! ⭐ 🌟

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Prevent re-loading of pixel data for patches Build a new annotation interface (it hangs)
5 participants