Skip to content

v3.0.0

Compare
Choose a tag to compare
@vmenger vmenger released this 20 Dec 10:37
· 19 commits to main since this release
8e0c7fa

3.0.0 (2023-12-20)

Added

  • speed optimizations, ~250%
  • pseudo-annotating eponymous diseases (e.g. Creutzfeldt-Jakob)
  • PatientNameAnnotator, which replaces deduce.pattern
  • a structured way for loading and building lookup structures (lists and tries), including caching
  • pre_match_words for some regexp annotators, speeding up the annotating
  • option to present a user config as dict (using config keyword)

Changed

  • speedup for TokenPatternAnnotator
  • some internals of ContextPatternAnnotator
  • initials now detected by lookup list, rather than pattern
  • redactor open and close chars from < > to [ ], as previous chars caused issues in html (so deidentified text now shows [PATIENT], [LOCATIE], etc.)
  • names of lookup structures to singular (prefix, rather than prefixes)
  • INSTELLING tag to ZIEKENHUIS and ZORGINSTELLING
  • refactored and simplified annotator loading, specifically the annotator_type config keyword now accepts references to classes (e.g deduce.annotator.TokenPatternAnnotator)
  • renamed interfix_with_capital annotator to interfix_with_name

Deprecated

  • the config_file keyword, now replaced by config which accepts both filenames and dicts
  • old lookup list names, e.g. prefixes now replaced by prefix
  • annotator types 'custom', 'regexp', 'token_pattern', 'dd_token_pattern' and 'annotation_context', all replaced by setting class directly as annotator_type

Removed

  • automated coverage reporting on coveralls.io
  • options lowercase_lookup, lowercase_neg_lookup for token patterns
  • everything in deduce.pattern, patient patterns now replaced by PatientNameAnnotator
  • utils.any_in_text

Fixed

  • some small additions/removals for specific lookup lists
  • smaller bugs related to overlapping matches