-
Notifications
You must be signed in to change notification settings - Fork 584
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhancing the decision process text when working with images #1361
Comments
While at it, in analyzer_engine.py line:222 I modified the line so that the code prints out every case in a new line , even more readable |
@NuiMrme are you asking specifically for images, or for any text? |
Does this help? #925 (comment) |
Sorry that wasn't well explained. I'm not reporting a bug but rather a feature I implemented on my version of Presidio that might help others too. See when you work with images or a lot of text while having your Every new case will begin in a new line and observe that there is now a 'entity_text' which will show that text that is detected (I covered it with red for the obvious reasons), now you don't have to guess what line was that in the image what position etc... This is more readable and help the anlaysis of the annomyization results. |
One of the reasons we intentionally left out the actual identified text, is because it is essentially PII you might not want to log or return. If you have a suggestion on how to allow this, perhaps not asa default setting, we'd be happy to hear. I totally agree that there are cases, especially with the images module, where returning or logging the actual text is needed. |
Well they are already printed out in the beginning anyway |
Good catch. I guess that for |
Absolutely |
Is your feature request related to a problem? Please describe.
The decision process output prints out the entity_type, start_position, end_position and the score. When working with longer sequences of texts or with images, printing start = 204 end = 217 doesn't really mean anything and it is hard to see where that is.
Describe the solution you'd like
Add an entity_text where the the text in question is also printed: printing start = 204 end = 217 entity_text = "Saint Antonio"
I solved this on my version by adding
entity_text: str,
in recognizer_result.py init function which then affected also image_analzer_engine.py, image_recognizer_results.py, spacy_recognizer.py and pattern_recognizer.py
but the output is rather more readable
The text was updated successfully, but these errors were encountered: