-
Notifications
You must be signed in to change notification settings - Fork 11
Add a short definition of mentions to the guide #16
base: master
Are you sure you want to change the base?
Conversation
guide/the-basics.md
Outdated
@@ -26,6 +26,10 @@ Each entity type contains a fixed set of possible properties to describe relevan | |||
|
|||
This structured vocabulary allows entities to be more easily searched, filtered, and cross-referenced with other data sources to find relevant co-occurrences and further enrich your investigation. | |||
|
|||
## Mentions | |||
|
|||
When you upload unstructured documents (for example, PDF documents) to Aleph, Aleph tries to extract names, locations, IBAN account numbers, and more from the document contents. While entities contain structured data, mentions are simple text parts Aleph recognized in a document. You can search Aleph for other datasets and documents with matching mentions. For example, if an uploaded document mentions an IBAN, Aleph allows you to search for other datasets and documents that mention the same IBAN. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would probably say:
When you upload unstructured data (for example, PDF documents), Aleph will try to extract names, locations, IBANs, and more from the document contents and store these as mentions. Mentions are different from entities in that they're stored as text and not FtM entities.
You can search Aleph for other datasets and documents with matching mentions. For example, if an uploaded document mentions an IBAN, Aleph allows you to search for other datasets and documents that mention the same IBAN.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to mention the limitation of mentions in XREF?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would probably say:
When you upload unstructured data (for example, PDF documents), Aleph will try to extract names, locations, IBANs, and more from the document contents and store these as mentions. Mentions are different from entities in that they're stored as text and not FtM entities.
I have updated the wording according to your suggestion, but have removed the reference to FtM, as we do not explain what FtM is in the user-facing documentation, so user would probably be confused about it. Does that look good for you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to mention the limitation of mentions in XREF?
Yes, I think that would be sensible. There are two reasons why I didn’t add that to the documentation:
-
The only documentation we currently have is this page which basically only contains a link to slides for a presentation Kirk gave some time ago. We might want to consider turning that into a docs article at some point -- once we have that, I think it would be sensible to add details on XREF, including the limitations regarding mentions.
-
To be honest, it’s still a little unclear to me in what cases mentions are considered during XREF. Based on what Eric explained, my understanding was the following:
- Mentions in the current dataset are matched against entities from other datasets.
- Mentions in the current dataset are not matched against mentions in other datasets.
- Entities in the current dataset are not matched against mentions in other datasets.
However, Jan suggested in a Wiki comment that this might not be 100% true. I wanted to confirm this with Eric.
Mentions and especially the difference between mentions and entities seem to be grasp for many Aleph users. While this doesn’t go into the details (e.g. regarding XREF), I thought it might be a good idea to briefly explain this concept in the user guide.
Closes #15