Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example 3 - include forms not listed? #5

Open
pdurusau opened this issue Oct 3, 2019 · 2 comments
Open

Example 3 - include forms not listed? #5

pdurusau opened this issue Oct 3, 2019 · 2 comments

Comments

@pdurusau
Copy link

pdurusau commented Oct 3, 2019

Now understanding &c. as etc. to indicate an incomplete listing, do you want to encode other forms of #Bryght(e)# that don't appear in the vocabulary? Reasoning that we can find occurrence and auto-generate pointers more easily than Tolkien and make the listing complete. Then of course to distinguish between Tolkien's list of forms versus a more complete one.

@jtauber
Copy link
Member

jtauber commented Oct 3, 2019

I think the right way to do that would be to set up a separate abstract lexicon, lemmatise Sisam's texts linking to that lexicon, and linking Tolkien's entries to that lexicon.

My lemma lattice (originally developed for NT Greek Lexicons) is highly suitable for this use case too where you want to both be able to reference a particular spelling and the lexeme as a whole.

The citations in Tolkien can help bootstrap the lemmatisation but ultimately would not be the primary assertion of the lemmatisation.

Of course, there is a lot that can be done with both Sisam's texts and Tolkien's glossary even before any of this is done.

@pdurusau
Copy link
Author

pdurusau commented Oct 4, 2019

Separate abstract lexicon works but I assume you still want to account for the refs that Tolkien has to Sisam's texts. Yes? Which within <orth> would be <oRef>, using the @target attribute to point to an occurrence mentioned by Tolkien. Assuming you want to accept each reference to a text number + line number or just text number as a string. (Not encoding the text number separate from the line number.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants