Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[klang] file export of dialogues #185

Open
sylvainkahane opened this issue Jan 25, 2023 · 0 comments
Open

[klang] file export of dialogues #185

sylvainkahane opened this issue Jan 25, 2023 · 0 comments

Comments

@sylvainkahane
Copy link

In some files, a speaker (for instance the interviewer) has not been transcripted or only partially. We decided to add such transcription between square brackets.

  1. During the importation of the file it is possible that two contiguous segments are separated by a "pause" where there can be some non-transcripted utterance. In case an empty segment must be added, the the corrector could decide to fill or not.
  2. Speaker IDs could be made visible before each segment. Utterances in a segment that do not correspond to the speaker id must be added between square brackets.
  3. The export function cut the text into sentences, which are attributed to a speaker. Currently the export is bugged. The speaker attribution is not done. And sometimes a sentence contains a major punctuation.
  4. It is likely to be due to utterances between square brackets, which are not ended by a major punctuation. It is indeed possible that an added sentence continues on a second segment and is in two square-bracketed segments. But is very unlikely that it continue in a third segment and a warning must be sent in such a case.
  5. It must also be verified that each word receive AlignBegin and AlignEnd features.
  6. We also use "+" as a special character to indicate that a sentence has a syntactic structure that must be attached to a previous sentence. This character must be translated into a feature AtachedSentence= Yes.
  7. Reported speech is between double quotes "…". The second double must have a major punctuation before and after. This must be verified (error report if it is not the case) and in this special case, the sentence must not be cut before the double quote. Example:
he said "wonderful. I understand.".
=> he said "wonderful.
I understand.".
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant