You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Found your wonderful software, but had minor issue when loading an Amazon Transcribe transcript that had the variant format for independent audio channels as oppose to the typical speakers format.
Impressively, your software still loaded the rows of the transcript correctly, however, it made every speaker label have a unique number suffix, so it was impossible to relabel the speaker labels all at once and almost insurmountable task to track and correct by hand a very long transcript.
It has "channel_labels" (object) -> "channels" (array/list) ->"channel" (object) with each channel containing it's own "items" for words oppose to "items" being declared once in the other format and uses "channel_label" instead of "speaker_label" for speakers.
Could you please accommodate the Amazon Transcribe channel format variant and at least have speaker ID labels be consistent per channel if not matching the "channel_label?"
Doesn't have to be U_UKN but for STT services that returns speaker diarization infos sometimes it might look something like M_1 or F_2 etc... (eg using speechmatics)
To accommodate that it be a matter of modifying the AWS STT Adapter in a way that
keeps compatibility with other AWS STT format
is able to distinguish between the two and uses the correct one
if speaker diarization info is available uses those, otherwise fallback to a default
Don't want to speak for @jamesdools and @emettely but I am guessing a PR would be welcome, if you got the time/capacity?
As a side note, at the moment I am mostly working on this alternative version pietrop/slate-transcript-editor. It doesn't provide any adapters as part of the core components, but I've extracted some of the adapters from this module, eg pietrop/aws-to-dpe, pietrop/gcp-to-dpe for when that type of conversion might be needed, eg working with AWS STT, or Google STT.
@gittes - hi! Thanks for sending us a request to improve the adapters - we would be ecstatic if you could help us out to add that compatibility, based on the information that @pietrop mentioned above. We would be happy to review it / merge.
Found your wonderful software, but had minor issue when loading an Amazon Transcribe transcript that had the variant format for independent audio channels as oppose to the typical speakers format.
Impressively, your software still loaded the rows of the transcript correctly, however, it made every speaker label have a unique number suffix, so it was impossible to relabel the speaker labels all at once and almost insurmountable task to track and correct by hand a very long transcript.
It's used when each speakers are each on a dedicated channel/track in the source audio file:
https://docs.aws.amazon.com/transcribe/latest/dg/how-channel-id.html
Excerpt from referred AWS doc showing the JSON format:
It has "channel_labels" (object) -> "channels" (array/list) ->"channel" (object) with each channel containing it's own "items" for words oppose to "items" being declared once in the other format and uses "channel_label" instead of "speaker_label" for speakers.
Could you please accommodate the Amazon Transcribe channel format variant and at least have speaker ID labels be consistent per channel if not matching the "channel_label?"
Just for reference here's the doc for speaker identification format:
https://docs.aws.amazon.com/transcribe/latest/dg/how-diarization.html
The text was updated successfully, but these errors were encountered: