Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange IndexError caused by [SMACK] #183

Closed
roedoejet opened this issue Jul 25, 2023 · 2 comments
Closed

Strange IndexError caused by [SMACK] #183

roedoejet opened this issue Jul 25, 2023 · 2 comments
Assignees

Comments

@roedoejet
Copy link
Collaborator

I can't provide steps to re-create this because the data is private, but while aligning a specific audio file, an IndexError is thrown when trying to get the word element:

File "/Users/pinea/.pyenv/versions/3.9.14/lib/python3.9/site-packages/readalongs/align.py", line 1006, in get_word_texts_and_sentences
    word_el = get_word_element(tokenized_xml, word["id"])
  File "/Users/pinea/.pyenv/versions/3.9.14/lib/python3.9/site-packages/readalongs/align.py", line 972, in get_word_element
    return xml.xpath(f'//w[@id="{el_id}"]')[0]
IndexError: list index out of range

A breakpoint showed that the word in question was {'id': '[SMACK]', 'start': 74.19, 'end': 75.66} - removing it fixes the issue, but I presume this is something coming from SoundSwallower. @dhdaines - how should we be handling this?

@dhdaines
Copy link
Collaborator

Ah, we should be filtering those noise words out, but for whatever reason we currently aren't. No need to see the private data to fix this, I can make a PR tonight.

@roedoejet
Copy link
Collaborator Author

Fixed by #184

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants