Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Test Email with non UTF-8 Characters which breaks execution. #4

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

zaghadon
Copy link

In response to #3 (comment) on #3

@zaghadon
Copy link
Author

Hi @dunnkers I find something interesting here, when I run grep -axv '.*' ./* It outputs rightly the eml file with the non UTF-8 character found, and I if I try to run the globally installed package on the folder where it is it'll fail with the error message, but then when I copy it to the test_emails/ folder and run the test, it'll work successfully after which running grep -axv '.*' ./* would also outputs nothing anymore.

Is the eml file getting transformed to utf-8 during test?

Meanwhile, when I paste all the eml files in the test_emails/ folder I get 17 failed tests, and I can't put all of that into the repository.

@dunnkers
Copy link
Owner

Hi @zaghadon ! I just executed the CI pipeline for this branch and it all seems to pass:

https://github.com/dunnkers/eml-to-html/actions/runs/8754777933?pr=4

.. same locally. So that executes the CI on ubuntu, macOS and Windows. Any idea how we can still reproduce the issue? And great you're so actively contributing !

@zaghadon
Copy link
Author

Hello @dunnkers I just created a test_email_3.eml which I stuffed with characters of different encodings and it indeed failed the test on local. However, I checked the CI Pipeline logs and don't think it reached the pytest workflow. Could you try this to see how this is replicated?

@zaghadon
Copy link
Author

zaghadon commented May 2, 2024

Hello @dunnkers I finally checkedout a branch to try to fix the breaking of the workflow at Python Installation, after several tries, I discovered the error was coming from the latest MacOS, I couldn't troubleshoot it, so I removed MacOS in the GitHub Action and The tests passed with the existing tests.

Then I merged it into the latest add-test-email-for-#2 branch. You can check the recent workflow runs to see that execution breaks at Test with message:

FAILED test_eml_to_html.py::test_eml_to_html[test_emails/test_email_3.eml] - UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe1 in position 159: invalid continuation byte

Review this updates and merge to try it with the initial Pull Request that contains fixes at #3

@zaghadon
Copy link
Author

@dunnkers Hi Jereon, I just wanted to bring this again to your notice, if you have a free time, kindly review.

@dunnkers
Copy link
Owner

@dunnkers Hi Jereon, I just wanted to bring this again to your notice, if you have a free time, kindly review.

Hi @zaghadon your continuous effort is much appreciated. I hadn't had much time to look at this. My main thought here is that when we change the reading encoding to latin-1, that the reading process keeps supporting other encodings as well. Perhaps we can make the code such that we support both? By either detecting which encoding we should use or by just trying utf-8 first and then falling back to latin-1 in case of an error.

How's that sound? Curious to hear your thoughts. In either case- have a great day still ☀️.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants