Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Extractor] Provide an escape hatch in case of extraction failure #30

Open
maxpatiiuk opened this issue Sep 8, 2024 · 0 comments
Open
Labels
enhancement New feature or request

Comments

@maxpatiiuk
Copy link
Owner

While Readability.js generally does very good job, sometimes it fails.

Google's DOM Extractor is generally worse, but in some cases when Readability.js fails, it is a useful fallback.

Options:

  • Add DOM Extractor to text-hoarder as a fallback mechanism when Readability.js didn't extract content?
  • Keep reporting bugs to Readability.js for when they fail to extract a particular website?
  • If text extraction fails, but alternative browser extension works well, make it easier to extract content from the alternative extension?
    • Some readable extensions spawn an iframe - any way to get inside of that?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: 🆕 New
Development

No branches or pull requests

1 participant