Skip to content

Issues: adbar/trafilatura

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

Question about error handling design in request functions
#789 opened Feb 26, 2025 by L-cloud updated Feb 26, 2025
markdown conversion removes elements with anchors in a list
#788 opened Feb 20, 2025 by ziodave updated Feb 20, 2025
Fast and full mode yield the same results bug Something isn't working
#787 opened Feb 12, 2025 by adbar updated Feb 12, 2025
Trafilatura cannot read gzipped pages? bug Something isn't working
#781 opened Feb 2, 2025 by LaundroMat updated Feb 3, 2025
Issues with xpath processing along the "FullText" path template recognition. bug Something isn't working
#780 opened Jan 29, 2025 by krstp updated Jan 31, 2025
Deduplication is non-deterministic (and destructive) question Further information is requested
#778 opened Jan 24, 2025 by BramVanroy updated Jan 27, 2025
Table tags incorrect in HTML formatted output bug Something isn't working
#777 opened Jan 14, 2025 by GICodeWarrior updated Jan 27, 2025
Trafilatura fails to extract structured heading tags (h2, h3)
#774 opened Jan 7, 2025 by LeMoussel updated Jan 7, 2025
Turning on "--keep-dirs" gives no output bug Something isn't working
#771 opened Dec 20, 2024 by DesBw updated Dec 27, 2024
Duplicated lines when nested in <article> and <main>, with <br> in front bug Something isn't working
#768 opened Dec 14, 2024 by ibestvina updated Dec 23, 2024
Question regarding title extraction question Further information is requested
#770 opened Dec 16, 2024 by unsleepy22 updated Dec 18, 2024
Documentation: on precision documentation Docs in need of update or extension
#766 opened Dec 10, 2024 by DesBw updated Dec 10, 2024
CLI: better control of output file names enhancement New feature or request
#754 opened Nov 30, 2024 by DesBw updated Dec 5, 2024
Backticks produce extra line breaks bug Something isn't working
#755 opened Nov 30, 2024 by klvbdmh updated Dec 2, 2024
Support for sidemap parsing from text instead of urls feedback Feedback from users requested
#751 opened Nov 27, 2024 by NiClassic updated Nov 28, 2024
Performance bottleneck in prune_unwanted_nodes causing 200ms per call question Further information is requested
#750 opened Nov 23, 2024 by thsunkid updated Nov 25, 2024
Review input type for is_probably_readerable() function enhancement New feature or request
#749 opened Nov 22, 2024 by adbar updated Nov 22, 2024
Documentation about settings could use examples documentation Docs in need of update or extension
#746 opened Nov 15, 2024 by georgedorn updated Nov 18, 2024
Add document language to metadata enhancement New feature or request
#224 opened Jul 19, 2022 by adbar updated Nov 12, 2024
feat(cli/lib): Add tqdm based progress bar as an option enhancement New feature or request
#663 opened Jul 30, 2024 by chitralverma updated Oct 22, 2024
Review HTML element list and conversion enhancement New feature or request
#720 opened Oct 15, 2024 by adbar updated Oct 15, 2024
2 tasks
Empty Results When Using Spider Function with Category URL question Further information is requested
#696 opened Sep 9, 2024 by felipehertzer updated Oct 1, 2024
List of smaller extraction bugs (text & metadata) good first issue Good for newcomers up for grabs Good for (first) contributors
#4 opened Jan 9, 2020 by adbar updated Sep 22, 2024
Docs: add page explaining how to run tests documentation Docs in need of update or extension
#698 opened Sep 9, 2024 by adbar updated Sep 9, 2024
Downloads: add support to switch between proxies enhancement New feature or request
#697 opened Sep 9, 2024 by adbar updated Sep 9, 2024
ProTip! Find all open issues with in progress development work with linked:pr.