Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optional pre-parsing of html #32

Open
dghelm opened this issue Oct 11, 2020 · 0 comments
Open

Optional pre-parsing of html #32

dghelm opened this issue Oct 11, 2020 · 0 comments

Comments

@dghelm
Copy link

dghelm commented Oct 11, 2020

Feature Suggestion: Currently, if you want to know multiple details about a case, you pass the html to be parsed by BS4 for each parse method (attorney, counts, etc) which requires a new parsed soup for each one.

This could be lazily computed on the Case instance (in case it doesn't need to be done at all) and then passed to each parser.

This could also allow for 'lxml' or other parsers to easily be subbed in instead of "html.parser".

I used a messy implementation of this on a project with lots of cases, and between this and the other parser, was able to get good speedups when I wanted to fully parse all details from a case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant