Skip to content

Optional pre-parsing of html #32

Open
@dghelm

Description

@dghelm

Feature Suggestion: Currently, if you want to know multiple details about a case, you pass the html to be parsed by BS4 for each parse method (attorney, counts, etc) which requires a new parsed soup for each one.

This could be lazily computed on the Case instance (in case it doesn't need to be done at all) and then passed to each parser.

This could also allow for 'lxml' or other parsers to easily be subbed in instead of "html.parser".

I used a messy implementation of this on a project with lots of cases, and between this and the other parser, was able to get good speedups when I wanted to fully parse all details from a case.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions