Replies: 10 comments
-
+1 to overhauling that, would be nice to simplify. A few thoughts:
|
Beta Was this translation helpful? Give feedback.
-
@schneidy can you post the sheet you started here so we can collaborate on these issues and as for 4. 100% agree- Making that customizable is something that's long overdue. I have a few ideas I'll try to write up on a pupa issue at some point |
Beta Was this translation helpful? Give feedback.
-
Here is the sheet that has the initial notes on the levels of difficulty to categorize the actions for each state. |
Beta Was this translation helpful? Give feedback.
-
In pupa's do_scrape in scrape/base.py we added a handler for custom outputters with their own save_object() function. |
Beta Was this translation helpful? Give feedback.
-
gotcha, so you're bypassing all of pupa.import right? |
Beta Was this translation helpful? Give feedback.
-
Yup, skipping that step entirely I believe, and running them all with --scrape |
Beta Was this translation helpful? Give feedback.
-
Just noting the fact that entity recognition (e.g. 'ADDED H.SMITH AS SPONSOR') is currently part of the categorizers, and we'd want to revisit that too. |
Beta Was this translation helpful? Give feedback.
-
Yeah, one vote here for splitting that out into separate logic. Spit-balling here, does it make sense to do a multi-pass approach:
Then we could re-use the people and entity resolution code elsewhere for looking up voters and sponsors. I'm imagining some built-ins that's are easy to override on a per-jurisdiction level, maybe in the init.py or something. Not sure if this makes sense as one function that can take an action or a voter or a sponsor (maybe with an arg to differentiate?) or if the logic is going to be different enough that they should be different. |
Beta Was this translation helpful? Give feedback.
-
Side note on my idea of the entity resolvers living in the scrapers, that might make running in scrape mode a real mess, so maybe that should be left in core where it lives now? It would be cool to make that easier to update though, so maybe there's another way. |
Beta Was this translation helpful? Give feedback.
-
thoughts on moving forward on this a bit (and other related things) in #20 |
Beta Was this translation helpful? Give feedback.
-
As part of the work we're doing we want to be able to improve action categorization, part of that will involve reworking how it works so that we can re-categorize actions without a new scrape (important for old sessions). I'd also like to make it easier for people to submit simple fixes to the action categorizers without editing a bunch of code in the scraper itself.
With those goals in mind, here's my proposal:
pupa categorize-actions nc
other things that'll need to be done:
Open questions:
Beta Was this translation helpful? Give feedback.
All reactions