Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New classifier: generalized item description #516

Open
cuducos opened this issue Feb 7, 2020 · 0 comments
Open

New classifier: generalized item description #516

cuducos opened this issue Feb 7, 2020 · 0 comments

Comments

@cuducos
Copy link
Collaborator

cuducos commented Feb 7, 2020

What is the problem?

Following CEAP rules (Ato de Mesa 43/2009, Art. 4º, § 3º), it is not allowed for congresspeople to generalize description of items in the official receipt. We have seen tons of cases since the beginning of the project and we even wrote about it.

However, at that time all receipts were in scanned, which made it difficult to parse structured information from them. Thus, in spite of our attempts using OCR and deep learning, it was not possible to progress teaching Rosie how to ponder whether a given receipt had generalizations or not.

How can this be addressed?

Nowadays most of the receipts come in in the digital form and since #501 we can easily select only these new electronic receipts. This is an opportunity to tech Rosie a new trick:

  • If the reimbursement has an electronic receipt
  • If we can parse the receipt to structure data about the description of each item in the receipt
  • If there's only one item and this item matches a generalization dictionary (refeição, despesas com refeição etc.)
  • Then it's is suspicious of disrespecting Ato de Mesa 43/2009, Art. 4º, § 3º

Who could help with this issue?

Anyone willing to validate that hypothesis using a notebook and, later, porting it to Rosie's pipeline.


UPDATE Some tweets here and there with actual suspicious overlapping this hypothesis

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant