join the committees table on Ballot_Measure_Election to election_name #351

ChenglimEar · 2024-04-13T05:42:33Z

since committees change names to reflect what they are supporting or opposing for an election

This problem was found under the following circumstance:

There are some differences in the build output that are due to DISTINCT in SQL picking out different rows when there are multiple committees for the same Filer_ID. As an example, take a look at the difference for build/_data/candidates/oakland/2014-11-04/bryan-parker.json. The value for Filer_NamL is different and it comes from this query in the independent_candidate_expenditures view:

SELECT DISTINCT ON ("Filer_ID") "Filer_ID", "Filer_NamL"
FROM committees WHERE "Filer_ID" = '983545';
The following query shows that there are two rows for the same Filer_ID and so the DISTINCT portion of the query picks a different one before and after the csvkit upgrade:

disclosure-backend=# SELECT "Filer_ID", "Filer_NamL"
FROM committees WHERE "Filer_ID" = '983545';
Filer_ID | Filer_NamL
----------+---------------------------------------------------
983545 | OAKPAC, Oakland Metropolitan Chamber of Commerce
983545 | OAK PAC, Oakland Metropolitan Chamber of Commerce
(2 rows)
If we force it to always pick the same one (like always pick the most recent one), it will help reduce the noise so that we can verify that there aren't any other differences that are more serious.

We want to try to do better than pick the most recent one this time around.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

join the committees table on Ballot_Measure_Election to election_name #351

join the committees table on Ballot_Measure_Election to election_name #351

ChenglimEar commented Apr 13, 2024

join the committees table on Ballot_Measure_Election to election_name #351

join the committees table on Ballot_Measure_Election to election_name #351

Comments

ChenglimEar commented Apr 13, 2024