Initial build of PrimeGov scraper #24

krammy19 · 2021-04-30T02:26:51Z

We need to design a web scraper for the PrimeGov agenda hosting platform. Examples of cities that use this software include San Mateo and Pleasanton

Input: a path endpoint or a batch .csv file of endpoints to scrape. This url should point to the specific page on a city website where agendas are listed out for public review.

Output: a .csv table with the following column data for each agenda listed on this page:

{Column} | {Description}
Index | Autoincrement index
City | City or agency name
Meeting Name | Title of government body
Date / Time | Date / time of meeting
Agenda | URL of agenda pdf
Meeting video | URL of meeting video (if available)
Published minutes | URL of minutes pdf (if available)

For reference and examples, please see this scraping walkthrough for Legistar.

As you build this scraper, keep in mind that we will need to eventually add additional features, including:

Accessing past agendas not included on the city’s main page, e.g. past years.

Filtering agenda scraping by Date range

Filtering agenda scraping by Meeting Name

Downloading all agendas from scraped urls into a specified directory

Scraping staff report urls from scraped agendas.

skyheat47295 · 2021-04-30T02:43:25Z

@krammy19 I'd like to volunteer to tackle this one.
Thanks, Hamilton Dutcher

krammy19 · 2021-05-05T01:23:12Z

Sure, that would be a huge help! Thanks @skyheat47295

skyheat47295 · 2021-05-11T00:57:01Z

@krammy19 , @shengxio Hello, my first draft is ready for review. I can create a PR, or you can review the code here.
https://github.com/skyheat47295/PrimeGov_scraper

The code works, and brings down a .csv file with the required data, however;
There is an issue is regarding the javascript interpretation of the href link.
Shall we set up a time to discuss?

shengxio · 2021-05-14T16:07:01Z

@skyheat47295 , @krammy19 Hello and yes definitely.
I am normally free on Saturday, Monday all day and Thursday before 3pm.
Let me know which day works the best for you guys, and we can shuffle from there.

To be honest, I am still learning what you did in the repo so.
We may need 30 mins to an hour time for the meeting if you don't mind.

Thanks!

Roland Ding

skyheat47295 · 2021-05-15T04:54:12Z

@shengxio @krammy19 I'm available during those times. If you want to just touch base, you can DM me on Slack, and we can schedule something for tomorrow or next week. I can walk you through what I did, and the thoughts I have.

Thanks!

H.

shengxio · 2021-12-01T00:06:15Z

@krammy19 Hey Mark, I wonder if this task has been completed?

Thanks!
R.

krammy19 assigned skyheat47295 and shengxio May 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial build of PrimeGov scraper #24

Initial build of PrimeGov scraper #24

krammy19 commented Apr 30, 2021 •

edited

Loading

skyheat47295 commented Apr 30, 2021 •

edited

Loading

krammy19 commented May 5, 2021 •

edited

Loading

skyheat47295 commented May 11, 2021

shengxio commented May 14, 2021

skyheat47295 commented May 15, 2021

shengxio commented Dec 1, 2021

Initial build of PrimeGov scraper #24

Initial build of PrimeGov scraper #24

Comments

krammy19 commented Apr 30, 2021 • edited Loading

skyheat47295 commented Apr 30, 2021 • edited Loading

krammy19 commented May 5, 2021 • edited Loading

skyheat47295 commented May 11, 2021

shengxio commented May 14, 2021

skyheat47295 commented May 15, 2021

shengxio commented Dec 1, 2021

krammy19 commented Apr 30, 2021 •

edited

Loading

skyheat47295 commented Apr 30, 2021 •

edited

Loading

krammy19 commented May 5, 2021 •

edited

Loading