Skip to content

Latest commit

 

History

History
46 lines (24 loc) · 2.45 KB

README.md

File metadata and controls

46 lines (24 loc) · 2.45 KB

Scrape Meetup

🏴‍☠️ Web scrape Meetup.com now that they have locked down their API. 😤

In August 2019, Meetup.com shutdown open access their API. In order to gain access to the API you now needed to apply and get approval. And in order to apply, you needed to pay for a 💸PRO account💸. This action hurt individual groups, like the Coffee Club of Seattle, which used the API to help organizers use historical data to schedule new events.

Without access to the API, one now needs to scrape the website to get event details. Let's get started! 💪🏻

Last 10 Events

Every public Meetup group has a page with their last 10 events.

Example [https://www.meetup.com/seattle-coffee-club/events/past/]

getEventHistory.js will scrape that page to pull out the eventID from the a.eventCard--link element. My sample code adds the eventID to a MySQL table for processing in the next step.

Event Detail

With the eventID, you can build a direct link to an event page.

Example: [https://www.meetup.com/seattle-coffee-club/events/265684295/]

On this page, we will scrape to pull up all the details related to the event and the venue. The only piece of data not inside the scrape that was available in the API was the venueID. If you have a table of prior events already saved you can write a query that matches the latitude and longitude of the venue to get a match. 😎

Once the data is pulled, this code will save a JSON file locally. You can process your JSON in whatever way is best for your situation.

Combine Both Steps

getEventHistoryAndJSON.js combines both steps.

Most Scrapes Work

On the first 20 scrapes, the code was able to pull full event and venue details for 19. One was returned as a partial with missing data. If I can improve the scraper, I'll update this repo.

The risk with any scraper is that it only works until the website gets redesigned. At which point, the code will need to be modified to work with the new layout.

FTP the JSON files to a Web Server

processEventJSON.js takes the event JSON files in the /json/ folder and FTPs them up to a webserver. Then it places that file into a sent folder.

More Ideas

I used the Last 10 page for my group, because we have all the legacy data saved for over 1,300 events going back to 2006. If I didn't have that data, I'd look into scraping the monthly pages.

Example [https://www.meetup.com/seattle-coffee-club/events/calendar/2019-09/]