Skip to content

realstorypro/data-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Crunchbase Scraper

Status GitHub Issues GitHub Pull Requests License

📝 Table of Contents

🧐 About

This project was implemented to be able to save crunchbase data without having access to their APIs. All that you need is a Crunchbase Free Trial.

It gathers data about companies like their website, their twitter and their founder's twitter. It can be modified to gather other types of data easily.

🏁 Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Installing

pip install -r requirements.txt

🎈 Usage

The project is composed of 2 scripts clipboard_fetcher.py and crunchbase_scraper.py

In order to get a list of comapanies, saved in list_of_company_names_raw run python clipboard_fetcher.py. Then login into Crunchbase, go to an advanced search and cmd+a, cmd+c. The program will automatically detect the copied content and will write the name of the company to the list csv.

In order to scrape data using the company names run python crunchbase_scraper.py. It will write the data in 3 files:

  • found.csv - the companies that were found. Format Company Name, Company Website, Company Twitter, CEO Twitter, CTO Twitter
  • not_found.csv - the companies that were not found based on the company name. Format Company Name
  • error.csv - the companies that returned an error while scraping. Format Company Name

⛏️ Built Using

✍️ Authors

See also the list of contributors who participated in this project.

🎉 Acknowledgements

  • Hat tip to anyone whose code was used
  • Inspiration
  • References

About

We scrape. We transform. We Sell.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages