Skip to content

Scrap all the articles and store in mongodb

iamsudip/bloomberg

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 

Repository files navigation

bloomberg hack

Project requirement

Build a crawler to store all the plain article text from www.bloomberg.com in mongodb

How to run

To start the daemon ::

$ python bloomhack.py start

this will start inserting the articles to the database(Assuming the mongodb database is running). It will log the data inserted to the database in a file named bloomhack.log.

To stop the daemon ::

$ python bloomhack.py stop

Press [ctrl]+c to stop the process.

Example run

$ python bloomhack.py start
PID: 3418 Daemon started successfully
$

$ python bloomhack.py stop
Daemon killed succesfully
$

About

Scrap all the articles and store in mongodb

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages