Skip to content

GowthamParamasivam/SkalTrial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SKAL TRIAL SCRAPPING

scrapy startproject projectname

scrapy genspider spidername URL

scrapy version used 2.1.0

Splash scrapy for accessing the java script objects

docker for splash browser sudo docker run -it -p 8050:8050 --name splash scrapinghub/splash requires knowledge on lua scripts

Beginner friendly

selenium for headless browser https://github.com/clemfromspace/scrapy-selenium

chrome selenium driver path https://chromedriver.storage.googleapis.com/index.html?path=81.0.4044.69/

install the pymongo pip3 install pymongo

docker run -d --network some-network --name some-mongo -e MONGO_INITDB_ROOT_USERNAME=mongoadmin -e MONGO_INITDB_ROOT_PASSWORD=secret -p 8081:8081 mongo

or

check the mongodb.yaml file for docker-compose docker-compose -f mongodb.yaml up -d

for images need

pip3 install pillow

About

Scrapping a site

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages