Skip to content

A small page scraper , NO DYNAMIC SCRAPING tho 😫

License

Notifications You must be signed in to change notification settings

ZTF666/web-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

💩Scrapy💩

A small page scraper , still a WiP . No dynamic scraping ... This script uses :

Cheerio Javascript
Axios

How to use

  • Install and run
npm install
npm run scrapy
  • Change the website and add yours
axios.get("https://chouftv.ma/press");
  • Change the elements by the ones you desire
$(".description").each((index, element) => {
  const title = $(element).children().first().text();
  const links = $(element).children("a").attr("href");
});

Screenshot

It looks weird because i used it on a local news website.
  • Limitations

    This is a shitty scrapper , i'm still learning.

    It doesn't scrap unloaded links.

    Screenshot

In the screenshot above , the button litteraly translates to : LOAD MORE

Since i suck at this, i can't make it load more so i can grab the links

So it only grabs the latest news articles .

That's a blessing and a curse , beacause if clicked , it will load EVERY ARTICLE WRITTEN

since the deployement of the website...

Contact

you can contact me at [email protected]

License

💩Scrapy💩 released under the MIT License.

Made with 💘 by a 👨‍💻 on a 💻 | 2020 | ZTF666 - N.EA