NAVER Cafe Crawler

NAVER Cafe Crawler using pandas, tqdm, Selenium, BeautifulSoup4

Caution

This crawler was created for educational and training purposes related to data analysis.
Users bear full legal responsibility for any use of this crawler.

Requirement

pandas
tqdm
Selenium
BeautifulSoup4

Install them using pip.

pip install pandas tqdm selenium beautifulsoup4

Configuring Config.py

Set your NAVER ID and password.

user_id = ''
user_pw = ''

Specify the name and ID of the NAVER Cafe you want to crawl.

cafe_name = ''
cafe_id = 0

Running Prepare.py

Specify the menu ID of the NAVER Cafe and the number of pages to collect.
(Menu ID, Number of pages)

Note

For example, if you enter 15 as the number of pages, it will crawl from page 1 to page 15.

menu_id_page = [
    (100, 15),
]

Run Prepare.py to generate a file named [Menu ID]_link.csv, which contains the links to be crawled.

python3 Prepare.py

Running Crawling.py

Provide the list of [Menu ID]_link.csv files to crawl.

file_name_list = [
    ('100_link.csv'),
]

Run Crawling.py to perform crawling.

python3 Crawling.py

It will save:

Post contents in [Menu ID]_content.csv
Comments in [Menu ID]_comment.csv

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Config.py		Config.py
Crawling.py		Crawling.py
Prepare.py		Prepare.py
README.md		README.md
Util.py		Util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NAVER Cafe Crawler

Requirement

Configuring Config.py

Running Prepare.py

Running Crawling.py

About

Releases

Packages

Languages

sirius-mhlee/naver-cafe-crawler

Folders and files

Latest commit

History

Repository files navigation

NAVER Cafe Crawler

Requirement

Configuring Config.py

Running Prepare.py

Running Crawling.py

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages