NAVER Cafe Crawler

NAVER Cafe Crawler using pandas, tqdm, Selenium, BeautifulSoup4

Caution

This crawler was created for educational and training purposes related to data analysis.
Users bear full legal responsibility for any use of this crawler.

Requirement

pandas
tqdm
Selenium
BeautifulSoup4

Install them using pip.

pip install pandas tqdm selenium beautifulsoup4

Configuring Config.py

Set your NAVER ID and password.

user_id = ''
user_pw = ''

Specify the name and ID of the NAVER Cafe you want to crawl.

cafe_name = ''
cafe_id = 0

Running Prepare.py

Specify the menu ID of the NAVER Cafe and the number of pages to collect.
(Menu ID, Number of pages)

Note

For example, if you enter 15 as the number of pages, it will crawl from page 1 to page 15.

menu_id_page = [
    (100, 15),
]

Run Prepare.py to generate a file named [Menu ID]_link.csv, which contains the links to be crawled.

python3 Prepare.py

Running Crawling.py

Provide the list of [Menu ID]_link.csv files to crawl.

file_name_list = [
    ('100_link.csv'),
]

Run Crawling.py to perform crawling.

python3 Crawling.py

It will save:

Post contents in [Menu ID]_content.csv
Comments in [Menu ID]_comment.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

NAVER Cafe Crawler

Requirement

Configuring Config.py

Running Prepare.py

Running Crawling.py

Files

README.md

Latest commit

History

README.md

File metadata and controls

NAVER Cafe Crawler

Requirement

Configuring Config.py

Running Prepare.py

Running Crawling.py