This project is created for the Unstructured Text Analysis course at the Central European University. Special thanks to Eduardo Ariño de la Rubia who is the professor of this course as I learned a lot from him. This course is based on the wonderful book: Text-Mining-R-Tidy-Approach.

Technical introduction

The whole analysis is implemented in R and can be observed in this GitHub repository. The coding part can be separated into two main parts:

Web scraping parts
Unstructured text analysis part

Throughout the whole process, I tried to follow the main principles of clean code, therefore one with R knowledge should follow it easily. There are plenty of further development possibilities in this project, therefore anyone who is interested, feel free to contact me David Utassy for any contribution. I will not show the whole code in this document, but I will highlight the most important snippets.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Technical introduction

Files

README.md

Latest commit

History

README.md

File metadata and controls

Technical introduction