GitHub - aaronmams/rHD-Vignette-Text-Mining: This Vignette illustrates some of R's text mining functionalities within the context of a text classification example.

Code

This repository contains 3 examples using various R features to analyze text data. These examples are contained in the following .rmd files:

Twitter-Scraping-Example.Rmd. This is a very simple example of using the rtweet package to harvest some data from Twitter's Public API.
Sentiment-Analysis-with-Tweets.Rmd. This is another relatively simple illustration of how to use the tm package to parse out text data and do some simple analysis (word count).
Text-Mining-with-Supreme-Court-Opinions.Rmd. This is slightly more involved example that has a few interesting features:

A. It uses methods from the pdftools package to read data into R from pdf files. B. It also uses some elements of functional programming to parse out the text from these pdf files and make it useable. C. It uses methods from the tidytext package to tidy the text data and do some analysis (again, word counts) with text from Supreme Court Opinions.

Data

The data dependencies for these examples include 1 .csv file and 10 .pdf files. These files are included in the "data"" directory in this project. Here is a brief description of these data files:

Sentiment.csv is a .csv file that I obtained from Kaggle. These are tweets collected during a GOP Debate in Ohio for the 2016 Presidential nomination).
There are 10 .pdf files that have a common naming convention in that they all start with "slip-2019-19-...." These are Supreme Court "slip opinions". Each .pdf file corresponds to a unique case that was heard by the Supreme Court during the 2019 session. Per the supremecourt.gov website:

Slip opinions are the first version of the Court’s opinions posted on this website. A “slip” opinion consists of the majority or principal opinion, any concurring or dissenting opinions written by the Justices, and a prefatory syllabus prepared by the Reporter’s Office that summarizes the decision. The slip opinions collected here are those issued during October Term 2019 (October 07, 2019, through October 04, 2020).

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
R		R
data		data
README.md		README.md
Resources-and-Readings.Rmd		Resources-and-Readings.Rmd
Sentiment-Analysis-with-Tweets.Rmd		Sentiment-Analysis-with-Tweets.Rmd
Text-Mining-with-Supreme-Court-Opinions.Rmd		Text-Mining-with-Supreme-Court-Opinions.Rmd
Twitter-Scraping-Example.Rmd		Twitter-Scraping-Example.Rmd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Code

Data

About

Releases

Packages

Languages

aaronmams/rHD-Vignette-Text-Mining

Folders and files

Latest commit

History

Repository files navigation

Code

Data

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages