Text as Data

Emory University / QTM 340 / Fall 2024

What does it mean to turn text into data? What are the data-scientific techniques that are commonly employed in order to analyze text? How are they applied in the humanities and social sciences? How are they applied in the world? This course explores these questions by focusing on how popular methods of text analysis, including those involving large language models, can be used to pursue humanistic and social-scientific research questions. Additional methods covered include text classification, clustering, and topic modeling, as well as methods for creating, cleaning, and parsing textual datasets. Along the way, we will also discuss the issues of ethics involved in our increasing reliance on large language models as well as the people whose labor—intellectual, physical, and emotional—that they depend upon.

Introductory courses in computer science and probability and statistics are recommended as perquisites for this course. You will complete all class exercises and homework assignments in Python. I expect you to participate in class discussion and present your final project at the end of the semester. I will also require some short writing assignments.

Syllabus
Course Calendar
Course on Canvas (Emory students only)
Email me!

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
corpora		corpora
docs		docs
notebooks		notebooks
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text as Data

About

Releases

Packages

Languages

License

laurenfklein/QTM340-Fall24

Folders and files

Latest commit

History

Repository files navigation

Text as Data

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages