Emory University / QTM 340 / Fall 2024
What does it mean to turn text into data? What are the data-scientific techniques that are commonly employed in order to analyze text? How are they applied in the humanities and social sciences? How are they applied in the world? This course explores these questions by focusing on how popular methods of text analysis, including those involving large language models, can be used to pursue humanistic and social-scientific research questions. Additional methods covered include text classification, clustering, and topic modeling, as well as methods for creating, cleaning, and parsing textual datasets. Along the way, we will also discuss the issues of ethics involved in our increasing reliance on large language models as well as the people whose labor—intellectual, physical, and emotional—that they depend upon.
Introductory courses in computer science and probability and statistics are recommended as perquisites for this course. You will complete all class exercises and homework assignments in Python. I expect you to participate in class discussion and present your final project at the end of the semester. I will also require some short writing assignments.
- Syllabus
- Course Calendar
- Course on Canvas (Emory students only)
- Email me!