Microcredential ekomex Introduction to AI-assisted text analysis
Registration |
---|
Online Registration not available. |
Subscribe to course dates | |
---|---|
Subscribe to Microcredential ekomex Introduction to AI-assisted text analysis dates | More info |
AI-assisted text analysis has become an essential tool in the social sciences. This hands-on workshop equips PhD students with practical skills to apply large language models (LLMs) in R, combining coding with critical reflection on methodological and ethical standards.
What Is This Course About?
The classification, scaling, and analysis of large amounts of text have become foundational tasks in the social sciences, enabling scholars to systematically examine political discourse, policy developments, public opinion, and more. This compact workshop is an intensive, hands-on seminar designed specifically for PhD students interested in applying AI-assisted text analysis using R. It provides practical training in working with both open-source and proprietary large language models (LLMs) to support a range of text analysis tasks such as classification or scaling. The focus is on developing the technical skills necessary to integrate LLMs into research workflows in a rigorous and transparent manner. Throughout the workshop, participants will engage in applied coding exercises while also reflecting on the ethical and methodological implications of using LLMs in research.
Learning Goals
- You will be able to identify and apply suitable AI-assisted text analysis techniques to address your own research questions using R
- You will learn how to work with both open-source and proprietary large language models (LLMs) for tasks such as text classification and scaling, and integrate them effectively into your research workflow
- You will understand key principles of reproducibility, validation, and ethical use of AI tools in academic research, and apply them to your own projects
- You will be familiar with key tools and packages for AI-assisted text analysis in R, and confident in implementing them through hands-on coding exercises
Recommended Readings for the Course
- Le Mens, G., & Gallego, A. (2025). Positioning Political Texts with Large Language Models by Asking and Averaging. Political Analysis, 1–9.
- Gilardi, F., Alizadeh, M., & Kubli, M. (2023). ChatGPT Outperforms Crowd-Workers for Text-Annotation Tasks. Proceedings of the National Academy of Sciences of the United States of America, 120(30), 1–3.
- Alizadeh, M., Kubli, M., Samei, Z., Dehghani, S., Zahedivafa, M., Bermeo, J. D., Korobeynikova, M., & Gilardi, F. (2025). Open-source LLMs for text annotation: a practical guide for model setting and fine-tuning. Journal of Computational Social Science, 8(1), 1–25.
Assignments for the Course
2 coding challenges to be completed in class
Schedule
- Pre-Course Preparation (available 3 weeks before course start)
Asynchronous materials (approx. 90 minutes total):
Introduction to LLMs, text-as-data, and R-based workflows
Set-up instructions for R, RStudio, and required packages
Pre-readings on ethical and methodological considerations
Students are expected to complete these materials before Day 1
Materials and readings will be made available on Moodle. - Day 1: Foundations of AI-Driven Text Analysis in R
90’ asynchronous materials
Introduction to LLMs: Types, capabilities, and limitations
Overview of workshop
90’ synchronous lab
Introduction to text analysis in R
Hands-on session: Getting started with text analysis in R
60’ independent small group learning
Reflection on ethical considerations and methodological implications of working with LLMs
60’ office hour
Live Q&A, troubleshooting set-up in R, and one-on-one consultations - Day 2: Applied LLMs and Text Classification
90’ asynchronous materials
Practical walkthrough: LLM API and local set-up (e.g., ChatGPT, Ollama) in R
Overview of common tasks: classification, extraction, and scaling
90’ synchronous lab (Zoom)
Hands-on session: Running classification tasks with different LLMs
Validating and comparing model outputs
Coding challenge (Assignment 1)
60’ independent group learning
Working with proprietary and open-source LLMs in R
Compare outputs of different LLMs on a basic classification task
60’ office hour
Feedback on Assignment 1 and help with LLM API/local setup - Day 3: Integration, Reproducibility, and Ethics
90’ asynchronous materials
Principles of reproducibility and transparent workflows in AI text analysis
Ethical considerations and biases in LLM-assisted research
90’ synchronous lab
Hands-on session: Building reproducible Quarto files for text analysis
Coding challenge: Assignment 2
60’ independent group learning
Peer review of coding challenge scripts
Final reflection activity on ethical use and transparency of LLMs in research
60’ office hour
Final consultations, feedback on Assignment 2
In addition, students should plan for sufficient time to do the assigned daily homework and read the assigned literature in advance.
Who Is Your Instructor?
Seraphine F. Maerz is a political scientist and Lecturer at the University of Melbourne, specializing in computational social science and quantitative methods. Her research focuses on democracy, authoritarianism, and political communication, with a methodological emphasis on text analysis and the application of large language models in political research. See her GoogleScholar profile (https://scholar.google.com/citations?user=Bp3m4N0AAAAJ&hl=de) for recent publications. She is co-founder of QuantLab (https://quantilab.github.io/). More information and regular updates about her work can be found on her website https://seraphinem.github.io/.
https://www.linkedin.com/in/dr-seraphine-f-maerz-410286327/