Microcredential ekomex Fine-tuning Large Language Models (LLMs)
Registration |
---|
Online Registration not available. |
Subscribe to course dates | |
---|---|
Subscribe to Microcredential ekomex Fine-tuning Large Language Models (LLMs) dates | More info |
AI-assisted text analysis has become an essential tool in the social sciences. This short course equips social science PhD students with the theoretical foundations and hands-on skills to fine-tune open-source large language models (LLMs for domain-specific text analysis.
What Is This Course About?
This short course introduces PhD students in the social sciences to fine-tuning open-source large language models (LLMs) for custom text analysis. Fine-tuning is useful to improve the performance of LLMs on domain-specific tasks and requires relevant data to adapt the model. Combining theoretical input with hands-on training, it covers models like BERT, parameter-efficient methods such as LoRA, and tools including R, Python, and Google Colab. Key topics include data preparation, hyperparameter tuning, evaluation of model performance, best practices, troubleshooting, and integration into research workflows.
Learning Goals
- You will understand the fundamentals of open-source large language models (LLMs).
- You will learn techniques to fine-tune open-source LLMs for specific text analysis tasks.
- You will learn techniques to evaluate the performance of fine-tuned LLMs.
- You will gain practical experience in applying fine-tuned LLMs to analyze text data.
- You will explore ethical considerations and best practices in LLM fine-tuning.
Recommended Readings for the Course
- Alizadeh, M., Kubli, M., Samei, Z., Dehghani, S., Zahedivafa, M., Bermeo, J. D., Korobeynikova, M., & Gilardi, F. (2025). Open-source LLMs for text annotation: a practical guide for model setting and fine-tuning. Journal of Computational Social Science, 8(1), 1–25.
- Bucher, M. J. J., & Martini, M. (2024). Fine-Tuned “Small” LLMs (Still) Significantly Outperform Zero-Shot Generative AI Models in Text Classification. http://arxiv.org/abs/2406.08660
- Le Mens, G., & Gallego, A. (2025). Positioning Political Texts with Large Language Models by Asking and Averaging. Political Analysis, 1–9.
Assignments for the Course
1 coding challenge to be completed in class.
Schedule
- Day 1 – Open-source LLMs and overview of fine-tuning workflows
90’ asynchronous materials
What are open-source LLMs and how do they work?
Use cases for fine-tuning in the social sciences
Overview of commonly used fine-tuning techniques
90’ synchronous lab
Prompt engineering
Building a labelled dataset
60’ independent small group learning
Building your own labelled dataset
60’ office hour
Q&A, one-on-one consultations on domain-specific fine-tuning - Day 2 – Applied fine-tuning and performance evaluation
90’ asynchronous materials
Performance evaluation of LLMs
Model selection and preparation
Hyperparameter adjustment
90’ synchronous lab
Students implement a complete fine-tuning pipeline
Model evaluation and application for text analysis
Compare fine-tuned model outputs with standard models
60’ assignment and wrap-up
Coding challenge (in class)
Best practices and common pitfalls in fine-tuning
Transparency and reproducibility in AI-assisted workflows
60’ office hour
Who Is Your Instructor?
Seraphine F. Maerz is a political scientist and Lecturer at the University of Melbourne, specializing in computational social science and quantitative methods. Her research focuses on democracy, authoritarianism, and political communication, with a methodological emphasis on text analysis and the application of large language models in political research. See her GoogleScholar profile for recent publications. She is co-founder of QuantLab (https://quantilab.github.io/). More information and regular updates about her work can be found on her website https://seraphinem.github.io/.
https://www.linkedin.com/in/dr-seraphine-f-maerz-410286327/