Recursive LLMs

Conducted an independent research project to analyze the effects of recursively fine-tuning pretrained language models on quantitative (e.g., ROUGE scores) and qualitative (i.e., toxicity, formality, emotion intensity) metrics.
Final report: Generational Debt: The Degeneration of Self-Consuming Generative Language Models
Tech specs:
Python, PyTorch, Hugging Face Inference API,
sentiment and formality analysis,
toxicity detection (Perspective API), Google Cloud Platform
Abstract:
The data that is used to train large language models (LLMs) influences the decisions that it makes. Furthermore, the rate at which such models can produce content dramatically outweighs that which a human being can. The present research examines the effects of training LLMs on synthetic data, where that synthetic data was generated by a previous iteration of that model, and evaluates on the task of abstractive text summarization, although the synthetic data pipeline is relevant in nearly all of the applications for which these models are used. This research is intersted in both the quantitative and qual- itative dimensions along which generated text evolves. Specifically, I consider the text characteristics of toxicity, formality, and emotion intensity, using three different abstractive summarization datasets, the set of which were selected with the intention of representing different text characteristics to differing degrees. Results support existing evidence that recursive training can reduce output quality.
For more information, you can check out:
- The GitHub repo
- The final research report