News Article March 25, 2025

NAACL 2025 paper acceptance

Streamlining LLMs: Adaptive Knowledge Distillation for Tailored Language Models

NAACL 2025 paper acceptance

Streamlining LLMs: Adaptive Knowledge Distillation for Tailored Language Models

Prajvi Saxena, Sabine Janzen, Wolfgang Maass:

Large language models (LLMs) like GPT-4 and LLaMA-3 offer transformative potential across industries, e.g., enhancing customer service, revolutionizing medical diagnostics, or identifying crises in news articles. However, deploying LLMs faces challenges such as limited training data, high computational costs, and issues with transparency and explainability. Our research focuses on distilling compact, parameter-efficient tailored language models (TLMs) from LLMs for domain-specific tasks with comparable performance. Current approaches like knowledge distillation, fine-tuning, and model parallelism address computational efficiency but lack hybrid strategies to balance efficiency, adaptability, and accuracy. We present ANON - an adaptive knowledge distillation framework integrating knowledge distillation with adapters to generate computationally efficient TLMs without relying on labeled datasets. ANON uses cross-entropy loss to transfer knowledge from the teacher's outputs and internal representations while employing adaptive prompt engineering and a progressive distillation strategy for phased knowledge transfer. We evaluated ANON's performance in the crisis domain, where accuracy is critical and labeled data is scarce. Experiments showed that ANON outperforms recent approaches of knowledge distillation, both in terms of the resulting TLM performance and in reducing the computational costs for training and maintaining accuracy compared to LLMs for domain-specific applications.

Additional Resources

Other News

Key competency courses "Project Management" and "Generative AI for Students" in semester break 2025

The chair ISS is offering two courses on the key competencies "Project Management" and "Generative AI for Students" in the late summer semester 2025.

Read More →

Guest Lecture Data Science: Prof. Dr. Peter Orth on Quantum Computing, 11.07.25, 10:15 - 11:45

On Friday, July 11th, we welcome Prof. Dr. Peter Orth in Room 0.18 (Building B4 1) at 11:15 - 11:45. Prof. Orth will give a guest lecture on "Quantum data and how to learn from it".

Read More →

Guest Lecture Data Science: Dr. Ulrich Neugebauer, Deka Investment, 13.06.25, 10:15 - 11:45

On Friday, June 13, we welcome Dr. Ulrich Neugebauer in Room 0.18 (Building B4, 1) at 10:15 - 11:45. Dr. Neugebauer will give a guest lecture on "Applications of AI in Asset Management". As part of the Data Science Course, this talk will explore the automated use of digital information systems and AI in asset management, examining their significance and practical benefits for fund management.

Read More →