Please use this identifier to cite or link to this item:
                
       https://hdl.handle.net/20.500.11851/12742| Title: | Multilingual Domain Adaptation for Speech Recognition Using LLMs | Authors: | Ulu, Elif Nehir Derya, Ece Tümer, Duygu Demirel, Berkan Karamanlıoğlu, Alper | Keywords: | Domain Adaptation Large Language Model Large Language Models Multilingual Speech Recognition Automatic Speech Recognition Whisper Computational Linguistics Data Acquisition Digital Storage Drive Parameters Healthcare Domains High Quality Labels Classifieds Speech Communication Speech Recognition Tuning Language Model | Publisher: | Springer Science and Business Media Deutschland GmbH | Abstract: | We present a practical pipeline for multilingual domain adaptation in automatic speech recognition (ASR) that combines the Whisper model with large language models (LLMs). Using Aya-23-8B, Common Voice transcripts in 22 languages are automatically classified into the Law and Healthcare domains, producing high-quality domain labels at a fraction of the manual cost. These labels drive parameter-efficient (LoRA) fine-tuning of Whisper and deliver consistent relative Word Error Rate (WER) reductions of up to 14.3% for languages that contribute at least 800 in-domain utterances. A data-volume analysis reveals a clear breakpoint: gains become reliably large once that 800-utterance threshold is crossed, while monolingual tuning still rescues performance in truly low-resource settings. The workflow therefore shifts the key success factor from expensive hand labelling to scalable data acquisition, and can be replicated in new domains with minimal human intervention. © 2025 Elsevier B.V., All rights reserved. | Description: | Siemens Healthineers AG | URI: | https://doi.org/10.1007/978-3-032-02548-7_32 https://hdl.handle.net/20.500.11851/12742 | ISBN: | 9789819698936 9789819698042 9789819698110 9789819698905 9789819512324 9783032026019 9783032008909 9783031915802 9789819698141 9783031984136 | ISSN: | 1611-3349 0302-9743 | 
| Appears in Collections: | Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection | 
Show full item record
CORE Recommender
	
	Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.
