• 대한전기학회
Mobile QR Code QR CODE : The Transactions of the Korean Institute of Electrical Engineers
  • COPE
  • kcse
  • 한국과학기술단체총연합회
  • 한국학술지인용색인
  • Scopus
  • crossref
  • orcid
Title Early Dementia Detection from Korean Speech Using wav2vec 2.0
Authors 김민지(Minji Kim) ; 정경용(Kyungyong Chung)
DOI https://doi.org/10.5370/KIEE.2026.75.7.1562
Page pp.1562-1570
Keywords Early dementia detection; Transformer; wav2vec 2.0; Self-Supervised Learning; Digital biomarker
Abstract In this study, we propose a wav2vec 2.0-based model for early dementia detection from Korean daily speech, validated on the AI Hub dataset of 5,769 recordings from 1,002 elderly speakers labeled as normal cognition, mild cognitive impairment (MCI), or Alzheimer's disease (AD). The model integrates a multilingual SSL (XLSR-53) and Korean ASR-adapted wav2vec 2.0 backbone, CNN feature encoder, 24-layer Transformer, and attentive statistics pooling head. Fine-tuned with Focal Loss, it effectively handles spontaneous speech, environmental noise, and class imbalance. Five-fold cross-validation achieves speaker-level (10 8-second segments, based on soft voting) 94.5% accuracy, 92.9% recall, and ROC-AUC of 0.958, outperforming handcrafted-feature baselines, Audio Spectrogram Transformer, and English-pretrained HuBERT. Future work will explore knowledge distillation for on-device inference, multimodal fusion with linguistic features, and explainable AI for real-time biomarker assessment.