Authors :
Presenting Author: Yueqian Sun, PhD – Beijing Tiantan hospital
yueqian Sun, PhD – Beijing Tiantan hospital
Qun Wang, MD – Beijing Tiantan Hospital, Capital Medical University
Rationale:
Large language models (LLMs) show promise in biomedical text analysis but have rarely been applied to pre-surgical epileptogenic zone (EZ) localization. We assessed LLMs' ability in analyzing multimodal clinical text for drug-resistant epilepsy (DRE) patients to improve surgical decision-making.
Methods:
Retrospective analysis of 157 DRE patients (2020-2024) at Beijing Tiantan Hospital included medical records, EEG, MRI, FDG-PET, and MEG data (64 with MEG). Three LLMs (GPT-4.1, Deepseek-R1, Claude 3.7 Sonnet) were used to perform: (1) EZ laterality classification, (2) probabilistic lobar localization (top 3 lobes), and (3) SEEG recommendation scoring (0-100). Performance was benchmarked against multidisciplinary team (MDT)-defined surgical resection sites. Modality ablation and stability analyses were conducted.Results:
GPT-4.1 and Deepseek-R1 achieved 98.1% laterality accuracy (p >0.05 vs Claude 3.7 Sonnet's 97.5%). For lobar localization, GPT-4.1 and Claude 3.7 Sonnet scored 70% median (vs 60% for Deepseek-R1, p< 0.001). All models maintained similar performance in MRI-negative cases (n=71). MRI reports and medical records were critical for localization. SEEG scores were significantly higher for patients who underwent SEEG (GPT-4.1: 90.00 [IQR: 85.00-90.00] vs 25.00 [IQR: 10.00-90.00] for non-SEEG, p< 0.001).