An Integrated Machine Learning and Functional Analysis Approach for Resolution of Variants of Uncertain Significance (VUS) in PCDH19
Abstract number :
3.381
Submission category :
12. Genetics / 12A. Human Studies
Year :
2022
Submission ID :
2205105
Source :
www.aesnet.org
Presentation date :
12/5/2022 12:00:00 PM
Published date :
Nov 22, 2022, 05:28 AM
Authors :
Gemma Carvill, PhD – Northwestern University; Jeffrey Calhoun, PhD – Northwestern University; Heather Mefford, MD, PhD – St. Jude Children's Research Hospital; Santiago Schnell, DPhil (Oxon), FRSC – University of Notre Dame; Vanessa Aguiar-Pulido, PhD – University of Miami; Margaret Ross, MD, PhD – Weill Cornell Medical College; Jack Parent, MD – University of Michigan; EpiMVP Consortium, none – EpiMVP Consortium; Lori Isom, PhD – University of Michigan
This abstract has been invited to present during the Basic Science Poster Highlights poster session.
Rationale: Variants of uncertain significance (VUS) pose a significant challenge for genetic diagnosis of epilepsy, in that these variants can be classified as neither pathogenic nor benign. In order to address this challenge, we established a highly integrated Epilepsy Multiplatform Variant Predictor (EpiMVP) Center Without Walls to develop precise, single-gene in silico pathogenicity prediction tools (EpiPred) for the most common epilepsy-associated genes. We are using supervised and unsupervised machine learning algorithms to discriminate pathogenic from benign variants. This approach will be bolstered by functional characterization of a subset of variants in multiple cellular and animal models. Functional data will then be incorporated into the machine learning model to improve classification performance in an iterative fashion. Herein we report our investigation of PCDH19, a protocadherin gene family member implicated in PCDH19-Epilepsy, that causes a female-limited epilepsy, except in the instances of mosaic males.
Methods: We first curated a robust training set for the development of our classifier. This consisted of known benign or likely benign (BLB) missense PCDH19 variants from the general population (gnomAD; n=129) and known pathogenic or likely pathogenic (PLP) missense PCDH19 variants from multiple sources, including industry partners, Clinvar, and the literature (n=90). Variants were annotated with multiple features (n=39) using in silico measures related to evolutionary conservation, protein structure and stability, among others. Our approach employed standard unsupervised and supervised machine learning algorithms to develop an optimal single-gene classifier. Our classifier outputs a score ranging from PLP (high) to BLB (low) and was used to score PCDH19 VUSs (n=267) collated from ClinVar and industry partners.
Results: Our classifier is successful at discriminating PCDH19 missense BLB and PLP variants (ROCAUC > 0.9). We selected representative variants for functional modeling in multiple cellular and animal models.
Conclusions: The first iteration of our EpiPred classifier can distinguish between BLB and PLP variants with high specificity. After functional validation in the EpiMVP cellular and animal models, we will improve the predictive power of EpiPred and release this model to the wider scientific and epilepsy community for interpretation of VUS. Our overarching goal is to develop gene-specific classifiers for additional epilepsy-associated genes with an emphasis on non-ion channel genes with the highest volume of VUS and PLP variants identified in routine clinical genetic testing.
Funding: NIH/NINDS Center Without Walls (CWOW): U54NS117170
Genetics