Shiny-ClinVar: An Interactive Online Tool to Explore Disease Genes and Variants Aggregated in the ClinVar Database and Its Application to Epilepsy
Abstract number :
1.400
Submission category :
12. Genetics / 12A. Human Studies
Year :
2018
Submission ID :
501762
Source :
www.aesnet.org
Presentation date :
12/1/2018 6:00:00 PM
Published date :
Nov 5, 2018, 18:00 PM
Authors :
Eduardo Perez-Palma, Cologne Center for Genomics, University of Cologne; Marie Gramm, Cologne Center for Genomics, Universität zu Köln; Patrick May, University of Luxembourg; and Dennis Lal, Cleveland Clinic
Rationale: The underlying etiology of epilepsy is unknown in 40%-50% of the cases. However, it has been estimated that approximately 30% of epilepsies have a genetic component. Clinical genetic testing for epilepsies and other related neurodevelopmental disorders has exponentially expanded in recent years, leading to an overwhelming amount of patient variants with high variability in pathogenicity and heterogeneous phenotypes. While variant level data is comprehensively aggregated in public databases such as ClinVar, the ability to answer broader questions such as “How many missense variants are associated to a specific disease or gene?” or “In which part of the protein are patient variants located?” is limited, particularly for users without bioinformatic expertise. Here, we aim to develop a web application able to provide disease, gene and protein level statistics based on the entire ClinVar database in an intuitive user-friendly online interface. Methods: The whole ClinVar database was retrieved in table format from the ftp site (ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/). Only canonical transcripts according to ensembl v92 were considered. Protein sequences and domain boundaries were retrieved from the UniProt database (May 2018 release). Annotation, interactive summary statistics, visualization and variant mapping was developed with the Shiny framework of Rstudio software. App deployment, hosting and update is performed with Google Cloud services. Results: The Shiny-ClinVar online tool initially displays summary statistics of the entire database which by the time of abstract submission holds information for 396,583 patient variants. Summary statistics are available for 18,806 genes and 10,537 unique phenotypes including counts for “Type of Variation”, “Molecular Consequence” and “Clinical significance” which can also be dynamically used as filters in any combination. As an example, the disease term query “epilepsy” will yield 4,426 genetic variants with missense variants (n= 1,587) as the most common type. A total of 96 genes related to epilepsy are shown in ascending order with SCN1A (n=441) as the most recurrent. Similarly, a gene-wise SCN1A query indicates that 30% of missense variants are classified as of uncertain significance (VUS, n=132), which are widely distributed across the protein and associated with 23 disorders. For all searches and at any filtering stage the user is able to download results for downstream analysis. Conclusions: We have developed a novel online tool that is able to interactively answer basic questions regarding genetic variation and their known relationships to disease. The user can identify in seconds the most important clinically tested disease genes by typing just the disease term and can explore immediately where these variants are located. The tool is available online at: Shiny-ClinVar.broadinstitute.org for the clinical use as well as for educational purposes. Funding: Dravet Foundation Research Grant 2017