Búsqueda | BVS CLAP/SMR-OPS/OMS

Rare disease variant curation from literature: assessing gaps with creatine transport deficiency in focus.

Lyons, Erica L; Watson, Daniel; Alodadi, Mohammad S; Haugabook, Sharie J; Tawa, Gregory J; Hannah-Shmouni, Fady; Porter, Forbes D; Collins, Jack R; Ottinger, Elizabeth A; Mudunuri, Uma S.

BMC Genomics ; 24(1): 460, 2023 Aug 16.

Artículo en Inglés | MEDLINE | ID: mdl-37587458

RESUMEN

BACKGROUND: Approximately 4-8% of the world suffers from a rare disease. Rare diseases are often difficult to diagnose, and many do not have approved therapies. Genetic sequencing has the potential to shorten the current diagnostic process, increase mechanistic understanding, and facilitate research on therapeutic approaches but is limited by the difficulty of novel variant pathogenicity interpretation and the communication of known causative variants. It is unknown how many published rare disease variants are currently accessible in the public domain. RESULTS: This study investigated the translation of knowledge of variants reported in published manuscripts to publicly accessible variant databases. Variants, symptoms, biochemical assay results, and protein function from literature on the SLC6A8 gene associated with X-linked Creatine Transporter Deficiency (CTD) were curated and reported as a highly annotated dataset of variants with clinical context and functional details. Variants were harmonized, their availability in existing variant databases was analyzed and pathogenicity assignments were compared with impact algorithm predictions. 24% of the pathogenic variants found in PubMed articles were not captured in any database used in this analysis while only 65% of the published variants received an accurate pathogenicity prediction from at least one impact prediction algorithm. CONCLUSIONS: Despite being published in the literature, pathogenicity data on patient variants may remain inaccessible for genetic diagnosis, therapeutic target identification, mechanistic understanding, or hypothesis generation. Clinical and functional details presented in the literature are important to make pathogenicity assessments. Impact predictions remain imperfect but are improving, especially for single nucleotide exonic variants, however such predictions are less accurate or unavailable for intronic and multi-nucleotide variants. Developing text mining workflows that use natural language processing for identifying diseases, genes and variants, along with impact prediction algorithms and integrating with details on clinical phenotypes and functional assessments might be a promising approach to scale literature mining of variants and assigning correct pathogenicity. The curated variants list created by this effort includes context details to improve any such efforts on variant curation for rare diseases.

Asunto(s)

Creatina , Enfermedades Raras , Humanos , Enfermedades Raras/genética , Intrones , Algoritmos , Nucleótidos

AVIA 3.0: interactive portal for genomic variant and sample level analysis.

Reardon, Hue V; Che, Anney; Luke, Brian T; Ravichandran, Sarangan; Collins, Jack R; Mudunuri, Uma S.

Bioinformatics ; 37(16): 2467-2469, 2021 08 25.

Artículo en Inglés | MEDLINE | ID: mdl-33289511

RESUMEN

SUMMARY: The Annotation, Visualization and Impact Analysis (AVIA) is a web application combining multiple features to annotate and visualize genomic variant data. Users can investigate functional significance of their genetic alterations across samples, genes and pathways. Version 3.0 of AVIA offers filtering options through interactive charts and by linking disease relevant data sources. Newly incorporated services include gene, variant and sample level reporting, literature and functional correlations among impacted genes, comparative analysis across samples and against data sources such as TCGA and ClinVar, and cohort building. Sample and data management is now feasible through the application, which allows greater flexibility with sharing, reannotating and organizing data. Most importantly, AVIA's utility stems from its convenience for allowing users to upload and explore results without any a priori knowledge or the need to install, update and maintain software or databases. Together, these enhancements strengthen AVIA as a comprehensive, user-driven variant analysis portal. AVAILABILITYAND IMPLEMENTATION: AVIA is accessible online at https://avia-abcc.ncifcrf.gov.

Asunto(s)

Bases de Datos Genéticas , Variación Genética , Manejo de Datos , Genoma , Genómica , Humanos , Internet , Programas Informáticos

AVIA v2.0: annotation, visualization and impact analysis of genomic variants and genes.

Vuong, Hue; Che, Anney; Ravichandran, Sarangan; Luke, Brian T; Collins, Jack R; Mudunuri, Uma S.

Bioinformatics ; 31(16): 2748-50, 2015 Aug 15.

Artículo en Inglés | MEDLINE | ID: mdl-25861966

RESUMEN

UNLABELLED: As sequencing becomes cheaper and more widely available, there is a greater need to quickly and effectively analyze large-scale genomic data. While the functionality of AVIA v1.0, whose implementation was based on ANNOVAR, was comparable with other annotation web servers, AVIA v2.0 represents an enhanced web-based server that extends genomic annotations to cell-specific transcripts and protein-level functional annotations. With AVIA's improved interface, users can better visualize their data, perform comprehensive searches and categorize both coding and non-coding variants. AVAILABILITY AND IMPLEMENTATION: AVIA is freely available through the web at http://avia.abcc.ncifcrf.gov. CONTACT: Hue.Vuong@fnlcr.nih.gov SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Genes , Variación Genética , Anotación de Secuencia Molecular , Programas Informáticos , Bases de Datos Genéticas , Internet

Guanine holes are prominent targets for mutation in cancer and inherited disease.

Bacolla, Albino; Temiz, Nuri A; Yi, Ming; Ivanic, Joseph; Cer, Regina Z; Donohue, Duncan E; Ball, Edward V; Mudunuri, Uma S; Wang, Guliang; Jain, Aklank; Volfovsky, Natalia; Luke, Brian T; Stephens, Robert M; Cooper, David N; Collins, Jack R; Vasquez, Karen M.

PLoS Genet ; 9(9): e1003816, 2013.

Artículo en Inglés | MEDLINE | ID: mdl-24086153

RESUMEN

Single base substitutions constitute the most frequent type of human gene mutation and are a leading cause of cancer and inherited disease. These alterations occur non-randomly in DNA, being strongly influenced by the local nucleotide sequence context. However, the molecular mechanisms underlying such sequence context-dependent mutagenesis are not fully understood. Using bioinformatics, computational and molecular modeling analyses, we have determined the frequencies of mutation at G â¢ C bp in the context of all 64 5'-NGNN-3' motifs that contain the mutation at the second position. Twenty-four datasets were employed, comprising >530,000 somatic single base substitutions from 21 cancer genomes, >77,000 germline single-base substitutions causing or associated with human inherited disease and 16.7 million benign germline single-nucleotide variants. In several cancer types, the number of mutated motifs correlated both with the free energies of base stacking and the energies required for abstracting an electron from the target guanines (ionization potentials). Similar correlations were also evident for the pathological missense and nonsense germline mutations, but only when the target guanines were located on the non-transcribed DNA strand. Likewise, pathogenic splicing mutations predominantly affected positions in which a purine was located on the non-transcribed DNA strand. Novel candidate driver mutations and tissue-specific mutational patterns were also identified in the cancer datasets. We conclude that electron transfer reactions within the DNA molecule contribute to sequence context-dependent mutagenesis, involving both somatic driver and passenger mutations in cancer, as well as germline alterations causing or associated with inherited disease.

Asunto(s)

Sustitución de Aminoácidos/genética , Enfermedades Genéticas Congénitas/genética , Guanina , Neoplasias/genética , Biología Computacional , ADN de Neoplasias/genética , Enfermedades Genéticas Congénitas/patología , Mutación de Línea Germinal , Humanos , Modelos Moleculares , Neoplasias/patología , Motivos de Nucleótidos/genética

Non-B DB v2.0: a database of predicted non-B DNA-forming motifs and its associated tools.

Cer, Regina Z; Donohue, Duncan E; Mudunuri, Uma S; Temiz, Nuri A; Loss, Michael A; Starner, Nathan J; Halusa, Goran N; Volfovsky, Natalia; Yi, Ming; Luke, Brian T; Bacolla, Albino; Collins, Jack R; Stephens, Robert M.

Nucleic Acids Res ; 41(Database issue): D94-D100, 2013 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-23125372

RESUMEN

The non-B DB, available at http://nonb.abcc.ncifcrf.gov, catalogs predicted non-B DNA-forming sequence motifs, including Z-DNA, G-quadruplex, A-phased repeats, inverted repeats, mirror repeats, direct repeats and their corresponding subsets: cruciforms, triplexes and slipped structures, in several genomes. Version 2.0 of the database revises and re-implements the motif discovery algorithms to better align with accepted definitions and thresholds for motifs, expands the non-B DNA-forming motifs coverage by including short tandem repeats and adds key visualization tools to compare motif locations relative to other genomic annotations. Non-B DB v2.0 extends the ability for comparative genomics by including re-annotation of the five organisms reported in non-B DB v1.0, human, chimpanzee, dog, macaque and mouse, and adds seven additional organisms: orangutan, rat, cow, pig, horse, platypus and Arabidopsis thaliana. Additionally, the non-B DB v2.0 provides an overall improved graphical user interface and faster query performance.

Asunto(s)

ADN/química , Bases de Datos de Ácidos Nucleicos , Animales , Gráficos por Computador , Perros , Humanos , Internet , Ratones , Anotación de Secuencia Molecular , Motivos de Nucleótidos , Ratas , Secuencias Repetitivas de Ácidos Nucleicos , Programas Informáticos , Interfaz Usuario-Computador

Non-B DB: a database of predicted non-B DNA-forming motifs in mammalian genomes.

Cer, Regina Z; Bruce, Kevin H; Mudunuri, Uma S; Yi, Ming; Volfovsky, Natalia; Luke, Brian T; Bacolla, Albino; Collins, Jack R; Stephens, Robert M.

Nucleic Acids Res ; 39(Database issue): D383-91, 2011 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-21097885

RESUMEN

Although the capability of DNA to form a variety of non-canonical (non-B) structures has long been recognized, the overall significance of these alternate conformations in biology has only recently become accepted en masse. In order to provide access to genome-wide locations of these classes of predicted structures, we have developed non-B DB, a database integrating annotations and analysis of non-B DNA-forming sequence motifs. The database provides the most complete list of alternative DNA structure predictions available, including Z-DNA motifs, quadruplex-forming motifs, inverted repeats, mirror repeats and direct repeats and their associated subsets of cruciforms, triplex and slipped structures, respectively. The database also contains motifs predicted to form static DNA bends, short tandem repeats and homo(purineâ¢pyrimidine) tracts that have been associated with disease. The database has been built using the latest releases of the human, chimp, dog, macaque and mouse genomes, so that the results can be compared directly with other data sources. In order to make the data interpretable in a genomic context, features such as genes, single-nucleotide polymorphisms and repetitive elements (SINE, LINE, etc.) have also been incorporated. The database is accessed through query pages that produce results with links to the UCSC browser and a GBrowse-based genomic viewer. It is freely accessible at http://nonb.abcc.ncifcrf.gov.

Asunto(s)

ADN/química , Bases de Datos de Ácidos Nucleicos , Animales , Secuencia de Bases , Perros , Genómica , Humanos , Macaca , Ratones , Conformación de Ácido Nucleico , Pan troglodytes/genética , Secuencias Repetitivas de Ácidos Nucleicos

Immuno-transcriptomic profiling of extracranial pediatric solid malignancies.

Brohl, Andrew S; Sindiri, Sivasish; Wei, Jun S; Milewski, David; Chou, Hsien-Chao; Song, Young K; Wen, Xinyu; Kumar, Jeetendra; Reardon, Hue V; Mudunuri, Uma S; Collins, Jack R; Nagaraj, Sushma; Gangalapudi, Vineela; Tyagi, Manoj; Zhu, Yuelin J; Masih, Katherine E; Yohe, Marielle E; Shern, Jack F; Qi, Yue; Guha, Udayan; Catchpoole, Daniel; Orentas, Rimas J; Kuznetsov, Igor B; Llosa, Nicolas J; Ligon, John A; Turpin, Brian K; Leino, Daniel G; Iwata, Shintaro; Andrulis, Irene L; Wunder, Jay S; Toledo, Silvia R C; Meltzer, Paul S; Lau, Ching; Teicher, Beverly A; Magnan, Heather; Ladanyi, Marc; Khan, Javed.

Cell Rep ; 37(8): 110047, 2021 11 23.

Artículo en Inglés | MEDLINE | ID: mdl-34818552

RESUMEN

We perform an immunogenomics analysis utilizing whole-transcriptome sequencing of 657 pediatric extracranial solid cancer samples representing 14 diagnoses, and additionally utilize transcriptomes of 131 pediatric cancer cell lines and 147 normal tissue samples for comparison. We describe patterns of infiltrating immune cells, T cell receptor (TCR) clonal expansion, and translationally relevant immune checkpoints. We find that tumor-infiltrating lymphocytes and TCR counts vary widely across cancer types and within each diagnosis, and notably are significantly predictive of survival in osteosarcoma patients. We identify potential cancer-specific immunotherapeutic targets for adoptive cell therapies including cell-surface proteins, tumor germline antigens, and lineage-specific transcription factors. Using an orthogonal immunopeptidomics approach, we find several potential immunotherapeutic targets in osteosarcoma and Ewing sarcoma and validated PRAME as a bona fide multi-pediatric cancer target. Importantly, this work provides a critical framework for immune targeting of extracranial solid tumors using parallel immuno-transcriptomic and -peptidomic approaches.

Asunto(s)

Neoplasias/genética , Neoplasias/inmunología , Transcriptoma/genética , Adolescente , Antígenos de Neoplasias , Línea Celular Tumoral , Niño , Preescolar , Femenino , Expresión Génica/genética , Perfilación de la Expresión Génica/métodos , Humanos , Proteínas de Punto de Control Inmunitario/genética , Proteínas de Punto de Control Inmunitario/inmunología , Inmunogenética/métodos , Inmunoterapia Adoptiva , Lactante , Linfocitos Infiltrantes de Tumor/inmunología , Masculino , Receptores de Antígenos de Linfocitos T/genética , Receptores de Antígenos de Linfocitos T/inmunología , Transcriptoma/inmunología , Microambiente Tumoral , Secuenciación del Exoma/métodos

Assessment of Coagulation Homeostasis in Blunt, Penetrating, and Thermal Trauma: Guidance for a Multicenter Systems Biology Approach.

Shupp, Jeffrey W; Brummel-Ziedins, Kathleen E; Cohen, Mitchell J; Freeman, Kalev; Hammamieh, Rasha; Mudunuri, Uma S; Orfeo, Thomas; Moffatt, Lauren T; Brownstein, Bernard H; Mann, Kenneth G; Jett, Marti; Pusateri, Anthony E.

Shock ; 52(1S Suppl 1): 84-91, 2019 10.

Artículo en Inglés | MEDLINE | ID: mdl-30339633

RESUMEN

INTRODUCTION: Provisioning care for traumatically injured patients makes conducting research very proximal to injury difficult. These studies also inherently have regulatory barriers to overcome. Here we outline a protocol for acute-phase enrollment of traumatically injured patients into a prospective observational clinical trial with precise and comprehensive sample acquisition in support of a systems biology approach to a research study. METHODS: Experts in trauma, burn, blood coagulation, computational biology, and integrative systems biology developed a prospective study that would capture the natural history of coagulation pathology after traumatic injury. Blood was sampled at admission and serial time points throughout hospitalization. Concurrently, demographic and outcomes data were recorded and on-site point-of-care testing was implemented. Protocols were harmonized across sites and sampling protocols validated through demonstration of feasibility and sample quality assurance testing. A novel data integration platform was developed to store, visualize, and enable large-scale analysis of empirical and clinical data. Regulatory considerations were also addressed in protocol development. RESULTS: A comprehensive Manual of Operations (MOO) was developed and implemented at 3 clinical sites. After regulatory approval, the MOO was followed to collect 5,348 longitudinal samples from 1,547 patients. All samples were collected, processed, and stored per the MOO. Assay results and clinical data were entered into the novel data management platform for analyses. CONCLUSION: We used an iterative, interdisciplinary process to develop a systematic and robust protocol for comprehensive assessment of coagulation in traumatically injured patients. This MOO can be a template for future studies in the acute setting.

Asunto(s)

Traumatismo Múltiple/metabolismo , Biología de Sistemas/métodos , Coagulación Sanguínea/fisiología , Femenino , Homeostasis , Humanos , Masculino , Traumatismo Múltiple/fisiopatología , Estudios Prospectivos

Knowledge and theme discovery across very large biological data sets using distributed queries: a prototype combining unstructured and structured data.

Mudunuri, Uma S; Khouja, Mohamad; Repetski, Stephen; Venkataraman, Girish; Che, Anney; Luke, Brian T; Girard, F Pascal; Stephens, Robert M.

PLoS One ; 8(12): e80503, 2013.

Artículo en Inglés | MEDLINE | ID: mdl-24312478

RESUMEN

As the discipline of biomedical science continues to apply new technologies capable of producing unprecedented volumes of noisy and complex biological data, it has become evident that available methods for deriving meaningful information from such data are simply not keeping pace. In order to achieve useful results, researchers require methods that consolidate, store and query combinations of structured and unstructured data sets efficiently and effectively. As we move towards personalized medicine, the need to combine unstructured data, such as medical literature, with large amounts of highly structured and high-throughput data such as human variation or expression data from very large cohorts, is especially urgent. For our study, we investigated a likely biomedical query using the Hadoop framework. We ran queries using native MapReduce tools we developed as well as other open source and proprietary tools. Our results suggest that the available technologies within the Big Data domain can reduce the time and effort needed to utilize and apply distributed queries over large datasets in practical clinical applications in the life sciences domain. The methodologies and technologies discussed in this paper set the stage for a more detailed evaluation that investigates how various data structures and data models are best mapped to the proper computational framework.

Asunto(s)

Minería de Datos/métodos , Bases de Datos Factuales , Humanos

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA