Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
Add more filters










Database
Language
Publication year range
1.
Genome Biol ; 25(1): 100, 2024 Apr 19.
Article in English | MEDLINE | ID: mdl-38641812

ABSTRACT

Multiplexed assays of variant effect (MAVEs) have emerged as a powerful approach for interrogating thousands of genetic variants in a single experiment. The flexibility and widespread adoption of these techniques across diverse disciplines have led to a heterogeneous mix of data formats and descriptions, which complicates the downstream use of the resulting datasets. To address these issues and promote reproducibility and reuse of MAVE data, we define a set of minimum information standards for MAVE data and metadata and outline a controlled vocabulary aligned with established biomedical ontologies for describing these experimental designs.


Subject(s)
Metadata , Research Design , Reproducibility of Results
2.
bioRxiv ; 2024 Jan 02.
Article in English | MEDLINE | ID: mdl-38260256

ABSTRACT

Recent advances in AI-based methods have revolutionized the field of structural biology. Concomitantly, high-throughput sequencing and functional genomics technologies have enabled the detection and generation of variants at an unprecedented scale. However, efficient tools and resources are needed to link these two disparate data types - to "map" variants onto protein structures, to better understand how the variation causes disease and thereby design therapeutics. Here we present the Genomics 2 Proteins Portal (G2P; g2p.broadinstitute.org/): a human proteome-wide resource that maps 19,996,443 genetic variants onto 42,413 protein sequences and 77,923 structures, with a comprehensive set of structural and functional features. Additionally, the G2P portal generalizes the capability of linking genomics to proteins beyond databases by allowing users to interactively upload protein residue-wise annotations (variants, scores, etc.) as well as the protein structure to establish the connection. The portal serves as an easy-to-use discovery tool for researchers and scientists to hypothesize the structure-function relationship between natural or synthetic variations and their molecular phenotype.

3.
ArXiv ; 2023 Jun 26.
Article in English | MEDLINE | ID: mdl-37426450

ABSTRACT

Multiplexed Assays of Variant Effect (MAVEs) have emerged as a powerful approach for interrogating thousands of genetic variants in a single experiment. The flexibility and widespread adoption of these techniques across diverse disciplines has led to a heterogeneous mix of data formats and descriptions, which complicates the downstream use of the resulting datasets. To address these issues and promote reproducibility and reuse of MAVE data, we define a set of minimum information standards for MAVE data and metadata and outline a controlled vocabulary aligned with established biomedical ontologies for describing these experimental designs.

4.
BMC Genomics ; 22(1): 872, 2021 Dec 04.
Article in English | MEDLINE | ID: mdl-34863095

ABSTRACT

BACKGROUND: Pediatric cancers typically have a distinct genomic landscape when compared to adult cancers and frequently carry somatic gene fusion events that alter gene expression and drive tumorigenesis. Sensitive and specific detection of gene fusions through the analysis of next-generation-based RNA sequencing (RNA-Seq) data is computationally challenging and may be confounded by low tumor cellularity or underlying genomic complexity. Furthermore, numerous computational tools are available to identify fusions from supporting RNA-Seq reads, yet each algorithm demonstrates unique variability in sensitivity and precision, and no clearly superior approach currently exists. To overcome these challenges, we have developed an ensemble fusion calling approach to increase the accuracy of identifying fusions. RESULTS: Our Ensemble Fusion (EnFusion) approach utilizes seven fusion calling algorithms: Arriba, CICERO, FusionMap, FusionCatcher, JAFFA, MapSplice, and STAR-Fusion, which are packaged as a fully automated pipeline using Docker and Amazon Web Services (AWS) serverless technology. This method uses paired end RNA-Seq sequence reads as input, and the output from each algorithm is examined to identify fusions detected by a consensus of at least three algorithms. These consensus fusion results are filtered by comparison to an internal database to remove likely artifactual fusions occurring at high frequencies in our internal cohort, while a "known fusion list" prevents failure to report known pathogenic events. We have employed the EnFusion pipeline on RNA-Seq data from 229 patients with pediatric cancer or blood disorders studied under an IRB-approved protocol. The samples consist of 138 central nervous system tumors, 73 solid tumors, and 18 hematologic malignancies or disorders. The combination of an ensemble fusion-calling pipeline and a knowledge-based filtering strategy identified 67 clinically relevant fusions among our cohort (diagnostic yield of 29.3%), including RBPMS-MET, BCAN-NTRK1, and TRIM22-BRAF fusions. Following clinical confirmation and reporting in the patient's medical record, both known and novel fusions provided medically meaningful information. CONCLUSIONS: The EnFusion pipeline offers a streamlined approach to discover fusions in cancer, at higher levels of sensitivity and accuracy than single algorithm methods. Furthermore, this method accurately identifies driver fusions in pediatric cancer, providing clinical impact by contributing evidence to diagnosis and, when appropriate, indicating targeted therapies.


Subject(s)
Genome , Neoplasms , Child , Genomics , Humans , Neoplasms/genetics , Sequence Analysis, DNA , Sequence Analysis, RNA
SELECTION OF CITATIONS
SEARCH DETAIL
...