Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 77
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Cell ; 153(4): 919-29, 2013 May 09.
Artículo en Inglés | MEDLINE | ID: mdl-23663786

RESUMEN

Identification of somatic rearrangements in cancer genomes has accelerated through analysis of high-throughput sequencing data. However, characterization of complex structural alterations and their underlying mechanisms remains inadequate. Here, applying an algorithm to predict structural variations from short reads, we report a comprehensive catalog of somatic structural variations and the mechanisms generating them, using high-coverage whole-genome sequencing data from 140 patients across ten tumor types. We characterize the relative contributions of different types of rearrangements and their mutational mechanisms, find that ~20% of the somatic deletions are complex deletions formed by replication errors, and describe the differences between the mutational mechanisms in somatic and germline alterations. Importantly, we provide detailed reconstructions of the events responsible for loss of CDKN2A/B and gain of EGFR in glioblastoma, revealing that these alterations can result from multiple mechanisms even in a single genome and that both DNA double-strand breaks and replication errors drive somatic rearrangements.


Asunto(s)
Algoritmos , Genoma Humano , Mutación , Neoplasias/genética , Aberraciones Cromosómicas , Estudio de Asociación del Genoma Completo , Glioblastoma/genética , Humanos , Neoplasias/patología
2.
Nat Methods ; 20(8): 1174-1178, 2023 08.
Artículo en Inglés | MEDLINE | ID: mdl-37468619

RESUMEN

Multiplexed antibody-based imaging enables the detailed characterization of molecular and cellular organization in tissues. Advances in the field now allow high-parameter data collection (>60 targets); however, considerable expertise and capital are needed to construct the antibody panels employed by these methods. Organ mapping antibody panels are community-validated resources that save time and money, increase reproducibility, accelerate discovery and support the construction of a Human Reference Atlas.


Asunto(s)
Anticuerpos , Recursos Comunitarios , Humanos , Reproducibilidad de los Resultados , Diagnóstico por Imagen
3.
Nucleic Acids Res ; 52(D1): D61-D66, 2024 Jan 05.
Artículo en Inglés | MEDLINE | ID: mdl-37971305

RESUMEN

The Cistrome Data Browser is a resource of ChIP-seq, ATAC-seq and DNase-seq data from humans and mice. It provides maps of the genome-wide locations of transcription factors, cofactors, chromatin remodelers, histone post-translational modifications and regions of chromatin accessible to endonuclease activity. Cistrome DB v3.0 contains approximately 45 000 human and 44 000 mouse samples with about 32 000 newly collected datasets compared to the previous release. The Cistrome DB v3.0 user interface is implemented as a single page application that unifies menu driven and data driven search functions and provides an embedded genome browser, which allows users to find and visualize data more effectively. Users can find informative chromatin profiles through keyword, menu, and data-driven search tools. Browser search functions can predict the regulators of query genes as well as the cell type and factor dependent functionality of potential cis-regulatory elements. Cistrome DB v3.0 expands the display of quality control statistics, incorporates sequence logos into motif enrichment displays and includes more expansive sample metadata. Cistrome DB v3.0 is available at http://db3.cistrome.org/browser.


Asunto(s)
Cromatina , Bases de Datos de Proteínas , Genómica , Programas Informáticos , Animales , Humanos , Ratones , Cromatina/genética , Histonas/genética , Histonas/metabolismo , Análisis de Secuencia de ADN , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Visualización de Datos , Internet , Genómica/métodos
4.
Bioinformatics ; 39(1)2023 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-36688709

RESUMEN

SUMMARY: Gos is a declarative Python library designed to create interactive multiscale visualizations of genomics and epigenomics data. It provides a consistent and simple interface to the flexible Gosling visualization grammar. Gos hides technical complexities involved with configuring web-based genome browsers and integrates seamlessly within computational notebooks environments to enable new interactive analysis workflows. AVAILABILITY AND IMPLEMENTATION: Gos is released under the MIT License and available on the Python Package Index (PyPI). The source code is publicly available on GitHub (https://github.com/gosling-lang/gos), and documentation with examples can be found at https://gosling-lang.github.io/gos.


Asunto(s)
Biología Computacional , Gansos , Animales , Genómica , Genoma , Biblioteca de Genes , Programas Informáticos
5.
Bioinformatics ; 39(2)2023 02 03.
Artículo en Inglés | MEDLINE | ID: mdl-36688700

RESUMEN

SUMMARY: The regulation of genes by cis-regulatory elements (CREs) is complex and differs between cell types. Visual analysis of large collections of chromatin profiles across diverse cell types, integrated with computational methods, can reveal meaningful biological insights. We developed Cistrome Explorer, a web-based interactive visual analytics tool for exploring thousands of chromatin profiles in diverse cell types. Integrated with the Cistrome Data Browser database which contains thousands of ChIP-seq, DNase-seq and ATAC-seq samples, Cistrome Explorer enables the discovery of patterns of CREs across cell types and the identification of transcription factor binding underlying these patterns. AVAILABILITY AND IMPLEMENTATION: Cistrome Explorer and its source code are available at http://cisvis.gehlenborglab.org/ and released under the MIT License. Documentation can be accessed via http://cisvis.gehlenborglab.org/docs/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Cromatina , Epigenómica , Análisis de Secuencia de ADN , Secuenciación de Inmunoprecipitación de Cromatina , Programas Informáticos , Bases de Datos Genéticas
6.
Bioinformatics ; 37(Suppl_1): i59-i66, 2021 07 12.
Artículo en Inglés | MEDLINE | ID: mdl-34252935

RESUMEN

MOTIVATION: Molecular profiling of patient tumors and liquid biopsies over time with next-generation sequencing technologies and new immuno-profile assays are becoming part of standard research and clinical practice. With the wealth of new longitudinal data, there is a critical need for visualizations for cancer researchers to explore and interpret temporal patterns not just in a single patient but across cohorts. RESULTS: To address this need we developed OncoThreads, a tool for the visualization of longitudinal clinical and cancer genomics and other molecular data in patient cohorts. The tool visualizes patient cohorts as temporal heatmaps and Sankey diagrams that support the interactive exploration and ranking of a wide range of clinical and molecular features. This allows analysts to discover temporal patterns in longitudinal data, such as the impact of mutations on response to a treatment, for example, emergence of resistant clones. We demonstrate the functionality of OncoThreads using a cohort of 23 glioma patients sampled at 2-4 timepoints. AVAILABILITY AND IMPLEMENTATION: Freely available at http://oncothreads.gehlenborglab.org. Implemented in Java Script using the cBioPortal web API as a backend. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Fenómenos Bioquímicos , Neoplasias , Genómica , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Neoplasias/genética , Programas Informáticos
8.
J Med Internet Res ; 23(3): e22219, 2021 03 02.
Artículo en Inglés | MEDLINE | ID: mdl-33600347

RESUMEN

Coincident with the tsunami of COVID-19-related publications, there has been a surge of studies using real-world data, including those obtained from the electronic health record (EHR). Unfortunately, several of these high-profile publications were retracted because of concerns regarding the soundness and quality of the studies and the EHR data they purported to analyze. These retractions highlight that although a small community of EHR informatics experts can readily identify strengths and flaws in EHR-derived studies, many medical editorial teams and otherwise sophisticated medical readers lack the framework to fully critically appraise these studies. In addition, conventional statistical analyses cannot overcome the need for an understanding of the opportunities and limitations of EHR-derived studies. We distill here from the broader informatics literature six key considerations that are crucial for appraising studies utilizing EHR data: data completeness, data collection and handling (eg, transformation), data type (ie, codified, textual), robustness of methods against EHR variability (within and across institutions, countries, and time), transparency of data and analytic code, and the multidisciplinary approach. These considerations will inform researchers, clinicians, and other stakeholders as to the recommended best practices in reviewing manuscripts, grants, and other outputs from EHR-data derived studies, and thereby promote and foster rigor, quality, and reliability of this rapidly growing field.


Asunto(s)
COVID-19/epidemiología , Recolección de Datos/métodos , Registros Electrónicos de Salud , Recolección de Datos/normas , Humanos , Revisión de la Investigación por Pares/normas , Edición/normas , Reproducibilidad de los Resultados , SARS-CoV-2/aislamiento & purificación
9.
J Med Internet Res ; 23(10): e31400, 2021 10 11.
Artículo en Inglés | MEDLINE | ID: mdl-34533459

RESUMEN

BACKGROUND: Many countries have experienced 2 predominant waves of COVID-19-related hospitalizations. Comparing the clinical trajectories of patients hospitalized in separate waves of the pandemic enables further understanding of the evolving epidemiology, pathophysiology, and health care dynamics of the COVID-19 pandemic. OBJECTIVE: In this retrospective cohort study, we analyzed electronic health record (EHR) data from patients with SARS-CoV-2 infections hospitalized in participating health care systems representing 315 hospitals across 6 countries. We compared hospitalization rates, severe COVID-19 risk, and mean laboratory values between patients hospitalized during the first and second waves of the pandemic. METHODS: Using a federated approach, each participating health care system extracted patient-level clinical data on their first and second wave cohorts and submitted aggregated data to the central site. Data quality control steps were adopted at the central site to correct for implausible values and harmonize units. Statistical analyses were performed by computing individual health care system effect sizes and synthesizing these using random effect meta-analyses to account for heterogeneity. We focused the laboratory analysis on C-reactive protein (CRP), ferritin, fibrinogen, procalcitonin, D-dimer, and creatinine based on their reported associations with severe COVID-19. RESULTS: Data were available for 79,613 patients, of which 32,467 were hospitalized in the first wave and 47,146 in the second wave. The prevalence of male patients and patients aged 50 to 69 years decreased significantly between the first and second waves. Patients hospitalized in the second wave had a 9.9% reduction in the risk of severe COVID-19 compared to patients hospitalized in the first wave (95% CI 8.5%-11.3%). Demographic subgroup analyses indicated that patients aged 26 to 49 years and 50 to 69 years; male and female patients; and black patients had significantly lower risk for severe disease in the second wave than in the first wave. At admission, the mean values of CRP were significantly lower in the second wave than in the first wave. On the seventh hospital day, the mean values of CRP, ferritin, fibrinogen, and procalcitonin were significantly lower in the second wave than in the first wave. In general, countries exhibited variable changes in laboratory testing rates from the first to the second wave. At admission, there was a significantly higher testing rate for D-dimer in France, Germany, and Spain. CONCLUSIONS: Patients hospitalized in the second wave were at significantly lower risk for severe COVID-19. This corresponded to mean laboratory values in the second wave that were more likely to be in typical physiological ranges on the seventh hospital day compared to the first wave. Our federated approach demonstrated the feasibility and power of harmonizing heterogeneous EHR data from multiple international health care systems to rapidly conduct large-scale studies to characterize how COVID-19 clinical trajectories evolve.


Asunto(s)
COVID-19 , Pandemias , Adulto , Anciano , Femenino , Hospitalización , Hospitales , Humanos , Masculino , Persona de Mediana Edad , Estudios Retrospectivos , SARS-CoV-2
13.
Nature ; 512(7515): 449-52, 2014 Aug 28.
Artículo en Inglés | MEDLINE | ID: mdl-25164756

RESUMEN

Genome function is dynamically regulated in part by chromatin, which consists of the histones, non-histone proteins and RNA molecules that package DNA. Studies in Caenorhabditis elegans and Drosophila melanogaster have contributed substantially to our understanding of molecular mechanisms of genome function in humans, and have revealed conservation of chromatin components and mechanisms. Nevertheless, the three organisms have markedly different genome sizes, chromosome architecture and gene organization. On human and fly chromosomes, for example, pericentric heterochromatin flanks single centromeres, whereas worm chromosomes have dispersed heterochromatin-like regions enriched in the distal chromosomal 'arms', and centromeres distributed along their lengths. To systematically investigate chromatin organization and associated gene regulation across species, we generated and analysed a large collection of genome-wide chromatin data sets from cell lines and developmental stages in worm, fly and human. Here we present over 800 new data sets from our ENCODE and modENCODE consortia, bringing the total to over 1,400. Comparison of combinatorial patterns of histone modifications, nuclear lamina-associated domains, organization of large-scale topological domains, chromatin environment at promoters and enhancers, nucleosome positioning, and DNA replication patterns reveals many conserved features of chromatin organization among the three organisms. We also find notable differences in the composition and locations of repressive chromatin. These data sets and analyses provide a rich resource for comparative and species-specific investigations of chromatin composition, organization and function.


Asunto(s)
Caenorhabditis elegans/citología , Caenorhabditis elegans/genética , Cromatina/genética , Cromatina/metabolismo , Drosophila melanogaster/citología , Drosophila melanogaster/genética , Animales , Línea Celular , Centrómero/genética , Centrómero/metabolismo , Cromatina/química , Ensamble y Desensamble de Cromatina/genética , Replicación del ADN/genética , Elementos de Facilitación Genéticos/genética , Epigénesis Genética , Heterocromatina/química , Heterocromatina/genética , Heterocromatina/metabolismo , Histonas/química , Histonas/metabolismo , Humanos , Anotación de Secuencia Molecular , Lámina Nuclear/metabolismo , Nucleosomas/química , Nucleosomas/genética , Nucleosomas/metabolismo , Regiones Promotoras Genéticas/genética , Especificidad de la Especie
14.
Bioinformatics ; 34(7): 1200-1207, 2018 04 01.
Artículo en Inglés | MEDLINE | ID: mdl-29186292

RESUMEN

Motivation: The ever-increasing number of biomedical datasets provides tremendous opportunities for re-use but current data repositories provide limited means of exploration apart from text-based search. Ontological metadata annotations provide context by semantically relating datasets. Visualizing this rich network of relationships can improve the explorability of large data repositories and help researchers find datasets of interest. Results: We developed SATORI-an integrative search and visual exploration interface for the exploration of biomedical data repositories. The design is informed by a requirements analysis through a series of semi-structured interviews. We evaluated the implementation of SATORI in a field study on a real-world data collection. SATORI enables researchers to seamlessly search, browse and semantically query data repositories via two visualizations that are highly interconnected with a powerful search interface. Availability and implementation: SATORI is an open-source web application, which is freely available at http://satori.refinery-platform.org and integrated into the Refinery Platform. Contact: nils@hms.harvard.edu. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Ontologías Biológicas , Biología Computacional/métodos , Metadatos , Programas Informáticos , Animales , Humanos , Internet , Semántica
15.
Bioinformatics ; 33(18): 2938-2940, 2017 Sep 15.
Artículo en Inglés | MEDLINE | ID: mdl-28645171

RESUMEN

MOTIVATION: Venn and Euler diagrams are a popular yet inadequate solution for quantitative visualization of set intersections. A scalable alternative to Venn and Euler diagrams for visualizing intersecting sets and their properties is needed. RESULTS: We developed UpSetR, an open source R package that employs a scalable matrix-based visualization to show intersections of sets, their size, and other properties. AVAILABILITY AND IMPLEMENTATION: UpSetR is available at https://github.com/hms-dbmi/UpSetR/ and released under the MIT License. A Shiny app is available at https://gehlenborglab.shinyapps.io/upsetr/ . CONTACT: nils@hms.harvard.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Biología Computacional/métodos , Programas Informáticos , Técnicas de Genotipaje/métodos , Análisis de Secuencia de ADN/métodos
16.
BMC Bioinformatics ; 18(1): 406, 2017 Sep 12.
Artículo en Inglés | MEDLINE | ID: mdl-28899361

RESUMEN

BACKGROUND: With ever-increasing amounts of data produced in biology research, scientists are in need of efficient data analysis methods. Cluster analysis, combined with visualization of the results, is one such method that can be used to make sense of large data volumes. At the same time, cluster analysis is known to be imperfect and depends on the choice of algorithms, parameters, and distance measures. Most clustering algorithms don't properly account for ambiguity in the source data, as records are often assigned to discrete clusters, even if an assignment is unclear. While there are metrics and visualization techniques that allow analysts to compare clusterings or to judge cluster quality, there is no comprehensive method that allows analysts to evaluate, compare, and refine cluster assignments based on the source data, derived scores, and contextual data. RESULTS: In this paper, we introduce a method that explicitly visualizes the quality of cluster assignments, allows comparisons of clustering results and enables analysts to manually curate and refine cluster assignments. Our methods are applicable to matrix data clustered with partitional, hierarchical, and fuzzy clustering algorithms. Furthermore, we enable analysts to explore clustering results in context of other data, for example, to observe whether a clustering of genomic data results in a meaningful differentiation in phenotypes. CONCLUSIONS: Our methods are integrated into Caleydo StratomeX, a popular, web-based, disease subtype analysis tool. We show in a usage scenario that our approach can reveal ambiguities in cluster assignments and produce improved clusterings that better differentiate genotypes and phenotypes.


Asunto(s)
Algoritmos , Interfaz Usuario-Computador , Análisis por Conglomerados , Genotipo , Humanos , Internet , Neoplasias/clasificación , Neoplasias/genética , Neoplasias/patología , Fenotipo
17.
Proc Natl Acad Sci U S A ; 111(43): 15544-9, 2014 Oct 28.
Artículo en Inglés | MEDLINE | ID: mdl-25313082

RESUMEN

Previous studies have established that a subset of head and neck tumors contains human papillomavirus (HPV) sequences and that HPV-driven head and neck cancers display distinct biological and clinical features. HPV is known to drive cancer by the actions of the E6 and E7 oncoproteins, but the molecular architecture of HPV infection and its interaction with the host genome in head and neck cancers have not been comprehensively described. We profiled a cohort of 279 head and neck cancers with next generation RNA and DNA sequencing and show that 35 (12.5%) tumors displayed evidence of high-risk HPV types 16, 33, or 35. Twenty-five cases had integration of the viral genome into one or more locations in the human genome with statistical enrichment for genic regions. Integrations had a marked impact on the human genome and were associated with alterations in DNA copy number, mRNA transcript abundance and splicing, and both inter- and intrachromosomal rearrangements. Many of these events involved genes with documented roles in cancer. Cancers with integrated vs. nonintegrated HPV displayed different patterns of DNA methylation and both human and viral gene expressions. Together, these data provide insight into the mechanisms by which HPV interacts with the human genome beyond expression of viral oncoproteins and suggest that specific integration events are an integral component of viral oncogenesis.


Asunto(s)
Genoma Humano/genética , Neoplasias de Cabeza y Cuello/genética , Neoplasias de Cabeza y Cuello/virología , Interacciones Huésped-Patógeno/genética , Papillomaviridae/fisiología , Secuencia de Bases , Metilación de ADN/genética , Regulación Neoplásica de la Expresión Génica , Genes Relacionados con las Neoplasias , Humanos , Datos de Secuencia Molecular , Integración Viral/genética
18.
Bioinformatics ; 29(8): 1089-91, 2013 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-23419376

RESUMEN

SUMMARY: We have developed Nozzle, an R package that provides an Application Programming Interface to generate HTML reports with dynamic user interface elements. Nozzle was designed to facilitate summarization and rapid browsing of complex results in data analysis pipelines where multiple analyses are performed frequently on big datasets. The package can be applied to any project where user-friendly reports need to be created. AVAILABILITY: The R package is available on CRAN at http://cran.r-project.org/package=Nozzle.R1. Examples and additional materials are available at http://gdac.broadinstitute.org/nozzle. The source code is also available at http://www.github.com/parklab/Nozzle. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Programas Informáticos , Biología Computacional/métodos , Genómica , Humanos , Neoplasias/genética , Lenguajes de Programación , Interfaz Usuario-Computador , Flujo de Trabajo
19.
medRxiv ; 2024 Apr 05.
Artículo en Inglés | MEDLINE | ID: mdl-38585998

RESUMEN

Over 30 international research studies and commercial laboratories are exploring the use of genomic sequencing to screen apparently healthy newborns for genetic disorders. These programs have individualized processes for determining which genes and genetic disorders are queried and reported in newborns. We compared lists of genes from 26 research and commercial newborn screening programs and found substantial heterogeneity among the genes included. A total of 1,750 genes were included in at least one newborn genome sequencing program, but only 74 genes were included on >80% of gene lists, 16 of which are not associated with conditions on the Recommended Uniform Screening Panel. We used a linear regression model to explore factors related to the inclusion of individual genes across programs, finding that a high evidence base as well as treatment efficacy were two of the most important factors for inclusion. We applied a machine learning model to predict how suitable a gene is for newborn sequencing. As knowledge about and treatments for genetic disorders expand, this model provides a dynamic tool to reassess genes for newborn screening implementation. This study highlights the complex landscape of gene list curation among genomic newborn screening programs and proposes an empirical path forward for determining the genes and disorders of highest priority for newborn screening programs.

20.
Nat Commun ; 15(1): 433, 2024 Jan 10.
Artículo en Inglés | MEDLINE | ID: mdl-38199997

RESUMEN

There is a need to define regions of gene activation or repression that control human kidney cells in states of health, injury, and repair to understand the molecular pathogenesis of kidney disease and design therapeutic strategies. Comprehensive integration of gene expression with epigenetic features that define regulatory elements remains a significant challenge. We measure dual single nucleus RNA expression and chromatin accessibility, DNA methylation, and H3K27ac, H3K4me1, H3K4me3, and H3K27me3 histone modifications to decipher the chromatin landscape and gene regulation of the kidney in reference and adaptive injury states. We establish a spatially-anchored epigenomic atlas to define the kidney's active, silent, and regulatory accessible chromatin regions across the genome. Using this atlas, we note distinct control of adaptive injury in different epithelial cell types. A proximal tubule cell transcription factor network of ELF3, KLF6, and KLF10 regulates the transition between health and injury, while in thick ascending limb cells this transition is regulated by NR2F1. Further, combined perturbation of ELF3, KLF6, and KLF10 distinguishes two adaptive proximal tubular cell subtypes, one of which manifested a repair trajectory after knockout. This atlas will serve as a foundation to facilitate targeted cell-specific therapeutics by reprogramming gene regulatory networks.


Asunto(s)
Cromatina , Riñón , Humanos , Cromatina/genética , Túbulos Renales Proximales , Estado de Salud , Recuento de Células
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA