Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 77
Filtrar
1.
Cell ; 153(4): 919-29, 2013 May 09.
Artigo em Inglês | MEDLINE | ID: mdl-23663786

RESUMO

Identification of somatic rearrangements in cancer genomes has accelerated through analysis of high-throughput sequencing data. However, characterization of complex structural alterations and their underlying mechanisms remains inadequate. Here, applying an algorithm to predict structural variations from short reads, we report a comprehensive catalog of somatic structural variations and the mechanisms generating them, using high-coverage whole-genome sequencing data from 140 patients across ten tumor types. We characterize the relative contributions of different types of rearrangements and their mutational mechanisms, find that ~20% of the somatic deletions are complex deletions formed by replication errors, and describe the differences between the mutational mechanisms in somatic and germline alterations. Importantly, we provide detailed reconstructions of the events responsible for loss of CDKN2A/B and gain of EGFR in glioblastoma, revealing that these alterations can result from multiple mechanisms even in a single genome and that both DNA double-strand breaks and replication errors drive somatic rearrangements.


Assuntos
Algoritmos , Genoma Humano , Mutação , Neoplasias/genética , Aberrações Cromossômicas , Estudo de Associação Genômica Ampla , Glioblastoma/genética , Humanos , Neoplasias/patologia
2.
Nat Methods ; 20(8): 1174-1178, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37468619

RESUMO

Multiplexed antibody-based imaging enables the detailed characterization of molecular and cellular organization in tissues. Advances in the field now allow high-parameter data collection (>60 targets); however, considerable expertise and capital are needed to construct the antibody panels employed by these methods. Organ mapping antibody panels are community-validated resources that save time and money, increase reproducibility, accelerate discovery and support the construction of a Human Reference Atlas.


Assuntos
Anticorpos , Recursos Comunitários , Humanos , Reprodutibilidade dos Testes , Diagnóstico por Imagem
3.
Nucleic Acids Res ; 52(D1): D61-D66, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37971305

RESUMO

The Cistrome Data Browser is a resource of ChIP-seq, ATAC-seq and DNase-seq data from humans and mice. It provides maps of the genome-wide locations of transcription factors, cofactors, chromatin remodelers, histone post-translational modifications and regions of chromatin accessible to endonuclease activity. Cistrome DB v3.0 contains approximately 45 000 human and 44 000 mouse samples with about 32 000 newly collected datasets compared to the previous release. The Cistrome DB v3.0 user interface is implemented as a single page application that unifies menu driven and data driven search functions and provides an embedded genome browser, which allows users to find and visualize data more effectively. Users can find informative chromatin profiles through keyword, menu, and data-driven search tools. Browser search functions can predict the regulators of query genes as well as the cell type and factor dependent functionality of potential cis-regulatory elements. Cistrome DB v3.0 expands the display of quality control statistics, incorporates sequence logos into motif enrichment displays and includes more expansive sample metadata. Cistrome DB v3.0 is available at http://db3.cistrome.org/browser.


Assuntos
Cromatina , Bases de Dados de Proteínas , Genômica , Software , Animais , Humanos , Camundongos , Cromatina/genética , Histonas/genética , Histonas/metabolismo , Análise de Sequência de DNA , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Visualização de Dados , Internet , Genômica/métodos
4.
Bioinformatics ; 39(1)2023 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-36688709

RESUMO

SUMMARY: Gos is a declarative Python library designed to create interactive multiscale visualizations of genomics and epigenomics data. It provides a consistent and simple interface to the flexible Gosling visualization grammar. Gos hides technical complexities involved with configuring web-based genome browsers and integrates seamlessly within computational notebooks environments to enable new interactive analysis workflows. AVAILABILITY AND IMPLEMENTATION: Gos is released under the MIT License and available on the Python Package Index (PyPI). The source code is publicly available on GitHub (https://github.com/gosling-lang/gos), and documentation with examples can be found at https://gosling-lang.github.io/gos.


Assuntos
Biologia Computacional , Gansos , Animais , Genômica , Genoma , Biblioteca Gênica , Software
5.
Bioinformatics ; 39(2)2023 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-36688700

RESUMO

SUMMARY: The regulation of genes by cis-regulatory elements (CREs) is complex and differs between cell types. Visual analysis of large collections of chromatin profiles across diverse cell types, integrated with computational methods, can reveal meaningful biological insights. We developed Cistrome Explorer, a web-based interactive visual analytics tool for exploring thousands of chromatin profiles in diverse cell types. Integrated with the Cistrome Data Browser database which contains thousands of ChIP-seq, DNase-seq and ATAC-seq samples, Cistrome Explorer enables the discovery of patterns of CREs across cell types and the identification of transcription factor binding underlying these patterns. AVAILABILITY AND IMPLEMENTATION: Cistrome Explorer and its source code are available at http://cisvis.gehlenborglab.org/ and released under the MIT License. Documentation can be accessed via http://cisvis.gehlenborglab.org/docs/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Cromatina , Epigenômica , Análise de Sequência de DNA , Sequenciamento de Cromatina por Imunoprecipitação , Software , Bases de Dados Genéticas
6.
Bioinformatics ; 37(Suppl_1): i59-i66, 2021 07 12.
Artigo em Inglês | MEDLINE | ID: mdl-34252935

RESUMO

MOTIVATION: Molecular profiling of patient tumors and liquid biopsies over time with next-generation sequencing technologies and new immuno-profile assays are becoming part of standard research and clinical practice. With the wealth of new longitudinal data, there is a critical need for visualizations for cancer researchers to explore and interpret temporal patterns not just in a single patient but across cohorts. RESULTS: To address this need we developed OncoThreads, a tool for the visualization of longitudinal clinical and cancer genomics and other molecular data in patient cohorts. The tool visualizes patient cohorts as temporal heatmaps and Sankey diagrams that support the interactive exploration and ranking of a wide range of clinical and molecular features. This allows analysts to discover temporal patterns in longitudinal data, such as the impact of mutations on response to a treatment, for example, emergence of resistant clones. We demonstrate the functionality of OncoThreads using a cohort of 23 glioma patients sampled at 2-4 timepoints. AVAILABILITY AND IMPLEMENTATION: Freely available at http://oncothreads.gehlenborglab.org. Implemented in Java Script using the cBioPortal web API as a backend. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Fenômenos Bioquímicos , Neoplasias , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Neoplasias/genética , Software
8.
J Med Internet Res ; 23(3): e22219, 2021 03 02.
Artigo em Inglês | MEDLINE | ID: mdl-33600347

RESUMO

Coincident with the tsunami of COVID-19-related publications, there has been a surge of studies using real-world data, including those obtained from the electronic health record (EHR). Unfortunately, several of these high-profile publications were retracted because of concerns regarding the soundness and quality of the studies and the EHR data they purported to analyze. These retractions highlight that although a small community of EHR informatics experts can readily identify strengths and flaws in EHR-derived studies, many medical editorial teams and otherwise sophisticated medical readers lack the framework to fully critically appraise these studies. In addition, conventional statistical analyses cannot overcome the need for an understanding of the opportunities and limitations of EHR-derived studies. We distill here from the broader informatics literature six key considerations that are crucial for appraising studies utilizing EHR data: data completeness, data collection and handling (eg, transformation), data type (ie, codified, textual), robustness of methods against EHR variability (within and across institutions, countries, and time), transparency of data and analytic code, and the multidisciplinary approach. These considerations will inform researchers, clinicians, and other stakeholders as to the recommended best practices in reviewing manuscripts, grants, and other outputs from EHR-data derived studies, and thereby promote and foster rigor, quality, and reliability of this rapidly growing field.


Assuntos
COVID-19/epidemiologia , Coleta de Dados/métodos , Registros Eletrônicos de Saúde , Coleta de Dados/normas , Humanos , Revisão da Pesquisa por Pares/normas , Editoração/normas , Reprodutibilidade dos Testes , SARS-CoV-2/isolamento & purificação
9.
J Med Internet Res ; 23(10): e31400, 2021 10 11.
Artigo em Inglês | MEDLINE | ID: mdl-34533459

RESUMO

BACKGROUND: Many countries have experienced 2 predominant waves of COVID-19-related hospitalizations. Comparing the clinical trajectories of patients hospitalized in separate waves of the pandemic enables further understanding of the evolving epidemiology, pathophysiology, and health care dynamics of the COVID-19 pandemic. OBJECTIVE: In this retrospective cohort study, we analyzed electronic health record (EHR) data from patients with SARS-CoV-2 infections hospitalized in participating health care systems representing 315 hospitals across 6 countries. We compared hospitalization rates, severe COVID-19 risk, and mean laboratory values between patients hospitalized during the first and second waves of the pandemic. METHODS: Using a federated approach, each participating health care system extracted patient-level clinical data on their first and second wave cohorts and submitted aggregated data to the central site. Data quality control steps were adopted at the central site to correct for implausible values and harmonize units. Statistical analyses were performed by computing individual health care system effect sizes and synthesizing these using random effect meta-analyses to account for heterogeneity. We focused the laboratory analysis on C-reactive protein (CRP), ferritin, fibrinogen, procalcitonin, D-dimer, and creatinine based on their reported associations with severe COVID-19. RESULTS: Data were available for 79,613 patients, of which 32,467 were hospitalized in the first wave and 47,146 in the second wave. The prevalence of male patients and patients aged 50 to 69 years decreased significantly between the first and second waves. Patients hospitalized in the second wave had a 9.9% reduction in the risk of severe COVID-19 compared to patients hospitalized in the first wave (95% CI 8.5%-11.3%). Demographic subgroup analyses indicated that patients aged 26 to 49 years and 50 to 69 years; male and female patients; and black patients had significantly lower risk for severe disease in the second wave than in the first wave. At admission, the mean values of CRP were significantly lower in the second wave than in the first wave. On the seventh hospital day, the mean values of CRP, ferritin, fibrinogen, and procalcitonin were significantly lower in the second wave than in the first wave. In general, countries exhibited variable changes in laboratory testing rates from the first to the second wave. At admission, there was a significantly higher testing rate for D-dimer in France, Germany, and Spain. CONCLUSIONS: Patients hospitalized in the second wave were at significantly lower risk for severe COVID-19. This corresponded to mean laboratory values in the second wave that were more likely to be in typical physiological ranges on the seventh hospital day compared to the first wave. Our federated approach demonstrated the feasibility and power of harmonizing heterogeneous EHR data from multiple international health care systems to rapidly conduct large-scale studies to characterize how COVID-19 clinical trajectories evolve.


Assuntos
COVID-19 , Pandemias , Adulto , Idoso , Feminino , Hospitalização , Hospitais , Humanos , Masculino , Pessoa de Meia-Idade , Estudos Retrospectivos , SARS-CoV-2
13.
Nature ; 512(7515): 449-52, 2014 Aug 28.
Artigo em Inglês | MEDLINE | ID: mdl-25164756

RESUMO

Genome function is dynamically regulated in part by chromatin, which consists of the histones, non-histone proteins and RNA molecules that package DNA. Studies in Caenorhabditis elegans and Drosophila melanogaster have contributed substantially to our understanding of molecular mechanisms of genome function in humans, and have revealed conservation of chromatin components and mechanisms. Nevertheless, the three organisms have markedly different genome sizes, chromosome architecture and gene organization. On human and fly chromosomes, for example, pericentric heterochromatin flanks single centromeres, whereas worm chromosomes have dispersed heterochromatin-like regions enriched in the distal chromosomal 'arms', and centromeres distributed along their lengths. To systematically investigate chromatin organization and associated gene regulation across species, we generated and analysed a large collection of genome-wide chromatin data sets from cell lines and developmental stages in worm, fly and human. Here we present over 800 new data sets from our ENCODE and modENCODE consortia, bringing the total to over 1,400. Comparison of combinatorial patterns of histone modifications, nuclear lamina-associated domains, organization of large-scale topological domains, chromatin environment at promoters and enhancers, nucleosome positioning, and DNA replication patterns reveals many conserved features of chromatin organization among the three organisms. We also find notable differences in the composition and locations of repressive chromatin. These data sets and analyses provide a rich resource for comparative and species-specific investigations of chromatin composition, organization and function.


Assuntos
Caenorhabditis elegans/citologia , Caenorhabditis elegans/genética , Cromatina/genética , Cromatina/metabolismo , Drosophila melanogaster/citologia , Drosophila melanogaster/genética , Animais , Linhagem Celular , Centrômero/genética , Centrômero/metabolismo , Cromatina/química , Montagem e Desmontagem da Cromatina/genética , Replicação do DNA/genética , Elementos Facilitadores Genéticos/genética , Epigênese Genética , Heterocromatina/química , Heterocromatina/genética , Heterocromatina/metabolismo , Histonas/química , Histonas/metabolismo , Humanos , Anotação de Sequência Molecular , Lâmina Nuclear/metabolismo , Nucleossomos/química , Nucleossomos/genética , Nucleossomos/metabolismo , Regiões Promotoras Genéticas/genética , Especificidade da Espécie
14.
Bioinformatics ; 34(7): 1200-1207, 2018 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-29186292

RESUMO

Motivation: The ever-increasing number of biomedical datasets provides tremendous opportunities for re-use but current data repositories provide limited means of exploration apart from text-based search. Ontological metadata annotations provide context by semantically relating datasets. Visualizing this rich network of relationships can improve the explorability of large data repositories and help researchers find datasets of interest. Results: We developed SATORI-an integrative search and visual exploration interface for the exploration of biomedical data repositories. The design is informed by a requirements analysis through a series of semi-structured interviews. We evaluated the implementation of SATORI in a field study on a real-world data collection. SATORI enables researchers to seamlessly search, browse and semantically query data repositories via two visualizations that are highly interconnected with a powerful search interface. Availability and implementation: SATORI is an open-source web application, which is freely available at http://satori.refinery-platform.org and integrated into the Refinery Platform. Contact: nils@hms.harvard.edu. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Ontologias Biológicas , Biologia Computacional/métodos , Metadados , Software , Animais , Humanos , Internet , Semântica
15.
Bioinformatics ; 33(18): 2938-2940, 2017 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-28645171

RESUMO

MOTIVATION: Venn and Euler diagrams are a popular yet inadequate solution for quantitative visualization of set intersections. A scalable alternative to Venn and Euler diagrams for visualizing intersecting sets and their properties is needed. RESULTS: We developed UpSetR, an open source R package that employs a scalable matrix-based visualization to show intersections of sets, their size, and other properties. AVAILABILITY AND IMPLEMENTATION: UpSetR is available at https://github.com/hms-dbmi/UpSetR/ and released under the MIT License. A Shiny app is available at https://gehlenborglab.shinyapps.io/upsetr/ . CONTACT: nils@hms.harvard.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional/métodos , Software , Técnicas de Genotipagem/métodos , Análise de Sequência de DNA/métodos
16.
BMC Bioinformatics ; 18(1): 406, 2017 Sep 12.
Artigo em Inglês | MEDLINE | ID: mdl-28899361

RESUMO

BACKGROUND: With ever-increasing amounts of data produced in biology research, scientists are in need of efficient data analysis methods. Cluster analysis, combined with visualization of the results, is one such method that can be used to make sense of large data volumes. At the same time, cluster analysis is known to be imperfect and depends on the choice of algorithms, parameters, and distance measures. Most clustering algorithms don't properly account for ambiguity in the source data, as records are often assigned to discrete clusters, even if an assignment is unclear. While there are metrics and visualization techniques that allow analysts to compare clusterings or to judge cluster quality, there is no comprehensive method that allows analysts to evaluate, compare, and refine cluster assignments based on the source data, derived scores, and contextual data. RESULTS: In this paper, we introduce a method that explicitly visualizes the quality of cluster assignments, allows comparisons of clustering results and enables analysts to manually curate and refine cluster assignments. Our methods are applicable to matrix data clustered with partitional, hierarchical, and fuzzy clustering algorithms. Furthermore, we enable analysts to explore clustering results in context of other data, for example, to observe whether a clustering of genomic data results in a meaningful differentiation in phenotypes. CONCLUSIONS: Our methods are integrated into Caleydo StratomeX, a popular, web-based, disease subtype analysis tool. We show in a usage scenario that our approach can reveal ambiguities in cluster assignments and produce improved clusterings that better differentiate genotypes and phenotypes.


Assuntos
Algoritmos , Interface Usuário-Computador , Análise por Conglomerados , Genótipo , Humanos , Internet , Neoplasias/classificação , Neoplasias/genética , Neoplasias/patologia , Fenótipo
17.
Proc Natl Acad Sci U S A ; 111(43): 15544-9, 2014 Oct 28.
Artigo em Inglês | MEDLINE | ID: mdl-25313082

RESUMO

Previous studies have established that a subset of head and neck tumors contains human papillomavirus (HPV) sequences and that HPV-driven head and neck cancers display distinct biological and clinical features. HPV is known to drive cancer by the actions of the E6 and E7 oncoproteins, but the molecular architecture of HPV infection and its interaction with the host genome in head and neck cancers have not been comprehensively described. We profiled a cohort of 279 head and neck cancers with next generation RNA and DNA sequencing and show that 35 (12.5%) tumors displayed evidence of high-risk HPV types 16, 33, or 35. Twenty-five cases had integration of the viral genome into one or more locations in the human genome with statistical enrichment for genic regions. Integrations had a marked impact on the human genome and were associated with alterations in DNA copy number, mRNA transcript abundance and splicing, and both inter- and intrachromosomal rearrangements. Many of these events involved genes with documented roles in cancer. Cancers with integrated vs. nonintegrated HPV displayed different patterns of DNA methylation and both human and viral gene expressions. Together, these data provide insight into the mechanisms by which HPV interacts with the human genome beyond expression of viral oncoproteins and suggest that specific integration events are an integral component of viral oncogenesis.


Assuntos
Genoma Humano/genética , Neoplasias de Cabeça e Pescoço/genética , Neoplasias de Cabeça e Pescoço/virologia , Interações Hospedeiro-Patógeno/genética , Papillomaviridae/fisiologia , Sequência de Bases , Metilação de DNA/genética , Regulação Neoplásica da Expressão Gênica , Genes Neoplásicos , Humanos , Dados de Sequência Molecular , Integração Viral/genética
18.
Bioinformatics ; 29(8): 1089-91, 2013 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-23419376

RESUMO

SUMMARY: We have developed Nozzle, an R package that provides an Application Programming Interface to generate HTML reports with dynamic user interface elements. Nozzle was designed to facilitate summarization and rapid browsing of complex results in data analysis pipelines where multiple analyses are performed frequently on big datasets. The package can be applied to any project where user-friendly reports need to be created. AVAILABILITY: The R package is available on CRAN at http://cran.r-project.org/package=Nozzle.R1. Examples and additional materials are available at http://gdac.broadinstitute.org/nozzle. The source code is also available at http://www.github.com/parklab/Nozzle. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Software , Biologia Computacional/métodos , Genômica , Humanos , Neoplasias/genética , Linguagens de Programação , Interface Usuário-Computador , Fluxo de Trabalho
19.
medRxiv ; 2024 Apr 05.
Artigo em Inglês | MEDLINE | ID: mdl-38585998

RESUMO

Over 30 international research studies and commercial laboratories are exploring the use of genomic sequencing to screen apparently healthy newborns for genetic disorders. These programs have individualized processes for determining which genes and genetic disorders are queried and reported in newborns. We compared lists of genes from 26 research and commercial newborn screening programs and found substantial heterogeneity among the genes included. A total of 1,750 genes were included in at least one newborn genome sequencing program, but only 74 genes were included on >80% of gene lists, 16 of which are not associated with conditions on the Recommended Uniform Screening Panel. We used a linear regression model to explore factors related to the inclusion of individual genes across programs, finding that a high evidence base as well as treatment efficacy were two of the most important factors for inclusion. We applied a machine learning model to predict how suitable a gene is for newborn sequencing. As knowledge about and treatments for genetic disorders expand, this model provides a dynamic tool to reassess genes for newborn screening implementation. This study highlights the complex landscape of gene list curation among genomic newborn screening programs and proposes an empirical path forward for determining the genes and disorders of highest priority for newborn screening programs.

20.
Nat Commun ; 15(1): 433, 2024 Jan 10.
Artigo em Inglês | MEDLINE | ID: mdl-38199997

RESUMO

There is a need to define regions of gene activation or repression that control human kidney cells in states of health, injury, and repair to understand the molecular pathogenesis of kidney disease and design therapeutic strategies. Comprehensive integration of gene expression with epigenetic features that define regulatory elements remains a significant challenge. We measure dual single nucleus RNA expression and chromatin accessibility, DNA methylation, and H3K27ac, H3K4me1, H3K4me3, and H3K27me3 histone modifications to decipher the chromatin landscape and gene regulation of the kidney in reference and adaptive injury states. We establish a spatially-anchored epigenomic atlas to define the kidney's active, silent, and regulatory accessible chromatin regions across the genome. Using this atlas, we note distinct control of adaptive injury in different epithelial cell types. A proximal tubule cell transcription factor network of ELF3, KLF6, and KLF10 regulates the transition between health and injury, while in thick ascending limb cells this transition is regulated by NR2F1. Further, combined perturbation of ELF3, KLF6, and KLF10 distinguishes two adaptive proximal tubular cell subtypes, one of which manifested a repair trajectory after knockout. This atlas will serve as a foundation to facilitate targeted cell-specific therapeutics by reprogramming gene regulatory networks.


Assuntos
Cromatina , Rim , Humanos , Cromatina/genética , Túbulos Renais Proximais , Nível de Saúde , Contagem de Células
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA