Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 173
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Cell ; 184(26): 6281-6298.e23, 2021 12 22.
Artigo em Inglês | MEDLINE | ID: mdl-34875227

RESUMO

While intestinal Th17 cells are critical for maintaining tissue homeostasis, recent studies have implicated their roles in the development of extra-intestinal autoimmune diseases including multiple sclerosis. However, the mechanisms by which tissue Th17 cells mediate these dichotomous functions remain unknown. Here, we characterized the heterogeneity, plasticity, and migratory phenotypes of tissue Th17 cells in vivo by combined fate mapping with profiling of the transcriptomes and TCR clonotypes of over 84,000 Th17 cells at homeostasis and during CNS autoimmune inflammation. Inter- and intra-organ single-cell analyses revealed a homeostatic, stem-like TCF1+ IL-17+ SLAMF6+ population that traffics to the intestine where it is maintained by the microbiota, providing a ready reservoir for the IL-23-driven generation of encephalitogenic GM-CSF+ IFN-γ+ CXCR6+ T cells. Our study defines a direct in vivo relationship between IL-17+ non-pathogenic and GM-CSF+ and IFN-γ+ pathogenic Th17 populations and provides a mechanism by which homeostatic intestinal Th17 cells direct extra-intestinal autoimmune disease.


Assuntos
Autoimunidade , Intestinos/imunologia , Células-Tronco/metabolismo , Células Th17/imunologia , Animais , Movimento Celular , Células Clonais , Encefalomielite Autoimune Experimental/imunologia , Fator Estimulador de Colônias de Granulócitos e Macrófagos/metabolismo , Homeostase , Humanos , Interferon gama/metabolismo , Interleucina-17/metabolismo , Camundongos Endogâmicos C57BL , Especificidade de Órgãos , RNA/metabolismo , RNA-Seq , Receptores de Antígenos de Linfócitos T/metabolismo , Receptores CXCR6/metabolismo , Receptores de Interleucina/metabolismo , Reprodutibilidade dos Testes , Família de Moléculas de Sinalização da Ativação Linfocitária/metabolismo , Análise de Célula Única , Baço/metabolismo
2.
Cell ; 182(6): 1474-1489.e23, 2020 09 17.
Artigo em Inglês | MEDLINE | ID: mdl-32841603

RESUMO

Widespread changes to DNA methylation and chromatin are well documented in cancer, but the fate of higher-order chromosomal structure remains obscure. Here we integrated topological maps for colon tumors and normal colons with epigenetic, transcriptional, and imaging data to characterize alterations to chromatin loops, topologically associated domains, and large-scale compartments. We found that spatial partitioning of the open and closed genome compartments is profoundly compromised in tumors. This reorganization is accompanied by compartment-specific hypomethylation and chromatin changes. Additionally, we identify a compartment at the interface between the canonical A and B compartments that is reorganized in tumors. Remarkably, similar shifts were evident in non-malignant cells that have accumulated excess divisions. Our analyses suggest that these topological changes repress stemness and invasion programs while inducing anti-tumor immunity genes and may therefore restrain malignant progression. Our findings call into question the conventional view that tumor-associated epigenomic alterations are primarily oncogenic.


Assuntos
Cromatina/metabolismo , Cromossomos/metabolismo , Neoplasias Colorretais/genética , Neoplasias Colorretais/metabolismo , Metilação de DNA , Epigênese Genética , Regulação Neoplásica da Expressão Gênica/genética , Divisão Celular , Senescência Celular/genética , Sequenciamento de Cromatina por Imunoprecipitação , Cromossomos/genética , Estudos de Coortes , Neoplasias Colorretais/mortalidade , Neoplasias Colorretais/patologia , Biologia Computacional , Metilação de DNA/genética , Epigenômica , Células HCT116 , Humanos , Hibridização in Situ Fluorescente , Microscopia Eletrônica de Transmissão , Simulação de Dinâmica Molecular , RNA-Seq , Análise Espacial , Proteínas Supressoras de Tumor/genética , Proteínas Supressoras de Tumor/metabolismo
3.
Mol Cell ; 83(15): 2753-2767.e10, 2023 08 03.
Artigo em Inglês | MEDLINE | ID: mdl-37478846

RESUMO

Nuclear hormone receptors (NRs) are ligand-binding transcription factors that are widely targeted therapeutically. Agonist binding triggers NR activation and subsequent degradation by unknown ligand-dependent ubiquitin ligase machinery. NR degradation is critical for therapeutic efficacy in malignancies that are driven by retinoic acid and estrogen receptors. Here, we demonstrate the ubiquitin ligase UBR5 drives degradation of multiple agonist-bound NRs, including the retinoic acid receptor alpha (RARA), retinoid x receptor alpha (RXRA), glucocorticoid, estrogen, liver-X, progesterone, and vitamin D receptors. We present the high-resolution cryo-EMstructure of full-length human UBR5 and a negative stain model representing its interaction with RARA/RXRA. Agonist ligands induce sequential, mutually exclusive recruitment of nuclear coactivators (NCOAs) and UBR5 to chromatin to regulate transcriptional networks. Other pharmacological ligands such as selective estrogen receptor degraders (SERDs) degrade their receptors through differential recruitment of UBR5 or RNF111. We establish the UBR5 transcriptional regulatory hub as a common mediator and regulator of NR-induced transcription.


Assuntos
Cromatina , Fatores de Transcrição , Humanos , Ligantes , Cromatina/genética , Fatores de Transcrição/metabolismo , Receptores Citoplasmáticos e Nucleares/genética , Ubiquitinas , Ubiquitina-Proteína Ligases/genética
4.
Nat Methods ; 20(8): 1196-1202, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37429993

RESUMO

Unsupervised clustering of single-cell RNA-sequencing data enables the identification of distinct cell populations. However, the most widely used clustering algorithms are heuristic and do not formally account for statistical uncertainty. We find that not addressing known sources of variability in a statistically rigorous manner can lead to overconfidence in the discovery of novel cell types. Here we extend a previous method, significance of hierarchical clustering, to propose a model-based hypothesis testing approach that incorporates significance analysis into the clustering algorithm and permits statistical evaluation of clusters as distinct cell populations. We also adapt this approach to permit statistical assessment on the clusters reported by any algorithm. Finally, we extend these approaches to account for batch structure. We benchmarked our approach against popular clustering workflows, demonstrating improved performance. To show practical utility, we applied our approach to the Human Lung Cell Atlas and an atlas of the mouse cerebellar cortex, identifying several cases of over-clustering and recapitulating experimentally validated cell type definitions.


Assuntos
Algoritmos , Benchmarking , Humanos , Animais , Camundongos , Análise por Conglomerados , RNA , Análise de Célula Única/métodos , Análise de Sequência de RNA/métodos , Perfilação da Expressão Gênica/métodos
5.
Proc Natl Acad Sci U S A ; 120(1): e2206751120, 2023 01 03.
Artigo em Inglês | MEDLINE | ID: mdl-36574667

RESUMO

Although antibodies targeting specific tumor-expressed antigens are the standard of care for some cancers, the identification of cancer-specific targets amenable to antibody binding has remained a bottleneck in development of new therapeutics. To overcome this challenge, we developed a high-throughput platform that allows for the unbiased, simultaneous discovery of antibodies and targets based on phenotypic binding profiles. Applying this platform to ovarian cancer, we identified a wide diversity of cancer targets including receptor tyrosine kinases, adhesion and migration proteins, proteases and proteins regulating angiogenesis in a single round of screening using genomics, flow cytometry, and mass spectrometry. In particular, we identified BCAM as a promising candidate for targeted therapy in high-grade serous ovarian cancers. More generally, this approach provides a rapid and flexible framework to identify cancer targets and antibodies.


Assuntos
Neoplasias Ovarianas , Biblioteca de Peptídeos , Humanos , Feminino , Linhagem Celular Tumoral , Anticorpos , Neoplasias Ovarianas/genética , Antígenos de Neoplasias
6.
Nat Methods ; 19(9): 1076-1087, 2022 09.
Artigo em Inglês | MEDLINE | ID: mdl-36050488

RESUMO

A central problem in spatial transcriptomics is detecting differentially expressed (DE) genes within cell types across tissue context. Challenges to learning DE include changing cell type composition across space and measurement pixels detecting transcripts from multiple cell types. Here, we introduce a statistical method, cell type-specific inference of differential expression (C-SIDE), that identifies cell type-specific DE in spatial transcriptomics, accounting for localization of other cell types. We model gene expression as an additive mixture across cell types of log-linear cell type-specific expression functions. C-SIDE's framework applies to many contexts: DE due to pathology, anatomical regions, cell-to-cell interactions and cellular microenvironment. Furthermore, C-SIDE enables statistical inference across multiple/replicates. Simulations and validation experiments on Slide-seq, MERFISH and Visium datasets demonstrate that C-SIDE accurately identifies DE with valid uncertainty quantification. Last, we apply C-SIDE to identify plaque-dependent immune activity in Alzheimer's disease and cellular interactions between tumor and immune cells. We distribute C-SIDE within the R package https://github.com/dmcable/spacexr .


Assuntos
Perfilação da Expressão Gênica , Transcriptoma , Perfilação da Expressão Gênica/métodos
7.
Biostatistics ; 24(4): 901-921, 2023 10 18.
Artigo em Inglês | MEDLINE | ID: mdl-35277956

RESUMO

Pharmacogenomic experiments allow for the systematic testing of drugs, at varying dosage concentrations, to study how genomic markers correlate with cell sensitivity to treatment. The first step in the analysis is to quantify the response of cell lines to variable dosage concentrations of the drugs being tested. The signal to noise in these measurements can be low due to biological and experimental variability. However, the increasing availability of pharmacogenomic studies provides replicated data sets that can be leveraged to gain power. To do this, we formulate a hierarchical mixture model to estimate the drug-specific mixture distributions for estimating cell sensitivity and for assessing drug effect type as either broad or targeted effect. We use this formulation to propose a unified approach that can yield posterior probability of a cell being susceptible to a drug conditional on being a targeted effect or relative effect sizes conditioned on the cell being broad. We demonstrate the usefulness of our approach via case studies. First, we assess pairwise agreements for cell lines/drugs within the intersection of two data sets and confirm the moderate pairwise agreement between many publicly available pharmacogenomic data sets. We then present an analysis that identifies sensitivity to the drug crizotinib for cells harboring EML4-ALK or NPM1-ALK gene fusions, as well as significantly down-regulated cell-matrix pathways associated with crizotinib sensitivity.


Assuntos
Carcinoma Pulmonar de Células não Pequenas , Neoplasias Pulmonares , Humanos , Crizotinibe/uso terapêutico , Carcinoma Pulmonar de Células não Pequenas/tratamento farmacológico , Carcinoma Pulmonar de Células não Pequenas/genética , Neoplasias Pulmonares/genética , Farmacogenética , Modelos Estatísticos , Receptores Proteína Tirosina Quinases/genética , Receptores Proteína Tirosina Quinases/uso terapêutico
8.
Biostatistics ; 23(4): 1150-1164, 2022 10 14.
Artigo em Inglês | MEDLINE | ID: mdl-35770795

RESUMO

Single-cell RNA sequencing (scRNA-seq) quantifies gene expression for individual cells in a sample, which allows distinct cell-type populations to be identified and characterized. An important step in many scRNA-seq analysis pipelines is the annotation of cells into known cell types. While this can be achieved using experimental techniques, such as fluorescence-activated cell sorting, these approaches are impractical for large numbers of cells. This motivates the development of data-driven cell-type annotation methods. We find limitations with current approaches due to the reliance on known marker genes or from overfitting because of systematic differences, or batch effects, between studies. Here, we present a statistical approach that leverages public data sets to combine information across thousands of genes, uses a latent variable model to define cell-type-specific barcodes and account for batch effect variation, and probabilistically annotates cell-type identity from a reference of known cell types. The barcoding approach also provides a new way to discover marker genes. Using a range of data sets, including those generated to represent imperfect real-world reference data, we demonstrate that our approach substantially outperforms current reference-based methods, particularly when predicting across studies.


Assuntos
Perfilação da Expressão Gênica , Análise de Célula Única , Expressão Gênica , Perfilação da Expressão Gênica/métodos , Humanos , RNA-Seq , Análise de Sequência de RNA/métodos , Software
9.
Biostatistics ; 24(1): 1-16, 2022 12 12.
Artigo em Inglês | MEDLINE | ID: mdl-34467372

RESUMO

High-dimensional biological data collection across heterogeneous groups of samples has become increasingly common, creating high demand for dimensionality reduction techniques that capture underlying structure of the data. Discovering low-dimensional embeddings that describe the separation of any underlying discrete latent structure in data is an important motivation for applying these techniques since these latent classes can represent important sources of unwanted variability, such as batch effects, or interesting sources of signal such as unknown cell types. The features that define this discrete latent structure are often hard to identify in high-dimensional data. Principal component analysis (PCA) is one of the most widely used methods as an unsupervised step for dimensionality reduction. This reduction technique finds linear transformations of the data which explain total variance. When the goal is detecting discrete structure, PCA is applied with the assumption that classes will be separated in directions of maximum variance. However, PCA will fail to accurately find discrete latent structure if this assumption does not hold. Visualization techniques, such as t-Distributed Stochastic Neighbor Embedding (t-SNE) and Uniform Manifold Approximation and Projection (UMAP), attempt to mitigate these problems with PCA by creating a low-dimensional space where similar objects are modeled by nearby points in the low-dimensional embedding and dissimilar objects are modeled by distant points with high probability. However, since t-SNE and UMAP are computationally expensive, often a PCA reduction is done before applying them which makes it sensitive to PCAs downfalls. Also, tSNE is limited to only two or three dimensions as a visualization tool, which may not be adequate for retaining discriminatory information. The linear transformations of PCA are preferable to non-linear transformations provided by methods like t-SNE and UMAP for interpretable feature weights. Here, we propose iterative discriminant analysis (iDA), a dimensionality reduction technique designed to mitigate these limitations. iDA produces an embedding that carries discriminatory information which optimally separates latent clusters using linear transformations that permit post hoc analysis to determine features that define these latent structures.


Assuntos
Algoritmos , Humanos , Análise de Componente Principal
10.
Proc Natl Acad Sci U S A ; 117(51): 32772-32778, 2020 12 22.
Artigo em Inglês | MEDLINE | ID: mdl-33293417

RESUMO

Population displacement may occur after natural disasters, permanently altering the demographic composition of the affected regions. Measuring this displacement is vital for both optimal postdisaster resource allocation and calculation of measures of public health interest such as mortality estimates. Here, we analyzed data generated by mobile phones and social media to estimate the weekly island-wide population at risk and within-island geographic heterogeneity of migration in Puerto Rico after Hurricane Maria. We compared these two data sources with population estimates derived from air travel records and census data. We observed a loss of population across all data sources throughout the study period; however, the magnitude and dynamics differ by the data source. Census data predict a population loss of just over 129,000 from July 2017 to July 2018, a 4% decrease; air travel data predict a population loss of 168,295 for the same period, a 5% decrease; mobile phone-based estimates predict a loss of 235,375 from July 2017 to May 2018, an 8% decrease; and social media-based estimates predict a loss of 476,779 from August 2017 to August 2018, a 17% decrease. On average, municipalities with a smaller population size lost a bigger proportion of their population. Moreover, we infer that these municipalities experienced greater infrastructure damage as measured by the proportion of unknown locations stemming from these regions. Finally, our analysis measures a general shift of population from rural to urban centers within the island. Passively collected data provide a promising supplement to current at-risk population estimation procedures; however, each data source has its own biases and limitations.

11.
BMC Genomics ; 23(1): 439, 2022 Jun 13.
Artigo em Inglês | MEDLINE | ID: mdl-35698050

RESUMO

We introduce mirTarRnaSeq, an R/Bioconductor package for quantitative assessment of miRNA-mRNA relationships within sample cohorts. mirTarRnaSeq is a statistical package to explore predicted or pre-hypothesized miRNA-mRNA relationships following target prediction.We present two use cases applying mirTarRnaSeq. First, to identify miRNA targets, we examined EBV miRNAs for interaction with human and virus transcriptomes of stomach adenocarcinoma. This revealed enrichment of mRNA targets highly expressed in CD105+ endothelial cells, monocytes, CD4+ T cells, NK cells, CD19+ B cells, and CD34 cells. Next, to investigate miRNA-mRNA relationships in SARS-CoV-2 (COVID-19) infection across time, we used paired miRNA and RNA sequenced datasets of SARS-CoV-2 infected lung epithelial cells across three time points (4, 12, and 24 hours post-infection). mirTarRnaSeq identified evidence for human miRNAs targeting cytokine signaling and neutrophil regulation immune pathways from 4 to 24 hours after SARS-CoV-2 infection. Confirming the clinical relevance of these predictions, three of the immune specific mRNA-miRNA relationships identified in human lung epithelial cells after SARS-CoV-2 infection were also observed to be differentially expressed in blood from patients with COVID-19. Overall, mirTarRnaSeq is a robust tool that can address a wide-range of biological questions providing improved prediction of miRNA-mRNA interactions.


Assuntos
COVID-19 , MicroRNAs , COVID-19/genética , Células Endoteliais , Humanos , MicroRNAs/genética , MicroRNAs/metabolismo , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , SARS-CoV-2
12.
EMBO J ; 37(6)2018 03 15.
Artigo em Inglês | MEDLINE | ID: mdl-29335281

RESUMO

In the post-genomic era, thousands of putative noncoding regulatory regions have been identified, such as enhancers, promoters, long noncoding RNAs (lncRNAs), and a cadre of small peptides. These ever-growing catalogs require high-throughput assays to test their functionality at scale. Massively parallel reporter assays have greatly enhanced the understanding of noncoding DNA elements en masse Here, we present a massively parallel RNA assay (MPRNA) that can assay 10,000 or more RNA segments for RNA-based functionality. We applied MPRNA to identify RNA-based nuclear localization domains harbored in lncRNAs. We examined a pool of 11,969 oligos densely tiling 38 human lncRNAs that were fused to a cytosolic transcript. After cell fractionation and barcode sequencing, we identified 109 unique RNA regions that significantly enriched this cytosolic transcript in the nucleus including a cytosine-rich motif. These nuclear enrichment sequences are highly conserved and over-represented in global nuclear fractionation sequencing. Importantly, many of these regions were independently validated by single-molecule RNA fluorescence in situ hybridization. Overall, we demonstrate the utility of MPRNA for future investigation of RNA-based functionalities.


Assuntos
RNA Longo não Codificante/genética , Núcleo Celular/genética , Células HeLa , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Hibridização in Situ Fluorescente , Análise de Sequência de RNA
13.
Development ; 146(6)2019 03 28.
Artigo em Inglês | MEDLINE | ID: mdl-30923056

RESUMO

Cell type specification during early nervous system development in Drosophila melanogaster requires precise regulation of gene expression in time and space. Resolving the programs driving neurogenesis has been a major challenge owing to the complexity and rapidity with which distinct cell populations arise. To resolve the cell type-specific gene expression dynamics in early nervous system development, we have sequenced the transcriptomes of purified neurogenic cell types across consecutive time points covering crucial events in neurogenesis. The resulting gene expression atlas comprises a detailed resource of global transcriptome dynamics that permits systematic analysis of how cells in the nervous system acquire distinct fates. We resolve known gene expression dynamics and uncover novel expression signatures for hundreds of genes among diverse neurogenic cell types, most of which remain unstudied. We also identified a set of conserved long noncoding RNAs (lncRNAs) that are regulated in a tissue-specific manner and exhibit spatiotemporal expression during neurogenesis with exquisite specificity. lncRNA expression is highly dynamic and demarcates specific subpopulations within neurogenic cell types. Our spatiotemporal transcriptome atlas provides a comprehensive resource for investigating the function of coding genes and noncoding RNAs during crucial stages of early neurogenesis.


Assuntos
Drosophila melanogaster/genética , Regulação da Expressão Gênica no Desenvolvimento , Sistema Nervoso/embriologia , Neurogênese/genética , RNA Longo não Codificante/genética , Animais , Linhagem da Célula , Drosophila melanogaster/metabolismo , Citometria de Fluxo , Perfilação da Expressão Gênica , Redes Reguladoras de Genes , Hibridização in Situ Fluorescente , Neuroglia/fisiologia , Filogenia , Transcriptoma
14.
Epidemiology ; 33(3): 346-353, 2022 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-35383642

RESUMO

Quantifying the impact of natural disasters or epidemics is critical for guiding policy decisions and interventions. When the effects of an event are long-lasting and difficult to detect in the short term, the accumulated effects can be devastating. Mortality is one of the most reliably measured health outcomes, partly due to its unambiguous definition. As a result, excess mortality estimates are an increasingly effective approach for quantifying the effect of an event. However, the fact that indirect effects are often characterized by small, but enduring, increases in mortality rates present a statistical challenge. This is compounded by sources of variability introduced by demographic changes, secular trends, seasonal and day of the week effects, and natural variation. Here, we present a model that accounts for these sources of variability and characterizes concerning increases in mortality rates with smooth functions of time that provide statistical power. The model permits discontinuities in the smooth functions to model sudden increases due to direct effects. We implement a flexible estimation approach that permits both surveillance of concerning increases in mortality rates and careful characterization of the effect of a past event. We demonstrate our tools' utility by estimating excess mortality after hurricanes in the United States and Puerto Rico. We use Hurricane Maria as a case study to show appealing properties that are unique to our method compared with current approaches. Finally, we show the flexibility of our approach by detecting and quantifying the 2014 Chikungunya outbreak in Puerto Rico and the COVID-19 pandemic in the United States. We make our tools available through the excessmort R package available from https://cran.r-project.org/web/packages/excessmort/.


Assuntos
COVID-19 , Tempestades Ciclônicas , Humanos , Pandemias , Porto Rico/epidemiologia , Estados Unidos/epidemiologia
15.
N Engl J Med ; 379(2): 162-170, 2018 Jul 12.
Artigo em Inglês | MEDLINE | ID: mdl-29809109

RESUMO

BACKGROUND: Quantifying the effect of natural disasters on society is critical for recovery of public health services and infrastructure. The death toll can be difficult to assess in the aftermath of a major disaster. In September 2017, Hurricane Maria caused massive infrastructural damage to Puerto Rico, but its effect on mortality remains contentious. The official death count is 64. METHODS: Using a representative, stratified sample, we surveyed 3299 randomly chosen households across Puerto Rico to produce an independent estimate of all-cause mortality after the hurricane. Respondents were asked about displacement, infrastructure loss, and causes of death. We calculated excess deaths by comparing our estimated post-hurricane mortality rate with official rates for the same period in 2016. RESULTS: From the survey data, we estimated a mortality rate of 14.3 deaths (95% confidence interval [CI], 9.8 to 18.9) per 1000 persons from September 20 through December 31, 2017. This rate yielded a total of 4645 excess deaths during this period (95% CI, 793 to 8498), equivalent to a 62% increase in the mortality rate as compared with the same period in 2016. However, this number is likely to be an underestimate because of survivor bias. The mortality rate remained high through the end of December 2017, and one third of the deaths were attributed to delayed or interrupted health care. Hurricane-related migration was substantial. CONCLUSIONS: This household-based survey suggests that the number of excess deaths related to Hurricane Maria in Puerto Rico is more than 70 times the official estimate. (Funded by the Harvard T.H. Chan School of Public Health and others.).


Assuntos
Tempestades Ciclônicas , Desastres/estatística & dados numéricos , Acessibilidade aos Serviços de Saúde/estatística & dados numéricos , Mortalidade , Adolescente , Adulto , Distribuição por Idade , Idoso , Idoso de 80 Anos ou mais , Causas de Morte , Criança , Pré-Escolar , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Mortalidade Prematura , Porto Rico/epidemiologia , Inquéritos e Questionários , Adulto Jovem
16.
Ann Intern Med ; 173(12): 1004-1007, 2020 12 15.
Artigo em Inglês | MEDLINE | ID: mdl-32915654

RESUMO

As of mid-August 2020, more than 170 000 U.S. residents have died of coronavirus disease 2019 (COVID-19); however, the true number of deaths resulting from COVID-19, both directly and indirectly, is likely to be much higher. The proper attribution of deaths to this pandemic has a range of societal, legal, mortuary, and public health consequences. This article discusses the current difficulties of disaster death attribution and describes the strengths and limitations of relying on death counts from death certificates, estimations of indirect deaths, and estimations of excess mortality. Improving the tabulation of direct and indirect deaths on death certificates will require concerted efforts and consensus across medical institutions and public health agencies. In addition, actionable estimates of excess mortality will require timely access to standardized and structured vital registry data, which should be shared directly at the state level to ensure rapid response for local governments. Correct attribution of direct and indirect deaths and estimation of excess mortality are complementary goals that are critical to our understanding of the pandemic and its effect on human life.


Assuntos
COVID-19/mortalidade , Pandemias , Sistema de Registros , SARS-CoV-2 , Causas de Morte/tendências , Humanos , Taxa de Sobrevida/tendências
17.
Genome Res ; 27(11): 1930-1938, 2017 11.
Artigo em Inglês | MEDLINE | ID: mdl-29025895

RESUMO

The main application of ChIP-seq technology is the detection of genomic regions that bind to a protein of interest. A large part of functional genomics' public catalogs is based on ChIP-seq data. These catalogs rely on peak calling algorithms that infer protein-binding sites by detecting genomic regions associated with more mapped reads (coverage) than expected by chance, as a result of the experimental protocol's lack of perfect specificity. We find that GC-content bias accounts for substantial variability in the observed coverage for ChIP-seq experiments and that this variability leads to false-positive peak calls. More concerning is that the GC effect varies across experiments, with the effect strong enough to result in a substantial number of peaks called differently when different laboratories perform experiments on the same cell line. However, accounting for GC content bias in ChIP-seq is challenging because the binding sites of interest tend to be more common in high GC-content regions, which confounds real biological signals with unwanted variability. To account for this challenge, we introduce a statistical approach that accounts for GC effects on both nonspecific noise and signal induced by the binding site. The method can be used to account for this bias in binding quantification as well to improve existing peak calling algorithms. We use this approach to show a reduction in false-positive peaks as well as improved consistency across laboratories.


Assuntos
Composição de Bases , DNA/metabolismo , Análise de Sequência de DNA/métodos , Algoritmos , Sítios de Ligação , Imunoprecipitação da Cromatina , DNA/química , Reações Falso-Positivas , Genômica , Sequenciamento de Nucleotídeos em Larga Escala
18.
Biostatistics ; 20(3): 367-383, 2019 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-29481604

RESUMO

With recent advances in sequencing technology, it is now feasible to measure DNA methylation at tens of millions of sites across the entire genome. In most applications, biologists are interested in detecting differentially methylated regions, composed of multiple sites with differing methylation levels among populations. However, current computational approaches for detecting such regions do not provide accurate statistical inference. A major challenge in reporting uncertainty is that a genome-wide scan is involved in detecting these regions, which needs to be accounted for. A further challenge is that sample sizes are limited due to the costs associated with the technology. We have developed a new approach that overcomes these challenges and assesses uncertainty for differentially methylated regions in a rigorous manner. Region-level statistics are obtained by fitting a generalized least squares regression model with a nested autoregressive correlated error structure for the effect of interest on transformed methylation proportions. We develop an inferential approach, based on a pooled null distribution, that can be implemented even when as few as two samples per population are available. Here, we demonstrate the advantages of our method using both experimental data and Monte Carlo simulation. We find that the new method improves the specificity and sensitivity of lists of regions and accurately controls the false discovery rate.


Assuntos
Metilação de DNA , Genômica/métodos , Modelos Estatísticos , Análise de Sequência de DNA/métodos , Animais , Simulação por Computador , Genômica/normas , Humanos , Análise de Sequência de DNA/normas , Incerteza
19.
Nat Methods ; 14(4): 417-419, 2017 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-28263959

RESUMO

We introduce Salmon, a lightweight method for quantifying transcript abundance from RNA-seq reads. Salmon combines a new dual-phase parallel inference algorithm and feature-rich bias models with an ultra-fast read mapping procedure. It is the first transcriptome-wide quantifier to correct for fragment GC-content bias, which, as we demonstrate here, substantially improves the accuracy of abundance estimates and the sensitivity of subsequent differential expression analysis.


Assuntos
Algoritmos , Análise de Sequência de RNA/métodos , Composição de Bases , Teorema de Bayes , Perfilação da Expressão Gênica/métodos , Perfilação da Expressão Gênica/estatística & dados numéricos , Análise de Sequência de RNA/estatística & dados numéricos
20.
Biostatistics ; 19(4): 562-578, 2018 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-29121214

RESUMO

Until recently, high-throughput gene expression technology, such as RNA-Sequencing (RNA-seq) required hundreds of thousands of cells to produce reliable measurements. Recent technical advances permit genome-wide gene expression measurement at the single-cell level. Single-cell RNA-Seq (scRNA-seq) is the most widely used and numerous publications are based on data produced with this technology. However, RNA-seq and scRNA-seq data are markedly different. In particular, unlike RNA-seq, the majority of reported expression levels in scRNA-seq are zeros, which could be either biologically-driven, genes not expressing RNA at the time of measurement, or technically-driven, genes expressing RNA, but not at a sufficient level to be detected by sequencing technology. Another difference is that the proportion of genes reporting the expression level to be zero varies substantially across single cells compared to RNA-seq samples. However, it remains unclear to what extent this cell-to-cell variation is being driven by technical rather than biological variation. Furthermore, while systematic errors, including batch effects, have been widely reported as a major challenge in high-throughput technologies, these issues have received minimal attention in published studies based on scRNA-seq technology. Here, we use an assessment experiment to examine data from published studies and demonstrate that systematic errors can explain a substantial percentage of observed cell-to-cell expression variability. Specifically, we present evidence that some of these reported zeros are driven by technical variation by demonstrating that scRNA-seq produces more zeros than expected and that this bias is greater for lower expressed genes. In addition, this missing data problem is exacerbated by the fact that this technical variation varies cell-to-cell. Then, we show how this technical cell-to-cell variability can be confused with novel biological results. Finally, we demonstrate and discuss how batch-effects and confounded experiments can intensify the problem.


Assuntos
Perfilação da Expressão Gênica/normas , Sequenciamento de Nucleotídeos em Larga Escala/normas , Análise de Sequência de RNA/normas , Análise de Célula Única/normas , Transcriptoma , Animais , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA