RESUMO
Investing in documenting your bioinformatics software well can increase its impact and save your time. To maximize the effectiveness of your documentation, we suggest following a few guidelines we propose here. We recommend providing multiple avenues for users to use your research software, including a navigable HTML interface with a quick start, useful help messages with detailed explanation and thorough examples for each feature of your software. By following these guidelines, you can assure that your hard work maximally benefits yourself and others.
Assuntos
Biologia Computacional/métodos , Documentação/normas , Guias como Assunto , Software/normas , HumanosRESUMO
Short-read sequencing enables assessment of genetic and biochemical traits of individual genomic regions, such as the location of genetic variation, protein binding and chemical modifications. Every region in a genome assembly has a property called 'mappability', which measures the extent to which it can be uniquely mapped by sequence reads. In regions of lower mappability, estimates of genomic and epigenomic characteristics from sequencing assays are less reliable. These regions have increased susceptibility to spurious mapping from reads from other regions of the genome with sequencing errors or unexpected genetic variation. Bisulfite sequencing approaches used to identify DNA methylation exacerbate these problems by introducing large numbers of reads that map to multiple regions. Both to correct assumptions of uniformity in downstream analysis and to identify regions where the analysis is less reliable, it is necessary to know the mappability of both ordinary and bisulfite-converted genomes. We introduce the Umap software for identifying uniquely mappable regions of any genome. Its Bismap extension identifies mappability of the bisulfite-converted genome. A Umap and Bismap track hub for human genome assemblies GRCh37/hg19 and GRCh38/hg38, and mouse assemblies GRCm37/mm9 and GRCm38/mm10 is available at https://bismap.hoffmanlab.org for use with genome browsers.
Assuntos
Mapeamento Cromossômico/métodos , Biologia Computacional/métodos , Metilação de DNA , Genoma Humano/genética , Ilhas de CpG/genética , Epigenômica/métodos , Genômica/métodos , Humanos , Reprodutibilidade dos Testes , Análise de Sequência de DNA/métodosRESUMO
Large-scale sequencing efforts have been undertaken to understand the mutational landscape of the coding genome. However, the vast majority of variants occur within non-coding genomic regions. We designed an integrative computational and experimental framework to identify recurrently mutated non-coding regulatory regions that drive tumor progression. Applying this framework to sequencing data from a large prostate cancer patient cohort revealed a large set of candidate drivers. We used (1) in silico analyses, (2) massively parallel reporter assays, and (3) in vivo CRISPR interference screens to systematically validate metastatic castration-resistant prostate cancer (mCRPC) drivers. One identified enhancer region, GH22I030351, acts on a bidirectional promoter to simultaneously modulate expression of the U2-associated splicing factor SF3A1 and chromosomal protein CCDC157. SF3A1 and CCDC157 promote tumor growth in vivo. We nominated a number of transcription factors, notably SOX6, to regulate expression of SF3A1 and CCDC157. Our integrative approach enables the systematic detection of non-coding regulatory regions that drive human cancers.
Assuntos
Fatores de Processamento de RNA , Masculino , Humanos , Fatores de Processamento de RNA/metabolismo , Fatores de Processamento de RNA/genética , Regulação Neoplásica da Expressão Gênica , Metástase Neoplásica , Linhagem Celular Tumoral , Neoplasias da Próstata/genética , Neoplasias da Próstata/patologia , Neoplasias da Próstata/metabolismo , Animais , Neoplasias de Próstata Resistentes à Castração/genética , Neoplasias de Próstata Resistentes à Castração/patologia , Neoplasias de Próstata Resistentes à Castração/metabolismo , Sequências Reguladoras de Ácido Nucleico/genética , Camundongos , Elementos Facilitadores Genéticos/genética , Mutação/genéticaRESUMO
BACKGROUND: Human papillomavirus (HPV) drives almost all cervical cancers and up to 70% of head and neck cancers. Frequent integration into the host genome occurs predominantly in tumorigenic types of HPV. We hypothesize that changes in chromatin state at the location of integration can result in changes in gene expression that contribute to the tumorigenicity of HPV. RESULTS: We find that viral integration events often occur along with changes in chromatin state and expression of genes near the integration site. We investigate whether introduction of new transcription factor binding sites due to HPV integration could invoke these changes. Some regions within the HPV genome, particularly the position of a conserved CTCF binding site, show enriched chromatin accessibility signal. ChIP-seq reveals that the conserved CTCF binding site within the HPV genome binds CTCF in 4 HPV+ cancer cell lines. Significant changes in CTCF binding pattern and increases in chromatin accessibility occur exclusively within 100 kbp of HPV integration sites. The chromatin changes co-occur with out-sized changes in transcription and alternative splicing of local genes. Analysis of The Cancer Genome Atlas (TCGA) HPV+ tumors indicates that HPV integration upregulates genes which have significantly higher essentiality scores compared to randomly selected upregulated genes from the same tumors. CONCLUSIONS: Our results suggest that introduction of a new CTCF binding site due to HPV integration reorganizes chromatin state and upregulates genes essential for tumor viability in some HPV+ tumors. These findings emphasize a newly recognized role of HPV integration in oncogenesis.
Assuntos
Neoplasias de Cabeça e Pescoço , Infecções por Papillomavirus , Humanos , Cromatina , Papillomavirus Humano , CarcinogêneseRESUMO
Large-scale sequencing efforts of thousands of tumor samples have been undertaken to understand the mutational landscape of the coding genome. However, the vast majority of germline and somatic variants occur within non-coding portions of the genome. These genomic regions do not directly encode for specific proteins, but can play key roles in cancer progression, for example by driving aberrant gene expression control. Here, we designed an integrative computational and experimental framework to identify recurrently mutated non-coding regulatory regions that drive tumor progression. Application of this approach to whole-genome sequencing (WGS) data from a large cohort of metastatic castration-resistant prostate cancer (mCRPC) revealed a large set of recurrently mutated regions. We used (i) in silico prioritization of functional non-coding mutations, (ii) massively parallel reporter assays, and (iii) in vivo CRISPR-interference (CRISPRi) screens in xenografted mice to systematically identify and validate driver regulatory regions that drive mCRPC. We discovered that one of these enhancer regions, GH22I030351, acts on a bidirectional promoter to simultaneously modulate expression of U2-associated splicing factor SF3A1 and chromosomal protein CCDC157. We found that both SF3A1 and CCDC157 are promoters of tumor growth in xenograft models of prostate cancer. We nominated a number of transcription factors, including SOX6, to be responsible for higher expression of SF3A1 and CCDC157. Collectively, we have established and confirmed an integrative computational and experimental approach that enables the systematic detection of non-coding regulatory regions that drive the progression of human cancers.
RESUMO
Estrogen and progesterone have been extensively studied in the mammary gland, but the molecular effects of androgen remain largely unexplored. Transgender men are recorded as female at birth but identify as male and may undergo gender-affirming androgen therapy to align their physical characteristics and gender identity. Here we perform single-cell-resolution transcriptome, chromatin, and spatial profiling of breast tissues from transgender men following androgen therapy. We find canonical androgen receptor gene targets are upregulated in cells expressing the androgen receptor and that paracrine signaling likely drives sex-relevant androgenic effects in other cell types. We also observe involution of the epithelium and a spatial reconfiguration of immune, fibroblast, and vascular cells, and identify a gene regulatory network associated with androgen-induced fat loss. This work elucidates the molecular consequences of androgen activity in the human breast at single-cell resolution.
RESUMO
Existing methods for computational prediction of transcription factor (TF) binding sites evaluate genomic regions with similarity to known TF sequence preferences. Most TF binding sites, however, do not resemble known TF sequence motifs, and many TFs are not sequence-specific. We developed Virtual ChIP-seq, which predicts binding of individual TFs in new cell types, integrating learned associations with gene expression and binding, TF binding sites from other cell types, and chromatin accessibility data in the new cell type. This approach outperforms methods that predict TF binding solely based on sequence preference, predicting binding for 36 TFs (MCC>0.3).
Assuntos
Sequenciamento de Cromatina por Imunoprecipitação , Transcriptoma , Sítios de Ligação , Imunoprecipitação da Cromatina , Ligação Proteica , Fatores de Transcrição/metabolismoRESUMO
Transplantation of Neural Stem/Progenitor Cells (NPCs) is a promising regenerative strategy to promote neural repair following injury and degeneration because of the ability of these cells to proliferate, migrate, and integrate with the host tissue. Precise in vitro control of NPC proliferation without compromising multipotency and differentiation ability is critical in stem cell maintenance. This idea was highlighted in recent clinical trials, where discrepancies in NPC culturing protocols produced inconsistent therapeutic benefits. Of note, cell density plays an important role in regulating the survival, proliferation, differentiation, and fate choice of stem cells. To determine the extent of variability produced by inconsistent culturing densities, the present study cultured human-induced pluripotent NPCs (hiPSC-NPCs) at either a low or high plating density. hiPSC-NPCs were then isolated for transcriptomic analysis or differentiation in vitro. Following sequencing analysis, genes involved in cell-cell contact-mediated pathways, including Hippo-signaling, NOTCH, and WNT were differentially expressed. Modulation of these pathways was highly associated with the regulation of pro-neuronal transcription factors, which were also upregulated in response to higher-density hiPSC-NPC culture. Moreover, higher plating density translated into a greater neuronal and less astrocytic differentiation in vitro. This study highlights the importance of precisely controlling culture conditions during the development of NPC transplantation therapies.
Assuntos
Células-Tronco Pluripotentes Induzidas , Células-Tronco Neurais , Diferenciação Celular/genética , Expressão Gênica , Humanos , Neurogênese/fisiologiaRESUMO
Knowledge of the transcriptional programs underpinning the functions of human kidney cell populations at homeostasis is limited. We present a single-cell perspective of healthy human kidney from 19 living donors, with equal contribution from males and females, profiling the transcriptome of 27677 cells to map human kidney at high resolution. Sex-based differences in gene expression within proximal tubular cells were observed, specifically, increased anti-oxidant metallothionein genes in females and aerobic metabolism-related genes in males. Functional differences in metabolism were confirmed in proximal tubular cells, with male cells exhibiting higher oxidative phosphorylation and higher levels of energy precursor metabolites. We identified kidney-specific lymphocyte populations with unique transcriptional profiles indicative of kidney-adapted functions. Significant heterogeneity in myeloid cells was observed, with a MRC1+LYVE1+FOLR2+C1QC+ population representing a predominant population in healthy kidney. This study provides a detailed cellular map of healthy human kidney, and explores the complexity of parenchymal and kidney-resident immune cells.
Assuntos
Receptor 2 de Folato , Rim , Feminino , Humanos , Masculino , Rim/metabolismo , Transcriptoma , Metalotioneína/genética , Metalotioneína/metabolismo , Células Mieloides/metabolismo , Perfilação da Expressão Gênica , Análise de Célula Única , Receptor 2 de Folato/metabolismoRESUMO
The COVID-19 pandemic has highlighted the urgent need for the identification of new antiviral drug therapies for a variety of diseases. COVID-19 is caused by infection with the human coronavirus SARS-CoV-2, while other related human coronaviruses cause diseases ranging from severe respiratory infections to the common cold. We developed a computational approach to identify new antiviral drug targets and repurpose clinically-relevant drug compounds for the treatment of a range of human coronavirus diseases. Our approach is based on graph convolutional networks (GCN) and involves multiscale host-virus interactome analysis coupled to off-target drug predictions. Cell-based experimental assessment reveals several clinically-relevant drug repurposing candidates predicted by the in silico analyses to have antiviral activity against human coronavirus infection. In particular, we identify the MET inhibitor capmatinib as having potent and broad antiviral activity against several coronaviruses in a MET-independent manner, as well as novel roles for host cell proteins such as IRAK1/4 in supporting human coronavirus infection, which can inform further drug discovery studies.
Assuntos
Antivirais/farmacologia , Coronavirus/efeitos dos fármacos , Coronavirus/metabolismo , Desenvolvimento de Medicamentos/métodos , Reposicionamento de Medicamentos/métodos , Benzamidas/farmacologia , Linhagem Celular , Simulação por Computador , Coronavirus/química , Bases de Dados de Produtos Farmacêuticos , Descoberta de Drogas/métodos , Interações Hospedeiro-Patógeno , Humanos , Imidazóis/farmacologia , Quinases Associadas a Receptores de Interleucina-1/metabolismo , SARS-CoV-2/química , SARS-CoV-2/efeitos dos fármacos , SARS-CoV-2/metabolismo , SARS-CoV-2/fisiologia , Triazinas/farmacologia , Tratamento Farmacológico da COVID-19RESUMO
Type I interferons (IFNs) are our first line of defense against virus infection. Recent studies have suggested the ability of SARS-CoV-2 proteins to inhibit IFN responses. Emerging data also suggest that timing and extent of IFN production is associated with manifestation of COVID-19 severity. In spite of progress in understanding how SARS-CoV-2 activates antiviral responses, mechanistic studies into wild-type SARS-CoV-2-mediated induction and inhibition of human type I IFN responses are scarce. Here we demonstrate that SARS-CoV-2 infection induces a type I IFN response in vitro and in moderate cases of COVID-19. In vitro stimulation of type I IFN expression and signaling in human airway epithelial cells is associated with activation of canonical transcriptions factors, and SARS-CoV-2 is unable to inhibit exogenous induction of these responses. Furthermore, we show that physiological levels of IFNα detected in patients with moderate COVID-19 is sufficient to suppress SARS-CoV-2 replication in human airway cells.
RESUMO
Our understanding of the beads-on-a-string arrangement of nucleosomes has been built largely on high-resolution sequence-agnostic imaging methods and sequence-resolved bulk biochemical techniques. To bridge the divide between these approaches, we present the single-molecule adenine methylated oligonucleosome sequencing assay (SAMOSA). SAMOSA is a high-throughput single-molecule sequencing method that combines adenine methyltransferase footprinting and single-molecule real-time DNA sequencing to natively and nondestructively measure nucleosome positions on individual chromatin fibres. SAMOSA data allows unbiased classification of single-molecular 'states' of nucleosome occupancy on individual chromatin fibres. We leverage this to estimate nucleosome regularity and spacing on single chromatin fibres genome-wide, at predicted transcription factor binding motifs, and across human epigenomic domains. Our analyses suggest that chromatin is comprised of both regular and irregular single-molecular oligonucleosome patterns that differ subtly in their relative abundance across epigenomic domains. This irregularity is particularly striking in constitutive heterochromatin, which has typically been viewed as a conformationally static entity. Our proof-of-concept study provides a powerful new methodology for studying nucleosome organization at a previously intractable resolution and offers up new avenues for modeling and visualizing higher order chromatin structure.
Assuntos
Cromatina/genética , DNA/genética , Sequenciamento de Nucleotídeos em Larga Escala , Nucleossomos/genética , Imagem Individual de Molécula , Acetilação , Sítios de Ligação , Cromatina/química , Cromatina/metabolismo , DNA/química , DNA/metabolismo , Epigênese Genética , Histonas/química , Histonas/genética , Histonas/metabolismo , Humanos , Células K562 , Conformação de Ácido Nucleico , Nucleossomos/química , Nucleossomos/metabolismo , Estudo de Prova de Conceito , Conformação Proteica , Processamento de Proteína Pós-Traducional , DNA Metiltransferases Sítio Específica (Adenina-Específica)/metabolismo , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismoRESUMO
Despite efforts for extensive molecular characterization of cancer patients, such as the international cancer genome consortium (ICGC) and the cancer genome atlas (TCGA), the heterogeneous nature of cancer and our limited knowledge of the contextual function of proteins have complicated the identification of targetable genes. Here, we present Aberration Hub Analysis for Cancer (AbHAC) as a novel integrative approach to pinpoint aberration hubs, i.e. individual proteins that interact extensively with genes that show aberrant mutation or expression. Our analysis of the breast cancer data of the TCGA and the renal cancer data from the ICGC shows that aberration hubs are involved in relevant cancer pathways, including factors promoting cell cycle and DNA replication in basal-like breast tumors, and Src kinase and VEGF signaling in renal carcinoma. Moreover, our analysis uncovers novel functionally relevant and actionable targets, among which we have experimentally validated abnormal splicing of spleen tyrosine kinase as a key factor for cell proliferation in renal cancer. Thus, AbHAC provides an effective strategy to uncover novel disease factors that are only identifiable by examining mutational and expression data in the context of biological networks.
RESUMO
Widespread remodeling of the transcriptome is a signature of cancer; however, little is known about the post-transcriptional regulatory factors, including RNA-binding proteins (RBPs) that regulate mRNA stability, and the extent to which RBPs contribute to cancer-associated pathways. Here, by modeling the global change in gene expression based on the effect of sequence-specific RBPs on mRNA stability, we show that RBP-mediated stability programs are recurrently deregulated in cancerous tissues. Particularly, we uncovered several RBPs that contribute to the abnormal transcriptome of renal cell carcinoma (RCC), including PCBP2, ESRP2, and MBNL2. Modulation of these proteins in cancer cell lines alters the expression of pathways that are central to the disease and highlights RBPs as driving master regulators of RCC transcriptome. This study presents a framework for the screening of RBP activities based on computational modeling of mRNA stability programs in cancer and highlights the role of post-transcriptional gene dysregulation in RCC.
Assuntos
Neoplasias/genética , Estabilidade de RNA/genética , Proteínas de Ligação a RNA/metabolismo , Transcriptoma/genética , Carcinoma de Células Renais/genética , Ciclo Celular/genética , Proliferação de Células/genética , Regulação Neoplásica da Expressão Gênica , Humanos , Neoplasias Renais/genética , Proteínas de Neoplasias/metabolismo , Biossíntese de Proteínas , Transcrição Gênica , Regulação para Cima/genéticaRESUMO
The incidence of renal cell carcinoma (RCC) is increasing worldwide, and its prevalence is particularly high in some parts of Central Europe. Here we undertake whole-genome and transcriptome sequencing of clear cell RCC (ccRCC), the most common form of the disease, in patients from four different European countries with contrasting disease incidence to explore the underlying genomic architecture of RCC. Our findings support previous reports on frequent aberrations in the epigenetic machinery and PI3K/mTOR signalling, and uncover novel pathways and genes affected by recurrent mutations and abnormal transcriptome patterns including focal adhesion, components of extracellular matrix (ECM) and genes encoding FAT cadherins. Furthermore, a large majority of patients from Romania have an unexpected high frequency of A:T>T:A transversions, consistent with exposure to aristolochic acid (AA). These results show that the processes underlying ccRCC tumorigenesis may vary in different populations and suggest that AA may be an important ccRCC carcinogen in Romania, a finding with major public health implications.