Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
J Biosci ; 472022.
Artigo em Inglês | MEDLINE | ID: mdl-36222139

RESUMO

An organism's genome contains many sequence regions that perform diverse functions. Examples of such regions include genes, promoters, enhancers, and binding sites for regulatory proteins and RNAs. One of biology's most important open problems is how to take a genome sequence and predict which regions within it perform different functions. In recent years, deep learning has enabled dramatic advances across many fields by modeling complex relationships between entities. Several deep learning models have also proven successful in predicting the biological function of a portion of DNA from its sequence, revealing new insights into the complex rules underlying genome regulation and opening new possibilities in disease modeling and synthetic biology.


Assuntos
Aprendizado Profundo , DNA/genética , Elementos Facilitadores Genéticos , Genômica , Regiões Promotoras Genéticas , Sequências Reguladoras de Ácido Nucleico/genética
2.
Viruses ; 14(7)2022 06 29.
Artigo em Inglês | MEDLINE | ID: mdl-35891416

RESUMO

Viruses have evolved numerous mechanisms to exploit the molecular machinery of their host cells, including the broad spectrum of host RNA-binding proteins (RBPs). However, the RBP interactomes of most viruses are largely unknown. To shed light on the interaction landscape of RNA viruses with human host cell RBPs, we have analysed 197 single-stranded RNA (ssRNA) viral genome sequences and found that the majority of ssRNA virus genomes are significantly enriched or depleted in motifs for specific human RBPs, suggesting selection pressure on these interactions. To facilitate tailored investigations and the analysis of genomes sequenced in future, we have released our methodology as a fast and user-friendly computational toolbox named SMEAGOL. Our resources will contribute to future studies of specific ssRNA virus-host cell interactions and support the identification of antiviral drug targets.


Assuntos
Vírus de RNA , Vírus , Sequência de Bases , Genoma Viral , Humanos , RNA , Vírus de RNA/metabolismo , RNA Viral/metabolismo , Proteínas de Ligação a RNA/genética , Proteínas de Ligação a RNA/metabolismo , Vírus/genética
3.
STAR Protoc ; 3(3): 101513, 2022 09 16.
Artigo em Inglês | MEDLINE | ID: mdl-35779264

RESUMO

We outline the features of the R package SparseSignatures and its application to determine the signatures contributing to mutation profiles of tumor samples. We describe installation details and illustrate a step-by-step approach to (1) prepare the data for signature analysis, (2) determine the optimal parameters, and (3) employ them to determine the signatures and related exposure levels in the point mutation dataset. For complete details on the use and execution of this protocol, please refer to Lal et al. (2021).


Assuntos
Neoplasias , Algoritmos , Humanos , Mutação , Neoplasias/diagnóstico
4.
PLoS Comput Biol ; 17(6): e1009119, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-34181655

RESUMO

Cancer is the result of mutagenic processes that can be inferred from tumor genomes by analyzing rate spectra of point mutations, or "mutational signatures". Here we present SparseSignatures, a novel framework to extract signatures from somatic point mutation data. Our approach incorporates a user-specified background signature, employs regularization to reduce noise in non-background signatures, uses cross-validation to identify the number of signatures, and is scalable to large datasets. We show that SparseSignatures outperforms current state-of-the-art methods on simulated data using a variety of standard metrics. We then apply SparseSignatures to whole genome sequences of pancreatic and breast tumors, discovering well-differentiated signatures that are linked to known mutagenic mechanisms and are strongly associated with patient clinical features.


Assuntos
Análise Mutacional de DNA/estatística & dados numéricos , Neoplasias/genética , Mutação Puntual , Algoritmos , Biomarcadores Tumorais/genética , Neoplasias da Mama/classificação , Neoplasias da Mama/genética , Biologia Computacional , Simulação por Computador , Bases de Dados Genéticas/estatística & dados numéricos , Feminino , Genes BRCA1 , Genes BRCA2 , Genoma Humano , Humanos , Neoplasias Pancreáticas/classificação , Neoplasias Pancreáticas/genética , Software
5.
Commun Biol ; 4(1): 590, 2021 05 17.
Artigo em Inglês | MEDLINE | ID: mdl-34002013

RESUMO

The novel betacoronavirus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) caused a worldwide pandemic (COVID-19) after emerging in Wuhan, China. Here we analyzed public host and viral RNA sequencing data to better understand how SARS-CoV-2 interacts with human respiratory cells. We identified genes, isoforms and transposable element families that are specifically altered in SARS-CoV-2-infected respiratory cells. Well-known immunoregulatory genes including CSF2, IL32, IL-6 and SERPINA3 were differentially expressed, while immunoregulatory transposable element families were upregulated. We predicted conserved interactions between the SARS-CoV-2 genome and human RNA-binding proteins such as the heterogeneous nuclear ribonucleoprotein A1 (hnRNPA1) and eukaryotic initiation factor 4 (eIF4b). We also identified a viral sequence variant with a statistically significant skew associated with age of infection, that may contribute to intracellular host-pathogen interactions. These findings can help identify host mechanisms that can be targeted by prophylactics and/or therapeutics to reduce the severity of COVID-19.


Assuntos
COVID-19/genética , Biologia Computacional/métodos , Interações Hospedeiro-Patógeno/genética , Pandemias , SARS-CoV-2/genética , Sítios de Ligação , COVID-19/virologia , Citocinas/genética , Bases de Dados Genéticas , Regulação da Expressão Gênica , Genoma Viral , Humanos , RNA Viral/genética , RNA Viral/metabolismo , Proteínas de Ligação a RNA/genética , Proteínas de Ligação a RNA/metabolismo , RNA-Seq , Serpinas/genética , Transdução de Sinais/genética , Transcriptoma , Replicação Viral/genética
6.
Nat Commun ; 12(1): 1507, 2021 03 08.
Artigo em Inglês | MEDLINE | ID: mdl-33686069

RESUMO

ATAC-seq is a widely-applied assay used to measure genome-wide chromatin accessibility; however, its ability to detect active regulatory regions can depend on the depth of sequencing coverage and the signal-to-noise ratio. Here we introduce AtacWorks, a deep learning toolkit to denoise sequencing coverage and identify regulatory peaks at base-pair resolution from low cell count, low-coverage, or low-quality ATAC-seq data. Models trained by AtacWorks can detect peaks from cell types not seen in the training data, and are generalizable across diverse sample preparations and experimental platforms. We demonstrate that AtacWorks enhances the sensitivity of single-cell experiments by producing results on par with those of conventional methods using ~10 times as many cells, and further show that this framework can be adapted to enable cross-modality inference of protein-DNA interactions. Finally, we establish that AtacWorks can enable new biological discoveries by identifying active regulatory regions associated with lineage priming in rare subpopulations of hematopoietic stem cells.


Assuntos
Sequenciamento de Cromatina por Imunoprecipitação/métodos , Aprendizado Profundo , Epigenômica/métodos , Animais , Encéfalo , Cromatina , Humanos , Leucócitos , Camundongos , Sequências Reguladoras de Ácido Nucleico
7.
BMC Med Genomics ; 12(1): 84, 2019 06 10.
Artigo em Inglês | MEDLINE | ID: mdl-31182087

RESUMO

BACKGROUND: Germline mutations in the BRCA1 and BRCA2 genes predispose carriers to breast and ovarian cancer, and there remains a need to identify the specific genomic mechanisms by which cancer evolves in these patients. Here we present a systematic genomic analysis of breast tumors with BRCA1 and BRCA2 mutations. METHODS: We analyzed genomic data from breast tumors, with a focus on comparing tumors with BRCA1/BRCA2 gene mutations with common classes of sporadic breast tumors. RESULTS: We identify differences between BRCA-mutated and sporadic breast tumors in patterns of point mutation, DNA methylation and structural variation. We show that structural variation disproportionately affects tumor suppressor genes and identify specific driver gene candidates that are enriched for structural variation. CONCLUSIONS: Compared to sporadic tumors, BRCA-mutated breast tumors show signals of reduced DNA methylation, more ancestral cell divisions, and elevated rates of structural variation that tend to disrupt highly expressed protein-coding genes and known tumor suppressors. Our analysis suggests that BRCA-mutated tumors are more aggressive than sporadic breast cancers because loss of the BRCA pathway causes multiple processes of mutagenesis and gene dysregulation.


Assuntos
Proteína BRCA1/genética , Proteína BRCA2/genética , Neoplasias da Mama/genética , Genômica , Mutação , Ilhas de CpG/genética , Metilação de DNA , Humanos
8.
Nat Commun ; 9(1): 4453, 2018 10 26.
Artigo em Inglês | MEDLINE | ID: mdl-30367051

RESUMO

Outcomes for cancer patients vary greatly even within the same tumor type, and characterization of molecular subtypes of cancer holds important promise for improving prognosis and personalized treatment. This promise has motivated recent efforts to produce large amounts of multidimensional genomic (multi-omic) data, but current algorithms still face challenges in the integrated analysis of such data. Here we present Cancer Integration via Multikernel Learning (CIMLR), a new cancer subtyping method that integrates multi-omic data to reveal molecular subtypes of cancer. We apply CIMLR to multi-omic data from 36 cancer types and show significant improvements in both computational efficiency and ability to extract biologically meaningful cancer subtypes. The discovered subtypes exhibit significant differences in patient survival for 27 of 36 cancer types. Our analysis reveals integrated patterns of gene expression, methylation, point mutations, and copy number changes in multiple cancers and highlights patterns specifically associated with poor patient outcomes.


Assuntos
Biologia Computacional , Genômica , Neoplasias/genética , Neoplasias/mortalidade , Algoritmos , Análise por Conglomerados , Variações do Número de Cópias de DNA , Metilação de DNA , Perfilação da Expressão Gênica , Humanos , Neoplasias/classificação , Neoplasias/terapia , Mutação Puntual , Prognóstico , Análise de Sobrevida
9.
PLoS Biol ; 16(9): e2005895, 2018 09.
Artigo em Inglês | MEDLINE | ID: mdl-30212465

RESUMO

Malaria parasites (Plasmodium spp.) and related apicomplexan pathogens contain a nonphotosynthetic plastid called the apicoplast. Derived from an unusual secondary eukaryote-eukaryote endosymbiosis, the apicoplast is a fascinating organelle whose function and biogenesis rely on a complex amalgamation of bacterial and algal pathways. Because these pathways are distinct from the human host, the apicoplast is an excellent source of novel antimalarial targets. Despite its biomedical importance and evolutionary significance, the absence of a reliable apicoplast proteome has limited most studies to the handful of pathways identified by homology to bacteria or primary chloroplasts, precluding our ability to study the most novel apicoplast pathways. Here, we combine proximity biotinylation-based proteomics (BioID) and a new machine learning algorithm to generate a high-confidence apicoplast proteome consisting of 346 proteins. Critically, the high accuracy of this proteome significantly outperforms previous prediction-based methods and extends beyond other BioID studies of unique parasite compartments. Half of identified proteins have unknown function, and 77% are predicted to be important for normal blood-stage growth. We validate the apicoplast localization of a subset of novel proteins and show that an ATP-binding cassette protein ABCF1 is essential for blood-stage survival and plays a previously unknown role in apicoplast biogenesis. These findings indicate critical organellar functions for newly discovered apicoplast proteins. The apicoplast proteome will be an important resource for elucidating unique pathways derived from secondary endosymbiosis and prioritizing antimalarial drug targets.


Assuntos
Apicoplastos/metabolismo , Biologia Computacional/métodos , Malária/metabolismo , Malária/parasitologia , Parasitos/metabolismo , Proteoma/metabolismo , Proteômica/métodos , Proteínas de Protozoários/metabolismo , Algoritmos , Animais , Bases de Dados de Proteínas , Retículo Endoplasmático/metabolismo , Plasmodium falciparum/metabolismo
10.
G3 (Bethesda) ; 8(6): 2079-2089, 2018 05 31.
Artigo em Inglês | MEDLINE | ID: mdl-29686109

RESUMO

In Escherichia coli, the sigma factor σ70 directs RNA polymerase to transcribe growth-related genes, while σ38 directs transcription of stress response genes during stationary phase. Two molecules hypothesized to regulate RNA polymerase are the protein Rsd, which binds to σ70, and the non-coding 6S RNA which binds to the RNA polymerase-σ70 holoenzyme. Despite multiple studies, the functions of Rsd and 6S RNA remain controversial. Here we use RNA-Seq in five phases of growth to elucidate their function on a genome-wide scale. We show that Rsd and 6S RNA facilitate σ38 activity throughout bacterial growth, while 6S RNA also regulates widely different genes depending upon growth phase. We discover novel interactions between 6S RNA and Rsd and show widespread expression changes in a strain lacking both regulators. Finally, we present a mathematical model of transcription which highlights the crosstalk between Rsd and 6S RNA as a crucial factor in controlling sigma factor competition and global gene expression.


Assuntos
Proteínas de Escherichia coli/metabolismo , Escherichia coli/genética , Regulação Bacteriana da Expressão Gênica , RNA Bacteriano/genética , RNA não Traduzido/genética , Proteínas Repressoras/metabolismo , Transcrição Gênica , Proteínas de Bactérias/metabolismo , Simulação por Computador , Escherichia coli/crescimento & desenvolvimento , Técnicas de Inativação de Genes , Genes Bacterianos , Modelos Genéticos , Regulon/genética , Análise de Sequência de RNA , Fator sigma/metabolismo
11.
Nat Commun ; 7: 11055, 2016 Mar 30.
Artigo em Inglês | MEDLINE | ID: mdl-27025941

RESUMO

DNA in bacterial cells primarily exists in a negatively supercoiled state. The extent of supercoiling differs between regions of the chromosome, changes in response to external conditions and regulates gene expression. Here we report the use of trimethylpsoralen intercalation to map the extent of supercoiling across the Escherichia coli chromosome during exponential and stationary growth phases. We find that stationary phase E. coli cells display a gradient of negative supercoiling, with the terminus being more negatively supercoiled than the origin of replication, and that such a gradient is absent in exponentially growing cells. This stationary phase pattern is correlated with the binding of the nucleoid-associated protein HU, and we show that it is lost in an HU deletion strain. We suggest that HU establishes higher supercoiling near the terminus of the chromosome during stationary phase, whereas during exponential growth DNA gyrase and/or transcription equalizes supercoiling across the chromosome.


Assuntos
Cromossomos Bacterianos/genética , DNA Super-Helicoidal/genética , Genoma Bacteriano , Proteínas de Bactérias/metabolismo , DNA Girase/metabolismo , Escherichia coli/efeitos dos fármacos , Escherichia coli/genética , Ficusina/farmacologia , Hidroxiureia/farmacologia , Ligação Proteica/efeitos dos fármacos , Transcrição Gênica/efeitos dos fármacos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA