Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 79
Filtrar
1.
Development ; 149(5)2022 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-35245348

RESUMO

The hypothalamus displays staggering cellular diversity, chiefly established during embryogenesis by the interplay of several signalling pathways and a battery of transcription factors. However, the contribution of epigenetic cues to hypothalamus development remains unclear. We mutated the polycomb repressor complex 2 gene Eed in the developing mouse hypothalamus, which resulted in the loss of H3K27me3, a fundamental epigenetic repressor mark. This triggered ectopic expression of posteriorly expressed regulators (e.g. Hox homeotic genes), upregulation of cell cycle inhibitors and reduced proliferation. Surprisingly, despite these effects, single cell transcriptomic analysis revealed that most neuronal subtypes were still generated in Eed mutants. However, we observed an increase in glutamatergic/GABAergic double-positive cells, as well as loss/reduction of dopamine, hypocretin and Tac2-Pax6 neurons. These findings indicate that many aspects of the hypothalamic gene regulatory flow can proceed without the key H3K27me3 epigenetic repressor mark, but points to a unique sensitivity of particular neuronal subtypes to a disrupted epigenomic landscape.


Assuntos
Desenvolvimento Embrionário/fisiologia , Hipotálamo/fisiologia , Neurônios/fisiologia , Complexo Repressor Polycomb 2/genética , Proteínas do Grupo Polycomb/genética , Animais , Proliferação de Células/genética , Repressão Epigenética/genética , Feminino , Masculino , Camundongos , Mutação/genética , Transcriptoma/genética
2.
Bioinformatics ; 40(Supplement_1): i277-i286, 2024 Jun 28.
Artigo em Inglês | MEDLINE | ID: mdl-38940131

RESUMO

MOTIVATION: Insertions and deletions (indels) influence the genetic code in fundamentally distinct ways from substitutions, significantly impacting gene product structure and function. Despite their influence, the evolutionary history of indels is often neglected in phylogenetic tree inference and ancestral sequence reconstruction, hindering efforts to comprehend biological diversity determinants and engineer variants for medical and industrial applications. RESULTS: We frame determining the optimal history of indel events as a single Mixed-Integer Programming (MIP) problem, across all branch points in a phylogenetic tree adhering to topological constraints, and all sites implied by a given set of aligned, extant sequences. By disentangling the impact on ancestral sequences at each branch point, this approach identifies the minimal indel events that jointly explain the diversity in sequences mapped to the tips of that tree. MIP can recover alternate optimal indel histories, if available. We evaluated MIP for indel inference on a dataset comprising 15 real phylogenetic trees associated with protein families ranging from 165 to 2000 extant sequences, and on 60 synthetic trees at comparable scales of data and reflecting realistic rates of mutation. Across relevant metrics, MIP outperformed alternative parsimony-based approaches and reported the fewest indel events, on par or below their occurrence in synthetic datasets. MIP offers a rational justification for indel patterns in extant sequences; importantly, it uniquely identifies global optima on complex protein data sets without making unrealistic assumptions of independence or evolutionary underpinnings, promising a deeper understanding of molecular evolution and aiding novel protein design. AVAILABILITY AND IMPLEMENTATION: The implementation is available via GitHub at https://github.com/santule/indelmip.


Assuntos
Mutação INDEL , Filogenia , Evolução Molecular , Algoritmos , Biologia Computacional/métodos
3.
Nucleic Acids Res ; 51(11): e62, 2023 06 23.
Artigo em Inglês | MEDLINE | ID: mdl-37125641

RESUMO

Methods for cell clustering and gene expression from single-cell RNA sequencing (scRNA-seq) data are essential for biological interpretation of cell processes. Here, we present TRIAGE-Cluster which uses genome-wide epigenetic data from diverse bio-samples to identify genes demarcating cell diversity in scRNA-seq data. By integrating patterns of repressive chromatin deposited across diverse cell types with weighted density estimation, TRIAGE-Cluster determines cell type clusters in a 2D UMAP space. We then present TRIAGE-ParseR, a machine learning method which evaluates gene expression rank lists to define gene groups governing the identity and function of cell types. We demonstrate the utility of this two-step approach using atlases of in vivo and in vitro cell diversification and organogenesis. We also provide a web accessible dashboard for analysis and download of data and software. Collectively, genome-wide epigenetic repression provides a versatile strategy to define cell diversity and study gene regulation of scRNA-seq data.


Assuntos
Perfilação da Expressão Gênica , Análise de Célula Única , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Software , Análise por Conglomerados , Epigênese Genética , Algoritmos
4.
Bioinformatics ; 39(7)2023 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-37449901

RESUMO

MOTIVATION: Identification of cell types using single-cell RNA-seq is revolutionizing the study of multicellular organisms. However, typical single-cell RNA-seq analysis often involves post hoc manual curation to ensure clusters are transcriptionally distinct, which is time-consuming, error-prone, and irreproducible. RESULTS: To overcome these obstacles, we developed Cytocipher, a bioinformatics method and scverse compatible software package that statistically determines significant clusters. Application of Cytocipher to normal tissue, development, disease, and large-scale atlas data reveals the broad applicability and power of Cytocipher to generate biological insights in numerous contexts. This included the identification of cell types not previously described in the datasets analysed, such as CD8+ T cell subtypes in human peripheral blood mononuclear cells; cell lineage intermediate states during mouse pancreas development; and subpopulations of luminal epithelial cells over-represented in prostate cancer. Cytocipher also scales to large datasets with high-test performance, as shown by application to the Tabula Sapiens Atlas representing >480 000 cells. Cytocipher is a novel and generalizable method that statistically determines transcriptionally distinct and programmatically reproducible clusters from single-cell data. AVAILABILITY AND IMPLEMENTATION: The software version used for this manuscript has been deposited on Zenodo (https://doi.org/10.5281/zenodo.8089546), and is also available via github (https://github.com/BradBalderson/Cytocipher).


Assuntos
Algoritmos , Perfilação da Expressão Gênica , Animais , Camundongos , Humanos , Análise de Sequência de RNA/métodos , Perfilação da Expressão Gênica/métodos , Leucócitos Mononucleares , Análise da Expressão Gênica de Célula Única , Análise de Célula Única , Software
5.
Nucleic Acids Res ; 50(3): 1280-1296, 2022 02 22.
Artigo em Inglês | MEDLINE | ID: mdl-35048973

RESUMO

A prominent aspect of most, if not all, central nervous systems (CNSs) is that anterior regions (brain) are larger than posterior ones (spinal cord). Studies in Drosophila and mouse have revealed that Polycomb Repressor Complex 2 (PRC2), a protein complex responsible for applying key repressive histone modifications, acts by several mechanisms to promote anterior CNS expansion. However, it is unclear what the full spectrum of PRC2 action is during embryonic CNS development and how PRC2 intersects with the epigenetic landscape. We removed PRC2 function from the developing mouse CNS, by mutating the key gene Eed, and generated spatio-temporal transcriptomic data. To decode the role of PRC2, we developed a method that incorporates standard statistical analyses with probabilistic deep learning to integrate the transcriptomic response to PRC2 inactivation with epigenetic data. This multi-variate analysis corroborates the central involvement of PRC2 in anterior CNS expansion, and also identifies several unanticipated cohorts of genes, such as proliferation and immune response genes. Furthermore, the analysis reveals specific profiles of regulation via PRC2 upon these gene cohorts. These findings uncover a differential logic for the role of PRC2 upon functionally distinct gene cohorts that drive CNS anterior expansion. To support the analysis of emerging multi-modal datasets, we provide a novel bioinformatics package that integrates transcriptomic and epigenetic datasets to identify regulatory underpinnings of heterogeneous biological processes.


Assuntos
Sistema Nervoso Central/embriologia , Complexo Repressor Polycomb 2 , Animais , Embrião de Mamíferos/metabolismo , Histonas/genética , Histonas/metabolismo , Camundongos , Complexo Repressor Polycomb 2/genética , Complexo Repressor Polycomb 2/metabolismo
6.
Mol Biol Evol ; 39(6)2022 06 02.
Artigo em Inglês | MEDLINE | ID: mdl-35639613

RESUMO

The cytochrome P450 family 1 enzymes (CYP1s) are a diverse family of hemoprotein monooxygenases, which metabolize many xenobiotics including numerous environmental carcinogens. However, their historical function and evolution remain largely unstudied. Here we investigate CYP1 evolution via the reconstruction and characterization of the vertebrate CYP1 ancestors. Younger ancestors and extant forms generally demonstrated higher activity toward typical CYP1 xenobiotic and steroid substrates than older ancestors, suggesting significant diversification away from the original CYP1 function. Caffeine metabolism appears to be a recently evolved trait of the CYP1A subfamily, observed in the mammalian CYP1A lineage, and may parallel the recent evolution of caffeine synthesis in multiple separate plant species. Likewise, the aryl hydrocarbon receptor agonist, 6-formylindolo[3,2-b]carbazole (FICZ) was metabolized to a greater extent by certain younger ancestors and extant forms, suggesting that activity toward FICZ increased in specific CYP1 evolutionary branches, a process that may have occurred in parallel to the exploitation of land where UV-exposure was higher than in aquatic environments. As observed with previous reconstructions of P450 enzymes, thermostability correlated with evolutionary age; the oldest ancestor was up to 35 °C more thermostable than the extant forms, with a 10T50 (temperature at which 50% of the hemoprotein remains intact after 10 min) of 71 °C. This robustness may have facilitated evolutionary diversification of the CYP1s by buffering the destabilizing effects of mutations that conferred novel functions, a phenomenon which may also be useful in exploiting the catalytic versatility of these ancestral enzymes for commercial application as biocatalysts.


Assuntos
Cafeína , Xenobióticos , Animais , Citocromo P-450 CYP1A1/genética , Citocromo P-450 CYP1A1/metabolismo , Sistema Enzimático do Citocromo P-450/genética , Mamíferos/metabolismo , Vertebrados/genética , Vertebrados/metabolismo
7.
Chemistry ; 29(9): e202203140, 2023 Feb 10.
Artigo em Inglês | MEDLINE | ID: mdl-36385513

RESUMO

Enzyme-catalyzed reaction cascades play an increasingly important role for the sustainable manufacture of diverse chemicals from renewable feedstocks. For instance, dehydratases from the ilvD/EDD superfamily have been embedded into a cascade to convert glucose via pyruvate to isobutanol, a platform chemical for the production of aviation fuels and other valuable materials. These dehydratases depend on the presence of both a Fe-S cluster and a divalent metal ion for their function. However, they also represent the rate-limiting step in the cascade. Here, catalytic parameters and the crystal structure of the dehydratase from Paralcaligenes ureilyticus (PuDHT, both in presence of Mg2+ and Mn2+ ) were investigated. Rate measurements demonstrate that the presence of stoichiometric concentrations Mn2+ promotes higher activity than Mg2+ , but at high concentrations the former inhibits the activity of PuDHT. Molecular dynamics simulations identify the position of a second binding site for the divalent metal ion. Only binding of Mn2+ (not Mg2+ ) to this site affects the ligand environment of the catalytically essential divalent metal binding site, thus providing insight into an inhibitory mechanism of Mn2+ at higher concentrations. Furthermore, in silico docking identified residues that play a role in determining substrate binding and selectivity. The combined data inform engineering approaches to design an optimal dehydratase for the cascade.


Assuntos
Hidroliases , Sequência de Aminoácidos , Hidroliases/química , Sítios de Ligação , Catálise
8.
PLoS Comput Biol ; 18(10): e1010633, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-36279274

RESUMO

Ancestral sequence reconstruction is a technique that is gaining widespread use in molecular evolution studies and protein engineering. Accurate reconstruction requires the ability to handle appropriately large numbers of sequences, as well as insertion and deletion (indel) events, but available approaches exhibit limitations. To address these limitations, we developed Graphical Representation of Ancestral Sequence Predictions (GRASP), which efficiently implements maximum likelihood methods to enable the inference of ancestors of families with more than 10,000 members. GRASP implements partial order graphs (POGs) to represent and infer insertion and deletion events across ancestors, enabling the identification of building blocks for protein engineering. To validate the capacity to engineer novel proteins from realistic data, we predicted ancestor sequences across three distinct enzyme families: glucose-methanol-choline (GMC) oxidoreductases, cytochromes P450, and dihydroxy/sugar acid dehydratases (DHAD). All tested ancestors demonstrated enzymatic activity. Our study demonstrates the ability of GRASP (1) to support large data sets over 10,000 sequences and (2) to employ insertions and deletions to identify building blocks for engineering biologically active ancestors, by exploring variation over evolutionary time.


Assuntos
Evolução Molecular , Mutação INDEL , Mutação INDEL/genética , Proteínas/genética , Evolução Biológica , Filogenia
9.
Genomics ; 113(4): 1855-1866, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-33878366

RESUMO

Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is the primary protocol for detecting genome-wide DNA-protein interactions, and therefore a key tool for understanding transcriptional regulation. A number of factors, including low specificity of antibody and cellular heterogeneity of sample, may cause "peak" callers to output noise and experimental artefacts. Statistically combining multiple experimental replicates from the same condition could significantly enhance our ability to distinguish actual transcription factor binding events, even when peak caller accuracy and consistency of detection are compromised. We adapted the rank-product test to statistically evaluate the reproducibility from any number of ChIP-seq experimental replicates. We demonstrate over a number of benchmarks that our adaptation "ChIP-R" (pronounced 'chipper') performs as well as or better than comparable approaches on recovering transcription factor binding sites in ChIP-seq peak data. We also show ChIP-R extends to evaluate ATAC-seq peaks, finding reproducible peak sets even at low sequencing depth. ChIP-R decomposes peaks across replicates into "fragments" which either form part of a peak in a replicate, or not. We show that by re-analysing existing data sets, ChIP-R reconstructs reproducible peaks from fragments with enhanced biological enrichment relative to current strategies.


Assuntos
Algoritmos , Sequenciamento de Cromatina por Imunoprecipitação , Sítios de Ligação , Imunoprecipitação da Cromatina/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Reprodutibilidade dos Testes , Análise de Sequência de DNA/métodos
10.
Antimicrob Agents Chemother ; 65(10): e0093621, 2021 09 17.
Artigo em Inglês | MEDLINE | ID: mdl-34310207

RESUMO

The structural diversity in metallo-ß-lactamases (MBLs), especially in the vicinity of the active site, has been a major hurdle in the development of clinically effective inhibitors. Representatives from three variants of the B3 MBL subclass, containing either the canonical HHH/DHH active-site motif (present in the majority of MBLs in this subclass) or the QHH/DHH (B3-Q) or HRH/DQK (B3-RQK) variations, were reported previously. Here, we describe the structure and kinetic properties of the first example (SIE-1) of a fourth variant containing the EHH/DHH active-site motif (B3-E). SIE-1 was identified in the hexachlorocyclohexane-degrading bacterium Sphingobium indicum, and kinetic analyses demonstrate that although it is active against a wide range of antibiotics, its efficiency is lower than that of other B3 MBLs but has increased efficiency toward cephalosporins relative to other ß-lactam substrates. The overall fold of SIE-1 is characteristic of the MBLs; the notable variation is observed in the Zn1 site due to the replacement of the canonical His116 by a glutamate. The unusual preference of SIE-1 for cephalosporins and its occurrence in a widespread environmental organism suggest the scope for increased MBL-mediated ß-lactam resistance. Thus, it is relevant to include SIE-1 in MBL inhibitor design studies to widen the therapeutic scope of much needed antiresistance drugs.


Assuntos
Sphingomonadaceae , beta-Lactamases , Antibacterianos/farmacologia , Domínio Catalítico , Ácido Glutâmico , Sphingomonadaceae/metabolismo , Inibidores de beta-Lactamases/farmacologia , beta-Lactamases/genética , beta-Lactamases/metabolismo
11.
Bioinformatics ; 36(12): 3902-3904, 2020 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-32246829

RESUMO

MOTIVATION: Identifying the genes regulated by a given transcription factor (TF) (its 'target genes') is a key step in developing a comprehensive understanding of gene regulation. Previously, we developed a method (CisMapper) for predicting the target genes of a TF based solely on the correlation between a histone modification at the TF's binding site and the expression of the gene across a set of tissues or cell lines. That approach is limited to organisms for which extensive histone and expression data are available, and does not explicitly incorporate the genomic distance between the TF and the gene. RESULTS: We present the T-Gene algorithm, which overcomes these limitations. It can be used to predict which genes are most likely to be regulated by a TF, and which of the TF's binding sites are most likely involved in regulating particular genes. T-Gene calculates a novel score that combines distance and histone/expression correlation, and we show that this score accurately predicts when a regulatory element bound by a TF is in contact with a gene's promoter, achieving median precision above 60%. T-Gene is easy to use via its web server or as a command-line tool, and can also make accurate predictions (median precision above 40%) based on distance alone when extensive histone/expression data is not available for the organism. T-Gene provides an estimate of the statistical significance of each of its predictions. AVAILABILITY AND IMPLEMENTATION: The T-Gene web server, source code, histone/expression data and genome annotation files are provided at http://meme-suite.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Software , Sítios de Ligação , Imunoprecipitação da Cromatina , Regulação da Expressão Gênica , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo
12.
Proteins ; 88(9): 1251-1259, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32394426

RESUMO

Ancestral sequence reconstruction has had recent success in decoding the origins and the determinants of complex protein functions. However, phylogenetic analyses of remote homologues must handle extreme amino acid sequence diversity resulting from extended periods of evolutionary change. We exploited the wealth of protein structures to develop an evolutionary model based on protein secondary structure. The approach follows the differences between discrete secondary structure states observed in modern proteins and those hypothesized in their immediate ancestors. We implemented maximum likelihood-based phylogenetic inference to reconstruct ancestral secondary structure. The predictive accuracy from the use of the evolutionary model surpasses that of comparative modeling and sequence-based prediction; the reconstruction extracts information not available from modern structures or the ancestral sequences alone. Based on a phylogenetic analysis of a sequence-diverse protein family, we showed that the model can highlight relationships that are evolutionarily rooted in structure and not evident in amino acid-based analysis.


Assuntos
Proteínas Adaptadoras de Transporte Vesicular/química , Proteínas de Bactérias/química , Evolução Molecular , Modelos Estatísticos , Proteínas Adaptadoras de Transporte Vesicular/história , Animais , Bactérias/química , Bactérias/classificação , Bactérias/metabolismo , Proteínas de Bactérias/história , Simulação por Computador , História do Século XXI , História Antiga , Humanos , Mamíferos/classificação , Mamíferos/metabolismo , Filogenia , Plantas/química , Plantas/classificação , Plantas/metabolismo , Estrutura Secundária de Proteína
13.
Cerebellum ; 19(1): 89-101, 2020 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-31838646

RESUMO

Transcriptional regulation plays a central role in controlling neural stem and progenitor cell proliferation and differentiation during neurogenesis. For instance, transcription factors from the nuclear factor I (NFI) family have been shown to co-ordinate neural stem and progenitor cell differentiation within multiple regions of the embryonic nervous system, including the neocortex, hippocampus, spinal cord and cerebellum. Knockout of individual Nfi genes culminates in similar phenotypes, suggestive of common target genes for these transcription factors. However, whether or not the NFI family regulates common suites of genes remains poorly defined. Here, we use granule neuron precursors (GNPs) of the postnatal murine cerebellum as a model system to analyse regulatory targets of three members of the NFI family: NFIA, NFIB and NFIX. By integrating transcriptomic profiling (RNA-seq) of Nfia- and Nfix-deficient GNPs with epigenomic profiling (ChIP-seq against NFIA, NFIB and NFIX, and DNase I hypersensitivity assays), we reveal that these transcription factors share a large set of potential transcriptional targets, suggestive of complementary roles for these NFI family members in promoting neural development.


Assuntos
Cerebelo/crescimento & desenvolvimento , Cerebelo/metabolismo , Fatores de Transcrição NFI/metabolismo , Animais , Animais Recém-Nascidos , Cerebelo/citologia , Sequenciamento de Cromatina por Imunoprecipitação/métodos , Feminino , Masculino , Camundongos , Camundongos Endogâmicos C57BL , Camundongos Knockout , Fatores de Transcrição NFI/genética , Neurogênese/fisiologia , Gravidez
14.
Nucleic Acids Res ; 46(D1): D503-D508, 2018 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-29106588

RESUMO

NLSdb is a database collecting nuclear export signals (NES) and nuclear localization signals (NLS) along with experimentally annotated nuclear and non-nuclear proteins. NES and NLS are short sequence motifs related to protein transport out of and into the nucleus. The updated NLSdb now contains 2253 NLS and introduces 398 NES. The potential sets of novel NES and NLS have been generated by a simple 'in silico mutagenesis' protocol. We started with motifs annotated by experiments. In step 1, we increased specificity such that no known non-nuclear protein matched the refined motif. In step 2, we increased the sensitivity trying to match several different families with a motif. We then iterated over steps 1 and 2. The final set of 2253 NLS motifs matched 35% of 8421 experimentally verified nuclear proteins (up from 21% for the previous version) and none of 18 278 non-nuclear proteins. We updated the web interface providing multiple options to search protein sequences for NES and NLS motifs, and to evaluate your own signal sequences. NLSdb can be accessed via Rostlab services at: https://rostlab.org/services/nlsdb/.


Assuntos
Transporte Ativo do Núcleo Celular/genética , Bases de Dados Genéticas , Anotação de Sequência Molecular , Sinais de Exportação Nuclear/genética , Sinais de Localização Nuclear/química , Interface Usuário-Computador , Sequência de Aminoácidos , Animais , Arabidopsis/genética , Arabidopsis/metabolismo , Caenorhabditis elegans/genética , Caenorhabditis elegans/metabolismo , Núcleo Celular/metabolismo , Conjuntos de Dados como Assunto , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , Células Eucarióticas/metabolismo , Humanos , Internet , Camundongos , Sinais de Localização Nuclear/genética , Sinais de Localização Nuclear/metabolismo , Oryza/genética , Oryza/metabolismo , Ratos , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Schizosaccharomyces/genética , Schizosaccharomyces/metabolismo
15.
BMC Bioinformatics ; 20(1): 727, 2019 12 20.
Artigo em Inglês | MEDLINE | ID: mdl-31861997

RESUMO

Following publication of the original article [1], the author reported that an incorrect figure has been published as Figure 2. The correct Figure 2 is shown below.

16.
BMC Bioinformatics ; 20(1): 205, 2019 Apr 23.
Artigo em Inglês | MEDLINE | ID: mdl-31014229

RESUMO

BACKGROUND: Sub-nuclear structures or locations are associated with various nuclear processes. Proteins localized in these substructures are important to understand the interior nuclear mechanisms. Despite advances in high-throughput methods, experimental protein annotations remain limited. Predictions of cellular compartments have become very accurate, largely at the expense of leaving out substructures inside the nucleus making a fine-grained analysis impossible. RESULTS: Here, we present a new method (LocNuclei) that predicts nuclear substructures from sequence alone. LocNuclei used a string-based Profile Kernel with Support Vector Machines (SVMs). It distinguishes sub-nuclear localization in 13 distinct substructures and distinguishes between nuclear proteins confined to the nucleus and those that are also native to other compartments (traveler proteins). High performance was achieved by implicitly leveraging a large biological knowledge-base in creating predictions by homology-based inference through BLAST. Using this approach, the performance reached AUC = 0.70-0.74 and Q13 = 59-65%. Travelling proteins (nucleus and other) were identified at Q2 = 70-74%. A Gene Ontology (GO) analysis of the enrichment of biological processes revealed that the predicted sub-nuclear compartments matched the expected functionality. Analysis of protein-protein interactions (PPI) show that formation of compartments and functionality of proteins in these compartments highly rely on interactions between proteins. This suggested that the LocNuclei predictions carry important information about function. The source code and data sets are available through GitHub: https://github.com/Rostlab/LocNuclei . CONCLUSIONS: LocNuclei predicts subnuclear compartments and traveler proteins accurately. These predictions carry important information about functionality and PPIs.


Assuntos
Núcleo Celular/química , Biologia Computacional/métodos , Proteínas Nucleares , Análise de Sequência de Proteína/métodos , Proteínas Nucleares/química , Proteínas Nucleares/classificação , Proteínas Nucleares/fisiologia , Proteínas/química , Proteínas/classificação , Proteínas/fisiologia , Máquina de Vetores de Suporte
17.
Bioinformatics ; 34(15): 2670-2672, 2018 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-29554210

RESUMO

Summary: Small RNAs play key roles in gene regulation, defense against viral pathogens and maintenance of genome stability, though many aspects of their biogenesis and function remain to be elucidated. SCRAM (Small Complementary RNA Mapper) is a novel, simple-to-use short read aligner and visualization suite that enhances exploration of small RNA datasets. Availability and implementation: The SCRAM pipeline is implemented in Go and Python, and is freely available under MIT license. Source code, multiplatform binaries and a Docker image can be accessed via https://sfletc.github.io/scram/. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Análise de Sequência de RNA/métodos , Software
18.
Nucleic Acids Res ; 45(4): e19, 2017 02 28.
Artigo em Inglês | MEDLINE | ID: mdl-28204599

RESUMO

Identifying the genomic regions and regulatory factors that control the transcription of genes is an important, unsolved problem. The current method of choice predicts transcription factor (TF) binding sites using chromatin immunoprecipitation followed by sequencing (ChIP-seq), and then links the binding sites to putative target genes solely on the basis of the genomic distance between them. Evidence from chromatin conformation capture experiments shows that this approach is inadequate due to long-distance regulation via chromatin looping. We present CisMapper, which predicts the regulatory targets of a TF using the correlation between a histone mark at the TF's bound sites and the expression of each gene across a panel of tissues. Using both chromatin conformation capture and differential expression data, we show that CisMapper is more accurate at predicting the target genes of a TF than the distance-based approaches currently used, and is particularly advantageous for predicting the long-range regulatory interactions typical of tissue-specific gene expression. CisMapper also predicts which TF binding sites regulate a given gene more accurately than using genomic distance. Unlike distance-based methods, CisMapper can predict which transcription start site of a gene is regulated by a particular binding site of the TF.


Assuntos
Imunoprecipitação da Cromatina/métodos , Elementos Reguladores de Transcrição , Análise de Sequência de DNA/métodos , Software , Fatores de Transcrição/metabolismo , Algoritmos , Sítios de Ligação , Código das Histonas , Regiões Promotoras Genéticas , Sítio de Iniciação de Transcrição
19.
Bioinformatics ; 33(12): 1773-1781, 2017 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-28186228

RESUMO

MOTIVATION: Genome-wide association studies are identifying single nucleotide variants (SNVs) linked to various diseases, however the functional effect caused by these variants is often unknown. One potential functional effect, the loss or gain of protein phosphorylation sites, can be induced through variations in key amino acids that disrupt or introduce valid kinase binding patterns. Current methods for predicting the effect of SNVs on phosphorylation operate on the sequence content of reference and variant proteins. However, consideration of the amino acid sequence alone is insufficient for predicting phosphorylation change, as context factors determine kinase-substrate selection. RESULTS: We present here a method for quantifying the effect of SNVs on protein phosphorylation through an integrated system of motif analysis and context-based assessment of kinase targets. By predicting the effect that known variants across the proteome have on phosphorylation, we are able to use this background of proteome-wide variant effects to quantify the significance of novel variants for modifying phosphorylation. We validate our method on a manually curated set of phosphorylation change-causing variants from the primary literature, showing that the method predicts known examples of phosphorylation change at high levels of specificity. We apply our approach to data-sets of variants in phosphorylation site regions, showing that variants causing predicted phosphorylation loss are over-represented among disease-associated variants. AVAILABILITY AND IMPLEMENTATION: The method is freely available as a web-service at the website http://bioinf.scmb.uq.edu.au/phosphopick/snp. CONTACT: m.boden@uq.edu.au. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional/métodos , Fosforilação , Fosfotransferases/metabolismo , Polimorfismo de Nucleotídeo Único , Processamento de Proteína Pós-Traducional/genética , Software , Sequência de Aminoácidos , Humanos , Ligação Proteica
20.
J Chem Inf Model ; 58(3): 630-640, 2018 03 26.
Artigo em Inglês | MEDLINE | ID: mdl-29424533

RESUMO

Molecular dynamics simulations and free energy calculations have been used to investigate the effect of ligand binding on the enantioselectivity of an epoxide hydrolase (EH) from Aspergillus niger. Despite sharing a common mechanism, a wide range of alternative mechanisms have been proposed to explain the origin of enantiomeric selectivity in EHs. By comparing the interactions of ( R)- and ( S)-glycidyl phenyl ether (GPE) with both the wild type (WT, E = 3) and a mutant showing enhanced enantioselectivity to GPE (LW202, E = 193), we have examined whether enantioselectivity is due to differences in the binding pose, the affinity for the ( R)- or ( S)- enantiomers, or a kinetic effect. The two enantiomers were easily accommodated within the binding pockets of the WT enzyme and LW202. Free energy calculations suggested that neither enzyme had a preference for a given enantiomer. The two substrates sampled a wide variety of conformations in the simulations with the sterically hindered and unhindered carbon atoms of the GPE epoxide ring both coming in close proximity to the nucleophilic aspartic acid residue. This suggests that alternative pathways could lead to the formation of a ( S)- and ( R)-diol product. Together, the calculations suggest that the enantioselectivity is due to kinetic rather than thermodynamic effects and that the assumption that one substrate results in one product when interpreting the available experimental data and deriving E-values may be inappropriate in the case of EHs.


Assuntos
Aspergillus niger/enzimologia , Epóxido Hidrolases/metabolismo , Éteres Fenílicos/metabolismo , Aspergillus niger/química , Aspergillus niger/metabolismo , Epóxido Hidrolases/química , Cinética , Simulação de Acoplamento Molecular , Simulação de Dinâmica Molecular , Éteres Fenílicos/química , Ligação Proteica , Estereoisomerismo , Especificidade por Substrato , Termodinâmica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA