Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 79
Filtrar
1.
Bioinformatics ; 40(Supplement_1): i277-i286, 2024 Jun 28.
Artigo em Inglês | MEDLINE | ID: mdl-38940131

RESUMO

MOTIVATION: Insertions and deletions (indels) influence the genetic code in fundamentally distinct ways from substitutions, significantly impacting gene product structure and function. Despite their influence, the evolutionary history of indels is often neglected in phylogenetic tree inference and ancestral sequence reconstruction, hindering efforts to comprehend biological diversity determinants and engineer variants for medical and industrial applications. RESULTS: We frame determining the optimal history of indel events as a single Mixed-Integer Programming (MIP) problem, across all branch points in a phylogenetic tree adhering to topological constraints, and all sites implied by a given set of aligned, extant sequences. By disentangling the impact on ancestral sequences at each branch point, this approach identifies the minimal indel events that jointly explain the diversity in sequences mapped to the tips of that tree. MIP can recover alternate optimal indel histories, if available. We evaluated MIP for indel inference on a dataset comprising 15 real phylogenetic trees associated with protein families ranging from 165 to 2000 extant sequences, and on 60 synthetic trees at comparable scales of data and reflecting realistic rates of mutation. Across relevant metrics, MIP outperformed alternative parsimony-based approaches and reported the fewest indel events, on par or below their occurrence in synthetic datasets. MIP offers a rational justification for indel patterns in extant sequences; importantly, it uniquely identifies global optima on complex protein data sets without making unrealistic assumptions of independence or evolutionary underpinnings, promising a deeper understanding of molecular evolution and aiding novel protein design. AVAILABILITY AND IMPLEMENTATION: The implementation is available via GitHub at https://github.com/santule/indelmip.


Assuntos
Mutação INDEL , Filogenia , Evolução Molecular , Algoritmos , Biologia Computacional/métodos
2.
Brief Funct Genomics ; 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-38183207

RESUMO

Metastatic melanoma originates from melanocytes of the skin. Melanoma metastasis results in poor treatment prognosis for patients and is associated with epigenetic and transcriptional changes that reflect the developmental program of melanocyte differentiation from neural crest stem cells. Several studies have explored melanoma transcriptional heterogeneity using microarray, bulk and single-cell RNA-sequencing technologies to derive data-driven models of the transcriptional-state change which occurs during melanoma progression. No study has systematically examined how different models of melanoma progression derived from different data types, technologies and biological conditions compare. Here, we perform a cross-sectional study to identify averaging effects of bulk-based studies that mask and distort apparent melanoma transcriptional heterogeneity; we describe new transcriptionally distinct melanoma cell states, identify differential co-expression of genes between studies and examine the effects of predicted drug susceptibilities of different cell states between studies. Importantly, we observe considerable variability in drug-target gene expression between studies, indicating potential transcriptional plasticity of melanoma to down-regulate these drug targets and thereby circumvent treatment. Overall, observed differences in gene co-expression and predicted drug susceptibility between studies suggest bulk-based transcriptional measurements do not reliably gauge heterogeneity and that melanoma transcriptional plasticity is greater than described when studies are considered in isolation.

3.
ChemSusChem ; 17(4): e202301132, 2024 Feb 22.
Artigo em Inglês | MEDLINE | ID: mdl-37872118

RESUMO

Anthropogenic climate change has been caused by over-exploitation of fossil fuels and CO2 emissions. To counteract this, the chemical industry has shifted its focus to sustainable chemical production and the valorization of renewable resources. However, the biggest challenges in biomanufacturing are technical efficiency and profitability. In our minimal cell-free enzyme cascade generating pyruvate as the central intermediate, the NAD+ -dependent, selective oxidation of D-glyceraldehyde was identified as a key reaction step to improve the overall cascade flux. Successive genome mining identified one candidate enzyme with 24-fold enhanced activity and another whose stability is unaffected in 10 % (v/v) ethanol, the final product of our model cascade. Semi-rational engineering improved the substrate selectivity of the enzyme up to 21-fold, thus minimizing side reactions in the one-pot enzyme cascade. The final biotransformation of D-glucose showed a continuous linear production of ethanol (via pyruvate) to a final titer of 4.9 % (v/v) with a molar product yield of 98.7 %. Due to the central role of pyruvate in diverse biotransformations, the optimized production module has great potential for broad biomanufacturing applications.


Assuntos
Gliceraldeído , NAD , Gliceraldeído/metabolismo , NAD/metabolismo , Ácido Pirúvico , Etanol , Oxirredutases
4.
Bioinformatics ; 39(7)2023 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-37449901

RESUMO

MOTIVATION: Identification of cell types using single-cell RNA-seq is revolutionizing the study of multicellular organisms. However, typical single-cell RNA-seq analysis often involves post hoc manual curation to ensure clusters are transcriptionally distinct, which is time-consuming, error-prone, and irreproducible. RESULTS: To overcome these obstacles, we developed Cytocipher, a bioinformatics method and scverse compatible software package that statistically determines significant clusters. Application of Cytocipher to normal tissue, development, disease, and large-scale atlas data reveals the broad applicability and power of Cytocipher to generate biological insights in numerous contexts. This included the identification of cell types not previously described in the datasets analysed, such as CD8+ T cell subtypes in human peripheral blood mononuclear cells; cell lineage intermediate states during mouse pancreas development; and subpopulations of luminal epithelial cells over-represented in prostate cancer. Cytocipher also scales to large datasets with high-test performance, as shown by application to the Tabula Sapiens Atlas representing >480 000 cells. Cytocipher is a novel and generalizable method that statistically determines transcriptionally distinct and programmatically reproducible clusters from single-cell data. AVAILABILITY AND IMPLEMENTATION: The software version used for this manuscript has been deposited on Zenodo (https://doi.org/10.5281/zenodo.8089546), and is also available via github (https://github.com/BradBalderson/Cytocipher).


Assuntos
Algoritmos , Perfilação da Expressão Gênica , Animais , Camundongos , Humanos , Análise de Sequência de RNA/métodos , Perfilação da Expressão Gênica/métodos , Leucócitos Mononucleares , Análise da Expressão Gênica de Célula Única , Análise de Célula Única , Software
5.
Nucleic Acids Res ; 51(11): e62, 2023 06 23.
Artigo em Inglês | MEDLINE | ID: mdl-37125641

RESUMO

Methods for cell clustering and gene expression from single-cell RNA sequencing (scRNA-seq) data are essential for biological interpretation of cell processes. Here, we present TRIAGE-Cluster which uses genome-wide epigenetic data from diverse bio-samples to identify genes demarcating cell diversity in scRNA-seq data. By integrating patterns of repressive chromatin deposited across diverse cell types with weighted density estimation, TRIAGE-Cluster determines cell type clusters in a 2D UMAP space. We then present TRIAGE-ParseR, a machine learning method which evaluates gene expression rank lists to define gene groups governing the identity and function of cell types. We demonstrate the utility of this two-step approach using atlases of in vivo and in vitro cell diversification and organogenesis. We also provide a web accessible dashboard for analysis and download of data and software. Collectively, genome-wide epigenetic repression provides a versatile strategy to define cell diversity and study gene regulation of scRNA-seq data.


Assuntos
Perfilação da Expressão Gênica , Análise de Célula Única , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Software , Análise por Conglomerados , Epigênese Genética , Algoritmos
6.
Biochem Biophys Rep ; 33: 101420, 2023 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-36654922

RESUMO

Epigenetic repression has been linked to the regulation of different cell states. In this study, we focus on the influence of this repression, mainly by H3K27me3, over gene expression in muscle cells, which may affect mineral content, a phenotype that is relevant to muscle function and beef quality. Based on the inverse relationship between H3K27me3 and gene expression (i.e., epigenetic repression) and on contrasting sample groups, we computationally predicted regulatory genes that affect muscle mineral content. To this end, we applied the TRIAGE predictive method followed by a rank product analysis. This methodology can predict regulatory genes that might be affected by repressive epigenetic regulation related to mineral concentration. Annotation of orthologous genes, between human and bovine, enabled our investigation of gene expression in the Longissimus thoracis muscle of Bos indicus cattle. The animals under study had a contrasting mineral content in their muscle cells. We identified candidate regulatory genes influenced by repressive epigenetic mechanisms, linking histone modification to mineral content in beef samples. The discovered candidate genes take part in multiple biological pathways, i.e., impulse transmission, cell signalling, immunological, and developmental pathways. Some of these genes were previously associated with mineral content or regulatory mechanisms. Our findings indicate that epigenetic repression can partially explain the gene expression profiles observed in muscle samples with contrasting mineral content through the candidate regulators here identified.

7.
Chemistry ; 29(9): e202203140, 2023 Feb 10.
Artigo em Inglês | MEDLINE | ID: mdl-36385513

RESUMO

Enzyme-catalyzed reaction cascades play an increasingly important role for the sustainable manufacture of diverse chemicals from renewable feedstocks. For instance, dehydratases from the ilvD/EDD superfamily have been embedded into a cascade to convert glucose via pyruvate to isobutanol, a platform chemical for the production of aviation fuels and other valuable materials. These dehydratases depend on the presence of both a Fe-S cluster and a divalent metal ion for their function. However, they also represent the rate-limiting step in the cascade. Here, catalytic parameters and the crystal structure of the dehydratase from Paralcaligenes ureilyticus (PuDHT, both in presence of Mg2+ and Mn2+ ) were investigated. Rate measurements demonstrate that the presence of stoichiometric concentrations Mn2+ promotes higher activity than Mg2+ , but at high concentrations the former inhibits the activity of PuDHT. Molecular dynamics simulations identify the position of a second binding site for the divalent metal ion. Only binding of Mn2+ (not Mg2+ ) to this site affects the ligand environment of the catalytically essential divalent metal binding site, thus providing insight into an inhibitory mechanism of Mn2+ at higher concentrations. Furthermore, in silico docking identified residues that play a role in determining substrate binding and selectivity. The combined data inform engineering approaches to design an optimal dehydratase for the cascade.


Assuntos
Hidroliases , Sequência de Aminoácidos , Hidroliases/química , Sítios de Ligação , Catálise
8.
PLoS Comput Biol ; 18(10): e1010633, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-36279274

RESUMO

Ancestral sequence reconstruction is a technique that is gaining widespread use in molecular evolution studies and protein engineering. Accurate reconstruction requires the ability to handle appropriately large numbers of sequences, as well as insertion and deletion (indel) events, but available approaches exhibit limitations. To address these limitations, we developed Graphical Representation of Ancestral Sequence Predictions (GRASP), which efficiently implements maximum likelihood methods to enable the inference of ancestors of families with more than 10,000 members. GRASP implements partial order graphs (POGs) to represent and infer insertion and deletion events across ancestors, enabling the identification of building blocks for protein engineering. To validate the capacity to engineer novel proteins from realistic data, we predicted ancestor sequences across three distinct enzyme families: glucose-methanol-choline (GMC) oxidoreductases, cytochromes P450, and dihydroxy/sugar acid dehydratases (DHAD). All tested ancestors demonstrated enzymatic activity. Our study demonstrates the ability of GRASP (1) to support large data sets over 10,000 sequences and (2) to employ insertions and deletions to identify building blocks for engineering biologically active ancestors, by exploring variation over evolutionary time.


Assuntos
Evolução Molecular , Mutação INDEL , Mutação INDEL/genética , Proteínas/genética , Evolução Biológica , Filogenia
9.
Mol Biol Evol ; 39(6)2022 06 02.
Artigo em Inglês | MEDLINE | ID: mdl-35639613

RESUMO

The cytochrome P450 family 1 enzymes (CYP1s) are a diverse family of hemoprotein monooxygenases, which metabolize many xenobiotics including numerous environmental carcinogens. However, their historical function and evolution remain largely unstudied. Here we investigate CYP1 evolution via the reconstruction and characterization of the vertebrate CYP1 ancestors. Younger ancestors and extant forms generally demonstrated higher activity toward typical CYP1 xenobiotic and steroid substrates than older ancestors, suggesting significant diversification away from the original CYP1 function. Caffeine metabolism appears to be a recently evolved trait of the CYP1A subfamily, observed in the mammalian CYP1A lineage, and may parallel the recent evolution of caffeine synthesis in multiple separate plant species. Likewise, the aryl hydrocarbon receptor agonist, 6-formylindolo[3,2-b]carbazole (FICZ) was metabolized to a greater extent by certain younger ancestors and extant forms, suggesting that activity toward FICZ increased in specific CYP1 evolutionary branches, a process that may have occurred in parallel to the exploitation of land where UV-exposure was higher than in aquatic environments. As observed with previous reconstructions of P450 enzymes, thermostability correlated with evolutionary age; the oldest ancestor was up to 35 °C more thermostable than the extant forms, with a 10T50 (temperature at which 50% of the hemoprotein remains intact after 10 min) of 71 °C. This robustness may have facilitated evolutionary diversification of the CYP1s by buffering the destabilizing effects of mutations that conferred novel functions, a phenomenon which may also be useful in exploiting the catalytic versatility of these ancestral enzymes for commercial application as biocatalysts.


Assuntos
Cafeína , Xenobióticos , Animais , Citocromo P-450 CYP1A1/genética , Citocromo P-450 CYP1A1/metabolismo , Sistema Enzimático do Citocromo P-450/genética , Mamíferos/metabolismo , Vertebrados/genética , Vertebrados/metabolismo
10.
Development ; 149(5)2022 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-35245348

RESUMO

The hypothalamus displays staggering cellular diversity, chiefly established during embryogenesis by the interplay of several signalling pathways and a battery of transcription factors. However, the contribution of epigenetic cues to hypothalamus development remains unclear. We mutated the polycomb repressor complex 2 gene Eed in the developing mouse hypothalamus, which resulted in the loss of H3K27me3, a fundamental epigenetic repressor mark. This triggered ectopic expression of posteriorly expressed regulators (e.g. Hox homeotic genes), upregulation of cell cycle inhibitors and reduced proliferation. Surprisingly, despite these effects, single cell transcriptomic analysis revealed that most neuronal subtypes were still generated in Eed mutants. However, we observed an increase in glutamatergic/GABAergic double-positive cells, as well as loss/reduction of dopamine, hypocretin and Tac2-Pax6 neurons. These findings indicate that many aspects of the hypothalamic gene regulatory flow can proceed without the key H3K27me3 epigenetic repressor mark, but points to a unique sensitivity of particular neuronal subtypes to a disrupted epigenomic landscape.


Assuntos
Desenvolvimento Embrionário/fisiologia , Hipotálamo/fisiologia , Neurônios/fisiologia , Complexo Repressor Polycomb 2/genética , Proteínas do Grupo Polycomb/genética , Animais , Proliferação de Células/genética , Repressão Epigenética/genética , Feminino , Masculino , Camundongos , Mutação/genética , Transcriptoma/genética
11.
Nucleic Acids Res ; 50(3): 1280-1296, 2022 02 22.
Artigo em Inglês | MEDLINE | ID: mdl-35048973

RESUMO

A prominent aspect of most, if not all, central nervous systems (CNSs) is that anterior regions (brain) are larger than posterior ones (spinal cord). Studies in Drosophila and mouse have revealed that Polycomb Repressor Complex 2 (PRC2), a protein complex responsible for applying key repressive histone modifications, acts by several mechanisms to promote anterior CNS expansion. However, it is unclear what the full spectrum of PRC2 action is during embryonic CNS development and how PRC2 intersects with the epigenetic landscape. We removed PRC2 function from the developing mouse CNS, by mutating the key gene Eed, and generated spatio-temporal transcriptomic data. To decode the role of PRC2, we developed a method that incorporates standard statistical analyses with probabilistic deep learning to integrate the transcriptomic response to PRC2 inactivation with epigenetic data. This multi-variate analysis corroborates the central involvement of PRC2 in anterior CNS expansion, and also identifies several unanticipated cohorts of genes, such as proliferation and immune response genes. Furthermore, the analysis reveals specific profiles of regulation via PRC2 upon these gene cohorts. These findings uncover a differential logic for the role of PRC2 upon functionally distinct gene cohorts that drive CNS anterior expansion. To support the analysis of emerging multi-modal datasets, we provide a novel bioinformatics package that integrates transcriptomic and epigenetic datasets to identify regulatory underpinnings of heterogeneous biological processes.


Assuntos
Sistema Nervoso Central/embriologia , Complexo Repressor Polycomb 2 , Animais , Embrião de Mamíferos/metabolismo , Histonas/genética , Histonas/metabolismo , Camundongos , Complexo Repressor Polycomb 2/genética , Complexo Repressor Polycomb 2/metabolismo
12.
Methods Mol Biol ; 2397: 85-110, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-34813061

RESUMO

Analyzing the natural evolution of proteins by ancestral sequence reconstruction (ASR) can provide valuable information about the changes in sequence and structure that drive the development of novel protein functions. However, ASR has also been used as a protein engineering tool, as it often generates thermostable proteins which can serve as robust and evolvable templates for enzyme engineering. Importantly, ASR has the potential to provide an insight into the history of insertions and deletions that have occurred in the evolution of a protein family. Indels are strongly associated with functional change during enzyme evolution and represent a largely unexplored source of genetic diversity for designing proteins with novel or improved properties. Current ASR methods differ in the way they handle indels; inclusion or exclusion of indels is often managed subjectively, based on assumptions the user makes about the likelihood of each recombination event, yet most currently available ASR tools provide limited, if any, opportunities for evaluating indel placement in a reconstructed sequence. Graphical Representation of Ancestral Sequence Predictions (GRASP) is an ASR tool that maps indel evolution throughout a reconstruction and enables the evaluation of indel variants. This chapter provides a general protocol for performing a reconstruction using GRASP and using the results to create indel variants. The method addresses protein template selection, sequence curation, alignment refinement, tree building, ancestor reconstruction, evaluation of indel variants and approaches to library development.


Assuntos
Mutação INDEL , Evolução Molecular , Filogenia , Probabilidade , Proteínas/genética
13.
Antimicrob Agents Chemother ; 65(10): e0093621, 2021 09 17.
Artigo em Inglês | MEDLINE | ID: mdl-34310207

RESUMO

The structural diversity in metallo-ß-lactamases (MBLs), especially in the vicinity of the active site, has been a major hurdle in the development of clinically effective inhibitors. Representatives from three variants of the B3 MBL subclass, containing either the canonical HHH/DHH active-site motif (present in the majority of MBLs in this subclass) or the QHH/DHH (B3-Q) or HRH/DQK (B3-RQK) variations, were reported previously. Here, we describe the structure and kinetic properties of the first example (SIE-1) of a fourth variant containing the EHH/DHH active-site motif (B3-E). SIE-1 was identified in the hexachlorocyclohexane-degrading bacterium Sphingobium indicum, and kinetic analyses demonstrate that although it is active against a wide range of antibiotics, its efficiency is lower than that of other B3 MBLs but has increased efficiency toward cephalosporins relative to other ß-lactam substrates. The overall fold of SIE-1 is characteristic of the MBLs; the notable variation is observed in the Zn1 site due to the replacement of the canonical His116 by a glutamate. The unusual preference of SIE-1 for cephalosporins and its occurrence in a widespread environmental organism suggest the scope for increased MBL-mediated ß-lactam resistance. Thus, it is relevant to include SIE-1 in MBL inhibitor design studies to widen the therapeutic scope of much needed antiresistance drugs.


Assuntos
Sphingomonadaceae , beta-Lactamases , Antibacterianos/farmacologia , Domínio Catalítico , Ácido Glutâmico , Sphingomonadaceae/metabolismo , Inibidores de beta-Lactamases/farmacologia , beta-Lactamases/genética , beta-Lactamases/metabolismo
14.
Nat Commun ; 12(1): 2678, 2021 05 11.
Artigo em Inglês | MEDLINE | ID: mdl-33976153

RESUMO

Intellectual disability (ID) and autism spectrum disorder (ASD) are the most common neurodevelopmental disorders and are characterized by substantial impairment in intellectual and adaptive functioning, with their genetic and molecular basis remaining largely unknown. Here, we identify biallelic variants in the gene encoding one of the Elongator complex subunits, ELP2, in patients with ID and ASD. Modelling the variants in mice recapitulates the patient features, with brain imaging and tractography analysis revealing microcephaly, loss of white matter tract integrity and an aberrant functional connectome. We show that the Elp2 mutations negatively impact the activity of the complex and its function in translation via tRNA modification. Further, we elucidate that the mutations perturb protein homeostasis leading to impaired neurogenesis, myelin loss and neurodegeneration. Collectively, our data demonstrate an unexpected role for tRNA modification in the pathogenesis of monogenic ID and ASD and define Elp2 as a key regulator of brain development.


Assuntos
Transtorno do Espectro Autista/genética , Deficiência Intelectual/genética , Peptídeos e Proteínas de Sinalização Intracelular/genética , Mutação , Transtornos do Neurodesenvolvimento/genética , Transcriptoma/genética , Animais , Transtorno do Espectro Autista/metabolismo , Transtorno do Espectro Autista/fisiopatologia , Modelos Animais de Doenças , Epigênese Genética , Asseio Animal/fisiologia , Humanos , Deficiência Intelectual/metabolismo , Deficiência Intelectual/fisiopatologia , Peptídeos e Proteínas de Sinalização Intracelular/metabolismo , Camundongos Endogâmicos C57BL , Camundongos Endogâmicos DBA , Camundongos Knockout , Transtornos do Neurodesenvolvimento/metabolismo , Transtornos do Neurodesenvolvimento/fisiopatologia , Fenótipo , Células Sf9 , Spodoptera
15.
Genomics ; 113(4): 1855-1866, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-33878366

RESUMO

Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is the primary protocol for detecting genome-wide DNA-protein interactions, and therefore a key tool for understanding transcriptional regulation. A number of factors, including low specificity of antibody and cellular heterogeneity of sample, may cause "peak" callers to output noise and experimental artefacts. Statistically combining multiple experimental replicates from the same condition could significantly enhance our ability to distinguish actual transcription factor binding events, even when peak caller accuracy and consistency of detection are compromised. We adapted the rank-product test to statistically evaluate the reproducibility from any number of ChIP-seq experimental replicates. We demonstrate over a number of benchmarks that our adaptation "ChIP-R" (pronounced 'chipper') performs as well as or better than comparable approaches on recovering transcription factor binding sites in ChIP-seq peak data. We also show ChIP-R extends to evaluate ATAC-seq peaks, finding reproducible peak sets even at low sequencing depth. ChIP-R decomposes peaks across replicates into "fragments" which either form part of a peak in a replicate, or not. We show that by re-analysing existing data sets, ChIP-R reconstructs reproducible peaks from fragments with enhanced biological enrichment relative to current strategies.


Assuntos
Algoritmos , Sequenciamento de Cromatina por Imunoprecipitação , Sítios de Ligação , Imunoprecipitação da Cromatina/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Reprodutibilidade dos Testes , Análise de Sequência de DNA/métodos
16.
Cell Syst ; 11(6): 625-639.e13, 2020 12 16.
Artigo em Inglês | MEDLINE | ID: mdl-33278344

RESUMO

Determining genes that orchestrate cell differentiation in development and disease remains a fundamental goal of cell biology. This study establishes a genome-wide metric based on the gene-repressive trimethylation of histone H3 at lysine 27 (H3K27me3) across hundreds of diverse cell types to identify genetic regulators of cell differentiation. We introduce a computational method, TRIAGE, which uses discordance between gene-repressive tendency and expression to identify genetic drivers of cell identity. We apply TRIAGE to millions of genome-wide single-cell transcriptomes, diverse omics platforms, and eukaryotic cells and tissue types. Using a wide range of data, we validate the performance of TRIAGE in identifying cell-type-specific regulatory factors across diverse species including human, mouse, boar, bird, fish, and tunicate. Using CRISPR gene editing, we use TRIAGE to experimentally validate RNF220 as a regulator of Ciona cardiopharyngeal development and SIX3 as required for differentiation of endoderm in human pluripotent stem cells. A record of this paper's transparent peer review process is included in the Supplemental Information.


Assuntos
Epigenômica/métodos , Diferenciação Celular , Humanos
17.
Proteins ; 88(9): 1251-1259, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32394426

RESUMO

Ancestral sequence reconstruction has had recent success in decoding the origins and the determinants of complex protein functions. However, phylogenetic analyses of remote homologues must handle extreme amino acid sequence diversity resulting from extended periods of evolutionary change. We exploited the wealth of protein structures to develop an evolutionary model based on protein secondary structure. The approach follows the differences between discrete secondary structure states observed in modern proteins and those hypothesized in their immediate ancestors. We implemented maximum likelihood-based phylogenetic inference to reconstruct ancestral secondary structure. The predictive accuracy from the use of the evolutionary model surpasses that of comparative modeling and sequence-based prediction; the reconstruction extracts information not available from modern structures or the ancestral sequences alone. Based on a phylogenetic analysis of a sequence-diverse protein family, we showed that the model can highlight relationships that are evolutionarily rooted in structure and not evident in amino acid-based analysis.


Assuntos
Proteínas Adaptadoras de Transporte Vesicular/química , Proteínas de Bactérias/química , Evolução Molecular , Modelos Estatísticos , Proteínas Adaptadoras de Transporte Vesicular/história , Animais , Bactérias/química , Bactérias/classificação , Bactérias/metabolismo , Proteínas de Bactérias/história , Simulação por Computador , História do Século XXI , História Antiga , Humanos , Mamíferos/classificação , Mamíferos/metabolismo , Filogenia , Plantas/química , Plantas/classificação , Plantas/metabolismo , Estrutura Secundária de Proteína
18.
Bioinformatics ; 36(12): 3902-3904, 2020 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-32246829

RESUMO

MOTIVATION: Identifying the genes regulated by a given transcription factor (TF) (its 'target genes') is a key step in developing a comprehensive understanding of gene regulation. Previously, we developed a method (CisMapper) for predicting the target genes of a TF based solely on the correlation between a histone modification at the TF's binding site and the expression of the gene across a set of tissues or cell lines. That approach is limited to organisms for which extensive histone and expression data are available, and does not explicitly incorporate the genomic distance between the TF and the gene. RESULTS: We present the T-Gene algorithm, which overcomes these limitations. It can be used to predict which genes are most likely to be regulated by a TF, and which of the TF's binding sites are most likely involved in regulating particular genes. T-Gene calculates a novel score that combines distance and histone/expression correlation, and we show that this score accurately predicts when a regulatory element bound by a TF is in contact with a gene's promoter, achieving median precision above 60%. T-Gene is easy to use via its web server or as a command-line tool, and can also make accurate predictions (median precision above 40%) based on distance alone when extensive histone/expression data is not available for the organism. T-Gene provides an estimate of the statistical significance of each of its predictions. AVAILABILITY AND IMPLEMENTATION: The T-Gene web server, source code, histone/expression data and genome annotation files are provided at http://meme-suite.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Software , Sítios de Ligação , Imunoprecipitação da Cromatina , Regulação da Expressão Gênica , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo
19.
Cerebellum ; 19(1): 89-101, 2020 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-31838646

RESUMO

Transcriptional regulation plays a central role in controlling neural stem and progenitor cell proliferation and differentiation during neurogenesis. For instance, transcription factors from the nuclear factor I (NFI) family have been shown to co-ordinate neural stem and progenitor cell differentiation within multiple regions of the embryonic nervous system, including the neocortex, hippocampus, spinal cord and cerebellum. Knockout of individual Nfi genes culminates in similar phenotypes, suggestive of common target genes for these transcription factors. However, whether or not the NFI family regulates common suites of genes remains poorly defined. Here, we use granule neuron precursors (GNPs) of the postnatal murine cerebellum as a model system to analyse regulatory targets of three members of the NFI family: NFIA, NFIB and NFIX. By integrating transcriptomic profiling (RNA-seq) of Nfia- and Nfix-deficient GNPs with epigenomic profiling (ChIP-seq against NFIA, NFIB and NFIX, and DNase I hypersensitivity assays), we reveal that these transcription factors share a large set of potential transcriptional targets, suggestive of complementary roles for these NFI family members in promoting neural development.


Assuntos
Cerebelo/crescimento & desenvolvimento , Cerebelo/metabolismo , Fatores de Transcrição NFI/metabolismo , Animais , Animais Recém-Nascidos , Cerebelo/citologia , Sequenciamento de Cromatina por Imunoprecipitação/métodos , Feminino , Masculino , Camundongos , Camundongos Endogâmicos C57BL , Camundongos Knockout , Fatores de Transcrição NFI/genética , Neurogênese/fisiologia , Gravidez
20.
BMC Bioinformatics ; 20(1): 727, 2019 12 20.
Artigo em Inglês | MEDLINE | ID: mdl-31861997

RESUMO

Following publication of the original article [1], the author reported that an incorrect figure has been published as Figure 2. The correct Figure 2 is shown below.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...