Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Chem Inf Model ; 64(7): 2705-2719, 2024 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-38258978

RESUMO

Bacterial promoters play a crucial role in gene expression by serving as docking sites for the transcription initiation machinery. However, accurately identifying promoter regions in bacterial genomes remains a challenge due to their diverse architecture and variations. In this study, we propose MLDSPP (Machine Learning and Duplex Stability based Promoter prediction in Prokaryotes), a machine learning-based promoter prediction tool, to comprehensively screen bacterial promoter regions in 12 diverse genomes. We leveraged biologically relevant and informative DNA structural properties, such as DNA duplex stability and base stacking, and state-of-the-art machine learning (ML) strategies to gain insights into promoter characteristics. We evaluated several machine learning models, including Support Vector Machines, Random Forests, and XGBoost, and assessed their performance using accuracy, precision, recall, specificity, F1 score, and MCC metrics. Our findings reveal that XGBoost outperformed other models and current state-of-the-art promoter prediction tools, namely Sigma70pred and iPromoter2L, achieving F1-scores >95% in most systems. Significantly, the use of one-hot encoding for representing nucleotide sequences complements these structural features, enhancing our XGBoost model's predictive capabilities. To address the challenge of model interpretability, we incorporated explainable AI techniques using Shapley values. This enhancement allows for a better understanding and interpretation of the predictions of our model. In conclusion, our study presents MLDSPP as a novel, generic tool for predicting promoter regions in bacteria, utilizing original downstream sequences as nonpromoter controls. This tool has the potential to significantly advance the field of bacterial genomics and contribute to our understanding of gene regulation in diverse bacterial systems.


Assuntos
Comportamento de Utilização de Ferramentas , Bactérias/genética , DNA/genética , Aprendizado de Máquina , Regiões Promotoras Genéticas
2.
Front Cell Infect Microbiol ; 13: 1147544, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37396305

RESUMO

Mycobacterium tuberculosis, the causative agent of tuberculosis, has evolved over time into a multidrug resistance strain that poses a serious global pandemic health threat. The ability to survive and remain dormant within the host macrophage relies on multiple transcription factors contributing to virulence. To date, very limited structural insights from crystallographic and NMR studies are available for TFs and TF-DNA binding events. Understanding the role of DNA structure in TF binding is critical to deciphering MTB pathogenicity and has yet to be resolved at the genome scale. In this work, we analyzed the compositional and conformational preference of 21 mycobacterial TFs, evident at their DNA binding sites, in local and global scales. Results suggest that most TFs prefer binding to genomic regions characterized by unique DNA structural signatures, namely, high electrostatic potential, narrow minor grooves, high propeller twist, helical twist, intrinsic curvature, and DNA rigidity compared to the flanking sequences. Additionally, preference for specific trinucleotide motifs, with clear periodic signals of tetranucleotide motifs, are observed in the vicinity of the TF-DNA interactions. Altogether, our study reports nuanced DNA shape and structural preferences of 21 TFs.


Assuntos
DNA , Fatores de Transcrição , Fatores de Transcrição/metabolismo , DNA/genética , Sítios de Ligação , Motivos de Nucleotídeos , Ligação Proteica
3.
Biochimie ; 214(Pt A): 101-111, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37311475

RESUMO

The promoter regions of gene regulation are under evolutionary constraints and earlier studies uncovered that they are characterized by enrichment of functional non-B DNA structural signatures like curved DNA, cruciform DNA, G-quadruplex, triple-helical DNA, slipped DNA structures, and Z-DNA. However, these studies are restricted to a few model organisms, single non-B DNA motif types, or whole genomic sequences, and their comparative accumulation in promoter regions of different domains of life has not been reported comprehensively. In this study, for the first time, we investigated the preponderance of non-B DNA-prone motifs in promoter regions in 1180 genomes belonging to 28 taxonomic groups using the non-B DNA Motif Search Tool (nBMST). The trends suggest that they are predominant in promoters compared to the upstream and downstream regions of all three domains of life and variably linked to taxonomic groups. Cruciform DNA motif is the most abundant form of non-B DNA, spanning from archaea to lower eukaryotes. Curved DNA motifs are prominent in host-associated bacteria, and suppressed in mammals. Triplex-DNA and slipped DNA structure repeats are discretely dispersed in all lineages. G-quadruplex motifs are significantly enriched in mammals. We also observed that the unique enrichment of non-B DNA in promoters is strongly linked to genome GC, size, evolutionary time divergence, and ecological adaptations. Overall, our work systematically reports the unique non-B DNA structural landscape of cellular organisms from the perspective of the cis-regulatory code of genomes.


Assuntos
DNA Cruciforme , Quadruplex G , Animais , Motivos de Nucleotídeos , DNA/genética , DNA/química , Regiões Promotoras Genéticas/genética , Mamíferos
4.
Artigo em Inglês | MEDLINE | ID: mdl-35353704

RESUMO

Computational promoter identification in eukaryotes is a classical biological problem that should be refurbished with the availability of an avalanche of experimental data and emerging deep learning technologies. The current knowledge indicates that eukaryotic core promoters display multifarious signals such as TATA-Box, Inr element, TCT, and Pause-button, etc., and structural motifs such as G-quadruplexes. In the present study, we combined the power of deep learning with a plethora of promoter motifs to delineate promoter and non-promoters gleaned from the statistical properties of DNA sequence arrangement. To this end, we implemented convolutional neural network (CNN) and long short-term memory (LSTM) recurrent neural network architecture for five model systems with [-100 to +50] segments relative to the transcription start site being the core promoter. Unlike previous state-of-the-art tools, which furnish a binary decision of promoter or non-promoter, we classify a chunk of 151mer sequence into a promoter along with the consensus signal type or a non-promoter. The combined CNN-LSTM model; we call "DeePromClass", achieved testing accuracy of 90.6%, 93.6%, 91.8%, 86.5%, and 84.0% for S. cerevisiae, C. elegans, D. melanogaster, Mus musculus, and Homo sapiens respectively. In total, our tool provides an insightful update on next-generation promoter prediction tools for promoter biologists.


Assuntos
Drosophila melanogaster , Saccharomyces cerevisiae , Animais , Humanos , Camundongos , Drosophila melanogaster/genética , Saccharomyces cerevisiae/genética , Caenorhabditis elegans/genética , Regiões Promotoras Genéticas/genética , Redes Neurais de Computação
5.
Int J Biol Macromol ; 220: 920-933, 2022 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-35987365

RESUMO

Non-healing wounds have long been the subject of scientific and clinical investigations. Despite breakthroughs in understanding the biology of delayed wound healing, only limited advances have been made in properly treating wounds. Recently, research into nucleic acids (NAs) such as small-interfering RNA (siRNA), microRNA (miRNA), plasmid DNA (pDNA), aptamers, and antisense oligonucleotides (ASOs) has resulted in the development of a latest therapeutic strategy for wound healing. In this regard, dendrimers, scaffolds, lipid nanoparticles, polymeric nanoparticles, hydrogels, and metal nanoparticles have all been explored as NA delivery techniques. However, the translational possibility of NA remains a substantial barrier. As a result, different NAs must be identified, and their distribution method must be optimized. This review explores the role of NA-based therapeutics in various stages of wound healing and provides an update on the most recent findings in the development of NA-based nanomedicine and biomaterials, which may offer the potential for the invention of novel therapies for this long-term condition. Further, the challenges and potential for miRNA-based techniques to be translated into clinical applications are also highlighted.


Assuntos
Dendrímeros , MicroRNAs , Ácidos Nucleicos , Materiais Biocompatíveis , DNA , Dendrímeros/uso terapêutico , Hidrogéis , Lipossomos , MicroRNAs/genética , MicroRNAs/uso terapêutico , Nanopartículas , Ácidos Nucleicos/uso terapêutico , Oligonucleotídeos Antissenso/genética , Oligonucleotídeos Antissenso/uso terapêutico , RNA Interferente Pequeno/genética , RNA Interferente Pequeno/uso terapêutico , Cicatrização
6.
ACS Omega ; 7(7): 5657-5669, 2022 Feb 22.
Artigo em Inglês | MEDLINE | ID: mdl-35224327

RESUMO

The eukaryotic transcription is orchestrated from a chunk of the DNA region stated as the core promoter. Multifarious and punctilious core promoter signals, viz., TATA-box, Inr, BREs, and Pause Button, are associated with a subset of genes and regulate their spatiotemporal expression. However, the core promoter architecture linked with these signals has not been investigated exhaustively for several species. In this study, we attempted to envisage the adaptive binding landscape of the transcription initiation machinery as a function of DNA structure. To this end, we deployed a set of k-mer based DNA structural estimates and regular expression models derived from experiments, molecular dynamic simulations, and theoretical frameworks, and high-throughout promoter data sets retrieved from the eukaryotic promoter database. We categorized protein-coding gene core promoters based on characteristic motifs at precise locations and analyzed the B-DNA structural properties and non-B-DNA structural motifs for 15 different eukaryotic genomes. We observed that Inr, BREd, and no-motif classes display common patterns of DNA sequence and structural environment. TATA-containing, BREu, and Pause Button classes show a deviant behavior with the TATA class displaying varied axial and twisting flexibility while BREu and Pause Button leaned toward G-quadruplex motif enrichment. Intriguingly, DNA meltability and shape signals are conserved irrespective of the presence or absence of distinct core promoter motifs in the majority of species. Altogether, here we delineated the conserved DNA structural signals associated with several promoter classes that may contribute to the chromatin configuration, orchestration of transcription machinery, and DNA duplex melting during the transcription process.

7.
Gene ; 803: 145892, 2021 Nov 30.
Artigo em Inglês | MEDLINE | ID: mdl-34375633

RESUMO

The p53 tumor suppressor protein maintains the genome fidelity and integrity by modulating several cellular activities. It regulates these events by interacting with a heterogeneous set of response elements (REs) of regulatory genes in the background of chromatin configuration. At the p53-RE interface, both the base readout and torsional-flexibility of DNA account for high-affinity binding. However, DNA structure is an entanglement of a multitude of physicochemical features, both local and global structure should be considered for dealing with DNA-protein interactions. The goal of current research work is to conceptualize and abstract basic principles of p53-RE binding affinity as a function of structural alterations in DNA such as bending, twisting, and stretching flexibility and shape. For this purpose, we have exploited high throughput in-vitro relative affinity information of responsive elements and genome binding events of p53 from HT-Selex and ChIP-Seq experiments respectively. Our results confirm the role of torsional flexibility in p53 binding, and further, we reveal that DNA axial bending, stretching stiffness, propeller twist, and wedge angles are intimately linked to p53 binding affinity when compared to homeodomain, bZIP, and bHLH proteins. Besides, a similar DNA structural environment is observed in the distal sequences encompassing the actual binding sites of p53 cistrome genes. Additionally, we revealed that p53 cistrome target genes have unique promoter architecture, and the DNA flexibility of genomic sequences around REs in cancer and normal cell types display major differences. Altogether, our work provides a keynote on DNA structural features of REs that shape up the in-vitro and in-vivo high-affinity binding of the p53 transcription factor.


Assuntos
DNA/metabolismo , Análise de Sequência de DNA/métodos , Proteína Supressora de Tumor p53/metabolismo , Sítios de Ligação , Cromossomos Humanos/genética , DNA/química , Regulação da Expressão Gênica , Humanos , Regiões Promotoras Genéticas , Elementos de Resposta
8.
FEBS Lett ; 595(19): 2504-2521, 2021 10.
Artigo em Inglês | MEDLINE | ID: mdl-34387867

RESUMO

Nucleoid-associated proteins (NAPs) maintain bacterial nucleoid configuration through their architectural properties of DNA bending, wrapping, and bridging. However, the contribution of DNA structural alterations to DNA-NAP recognition at the genomic scale remains unresolved. Present work dissects the DNA sequence, shape and altered structural preferences at a genomic scale for six NAPs in Mycobacterium tuberculosis. Results suggest narrower minor groove width (MGW) and higher DNA rigidity are marked for the binding sites of EspR and Lsr2, while mIHF, MtHU and NapM have heterogeneous DNA structural predilections. In contrast, WhiB4-DNA-binding sites were characterized by wider MGW, highly deformable and less curved DNA. This work provides systematic insight into NAP-mediated genome organization as a function of DNA structural features.


Assuntos
Proteínas de Bactérias/metabolismo , DNA Bacteriano/química , DNA Bacteriano/genética , Genômica , Mycobacterium tuberculosis/genética , Mycobacterium tuberculosis/metabolismo , Sítios de Ligação , DNA Bacteriano/metabolismo , Regulação Bacteriana da Expressão Gênica
9.
Biochimie ; 184: 40-51, 2021 May.
Artigo em Inglês | MEDLINE | ID: mdl-33548392

RESUMO

The role of G-quadruplexes in the cellular physiology of human pathogenesis is an intriguing area of research. Nonetheless, their functional roles and evolutionary conservation have not been compared comprehensively in pathogenic forms of various bacterial genera and species. In the current in silico study, we addressed the role of G-quadruplex-forming sequences (G4 motifs) in the context of cis-regulation, expression variation, regulatory networks, gene orthology and ontology. Genome-wide screening across seven pathogenic genomes using the G4Hunter tool revealed the significant prevalence of G4 motifs in cis-regulatory regions compared to the intragenic regions. Significant conservation of G4 motifs was observed in the regulatory region of 300 orthologous genes. Further analysis of published ChIP-Seq data (Minch et al., 2015) of 91 DNA-binding proteins of the M. tuberculosis genome revealed significant links between G4 motifs and target sites of transcriptional regulators. Interestingly, the transcription factors entangled with virulence, in specific, CsoR, Rv0081, DevR/DosR, and TetR family are found to have G4 motifs in their target regulatory regions. Overall the current study applies positional-functional relationship computation to delve into the cis-regulation of G-quadruplex structures in the context of gene orthology in pathogenic bacteria.


Assuntos
Bactérias/genética , Simulação por Computador , Quadruplex G , Genoma Bacteriano , Sequências Reguladoras de Ácido Nucleico , Bactérias/patogenicidade
10.
ACS Omega ; 5(23): 13601-13611, 2020 Jun 16.
Artigo em Inglês | MEDLINE | ID: mdl-32566825

RESUMO

DNA replication in eukaryotes is an intricate process, which is precisely synchronized by a set of regulatory proteins, and the replication fork emanates from discrete sites on chromatin called origins of replication (Oris). These spots are considered as the gateway to chromosomal replication and are stereotyped by sequence motifs. The cognate sequences are noticeable in a small group of entire origin regions or totally absent across different metazoans. Alternatively, the use of DNA secondary structural features can provide additional information compared to the primary sequence. In this article, we report the trends in DNA sequence-based structural properties of origin sequences in nine eukaryotic systems representing different families of life. Biologically relevant DNA secondary structural properties, namely, stability, propeller twist, flexibility, and minor groove shape were studied in the sequences flanking replication start sites. Results indicate that Oris in yeasts show lower stability, more rigidity, and narrow minor groove preferences compared to genomic sequences surrounding them. Yeast Oris also show preference for A-tracts and the promoter element TATA box in the vicinity of replication start sites. On the contrary, Drosophila melanogaster, humans, and Arabidopsis thaliana do not have such features in their Oris, and instead, they show high preponderance of G-rich sequence motifs such as putative G-quadruplexes or i-motifs and CpG islands. Our extensive study applies the DNA structural feature computation to delve into origins of replication across organisms ranging from yeasts to mammals and including a plant. Insights from this study would be significant in understanding origin architecture and help in designing new algorithms for predicting DNA trans-acting factor recognition events.

11.
Nucleic Acids Res ; 46(22): 11883-11897, 2018 12 14.
Artigo em Inglês | MEDLINE | ID: mdl-30395339

RESUMO

Spatial and temporal expression of genes is essential for maintaining phenotype integrity. Transcription factors (TFs) modulate expression patterns by binding to specific DNA sequences in the genome. Along with the core binding motif, the flanking sequence context can play a role in DNA-TF recognition. Here, we employ high-throughput in vitro and in silico analyses to understand the influence of sequences flanking the cognate sites in binding of three most prevalent eukaryotic TF families (zinc finger, homeodomain and bZIP). In vitro binding preferences of each TF toward the entire DNA sequence space were correlated with a wide range of DNA structural parameters, including DNA flexibility. Results demonstrate that conformational plasticity of flanking regions modulates binding affinity of certain TF families. DNA duplex stability and minor groove width also play an important role in DNA-TF recognition but differ in how exactly they influence the binding in each specific case. Our analyses further reveal that the structural features of preferred flanking sequences are not universal, as similar DNA-binding folds can employ distinct DNA recognition modes.


Assuntos
Fatores de Transcrição de Zíper de Leucina Básica/química , DNA/química , Proteínas de Homeodomínio/química , Transcrição Gênica , Dedos de Zinco/genética , Animais , Sequência de Bases , Fatores de Transcrição de Zíper de Leucina Básica/genética , Fatores de Transcrição de Zíper de Leucina Básica/metabolismo , Sítios de Ligação , Sistema Livre de Células/química , Sistema Livre de Células/metabolismo , DNA/genética , DNA/metabolismo , Proteínas de Homeodomínio/genética , Proteínas de Homeodomínio/metabolismo , Humanos , Conformação de Ácido Nucleico , Motivos de Nucleotídeos , Ligação Proteica , Conformação Proteica em alfa-Hélice , Conformação Proteica em Folha beta , Domínios e Motivos de Interação entre Proteínas
12.
Sci Rep ; 8(1): 4520, 2018 03 14.
Artigo em Inglês | MEDLINE | ID: mdl-29540741

RESUMO

Transcription is an intricate mechanism and is orchestrated at the promoter region. The cognate motifs in the promoters are observed in only a subset of total genes across different domains of life. Hence, sequence-motif based promoter prediction may not be a holistic approach for whole genomes. Conversely, the DNA structural property, duplex stability is a characteristic of promoters and can be used to delineate them from other genomic sequences. In this study, we have used a DNA duplex stability based algorithm 'PromPredict' for promoter prediction in a broad range of eukaryotes, representing various species of yeast, worm, fly, fish, and mammal. Efficiency of the software has been tested in promoter regions of 48 eukaryotic systems. PromPredict achieves recall values, which range from 68 to 92% in various eukaryotes. PromPredict performs well in mammals, although their core promoter regions are GC rich. 'PromPredict' has also been tested for its ability to predict promoter regions for various transcript classes (coding and non-coding), TATA-containing and TATA-less promoters as well as on promoter sequences belonging to different gene expression variability categories. The results support the idea that differential DNA duplex stability is a potential predictor of promoter regions in various genomes.


Assuntos
Biologia Computacional/métodos , Eucariotos/genética , Genoma , Genômica/métodos , Regiões Promotoras Genéticas , Animais , Células Eucarióticas , Humanos , Reprodutibilidade dos Testes , Sítio de Iniciação de Transcrição
13.
FEBS Open Bio ; 7(3): 324-334, 2017 03.
Artigo em Inglês | MEDLINE | ID: mdl-28286728

RESUMO

Eukaryotic genes can be broadly classified as TATA-containing and TATA-less based on the presence of TATA box in their promoters. Experiments on both classes of genes have revealed a disparity in the regulation of gene expression and cellular functions between the two classes. In this study, we report characteristic differences in promoter sequences and associated structural properties of the two categories of genes in six different eukaryotes. We have analyzed three structural features, DNA duplex stability, bendability, and curvature along with the distribution of A-tracts, G-quadruplex motifs, and CpG islands. The structural feature analyses reveal that while the two classes of gene promoters are distinctly different from each other, the properties are also distinguishable across the six organisms.

14.
Curr Opin Struct Biol ; 25: 77-85, 2014 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-24503515

RESUMO

Regulatory information for transcription initiation is present in a stretch of genomic DNA, called the promoter region that is located upstream of the transcription start site (TSS) of the gene. The promoter region interacts with different transcription factors and RNA polymerase to initiate transcription and contains short stretches of transcription factor binding sites (TFBSs), as well as structurally unique elements. Recent experimental and computational analyses of promoter sequences show that they often have non-B-DNA structural motifs, as well as some conserved structural properties, such as stability, bendability, nucleosome positioning preference and curvature, across a class of organisms. Here, we briefly describe these structural features, the differences observed in various organisms and their possible role in regulation of gene expression.


Assuntos
Biologia Computacional/métodos , DNA/genética , Regulação da Expressão Gênica/genética , Regiões Promotoras Genéticas/genética , Sítio de Iniciação de Transcrição , Animais , Sequência de Bases , Humanos , Motivos de Nucleotídeos
15.
J Bioinform Comput Biol ; 11(6): 1343001, 2013 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-24372030

RESUMO

Gene expression is the most fundamental biological process, which is essential for phenotypic variation. It is regulated by various external (environment and evolution) and internal (genetic) factors. The level of gene expression depends on promoter architecture, along with other external factors. Presence of sequence motifs, such as transcription factor binding sites (TFBSs) and TATA-box, or DNA methylation in vertebrates has been implicated in the regulation of expression of some genes in eukaryotes, but a large number of genes lack these sequences. On the other hand, several experimental and computational studies have shown that promoter sequences possess some special structural properties, such as low stability, less bendability, low nucleosome occupancy, and more curvature, which are prevalent across all organisms. These structural features may play role in transcription initiation and regulation of gene expression. We have studied the relationship between the structural features of promoter DNA, promoter directionality and gene expression variability in S. cerevisiae. This relationship has been analyzed for seven different measures of gene expression variability, along with two different regulatory effect measures. We find that a few of the variability measures of gene expression are linked to DNA structural properties, nucleosome occupancy, TATA-box presence, and bidirectionality of promoter regions. Interestingly, gene responsiveness is most intimately correlated with DNA structural features and promoter architecture.


Assuntos
DNA Fúngico/química , Regulação Fúngica da Expressão Gênica , Regiões Promotoras Genéticas , Saccharomyces cerevisiae/genética , Nucleossomos/metabolismo , TATA Box , Transcrição Gênica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...