Pesquisa | Biblioteca Virtual em Saúde

Analysis of the nucleotide content of Escherichia coli promoter sequences related to the alternative sigma factors.

Dall'Alba, Gabriel; Casa, Pedro Lenz; Notari, Daniel Luis; Adami, Andre Gustavo; Echeverrigaray, Sergio; de Avila E Silva, Scheila.

J Mol Recognit ; 32(5): e2770, 2019 05.

Artigo em Inglês | MEDLINE | ID: mdl-30458580

RESUMO

Promoters are DNA sequences located upstream of the transcription start site of genes. In bacteria, the RNA polymerase enzyme requires additional subunits, called sigma factors (σ) to begin specific gene transcription in distinct environmental conditions. Currently, promoter prediction still poses many challenges due to the characteristics of these sequences. In this paper, the nucleotide content of Escherichia coli promoter sequences, related to five alternative σ factors, was analyzed by a machine learning technique in order to provide profiles according to the σ factor which recognizes them. For this, the clustering technique was applied since it is a viable method for finding hidden patterns on a data set. As a result, 20 groups of sequences were formed, and, aided by the Weblogo tool, it was possible to determine sequence profiles. These found patterns should be considered for implementing computational prediction tools. In addition, evidence was found of an overlap between the functions of the genes regulated by different σ factors, suggesting that DNA structural properties are also essential parameters for further studies.

Assuntos

Escherichia coli/enzimologia , Escherichia coli/genética , Regiões Promotoras Genéticas , Fator sigma/genética , Algoritmos , Sequência de Bases , RNA Polimerases Dirigidas por DNA/genética , RNA Polimerases Dirigidas por DNA/metabolismo , Proteínas de Escherichia coli/genética , Proteínas de Escherichia coli/metabolismo , Nucleotídeos/análise , Fator sigma/metabolismo , Transcrição Gênica

A Survey of Biological Data in a Big Data Perspective.

Dall'Alba, Gabriel; Casa, Pedro Lenz; Abreu, Fernanda Pessi de; Notari, Daniel Luis; de Avila E Silva, Scheila.

Big Data ; 10(4): 279-297, 2022 08.

Artigo em Inglês | MEDLINE | ID: mdl-35394342

RESUMO

The amount of available data is continuously growing. This phenomenon promotes a new concept, named big data. The highlight technologies related to big data are cloud computing (infrastructure) and Not Only SQL (NoSQL; data storage). In addition, for data analysis, machine learning algorithms such as decision trees, support vector machines, artificial neural networks, and clustering techniques present promising results. In a biological context, big data has many applications due to the large number of biological databases available. Some limitations of biological big data are related to the inherent features of these data, such as high degrees of complexity and heterogeneity, since biological systems provide information from an atomic level to interactions between organisms or their environment. Such characteristics make most bioinformatic-based applications difficult to build, configure, and maintain. Although the rise of big data is relatively recent, it has contributed to a better understanding of the underlying mechanisms of life. The main goal of this article is to provide a concise and reliable survey of the application of big data-related technologies in biology. As such, some fundamental concepts of information technology, including storage resources, analysis, and data sharing, are described along with their relation to biological data.

Assuntos

Big Data , Mineração de Dados , Computação em Nuvem , Mineração de Dados/métodos , Aprendizado de Máquina , Redes Neurais de Computação

Toward Algorithms for Automation of Postgenomic Data Analyses: Bacillus subtilis Promoter Prediction with Artificial Neural Network.

Coelho, Rafael Vieira; Dall'Alba, Gabriel; de Avila E Silva, Scheila; Echeverrigaray, Sergio; Delamare, Ana Paula Longaray.

OMICS ; 24(5): 300-309, 2020 05.

Artigo em Inglês | MEDLINE | ID: mdl-31573385

RESUMO

In the present postgenomic era, the capacity to generate big data has far exceeded the capacity to analyze, contextualize, and make sense of the data in clinical, biological, and ecological applications. There is a great unmet need for automation and algorithms to aid in analyses of big data, in biology in particular. In this context, it is noteworthy that computational methods used to analyze the regulation of bacterial gene expression have in the past focused mainly on Escherichia coli promoters due to the large amount of data available. The challenge and prospects of automation in prediction and recognition of bacteria sequences as promoters have not been properly addressed due to the promoter size and degenerate pattern. We report here an original neural network approach for recognition and prediction of Bacillus subtilis promoters. The artificial neural network used as input 767 B. subtilis promoter sequences, while also aiming at identifying the architecture, provides the most optimal prediction. Two multilayer perceptron neural network architectures offered the highest accuracy: one with five, and another with seven neurons in the hidden layer. Each architecture achieved an accuracy of 98.57% and 97.69%, respectively. The results collectively indicate the promise of the application of neural network approaches to the B. subtilis promoter recognition problem, while also suggesting the broader potential of algorithms for automation of data analyses in the postgenomic era.

Assuntos

Automação/métodos , Bacillus subtilis/genética , Biologia Computacional/métodos , Reconhecimento Automatizado de Padrão/métodos , Regiões Promotoras Genéticas/genética , Análise de Sequência de DNA/métodos , Algoritmos , Escherichia coli/genética , Expressão Gênica/genética , Genes Bacterianos/genética , Genoma Bacteriano/genética , Redes Neurais de Computação

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA