Pesquisa | Portal Regional da BVS

Temporal progress of gene expression analysis with RNA-Seq data: A review on the relationship between computational methods.

Costa-Silva, Juliana; Domingues, Douglas S; Menotti, David; Hungria, Mariangela; Lopes, Fabrício Martins.

Comput Struct Biotechnol J ; 21: 86-98, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-36514333

RESUMO

Analysis of differential gene expression from RNA-seq data has become a standard for several research areas. The steps for the computational analysis include many data types and file formats, and a wide variety of computational tools that can be applied alone or together as pipelines. This paper presents a review of the differential expression analysis pipeline, addressing its steps and the respective objectives, the principal methods available in each step, and their properties, therefore introducing an organized overview to this context. This review aims to address mainly the aspects involved in the differentially expressed gene (DEG) analysis from RNA sequencing data (RNA-seq), considering the computational methods. In addition, a timeline of the computational methods for DEG is shown and discussed, and the relationships existing between the most important computational tools are presented by an interaction network. A discussion on the challenges and gaps in DEG analysis is also highlighted in this review. This paper will serve as a tutorial for new entrants into the field and help established users update their analysis pipelines.

Analysis of co-authorship networks among Brazilian graduate programs in computer science.

Nunes da Silva, Alex; Breve, Matheus Montanini; Mena-Chalco, Jesús Pascual; Lopes, Fabrício Martins.

PLoS One ; 17(1): e0261200, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35041687

RESUMO

The growth and popularization of platforms on scientific production has been the subject of several studies, producing relevant analyses of co-authorship behavior among groups of researchers. Researchers and their scientific productions can be analysed as co-authorship social networks, so researchers are linked through common publications. In this context, co-authoring networks can be analysed to find patterns that can describe or characterize them. This work presents the analysis and characterization of co-authorship networks of academic Brazilian graduate programs in computer science. Data from Brazilian researchers were collected and modeled as co-authoring networks among the graduate programs that researchers take part in. Each network topology was analysed with complex network measurements and three proposed qualitative indices that evaluate the publication's quality. In addition, the co-authorship networks of the computer science graduate programs were characterized in relation to the assessment received by CAPES, which attributes a qualitative grade to the graduate programs in Brazil. The results show the most relevant topological measurements for the program's characterization and the evaluations received by the programs in different qualitative degrees, relating the main topological patterns of the co-authorship networks and the CAPES grades of the Brazilian graduate programs in computer science.

Assuntos

Bibliometria , Brasil

Computational Analysis of Transposable Elements and CircRNAs in Plants.

Oliveira, Liliane Santana; Patera, Andressa Caroline; Domingues, Douglas Silva; Sanches, Danilo Sipoli; Lopes, Fabricio Martins; Bugatti, Pedro Henrique; Saito, Priscila Tiemi Maeda; Maracaja-Coutinho, Vinicius; Durham, Alan Mitchell; Paschoal, Alexandre Rossi.

Methods Mol Biol ; 2362: 147-172, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-34195962

RESUMO

This chapter provides two main contributions: (1) a description of computational tools and databases used to identify and analyze transposable elements (TEs) and circRNAs in plants; and (2) data analysis on public TE and circRNA data. Our goal is to highlight the primary information available in the literature on circular noncoding RNAs and transposable elements in plants. The exploratory analysis performed on publicly available circRNA and TEs data help discuss four sequence features. Finally, we investigate the association on circRNAs:TE in plants in the model organism Arabidopsis thaliana.

Assuntos

Arabidopsis , Elementos de DNA Transponíveis , Arabidopsis/genética , Biologia Computacional , Elementos de DNA Transponíveis/genética , Plantas/genética , RNA Circular

Brazilian-adapted soybean Bradyrhizobium strains uncover IS elements with potential impact on biological nitrogen fixation.

Barros-Carvalho, Gesiele Almeida; Hungria, Mariangela; Lopes, Fabrício Martins; Van Sluys, Marie-Anne.

FEMS Microbiol Lett ; 366(11)2019 06 01.

Artigo em Inglês | MEDLINE | ID: mdl-30860585

RESUMO

Bradyrhizobium diazoefficiens CPAC 7 and Bradyrhizobium japonicum CPAC 15 are broadly used in commercial inoculants in Brazil, contributing to most of the nitrogen required by the soybean crop. These strains differ in their symbiotic properties: CPAC 7 is more efficient in fixing nitrogen, whereas CPAC 15 is more competitive. Comparative genomics revealed many transposases close to genes associated with symbiosis in the symbiotic island of these strains. Given the importance that insertion sequences (IS) elements have to bacterial genomes, we focused on identifying the local impact of these elements in the genomes of these and other related Bradyrhizobium strains to further understand their phenotypic differences. Analyses were performed using bioinformatics approaches. We found IS elements disrupting and inserted at regulatory regions of genes involved in symbiosis. Further comparative analyses with 21 Bradyrhizobium genomes revealed insertional polymorphism with distinguishing patterns between B. diazoefficiens and B. japonicum lineages. Finally, 13 of these potentially impacted genes are differentially expressed under symbiotic conditions in B. diazoefficiens USDA 110. Thus, IS elements are associated with the diversity of Bradyrhizobium, possibly by providing mechanisms for natural variation of symbiotic effectiveness.

Assuntos

Bradyrhizobium/genética , Bradyrhizobium/metabolismo , Elementos de DNA Transponíveis/genética , Glycine max/microbiologia , Biologia Computacional , Ilhas Genômicas/genética , Fixação de Nitrogênio/genética , Fixação de Nitrogênio/fisiologia

BASiNET-BiologicAl Sequences NETwork: a case study on coding and non-coding RNAs identification.

Ito, Eric Augusto; Katahira, Isaque; Vicente, Fábio Fernandes da Rocha; Pereira, Luiz Filipe Protasio; Lopes, Fabrício Martins.

Nucleic Acids Res ; 46(16): e96, 2018 09 19.

Artigo em Inglês | MEDLINE | ID: mdl-29873784

RESUMO

With the emergence of Next Generation Sequencing (NGS) technologies, a large volume of sequence data in particular de novo sequencing was rapidly produced at relatively low costs. In this context, computational tools are increasingly important to assist in the identification of relevant information to understand the functioning of organisms. This work introduces BASiNET, an alignment-free tool for classifying biological sequences based on the feature extraction from complex network measurements. The method initially transform the sequences and represents them as complex networks. Then it extracts topological measures and constructs a feature vector that is used to classify the sequences. The method was evaluated in the classification of coding and non-coding RNAs of 13 species and compared to the CNCI, PLEK and CPC2 methods. BASiNET outperformed all compared methods in all adopted organisms and datasets. BASiNET have classified sequences in all organisms with high accuracy and low standard deviation, showing that the method is robust and non-biased by the organism. The proposed methodology is implemented in open source in R language and freely available for download at https://cran.r-project.org/package=BASiNET.

Assuntos

Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , RNA Longo não Codificante/genética , RNA Mensageiro/genética , Análise de Sequência de RNA/métodos , Algoritmos , Internet , Reprodutibilidade dos Testes , Software

RNA-Seq differential expression analysis: An extended review and a software tool.

Costa-Silva, Juliana; Domingues, Douglas; Lopes, Fabricio Martins.

PLoS One ; 12(12): e0190152, 2017.

Artigo em Inglês | MEDLINE | ID: mdl-29267363

RESUMO

The correct identification of differentially expressed genes (DEGs) between specific conditions is a key in the understanding phenotypic variation. High-throughput transcriptome sequencing (RNA-Seq) has become the main option for these studies. Thus, the number of methods and softwares for differential expression analysis from RNA-Seq data also increased rapidly. However, there is no consensus about the most appropriate pipeline or protocol for identifying differentially expressed genes from RNA-Seq data. This work presents an extended review on the topic that includes the evaluation of six methods of mapping reads, including pseudo-alignment and quasi-mapping and nine methods of differential expression analysis from RNA-Seq data. The adopted methods were evaluated based on real RNA-Seq data, using qRT-PCR data as reference (gold-standard). As part of the results, we developed a software that performs all the analysis presented in this work, which is freely available at https://github.com/costasilvati/consexpression. The results indicated that mapping methods have minimal impact on the final DEGs analysis, considering that adopted data have an annotated reference genome. Regarding the adopted experimental model, the DEGs identification methods that have more consistent results were the limma+voom, NOIseq and DESeq2. Additionally, the consensus among five DEGs identification methods guarantees a list of DEGs with great accuracy, indicating that the combination of different methods can produce more suitable results. The consensus option is also included for use in the available software.

Assuntos

Análise de Sequência de RNA/métodos , Software , Expressão Gênica , Humanos

An Efficient Approach to Explore and Discriminate Anomalous Regions in Bacterial Genomes Based on Maximum Entropy.

Barros-Carvalho, Gesiele Almeida; Van Sluys, Marie-Anne; Lopes, Fabricio Martins.

J Comput Biol ; 24(11): 1125-1133, 2017 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-28570142

RESUMO

Recently, there has been an increase in the number of whole bacterial genomes sequenced, mainly due to the advancing of next-generation sequencing technologies. In face of this, there is a need to provide new analytical alternatives that can follow this advance. Given our current knowledge about the genomic plasticity of bacteria and that those genomic regions can uncover important features about this microorganism, our goal was to develop a fast methodology based on maximum entropy (ME) to guide the researcher to regions that could be prioritized during the analysis. This methodology was compared with other available methods. In addition, ME was applied to eight different bacterial genera. The methodology consists of two main steps: processing the nucleotide sequence and ME calculation. We applied ME to Xanthomonas axonopodis pv. citri 306 (XAC) and Xanthomonas campestris pv. campestris ATCC 33913 (XCC), both of which have their anomalous regions well documented. We then compared our results against those from Alien Hunter, HGT-DB, Islander, IslandPath, and SIGI-HMM. ME was shown to be superior in terms of efficiency and analysis duration. Besides, ME only needs the genome sequence in FASTA format as input. The proposed strategy based on ME is able to help in bacterial genome exploration. This is a simple and fast strategy for individual genomes in comparison with other available methods, without relying on previous annotation and alignments. This methodology can also be a new option in the early stages of analysis of newly sequenced bacterial genomes.

Assuntos

DNA Bacteriano/genética , Entropia , Genoma Bacteriano , Genômica/métodos , Xanthomonas/genética , Xanthomonas/classificação

Inference of gene regulatory networks from time series by Tsallis entropy.

Lopes, Fabrício Martins; de Oliveira, Evaldo A; Cesar, Roberto M.

BMC Syst Biol ; 5: 61, 2011 May 05.

Artigo em Inglês | MEDLINE | ID: mdl-21545720

RESUMO

BACKGROUND: The inference of gene regulatory networks (GRNs) from large-scale expression profiles is one of the most challenging problems of Systems Biology nowadays. Many techniques and models have been proposed for this task. However, it is not generally possible to recover the original topology with great accuracy, mainly due to the short time series data in face of the high complexity of the networks and the intrinsic noise of the expression measurements. In order to improve the accuracy of GRNs inference methods based on entropy (mutual information), a new criterion function is here proposed. RESULTS: In this paper we introduce the use of generalized entropy proposed by Tsallis, for the inference of GRNs from time series expression profiles. The inference process is based on a feature selection approach and the conditional entropy is applied as criterion function. In order to assess the proposed methodology, the algorithm is applied to recover the network topology from temporal expressions generated by an artificial gene network (AGN) model as well as from the DREAM challenge. The adopted AGN is based on theoretical models of complex networks and its gene transference function is obtained from random drawing on the set of possible Boolean functions, thus creating its dynamics. On the other hand, DREAM time series data presents variation of network size and its topologies are based on real networks. The dynamics are generated by continuous differential equations with noise and perturbation. By adopting both data sources, it is possible to estimate the average quality of the inference with respect to different network topologies, transfer functions and network sizes. CONCLUSIONS: A remarkable improvement of accuracy was observed in the experimental results by reducing the number of false connections in the inferred topology by the non-Shannon entropy. The obtained best free parameter of the Tsallis entropy was on average in the range 2.5 ≤ q ≤ 3.5 (hence, subextensive entropy), which opens new perspectives for GRNs inference methods based on information theory and for investigation of the nonextensivity of such networks. The inference algorithm and criterion function proposed here were implemented and included in the DimReduction software, which is freely available at http://sourceforge.net/projects/dimreduction and http://code.google.com/p/dimreduction/.

Assuntos

Biologia Computacional/métodos , Entropia , Redes Reguladoras de Genes , Modelos Genéticos , Fatores de Tempo

Feature selection environment for genomic applications.

Lopes, Fabrício Martins; Martins, David Corrêa; Cesar, Roberto M.

BMC Bioinformatics ; 9: 451, 2008 Oct 22.

Artigo em Inglês | MEDLINE | ID: mdl-18945362

RESUMO

BACKGROUND: Feature selection is a pattern recognition approach to choose important variables according to some criteria in order to distinguish or explain certain phenomena (i.e., for dimensionality reduction). There are many genomic and proteomic applications that rely on feature selection to answer questions such as selecting signature genes which are informative about some biological state, e.g., normal tissues and several types of cancer; or inferring a prediction network among elements such as genes, proteins and external stimuli. In these applications, a recurrent problem is the lack of samples to perform an adequate estimate of the joint probabilities between element states. A myriad of feature selection algorithms and criterion functions have been proposed, although it is difficult to point the best solution for each application. RESULTS: The intent of this work is to provide an open-source multiplatform graphical environment for bioinformatics problems, which supports many feature selection algorithms, criterion functions and graphic visualization tools such as scatterplots, parallel coordinates and graphs. A feature selection approach for growing genetic networks from seed genes (targets or predictors) is also implemented in the system. CONCLUSION: The proposed feature selection environment allows data analysis using several algorithms, criterion functions and graphic visualization tools. Our experiments have shown the software effectiveness in two distinct types of biological problems. Besides, the environment can be used in different pattern recognition applications, although the main concern regards bioinformatics tasks.

Assuntos

Biologia Computacional/métodos , Genômica/métodos , Reconhecimento Automatizado de Padrão/métodos , Software , Algoritmos , Teorema de Bayes , Interpretação Estatística de Dados , Internet , Cadeias de Markov , Modelos Genéticos , Reprodutibilidade dos Testes , Interface Usuário-Computador

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA