Pesquisa | BVS Doenças Infecciosas e Parasitárias

1.

ggmsa: a visual exploration tool for multiple sequence alignment and associated data.

Zhou, Lang; Feng, Tingze; Xu, Shuangbin; Gao, Fangluan; Lam, Tommy T; Wang, Qianwen; Wu, Tianzhi; Huang, Huina; Zhan, Li; Li, Lin; Guan, Yi; Dai, Zehan; Yu, Guangchuang.

Brief Bioinform ; 23(4)2022 07 18.

Artigo em Inglês | MEDLINE | ID: mdl-35671504

RESUMO

The identification of the conserved and variable regions in the multiple sequence alignment (MSA) is critical to accelerating the process of understanding the function of genes. MSA visualizations allow us to transform sequence features into understandable visual representations. As the sequence-structure-function relationship gains increasing attention in molecular biology studies, the simple display of nucleotide or protein sequence alignment is not satisfied. A more scalable visualization is required to broaden the scope of sequence investigation. Here we present ggmsa, an R package for mining comprehensive sequence features and integrating the associated data of MSA by a variety of display methods. To uncover sequence conservation patterns, variations and recombination at the site level, sequence bundles, sequence logos, stacked sequence alignment and comparative plots are implemented. ggmsa supports integrating the correlation of MSA sequences and their phenotypes, as well as other traits such as ancestral sequences, molecular structures, molecular functions and expression levels. We also design a new visualization method for genome alignments in multiple alignment format to explore the pattern of within and between species variation. Combining these visual representations with prime knowledge, ggmsa assists researchers in discovering MSA and making decisions. The ggmsa package is open-source software released under the Artistic-2.0 license, and it is freely available on Bioconductor (https://bioconductor.org/packages/ggmsa) and Github (https://github.com/YuLab-SMU/ggmsa).

Assuntos

Genoma , Software , Sequência de Aminoácidos , Matrizes de Pontuação de Posição Específica , Alinhamento de Sequência

2.

ggtreeExtra: Compact Visualization of Richly Annotated Phylogenetic Data.

Xu, Shuangbin; Dai, Zehan; Guo, Pingfan; Fu, Xiaocong; Liu, Shanshan; Zhou, Lang; Tang, Wenli; Feng, Tingze; Chen, Meijun; Zhan, Li; Wu, Tianzhi; Hu, Erqiang; Jiang, Yong; Bo, Xiaochen; Yu, Guangchuang.

Mol Biol Evol ; 38(9): 4039-4042, 2021 08 23.

Artigo em Inglês | MEDLINE | ID: mdl-34097064

RESUMO

We present the ggtreeExtra package for visualizing heterogeneous data with a phylogenetic tree in a circular or rectangular layout (https://www.bioconductor.org/packages/ggtreeExtra). The package supports more data types and visualization methods than other tools. It supports using the grammar of graphics syntax to present data on a tree with richly annotated layers and allows evolutionary statistics inferred by commonly used software to be integrated and visualized with external data. GgtreeExtra is a universal tool for tree data visualization. It extends the applications of the phylogenetic tree in different disciplines by making more domain-specific data to be available to visualize and interpret in the evolutionary context.

Assuntos

Filogenia , Software

3.

Treeio: An R Package for Phylogenetic Tree Input and Output with Richly Annotated and Associated Data.

Wang, Li-Gen; Lam, Tommy Tsan-Yuk; Xu, Shuangbin; Dai, Zehan; Zhou, Lang; Feng, Tingze; Guo, Pingfan; Dunn, Casey W; Jones, Bradley R; Bradley, Tyler; Zhu, Huachen; Guan, Yi; Jiang, Yong; Yu, Guangchuang.

Mol Biol Evol ; 37(2): 599-603, 2020 02 01.

Artigo em Inglês | MEDLINE | ID: mdl-31633786

RESUMO

Phylogenetic trees and data are often stored in incompatible and inconsistent formats. The outputs of software tools that contain trees with analysis findings are often not compatible with each other, making it hard to integrate the results of different analyses in a comparative study. The treeio package is designed to connect phylogenetic tree input and output. It supports extracting phylogenetic trees as well as the outputs of commonly used analytical software. It can link external data to phylogenies and merge tree data obtained from different sources, enabling analyses of phylogeny-associated data from different disciplines in an evolutionary context. Treeio also supports export of a phylogenetic tree with heterogeneous-associated data to a single tree file, including BEAST compatible NEXUS and jtree formats; these facilitate data sharing as well as file format conversion for downstream analysis. The treeio package is designed to work with the tidytree and ggtree packages. Tree data can be processed using the tidy interface with tidytree and visualized by ggtree. The treeio package is released within the Bioconductor and rOpenSci projects. It is available at https://www.bioconductor.org/packages/treeio/.

Assuntos

Biologia Computacional/métodos , Mineração de Dados/métodos , Internet , Filogenia , Software

4.

MicrobiotaProcess: A comprehensive R package for deep mining microbiome.

Xu, Shuangbin; Zhan, Li; Tang, Wenli; Wang, Qianwen; Dai, Zehan; Zhou, Lang; Feng, Tingze; Chen, Meijun; Wu, Tianzhi; Hu, Erqiang; Yu, Guangchuang.

Innovation (Camb) ; 4(2): 100388, 2023 Mar 13.

Artigo em Inglês | MEDLINE | ID: mdl-36895758

RESUMO

The data output from microbiome research is growing at an accelerating rate, yet mining the data quickly and efficiently remains difficult. There is still a lack of an effective data structure to represent and manage data, as well as flexible and composable analysis methods. In response to these two issues, we designed and developed the MicrobiotaProcess package. It provides a comprehensive data structure, MPSE, to better integrate the primary and intermediate data, which improves the integration and exploration of the downstream data. Around this data structure, the downstream analysis tasks are decomposed and a set of functions are designed under a tidy framework. These functions independently perform simple tasks and can be combined to perform complex tasks. This gives users the ability to explore data, conduct personalized analyses, and develop analysis workflows. Moreover, MicrobiotaProcess can interoperate with other packages in the R community, which further expands its analytical capabilities. This article demonstrates the MicrobiotaProcess for analyzing microbiome data as well as other ecological data through several examples. It connects upstream data, provides flexible downstream analysis components, and provides visualization methods to assist in presenting and interpreting results.

5.

Stemness Analysis Uncovers That The Peroxisome Proliferator-Activated Receptor Signaling Pathway Can Mediate Fatty Acid Homeostasis In Sorafenib-Resistant Hepatocellular Carcinoma Cells.

Feng, Tingze; Wu, Tianzhi; Zhang, Yanxia; Zhou, Lang; Liu, Shanshan; Li, Lin; Li, Ming; Hu, Erqiang; Wang, Qianwen; Fu, Xiaocong; Zhan, Li; Xie, Zijing; Xie, Wenqin; Huang, Xianying; Shang, Xuan; Yu, Guangchuang.

Front Oncol ; 12: 912694, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35957896

RESUMO

Hepatocellular carcinoma (HCC) stem cells are regarded as an important part of individualized HCC treatment and sorafenib resistance. However, there is lacking systematic assessment of stem-like indices and associations with a response of sorafenib in HCC. Our study thus aimed to evaluate the status of tumor dedifferentiation for HCC and further identify the regulatory mechanisms under the condition of resistance to sorafenib. Datasets of HCC, including messenger RNAs (mRNAs) expression, somatic mutation, and clinical information were collected. The mRNA expression-based stemness index (mRNAsi), which can represent degrees of dedifferentiation of HCC samples, was calculated to predict drug response of sorafenib therapy and prognosis. Next, unsupervised cluster analysis was conducted to distinguish mRNAsi-based subgroups, and gene/geneset functional enrichment analysis was employed to identify key sorafenib resistance-related pathways. In addition, we analyzed and confirmed the regulation of key genes discovered in this study by combining other omics data. Finally, Luciferase reporter assays were performed to validate their regulation. Our study demonstrated that the stemness index obtained from transcriptomic is a promising biomarker to predict the response of sorafenib therapy and the prognosis in HCC. We revealed the peroxisome proliferator-activated receptor signaling pathway (the PPAR signaling pathway), related to fatty acid biosynthesis, that was a potential sorafenib resistance pathway that had not been reported before. By analyzing the core regulatory genes of the PPAR signaling pathway, we identified four candidate target genes, retinoid X receptor beta (RXRB), nuclear receptor subfamily 1 group H member 3 (NR1H3), cytochrome P450 family 8 subfamily B member 1 (CYP8B1) and stearoyl-CoA desaturase (SCD), as a signature to distinguish the response of sorafenib. We proposed and validated that the RXRB and NR1H3 could directly regulate NR1H3 and SCD, respectively. Our results suggest that the combined use of SCD inhibitors and sorafenib may be a promising therapeutic approach.

6.

Use ggbreak to Effectively Utilize Plotting Space to Deal With Large Datasets and Outliers.

Xu, Shuangbin; Chen, Meijun; Feng, Tingze; Zhan, Li; Zhou, Lang; Yu, Guangchuang.

Front Genet ; 12: 774846, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-34795698

RESUMO

With the rapid increase of large-scale datasets, biomedical data visualization is facing challenges. The data may be large, have different orders of magnitude, contain extreme values, and the data distribution is not clear. Here we present an R package ggbreak that allows users to create broken axes using ggplot2 syntax. It can effectively use the plotting area to deal with large datasets (especially for long sequential data), data with different magnitudes, and contain outliers. The ggbreak package increases the available visual space for a better presentation of the data and detailed annotation, thus improves our ability to interpret the data. The ggbreak package is fully compatible with ggplot2 and it is easy to superpose additional layers and applies scale and theme to adjust the plot using the ggplot2 syntax. The ggbreak package is open-source software released under the Artistic-2.0 license, and it is freely available on CRAN (https://CRAN.R-project.org/package=ggbreak) and Github (https://github.com/YuLab-SMU/ggbreak).

7.

clusterProfiler 4.0: A universal enrichment tool for interpreting omics data.

Wu, Tianzhi; Hu, Erqiang; Xu, Shuangbin; Chen, Meijun; Guo, Pingfan; Dai, Zehan; Feng, Tingze; Zhou, Lang; Tang, Wenli; Zhan, Li; Fu, Xiaocong; Liu, Shanshan; Bo, Xiaochen; Yu, Guangchuang.

Innovation (Camb) ; 2(3): 100141, 2021 Aug 28.

Artigo em Inglês | MEDLINE | ID: mdl-34557778

RESUMO

Functional enrichment analysis is pivotal for interpreting high-throughput omics data in life science. It is crucial for this type of tool to use the latest annotation databases for as many organisms as possible. To meet these requirements, we present here an updated version of our popular Bioconductor package, clusterProfiler 4.0. This package has been enhanced considerably compared with its original version published 9 years ago. The new version provides a universal interface for functional enrichment analysis in thousands of organisms based on internally supported ontologies and pathways as well as annotation data provided by users or derived from online databases. It also extends the dplyr and ggplot2 packages to offer tidy interfaces for data operation and visualization. Other new features include gene set enrichment analysis and comparison of enrichment results from multiple gene lists. We anticipate that clusterProfiler 4.0 will be applied to a wide range of scenarios across diverse organisms.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA