Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Nat Protoc ; 2024 Jul 17.
Artigo em Inglês | MEDLINE | ID: mdl-39019974

RESUMO

With the advent of multiomics, software capable of multidimensional enrichment analysis has become increasingly crucial for uncovering gene set variations in biological processes and disease pathways. This is essential for elucidating disease mechanisms and identifying potential therapeutic targets. clusterProfiler stands out for its comprehensive utilization of databases and advanced visualization features. Importantly, clusterProfiler supports various biological knowledge, including Gene Ontology and Kyoto Encyclopedia of Genes and Genomes, through performing over-representation and gene set enrichment analyses. A key feature is that clusterProfiler allows users to choose from various graphical outputs to visualize results, enhancing interpretability. This protocol describes innovative ways in which clusterProfiler has been used for integrating metabolomics and metagenomics analyses, identifying and characterizing transcription factors under stress conditions, and annotating cells in single-cell studies. In all cases, the computational steps can be completed within ~2 min. clusterProfiler is released through the Bioconductor project and can be accessed via https://bioconductor.org/packages/clusterProfiler/ .

2.
Innovation (Camb) ; 4(2): 100388, 2023 Mar 13.
Artigo em Inglês | MEDLINE | ID: mdl-36895758

RESUMO

The data output from microbiome research is growing at an accelerating rate, yet mining the data quickly and efficiently remains difficult. There is still a lack of an effective data structure to represent and manage data, as well as flexible and composable analysis methods. In response to these two issues, we designed and developed the MicrobiotaProcess package. It provides a comprehensive data structure, MPSE, to better integrate the primary and intermediate data, which improves the integration and exploration of the downstream data. Around this data structure, the downstream analysis tasks are decomposed and a set of functions are designed under a tidy framework. These functions independently perform simple tasks and can be combined to perform complex tasks. This gives users the ability to explore data, conduct personalized analyses, and develop analysis workflows. Moreover, MicrobiotaProcess can interoperate with other packages in the R community, which further expands its analytical capabilities. This article demonstrates the MicrobiotaProcess for analyzing microbiome data as well as other ecological data through several examples. It connects upstream data, provides flexible downstream analysis components, and provides visualization methods to assist in presenting and interpreting results.

3.
Curr Protoc ; 2(10): e585, 2022 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-36286622

RESUMO

In many aspects of life, epigenetics, or the altering of phenotype without changes in sequences, play an essential role in biological function. A vast number of epigenomic datasets are emerging as a result of the advent of next-generation sequencing. Annotation, comparison, visualization, and interpretation of epigenomic datasets remain key aspects of computational biology. ChIPseeker is a Bioconductor package for performing these analyses among variable epigenomic datasets. The fundamental functions of ChIPseeker, including data preparation, annotation, comparison, and visualization, are explained in this article. ChIPseeker is a freely available open-source package that may be found at https://www.bioconductor.org/packages/ChIPseeker. © 2022 Wiley Periodicals LLC. Basic Protocol 1: ChIPseeker and epigenomic dataset preparation Basic Protocol 2: Annotation of epigenomic datasets Basic Protocol 3: Comparison of epigenomic datasets Basic Protocol 4: Visualization of annotated results Basic Protocol 5: Functional analysis of epigenomic datasets Basic Protocol 6: Genome-wide and locus-specific distribution of epigenomic datasets Basic Protocol 7: Heatmaps and metaplots of epigenomic datasets.


Assuntos
Epigenômica , Software , Epigenômica/métodos , Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Genoma
4.
Front Oncol ; 12: 912694, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35957896

RESUMO

Hepatocellular carcinoma (HCC) stem cells are regarded as an important part of individualized HCC treatment and sorafenib resistance. However, there is lacking systematic assessment of stem-like indices and associations with a response of sorafenib in HCC. Our study thus aimed to evaluate the status of tumor dedifferentiation for HCC and further identify the regulatory mechanisms under the condition of resistance to sorafenib. Datasets of HCC, including messenger RNAs (mRNAs) expression, somatic mutation, and clinical information were collected. The mRNA expression-based stemness index (mRNAsi), which can represent degrees of dedifferentiation of HCC samples, was calculated to predict drug response of sorafenib therapy and prognosis. Next, unsupervised cluster analysis was conducted to distinguish mRNAsi-based subgroups, and gene/geneset functional enrichment analysis was employed to identify key sorafenib resistance-related pathways. In addition, we analyzed and confirmed the regulation of key genes discovered in this study by combining other omics data. Finally, Luciferase reporter assays were performed to validate their regulation. Our study demonstrated that the stemness index obtained from transcriptomic is a promising biomarker to predict the response of sorafenib therapy and the prognosis in HCC. We revealed the peroxisome proliferator-activated receptor signaling pathway (the PPAR signaling pathway), related to fatty acid biosynthesis, that was a potential sorafenib resistance pathway that had not been reported before. By analyzing the core regulatory genes of the PPAR signaling pathway, we identified four candidate target genes, retinoid X receptor beta (RXRB), nuclear receptor subfamily 1 group H member 3 (NR1H3), cytochrome P450 family 8 subfamily B member 1 (CYP8B1) and stearoyl-CoA desaturase (SCD), as a signature to distinguish the response of sorafenib. We proposed and validated that the RXRB and NR1H3 could directly regulate NR1H3 and SCD, respectively. Our results suggest that the combined use of SCD inhibitors and sorafenib may be a promising therapeutic approach.

5.
Innovation (Camb) ; 2(3): 100141, 2021 Aug 28.
Artigo em Inglês | MEDLINE | ID: mdl-34557778

RESUMO

Functional enrichment analysis is pivotal for interpreting high-throughput omics data in life science. It is crucial for this type of tool to use the latest annotation databases for as many organisms as possible. To meet these requirements, we present here an updated version of our popular Bioconductor package, clusterProfiler 4.0. This package has been enhanced considerably compared with its original version published 9 years ago. The new version provides a universal interface for functional enrichment analysis in thousands of organisms based on internally supported ontologies and pathways as well as annotation data provided by users or derived from online databases. It also extends the dplyr and ggplot2 packages to offer tidy interfaces for data operation and visualization. Other new features include gene set enrichment analysis and comparison of enrichment results from multiple gene lists. We anticipate that clusterProfiler 4.0 will be applied to a wide range of scenarios across diverse organisms.

6.
Mol Biol Evol ; 38(9): 4039-4042, 2021 08 23.
Artigo em Inglês | MEDLINE | ID: mdl-34097064

RESUMO

We present the ggtreeExtra package for visualizing heterogeneous data with a phylogenetic tree in a circular or rectangular layout (https://www.bioconductor.org/packages/ggtreeExtra). The package supports more data types and visualization methods than other tools. It supports using the grammar of graphics syntax to present data on a tree with richly annotated layers and allows evolutionary statistics inferred by commonly used software to be integrated and visualized with external data. GgtreeExtra is a universal tool for tree data visualization. It extends the applications of the phylogenetic tree in different disciplines by making more domain-specific data to be available to visualize and interpret in the evolutionary context.


Assuntos
Filogenia , Software
7.
PeerJ ; 9: e11421, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34178436

RESUMO

BACKGROUND: The global spreading of the COVID-19 coronavirus is still a serious public health challenge. Although there are a large number of public resources that provide statistics data, tools for retrospective historical data and convenient visualization are still valuable. To provide convenient access to data and visualization on the pandemic we developed an R package, nCov2019 (https://github.com/YuLab-SMU/nCov2019). METHODS: We collect stable and reliable data of COVID-19 cases from multiple authoritative and up-to-date sources, and aggregate the most recent and historical data for each country or even province. Medical progress information, including global vaccine development and therapeutics candidates, were also collected and can be directly accessed in our package. The nCov2019 package provides an R language interfaces and designed functions for data operation and presentation, a set of interfaces to fetch data subset intuitively, visualization methods, and a dashboard with no extra coding requirement for data exploration and interactive analysis. RESULTS: As of January 14, 2021, the global health crisis is still serious. The number of confirmed cases worldwide has reached 91,268,983. Following the USA, India has reached 10 million confirmed cases. Multiple peaks are observed in many countries. Under the efforts of researchers, 51 vaccines and 54 drugs are under development and 14 of these vaccines are already in the pre-clinical phase. DISCUSSION: The nCov2019 package provides detailed statistics data, visualization functions and the Shiny web application, which allows researchers to keep abreast of the latest epidemic spread overview.

8.
Front Genet ; 12: 625236, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33643387

RESUMO

A growing amount of evidence has suggested the clinical importance of stromal and immune cells in the liver cancer microenvironment. However, reliable prognostic signatures based on assessments of stromal and immune components have not been well-established. This study aimed to identify stromal-immune score-based potential prognostic biomarkers for hepatocellular carcinoma. Stromal and immune scores were estimated from transcriptomic profiles of a liver cancer cohort from The Cancer Genome Atlas using the ESTIMATE (Estimation of STromal and Immune cells in MAlignant Tumors using Expression data) algorithm. Least absolute shrinkage and selection operator (LASSO) algorithm was applied to select prognostic genes. Favorable overall survivals and progression-free interval were found in patients with high stromal score and immune score, and 828 differentially expressed genes were identified. Functional enrichment analysis and protein-protein interaction networks further showed that these genes mainly participated in immune response, extracellular matrix, and cell adhesion. MMP9 (matrix metallopeptidase 9) was identified as a prognostic tumor microenvironment-associated gene by using LASSO and TIMER (Tumor IMmune Estimation Resource) algorithms and was found to be positively correlated with immunosuppressive molecules and drug response.

9.
Biosci Rep ; 40(1)2020 01 31.
Artigo em Inglês | MEDLINE | ID: mdl-31919492

RESUMO

Ischemic cardiomyopathy (ICM) is a common human heart disease that causes death. No effective biomarkers for ICM could be found in existing databases, which is detrimental to the in-depth study of this disease. In the present study, ICM susceptibility biomarkers were identified using a proposed strategy based on RNA-Seq and miRNA-Seq data of ICM and normal samples. Significantly differentially expressed competing endogenous RNA (ceRNA) triplets were constructed using permutation tests and differentially expressed mRNAs, miRNAs and lncRNAs. Candidate ICM susceptible genes were screened out as differentially expressed genes in significantly differentially expressed ceRNA triplets enriched in ICM-related functional classes. Finally, eight ICM susceptibility genes and their significantly correlated lncRNAs with high classification accuracy were identified as ICM susceptibility biomarkers. These biomarkers would contribute to the diagnosis and treatment of ICM. The proposed strategy could be extended to other complex diseases without disease biomarkers in public databases.


Assuntos
Biomarcadores/metabolismo , Cardiomiopatias/diagnóstico , Cardiomiopatias/genética , RNA/genética , Redes Reguladoras de Genes/genética , Humanos
10.
J Cell Physiol ; 235(11): 7960-7969, 2020 11.
Artigo em Inglês | MEDLINE | ID: mdl-31943201

RESUMO

Breast cancer is the most common female death-causing cancer worldwide. A network-based integration method was proposed to identify potential breast cancer genes. First, genes were prioritized using a gene prioritization algorithm by the strategy of disease risks transferred between genes in a network with weighted vertexes and edges. Our prioritization algorithm was effectives and robust for top-ranked seed gene number and higher area under the curve values compared to ToppGene and ToppNet. Then, 20 potential breast cancer genes were identified as common genes of the top 50 candidate genes for their robustness in multiple prioritizations. These genes could accurately classify tumor and normal samples of all and paired sample sets and three independent datasets. Of potential breast cancer genes, 18 were verified by literature and 2 were novel genes that need further study. This study would contribute to the understanding of the genetic architecture for the diagnosis and treatment of breast cancer.


Assuntos
Biomarcadores Tumorais/genética , Neoplasias da Mama/genética , Proteínas de Neoplasias/genética , Algoritmos , Área Sob a Curva , Neoplasias da Mama/diagnóstico , Neoplasias da Mama/patologia , Feminino , Regulação Neoplásica da Expressão Gênica/genética , Redes Reguladoras de Genes/genética , Humanos , Curva ROC
11.
Biomed Res Int ; 2020: 3854196, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33457407

RESUMO

Chronic obstructive pulmonary disease (COPD) is a complex disease caused by the disturbance of genetic and environmental factors. Single-nucleotide polymorphisms (SNPs) play a vital role in the genetic dissection of complex diseases. In-depth analysis of SNP-related information could recognize disease-associated biomarkers and further uncover the genetic mechanism of complex diseases. Risk-related variants might act on the disease by affecting gene expression and gene function. Through integrating SNP disease association study and expression quantitative trait loci (eQTL) analysis, as well as functional enrichment of containing known causal genes, four risk SNPs and four corresponding susceptibility genes were identified utilizing next-generation sequencing (NGS) data of COPD. Of the four risk SNPs, one could be found in the SNPedia database that stored disease-related SNPs and has been linked to a disease in the literature. Four genes showed significant differences from the perspective of normal/disease or variant/nonvariant samples, as well as the high performance of sample classification. It is speculated that the four susceptibility genes could be used as biomarkers of COPD. Furthermore, three of our susceptibility genes have been confirmed in the literature to be associated with COPD. Among them, two genes had an impact on the significance of expression correlation of known causal genes they interact with, respectively. Overall, this research may present novel insights into the diagnosis and pathogenesis of COPD and susceptibility gene identification of other complex diseases.


Assuntos
Predisposição Genética para Doença , Sequenciamento de Nucleotídeos em Larga Escala , Doença Pulmonar Obstrutiva Crônica/genética , Locos de Características Quantitativas , Algoritmos , Biomarcadores/metabolismo , Análise por Conglomerados , Biologia Computacional , Expressão Gênica , Estudo de Associação Genômica Ampla , Genótipo , Humanos , Doença Pulmonar Obstrutiva Crônica/diagnóstico , RNA-Seq , Curva ROC , Risco , Sensibilidade e Especificidade
12.
Aging (Albany NY) ; 11(24): 12131-12146, 2019 12 20.
Artigo em Inglês | MEDLINE | ID: mdl-31860871

RESUMO

Breast cancer is one of the most common malignant cancers among females worldwide. This complex disease is not caused by a single gene, but resulted from multi-gene interactions, which could be represented by biological networks. Network modules are composed of genes with significant similarities in terms of expression, function and disease association. Therefore, the identification of disease risk modules could contribute to understanding the molecular mechanisms underlying breast cancer. In this paper, an integrated disease risk module identification strategy was proposed according to a multi-objective programming model for two similarity criteria as well as significance of permutation tests in Markov random field module score, function consistency score and Pearson correlation coefficient difference score. Three breast cancer risk modules were identified from a breast cancer-related interaction network. Genes in these risk modules were confirmed to play critical roles in breast cancer by literature review. These risk modules were enriched in breast cancer-related pathways or functions and could distinguish between breast tumor and normal samples with high accuracy for not only the microarray dataset used for breast cancer risk module identification, but also another two independent datasets. Our integrated strategy could be extended to other complex diseases to identify their risk modules and reveal their pathogenesis.


Assuntos
Neoplasias da Mama/genética , Predisposição Genética para Doença , Modelos Genéticos , Software , Algoritmos , Feminino , Regulação Neoplásica da Expressão Gênica , Humanos , Fatores de Risco
13.
BMC Med Genet ; 20(1): 177, 2019 11 12.
Artigo em Inglês | MEDLINE | ID: mdl-31718573

RESUMO

BACKGROUND: Lung cancer is a leading cause of death from cancer worldwide, especially non-small cell lung cancer (NSCLC). The marker of progression in lung adenocarcinoma, the main type of NSCLC, has been rarely studied. Programmed death 1 (PD-1) is an effective drug target for the treatment of NSCLC. Our study aimed to examine the PD-1 role in the disease process. The study of the effect of polymorphisms on the progression of lung adenocarcinoma in the Han population of Northeast China may provide a valuable reference for the research and application of these drugs. METHODS: Chi-square test, Wilcoxon rank sum test, and classification efficiency assessment were used to test SNPs of PD-1 in 287 patients and combined with clinical information. RESULTS: We successfully identified biomarkers (rs2227981, rs2227982, and rs3608432) that could distinguish between lung adenocarcinoma patients of early stages and late stages. Multiple clinical indicators showed significant differences among different SNPs and cancer stages. Furthermore, this gene was confirmed to effectively distinguish the stages of lung adenocarcinoma with RNA-seq data in TCGA. CONCLUSIONS: Out study indicated that the PD-1 gene and the SNPs on it could be used as markers for distinguishing lung adenocarcinoma staging in the Northeast Han population. Our investigation into the link between PD-1 polymorphisms and lung adenocarcinoma would help to provide guidance for the treatment and prognosis of lung adenocarcinoma.


Assuntos
Adenocarcinoma/genética , Etnicidade/genética , Predisposição Genética para Doença , Neoplasias Pulmonares/genética , Polimorfismo de Nucleotídeo Único , Receptor de Morte Celular Programada 1/genética , Adulto , Idoso , Idoso de 80 Anos ou mais , Estudos de Casos e Controles , China , Estudos Transversais , Progressão da Doença , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Prognóstico , Fatores de Risco
14.
Oncol Lett ; 13(5): 3935-3941, 2017 May.
Artigo em Inglês | MEDLINE | ID: mdl-28529601

RESUMO

Breast cancer is one of the leading causes of mortality in females. A number of prognostic markers have been identified, including single genes, multi-gene signatures and network modules; however, the robustness of these prognostic markers is insufficient. Thus, the present study proposed a more robust method to identify breast cancer prognostic modules based on weighted protein-protein interaction networks, by integrating four sets of disease-associated expression profiles. Three identified prognostic modules were closely associated with prognosis-associated functions and survival time, as determined by Cox regression and Kaplan-Meier survival analyses. The robustness of these modules was verified with an independent profile from another platform. Genes from these modules may be useful as breast cancer prognostic markers. The prognostic modules could be used to determine the prognoses of patients with breast cancer and characterize patient recovery.

15.
Oncotarget ; 7(50): 82063-82073, 2016 Dec 13.
Artigo em Inglês | MEDLINE | ID: mdl-27852050

RESUMO

Ischemic cardiomyopathy (ICM) is an important cause of heart failure, yet no ICM disease genes were stored in any public databases. Mutations of genes provided by RNA-Seq data could set a foundation for a variety of biological processes. This also made it possible to elucidate the mechanism and identify potential genes for ICM. In this paper, an integrated co-expression network was constructed using univariate and bivariate canonical correlation analysis for RNA-Seq data of human ICM samples. Three ICM-related modules were recognized after comparing between Pearson correlation coefficients of ICM samples and normal controls. Furthermore, 32 ICM potential genes were identified from ICM-related modules considering protein-protein interactions. Most of these genes were verified to be involved in ICM and diseases caused it by OMIM and literature. Our study could provide a novel perspective for potential gene identification and the pathogenesis for ICM and other complex diseases.


Assuntos
Cardiomiopatias/genética , Isquemia Miocárdica/genética , RNA/genética , Análise de Sequência de RNA , Cardiomiopatias/diagnóstico , Cardiomiopatias/etiologia , Estudos de Casos e Controles , Biologia Computacional , Bases de Dados Genéticas , Regulação da Expressão Gênica , Redes Reguladoras de Genes , Estudos de Associação Genética , Marcadores Genéticos , Predisposição Genética para Doença , Humanos , Isquemia Miocárdica/complicações , Isquemia Miocárdica/diagnóstico , Fenótipo , Mapas de Interação de Proteínas , Integração de Sistemas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA