Pesquisa | Portal Regional da BVS

1.

Characterization of circulating miRNAs in the treatment of primary liver tumors.

Umezu, Tomohiro; Tanaka, Shogo; Kubo, Shoji; Enomoto, Masaru; Tamori, Akihiro; Ochiya, Takahiro; Taguchi, Y-H; Kuroda, Masahiko; Murakami, Yoshiki.

Cancer Rep (Hoboken) ; 7(2): e1964, 2024 02.

Artigo em Inglês | MEDLINE | ID: mdl-38146079

RESUMO

BACKGROUND AND AIM: Circulating micro RNAs (miRNAs) indicate clinical pathologies such as inflammation and carcinogenesis. In this study, we aimed to investigate whether miRNA expression level patterns in could be used to diagnose hepatocellular carcinoma (HCC) and biliary tract cancer (BTC), and the relationship miRNA expression patterns and cancer etiology. METHODS: Patients with HCC and BTC with indications for surgery were selected for the study. Total RNA was extracted from the extracellular vesicle (EV)-rich fraction of the serum and analyzed using Toray miRNA microarray. Samples were divided into two cohorts in order of collection, the first 85 HCC were analyzed using a microarray based on miRBase ver.2.0 (hereafter v20 cohort), and the second 177 HCC and 43 BTC were analyzed using a microarray based on miRBase ver.21 (hereafter v21 cohort). RESULTS: Using miRNA expression patterns, we found that HCC and BTC could be identified with an area under curve (AUC) 0.754 (v21 cohort). Patients with anti-hepatitis C virus (HCV) treatment (SVR-HCC) and without antiviral treatment (HCV-HCC) could be distinguished by an AUC 0.811 (v20 cohort) and AUC 0.798 (v21 cohort), respectively. CONCLUSIONS: In this study, we could diagnose primary hepatic malignant tumor using miRNA expression patterns. Moreover, the difference of miRNA expression in SVR-HCC and HCV-HCC can be important information for enclosing cases that are prone to carcinogenesis after being cured with antiviral agents, but also for uncovering the mechanism for some carcinogenic potential remains even after persistent virus infection has disappeared.

Assuntos

Carcinoma Hepatocelular , Hepatite C , Neoplasias Hepáticas , MicroRNAs , Humanos , Neoplasias Hepáticas/diagnóstico , Neoplasias Hepáticas/genética , Neoplasias Hepáticas/terapia , Carcinoma Hepatocelular/diagnóstico , Carcinoma Hepatocelular/genética , Carcinoma Hepatocelular/terapia , MicroRNAs/genética , Hepacivirus/genética , Carcinogênese

2.

Investigating Neuron Degeneration in Huntington's Disease Using RNA-Seq Based Transcriptome Study.

Sneha, Nela Pragathi; Dharshini, S Akila Parvathy; Taguchi, Y-H; Gromiha, M Michael.

Genes (Basel) ; 14(9)2023 09 14.

Artigo em Inglês | MEDLINE | ID: mdl-37761940

RESUMO

Huntington's disease (HD) is a progressive neurodegenerative disorder caused due to a CAG repeat expansion in the huntingtin (HTT) gene. The primary symptoms of HD include motor dysfunction such as chorea, dystonia, and involuntary movements. The primary motor cortex (BA4) is the key brain region responsible for executing motor/movement activities. Investigating patient and control samples from the BA4 region will provide a deeper understanding of the genes responsible for neuron degeneration and help to identify potential markers. Previous studies have focused on overall differential gene expression and associated biological functions. In this study, we illustrate the relationship between variants and differentially expressed genes/transcripts. We identified variants and their associated genes along with the quantification of genes and transcripts. We also predicted the effect of variants on various regulatory activities and found that many variants are regulating gene expression. Variants affecting miRNA and its targets are also highlighted in our study. Co-expression network studies revealed the role of novel genes. Function interaction network analysis unveiled the importance of genes involved in vesicle-mediated transport. From this unified approach, we propose that genes expressed in immune cells are crucial for reducing neuron death in HD.

Assuntos

Coreia , Doença de Huntington , Humanos , Doença de Huntington/genética , RNA-Seq , Transcriptoma/genética , Degeneração Neural

3.

Application note: TDbasedUFE and TDbasedUFEadv: bioconductor packages to perform tensor decomposition based unsupervised feature extraction.

Taguchi, Y-H; Turki, Turki.

Front Artif Intell ; 6: 1237542, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37719083

RESUMO

Motivation: Tensor decomposition (TD)-based unsupervised feature extraction (FE) has proven effective for a wide range of bioinformatics applications ranging from biomarker identification to the identification of disease-causing genes and drug repositioning. However, TD-based unsupervised FE failed to gain widespread acceptance due to the lack of user-friendly tools for non-experts. Results: We developed two bioconductor packages-TDbasedUFE and TDbasedUFEadv-that enable researchers unfamiliar with TD to utilize TD-based unsupervised FE. The packages facilitate the identification of differentially expressed genes and multiomics analysis. TDbasedUFE was found to outperform two state-of-the-art methods, such as DESeq2 and DIABLO. Availability and implementation: TDbasedUFE and TDbasedUFEadv are freely available as R/Bioconductor packages, which can be accessed at https://bioconductor.org/packages/TDbasedUFE and https://bioconductor.org/packages/TDbasedUFEadv, respectively.

4.

Integrated analysis of human DNA methylation, gene expression, and genomic variation in iMETHYL database using kernel tensor decomposition-based unsupervised feature extraction.

Taguchi, Y-H; Komaki, Shohei; Sutoh, Yoichi; Ohmomo, Hideki; Otsuka-Yamasaki, Yayoi; Shimizu, Atsushi.

PLoS One ; 18(8): e0289029, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37556429

RESUMO

Integrating gene expression, DNA methylation, and genomic variants simultaneously without location coincidence (i.e., irrespective of distance from each other) or pairwise coincidence (i.e., direct identification of triplets of gene expression, DNA methylation, and genomic variants, and not integration of pairwise coincidences) is difficult. In this study, we integrated gene expression, DNA methylation, and genome variants from the iMETHYL database using the recently proposed kernel tensor decomposition-based unsupervised feature extraction method with limited computational resources (i.e., short CPU time and small memory requirements). Our methods do not require prior knowledge of the subjects because they are fully unsupervised in that unsupervised tensor decomposition is used. The selected genes and genomic variants were significantly targeted by transcription factors that were biologically enriched in KEGG pathway terms as well as in the intra-related regulatory network. The proposed method is promising for integrated analyses of gene expression, methylation, and genomic variants with limited computational resources.

Assuntos

Metilação de DNA , Fatores de Transcrição , Humanos , Bases de Dados Factuais , Genômica , Expressão Gênica

5.

Tensor decomposition discriminates tissues using scATAC-seq.

Taguchi, Y-H; Turki, Turki.

Biochim Biophys Acta Gen Subj ; 1867(6): 130360, 2023 06.

Artigo em Inglês | MEDLINE | ID: mdl-37003566

RESUMO

ATAC-seq is a powerful tool for measuring the landscape structure of a chromosome. scATAC-seq is a recently updated version of ATAC-seq performed in a single cell. The problem with scATAC-seq is data sparsity and most of the genomic sites are inaccessible. Here, tensor decomposition (TD) was used to fill in missing values. In this study, TD was applied to massive scATAC-seq datasets generated by approximately 200 bp intervals, and this number can reach 13,627,618. Currently, no other methods can deal with large sparse matrices. The proposed method could not only provide UMAP embedding that coincides with tissue specificity, but also select genes associated with various biological enrichment terms and transcription factor targeting. This suggests that TD is a useful tool to process a large sparse matrix generated from scATAC-seq.

Assuntos

Cromatina , Genoma , Regulação da Expressão Gênica , Fatores de Transcrição/metabolismo

6.

Principal component analysis- and tensor decomposition-based unsupervised feature extraction to select more suitable differentially methylated cytosines: Optimization of standard deviation versus state-of-the-art methods.

Taguchi, Y-H; Turki, Turki.

Genomics ; 115(2): 110577, 2023 03.

Artigo em Inglês | MEDLINE | ID: mdl-36804268

RESUMO

In contrast to RNA-seq analysis, which has various standard methods, no standard methods for identifying differentially methylated cytosines (DMCs) exist. To identify DMCs, we tested principal component analysis and tensor decomposition-based unsupervised feature extraction with optimized standard deviation, which has been shown to be effective for differentially expressed gene (DEG) identification. The proposed method outperformed certain conventional methods, including those that assume beta-binomial distribution for methylation as the proposed method does not require this, especially when applied to methylation profiles measured using high throughput sequencing. DMCs identified by the proposed method also significantly overlapped with various functional sites, including known differentially methylated regions, enhancers, and DNase I hypersensitive sites. The proposed method was applied to data sets retrieved from The Cancer Genome Atlas to identify DMCs using American Joint Committee on Cancer staging system edition labels. This suggests that the proposed method is a promising standard method for identifying DMCs.

Assuntos

Metilação de DNA , Genoma , Ilhas de CpG , Análise de Componente Principal

7.

Features extracted using tensor decomposition reflect the biological features of the temporal patterns of human blood multimodal metabolome.

Fujita, Suguru; Karasawa, Yasuaki; Hironaka, Ken-Ichi; Taguchi, Y-H; Kuroda, Shinya.

PLoS One ; 18(2): e0281594, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-36791130

RESUMO

High-throughput omics technologies have enabled the profiling of entire biological systems. For the biological interpretation of such omics data, two analyses, hypothesis- and data-driven analyses including tensor decomposition, have been used. Both analyses have their own advantages and disadvantages and are mutually complementary; however, a direct comparison of these two analyses for omics data is poorly examined.We applied tensor decomposition (TD) to a dataset representing changes in the concentrations of 562 blood molecules at 14 time points in 20 healthy human subjects after ingestion of 75 g oral glucose. We characterized each molecule by individual dependence (constant or variable) and time dependence (later peak or early peak). Three of the four features extracted by TD were characterized by our previous hypothesis-driven study, indicating that TD can extract some of the same features obtained by hypothesis-driven analysis in a non-biased manner. In contrast to the years taken for our previous hypothesis-driven analysis, the data-driven analysis in this study took days, indicating that TD can extract biological features in a non-biased manner without the time-consuming process of hypothesis generation.

Assuntos

Sangue , Metaboloma , Humanos , Análise Química do Sangue

8.

A new machine learning based computational framework identifies therapeutic targets and unveils influential genes in pancreatic islet cells.

Turki, Turki; Taguchi, Y-H.

Gene ; 853: 147038, 2023 Feb 15.

Artigo em Inglês | MEDLINE | ID: mdl-36503891

RESUMO

Pancreatic islets comprise a group of cells that produce hormones regulating blood glucose levels. Particularly, the alpha and beta islet cells produce glucagon and insulin to stabilize blood glucose. When beta islet cells are dysfunctional, insulin is not secreted, inducing a glucose metabolic disorder. Identifying effective therapeutic targets against the disease is a complicated task and is not yet conclusive. To close the wide gap between understanding the molecular mechanism of pancreatic islet cells and providing effective therapeutic targets, we present a computational framework to identify potential therapeutic targets against pancreatic disorders. First, we downloaded three transcriptome expression profiling datasets pertaining to pancreatic islet cells (GSE87375, GSE79457, GSE110154) from the Gene Expression Omnibus database. For each dataset, we extracted expression profiles for two cell types. We then provided these expression profiles along with the cell types to our proposed constrained optimization problem of a support vector machine and to other existing methods, selecting important genes from the expression profiles. Finally, we performed (1) an evaluation from a classification perspective which showed the superiority of our methods against the baseline; and (2) an enrichment analysis which indicated that our methods achieved better outcomes. Results for the three datasets included 44 unique genes and 10 unique transcription factors (SP1, HDAC1, EGR1, E2F1, AR, STAT6, RELA, SP3, NFKB1, and ESR1) which are reportedly related to pancreatic islet functions, diseases, and therapeutic targets.

Assuntos

Células Secretoras de Insulina , Ilhotas Pancreáticas , Glicemia/metabolismo , Ilhotas Pancreáticas/metabolismo , Insulina/genética , Insulina/metabolismo , Glucagon , Perfilação da Expressão Gênica , Células Secretoras de Insulina/metabolismo

9.

Bioinformatic tools for epitranscriptomics.

Taguchi, Y-H.

Am J Physiol Cell Physiol ; 324(2): C447-C457, 2023 02 01.

Artigo em Inglês | MEDLINE | ID: mdl-36468841

RESUMO

The epitranscriptome, defined as RNA modifications that do not involve alterations in the nucleotide sequence, is a popular topic in the genomic sciences. Because we need massive computational techniques to identify epitranscriptomes within individual transcripts, many tools have been developed to infer epitranscriptomic sites as well as to process datasets using high-throughput sequencing. In this review, we summarize recent developments in epitranscriptome spatial detection and data analysis and discuss their progression.

Assuntos

Processamento Pós-Transcricional do RNA , Transcriptoma , Transcriptoma/genética , Biologia Computacional/métodos , Genômica , Sequenciamento de Nucleotídeos em Larga Escala

10.

Integrative Meta-Analysis of Huntington's Disease Transcriptome Landscape.

Sneha, Nela Pragathi; Dharshini, S Akila Parvathy; Taguchi, Y-H; Gromiha, M Michael.

Genes (Basel) ; 13(12)2022 12 16.

Artigo em Inglês | MEDLINE | ID: mdl-36553652

RESUMO

Huntington's disease (HD) is a neurodegenerative disorder with autosomal dominant inheritance caused by glutamine expansion in the Huntingtin gene (HTT). Striatal projection neurons (SPNs) in HD are more vulnerable to cell death. The executive striatal population is directly connected with the Brodmann Area (BA9), which is mainly involved in motor functions. Analyzing the disease samples from BA9 from the SRA database provides insights related to neuron degeneration, which helps to identify a promising therapeutic strategy. Most gene expression studies examine the changes in expression and associated biological functions. In this study, we elucidate the relationship between variants and their effect on gene/downstream transcript expression. We computed gene and transcript abundance and identified variants from RNA-seq data using various pipelines. We predicted the effect of genome-wide association studies (GWAS)/novel variants on regulatory functions. We found that many variants affect the histone acetylation pattern in HD, thereby perturbing the transcription factor networks. Interestingly, some variants affect miRNA binding as well as their downstream gene expression. Tissue-specific network analysis showed that mitochondrial, neuroinflammation, vasculature, and angiogenesis-related genes are disrupted in HD. From this integrative omics analysis, we propose that abnormal neuroinflammation acts as a two-edged sword that indirectly affects the vasculature and associated energy metabolism. Rehabilitation of blood-brain barrier functionality and energy metabolism may secure the neuron from cell death.

Assuntos

Doença de Huntington , Transcriptoma , Humanos , Transcriptoma/genética , Doença de Huntington/genética , Estudo de Associação Genômica Ampla , Doenças Neuroinflamatórias , Regulação da Expressão Gênica

11.

A tensor decomposition-based integrated analysis applicable to multiple gene expression profiles without sample matching.

Taguchi, Y-H; Turki, Turki.

Sci Rep ; 12(1): 21242, 2022 12 08.

Artigo em Inglês | MEDLINE | ID: mdl-36481877

RESUMO

The integrated analysis of multiple gene expression profiles previously measured in distinct studies is problematic since missing both sample matches and common labels prevent their integration in fully data-driven, unsupervised training. In this study, we propose a strategy to enable the integration of multiple gene expression profiles among multiple independent studies with neither labeling nor sample matching using tensor decomposition unsupervised feature extraction. We apply this strategy to Alzheimer's disease (AD)-related gene expression profiles that lack precise correspondence among samples, including AD single-cell RNA sequence (scRNA-seq) data. We were able to select biologically reasonable genes using the integrated analysis. Overall, integrated gene expression profiles can function analogously to prior- and/or transfer-learning strategies in other machine-learning applications. For scRNA-seq, the proposed approach significantly reduces the required computational memory.

Assuntos

Transcriptoma

12.

microRNA Bioinformatics.

Taguchi, Y-H.

Cells ; 11(22)2022 11 18.

Artigo em Inglês | MEDLINE | ID: mdl-36429104

RESUMO

Firstly, I apologize for the delayed publication of this Special Issue in the form of a book title [...].

Assuntos

Biologia Computacional , MicroRNAs , MicroRNAs/genética

13.

Estimation of Metabolic Effects upon Cadmium Exposure during Pregnancy Using Tensor Decomposition.

Amakura, Yuki; Taguchi, Y-H.

Genes (Basel) ; 13(10)2022 Sep 22.

Artigo em Inglês | MEDLINE | ID: mdl-36292583

RESUMO

A simple tensor decomposition model was applied to the liver transcriptome analysis data to elucidate the cause of cadmium-induced gene overexpression. In addition, we estimated the mechanism by which prenatal Cd exposure disrupts insulin metabolism in offspring. Numerous studies have reported on the toxicity of Cd. A liver transcriptome analysis revealed that Cd toxicity induces intracellular oxidative stress and mitochondrial dysfunction via changes in gene expression, which in turn induces endoplasmic reticulum-associated degradation via abnormal protein folding. However, the specific mechanisms underlying these effects remain unknown. In this study, we found that Cd-induced endoplasmic reticulum stress may promote increased expression of tumor necrosis factor-α (TNF-α). Based on the high expression of genes involved in the production of sphingolipids, it was also found that the accumulation of ceramide may induce intracellular oxidative stress through the overproduction of reactive oxygen species. In addition, the high expression of a set of genes involved in the electron transfer system may contribute to oxidative stress. These findings allowed us to identify the mechanisms by which intracellular oxidative stress leads to the phosphorylation of insulin receptor substrate 1, which plays a significant role in the insulin signaling pathway.

Assuntos

Cádmio , Fator de Necrose Tumoral alfa , Gravidez , Humanos , Feminino , Cádmio/toxicidade , Cádmio/metabolismo , Espécies Reativas de Oxigênio/metabolismo , Proteínas Substratos do Receptor de Insulina , Fator de Necrose Tumoral alfa/metabolismo , Degradação Associada com o Retículo Endoplasmático , Insulina/genética , Insulina/metabolismo , Ceramidas , Esfingolipídeos

14.

Adapted tensor decomposition and PCA based unsupervised feature extraction select more biologically reasonable differentially expressed genes than conventional methods.

Taguchi, Y-H; Turki, Turki.

Sci Rep ; 12(1): 17438, 2022 10 19.

Artigo em Inglês | MEDLINE | ID: mdl-36261574

RESUMO

Tensor decomposition- and principal component analysis-based unsupervised feature extraction were proposed almost 5 and 10 years ago, respectively; although these methods have been successfully applied to a wide range of genome analyses, including drug repositioning, biomarker identification, and disease-causing genes' identification, some fundamental problems have been identified: the number of genes identified was too small to assume that there were no false negatives, and the histogram of P values derived was not fully coincident with the null hypothesis that principal component and singular value vectors follow the Gaussian distribution. Optimizing the standard deviation such that the histogram of P values is as much as possible coincident with the null hypothesis results in an increase in the number and biological reliability of the selected genes. Our contribution was that we improved these methods so as to be able to select biologically more reasonable differentially expressed genes than the state of art methods that must empirically assume negative binomial distributions and dispersion relation, which is required for the selecting more expressed genes than less expressed ones, which can be achieved by the proposed methods that do not have to assume these.

Assuntos

Algoritmos , Reprodutibilidade dos Testes , Análise de Componente Principal , Distribuição Normal , Biomarcadores

15.

Exploring Plausible Therapeutic Targets for Alzheimer's Disease using Multi-omics Approach, Machine Learning and Docking.

Parvathy Dharshini, S Akila; Sneha, Nela Pragathi; Yesudhas, Dhanusha; Kulandaisamy, A; Rangaswamy, Uday; Shanmugam, Anusuya; Taguchi, Y-H; Gromiha, M Michael.

Curr Top Med Chem ; 22(22): 1868-1879, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-36056872

RESUMO

The progressive deterioration of neurons leads to Alzheimer's disease (AD), and developing a drug for this disorder is challenging. Substantial gene/transcriptome variability from multiple cell types leads to downstream pathophysiologic consequences that represent the heterogeneity of this disease. Identifying potential biomarkers for promising therapeutics is strenuous due to the fact that the transcriptome, epigenetic, or proteome changes detected in patients are not clear whether they are the cause or consequence of the disease, which eventually makes the drug discovery efforts intricate. The advancement in scRNA-sequencing technologies helps to identify cell type-specific biomarkers that may guide the selection of the pathways and related targets specific to different stages of the disease progression. This review is focussed on the analysis of multi-omics data from various perspectives (genomic and transcriptomic variants, and single-cell expression), which provide insights to identify plausible molecular targets to combat this complex disease. Further, we briefly outlined the developments in machine learning techniques to prioritize the risk-associated genes, predict probable mutations and identify promising drug candidates from natural products.

Assuntos

Doença de Alzheimer , Humanos , Doença de Alzheimer/tratamento farmacológico , Doença de Alzheimer/metabolismo , Genômica/métodos , Proteoma , Aprendizado de Máquina , Biomarcadores

16.

Suppression of intrahepatic cholangiocarcinoma cell growth by SKI via upregulation of the CDK inhibitor p21.

Kawamura, Etsushi; Matsubara, Tsutomu; Daikoku, Atsuko; Deguchi, Sanae; Kinoshita, Masahiko; Yuasa, Hideto; Urushima, Hayato; Odagiri, Naoshi; Motoyama, Hiroyuki; Kotani, Kohei; Kozuka, Ritsuzo; Hagihara, Atsushi; Fujii, Hideki; Uchida-Kobayashi, Sawako; Tanaka, Shogo; Takemura, Shigekazu; Iwaisako, Keiko; Enomoto, Masaru; Taguchi, Y H; Tamori, Akihiro; Kubo, Shoji; Ikeda, Kazuo; Kawada, Norifumi.

FEBS Open Bio ; 12(12): 2122-2135, 2022 12.

Artigo em Inglês | MEDLINE | ID: mdl-36114826

RESUMO

Cholangiocarcinoma (CC) has a poor prognosis and different driver genes depending on the site of onset. Intrahepatic CC is the second-most common liver cancer after hepatocellular carcinoma, and novel therapeutic targets are urgently needed. The present study was conducted to identify novel therapeutic targets by exploring differentially regulated genes in human CC. MicroRNA (miRNA) and mRNA microarrays were performed using tissue and serum samples obtained from 24 surgically resected hepatobiliary tumor cases, including 10 CC cases. We conducted principal component analysis to identify differentially expressed miRNA, leading to the identification of miRNA-3648 as a differentially expressed miRNA. We used an in silico screening approach to identify its target mRNA, the tumor suppressor Sloan Kettering Institute (SKI). SKI protein expression was decreased in human CC cells overexpressing miRNA-3648, endogenous SKI protein expression was decreased in human CC tumor tissues, and endogenous SKI mRNA expression was suppressed in human CC cells characterized by rapid growth. SKI-overexpressing OZ cells (human intrahepatic CC cells) showed upregulation of cyclin-dependent kinase inhibitor p21 mRNA and protein expression and suppressed cell proliferation. Nuclear expression of CDT1 (chromatin licensing and DNA replication factor 1), which is required for the G1/S transition, was suppressed in SKI-overexpressing OZ cells. SKI knockdown resulted in the opposite effects. Transgenic p21-luciferase was activated in SKI-overexpressing OZ cells. These data indicate SKI involvement in p21 transcription and that SKI-p21 signaling causes cell cycle arrest in G1, suppressing intrahepatic CC cell growth. Therefore, SKI may be a potential therapeutic target for intrahepatic CC.

Assuntos

Neoplasias dos Ductos Biliares , Colangiocarcinoma , MicroRNAs , Humanos , Inibidor de Quinase Dependente de Ciclina p21/genética , Inibidor de Quinase Dependente de Ciclina p21/metabolismo , Regulação para Cima/genética , Colangiocarcinoma/genética , Colangiocarcinoma/metabolismo , Colangiocarcinoma/patologia , Proliferação de Células/genética , Proteínas de Ciclo Celular/metabolismo , Ductos Biliares Intra-Hepáticos/metabolismo , Ductos Biliares Intra-Hepáticos/patologia , Neoplasias dos Ductos Biliares/genética , Neoplasias dos Ductos Biliares/metabolismo , Neoplasias dos Ductos Biliares/patologia , RNA Mensageiro

17.

Projection in genomic analysis: A theoretical basis to rationalize tensor decomposition and principal component analysis as feature selection tools.

Taguchi, Y-H; Turki, Turki.

PLoS One ; 17(9): e0275472, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-36173994

RESUMO

Identifying differentially expressed genes is difficult because of the small number of available samples compared with the large number of genes. Conventional gene selection methods employing statistical tests have the critical problem of heavy dependence of P-values on sample size. Although the recently proposed principal component analysis (PCA) and tensor decomposition (TD)-based unsupervised feature extraction (FE) has often outperformed these statistical test-based methods, the reason why they worked so well is unclear. In this study, we aim to understand this reason in the context of projection pursuit (PP) that was proposed a long time ago to solve the problem of dimensions; we can relate the space spanned by singular value vectors with that spanned by the optimal cluster centroids obtained from K-means. Thus, the success of PCA- and TD-based unsupervised FE can be understood by this equivalence. In addition to this, empirical threshold adjusted P-values of 0.01 assuming the null hypothesis that singular value vectors attributed to genes obey the Gaussian distribution empirically corresponds to threshold-adjusted P-values of 0.1 when the null distribution is generated by gene order shuffling. For this purpose, we newly applied PP to the three data sets to which PCA and TD based unsupervised FE were previously applied; these data sets treated two topics, biomarker identification for kidney cancers (the first two) and the drug discovery for COVID-19 (the thrid one). Then we found the coincidence between PP and PCA or TD based unsupervised FE is pretty well. Shuffling procedures described above are also successfully applied to these three data sets. These findings thus rationalize the success of PCA- and TD-based unsupervised FE for the first time.

Assuntos

COVID-19 , Ordem dos Genes , Genômica , Humanos , Análise de Componente Principal , Projeção

18.

Integrated Analysis of Tissue-Specific Gene Expression in Diabetes by Tensor Decomposition Can Identify Possible Associated Diseases.

Taguchi, Y-H; Turki, Turki.

Genes (Basel) ; 13(6)2022 06 20.

Artigo em Inglês | MEDLINE | ID: mdl-35741859

RESUMO

In the field of gene expression analysis, methods of integrating multiple gene expression profiles are still being developed and the existing methods have scope for improvement. The previously proposed tensor decomposition-based unsupervised feature extraction method was improved by introducing standard deviation optimization. The improved method was applied to perform an integrated analysis of three tissue-specific gene expression profiles (namely, adipose, muscle, and liver) for diabetes mellitus, and the results showed that it can detect diseases that are associated with diabetes (e.g., neurodegenerative diseases) but that cannot be predicted by individual tissue expression analyses using state-of-the-art methods. Although the selected genes differed from those identified by the individual tissue analyses, the selected genes are known to be expressed in all three tissues. Thus, compared with individual tissue analyses, an integrated analysis can provide more in-depth data and identify additional factors, namely, the association with other diseases.

Assuntos

Diabetes Mellitus , Diabetes Mellitus/genética , Humanos , Fígado , Transcriptoma/genética

19.

Tumor Heterogeneity and Molecular Characteristics of Glioblastoma Revealed by Single-Cell RNA-Seq Data Analysis.

Yesudhas, Dhanusha; Dharshini, S Akila Parvathy; Taguchi, Y-H; Gromiha, M Michael.

Genes (Basel) ; 13(3)2022 02 25.

Artigo em Inglês | MEDLINE | ID: mdl-35327982

RESUMO

Glioblastoma multiforme (GBM) is the most common infiltrating lethal tumor of the brain. Tumor heterogeneity and the precise characterization of GBM remain challenging, and the disease-specific and effective biomarkers are not available at present. To understand GBM heterogeneity and the disease prognosis mechanism, we carried out a single-cell transcriptome data analysis of 3389 cells from four primary IDH-WT (isocitrate dehydrogenase wild type) glioblastoma patients and compared the characteristic features of the tumor and periphery cells. We observed that the marker gene expression profiles of different cell types and the copy number variations (CNVs) are heterogeneous in the GBM samples. Further, we have identified 94 differentially expressed genes (DEGs) between tumor and periphery cells. We constructed a tissue-specific co-expression network and protein-protein interaction network for the DEGs and identified several hub genes, including CX3CR1, GAPDH, FN1, PDGFRA, HTRA1, ANXA2 THBS1, GFAP, PTN, TNC, and VIM. The DEGs were significantly enriched with proliferation and migration pathways related to glioblastoma. Additionally, we were able to identify the differentiation state of microglia and changes in the transcriptome in the presence of glioblastoma that might support tumor growth. This study provides insights into GBM heterogeneity and suggests novel potential disease-specific biomarkers which could help to identify the therapeutic targets in GBM.

Assuntos

Neoplasias Encefálicas , Glioblastoma , Biomarcadores , Neoplasias Encefálicas/metabolismo , Variações do Número de Cópias de DNA , Análise de Dados , Regulação Neoplásica da Expressão Gênica , Glioblastoma/genética , Glioblastoma/patologia , Serina Peptidase 1 de Requerimento de Alta Temperatura A/genética , Humanos , RNA-Seq

20.

Novel feature selection method via kernel tensor decomposition for improved multi-omics data analysis.

Taguchi, Y-H; Turki, Turki.

BMC Med Genomics ; 15(1): 37, 2022 02 24.

Artigo em Inglês | MEDLINE | ID: mdl-35209912

RESUMO

BACKGROUND: Feature selection of multi-omics data analysis remains challenging owing to the size of omics datasets, comprising approximately [Formula: see text]-[Formula: see text] features. In particular, appropriate methods to weight individual omics datasets are unclear, and the approach adopted has substantial consequences for feature selection. In this study, we extended a recently proposed kernel tensor decomposition (KTD)-based unsupervised feature extraction (FE) method to integrate multi-omics datasets obtained from common samples in a weight-free manner. METHOD: KTD-based unsupervised FE was reformatted as the collection of kernelized tensors sharing common samples, which was applied to synthetic and real datasets. RESULTS: The proposed advanced KTD-based unsupervised FE method showed comparative performance to that of the previously proposed KTD method, as well as tensor decomposition-based unsupervised FE, but required reduced memory and central processing unit time. Moreover, this advanced KTD method, specifically designed for multi-omics analysis, attributes P values to features, which is rare for existing multi-omics-oriented methods. CONCLUSIONS: The sample R code is available at https://github.com/tagtag/MultiR/ .

Assuntos

Análise de Dados , Genômica , Proteômica

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA