Pesquisa | Secretaria de Estado da Saúde

1.

Principal component analysis- and tensor decomposition-based unsupervised feature extraction to select more suitable differentially methylated cytosines: Optimization of standard deviation versus state-of-the-art methods.

Taguchi, Y-H; Turki, Turki.

Genomics ; 115(2): 110577, 2023 03.

Artigo em Inglês | MEDLINE | ID: mdl-36804268

RESUMO

In contrast to RNA-seq analysis, which has various standard methods, no standard methods for identifying differentially methylated cytosines (DMCs) exist. To identify DMCs, we tested principal component analysis and tensor decomposition-based unsupervised feature extraction with optimized standard deviation, which has been shown to be effective for differentially expressed gene (DEG) identification. The proposed method outperformed certain conventional methods, including those that assume beta-binomial distribution for methylation as the proposed method does not require this, especially when applied to methylation profiles measured using high throughput sequencing. DMCs identified by the proposed method also significantly overlapped with various functional sites, including known differentially methylated regions, enhancers, and DNase I hypersensitive sites. The proposed method was applied to data sets retrieved from The Cancer Genome Atlas to identify DMCs using American Joint Committee on Cancer staging system edition labels. This suggests that the proposed method is a promising standard method for identifying DMCs.

Assuntos

Metilação de DNA , Genoma , Ilhas de CpG , Análise de Componente Principal

2.

Bioinformatic tools for epitranscriptomics.

Taguchi, Y-H.

Am J Physiol Cell Physiol ; 324(2): C447-C457, 2023 02 01.

Artigo em Inglês | MEDLINE | ID: mdl-36468841

RESUMO

The epitranscriptome, defined as RNA modifications that do not involve alterations in the nucleotide sequence, is a popular topic in the genomic sciences. Because we need massive computational techniques to identify epitranscriptomes within individual transcripts, many tools have been developed to infer epitranscriptomic sites as well as to process datasets using high-throughput sequencing. In this review, we summarize recent developments in epitranscriptome spatial detection and data analysis and discuss their progression.

Assuntos

Processamento Pós-Transcricional do RNA , Transcriptoma , Transcriptoma/genética , Biologia Computacional/métodos , Genômica , Sequenciamento de Nucleotídeos em Larga Escala

3.

Identifying suitable tools for variant detection and differential gene expression using RNA-seq data.

Dharshini, S Akila Parvathy; Taguchi, Y-H; Gromiha, M Michael.

Genomics ; 112(3): 2166-2172, 2020 05.

Artigo em Inglês | MEDLINE | ID: mdl-31862361

RESUMO

Neurodegenerative diseases are the most predominate brain disorders around the globe and the affected populations are rapidly increasing. Recently, these diseases have been addressed using the data obtained from RNA-sequencing technology to reveal the changes in gene/transcript expression, effect of variants, and pathways involved in disease mechanisms. However, the observations mainly depend on the aligners/tools and the performance of existing RNA-seq tools on hg38 genome assembly has not yet been documented. In this study, we performed a systematic analysis of various spliced aligners, transcript assembling and variant calling tools based on both genomic assemblies (hg19/hg38) from hippocampus brain tissue. This helps to identify the best possible combination tools for hg38 annotation. In order to evaluate the identified variants from various pipelines, we compared them with expression Quantitative Trait Loci (eQTL) and Genome-Wide Association Study (GWAS). In addition, the identified differentially expressed genes (DG) were compared with microarray studies. From our analysis of variant calling, the combination of GATK (Genome Analysis Tool-kit) and STAR (Spliced Transcripts Alignment to a Reference) protocol yields a larger number of GWAS/eQTL variants compared to SAMtools (Sequence Alignment Map). We also identified a higher number of non-coding variants in hg38 compared to hg19 due to enhanced annotation. In the case of various DG pipelines, we found that the Salmon-based hg38 transcriptomic quantification yields a higher number of reported DG compared to other genome-based quantification methods. This study revealed that higher number of reads maps to multiple location of the genome with hg38 compared to hg19, and these spurious multi-mapped reads may affect the gene quantification techniques. We suggest that it is necessary to develop efficient algorithms, which can handle the multi-mapped reads and improve the performance of genome-based alignment quantification.

Assuntos

Variação Genética , RNA-Seq/métodos , Doença de Alzheimer/genética , Doença de Alzheimer/metabolismo , Estudo de Associação Genômica Ampla , Genômica/métodos , Hipocampo/metabolismo , Humanos , Alinhamento de Sequência

4.

Exploring the selective vulnerability in Alzheimer disease using tissue specific variant analysis.

Akila Parvathy Dharshini, S; Taguchi, Y-H; Michael Gromiha, M.

Genomics ; 111(4): 936-949, 2019 07.

Artigo em Inglês | MEDLINE | ID: mdl-29879491

RESUMO

The selective vulnerability of distinct regions of the brain is a critical factor in neurodegenerative disorders. In Alzheimer's disease (AD), neurons in hippocampus situated in medial temporal lobe are immensely damaged. Identifying tissue-specific variants is essential in order to perceive the selective vulnerability in AD. In current work, we aligned mRNA-seq data with HG19/HG38 genomic assembly and identified specific variations present in temporal, frontal and other lobes of the AD using sequence alignment map tools. We compared the results with the genome-wide association and gene expression quantitative trait loci studies of the various neurological disorders. We also distinguished variants and epitranscriptomic modifications through the RNA-modification database and evaluated the variant effect in the coding/UTR regions. In addition, we developed genetic and functional interaction networks to understand the relationship between predicted vulnerable variations and differentially expressed genes. We found that genes involved in gliogenesis, intermediate filament organization are altered in the temporal lobe. Oxidative phosphorylation, and calcium ion homeostasis are modified in the frontal lobe, and protein degradation, apoptotic signaling are altered in other lobes. From this study, we propose that disruption of glial cell structural integrity, defective gliogenesis, and failure in glia-neuron communication are the primary factors for selective vulnerability.

Assuntos

Doença de Alzheimer/genética , Encéfalo/metabolismo , Polimorfismo de Nucleotídeo Único , Transcriptoma , Regiões 3' não Traduzidas , Doença de Alzheimer/patologia , Encéfalo/patologia , Humanos , Redes e Vias Metabólicas/genética , Especificidade de Órgãos , Locos de Características Quantitativas

5.

Drug candidate identification based on gene expression of treated cells using tensor decomposition-based unsupervised feature extraction for large-scale data.

Taguchi, Y-H.

BMC Bioinformatics ; 19(Suppl 13): 388, 2019 Feb 04.

Artigo em Inglês | MEDLINE | ID: mdl-30717646

RESUMO

BACKGROUND: Although in silico drug discovery is necessary for drug development, two major strategies, a structure-based and ligand-based approach, have not been completely successful. Currently, the third approach, inference of drug candidates from gene expression profiles obtained from the cells treated with the compounds under study requires the use of a training dataset. Here, the purpose was to develop a new approach that does not require any pre-existing knowledge about the drug-protein interactions, but these interactions can be inferred by means of an integrated approach using gene expression profiles obtained from the cells treated with the analysed compounds and the existing data describing gene-gene interactions. RESULTS: In the present study, using tensor decomposition-based unsupervised feature extraction, which represents an extension of the recently proposed principal-component analysis-based feature extraction, gene sets and compounds with a significant dose-dependent activity were screened without any training datasets. Next, after these results were combined with the data showing perturbations in single-gene expression profiles, genes targeted by the analysed compounds were inferred. The set of target genes thus identified was shown to significantly overlap with known target genes of the compounds under study. CONCLUSIONS: The method is specifically designed for large-scale datasets (including hundreds of treatments with compounds), not for conventional small-scale datasets. The obtained results indicate that two compounds that have not been extensively studied, WZ-3105 and CGP-60474, represent promising drug candidates targeting multiple cancers, including melanoma, adenocarcinoma, liver carcinoma, and breast, colon, and prostate cancers, which were analysed in this in silico study.

Assuntos

Algoritmos , Descoberta de Drogas , Transcriptoma , Linhagem Celular , Simulação por Computador , Humanos , Análise de Componente Principal , Mapas de Interação de Proteínas , Fatores de Transcrição/metabolismo

6.

Tensor decomposition-based and principal-component-analysis-based unsupervised feature extraction applied to the gene expression and methylation profiles in the brains of social insects with multiple castes.

Taguchi, Y-H.

BMC Bioinformatics ; 19(Suppl 4): 99, 2018 05 08.

Artigo em Inglês | MEDLINE | ID: mdl-29745827

RESUMO

BACKGROUND: Even though coexistence of multiple phenotypes sharing the same genomic background is interesting, it remains incompletely understood. Epigenomic profiles may represent key factors, with unknown contributions to the development of multiple phenotypes, and social-insect castes are a good model for elucidation of the underlying mechanisms. Nonetheless, previous studies have failed to identify genes associated with aberrant gene expression and methylation profiles because of the lack of suitable methodology that can address this problem properly. METHODS: A recently proposed principal component analysis (PCA)-based and tensor decomposition (TD)-based unsupervised feature extraction (FE) can solve this problem because these two approaches can deal with gene expression and methylation profiles even when a small number of samples is available. RESULTS: PCA-based and TD-based unsupervised FE methods were applied to the analysis of gene expression and methylation profiles in the brains of two social insects, Polistes canadensis and Dinoponera quadriceps. Genes associated with differential expression and methylation between castes were identified, and analysis of enrichment of Gene Ontology terms confirmed reliability of the obtained sets of genes from the biological standpoint. CONCLUSIONS: Biologically relevant genes, shown to be associated with significant differential gene expression and methylation between castes, were identified here for the first time. The identification of these genes may help understand the mechanisms underlying epigenetic control of development of multiple phenotypes under the same genomic conditions.

Assuntos

Algoritmos , Encéfalo/metabolismo , Metilação de DNA/genética , Regulação da Expressão Gênica , Hierarquia Social , Análise de Componente Principal , Vespas/genética , Animais , Regulação para Baixo/genética , Análise de Elementos Finitos , Perfilação da Expressão Gênica , Ontologia Genética , Reprodutibilidade dos Testes , Comportamento Social

7.

Collaborative environmental DNA sampling from petal surfaces of flowering cherry Cerasus × yedoensis 'Somei-yoshino' across the Japanese archipelago.

Ohta, Tazro; Kawashima, Takeshi; Shinozaki, Natsuko O; Dobashi, Akito; Hiraoka, Satoshi; Hoshino, Tatsuhiko; Kanno, Keiichi; Kataoka, Takafumi; Kawashima, Shuichi; Matsui, Motomu; Nemoto, Wataru; Nishijima, Suguru; Suganuma, Natsuki; Suzuki, Haruo; Taguchi, Y-H; Takenaka, Yoichi; Tanigawa, Yosuke; Tsuneyoshi, Momoka; Yoshitake, Kazutoshi; Sato, Yukuto; Yamashita, Riu; Arakawa, Kazuharu; Iwasaki, Wataru.

J Plant Res ; 131(4): 709-717, 2018 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-29460198

RESUMO

Recent studies have shown that environmental DNA is found almost everywhere. Flower petal surfaces are an attractive tissue to use for investigation of the dispersal of environmental DNA in nature as they are isolated from the external environment until the bud opens and only then can the petal surface accumulate environmental DNA. Here, we performed a crowdsourced experiment, the "Ohanami Project", to obtain environmental DNA samples from petal surfaces of Cerasus × yedoensis 'Somei-yoshino' across the Japanese archipelago during spring 2015. C. × yedoensis is the most popular garden cherry species in Japan and clones of this cultivar bloom simultaneously every spring. Data collection spanned almost every prefecture and totaled 577 DNA samples from 149 collaborators. Preliminary amplicon-sequencing analysis showed the rapid attachment of environmental DNA onto the petal surfaces. Notably, we found DNA of other common plant species in samples obtained from a wide distribution; this DNA likely originated from the pollen of the Japanese cedar. Our analysis supports our belief that petal surfaces after blossoming are a promising target to reveal the dynamics of environmental DNA in nature. The success of our experiment also shows that crowdsourced environmental DNA analyses have considerable value in ecological studies.

Assuntos

DNA de Plantas/genética , DNA/genética , Meio Ambiente , Flores/genética , Prunus/genética , Cloroplastos/genética , Cianobactérias/genética , Flores/microbiologia , Japão , Proteobactérias/genética , Prunus/microbiologia , Alinhamento de Sequência , Análise de Sequência de DNA

8.

Exploring microRNA Biomarker for Amyotrophic Lateral Sclerosis.

Taguchi, Y-H; Wang, Hsiuying.

Int J Mol Sci ; 19(5)2018 Apr 28.

Artigo em Inglês | MEDLINE | ID: mdl-29710810

RESUMO

Amyotrophic lateral sclerosis (ALS) is among the severe neuro degenerative diseases that lack widely available effective treatments. As the disease progresses, patients lose the control of voluntary muscles. Although the neuronal degeneration is the cause of this disease, the failure mechanism is still unknown. In order to seek genetic mechanisms that initiate and progress ALS, the association of microRNA (miRNA) expression with this disease was considered. Serum miRNAs from healthy controls, sporadic ALS (sALS), familial ALS (fALS) and ALS mutation carriers were investigated. Principal component analysis (PCA)-based unsupervised feature extraction (FE) was applied to these serum miRNA profiles. As a result, we predict miRNAs that can discriminate patients from healthy controls with high accuracy. Thus, these miRNAs can be potential prognosis miRNA biomarkers for ALS.

Assuntos

Esclerose Lateral Amiotrófica/sangue , MicroRNAs/sangue , Biomarcadores/sangue , Estudos de Casos e Controles , Humanos

9.

Expression of Serum Exosomal and Esophageal MicroRNA in Rat Reflux Esophagitis.

Uemura, Risa; Murakami, Yoshiki; Hashimoto, Atsushi; Sawada, Akinari; Otani, Koji; Taira, Koichi; Hosomi, Shuhei; Nagami, Yasuaki; Tanaka, Fumio; Kamata, Noriko; Yamagami, Hirokazu; Tanigawa, Tetsuya; Watanabe, Toshio; Taguchi, Y-H; Fujiwara, Yasuhiro.

Int J Mol Sci ; 18(8)2017 Jul 25.

Artigo em Inglês | MEDLINE | ID: mdl-28757556

RESUMO

Gastroesophageal reflux disease (GERD) is a common upper gastrointestinal disease. However, the role of exosomal microRNAs (miRNAs) and esophageal miRNAs in GERD has not been studied. A rat model of acid reflux esophagitis was used to establish a novel diagnosis marker for GERD and examine dynamics of miRNA expression in GERD. Rats were sacrificed 3 (acute phase), 7 (sub-acute phase) and 21 days (chronic phase) after induction of esophagitis. Exosomes were extracted from serum, and the expression patterns of serum miRNAs were analyzed. Four upregulated miRNAs (miR-29a-3p, 128-3p, 223-3p and 3473) were identified by microarray analysis. The expression levels of exosomal miR-29a-3p were significantly higher in the chronic phase of reflux esophagitis compared with controls, and increased expression of miR-29a-3p was specific to chronic reflux esophagitis. Esophageal miR-223-3p expression was higher compared with controls, and gradually decreased from acute to chronic phase in esophagitis. In conclusion, exosomal miR-29a-3p and esophageal miR-223-3p might play roles in GERD.

Assuntos

Esofagite Péptica/genética , Esôfago/química , Exossomos/genética , MicroRNAs/genética , Animais , Biomarcadores/sangue , Modelos Animais de Doenças , Fator de Transcrição E2F1/genética , Esofagite Péptica/diagnóstico , Feminino , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Humanos , Masculino , MicroRNAs/sangue , Análise de Sequência com Séries de Oligonucleotídeos , Ratos , Fator de Transcrição STAT3/genética

10.

Microarray analysis of circulating microRNAs in familial Mediterranean fever.

Wada, Taizo; Toma, Tomoko; Matsuda, Yusuke; Yachie, Akihiro; Itami, Saori; Taguchi, Y-H; Murakami, Yoshiki.

Mod Rheumatol ; 27(6): 1040-1046, 2017 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-28165838

RESUMO

OBJECTIVES: Familial Mediterranean fever (FMF) is an autoinflammatory disease caused by mutations in MEFV. Mutations in exon 10 are associated with typical FMF phenotypes, whereas the pathogenic role of variants in exons 2 and 3 remains uncertain. Recent evidence suggests that circulating microRNAs (miRNAs) are potentially useful biomarkers in several diseases. Therefore, their expression was assessed in FMF. METHODS: The subjects were 24 patients with FMF who were between attacks: eight with exon 10 mutations (group A), eight with exon 3 mutations (group B), and eight without exon 3 or 10 mutations (group C). We also investigated eight cases of PFAPA as disease controls. Exosome-rich fractionated RNA was subjected to miRNA profiling by microarray. RESULTS: Using the expression patterns of 26 miRNAs, we classified FMF (groups A, B, and C) and PFAPA with 78.1% accuracy. In FMF patients, groups A and B, A and C, and B and C were distinguished with 93.8, 87.5, and 100% accuracy using 24, 30, and 25 miRNA expression patterns, respectively. CONCLUSIONS: These findings suggest that expression patterns of circulating miRNAs differ among FMF subgroups based on MEFV mutations between FMF episodes. These patterns may serve as a useful biomarker for detecting subgroups of FMF.

Assuntos

MicroRNA Circulante/genética , Febre Familiar do Mediterrâneo/genética , Adulto , Biomarcadores/sangue , MicroRNA Circulante/sangue , Éxons , Febre Familiar do Mediterrâneo/sangue , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Mutação

11.

MicroRNA expression in hepatocellular carcinoma after the eradication of chronic hepatitis virus C infection using interferon therapy.

Tamori, Akihiro; Murakami, Yoshiki; Kubo, Shoji; Itami, Saori; Uchida-Kobayashi, Sawako; Morikawa, Hiroyasu; Enomoto, Masaru; Takemura, Shigekazu; Tanahashi, Toshihito; Taguchi, Y-H; Kawada, Norifumi.

Hepatol Res ; 46(3): E26-35, 2016 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-25788219

RESUMO

AIM: Hepatocellular carcinoma (HCC) develops in up to 5% of patients after the successful treatment of chronic hepatitis C virus (HCV) infection using interferon therapy. The aim of this study was to characterize miRNA expression in liver tissues from patients who achieved a sustained viral response (SVR). METHODS: Seventy-one patients with resected HCC were enrolled into the present study: 61 HCC from patients with continuously infected HCV (HCV-HCC) and 10 from patients who had achieved SVR (SVR-HCC). We also included non-tumor tissues (SVR-NT) from four patients with SVR-HCC, and liver tissue (SVR-CH) from four SVR patients without HCC. Total RNA was extracted from liver samples. The miRNA expression patterns were analyzed using microarrays. In addition, target gene expression was quantified after miRNA overexpression in HEK293 cells. RESULTS: We could discriminate between SVR-HCC and HCV-HCC with 75.36% accuracy using the expression pattern of six specific miRNA. The expression levels of 37 miRNA were significantly lower in HCV-HCC than in SVR-HCC, whereas the expression of 25 miRNA was significantly higher in HCV-HCC than SVR-HCC (P < 1.0E-05). The expression of thrombospondin 1 was regulated in an opposing manner by miR-30a-3p in SVR-HCC and HCV-HCC. In non-tumor tissues, the expression pattern of seven miRNA could distinguish between SVR-CH and SVR-NT with 87.50% accuracy. CONCLUSION: Comprehensive miRNA expression analyses could not only differentiate between SVR-HCC and HCV-HCC but also forecast hepatocarcinogenesis after achieving SVR.

12.

Identification of More Feasible MicroRNA-mRNA Interactions within Multiple Cancers Using Principal Component Analysis Based Unsupervised Feature Extraction.

Taguchi, Y-H.

Int J Mol Sci ; 17(5)2016 May 10.

Artigo em Inglês | MEDLINE | ID: mdl-27171078

RESUMO

MicroRNA(miRNA)-mRNA interactions are important for understanding many biological processes, including development, differentiation and disease progression, but their identification is highly context-dependent. When computationally derived from sequence information alone, the identification should be verified by integrated analyses of mRNA and miRNA expression. The drawback of this strategy is the vast number of identified interactions, which prevents an experimental or detailed investigation of each pair. In this paper, we overcome this difficulty by the recently proposed principal component analysis (PCA)-based unsupervised feature extraction (FE), which reduces the number of identified miRNA-mRNA interactions that properly discriminate between patients and healthy controls without losing biological feasibility. The approach is applied to six cancers: hepatocellular carcinoma, non-small cell lung cancer, esophageal squamous cell carcinoma, prostate cancer, colorectal/colon cancer and breast cancer. In PCA-based unsupervised FE, the significance does not depend on the number of samples (as in the standard case) but on the number of features, which approximates the number of miRNAs/mRNAs. To our knowledge, we have newly identified miRNA-mRNA interactions in multiple cancers based on a single common (universal) criterion. Moreover, the number of identified interactions was sufficiently small to be sequentially curated by literature searches.

Assuntos

Neoplasias da Mama/genética , Carcinoma/genética , Neoplasias Gastrointestinais/genética , Neoplasias Pulmonares/genética , MicroRNAs/genética , Neoplasias da Próstata/genética , RNA Mensageiro/genética , Neoplasias da Mama/metabolismo , Carcinoma/metabolismo , Estudos de Casos e Controles , Feminino , Neoplasias Gastrointestinais/metabolismo , Humanos , Neoplasias Pulmonares/metabolismo , Masculino , Análise de Componente Principal , Neoplasias da Próstata/metabolismo

13.

Identification of aberrant gene expression associated with aberrant promoter methylation in primordial germ cells between E13 and E16 rat F3 generation vinclozolin lineage.

Taguchi, Y-h.

BMC Bioinformatics ; 16 Suppl 18: S16, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-26677731

RESUMO

BACKGROUND: Transgenerational epigenetics (TGE) are currently considered important in disease, but the mechanisms involved are not yet fully understood. TGE abnormalities expected to cause disease are likely to be initiated during development and to be mediated by aberrant gene expression associated with aberrant promoter methylation that is heritable between generations. However, because methylation is removed and then re-established during development, it is not easy to identify promoter methylation abnormalities by comparing normal lineages with those expected to exhibit TGE abnormalities. METHODS: This study applied the recently proposed principal component analysis (PCA)-based unsupervised feature extraction to previously reported and publically available gene expression/promoter methylation profiles of rat primordial germ cells, between E13 and E16 of the F3 generation vinclozolin lineage that are expected to exhibit TGE abnormalities, to identify multiple genes that exhibited aberrant gene expression/promoter methylation during development. RESULTS: The biological feasibility of the identified genes were tested via enrichment analyses of various biological concepts including pathway analysis, gene ontology terms and protein-protein interactions. All validations suggested superiority of the proposed method over three conventional and popular supervised methods that employed t test, limma and significance analysis of microarrays, respectively. The identified genes were globally related to tumors, the prostate, kidney, testis and the immune system and were previously reported to be related to various diseases caused by TGE. CONCLUSIONS: Among the genes reported by PCA-based unsupervised feature extraction, we propose that chemokine signaling pathways and leucine rich repeat proteins are key factors that initiate transgenerational epigenetic-mediated diseases, because multiple genes included in these two categories were identified in this study.

Assuntos

Epigenômica , Células Germinativas/metabolismo , Oxazóis/metabolismo , Algoritmos , Animais , Metilação de DNA , Embrião de Mamíferos/metabolismo , Feminino , Análise de Sequência com Séries de Oligonucleotídeos , Análise de Componente Principal , Regiões Promotoras Genéticas , Ratos , Transdução de Sinais/genética , Transcriptoma

14.

Principal component analysis-based unsupervised feature extraction applied to in silico drug discovery for posttraumatic stress disorder-mediated heart disease.

Taguchi, Y-h; Iwadate, Mitsuo; Umeyama, Hideaki.

BMC Bioinformatics ; 16: 139, 2015 Apr 30.

Artigo em Inglês | MEDLINE | ID: mdl-25925353

RESUMO

BACKGROUND: Feature extraction (FE) is difficult, particularly if there are more features than samples, as small sample numbers often result in biased outcomes or overfitting. Furthermore, multiple sample classes often complicate FE because evaluating performance, which is usual in supervised FE, is generally harder than the two-class problem. Developing sample classification independent unsupervised methods would solve many of these problems. RESULTS: Two principal component analysis (PCA)-based FE, specifically, variational Bayes PCA (VBPCA) was extended to perform unsupervised FE, and together with conventional PCA (CPCA)-based unsupervised FE, were tested as sample classification independent unsupervised FE methods. VBPCA- and CPCA-based unsupervised FE both performed well when applied to simulated data, and a posttraumatic stress disorder (PTSD)-mediated heart disease data set that had multiple categorical class observations in mRNA/microRNA expression of stressed mouse heart. A critical set of PTSD miRNAs/mRNAs were identified that show aberrant expression between treatment and control samples, and significant, negative correlation with one another. Moreover, greater stability and biological feasibility than conventional supervised FE was also demonstrated. Based on the results obtained, in silico drug discovery was performed as translational validation of the methods. CONCLUSIONS: Our two proposed unsupervised FE methods (CPCA- and VBPCA-based) worked well on simulated data, and outperformed two conventional supervised FE methods on a real data set. Thus, these two methods have suggested equivalence for FE on categorical multiclass data sets, with potential translational utility for in silico drug discovery.

Assuntos

Algoritmos , Biomarcadores , Mineração de Dados/métodos , Descoberta de Drogas , Regulação da Expressão Gênica , Cardiopatias/genética , Análise de Componente Principal/métodos , Transtornos de Estresse Pós-Traumáticos/genética , Animais , Teorema de Bayes , Biologia Computacional , Simulação por Computador , Mineração de Dados/estatística & dados numéricos , Perfilação da Expressão Gênica , Cardiopatias/tratamento farmacológico , Camundongos , MicroRNAs/genética , RNA Mensageiro/genética , Transtornos de Estresse Pós-Traumáticos/tratamento farmacológico

15.

TINAGL1 and B3GALNT1 are potential therapy target genes to suppress metastasis in non-small cell lung cancer.

Umeyama, Hideaki; Iwadate, Mitsuo; Taguchi, Y-h.

BMC Genomics ; 15 Suppl 9: S2, 2014.

Artigo em Inglês | MEDLINE | ID: mdl-25521548

RESUMO

BACKGROUND: Non-small cell lung cancer (NSCLC) remains lethal despite the development of numerous drug therapy technologies. About 85% to 90% of lung cancers are NSCLC and the 5-year survival rate is at best still below 50%. Thus, it is important to find drugable target genes for NSCLC to develop an effective therapy for NSCLC. RESULTS: Integrated analysis of publically available gene expression and promoter methylation patterns of two highly aggressive NSCLC cell lines generated by in vivo selection was performed. We selected eleven critical genes that may mediate metastasis using recently proposed principal component analysis based unsupervised feature extraction. The eleven selected genes were significantly related to cancer diagnosis. The tertiary protein structure of the selected genes was inferred by Full Automatic Modeling System, a profile-based protein structure inference software, to determine protein functions and to specify genes that could be potential drug targets. CONCLUSIONS: We identified eleven potentially critical genes that may mediate NSCLC metastasis using bioinformatic analysis of publically available data sets. These genes are potential target genes for the therapy of NSCLC. Among the eleven genes, TINAGL1 and B3GALNT1 are possible candidates for drug compounds that inhibit their gene expression.

Assuntos

Carcinoma Pulmonar de Células não Pequenas/genética , Carcinoma Pulmonar de Células não Pequenas/patologia , Proteínas da Matriz Extracelular/genética , Lipocalinas/genética , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/patologia , Terapia de Alvo Molecular , N-Acetilgalactosaminiltransferases/genética , Carcinoma Pulmonar de Células não Pequenas/tratamento farmacológico , Linhagem Celular Tumoral , Metilação de DNA/efeitos dos fármacos , Metilação de DNA/genética , Regulação Neoplásica da Expressão Gênica/efeitos dos fármacos , Humanos , Neoplasias Pulmonares/tratamento farmacológico , Metástase Neoplásica , Análise de Componente Principal , Regiões Promotoras Genéticas/efeitos dos fármacos , Regiões Promotoras Genéticas/genética , Análise de Sobrevida

16.

Characterization of circulating miRNAs in the treatment of primary liver tumors.

Umezu, Tomohiro; Tanaka, Shogo; Kubo, Shoji; Enomoto, Masaru; Tamori, Akihiro; Ochiya, Takahiro; Taguchi, Y-H; Kuroda, Masahiko; Murakami, Yoshiki.

Cancer Rep (Hoboken) ; 7(2): e1964, 2024 02.

Artigo em Inglês | MEDLINE | ID: mdl-38146079

RESUMO

BACKGROUND AND AIM: Circulating micro RNAs (miRNAs) indicate clinical pathologies such as inflammation and carcinogenesis. In this study, we aimed to investigate whether miRNA expression level patterns in could be used to diagnose hepatocellular carcinoma (HCC) and biliary tract cancer (BTC), and the relationship miRNA expression patterns and cancer etiology. METHODS: Patients with HCC and BTC with indications for surgery were selected for the study. Total RNA was extracted from the extracellular vesicle (EV)-rich fraction of the serum and analyzed using Toray miRNA microarray. Samples were divided into two cohorts in order of collection, the first 85 HCC were analyzed using a microarray based on miRBase ver.2.0 (hereafter v20 cohort), and the second 177 HCC and 43 BTC were analyzed using a microarray based on miRBase ver.21 (hereafter v21 cohort). RESULTS: Using miRNA expression patterns, we found that HCC and BTC could be identified with an area under curve (AUC) 0.754 (v21 cohort). Patients with anti-hepatitis C virus (HCV) treatment (SVR-HCC) and without antiviral treatment (HCV-HCC) could be distinguished by an AUC 0.811 (v20 cohort) and AUC 0.798 (v21 cohort), respectively. CONCLUSIONS: In this study, we could diagnose primary hepatic malignant tumor using miRNA expression patterns. Moreover, the difference of miRNA expression in SVR-HCC and HCV-HCC can be important information for enclosing cases that are prone to carcinogenesis after being cured with antiviral agents, but also for uncovering the mechanism for some carcinogenic potential remains even after persistent virus infection has disappeared.

Assuntos

Carcinoma Hepatocelular , Hepatite C , Neoplasias Hepáticas , MicroRNAs , Humanos , Neoplasias Hepáticas/diagnóstico , Neoplasias Hepáticas/genética , Neoplasias Hepáticas/terapia , Carcinoma Hepatocelular/diagnóstico , Carcinoma Hepatocelular/genética , Carcinoma Hepatocelular/terapia , MicroRNAs/genética , Hepacivirus/genética , Carcinogênese

17.

Application note: TDbasedUFE and TDbasedUFEadv: bioconductor packages to perform tensor decomposition based unsupervised feature extraction.

Taguchi, Y-H; Turki, Turki.

Front Artif Intell ; 6: 1237542, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37719083

RESUMO

Motivation: Tensor decomposition (TD)-based unsupervised feature extraction (FE) has proven effective for a wide range of bioinformatics applications ranging from biomarker identification to the identification of disease-causing genes and drug repositioning. However, TD-based unsupervised FE failed to gain widespread acceptance due to the lack of user-friendly tools for non-experts. Results: We developed two bioconductor packages-TDbasedUFE and TDbasedUFEadv-that enable researchers unfamiliar with TD to utilize TD-based unsupervised FE. The packages facilitate the identification of differentially expressed genes and multiomics analysis. TDbasedUFE was found to outperform two state-of-the-art methods, such as DESeq2 and DIABLO. Availability and implementation: TDbasedUFE and TDbasedUFEadv are freely available as R/Bioconductor packages, which can be accessed at https://bioconductor.org/packages/TDbasedUFE and https://bioconductor.org/packages/TDbasedUFEadv, respectively.

18.

Tensor decomposition discriminates tissues using scATAC-seq.

Taguchi, Y-H; Turki, Turki.

Biochim Biophys Acta Gen Subj ; 1867(6): 130360, 2023 06.

Artigo em Inglês | MEDLINE | ID: mdl-37003566

RESUMO

ATAC-seq is a powerful tool for measuring the landscape structure of a chromosome. scATAC-seq is a recently updated version of ATAC-seq performed in a single cell. The problem with scATAC-seq is data sparsity and most of the genomic sites are inaccessible. Here, tensor decomposition (TD) was used to fill in missing values. In this study, TD was applied to massive scATAC-seq datasets generated by approximately 200 bp intervals, and this number can reach 13,627,618. Currently, no other methods can deal with large sparse matrices. The proposed method could not only provide UMAP embedding that coincides with tissue specificity, but also select genes associated with various biological enrichment terms and transcription factor targeting. This suggests that TD is a useful tool to process a large sparse matrix generated from scATAC-seq.

Assuntos

Cromatina , Genoma , Regulação da Expressão Gênica , Fatores de Transcrição/metabolismo

19.

A new machine learning based computational framework identifies therapeutic targets and unveils influential genes in pancreatic islet cells.

Turki, Turki; Taguchi, Y-H.

Gene ; 853: 147038, 2023 Feb 15.

Artigo em Inglês | MEDLINE | ID: mdl-36503891

RESUMO

Pancreatic islets comprise a group of cells that produce hormones regulating blood glucose levels. Particularly, the alpha and beta islet cells produce glucagon and insulin to stabilize blood glucose. When beta islet cells are dysfunctional, insulin is not secreted, inducing a glucose metabolic disorder. Identifying effective therapeutic targets against the disease is a complicated task and is not yet conclusive. To close the wide gap between understanding the molecular mechanism of pancreatic islet cells and providing effective therapeutic targets, we present a computational framework to identify potential therapeutic targets against pancreatic disorders. First, we downloaded three transcriptome expression profiling datasets pertaining to pancreatic islet cells (GSE87375, GSE79457, GSE110154) from the Gene Expression Omnibus database. For each dataset, we extracted expression profiles for two cell types. We then provided these expression profiles along with the cell types to our proposed constrained optimization problem of a support vector machine and to other existing methods, selecting important genes from the expression profiles. Finally, we performed (1) an evaluation from a classification perspective which showed the superiority of our methods against the baseline; and (2) an enrichment analysis which indicated that our methods achieved better outcomes. Results for the three datasets included 44 unique genes and 10 unique transcription factors (SP1, HDAC1, EGR1, E2F1, AR, STAT6, RELA, SP3, NFKB1, and ESR1) which are reportedly related to pancreatic islet functions, diseases, and therapeutic targets.

Assuntos

Células Secretoras de Insulina , Ilhotas Pancreáticas , Glicemia/metabolismo , Ilhotas Pancreáticas/metabolismo , Insulina/genética , Insulina/metabolismo , Glucagon , Perfilação da Expressão Gênica , Células Secretoras de Insulina/metabolismo

20.

Features extracted using tensor decomposition reflect the biological features of the temporal patterns of human blood multimodal metabolome.

Fujita, Suguru; Karasawa, Yasuaki; Hironaka, Ken-Ichi; Taguchi, Y-H; Kuroda, Shinya.

PLoS One ; 18(2): e0281594, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-36791130

RESUMO

High-throughput omics technologies have enabled the profiling of entire biological systems. For the biological interpretation of such omics data, two analyses, hypothesis- and data-driven analyses including tensor decomposition, have been used. Both analyses have their own advantages and disadvantages and are mutually complementary; however, a direct comparison of these two analyses for omics data is poorly examined.We applied tensor decomposition (TD) to a dataset representing changes in the concentrations of 562 blood molecules at 14 time points in 20 healthy human subjects after ingestion of 75 g oral glucose. We characterized each molecule by individual dependence (constant or variable) and time dependence (later peak or early peak). Three of the four features extracted by TD were characterized by our previous hypothesis-driven study, indicating that TD can extract some of the same features obtained by hypothesis-driven analysis in a non-biased manner. In contrast to the years taken for our previous hypothesis-driven analysis, the data-driven analysis in this study took days, indicating that TD can extract biological features in a non-biased manner without the time-consuming process of hypothesis generation.

Assuntos

Sangue , Metaboloma , Humanos , Análise Química do Sangue

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa