Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
2.
Genome Biol ; 22(1): 332, 2021 12 06.
Artigo em Inglês | MEDLINE | ID: mdl-34872606

RESUMO

BACKGROUND: Cytosine modifications in DNA such as 5-methylcytosine (5mC) underlie a broad range of developmental processes, maintain cellular lineage specification, and can define or stratify types of cancer and other diseases. However, the wide variety of approaches available to interrogate these modifications has created a need for harmonized materials, methods, and rigorous benchmarking to improve genome-wide methylome sequencing applications in clinical and basic research. Here, we present a multi-platform assessment and cross-validated resource for epigenetics research from the FDA's Epigenomics Quality Control Group. RESULTS: Each sample is processed in multiple replicates by three whole-genome bisulfite sequencing (WGBS) protocols (TruSeq DNA methylation, Accel-NGS MethylSeq, and SPLAT), oxidative bisulfite sequencing (TrueMethyl), enzymatic deamination method (EMSeq), targeted methylation sequencing (Illumina Methyl Capture EPIC), single-molecule long-read nanopore sequencing from Oxford Nanopore Technologies, and 850k Illumina methylation arrays. After rigorous quality assessment and comparison to Illumina EPIC methylation microarrays and testing on a range of algorithms (Bismark, BitmapperBS, bwa-meth, and BitMapperBS), we find overall high concordance between assays, but also differences in efficiency of read mapping, CpG capture, coverage, and platform performance, and variable performance across 26 microarray normalization algorithms. CONCLUSIONS: The data provided herein can guide the use of these DNA reference materials in epigenomics research, as well as provide best practices for experimental design in future studies. By leveraging seven human cell lines that are designated as publicly available reference materials, these data can be used as a baseline to advance epigenomics research.


Assuntos
Epigênese Genética , Epigenômica/métodos , Controle de Qualidade , 5-Metilcitosina , Algoritmos , Ilhas de CpG , DNA/genética , Metilação de DNA , Epigenoma , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Alinhamento de Sequência , Análise de Sequência de DNA/métodos , Sulfitos , Sequenciamento Completo do Genoma/métodos
3.
Front Cell Dev Biol ; 9: 754507, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34722540

RESUMO

Extrinsic factors such as expression of PD-L1 (programmed dealth-ligand 1) in the tumor microenvironment (TME) have been shown to correlate with responses to checkpoint blockade therapy. More recently two intrinsic factors related to tumor genetics, microsatellite instability (MSI), and tumor mutation burden (TMB), have been linked to high response rates to checkpoint blockade drugs. These response rates led to the first tissue-agnostic approval of any cancer therapy by the FDA for the treatment of metastatic, MSI-H tumors with anti-PD-1 immunotherapy. But there are still very few studies focusing on the association of miRNAs with immune therapy through checkpoint inhibitors. Our team sought to explore the biology of such tumors further and suggest potential companion therapeutics to current checkpoint inhibitors. Analysis by Pearson Correlation revealed 41 total miRNAs correlated with mutation burden, 62 miRNAs correlated with MSI, and 17 miRNAs correlated with PD-L1 expression. Three miRNAs were correlated with all three of these tumor features as well as M1 macrophage polarization. No miRNAs in any group were associated with overall survival. TGF-ß was predicted to be influenced by these three miRNAs (p = 0.008). Exploring miRNA targets as companions to treatment by immune checkpoint blockade revealed three potential miRNA targets predicted to impact TGF-ß. M1 macrophage polarization state was also associated with tumors predicted to respond to therapy by immune checkpoint blockade.

4.
Molecules ; 26(18)2021 Sep 17.
Artigo em Inglês | MEDLINE | ID: mdl-34577130

RESUMO

One in five cancers is attributed to infectious agents, and the extent of the impact on the initiation, progression, and disease outcomes may be underestimated. Infection-associated cancers are commonly attributed to viral, and to a lesser extent, parasitic and bacterial etiologies. There is growing evidence that microbial community variation rather than a single agent can influence cancer development, progression, response to therapy, and outcome. We evaluated microbial sequences from a subset of infection-associated cancers-namely, head and neck squamous cell carcinoma (HNSC), liver hepatocellular carcinoma (LIHC), and stomach adenocarcinoma (STAD) from The Cancer Genome Atlas (TCGA). A total of 470 paired tumor and adjacent normal samples were analyzed. In STAD, concurrent presence of EBV and Selemonas sputigena with a high diversity index were associated with poorer survival (HR: 2.23, 95% CI 1.26-3.94, p = 0.006 and HR: 2.31, 95% CI 1.1-4.9, p = 0.03, respectively). In LIHC, lower microbial diversity was associated with poorer overall survival (HR: 2.57, 95% CI: 1.2, 5.5, p = 0.14). Bacterial within-sample diversity correlates with overall survival in infection-associated cancers in a subset of TCGA cohorts.


Assuntos
Neoplasias Hepáticas , Carcinoma de Células Escamosas de Cabeça e Pescoço , Neoplasias Gástricas , Biomarcadores Tumorais , Regulação Neoplásica da Expressão Gênica , Humanos , Prognóstico
5.
Mol Hum Reprod ; 27(4)2021 03 24.
Artigo em Inglês | MEDLINE | ID: mdl-33677573

RESUMO

Early embryos are vulnerable to environmental insults, such as medications taken by the mother. Due to increasing prevalence of hypercholesterolemia, more women of childbearing potential are taking cholesterol-lowering medications called statins. Previously, we showed that inhibition of the mevalonate pathway by statins impaired mouse preimplantation development, by modulating HIPPO signaling, a key regulator for trophectoderm (TE) lineage specification. Here, we further evaluated molecular events that are altered by mevalonate pathway inhibition during the timeframe of morphogenesis and cell lineage specification. Whole transcriptome analysis revealed that statin treatment dysregulated gene expression underlying multiple processes, including cholesterol biosynthesis, HIPPO signaling, cell lineage specification and endoplasmic reticulum (ER) stress response. We explored mechanisms that link the mevalonate pathway to ER stress, because of its potential impact on embryonic health and development. Upregulation of ER stress-responsive genes was inhibited when statin-treated embryos were supplemented with the mevalonate pathway product, geranylgeranyl pyrophosphate (GGPP). Inhibition of geranylgeranylation was sufficient to upregulate ER stress-responsive genes. However, ER stress-responsive genes were not upregulated by inhibition of ras homolog family member A (RHOA), a geranylgeranylation target, although it interfered with TE specification and blastocyst cavity formation. In contrast, inhibition of Rac family small GTPase 1 (RAC1), another geranylgeranylation target, upregulated ER stress-responsive genes, while it did not impair TE specification or cavity formation. Thus, our study suggests that the mevalonate pathway regulates cellular homeostasis (ER stress repression) and differentiation (TE lineage specification) in preimplantation embryos through GGPP-dependent activation of two distinct small GTPases, RAC1 and RHOA, respectively. Translation of the findings to human embryos and clinical settings requires further investigations.


Assuntos
Estresse do Retículo Endoplasmático , Ácido Mevalônico , Animais , Blastocisto/metabolismo , Linhagem da Célula , Embrião de Mamíferos , Desenvolvimento Embrionário/fisiologia , Estresse do Retículo Endoplasmático/genética , Feminino , Regulação da Expressão Gênica no Desenvolvimento , Humanos , Ácido Mevalônico/farmacologia , Camundongos
6.
Int J Mol Sci ; 22(2)2021 Jan 09.
Artigo em Inglês | MEDLINE | ID: mdl-33435397

RESUMO

Selenoproteins are a class of proteins with the selenium-containing amino acid selenocysteine (Sec) in their primary structure. Sec is incorporated into selenoproteins via recoding of the stop codon UGA, with specific cis and trans factors required during translation to avoid UGA recognition as a stop codon, including a Sec-specific tRNA, tRNA[Ser]Sec, encoded in mice by the gene Trsp. Whole-body deletion of Trsp in mouse is embryonically lethal, while targeted deletion of Trsp in mice has been used to understand the role of selenoproteins in the health and physiology of various tissues. We developed a mouse model with the targeted deletion of Trsp in brown adipocytes (Trspf/f-Ucp1-Cre+/-), a cell type predominant in brown adipose tissue (BAT) controlling energy expenditure via activation of adaptive thermogenesis, mostly using uncoupling protein 1 (Ucp1). At room temperature, Trspf/f-Ucp1-Cre+/- mice maintain oxygen consumption and Ucp1 expression, with male Trspf/f-Ucp1-Cre+/- mice accumulating more triglycerides in BAT than both female Trspf/f-Ucp1-Cre+/- mice or Trspf/f controls. Acute cold exposure neither reduced core body temperature nor changed the expression of selenoprotein iodothyronine deiodinase type II (Dio2), a marker of adaptive thermogenesis, in Trspf/f-Ucp1-Cre+/- mice. Microarray analysis of BAT from Trspf/f-Ucp1-Cre+/- mice revealed glutathione S-transferase alpha 3 (Gsta3) and ELMO domain containing 2 (Elmod2) as the transcripts most affected by the loss of Trsp. Male Trspf/f-Ucp1-Cre+/- mice showed mild hypothyroidism while downregulating thyroid hormone-responsive genes Thrsp and Tshr in their BATs. In summary, modest changes in the BAT of Trspf/f-Ucp1-Cre +/- mice implicate a mild thyroid hormone dysfunction in brown adipocytes.


Assuntos
Adipócitos Marrons/metabolismo , Selenoproteínas/metabolismo , Termogênese , Tecido Adiposo Marrom/metabolismo , Animais , Vias Biossintéticas , Células Cultivadas , Resposta ao Choque Frio , Metabolismo Energético , Feminino , Deleção de Genes , Masculino , Camundongos , Camundongos Endogâmicos C57BL , RNA de Transferência Aminoácido-Específico/genética , Proteína Desacopladora 1/genética
7.
BMC Bioinformatics ; 21(Suppl 9): 523, 2020 Dec 03.
Artigo em Inglês | MEDLINE | ID: mdl-33272199

RESUMO

Cancer is one of the leading causes of morbidity and mortality in the globe. Microbiological infections account for up to 20% of the total global cancer burden. The human microbiota within each organ system is distinct, and their compositional variation and interactions with the human host have been known to attribute detrimental and beneficial effects on tumor progression. With the advent of next generation sequencing (NGS) technologies, data generated from NGS is being used for pathogen detection in cancer. Numerous bioinformatics computational frameworks have been developed to study viral information from host-sequencing data and can be adapted to bacterial studies. This review highlights existing popular computational frameworks that utilize NGS data as input to decipher microbial composition, which output can predict functional compositional differences with clinically relevant applicability in the development of treatment and prevention strategies.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Microbiota/genética , Neoplasias/microbiologia , Especificidade de Órgãos/genética , Biologia Computacional , Humanos
8.
Comput Struct Biotechnol J ; 18: 631-641, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32257046

RESUMO

Identification of microbial composition directly from tumor tissue permits studying the relationship between microbial changes and cancer pathogenesis. We interrogated bacterial presence in tumor and adjacent normal tissue strictly in pairs utilizing human whole exome sequencing to generate microbial profiles. Profiles were generated for 813 cases from stomach, liver, colon, rectal, lung, head & neck, cervical and bladder TCGA cohorts. Core microbiota examination revealed twelve taxa to be common across the nine cancer types at all classification levels. Paired analyses demonstrated significant differences in bacterial shifts between tumor and adjacent normal tissue across stomach, colon, lung squamous cell, and head & neck cohorts, whereas little or no differences were evident in liver, rectal, lung adenocarcinoma, cervical and bladder cancer cohorts in adjusted models. Helicobacter pylori in stomach and Bacteroides vulgatus in colon were found to be significantly higher in adjacent normal compared to tumor tissue after false discovery rate correction. Computational results were validated with tissue from an independent population by species-specific qPCR showing similar patterns of co-occurrence among Fusobacterium nucleatum and Selenomonas sputigena in gastric samples. This study demonstrates the ability to identify bacteria differential composition derived from human tissue whole exome sequences. Taken together our results suggest the microbial profiles shift with advanced disease and that the microbial composition of the adjacent tissue can be indicative of cancer stage disease progression.

9.
Reprod Toxicol ; 91: 74-91, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-31711903

RESUMO

Pluripotent stem cells recapitulate many aspects of embryogenesis in vitro. Here, we established a novel culture system to differentiate human embryonic stem cell aggregates (HESCA), and evaluated its utility for teratogenicity assessment. Culture of HESCA with modulators of developmental signals induced morphogenetic and molecular changes associated with differentiation of the paraxial mesoderm and neuroectoderm. To examine impact of teratogenic exposures on HESCA differentiation, 18 compounds were tested, for which adequate information on in vivo plasma concentrations is available. HESCA treated with each compound were examined for gross morphology and transcript levels of 15 embryogenesis regulator genes. Significant alterations in the transcript levels were observed for 94% (15/16) of the teratogenic exposures within 5-fold margin, whereas no alteration was observed for 92% (11/12) of the non-teratogenic exposures. Our study demonstrates that transcriptional changes in HESCA serve as predictive indicator of teratogenicity in a manner comparable to in vivo exposure levels.


Assuntos
Técnicas de Cultura de Células , Células-Tronco Embrionárias Humanas/efeitos dos fármacos , Teratogênicos/toxicidade , Agregação Celular , Diferenciação Celular , Células Cultivadas , Desenvolvimento Embrionário/efeitos dos fármacos , Desenvolvimento Embrionário/genética , Regulação da Expressão Gênica no Desenvolvimento/efeitos dos fármacos , Células-Tronco Embrionárias Humanas/metabolismo , Humanos , Teratogênese
10.
Nutrients ; 11(11)2019 Oct 26.
Artigo em Inglês | MEDLINE | ID: mdl-31717805

RESUMO

Selenium is a nonmetal trace element that is critical for several redox reactions and utilized to produce the amino acid selenocysteine (Sec), which can be incorporated into selenoproteins. Selenocysteine lyase (SCL) is an enzyme which decomposes Sec into selenide and alanine, releasing the selenide to be further utilized to synthesize new selenoproteins. Disruption of the selenocysteine lyase gene (Scly) in mice (Scly-/- or Scly KO) led to obesity with dyslipidemia, hyperinsulinemia, glucose intolerance and lipid accumulation in the hepatocytes. As the liver is a central regulator of glucose and lipid homeostasis, as well as selenium metabolism, we aimed to pinpoint hepatic molecular pathways affected by the Scly gene disruption. Using RNA sequencing and metabolomics, we identified differentially expressed genes and metabolites in the livers of Scly KO mice. Integrated omics revealed that biological pathways related to amino acid metabolism, particularly alanine and glycine metabolism, were affected in the liver by disruption of Scly in mice with selenium adequacy. We further confirmed that hepatic glycine levels are elevated in male, but not in female, Scly KO mice. In conclusion, our results reveal that Scly participates in the modulation of hepatic amino acid metabolic pathways.


Assuntos
Aminoácidos/metabolismo , Liases , Metaboloma/genética , Transcriptoma/genética , Animais , Feminino , Liases/genética , Liases/metabolismo , Liases/fisiologia , Masculino , Metabolômica , Camundongos , Camundongos Knockout , Selênio/metabolismo
11.
Front Oncol ; 9: 720, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31428586

RESUMO

Malignant Mesothelioma (MM) is a rare and highly aggressive cancer that develops from mesothelial cells lining the pleura and other internal cavities, and is often associated with asbestos exposure. To date, no effective treatments have been made available for this pathology. Herein, we propose a novel immunotherapeutic approach based on a unique vaccine targeting a series of antigens that we found expressed in different MM tumors, but largely undetectable in normal tissues. This vaccine, that we term p-Tvax, is comprised of a series of immunogenic peptides presented by both MHC-I and -II to generate robust immune responses. The peptides were designed using in silico algorithms that discriminate between highly immunogenic T cell epitopes and other harmful epitopes, such as suppressive regulatory T cell epitopes and autoimmune epitopes. Vaccination of mice with p-Tvax led to antigen-specific immune responses that involved both CD8+ and CD4+ T cells, which exhibited cytolytic activity against MM cells in vitro. In mice carrying MM tumors, p-Tvax increased tumor infiltration of CD4+ T cells. Moreover, combining p-Tvax with an OX40 agonist led to decreased tumor growth and increased survival. Mice treated with this combination immunotherapy displayed higher numbers of tumor-infiltrating CD8+ and CD4+ T cells and reduced T regulatory cells in tumors. Collectively, these data suggest that the combination of p-Tvax with an OX40 agonist could be an effective strategy for MM treatment.

12.
BMC Med Genomics ; 12(Suppl 1): 24, 2019 01 31.
Artigo em Inglês | MEDLINE | ID: mdl-30704450

RESUMO

BACKGROUND: Prognostic signatures are vital to precision medicine. However, development of somatic mutation prognostic signatures for cancers remains a challenge. In this study we developed a novel method for discovering somatic mutation based prognostic signatures. RESULTS: Somatic mutation and clinical data for lung adenocarcinoma (LUAD) and colorectal adenocarcinoma (COAD) from The Cancer Genome Atlas (TCGA) were randomly divided into training (n = 328 for LUAD and 286 for COAD) and validation (n = 167 for LUAD and 141 for COAD) datasets. A novel method of using the log2 ratio of the tumor mutation frequency to the paired normal mutation frequency is computed for each patient and missense mutation. The missense mutation ratios were mean aggregated into gene-level somatic mutation profiles. The somatic mutations were assessed using univariate Cox analysis on the LUAD and COAD training sets separately. Stepwise multivariate Cox analysis resulted in a final gene prognostic signature for LUAD and COAD. Performance was compared to gene prognostic signatures generated using the same pipeline but with different somatic mutation profile representations based on tumor mutation frequency, binary calls, and gene-gene network normalization. Signature high-risk LUAD and COAD cases had worse overall survival compared to the signature low-risk cases in the validation set (log-rank test p-value = 0.0101 for LUAD and 0.0314 for COAD) using mutation tumor frequency ratio (MFR) profiles, while all other methods, including gene-gene network normalization, have statistically insignificant stratification (log-rank test p-value ≥0.05). Most of the genes in the final gene signatures using MFR profiles are cancer-related based on network and literature analysis. CONCLUSIONS: We demonstrated the robustness of MFR profiles and its potential to be a powerful prognostic tool in cancer. The results are robust according to validation testing and the selected genes are biologically relevant.


Assuntos
Adenocarcinoma de Pulmão/diagnóstico , Adenocarcinoma de Pulmão/genética , Neoplasias Colorretais/diagnóstico , Neoplasias Colorretais/genética , Genômica , Mutação , Idoso , Feminino , Predisposição Genética para Doença/genética , Humanos , Masculino , Pessoa de Meia-Idade , Prognóstico , Medição de Risco
13.
Physiol Genomics ; 50(7): 479-494, 2018 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-29652636

RESUMO

Alternative splicing of RNA is an underexplored area of transcriptional response. We expect that early changes in alternatively spliced genes may be important for responses to cardiac injury. Hypoxia inducible factor 1 (HIF1) is a key transcription factor that rapidly responds to loss of oxygen through alteration of metabolism and angiogenesis. The goal of this study was to investigate the transcriptional response after myocardial infarction (MI) and to identify novel, hypoxia-driven changes, including alternative splicing. After ligation of the left anterior descending artery in mice, we observed an abrupt loss of cardiac contractility and upregulation of hypoxic signaling. We then performed RNA sequencing on ischemic heart tissue 1 and 3 days after infarct to assess early transcriptional changes and identified 89 transcripts with altered splicing. Of particular interest was the switch in Pkm isoform expression (pyruvate kinase, muscle). The usually predominant Pkm1 isoform was less abundant in ischemic hearts, while Pkm2 and associated splicing factors (hnRNPA1, hnRNPA2B1, Ptbp1) rapidly increased. Despite increased Pkm2 expression, total pyruvate kinase activity remained reduced in ischemic myocardial tissue. We also demonstrated HIF1 binding to PKM by chromatin immunoprecipitation, indicating a direct role for HIF1 in mediating this isoform switch. Our study provides a new, detailed characterization of the early transcriptome after MI. From this analysis, we identified an HIF1-mediated alternative splicing event in the PKM gene. Pkm1 and Pkm2 play distinct roles in glycolytic metabolism and the upregulation of Pkm2 is likely to have important consequences for ATP synthesis in infarcted cardiac muscle.


Assuntos
Perfilação da Expressão Gênica , Fator 1 Induzível por Hipóxia/genética , Infarto do Miocárdio/genética , Piruvato Quinase/genética , Processamento Alternativo , Animais , Glicólise/genética , Humanos , Hipóxia , Fator 1 Induzível por Hipóxia/metabolismo , Isoenzimas/genética , Isoenzimas/metabolismo , Masculino , Camundongos Endogâmicos C57BL , Infarto do Miocárdio/metabolismo , Infarto do Miocárdio/fisiopatologia , Piruvato Quinase/metabolismo
14.
Int J Mol Sci ; 16(1): 1466-81, 2015 Jan 08.
Artigo em Inglês | MEDLINE | ID: mdl-25580537

RESUMO

The discovery of novel microRNA (miRNA) and piwi-interacting RNA (piRNA) is an important task for the understanding of many biological processes. Most of the available miRNA and piRNA identification methods are dependent on the availability of the organism's genome sequence and the quality of its annotation. Therefore, an efficient prediction method based solely on the short RNA reads and requiring no genomic information is highly desirable. In this study, we propose an approach that relies primarily on the nucleotide composition of the read and does not require reference genomes of related species for prediction. Using an empirical Bayesian kernel method and the error correcting output codes framework, compact models suitable for large-scale analyses are built on databases of known mature miRNAs and piRNAs. We found that the usage of an L1-based Gaussian kernel can double the true positive rate compared to the standard L2-based Gaussian kernel. Our approach can increase the true positive rate by at most 60% compared to the existing piRNA predictor based on the analysis of a hold-out test set. Using experimental data, we also show that our approach can detect about an order of magnitude or more known miRNAs than the mature miRNA predictor, miRPlex.


Assuntos
MicroRNAs/metabolismo , RNA Interferente Pequeno/metabolismo , Animais , Caenorhabditis elegans/genética , Bases de Dados Genéticas , Drosophila melanogaster/genética , Genoma , MicroRNAs/genética , Distribuição Normal , RNA Interferente Pequeno/genética , Curva ROC , Máquina de Vetores de Suporte
15.
Genome Biol ; 15(10): 500, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25344330

RESUMO

MiRNAs play important roles in many diseases including cancers. However computational prediction of miRNA target genes is challenging and the accuracies of existing methods remain poor. We report mirMark, a new machine learning-based method of miRNA target prediction at the site and UTR levels. This method uses experimentally verified miRNA targets from miRecords and mirTarBase as training sets and considers over 700 features. By combining Correlation-based Feature Selection with a variety of statistical or machine learning methods for the site- and UTR-level classifiers, mirMark significantly improves the overall predictive performance compared to existing publicly available methods. MirMark is available from https://github.com/lanagarmire/MirMark.


Assuntos
Inteligência Artificial , MicroRNAs/fisiologia , Software , Biologia Computacional/métodos , Regiões não Traduzidas
16.
BMC Genomics ; 14 Suppl 2: S6, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23445533

RESUMO

BACKGROUND: Classification is the problem of assigning each input object to one of a finite number of classes. This problem has been extensively studied in machine learning and statistics, and there are numerous applications to bioinformatics as well as many other fields. Building a multiclass classifier has been a challenge, where the direct approach of altering the binary classification algorithm to accommodate more than two classes can be computationally too expensive. Hence the indirect approach of using binary decomposition has been commonly used, in which retrieving the class posterior probabilities from the set of binary posterior probabilities given by the individual binary classifiers has been a major issue. METHODS: In this work, we present an extension of a recently introduced probabilistic kernel-based learning algorithm called the Classification Relevance Units Machine (CRUM) to the multiclass setting to increase its applicability. The extension is achieved under the error correcting output codes framework. The probabilistic outputs of the binary CRUMs are preserved using a proposed linear-time decoding algorithm, an alternative to the generalized Bradley-Terry (GBT) algorithm whose application to large-scale prediction settings is prohibited by its computational complexity. The resulting classifier is called the Multiclass Relevance Units Machine (McRUM). RESULTS: The evaluation of McRUM on a variety of real small-scale benchmark datasets shows that our proposed Naïve decoding algorithm is computationally more efficient than the GBT algorithm while maintaining a similar level of predictive accuracy. Then a set of experiments on a larger scale dataset for small ncRNA classification have been conducted with Naïve McRUM and compared with the Gaussian and linear SVM. Although McRUM's predictive performance is slightly lower than the Gaussian SVM, the results show that the similar level of true positive rate can be achieved by sacrificing false positive rate slightly. Furthermore, McRUM is computationally more efficient than the SVM, which is an important factor for large-scale analysis. CONCLUSIONS: We have proposed McRUM, a multiclass extension of binary CRUM. McRUM with Naïve decoding algorithm is computationally efficient in run-time and its predictive performance is comparable to the well-known SVM, showing its potential in solving large-scale multiclass problems in bioinformatics and other fields of study.


Assuntos
Algoritmos , Biologia Computacional/métodos , RNA não Traduzido/classificação
17.
ACM SIGAPP Appl Comput Rev ; 12(4): 8-20, 2012 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-24163645

RESUMO

Phosphorylation is an important post-translational modification of proteins that is essential to the regulation of many cellular processes. Although most of the phosphorylation sites discovered in protein sequences have been identified experimentally, the in vivo and in vitro discovery of the sites is an expensive, time-consuming and laborious task. Therefore, the development of computational methods for prediction of protein phosphorylation sites has drawn considerable attention. In this work, we present a kernel-based probabilistic Classification Relevance Units Machine (CRUM) for in silico phosphorylation site prediction. In comparison with the popular Support Vector Machine (SVM) CRUM shows comparable predictive performance and yet provides a more parsimonious model. This is desirable since it leads to a reduction in prediction run-time, which is important in predictions on large-scale data. Furthermore, the CRUM training algorithm has lower run-time and memory complexity and has a simpler parameter selection scheme than the Relevance Vector Machine (RVM) learning algorithm. To further investigate the viability of using CRUM in phosphorylation site prediction, we construct multiple CRUM predictors using different combinations of three phosphorylation site features - BLOSUM encoding, disorder, and amino acid composition. The predictors are evaluated through cross-validation and the results show that CRUM with BLOSUM feature is among the best performing CRUM predictors in both cross-validation and benchmark experiments. A comparative study with existing prediction tools in an independent benchmark experiment suggests possible direction for further improving the predictive performance of CRUM predictors.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...