Búsqueda | Biblioteca Virtual en Salud Odontología. Uruguay

1.

Enter the Matrix: Factorization Uncovers Knowledge from Omics.

Stein-O'Brien, Genevieve L; Arora, Raman; Culhane, Aedin C; Favorov, Alexander V; Garmire, Lana X; Greene, Casey S; Goff, Loyal A; Li, Yifeng; Ngom, Aloune; Ochs, Michael F; Xu, Yanxun; Fertig, Elana J.

Trends Genet ; 34(10): 790-805, 2018 10.

Artículo en Inglés | MEDLINE | ID: mdl-30143323

RESUMEN

Omics data contain signals from the molecular, physical, and kinetic inter- and intracellular interactions that control biological systems. Matrix factorization (MF) techniques can reveal low-dimensional structure from high-dimensional data that reflect these interactions. These techniques can uncover new biological knowledge from diverse high-throughput omics data in applications ranging from pathway discovery to timecourse analysis. We review exemplary applications of MF for systems-level analyses. We discuss appropriate applications of these methods, their limitations, and focus on the analysis of results to facilitate optimal biological interpretation. The inference of biologically relevant features with MF enables discovery from high-throughput data beyond the limits of current biological knowledge - answering questions from high-dimensional data that we have not yet thought to ask.

Asunto(s)

Interpretación Estadística de Datos , Genómica/estadística & datos numéricos , Proteómica/estadística & datos numéricos , Algoritmos , Humanos , Biología de Sistemas/estadística & datos numéricos

2.

Evaluating optimal therapy robustness by virtual expansion of a sample population, with a case study in cancer immunotherapy.

Barish, Syndi; Ochs, Michael F; Sontag, Eduardo D; Gevertz, Jana L.

Proc Natl Acad Sci U S A ; 114(31): E6277-E6286, 2017 08 01.

Artículo en Inglés | MEDLINE | ID: mdl-28716945

RESUMEN

Cancer is a highly heterogeneous disease, exhibiting spatial and temporal variations that pose challenges for designing robust therapies. Here, we propose the VEPART (Virtual Expansion of Populations for Analyzing Robustness of Therapies) technique as a platform that integrates experimental data, mathematical modeling, and statistical analyses for identifying robust optimal treatment protocols. VEPART begins with time course experimental data for a sample population, and a mathematical model fit to aggregate data from that sample population. Using nonparametric statistics, the sample population is amplified and used to create a large number of virtual populations. At the final step of VEPART, robustness is assessed by identifying and analyzing the optimal therapy (perhaps restricted to a set of clinically realizable protocols) across each virtual population. As proof of concept, we have applied the VEPART method to study the robustness of treatment response in a mouse model of melanoma subject to treatment with immunostimulatory oncolytic viruses and dendritic cell vaccines. Our analysis (i) showed that every scheduling variant of the experimentally used treatment protocol is fragile (nonrobust) and (ii) discovered an alternative region of dosing space (lower oncolytic virus dose, higher dendritic cell dose) for which a robust optimal protocol exists.

Asunto(s)

Vacunas contra el Cáncer/inmunología , Células Dendríticas/inmunología , Inmunoterapia/métodos , Melanoma/terapia , Modelos Teóricos , Viroterapia Oncolítica/métodos , Virus Oncolíticos/fisiología , Algoritmos , Animales , Diferenciación Celular/inmunología , Simulación por Computador , Modelos Animales de Enfermedad , Melanoma/inmunología , Ratones , Linfocitos T Citotóxicos/inmunología

3.

Splice Expression Variation Analysis (SEVA) for inter-tumor heterogeneity of gene isoform usage in cancer.

Afsari, Bahman; Guo, Theresa; Considine, Michael; Florea, Liliana; Kagohara, Luciane T; Stein-O'Brien, Genevieve L; Kelley, Dylan; Flam, Emily; Zambo, Kristina D; Ha, Patrick K; Geman, Donald; Ochs, Michael F; Califano, Joseph A; Gaykalova, Daria A; Favorov, Alexander V; Fertig, Elana J.

Bioinformatics ; 34(11): 1859-1867, 2018 06 01.

Artículo en Inglés | MEDLINE | ID: mdl-29342249

RESUMEN

Motivation: Current bioinformatics methods to detect changes in gene isoform usage in distinct phenotypes compare the relative expected isoform usage in phenotypes. These statistics model differences in isoform usage in normal tissues, which have stable regulation of gene splicing. Pathological conditions, such as cancer, can have broken regulation of splicing that increases the heterogeneity of the expression of splice variants. Inferring events with such differential heterogeneity in gene isoform usage requires new statistical approaches. Results: We introduce Splice Expression Variability Analysis (SEVA) to model increased heterogeneity of splice variant usage between conditions (e.g. tumor and normal samples). SEVA uses a rank-based multivariate statistic that compares the variability of junction expression profiles within one condition to the variability within another. Simulated data show that SEVA is unique in modeling heterogeneity of gene isoform usage, and benchmark SEVA's performance against EBSeq, DiffSplice and rMATS that model differential isoform usage instead of heterogeneity. We confirm the accuracy of SEVA in identifying known splice variants in head and neck cancer and perform cross-study validation of novel splice variants. A novel comparison of splice variant heterogeneity between subtypes of head and neck cancer demonstrated unanticipated similarity between the heterogeneity of gene isoform usage in HPV-positive and HPV-negative subtypes and anticipated increased heterogeneity among HPV-negative samples with mutations in genes that regulate the splice variant machinery. These results show that SEVA accurately models differential heterogeneity of gene isoform usage from RNA-seq data. Availability and implementation: SEVA is implemented in the R/Bioconductor package GSReg. Contact: bahman@jhu.edu or favorov@sensi.org or ejfertig@jhmi.edu. Supplementary information: Supplementary data are available at Bioinformatics online.

Asunto(s)

Empalme Alternativo , Neoplasias/genética , Isoformas de Proteínas/genética , Análisis de Secuencia de ARN/métodos , Programas Informáticos , Biología Computacional/métodos , Regulación Neoplásica de la Expresión Génica , Neoplasias de Cabeza y Cuello/genética , Humanos , Modelos Genéticos

4.

PatternMarkers & GWCoGAPS for novel data-driven biomarkers via whole transcriptome NMF.

Stein-O'Brien, Genevieve L; Carey, Jacob L; Lee, Wai Shing; Considine, Michael; Favorov, Alexander V; Flam, Emily; Guo, Theresa; Li, Sijia; Marchionni, Luigi; Sherman, Thomas; Sivy, Shawn; Gaykalova, Daria A; McKay, Ronald D; Ochs, Michael F; Colantuoni, Carlo; Fertig, Elana J.

Bioinformatics ; 33(12): 1892-1894, 2017 Jun 15.

Artículo en Inglés | MEDLINE | ID: mdl-28174896

RESUMEN

SUMMARY: Non-negative Matrix Factorization (NMF) algorithms associate gene expression with biological processes (e.g. time-course dynamics or disease subtypes). Compared with univariate associations, the relative weights of NMF solutions can obscure biomarkers. Therefore, we developed a novel patternMarkers statistic to extract genes for biological validation and enhanced visualization of NMF results. Finding novel and unbiased gene markers with patternMarkers requires whole-genome data. Therefore, we also developed Genome-Wide CoGAPS Analysis in Parallel Sets (GWCoGAPS), the first robust whole genome Bayesian NMF using the sparse, MCMC algorithm, CoGAPS. Additionally, a manual version of the GWCoGAPS algorithm contains analytic and visualization tools including patternMatcher, a Shiny web application. The decomposition in the manual pipeline can be replaced with any NMF algorithm, for further generalization of the software. Using these tools, we find granular brain-region and cell-type specific signatures with corresponding biomarkers in GTEx data, illustrating GWCoGAPS and patternMarkers ascertainment of data-driven biomarkers from whole-genome data. AVAILABILITY AND IMPLEMENTATION: PatternMarkers & GWCoGAPS are in the CoGAPS Bioconductor package (3.5) under the GPL license. CONTACT: gsteinobrien@jhmi.edu or ccolantu@jhmi.edu or ejfertig@jhmi.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Algoritmos , Perfilación de la Expresión Génica/métodos , Programas Informáticos , Teorema de Bayes , Biomarcadores , Humanos , Análisis de Secuencia de ARN/métodos

5.

Two clinical phenotypes in polycythemia vera.

Spivak, Jerry L; Considine, Michael; Williams, Donna M; Talbot, Conover C; Rogers, Ophelia; Moliterno, Alison R; Jie, Chunfa; Ochs, Michael F.

N Engl J Med ; 371(9): 808-17, 2014 Aug 28.

Artículo en Inglés | MEDLINE | ID: mdl-25162887

RESUMEN

BACKGROUND: Polycythemia vera is the ultimate phenotypic consequence of the V617F mutation in Janus kinase 2 (encoded by JAK2), but the extent to which this mutation influences the behavior of the involved CD34+ hematopoietic stem cells is unknown. METHODS: We analyzed gene expression in CD34+ peripheral-blood cells from 19 patients with polycythemia vera, using oligonucleotide microarray technology after correcting for potential confounding by sex, since the phenotypic features of the disease differ between men and women. RESULTS: Men with polycythemia vera had twice as many up-regulated or down-regulated genes as women with polycythemia vera, in a comparison of gene expression in the patients and in healthy persons of the same sex, but there were 102 genes with differential regulation that was concordant in men and women. When these genes were used for class discovery by means of unsupervised hierarchical clustering, the 19 patients could be divided into two groups that did not differ significantly with respect to age, neutrophil JAK2 V617F allele burden, white-cell count, platelet count, or clonal dominance. However, they did differ significantly with respect to disease duration; hemoglobin level; frequency of thromboembolic events, palpable splenomegaly, and splenectomy; chemotherapy exposure; leukemic transformation; and survival. The unsupervised clustering was confirmed by a supervised approach with the use of a top-scoring-pair classifier that segregated the 19 patients into the same two phenotypic groups with 100% accuracy. CONCLUSIONS: Removing sex as a potential confounder, we identified an accurate molecular method for classifying patients with polycythemia vera according to disease behavior, independently of their JAK2 V617F allele burden, and identified previously unrecognized molecular pathways in polycythemia vera outside the canonical JAK2 pathway that may be amenable to targeted therapy. (Funded by the Department of Defense and the National Institutes of Health.).

Asunto(s)

Expresión Génica , Janus Quinasa 2/genética , Fenotipo , Policitemia Vera/genética , Anciano , Anciano de 80 o más Años , Antígenos CD34 , Recuento de Células Sanguíneas , Factores de Confusión Epidemiológicos , Femenino , Regulación de la Expresión Génica , Humanos , Janus Quinasa 2/metabolismo , Masculino , Redes y Vías Metabólicas , Persona de Mediana Edad , Análisis de Secuencia por Matrices de Oligonucleótidos , Policitemia Vera/clasificación , Policitemia Vera/metabolismo , Factores Sexuales

6.

NF-κB and stat3 transcription factor signatures differentiate HPV-positive and HPV-negative head and neck squamous cell carcinoma.

Gaykalova, Daria A; Manola, Judith B; Ozawa, Hiroyuki; Zizkova, Veronika; Morton, Kathryn; Bishop, Justin A; Sharma, Rajni; Zhang, Chi; Michailidi, Christina; Considine, Michael; Tan, Marietta; Fertig, Elana J; Hennessey, Patrick T; Ahn, Julie; Koch, Wayne M; Westra, William H; Khan, Zubair; Chung, Christine H; Ochs, Michael F; Califano, Joseph A.

Int J Cancer ; 137(8): 1879-89, 2015 Oct 15.

Artículo en Inglés | MEDLINE | ID: mdl-25857630

RESUMEN

Using high-throughput analyses and the TRANSFAC database, we characterized TF signatures of head and neck squamous cell carcinoma (HNSCC) subgroups by inferential analysis of target gene expression, correcting for the effects of DNA methylation and copy number. Using this discovery pipeline, we determined that human papillomavirus-related (HPV+) and HPV- HNSCC differed significantly based on the activity levels of key TFs including AP1, STATs, NF-κB and p53. Immunohistochemical analysis confirmed that HPV- HNSCC is characterized by co-activated STAT3 and NF-κB pathways and functional studies demonstrate that this phenotype can be effectively targeted with combined anti-NF-κB and anti-STAT therapies. These discoveries correlate strongly with previous findings connecting STATs, NF-κB and AP1 in HNSCC. We identified five top-scoring pair biomarkers from STATs, NF-κB and AP1 pathways that distinguish HPV+ from HPV- HNSCC based on TF activity and validated these biomarkers on TCGA and on independent validation cohorts. We conclude that a novel approach to TF pathway analysis can provide insight into therapeutic targeting of patient subgroup for heterogeneous disease such as HNSCC.

Asunto(s)

Carcinoma de Células Escamosas/genética , Neoplasias de Cabeza y Cuello/genética , FN-kappa B/genética , Infecciones por Papillomavirus/genética , Factor de Transcripción STAT3/genética , Carcinoma de Células Escamosas/metabolismo , Carcinoma de Células Escamosas/virología , Línea Celular Tumoral , Metilación de ADN , Regulación Neoplásica de la Expresión Génica , Neoplasias de Cabeza y Cuello/metabolismo , Neoplasias de Cabeza y Cuello/virología , Humanos , FN-kappa B/metabolismo , Análisis de Secuencia por Matrices de Oligonucleótidos , Infecciones por Papillomavirus/metabolismo , Factor de Transcripción STAT3/metabolismo , Transducción de Señal , Carcinoma de Células Escamosas de Cabeza y Cuello

7.

Correcting transcription factor gene sets for copy number and promoter methylation variations.

Rathi, Komal S; Gaykalova, Daria A; Hennessey, Patrick; Califano, Joseph A; Ochs, Michael F.

Drug Dev Res ; 75(6): 343-7, 2014 Sep.

Artículo en Inglés | MEDLINE | ID: mdl-25195578

RESUMEN

Gene set analysis provides a method to generate statistical inferences across sets of linked genes, primarily using high-throughput expression data. Common gene sets include biological pathways, operons, and targets of transcriptional regulators. In higher eukaryotes, especially when dealing with diseases with strong genetic and epigenetic components such as cancer, copy number loss and gene silencing through promoter methylation can eliminate the possibility that a gene is transcribed. This, in turn, can adversely affect the estimation of transcription factor or pathway activity from a set of target genes, as some of the targets may not be responsive to transcriptional regulation. Here we introduce a simple filtering approach that removes genes from consideration if they show copy number loss or promoter methylation, and demonstrate the improvement in inference of transcription factor activity in a simulated dataset based on the background expression observed in normal head and neck tissue.

Asunto(s)

Biología Computacional/métodos , Dosificación de Gen , Neoplasias/genética , Regiones Promotoras Genéticas , Factores de Transcripción/genética , Metilación de ADN , Epigénesis Genética , Regulación Neoplásica de la Expresión Génica , Humanos , Programas Informáticos

8.

Updating annotations with the distributed annotation system and the automated sequence annotation pipeline.

Speier, William; Ochs, Michael F.

Bioinformatics ; 28(21): 2858-9, 2012 Nov 01.

Artículo en Inglés | MEDLINE | ID: mdl-22945787

RESUMEN

SUMMARY: The integration between BioDAS ProServer and Automated Sequence Annotation Pipeline (ASAP) provides an interface for querying diverse annotation sources, chaining and linking results, and standardizing the output using the Distributed Annotation System (DAS) protocol. This interface allows pipeline plans in ASAP to be integrated into any system using HTTP and also allows the information returned by ASAP to be included in the DAS registry for use in any DAS-aware system. Three example implementations have been developed: the first accesses TRANSFAC information to automatically create gene sets for the Coordinated Gene Activity in Pattern Sets (CoGAPS) algorithm; the second integrates annotations from multiple array platforms and provides unified annotations in an R environment; and the third wraps the UniProt database for integration with the SPICE DAS client. AVAILABILITY: Source code for ASAP 2.7 and the DAS 1.6 interface is available under the GNU public license. Proserver 2.20 is free software available from SourceForge. Scripts for installation and configuration on Linux are provided at our website: http://www.rits.onc.jhmi.edu/dbb/custom/A6/

Asunto(s)

Algoritmos , Bases de Datos Factuales , Almacenamiento y Recuperación de la Información/métodos , Anotación de Secuencia Molecular/métodos , Biología Computacional/métodos , Secuenciación de Nucleótidos de Alto Rendimiento , Bases del Conocimiento , Lenguajes de Programación , Proteínas/química , Programas Informáticos , Interfaz Usuario-Computador

9.

Gene expression signatures modulated by epidermal growth factor receptor activation and their relationship to cetuximab resistance in head and neck squamous cell carcinoma.

Fertig, Elana J; Ren, Qing; Cheng, Haixia; Hatakeyama, Hiromitsu; Dicker, Adam P; Rodeck, Ulrich; Considine, Michael; Ochs, Michael F; Chung, Christine H.

BMC Genomics ; 13: 160, 2012 May 01.

Artículo en Inglés | MEDLINE | ID: mdl-22549044

RESUMEN

BACKGROUND: Aberrant activation of signaling pathways downstream of epidermal growth factor receptor (EGFR) has been hypothesized to be one of the mechanisms of cetuximab (a monoclonal antibody against EGFR) resistance in head and neck squamous cell carcinoma (HNSCC). To infer relevant and specific pathway activation downstream of EGFR from gene expression in HNSCC, we generated gene expression signatures using immortalized keratinocytes (HaCaT) subjected to ligand stimulation and transfected with EGFR, RELA/p65, or HRASVal12D. RESULTS: The gene expression patterns that distinguished the HaCaT variants and conditions were inferred using the Markov chain Monte Carlo (MCMC) matrix factorization algorithm Coordinated Gene Activity in Pattern Sets (CoGAPS). This approach inferred gene expression signatures with greater relevance to cell signaling pathway activation than the expression signatures inferred with standard linear models. Furthermore, the pathway signature generated using HaCaT-HRASVal12D further associated with the cetuximab treatment response in isogenic cetuximab-sensitive (UMSCC1) and -resistant (1CC8) cell lines. CONCLUSIONS: Our data suggest that the CoGAPS algorithm can generate gene expression signatures that are pertinent to downstream effects of receptor signaling pathway activation and potentially be useful in modeling resistance mechanisms to targeted therapies.

Asunto(s)

Anticuerpos Monoclonales/farmacología , Carcinoma de Células Escamosas/metabolismo , Receptores ErbB/metabolismo , Neoplasias de Cabeza y Cuello/metabolismo , Algoritmos , Anticuerpos Monoclonales Humanizados , Línea Celular Tumoral , Cetuximab , Resistencia a Antineoplásicos/genética , Receptores ErbB/genética , Humanos , Queratinocitos/citología , Queratinocitos/efectos de los fármacos , Queratinocitos/metabolismo , Unión Proteica/efectos de los fármacos , Transducción de Señal/efectos de los fármacos

10.

Knowledge-based data analysis comes of age.

Ochs, Michael F.

Brief Bioinform ; 11(1): 30-9, 2010 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-19854753

RESUMEN

The emergence of high-throughput technologies for measuring biological systems has introduced problems for data interpretation that must be addressed for proper inference. First, analysis techniques need to be matched to the biological system, reflecting in their mathematical structure the underlying behavior being studied. When this is not done, mathematical techniques will generate answers, but the values and reliability estimates may not accurately reflect the biology. Second, analysis approaches must address the vast excess in variables measured (e.g. transcript levels of genes) over the number of samples (e.g. tumors, time points), known as the 'large-p, small-n' problem. In large-p, small-n paradigms, standard statistical techniques generally fail, and computational learning algorithms are prone to overfit the data. Here we review the emergence of techniques that match mathematical structure to the biology, the use of integrated data and prior knowledge to guide statistical analysis, and the recent emergence of analysis approaches utilizing simple biological models. We show that novel biological insights have been gained using these techniques.

Asunto(s)

Biología de Sistemas , Teorema de Bayes , Estudio de Asociación del Genoma Completo , Análisis de Secuencia por Matrices de Oligonucleótidos , Sitios de Carácter Cuantitativo

11.

CoGAPS: an R/C++ package to identify patterns and biological process activity in transcriptomic data.

Fertig, Elana J; Ding, Jie; Favorov, Alexander V; Parmigiani, Giovanni; Ochs, Michael F.

Bioinformatics ; 26(21): 2792-3, 2010 Nov 01.

Artículo en Inglés | MEDLINE | ID: mdl-20810601

RESUMEN

SUMMARY: Coordinated Gene Activity in Pattern Sets (CoGAPS) provides an integrated package for isolating gene expression driven by a biological process, enhancing inference of biological processes from transcriptomic data. CoGAPS improves on other enrichment measurement methods by combining a Markov chain Monte Carlo (MCMC) matrix factorization algorithm (GAPS) with a threshold-independent statistic inferring activity on gene sets. The software is provided as open source C++ code built on top of JAGS software with an R interface. AVAILABILITY: The R package CoGAPS and the C++ package GAPS-JAGS are provided open source under the GNU Lesser Public License (GLPL) with a users manual containing installation and operating instructions. CoGAPS is available through Bioconductor and depends on the rjags package available through CRAN to interface CoGAPS with GAPS-JAGS. URL: http://www.cancerbiostats.onc.jhmi.edu/cogaps.cfm .

Asunto(s)

Expresión Génica , Genómica/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Programas Informáticos , Biología Computacional/métodos , Perfilación de la Expresión Génica , Cadenas de Markov

12.

Many ways to land upright: novel righting strategies allow spotted lanternfly nymphs to land on diverse substrates.

Kane, Suzanne Amador; Bien, Theodore; Contreras-Orendain, Luis; Ochs, Michael F; Tonia Hsieh, S.

J R Soc Interface ; 18(181): 20210367, 2021 08.

Artículo en Inglés | MEDLINE | ID: mdl-34376093

RESUMEN

Unlike large animals, insects and other very small animals are so unsusceptible to impact-related injuries that they can use falling for dispersal and predator evasion. Reorienting to land upright can mitigate lost access to resources and predation risk. Such behaviours are critical for the spotted lanternfly (SLF) (Lycorma delicatula), an invasive, destructive insect pest spreading rapidly in the USA. High-speed video of SLF nymphs released under different conditions showed that these insects self-right using both active midair righting motions previously reported for other insects and novel post-impact mechanisms that take advantage of their ability to experience near-total energy loss on impact. Unlike during terrestrial self-righting, in which an animal initially at rest on its back uses appendage motions to flip over, SLF nymphs impacted the surface at varying angles and then self-righted during the rebound using coordinated body rotations, foot-substrate adhesion and active leg motions. These previously unreported strategies were found to promote disproportionately upright, secure landings on both hard, flat surfaces and tilted, compliant host plant leaves. Our results highlight the importance of examining biomechanical phenomena in ecologically relevant contexts, and show that, for small animals, the post-impact bounce period can be critical for achieving an upright landing.

Asunto(s)

Hemípteros , Animales , Extremidades , Insectos , Movimiento

13.

Correction: Novel Insight into Mutational Landscape of Head and Neck Squamous Cell Carcinoma.

Gaykalova, Daria A; Mambo, Elizabeth; Choudhary, Ashish; Houghton, Jeffery; Buddavarapu, Kalyan; Sanford, Tiffany; Darden, Will; Adai, Alex; Hadd, Andrew; Latham, Gary; Danilova, Ludmila V; Bishop, Justin; Li, Ryan J; Westra, William H; Hennessey, Patrick; Koch, Wayne M; Ochs, Michael F; Califano, Joseph A; Sun, Wenyue.

PLoS One ; 15(5): e0233409, 2020.

Artículo en Inglés | MEDLINE | ID: mdl-32401780

RESUMEN

[This corrects the article DOI: 10.1371/journal.pone.0093102.].

14.

Information systems for cancer research.

Ochs, Michael F; Casagrande, John T.

Cancer Invest ; 26(10): 1060-7, 2008 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-19093263

RESUMEN

The last decade has seen a massive growth in data for cancer research, with high-throughput technologies joining clinical trials as major drivers of informatics needs. These data provide opportunities for developing new cancer treatments, but also major challenges for informatics, and we summarize the systems needed and potential issues arising in addressing these challenges. Integrating these data into the research enterprise will require investments in (1) data capture and management, (2) data analysis, (3) data integration standards, (4) visualization tools, and (5) methods for integration with other enterprise systems.

Asunto(s)

Sistemas de Información/estadística & datos numéricos , Neoplasias/terapia , Investigación/tendencias , Ensayos Clínicos como Asunto , Biología Computacional , Humanos , Sistemas de Información/organización & administración , Sistemas de Información/tendencias , Lenguaje , Informática Médica , Proyectos de Investigación , Ciencia/métodos , Ciencia/tendencias , Biología de Sistemas

15.

Estimating gene function with least squares nonnegative matrix factorization.

Wang, Guoli; Ochs, Michael F.

Methods Mol Biol ; 408: 35-47, 2007.

Artículo en Inglés | MEDLINE | ID: mdl-18314576

RESUMEN

Nonnegative matrix factorization is a machine learning algorithm that has extracted information from data in a number of fields, including imaging and spectral analysis, text mining, and microarray data analysis. One limitation with the method for linking genes through microarray data in order to estimate gene function is the high variance observed in transcription levels between different genes. Least squares nonnegative matrix factorization uses estimates of the uncertainties on the mRNA levels for each gene in each condition, to guide the algorithm to a local minimum in normalized chi2, rather than a Euclidean distance or divergence between the reconstructed data and the data itself. Herein, application of this method to microarray data is demonstrated in order to predict gene function.

Asunto(s)

Algoritmos , Técnicas Genéticas/estadística & datos numéricos , Inteligencia Artificial , Análisis por Conglomerados , Simulación por Computador , Humanos , Análisis de los Mínimos Cuadrados , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Reconocimiento de Normas Patrones Automatizadas/estadística & datos numéricos , Interfaz Usuario-Computador

16.

Incorporation of gene ontology annotations to enhance microarray data analysis.

Ochs, Michael F; Peterson, Aidan J; Kossenkov, Andrew; Bidaut, Ghislain.

Methods Mol Biol ; 377: 243-54, 2007.

Artículo en Inglés | MEDLINE | ID: mdl-17634621

RESUMEN

Typical microarray or GeneChip experiments now provide genome-wide measurements on gene expression across many conditions. Analysis often focuses on only a few of the genes, looking for those that are "differentially expressed" between conditions or groups of conditions. However, the large number of measurements both present statistical problems to such single gene approaches and offers a tremendous amount of information for methods focused on biological processes rather than individual genes. Here we provide a method to utilize biological annotations in the form of gene ontologies to interpret the results of individual or multiple pattern recognition analyses of a microarray experiment.

Asunto(s)

Interpretación Estadística de Datos , Genes , Análisis por Micromatrices/métodos , Biología Molecular/métodos , Animales , Análisis por Conglomerados , Expresión Génica , Genoma , Humanos , Reconocimiento de Normas Patrones Automatizadas

17.

Determining transcription factor activity from microarray data using Bayesian Markov chain Monte Carlo sampling.

Kossenkov, Andrew V; Peterson, Aidan J; Ochs, Michael F.

Stud Health Technol Inform ; 129(Pt 2): 1250-4, 2007.

Artículo en Inglés | MEDLINE | ID: mdl-17911915

RESUMEN

Many biological processes rely on remodeling of the transcriptional response of cells through activation of transcription factors. Although determination of the activity level of transcription factors from microarray data can provide insight into developmental and disease processes, it requires careful analysis because of the multiple regulation of genes. We present a novel approach that handles both the assignment of genes to multiple patterns, as required by multiple regulation, and the linking of genes in prior probability distributions according to their known transcriptional regulators. We demonstrate the power of this approach in simulations and by application to yeast cell cycle and deletion mutant data. The results of simulations in the presence of increasing noise showed improved recovery of patterns in terms of chi2 fit. Analysis of the yeast data led to improved inference of biologically meaningful groups in comparison to other techniques, as demonstrated with ROC analysis. The new algorithm provides an approach for estimating the levels of transcription factor activity from microarray data, and therefore provides insights into biological response.

Asunto(s)

Algoritmos , Regulación de la Expresión Génica , Análisis de Secuencia por Matrices de Oligonucleótidos , Factores de Transcripción/metabolismo , Teorema de Bayes , Biología Computacional , Cadenas de Markov , Modelos Genéticos , Método de Montecarlo , Curva ROC , Transcripción Genética , Levaduras/genética

18.

Integrative computational analysis of transcriptional and epigenetic alterations implicates DTX1 as a putative tumor suppressor gene in HNSCC.

Gaykalova, Daria A; Zizkova, Veronika; Guo, Theresa; Tiscareno, Ilse; Wei, Yingying; Vatapalli, Rajita; Hennessey, Patrick T; Ahn, Julie; Danilova, Ludmila; Khan, Zubair; Bishop, Justin A; Gutkind, J Silvio; Koch, Wayne M; Westra, William H; Fertig, Elana J; Ochs, Michael F; Califano, Joseph A.

Oncotarget ; 8(9): 15349-15363, 2017 Feb 28.

Artículo en Inglés | MEDLINE | ID: mdl-28146432

RESUMEN

Over a half million new cases of Head and Neck Squamous Cell Carcinoma (HNSCC) are diagnosed annually worldwide, however, 5 year overall survival is only 50% for HNSCC patients. Recently, high throughput technologies have accelerated the genome-wide characterization of HNSCC. However, comprehensive pipelines with statistical algorithms that account for HNSCC biology and perform independent confirmatory and functional validation of candidates are needed to identify the most biologically relevant genes. We applied outlier statistics to high throughput gene expression data, and identified 76 top-scoring candidates with significant differential expression in tumors compared to normal tissues. We identified 15 epigenetically regulated candidates by focusing on a subset of the genes with a negative correlation between gene expression and promoter methylation. Differential expression and methylation of 3 selected candidates (BANK1, BIN2, and DTX1) were confirmed in an independent HNSCC cohorts from Johns Hopkins and TCGA (The Cancer Genome Atlas). We further performed functional evaluation of NOTCH regulator, DTX1, which was downregulated by promoter hypermethylation in tumors, and demonstrated that decreased expression of DTX1 in HNSCC tumors maybe associated with NOTCH pathway activation and increased migration potential.

Asunto(s)

Carcinoma de Células Escamosas/genética , Epigenómica , Regulación Neoplásica de la Expresión Génica , Genes Supresores de Tumor , Neoplasias de Cabeza y Cuello/genética , Ubiquitina-Proteína Ligasas/genética , Adulto , Anciano , Anciano de 80 o más Años , Carcinoma de Células Escamosas/patología , Línea Celular Tumoral , Movimiento Celular/genética , Análisis por Conglomerados , Estudios de Cohortes , Biología Computacional/métodos , Metilación de ADN , Femenino , Perfilación de la Expresión Génica/métodos , Neoplasias de Cabeza y Cuello/patología , Humanos , Masculino , Persona de Mediana Edad , Interferencia de ARN , Receptores Notch/genética , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa , Transducción de Señal/genética

19.

LS-NMF: a modified non-negative matrix factorization algorithm utilizing uncertainty estimates.

Wang, Guoli; Kossenkov, Andrew V; Ochs, Michael F.

BMC Bioinformatics ; 7: 175, 2006 Mar 28.

Artículo en Inglés | MEDLINE | ID: mdl-16569230

RESUMEN

BACKGROUND: Non-negative matrix factorisation (NMF), a machine learning algorithm, has been applied to the analysis of microarray data. A key feature of NMF is the ability to identify patterns that together explain the data as a linear combination of expression signatures. Microarray data generally includes individual estimates of uncertainty for each gene in each condition, however NMF does not exploit this information. Previous work has shown that such uncertainties can be extremely valuable for pattern recognition. RESULTS: We have created a new algorithm, least squares non-negative matrix factorization, LS-NMF, which integrates uncertainty measurements of gene expression data into NMF updating rules. While the LS-NMF algorithm maintains the advantages of original NMF algorithm, such as easy implementation and a guaranteed locally optimal solution, the performance in terms of linking functionally related genes has been improved. LS-NMF exceeds NMF significantly in terms of identifying functionally related genes as determined from annotations in the MIPS database. CONCLUSION: Uncertainty measurements on gene expression data provide valuable information for data analysis, and use of this information in the LS-NMF algorithm significantly improves the power of the NMF technique.

Asunto(s)

Algoritmos , Bases de Datos Genéticas , Análisis de Secuencia por Matrices de Oligonucleótidos , Incertidumbre , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Reconocimiento de Normas Patrones Automatizadas/métodos , ARN Mensajero/genética

20.

Determination of strongly overlapping signaling activity from microarray data.

Bidaut, Ghislain; Suhre, Karsten; Claverie, Jean-Michel; Ochs, Michael F.

BMC Bioinformatics ; 7: 99, 2006 Feb 28.

Artículo en Inglés | MEDLINE | ID: mdl-16507110

RESUMEN

BACKGROUND: As numerous diseases involve errors in signal transduction, modern therapeutics often target proteins involved in cellular signaling. Interpretation of the activity of signaling pathways during disease development or therapeutic intervention would assist in drug development, design of therapy, and target identification. Microarrays provide a global measure of cellular response, however linking these responses to signaling pathways requires an analytic approach tuned to the underlying biology. An ongoing issue in pattern recognition in microarrays has been how to determine the number of patterns (or clusters) to use for data interpretation, and this is a critical issue as measures of statistical significance in gene ontology or pathways rely on proper separation of genes into groups. RESULTS: Here we introduce a method relying on gene annotation coupled to decompositional analysis of global gene expression data that allows us to estimate specific activity on strongly coupled signaling pathways and, in some cases, activity of specific signaling proteins. We demonstrate the technique using the Rosetta yeast deletion mutant data set, decompositional analysis by Bayesian Decomposition, and annotation analysis using ClutrFree. We determined from measurements of gene persistence in patterns across multiple potential dimensionalities that 15 basis vectors provides the correct dimensionality for interpreting the data. Using gene ontology and data on gene regulation in the Saccharomyces Genome Database, we identified the transcriptional signatures of several cellular processes in yeast, including cell wall creation, ribosomal disruption, chemical blocking of protein synthesis, and, critically, individual signatures of the strongly coupled mating and filamentation pathways. CONCLUSION: This works demonstrates that microarray data can provide downstream indicators of pathway activity either through use of gene ontology or transcription factor databases. This can be used to investigate the specificity and success of targeted therapeutics as well as to elucidate signaling activity in normal and disease processes.

Asunto(s)

Perfilación de la Expresión Génica/métodos , Modelos Biológicos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Proteínas de Saccharomyces cerevisiae/metabolismo , Transducción de Señal/fisiología , Factores de Transcripción/metabolismo , Algoritmos , Simulación por Computador , Proteínas de Saccharomyces cerevisiae/genética , Factores de Transcripción/genética

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA