Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
1.
Mol Syst Biol ; 5: 312, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-19888207

RESUMO

This report provides a global view of how gene expression is affected by DNA replication. We analyzed synchronized cultures of Saccharomyces cerevisiae under conditions that prevent DNA replication initiation without delaying cell cycle progression. We use a higher-order singular value decomposition to integrate the global mRNA expression measured in the multiple time courses, detect and remove experimental artifacts and identify significant combinations of patterns of expression variation across the genes, time points and conditions. We find that, first, approximately 88% of the global mRNA expression is independent of DNA replication. Second, the requirement of DNA replication for efficient histone gene expression is independent of conditions that elicit DNA damage checkpoint responses. Third, origin licensing decreases the expression of genes with origins near their 3' ends, revealing that downstream origins can regulate the expression of upstream genes. This confirms previous predictions from mathematical modeling of a global causal coordination between DNA replication origin activity and mRNA expression, and shows that mathematical modeling of DNA microarray data can be used to correctly predict previously unknown biological modes of regulation.


Assuntos
Replicação do DNA/genética , Regulação Fúngica da Expressão Gênica , Origem de Replicação/genética , Saccharomyces cerevisiae/genética , Genes Fúngicos , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo , Fatores de Tempo
2.
Proc Natl Acad Sci U S A ; 104(47): 18371-6, 2007 Nov 20.
Artigo em Inglês | MEDLINE | ID: mdl-18003902

RESUMO

We describe the use of a higher-order singular value decomposition (HOSVD) in transforming a data tensor of genes x "x-settings," that is, different settings of the experimental variable x x "y-settings," which tabulates DNA microarray data from different studies, to a "core tensor" of "eigenarrays" x "x-eigengenes" x "y-eigengenes." Reformulating this multilinear HOSVD such that it decomposes the data tensor into a linear superposition of all outer products of an eigenarray, an x- and a y-eigengene, that is, rank-1 "subtensors," we define the significance of each subtensor in terms of the fraction of the overall information in the data tensor that it captures. We illustrate this HOSVD with an integration of genome-scale mRNA expression data from three yeast cell cycle time courses, two of which are under exposure to either hydrogen peroxide or menadione. We find that significant subtensors represent independent biological programs or experimental phenomena. The picture that emerges suggests that the conserved genes YKU70, MRE11, AIF1, and ZWF1, and the processes of retrotransposition, apoptosis, and the oxidative pentose phosphate pathway that these genes are involved in, may play significant, yet previously unrecognized, roles in the differential effects of hydrogen peroxide and menadione on cell cycle progression. A genome-scale correlation between DNA replication initiation and RNA transcription, which is equivalent to a recently discovered correlation and might be due to a previously unknown mechanism of regulation, is independently uncovered.


Assuntos
DNA/genética , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Ciclo Celular , Replicação do DNA/genética , Perfilação da Expressão Gênica , Regulação Fúngica da Expressão Gênica , Modelos Genéticos , Estresse Oxidativo , RNA Mensageiro/genética , Saccharomyces cerevisiae/citologia , Saccharomyces cerevisiae/genética , Fatores de Tempo
3.
APL Bioeng ; 4(2): 026106, 2020 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-32478280

RESUMO

Modeling of genomic profiles from the Cancer Genome Atlas (TCGA) by using recently developed mathematical frameworks has associated a genome-wide pattern of DNA copy-number alterations with a shorter, roughly one-year, median survival time in glioblastoma (GBM) patients. Here, to experimentally test this relationship, we whole-genome sequenced DNA from tumor samples of patients. We show that the patients represent the U.S. adult GBM population in terms of most normal and disease phenotypes. Intratumor heterogeneity affects ≈ 11 % and profiling technology and reference human genome specifics affect <1% of the classifications of the tumors by the pattern, where experimental batch effects normally reduce the reproducibility, i.e., precision, of classifications based upon between one to a few hundred genomic loci by >30%. With a 2.25-year Kaplan-Meier median survival difference, a 3.5 univariate Cox hazard ratio, and a 0.78 concordance index, i.e., accuracy, the pattern predicts survival better than and independent of age at diagnosis, which has been the best indicator since 1950. The prognostic classification by the pattern may, therefore, help to manage GBM pseudoprogression. The diagnostic classification may help drugs progress to regulatory approval. The therapeutic predictions, of previously unrecognized targets that are correlated with survival, may lead to new drugs. Other methods missed this relationship in the roughly 3B-nucleotide genomes of the small, order of magnitude of 100, patient cohorts, e.g., from TCGA. Previous attempts to associate GBM genotypes with patient phenotypes were unsuccessful. This is a proof of principle that the frameworks are uniquely suitable for discovering clinically actionable genotype-phenotype relationships.

4.
APL Bioeng ; 3(3): 036104, 2019 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-31463421

RESUMO

More than a quarter of lung, uterine, and ovarian adenocarcinoma (LUAD, USEC, and OV) tumors are resistant to platinum drugs. Only recently and only in OV, patterns of copy-number alterations that predict survival in response to platinum were discovered, and only by using the tensor GSVD to compare Agilent microarray platform-matched profiles of patient-matched normal and primary tumor DNA. Here, we use the GSVD to compare whole-genome sequencing (WGS) and Affymetrix microarray profiles of patient-matched normal and primary LUAD, USEC, and OV tumor DNA. First, the GSVD uncovers patterns similar to one Agilent OV pattern, where a loss of most of the chromosome arm 6p combined with a gain of 12p encode for transformation. Like the Agilent OV pattern, the WGS LUAD and Affymetrix LUAD, USEC, and OV patterns are correlated with shorter survival, in general and in response to platinum. Like the tensor GSVD, the GSVD separates these tumor-exclusive genotypes from experimental inconsistencies. Second, by identifying the shorter survival phenotypes among the WGS- and Affymetrix-profiled tumors, the Agilent pattern proves to be a technology-independent predictor of survival, independent also of the best other indicator at diagnosis, i.e., stage. Third, like no other indicator, the pattern predicts the overall survival of OV patients experiencing progression-free survival, in general and in response to platinum. We conclude that comparative spectral decompositions, such as the GSVD and tensor GSVD, underlie a mathematically universal description of the relationships between a primary tumor's genotype and a patient's overall survival phenotype, which other methods miss.

5.
APL Bioeng ; 2(3)2018 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-30397684

RESUMO

DNA alterations have been observed in astrocytoma for decades. A copy-number genotype predictive of a survival phenotype was only discovered by using the generalized singular value decomposition (GSVD) formulated as a comparative spectral decomposition. Here, we use the GSVD to compare whole-genome sequencing (WGS) profiles of patient-matched astrocytoma and normal DNA. First, the GSVD uncovers a genome-wide pattern of copy-number alterations, which is bounded by patterns recently uncovered by the GSVDs of microarray-profiled patient-matched glioblastoma (GBM) and, separately, lower-grade astrocytoma and normal genomes. Like the microarray patterns, the WGS pattern is correlated with an approximately one-year median survival time. By filling in gaps in the microarray patterns, the WGS pattern reveals that this biologically consistent genotype encodes for transformation via the Notch together with the Ras and Shh pathways. Second, like the GSVDs of the microarray profiles, the GSVD of the WGS profiles separates the tumor-exclusive pattern from normal copy-number variations and experimental inconsistencies. These include the WGS technology-specific effects of guaninecytosine content variations across the genomes that are correlated with experimental batches. Third, by identifying the biologically consistent phenotype among the WGS-profiled tumors, the GBM pattern proves to be a technology-independent predictor of survival and response to chemotherapy and radiation, statistically better than the patient's age and tumor's grade, the best other indicators, and MGMT promoter methylation and IDH1 mutation. We conclude that by using the complex structure of the data, comparative spectral decompositions underlie a mathematically universal description of the genotype-phenotype relations in cancer that other methods miss.

6.
Methods Mol Biol ; 377: 17-60, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-17634608

RESUMO

DNA microarrays make it possible, for the first time, to record the complete genomic signals that guide the progression of cellular processes. Future discovery in biology and medicine will come from the mathematical modeling of these data, which hold the key to fundamental understanding of life on the molecular level, as well as answers to questions regarding diagnosis, treatment, and drug development. This chapter reviews the first data-driven models that were created from these genome-scale data, through adaptations and generalizations of mathematical frameworks from matrix algebra that have proven successful in describing the physical world, in such diverse areas as mechanics and perception: the singular value decomposition model, the generalized singular value decomposition model comparative model, and the pseudoinverse projection integrative model. These models provide mathematical descriptions of the genetic networks that generate and sense the measured data, where the mathematical variables and operations represent biological reality. The variables, patterns uncovered in the data, correlate with activities of cellular elements such as regulators or transcription factors that drive the measured signals and cellular states where these elements are active. The operations, such as data reconstruction, rotation, and classification in subspaces of selected patterns, simulate experimental observation of only the cellular programs that these patterns represent. These models are illustrated in the analyses of RNA expression data from yeast and human during their cell cycle programs and DNA-binding data from yeast cell cycle transcription factors and replication initiation proteins. Two alternative pictures of RNA expression oscillations during the cell cycle that emerge from these analyses, which parallel well-known designs of physical oscillators, convey the capacity of the models to elucidate the design principles of cellular systems, as well as guide the design of synthetic ones. In these analyses, the power of the models to predict previously unknown biological principles is demonstrated with a prediction of a novel mechanism of regulation that correlates DNA replication initiation with cell cycle-regulated RNA transcription in yeast. These models may become the foundation of a future in which biological systems are modeled as physical systems are today.


Assuntos
Genoma , Modelos Biológicos , Modelos Genéticos , Modelos Estatísticos , Modelos Teóricos , Transdução de Sinais , Algoritmos , Ciclo Celular , Fenômenos Fisiológicos Celulares , Simulação por Computador , DNA Fúngico , Genoma Fúngico , Humanos , RNA Fúngico/genética , RNA Fúngico/metabolismo , Saccharomyces cerevisiae/citologia , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo
7.
PLoS One ; 11(10): e0164546, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27798635

RESUMO

We use the generalized singular value decomposition (GSVD), formulated as a comparative spectral decomposition, to model patient-matched grades III and II, i.e., lower-grade astrocytoma (LGA) brain tumor and normal DNA copy-number profiles. A genome-wide tumor-exclusive pattern of DNA copy-number alterations (CNAs) is revealed, encompassed in that previously uncovered in glioblastoma (GBM), i.e., grade IV astrocytoma, where GBM-specific CNAs encode for enhanced opportunities for transformation and proliferation via growth and developmental signaling pathways in GBM relative to LGA. The GSVD separates the LGA pattern from other sources of biological and experimental variation, common to both, or exclusive to one of the tumor and normal datasets. We find, first, and computationally validate, that the LGA pattern is correlated with a patient's survival and response to treatment. Second, the GBM pattern identifies among the LGA patients a subtype, statistically indistinguishable from that among the GBM patients, where the CNA genotype is correlated with an approximately one-year survival phenotype. Third, cross-platform classification of the Affymetrix-measured LGA and GBM profiles by using the Agilent-derived GBM pattern shows that the GBM pattern is a platform-independent predictor of astrocytoma outcome. Statistically, the pattern is a better predictor (corresponding to greater median survival time difference, proportional hazard ratio, and concordance index) than the patient's age and the tumor's grade, which are the best indicators of astrocytoma currently in clinical use, and laboratory tests. The pattern is also statistically independent of these indicators, and, combined with either one, is an even better predictor of astrocytoma outcome. Recurring DNA CNAs have been observed in astrocytoma tumors' genomes for decades, however, copy-number subtypes that are predictive of patients' outcomes were not identified before. This is despite the growing number of datasets recording different aspects of the disease, and due to an existing fundamental need for mathematical frameworks that can simultaneously find similarities and dissimilarities across the datasets. This illustrates the ability of comparative spectral decompositions to find what other methods miss.


Assuntos
Astrocitoma/genética , Astrocitoma/mortalidade , Variações do Número de Cópias de DNA , Estudo de Associação Genômica Ampla , Algoritmos , Astrocitoma/diagnóstico , Astrocitoma/terapia , Biologia Computacional/métodos , Metilação de DNA , Metilases de Modificação do DNA/genética , Enzimas Reparadoras do DNA/genética , Conjuntos de Dados como Assunto , Humanos , Isocitrato Desidrogenase/genética , Estimativa de Kaplan-Meier , Mutação , Gradação de Tumores/métodos , Análise de Sequência com Séries de Oligonucleotídeos , Prognóstico , Regiões Promotoras Genéticas , Resultado do Tratamento , Proteínas Supressoras de Tumor/genética
8.
PLoS One ; 10(4): e0121396, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25875127

RESUMO

The number of large-scale high-dimensional datasets recording different aspects of a single disease is growing, accompanied by a need for frameworks that can create one coherent model from multiple tensors of matched columns, e.g., patients and platforms, but independent rows, e.g., probes. We define and prove the mathematical properties of a novel tensor generalized singular value decomposition (GSVD), which can simultaneously find the similarities and dissimilarities, i.e., patterns of varying relative significance, between any two such tensors. We demonstrate the tensor GSVD in comparative modeling of patient- and platform-matched but probe-independent ovarian serous cystadenocarcinoma (OV) tumor, mostly high-grade, and normal DNA copy-number profiles, across each chromosome arm, and combination of two arms, separately. The modeling uncovers previously unrecognized patterns of tumor-exclusive platform-consistent co-occurring copy-number alterations (CNAs). We find, first, and validate that each of the patterns across only 7p and Xq, and the combination of 6p+12p, is correlated with a patient's prognosis, is independent of the tumor's stage, the best predictor of OV survival to date, and together with stage makes a better predictor than stage alone. Second, these patterns include most known OV-associated CNAs that map to these chromosome arms, as well as several previously unreported, yet frequent focal CNAs. Third, differential mRNA, microRNA, and protein expression consistently map to the DNA CNAs. A coherent picture emerges for each pattern, suggesting roles for the CNAs in OV pathogenesis and personalized therapy. In 6p+12p, deletion of the p21-encoding CDKN1A and p38-encoding MAPK14 and amplification of RAD51AP1 and KRAS encode for human cell transformation, and are correlated with a cell's immortality, and a patient's shorter survival time. In 7p, RPA3 deletion and POLD2 amplification are correlated with DNA stability, and a longer survival. In Xq, PABPC5 deletion and BCAP31 amplification are correlated with a cellular immune response, and a longer survival.


Assuntos
Cistadenocarcinoma Seroso/genética , Variações do Número de Cópias de DNA/genética , Modelos Teóricos , Neoplasias Epiteliais e Glandulares/genética , Neoplasias Ovarianas/genética , Prognóstico , Carcinoma Epitelial do Ovário , Transformação Celular Neoplásica/genética , Mapeamento Cromossômico , Cromossomos/genética , Cistadenocarcinoma Seroso/diagnóstico , Cistadenocarcinoma Seroso/patologia , Feminino , Regulação Neoplásica da Expressão Gênica , Humanos , MicroRNAs/biossíntese , Mutação , Proteínas de Neoplasias/biossíntese , Neoplasias Epiteliais e Glandulares/diagnóstico , Neoplasias Epiteliais e Glandulares/patologia , Neoplasias Ovarianas/diagnóstico , Neoplasias Ovarianas/patologia , RNA Mensageiro/biossíntese , RNA Mensageiro/genética , Análise de Sobrevida
9.
PLoS One ; 8(11): e78913, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24282503

RESUMO

To search for evolutionary forces that might act upon transcript length, we use the singular value decomposition (SVD) to identify the length distribution functions of sets and subsets of human and yeast transcripts from profiles of mRNA abundance levels across gel electrophoresis migration distances that were previously measured by DNA microarrays. We show that the SVD identifies the transcript length distribution functions as "asymmetric generalized coherent states" from the DNA microarray data and with no a-priori assumptions. Comparing subsets of human and yeast transcripts of the same gene ontology annotations, we find that in both disparate eukaryotes, transcripts involved in protein synthesis or mitochondrial metabolism are significantly shorter than typical, and in particular, significantly shorter than those involved in glucose metabolism. Comparing the subsets of human transcripts that are overexpressed in glioblastoma multiforme (GBM) or normal brain tissue samples from The Cancer Genome Atlas, we find that GBM maintains normal brain overexpression of significantly short transcripts, enriched in transcripts that are involved in protein synthesis or mitochondrial metabolism, but suppresses normal overexpression of significantly longer transcripts, enriched in transcripts that are involved in glucose metabolism and brain activity. These global relations among transcript length, cellular metabolism and tumor development suggest a previously unrecognized physical mode for tumor and normal cells to differentially regulate metabolism in a transcript length-dependent manner. The identified distribution functions support a previous hypothesis from mathematical modeling of evolutionary forces that act upon transcript length in the manner of the restoring force of the harmonic oscillator.


Assuntos
Neoplasias Encefálicas/genética , Evolução Molecular , Glioblastoma/genética , RNA Mensageiro/química , Saccharomyces cerevisiae/genética , Glioblastoma/metabolismo , Humanos , Análise de Sequência com Séries de Oligonucleotídeos , RNA Mensageiro/metabolismo
10.
PLoS One ; 7(1): e30098, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22291905

RESUMO

Despite recent large-scale profiling efforts, the best prognostic predictor of glioblastoma multiforme (GBM) remains the patient's age at diagnosis. We describe a global pattern of tumor-exclusive co-occurring copy-number alterations (CNAs) that is correlated, possibly coordinated with GBM patients' survival and response to chemotherapy. The pattern is revealed by GSVD comparison of patient-matched but probe-independent GBM and normal aCGH datasets from The Cancer Genome Atlas (TCGA). We find that, first, the GSVD, formulated as a framework for comparatively modeling two composite datasets, removes from the pattern copy-number variations (CNVs) that occur in the normal human genome (e.g., female-specific X chromosome amplification) and experimental variations (e.g., in tissue batch, genomic center, hybridization date and scanner), without a-priori knowledge of these variations. Second, the pattern includes most known GBM-associated changes in chromosome numbers and focal CNAs, as well as several previously unreported CNAs in >3% of the patients. These include the biochemically putative drug target, cell cycle-regulated serine/threonine kinase-encoding TLK2, the cyclin E1-encoding CCNE1, and the Rb-binding histone demethylase-encoding KDM5A. Third, the pattern provides a better prognostic predictor than the chromosome numbers or any one focal CNA that it identifies, suggesting that the GBM survival phenotype is an outcome of its global genotype. The pattern is independent of age, and combined with age, makes a better predictor than age alone. GSVD comparison of matched profiles of a larger set of TCGA patients, inclusive of the initial set, confirms the global pattern. GSVD classification of the GBM profiles of an independent set of patients validates the prognostic contribution of the pattern.


Assuntos
Neoplasias Encefálicas/genética , Neoplasias Encefálicas/mortalidade , Variações do Número de Cópias de DNA/fisiologia , Glioblastoma/genética , Glioblastoma/mortalidade , Adulto , Idoso , Idoso de 80 Anos ou mais , Neoplasias Encefálicas/diagnóstico , Neoplasias Encefálicas/patologia , Estudos de Casos e Controles , Análise por Conglomerados , Hibridização Genômica Comparativa/métodos , Hibridização Genômica Comparativa/estatística & dados numéricos , Variações do Número de Cópias de DNA/genética , Interpretação Estatística de Dados , Feminino , Glioblastoma/diagnóstico , Glioblastoma/patologia , Humanos , Masculino , Análise por Pareamento , Pessoa de Meia-Idade , Técnicas de Diagnóstico Molecular , Prognóstico , Análise de Sobrevida , Estudos de Validação como Assunto
11.
PLoS One ; 6(4): e18768, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21625625

RESUMO

Evolutionary relationships among organisms are commonly described by using a hierarchy derived from comparisons of ribosomal RNA (rRNA) sequences. We propose that even on the level of a single rRNA molecule, an organism's evolution is composed of multiple pathways due to concurrent forces that act independently upon different rRNA degrees of freedom. Relationships among organisms are then compositions of coexisting pathway-dependent similarities and dissimilarities, which cannot be described by a single hierarchy. We computationally test this hypothesis in comparative analyses of 16S and 23S rRNA sequence alignments by using a tensor decomposition, i.e., a framework for modeling composite data. Each alignment is encoded in a cuboid, i.e., a third-order tensor, where nucleotides, positions and organisms, each represent a degree of freedom. A tensor mode-1 higher-order singular value decomposition (HOSVD) is formulated such that it separates each cuboid into combinations of patterns of nucleotide frequency variation across organisms and positions, i.e., "eigenpositions" and corresponding nucleotide-specific segments of "eigenorganisms," respectively, independent of a-priori knowledge of the taxonomic groups or rRNA structures. We find, in support of our hypothesis that, first, the significant eigenpositions reveal multiple similarities and dissimilarities among the taxonomic groups. Second, the corresponding eigenorganisms identify insertions or deletions of nucleotides exclusively conserved within the corresponding groups, that map out entire substructures and are enriched in adenosines, unpaired in the rRNA secondary structure, that participate in tertiary structure interactions. This demonstrates that structural motifs involved in rRNA folding and function are evolutionary degrees of freedom. Third, two previously unknown coexisting subgenic relationships between Microsporidia and Archaea are revealed in both the 16S and 23S rRNA alignments, a convergence and a divergence, conferred by insertions and deletions of these motifs, which cannot be described by a single hierarchy. This shows that mode-1 HOSVD modeling of rRNA alignments might be used to computationally predict evolutionary mechanisms.


Assuntos
Biologia Computacional/métodos , Evolução Molecular , RNA Ribossômico/genética , Sequências Repetitivas de Ácido Nucleico/genética , Animais , Sequência de Bases , Dados de Sequência Molecular , RNA Ribossômico 16S/genética , RNA Ribossômico 23S/genética
12.
PLoS One ; 6(12): e28072, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-22216090

RESUMO

The number of high-dimensional datasets recording multiple aspects of a single phenomenon is increasing in many areas of science, accompanied by a need for mathematical frameworks that can compare multiple large-scale matrices with different row dimensions. The only such framework to date, the generalized singular value decomposition (GSVD), is limited to two matrices. We mathematically define a higher-order GSVD (HO GSVD) for N≥2 matrices D(i)∈R(m(i) × n), each with full column rank. Each matrix is exactly factored as D(i)=U(i)Σ(i)V(T), where V, identical in all factorizations, is obtained from the eigensystem SV=VΛ of the arithmetic mean S of all pairwise quotients A(i)A(j)(-1) of the matrices A(i)=D(i)(T)D(i), i≠j. We prove that this decomposition extends to higher orders almost all of the mathematical properties of the GSVD. The matrix S is nondefective with V and Λ real. Its eigenvalues satisfy λ(k)≥1. Equality holds if and only if the corresponding eigenvector v(k) is a right basis vector of equal significance in all matrices D(i) and D(j), that is σ(i,k)/σ(j,k)=1 for all i and j, and the corresponding left basis vector u(i,k) is orthogonal to all other vectors in U(i) for all i. The eigenvalues λ(k)=1, therefore, define the "common HO GSVD subspace." We illustrate the HO GSVD with a comparison of genome-scale cell-cycle mRNA expression from S. pombe, S. cerevisiae and human. Unlike existing algorithms, a mapping among the genes of these disparate organisms is not required. We find that the approximately common HO GSVD subspace represents the cell-cycle mRNA expression oscillations, which are similar among the datasets. Simultaneous reconstruction in the common subspace, therefore, removes the experimental artifacts, which are dissimilar, from the datasets. In the simultaneous sequence-independent classification of the genes of the three organisms in this common subspace, genes of highly conserved sequences but significantly different cell-cycle peak times are correctly classified.


Assuntos
Modelos Teóricos , RNA Mensageiro/genética , Ciclo Celular , Humanos , Saccharomyces cerevisiae/genética
13.
Proc Natl Acad Sci U S A ; 103(32): 11828-33, 2006 Aug 08.
Artigo em Inglês | MEDLINE | ID: mdl-16877539

RESUMO

We describe the singular value decomposition (SVD) of yeast genome-scale mRNA lengths distribution data measured by DNA microarrays. SVD uncovers in the mRNA abundance levels data matrix of genes x arrays, i.e., electrophoretic gel migration lengths or mRNA lengths, mathematically unique decorrelated and decoupled "eigengenes." The eigengenes are the eigenvectors of the arrays x arrays correlation matrix, with the corresponding series of eigenvalues proportional to the series of the "fractions of eigen abundance." Each fraction of eigen abundance indicates the significance of the corresponding eigengene relative to all others. We show that the eigengenes fit "asymmetric Hermite functions," a generalization of the eigenfunctions of the quantum harmonic oscillator and the integral transform which kernel is a generalized coherent state. The fractions of eigen abundance fit a geometric series as do the eigenvalues of the integral transform which kernel is a generalized coherent state. The "asymmetric generalized coherent state" models the measured data, where the profiles of mRNA abundance levels of most genes as well as the distribution of the peaks of these profiles fit asymmetric Gaussians. We hypothesize that the asymmetry in the distribution of the peaks of the profiles is due to two competing evolutionary forces. We show that the asymmetry in the profiles of the genes might be due to a previously unknown asymmetry in the gel electrophoresis thermal broadening of a moving, rather than a stationary, band of RNA molecules.


Assuntos
Perfilação da Expressão Gênica , Genoma Fúngico , Análise de Sequência com Séries de Oligonucleotídeos/métodos , RNA Mensageiro/metabolismo , Evolução Molecular , Proteínas Fúngicas/química , Genes Fúngicos , Modelos Teóricos , Distribuição Normal , RNA/química , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo
14.
Proc Natl Acad Sci U S A ; 102(49): 17559-64, 2005 Dec 06.
Artigo em Inglês | MEDLINE | ID: mdl-16314560

RESUMO

We describe the use of the matrix eigenvalue decomposition (EVD) and pseudoinverse projection and a tensor higher-order EVD (HOEVD) in reconstructing the pathways that compose a cellular system from genome-scale nondirectional networks of correlations among the genes of the system. The EVD formulates a genes x genes network as a linear superposition of genes x genes decorrelated and decoupled rank-1 subnetworks, which can be associated with functionally independent pathways. The integrative pseudoinverse projection of a network computed from a "data" signal onto a designated "basis" signal approximates the network as a linear superposition of only the subnetworks that are common to both signals and simulates observation of only the pathways that are manifest in both experiments. We define a comparative HOEVD that formulates a series of networks as linear superpositions of decorrelated rank-1 subnetworks and the rank-2 couplings among these subnetworks, which can be associated with independent pathways and the transitions among them common to all networks in the series or exclusive to a subset of the networks. Boolean functions of the discretized subnetworks and couplings highlight differential, i.e., pathway-dependent, relations among genes. We illustrate the EVD, pseudoinverse projection, and HOEVD of genome-scale networks with analyses of yeast DNA microarray data.


Assuntos
Biologia Computacional/métodos , Genoma Fúngico/genética , Saccharomyces cerevisiae/citologia , Saccharomyces cerevisiae/genética , Simulação por Computador , Regulação Fúngica da Expressão Gênica , Análise de Sequência com Séries de Oligonucleotídeos , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Transdução de Sinais
15.
Proc Natl Acad Sci U S A ; 101(47): 16577-82, 2004 Nov 23.
Artigo em Inglês | MEDLINE | ID: mdl-15545604

RESUMO

We describe an integrative data-driven mathematical framework that formulates any number of genome-scale molecular biological data sets in terms of one chosen set of data samples, or of profiles extracted mathematically from data samples, designated the "basis" set. By using pseudoinverse projection, the molecular biological profiles of the data samples are least-squares-approximated as superpositions of the basis profiles. Reconstruction of the data in the basis simulates experimental observation of only the cellular states manifest in the data that correspond to those of the basis. Classification of the data samples according to their reconstruction in the basis, rather than their overall measured profiles, maps the cellular states of the data onto those of the basis and gives a global picture of the correlations and possibly also causal coordination of these two sets of states. We illustrate this framework with an integration of yeast genome-scale proteins' DNA-binding data with cell cycle mRNA expression time course data. Novel correlation between DNA replication initiation and RNA transcription during the yeast cell cycle, which might be due to a previously unknown mechanism of regulation, is predicted.


Assuntos
Replicação do DNA/genética , Genoma Fúngico , Modelos Genéticos , RNA Fúngico/genética , Saccharomyces cerevisiae/genética , Ciclo Celular , DNA Fúngico/biossíntese , DNA Fúngico/genética , Bases de Dados Genéticas , Análise dos Mínimos Quadrados , Matemática , Ligação Proteica , RNA Mensageiro/genética , Saccharomyces cerevisiae/citologia , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo , Transcrição Gênica
16.
Proc Natl Acad Sci U S A ; 100(6): 3351-6, 2003 Mar 18.
Artigo em Inglês | MEDLINE | ID: mdl-12631705

RESUMO

We describe a comparative mathematical framework for two genome-scale expression data sets. This framework formulates expression as superposition of the effects of regulatory programs, biological processes, and experimental artifacts common to both data sets, as well as those that are exclusive to one data set or the other, by using generalized singular value decomposition. This framework enables comparative reconstruction and classification of the genes and arrays of both data sets. We illustrate this framework with a comparison of yeast and human cell-cycle expression data sets.


Assuntos
Perfilação da Expressão Gênica/estatística & dados numéricos , Genômica/estatística & dados numéricos , Ciclo Celular/genética , Interpretação Estatística de Dados , Bases de Dados Genéticas , Genes Fúngicos/efeitos dos fármacos , Humanos , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Feromônios/farmacologia , RNA Fúngico/genética , RNA Fúngico/metabolismo , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Saccharomyces cerevisiae/citologia , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Estresse Fisiológico/genética
17.
Proc Natl Acad Sci U S A ; 100(4): 1926-30, 2003 Feb 18.
Artigo em Inglês | MEDLINE | ID: mdl-12571354

RESUMO

Analysis of the patterns of gene expression in follicular lymphomas from 24 patients suggested that two groups of tumors might be distinguished. All patients, whose biopsies were obtained before any treatment, were treated with rituximab, a monoclonal antibody directed against the B cell antigen, CD20. Gene expression patterns in the tumors that subsequently failed to respond to rituximab appeared more similar to those of normal lymphoid tissues than to gene expression patterns of tumors from rituximab responders. These findings suggest the possibility that the response of follicular lymphoma to rituximab treatment may be predicted from the gene expression pattern of tumors.


Assuntos
Anticorpos Monoclonais/uso terapêutico , Antineoplásicos/uso terapêutico , Perfilação da Expressão Gênica , Linfoma Folicular/tratamento farmacológico , Adulto , Idoso , Anticorpos Monoclonais Murinos , Feminino , Humanos , Linfoma Folicular/genética , Masculino , Pessoa de Meia-Idade , Rituximab , Resultado do Tratamento
18.
Lancet ; 359(9314): 1301-7, 2002 Apr 13.
Artigo em Inglês | MEDLINE | ID: mdl-11965276

RESUMO

BACKGROUND: Soft-tissue tumours are derived from mesenchymal cells such as fibroblasts, muscle cells, or adipocytes, but for many such tumours the histogenesis is controversial. We aimed to start molecular characterisation of these rare neoplasms and to do a genome-wide search for new diagnostic markers. METHODS: We analysed gene-expression patterns of 41 soft-tissue tumours with spotted cDNA microarrays. After removal of errors introduced by use of different microarray batches, the expression patterns of 5520 genes that were well defined were used to separate tumours into discrete groups by hierarchical clustering and singular value decomposition. FINDINGS: Synovial sarcomas, gastrointestinal stromal tumours, neural tumours, and a subset of the leiomyosarcomas, showed strikingly distinct gene-expression patterns. Other tumour categories--malignant fibrous histiocytoma, liposarcoma, and the remaining leiomyosarcomas--shared molecular profiles that were not predicted by histological features or immunohistochemistry. Strong expression of known genes, such as KIT in gastrointestinal stromal tumours, was noted within gene sets that distinguished the different sarcomas. However, many uncharacterised genes also contributed to the distinction between tumour types. INTERPRETATION: These results suggest a new method for classification of soft-tissue tumours, which could improve on the method based on histological findings. Large numbers of uncharacterised genes contributed to distinctions between the tumours, and some of these could be useful markers for diagnosis, have prognostic significance, or prove possible targets for treatment.


Assuntos
Regulação Neoplásica da Expressão Gênica/genética , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Sarcoma/genética , Neoplasias de Tecidos Moles/genética , Perfilação da Expressão Gênica , Humanos , Sarcoma/classificação , Sarcoma/patologia , Neoplasias de Tecidos Moles/classificação , Neoplasias de Tecidos Moles/patologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA