Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 25
Filtrar
1.
PLoS One ; 8(1): e54290, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23335997

RESUMEN

BACKGROUND: There is a growing appreciation of the role of proteolytic processes in human health and disease, but tools for analysis of such processes on a proteome-wide scale are limited. Furin is a ubiquitous proprotein convertase that cleaves after basic residues and transforms secretory proproteins into biologically active proteins. Despite this important role, many furin substrates remain unknown in the human proteome. METHODOLOGY/PRINCIPAL FINDINGS: We devised an approach for proteinase target identification that combines an in silico discovery pipeline with highly multiplexed proteinase activity assays. We performed in silico analysis of the human proteome and identified over 1,050 secretory proteins as potential furin substrates. We then used a multiplexed protease assay to validate these tentative targets. The assay was carried out on over 3,260 overlapping peptides designed to represent P7-P1' and P4-P4' positions of furin cleavage sites in the candidate proteins. The obtained results greatly increased our knowledge of the unique cleavage preferences of furin, revealed the importance of both short-range (P4-P1) and long-range (P7-P6) interactions in defining furin cleavage specificity, demonstrated that the R-X-R/K/X-R ↓ motif alone is insufficient for predicting furin proteolysis of the substrate, and identified ≈ 490 potential protein substrates of furin in the human proteome. CONCLUSIONS/SIGNIFICANCE: The assignment of these substrates to cellular pathways suggests an important role of furin in development, including axonal guidance, cardiogenesis, and maintenance of stem cell pluripotency. The novel approach proposed in this study can be readily applied to other proteinases.


Asunto(s)
Furina/química , Furina/metabolismo , Proteoma/metabolismo , Secuencias de Aminoácidos , Secuencia de Aminoácidos , Bases de Datos de Proteínas , Humanos , Modelos Moleculares , Dominios y Motivos de Interacción de Proteínas , Mapeo de Interacción de Proteínas , Mapas de Interacción de Proteínas , Estructura Secundaria de Proteína , Proteolisis , Reproducibilidad de los Resultados , Especificidad por Sustrato
2.
PLoS One ; 7(6): e37441, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22701568

RESUMEN

We report a scalable and cost-effective technology for generating and screening high-complexity customizable peptide sets. The peptides are made as peptide-cDNA fusions by in vitro transcription/translation from pools of DNA templates generated by microarray-based synthesis. This approach enables large custom sets of peptides to be designed in silico, manufactured cost-effectively in parallel, and assayed efficiently in a multiplexed fashion. The utility of our peptide-cDNA fusion pools was demonstrated in two activity-based assays designed to discover protease and kinase substrates. In the protease assay, cleaved peptide substrates were separated from uncleaved and identified by digital sequencing of their cognate cDNAs. We screened the 3,011 amino acid HCV proteome for susceptibility to cleavage by the HCV NS3/4A protease and identified all 3 known trans cleavage sites with high specificity. In the kinase assay, peptide substrates phosphorylated by tyrosine kinases were captured and identified by sequencing of their cDNAs. We screened a pool of 3,243 peptides against Abl kinase and showed that phosphorylation events detected were specific and consistent with the known substrate preferences of Abl kinase. Our approach is scalable and adaptable to other protein-based assays.


Asunto(s)
ADN Complementario/genética , Hepacivirus/genética , Péptido Hidrolasas/metabolismo , Péptidos/genética , Fosfotransferasas/metabolismo , Proteómica/métodos , ADN Complementario/metabolismo , Análisis por Micromatrices/métodos , Péptidos/metabolismo , Fosforilación , Especificidad por Sustrato , Proteínas no Estructurales Virales/metabolismo
3.
PLoS One ; 7(4): e35759, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22558217

RESUMEN

BACKGROUND: The hepatitis C virus (HCV) genome encodes a long polyprotein, which is processed by host cell and viral proteases to the individual structural and non-structural (NS) proteins. HCV NS3/4A serine proteinase (NS3/4A) is a non-covalent heterodimer of the N-terminal, ∼180-residue portion of the 631-residue NS3 protein with the NS4A co-factor. NS3/4A cleaves the polyprotein sequence at four specific regions. NS3/4A is essential for viral replication and has been considered an attractive drug target. METHODOLOGY/PRINCIPAL FINDINGS: Using a novel multiplex cleavage assay and over 2,660 peptide sequences derived from the polyprotein and from introducing mutations into the known NS3/4A cleavage sites, we obtained the first detailed fingerprint of NS3/4A cleavage preferences. Our data identified structural requirements illuminating the importance of both the short-range (P1-P1') and long-range (P6-P5) interactions in defining the NS3/4A substrate cleavage specificity. A newly observed feature of NS3/4A was a high frequency of either Asp or Glu at both P5 and P6 positions in a subset of the most efficient NS3/4A substrates. In turn, aberrations of this negatively charged sequence such as an insertion of a positively charged or hydrophobic residue between the negatively charged residues resulted in inefficient substrates. Because NS5B misincorporates bases at a high rate, HCV constantly mutates as it replicates. Our analysis revealed that mutations do not interfere with polyprotein processing in over 5,000 HCV isolates indicating a pivotal role of NS3/4A proteolysis in the virus life cycle. CONCLUSIONS/SIGNIFICANCE: Our multiplex assay technology in light of the growing appreciation of the role of proteolytic processes in human health and disease will likely have widespread applications in the proteolysis research field and provide new therapeutic opportunities.


Asunto(s)
Serina Endopeptidasas/química , Proteínas no Estructurales Virales/química , Secuencia de Aminoácidos , Ensayos Analíticos de Alto Rendimiento , Humanos , Modelos Moleculares , Datos de Secuencia Molecular , Mutación , Péptidos/análisis , Péptidos/síntesis química , Poliproteínas/química , Procesamiento Proteico-Postraduccional , Proteolisis , Serina Endopeptidasas/genética , Serina Endopeptidasas/metabolismo , Especificidad por Sustrato , Proteínas no Estructurales Virales/genética , Proteínas no Estructurales Virales/metabolismo
4.
Acta Crystallogr A ; 68(Pt 3): 313-8, 2012 May.
Artículo en Inglés | MEDLINE | ID: mdl-22514062

RESUMEN

A central problem in crystallography is crystal structure determination directly from diffraction intensities. For structures of small molecules, this problem has been addressed by probabilistic direct methods that allow one to obtain the structure coordinates with a high degree of certainty given a sufficiently large set of intensities. In contrast, deterministic algebraic methods that could guarantee a solution and may be applicable to macromolecules have not yet emerged. In this study a basic algebraic question is posed: how many crystal structures can be obtained from a given set of intensities? Recently, by using a new origin definition and the method of elementary symmetrical polynomials, all small (N ≤ 4 atoms) one-dimensional crystal structures that could be obtained from the minimum set of N - 1 lowest-resolution intensities were enumerated. Here, by using methods of modern algebraic geometry the maximum number of one-dimensional crystal structures that can be determined from the minimum set of intensities for N > 4 is obtained. It is demonstrated that this ambiguity increases exponentially with the increasing number of atoms in the structure N (~4(N)/N(3/2) for N >> 1) and includes non-homometric structures. Therefore, a minimum set of intensities, even in principle, is insufficient for structure determination for all but very small structures.


Asunto(s)
Cristalografía por Rayos X , Sustancias Macromoleculares/química , Proteínas/química , Modelos Moleculares , Difracción de Rayos X
5.
J Ovarian Res ; 5(1): 3, 2012 Jan 20.
Artículo en Inglés | MEDLINE | ID: mdl-22264331

RESUMEN

BACKGROUND: We sought to identify candidate serum biomarkers for the detection and surveillance of EOC. Based on RNA-Seq transcriptome analysis of patient-derived tumors, highly expressed secreted proteins were identified using a bioinformatic approach. METHODS: RNA-Seq was used to quantify papillary serous ovarian cancer transcriptomes. Paired end sequencing of 22 flash frozen tumors was performed. Sequence alignments were processed with the program ELAND, expression levels with ERANGE and then bioinformatically screened for secreted protein signatures. Serum samples from women with benign and malignant pelvic masses and serial samples from women during chemotherapy regimens were measured for IGFBP-4 by ELISA. Student's t Test, ANOVA, and ROC curves were used for statistical analysis. RESULTS: Insulin-like growth factor binding protein (IGFBP-4) was consistently present in the top 7.5% of all expressed genes in all tumor samples. We then screened serum samples to determine if increased tumor expression correlated with serum expression. In an initial discovery set of 21 samples, IGFBP-4 levels were found to be elevated in patients, including those with early stage disease and normal CA125 levels. In a larger and independent validation set (82 controls, 78 cases), IGFBP-4 levels were significantly increased (p < 5 × 10-5). IGFBP-4 levels were ~3× greater in women with malignant pelvic masses compared to women with benign masses. ROC sensitivity was 73% at 93% specificity (AUC 0.816). In women receiving chemotherapy, average IGFBP-4 levels were below the ROC-determined threshold and lower in NED patients compared to AWD patients. CONCLUSIONS: This study, the first to our knowledge to use RNA-Seq for biomarker discovery, identified IGFBP-4 as overexpressed in ovarian cancer patients. Beyond this, these studies identified two additional intriguing findings. First, IGFBP-4 can be elevated in early stage disease without elevated CA125. Second, IGFBP-4 levels are significantly elevated with malignant versus benign disease. These findings provide the rationale for future validation studies.

6.
BMC Cancer ; 11: 481, 2011 Nov 09.
Artículo en Inglés | MEDLINE | ID: mdl-22070665

RESUMEN

BACKGROUND: The prognosis of hepatocellular carcinoma (HCC) varies following surgical resection and the large variation remains largely unexplained. Studies have revealed the ability of clinicopathologic parameters and gene expression to predict HCC prognosis. However, there has been little systematic effort to compare the performance of these two types of predictors or combine them in a comprehensive model. METHODS: Tumor and adjacent non-tumor liver tissues were collected from 272 ethnic Chinese HCC patients who received curative surgery. We combined clinicopathologic parameters and gene expression data (from both tissue types) in predicting HCC prognosis. Cross-validation and independent studies were employed to assess prediction. RESULTS: HCC prognosis was significantly associated with six clinicopathologic parameters, which can partition the patients into good- and poor-prognosis groups. Within each group, gene expression data further divide patients into distinct prognostic subgroups. Our predictive genes significantly overlap with previously published gene sets predictive of prognosis. Moreover, the predictive genes were enriched for genes that underwent normal-to-tumor gene network transformation. Previously documented liver eSNPs underlying the HCC predictive gene signatures were enriched for SNPs that associated with HCC prognosis, providing support that these genes are involved in key processes of tumorigenesis. CONCLUSION: When applied individually, clinicopathologic parameters and gene expression offered similar predictive power for HCC prognosis. In contrast, a combination of the two types of data dramatically improved the power to predict HCC prognosis. Our results also provided a framework for understanding the impact of gene expression on the processes of tumorigenesis and clinical outcome.


Asunto(s)
Carcinoma Hepatocelular/genética , Carcinoma Hepatocelular/patología , Neoplasias Hepáticas/genética , Neoplasias Hepáticas/patología , Carcinoma Hepatocelular/metabolismo , Carcinoma Hepatocelular/cirugía , Transformación Celular Neoplásica/genética , Supervivencia sin Enfermedad , Femenino , Expresión Génica , Perfilación de la Expresión Génica , Humanos , Neoplasias Hepáticas/metabolismo , Neoplasias Hepáticas/cirugía , Masculino , Persona de Mediana Edad , Valor Predictivo de las Pruebas , Pronóstico
7.
PLoS One ; 6(7): e20090, 2011.
Artículo en Inglés | MEDLINE | ID: mdl-21750698

RESUMEN

BACKGROUND: In hepatocellular carcinoma (HCC) genes predictive of survival have been found in both adjacent normal (AN) and tumor (TU) tissues. The relationships between these two sets of predictive genes and the general process of tumorigenesis and disease progression remains unclear. METHODOLOGY/PRINCIPAL FINDINGS: Here we have investigated HCC tumorigenesis by comparing gene expression, DNA copy number variation and survival using ∼250 AN and TU samples representing, respectively, the pre-cancer state, and the result of tumorigenesis. Genes that participate in tumorigenesis were defined using a gene-gene correlation meta-analysis procedure that compared AN versus TU tissues. Genes predictive of survival in AN (AN-survival genes) were found to be enriched in the differential gene-gene correlation gene set indicating that they directly participate in the process of tumorigenesis. Additionally the AN-survival genes were mostly not predictive after tumorigenesis in TU tissue and this transition was associated with and could largely be explained by the effect of somatic DNA copy number variation (sCNV) in cis and in trans. The data was consistent with the variance of AN-survival genes being rate-limiting steps in tumorigenesis and this was confirmed using a treatment that promotes HCC tumorigenesis that selectively altered AN-survival genes and genes differentially correlated between AN and TU. CONCLUSIONS/SIGNIFICANCE: This suggests that the process of tumor evolution involves rate-limiting steps related to the background from which the tumor evolved where these were frequently predictive of clinical outcome. Additionally treatments that alter the likelihood of tumorigenesis occurring may act by altering AN-survival genes, suggesting that the process can be manipulated. Further sCNV explains a substantial fraction of tumor specific expression and may therefore be a causal driver of tumor evolution in HCC and perhaps many solid tumor types.


Asunto(s)
Carcinoma Hepatocelular/genética , Variaciones en el Número de Copia de ADN , Perfilación de la Expresión Génica , Neoplasias Hepáticas/genética , Hígado/metabolismo , Adulto , Anciano , Animales , Línea Celular Tumoral , Cromosomas Humanos Par 1/genética , Femenino , Redes Reguladoras de Genes , Humanos , Hígado/patología , Masculino , Ratones , Ratones Transgénicos , Persona de Mediana Edad , Modelos Genéticos , Análisis de Secuencia por Matrices de Oligonucleótidos , Proteínas Proto-Oncogénicas c-met/genética , Análisis de Regresión
8.
Genome Res ; 21(7): 1008-16, 2011 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-21602305

RESUMEN

To map the genetics of gene expression in metabolically relevant tissues and investigate the diversity of expression SNPs (eSNPs) in multiple tissues from the same individual, we collected four tissues from approximately 1000 patients undergoing Roux-en-Y gastric bypass (RYGB) and clinical traits associated with their weight loss and co-morbidities. We then performed high-throughput genotyping and gene expression profiling and carried out a genome-wide association analyses for more than 100,000 gene expression traits representing four metabolically relevant tissues: liver, omental adipose, subcutaneous adipose, and stomach. We successfully identified 24,531 eSNPs corresponding to about 10,000 distinct genes. This represents the greatest number of eSNPs identified to our knowledge by any study to date and the first study to identify eSNPs from stomach tissue. We then demonstrate how these eSNPs provide a high-quality disease map for each tissue in morbidly obese patients to not only inform genetic associations identified in this cohort, but in previously published genome-wide association studies as well. These data can aid in elucidating the key networks associated with morbid obesity, response to RYGB, and disease as a whole.


Asunto(s)
Mucosa Gástrica/metabolismo , Hígado/metabolismo , Obesidad Mórbida/epidemiología , Obesidad Mórbida/genética , Adiposidad/genética , Adulto , Estudios de Cohortes , Comorbilidad , Bases de Datos Genéticas , Femenino , Derivación Gástrica , Perfilación de la Expresión Génica , Estudio de Asociación del Genoma Completo/métodos , Genotipo , Humanos , Masculino , Persona de Mediana Edad , Obesidad Mórbida/cirugía , Polimorfismo de Nucleótido Simple , Pérdida de Peso
9.
Genome Res ; 20(8): 1020-36, 2010 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-20538623

RESUMEN

Liver cytochrome P450s (P450s) play critical roles in drug metabolism, toxicology, and metabolic processes. Despite rapid progress in the understanding of these enzymes, a systematic investigation of the full spectrum of functionality of individual P450s, the interrelationship or networks connecting them, and the genetic control of each gene/enzyme is lacking. To this end, we genotyped, expression-profiled, and measured P450 activities of 466 human liver samples and applied a systems biology approach via the integration of genetics, gene expression, and enzyme activity measurements. We found that most P450s were positively correlated among themselves and were highly correlated with known regulators as well as thousands of other genes enriched for pathways relevant to the metabolism of drugs, fatty acids, amino acids, and steroids. Genome-wide association analyses between genetic polymorphisms and P450 expression or enzyme activities revealed sets of SNPs associated with P450 traits, and suggested the existence of both cis-regulation of P450 expression (especially for CYP2D6) and more complex trans-regulation of P450 activity. Several novel SNPs associated with CYP2D6 expression and enzyme activity were validated in an independent human cohort. By constructing a weighted coexpression network and a Bayesian regulatory network, we defined the human liver transcriptional network structure, uncovered subnetworks representative of the P450 regulatory system, and identified novel candidate regulatory genes, namely, EHHADH, SLC10A1, and AKR1D1. The P450 subnetworks were then validated using gene signatures responsive to ligands of known P450 regulators in mouse and rat. This systematic survey provides a comprehensive view of the functionality, genetic control, and interactions of P450s.


Asunto(s)
Sistema Enzimático del Citocromo P-450/genética , Sistema Enzimático del Citocromo P-450/metabolismo , Regulación Enzimológica de la Expresión Génica , Genómica , Hígado/enzimología , Adolescente , Adulto , Anciano , Anciano de 80 o más Años , Animales , Niño , Preescolar , Femenino , Expresión Génica , Estudio de Asociación del Genoma Completo , Humanos , Lactante , Recién Nacido , Masculino , Ratones , Persona de Mediana Edad , Preparaciones Farmacéuticas/metabolismo , Polimorfismo de Nucleótido Simple , Ratas , Biología de Sistemas , Transcripción Genética , Adulto Joven
10.
PLoS One ; 5(1): e8695, 2010 Jan 13.
Artículo en Inglés | MEDLINE | ID: mdl-20084173

RESUMEN

Genome-wide association studies (GWAS) may be biased by population stratification (PS). We conducted empirical quantification of the magnitude of PS among human populations and its impact on GWAS. Liver tissues were collected from 979, 59 and 49 Caucasian Americans (CA), African Americans (AA) and Hispanic Americans (HA), respectively, and genotyped using Illumina650Y (Ilmn650Y) arrays. RNA was also isolated and hybridized to Agilent whole-genome gene expression arrays. We propose a new method (i.e., hgdp-eigen) for detecting PS by projecting genotype vectors for each sample to the eigenvector space defined by the Human Genetic Diversity Panel (HGDP). Further, we conducted GWAS to map expression quantitative trait loci (eQTL) for the approximately 40,000 liver gene expression traits monitored by the Agilent arrays. HGDP-eigen performed similarly to the conventional self-eigen methods in capturing PS. However, leveraging the HGDP offered a significant advantage in revealing the origins, directions and magnitude of PS. Adjusting for eigenvectors had minor impacts on eQTL detection rates in CA. In contrast, for AA and HA, adjustment dramatically reduced association findings. At an FDR = 10%, we identified 65 eQTLs in AA with the unadjusted analysis, but only 18 eQTLs after the eigenvector adjustment. Strikingly, 55 out of the 65 unadjusted AA eQTLs were validated in CA, indicating that the adjustment procedure significantly reduced GWAS power. A number of the 55 AA eQTLs validated in CA overlapped with published disease associated SNPs. For example, rs646776 and rs10903129 have previously been associated with lipid levels and coronary heart disease risk, however, the rs10903129 eQTL was missed in the eigenvector adjusted analysis.


Asunto(s)
Genética de Población , Estudio de Asociación del Genoma Completo , Genoma Humano , Humanos , Hígado/metabolismo , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo
11.
BMC Genet ; 10: 27, 2009 Jun 16.
Artículo en Inglés | MEDLINE | ID: mdl-19531258

RESUMEN

BACKGROUND: Although high-throughput genotyping arrays have made whole-genome association studies (WGAS) feasible, only a small proportion of SNPs in the human genome are actually surveyed in such studies. In addition, various SNP arrays assay different sets of SNPs, which leads to challenges in comparing results and merging data for meta-analyses. Genome-wide imputation of untyped markers allows us to address these issues in a direct fashion. METHODS: 384 Caucasian American liver donors were genotyped using Illumina 650Y (Ilmn650Y) arrays, from which we also derived genotypes from the Ilmn317K array. On these data, we compared two imputation methods: MACH and BEAGLE. We imputed 2.5 million HapMap Release22 SNPs, and conducted GWAS on approximately 40,000 liver mRNA expression traits (eQTL analysis). In addition, 200 Caucasian American and 200 African American subjects were genotyped using the Affymetrix 500 K array plus a custom 164 K fill-in chip. We then imputed the HapMap SNPs and quantified the accuracy by randomly masking observed SNPs. RESULTS: MACH and BEAGLE perform similarly with respect to imputation accuracy. The Ilmn650Y results in excellent imputation performance, and it outperforms Affx500K or Ilmn317K sets. For Caucasian Americans, 90% of the HapMap SNPs were imputed at 98% accuracy. As expected, imputation of poorly tagged SNPs (untyped SNPs in weak LD with typed markers) was not as successful. It was more challenging to impute genotypes in the African American population, given (1) shorter LD blocks and (2) admixture with Caucasian populations in this population. To address issue (2), we pooled HapMap CEU and YRI data as an imputation reference set, which greatly improved overall performance. The approximate 40,000 phenotypes scored in these populations provide a path to determine empirically how the power to detect associations is affected by the imputation procedures. That is, at a fixed false discovery rate, the number of cis-eQTL discoveries detected by various methods can be interpreted as their relative statistical power in the GWAS. In this study, we find that imputation offer modest additional power (by 4%) on top of either Ilmn317K or Ilmn650Y, much less than the power gain from Ilmn317K to Ilmn650Y (13%). CONCLUSION: Current algorithms can accurately impute genotypes for untyped markers, which enables researchers to pool data between studies conducted using different SNP sets. While genotyping itself results in a small error rate (e.g. 0.5%), imputing genotypes is surprisingly accurate. We found that dense marker sets (e.g. Ilmn650Y) outperform sparser ones (e.g. Ilmn317K) in terms of imputation yield and accuracy. We also noticed it was harder to impute genotypes for African American samples, partially due to population admixture, although using a pooled reference boosts performance. Interestingly, GWAS carried out using imputed genotypes only slightly increased power on top of assayed SNPs. The reason is likely due to adding more markers via imputation only results in modest gain in genetic coverage, but worsens the multiple testing penalties. Furthermore, cis-eQTL mapping using dense SNP set derived from imputation achieves great resolution, and locate associate peak closer to causal variants than conventional approach.


Asunto(s)
Genoma Humano , Estudio de Asociación del Genoma Completo/métodos , Genotipo , Polimorfismo de Nucleótido Simple , Negro o Afroamericano/genética , Algoritmos , Mapeo Cromosómico/métodos , Marcadores Genéticos , Humanos , Hígado/metabolismo , Modelos Estadísticos , ARN Mensajero/metabolismo , Sensibilidad y Especificidad , Población Blanca/genética
12.
Hum Mol Genet ; 18(18): 3502-7, 2009 Sep 15.
Artículo en Inglés | MEDLINE | ID: mdl-19553259

RESUMEN

To investigate the genetic architecture of severe obesity, we performed a genome-wide association study of 775 cases and 3197 unascertained controls at approximately 550,000 markers across the autosomal genome. We found convincing association to the previously described locus including the FTO gene. We also found evidence of association at a further six of 12 other loci previously reported to influence body mass index (BMI) in the general population and one of three associations to severe childhood and adult obesity and that cases have a higher proportion of risk-conferring alleles than controls. We found no evidence of homozygosity at any locus due to identity-by-descent associating with phenotype which would be indicative of rare, penetrant alleles, nor was there excess genome-wide homozygosity in cases relative to controls. Our results suggest that variants influencing BMI also contribute to severe obesity, a condition at the extreme of the phenotypic spectrum rather than a distinct condition.


Asunto(s)
Índice de Masa Corporal , Obesidad/genética , Polimorfismo de Nucleótido Simple , Adolescente , Adulto , Anciano , Estudios de Cohortes , Femenino , Marcadores Genéticos , Humanos , Masculino , Persona de Mediana Edad , Obesidad/fisiopatología , Fenotipo , Factores de Riesgo
13.
PLoS Biol ; 6(5): e107, 2008 May 06.
Artículo en Inglés | MEDLINE | ID: mdl-18462017

RESUMEN

Genetic variants that are associated with common human diseases do not lead directly to disease, but instead act on intermediate, molecular phenotypes that in turn induce changes in higher-order disease traits. Therefore, identifying the molecular phenotypes that vary in response to changes in DNA and that also associate with changes in disease traits has the potential to provide the functional information required to not only identify and validate the susceptibility genes that are directly affected by changes in DNA, but also to understand the molecular networks in which such genes operate and how changes in these networks lead to changes in disease traits. Toward that end, we profiled more than 39,000 transcripts and we genotyped 782,476 unique single nucleotide polymorphisms (SNPs) in more than 400 human liver samples to characterize the genetic architecture of gene expression in the human liver, a metabolically active tissue that is important in a number of common human diseases, including obesity, diabetes, and atherosclerosis. This genome-wide association study of gene expression resulted in the detection of more than 6,000 associations between SNP genotypes and liver gene expression traits, where many of the corresponding genes identified have already been implicated in a number of human diseases. The utility of these data for elucidating the causes of common human diseases is demonstrated by integrating them with genotypic and expression data from other human and mouse populations. This provides much-needed functional support for the candidate susceptibility genes being identified at a growing number of genetic loci that have been identified as key drivers of disease from genome-wide association studies of disease. By using an integrative genomics approach, we highlight how the gene RPS26 and not ERBB3 is supported by our data as the most likely susceptibility gene for a novel type 1 diabetes locus recently identified in a large-scale, genome-wide association study. We also identify SORT1 and CELSR2 as candidate susceptibility genes for a locus recently associated with coronary artery disease and plasma low-density lipoprotein cholesterol levels in the process.


Asunto(s)
Perfilación de la Expresión Génica , Predisposición Genética a la Enfermedad/genética , Hígado/metabolismo , Polimorfismo de Nucleótido Simple/genética , Transcripción Genética/genética , Adolescente , Adulto , Anciano , Anciano de 80 o más Años , Animales , Niño , Preescolar , LDL-Colesterol/sangre , LDL-Colesterol/genética , Enfermedad de la Arteria Coronaria/genética , Diabetes Mellitus Tipo 1/genética , Femenino , Genes MHC Clase II/genética , Genoma Humano , Genotipo , Humanos , Lactante , Masculino , Ratones , Persona de Mediana Edad , Análisis de Secuencia por Matrices de Oligonucleótidos , Sitios de Carácter Cuantitativo/genética , ARN Mensajero/análisis , ARN Mensajero/genética
14.
PLoS Genet ; 4(2): e1000006, 2008 Feb 29.
Artículo en Inglés | MEDLINE | ID: mdl-18454203

RESUMEN

The recent development of whole genome association studies has lead to the robust identification of several loci involved in different common human diseases. Interestingly, some of the strongest signals of association observed in these studies arise from non-coding regions located in very large introns or far away from any annotated genes, raising the possibility that these regions are involved in the etiology of the disease through some unidentified regulatory mechanisms. These findings highlight the importance of better understanding the mechanisms leading to inter-individual differences in gene expression in humans. Most of the existing approaches developed to identify common regulatory polymorphisms are based on linkage/association mapping of gene expression to genotypes. However, these methods have some limitations, notably their cost and the requirement of extensive genotyping information from all the individuals studied which limits their applications to a specific cohort or tissue. Here we describe a robust and high-throughput method to directly measure differences in allelic expression for a large number of genes using the Illumina Allele-Specific Expression BeadArray platform and quantitative sequencing of RT-PCR products. We show that this approach allows reliable identification of differences in the relative expression of the two alleles larger than 1.5-fold (i.e., deviations of the allelic ratio larger than 60:40) and offers several advantages over the mapping of total gene expression, particularly for studying humans or outbred populations. Our analysis of more than 80 individuals for 2,968 SNPs located in 1,380 genes confirms that differential allelic expression is a widespread phenomenon affecting the expression of 20% of human genes and shows that our method successfully captures expression differences resulting from both genetic and epigenetic cis-acting mechanisms.


Asunto(s)
Epigénesis Genética , Regulación de la Expresión Génica , Genoma Humano , Alelos , Desequilibrio Alélico , Prueba de Complementación Genética , Humanos , Intrones , Análisis de Secuencia por Matrices de Oligonucleótidos , Polimorfismo de Nucleótido Simple , ARN Mensajero/genética , ARN Mensajero/metabolismo , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa
15.
Genomics ; 89(6): 666-72, 2007 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-17459658

RESUMEN

Predicting prognosis in prostate carcinoma remains a challenge when using clinical and pathologic criteria only. We used an array-based DASL assay to identify molecular signatures for predicting prostate cancer relapse in formalin-fixed, paraffin-embedded (FFPE) prostate cancers, through gene expression profiling of 512 prioritized genes. Of the 71 patients that we analyzed, all but 3 had no evidence of residual tumor (defined as negative surgical margins) following radical prostatectomy and no patient received adjuvant therapy following surgery. All of the 71 patients had an undetectable serum PSA following radical prostatectomy. Follow-up period was 44+/-15 months. Highly reproducible gene expression patterns were obtained with these samples (average R(2)=0.99). We identified a panel of 11 genes that correlated positively and 5 genes that correlated negatively with Gleason grade. A gene expression score (GEX) was derived from the expression levels of the 16 genes. We assessed the prognostic value of these genes and found the GEX significantly correlated with disease relapse (p=0.007). These results suggest that the approach we used is effective for expression profiling in heterogeneous FFPE tissues for cancer diagnosis/prognosis biomarker discovery and validation.


Asunto(s)
Neoplasias de la Próstata/genética , Anciano , Anciano de 80 o más Años , Formaldehído , Perfilación de la Expresión Génica , Humanos , Masculino , Persona de Mediana Edad , Análisis de Secuencia por Matrices de Oligonucleótidos , Adhesión en Parafina , Pronóstico , Prostatectomía , Neoplasias de la Próstata/patología , Neoplasias de la Próstata/cirugía , Recurrencia , Factores de Riesgo , Fijación del Tejido
16.
Nat Biotechnol ; 24(9): 1123-31, 2006 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-16964226

RESUMEN

We have assessed the utility of RNA titration samples for evaluating microarray platform performance and the impact of different normalization methods on the results obtained. As part of the MicroArray Quality Control project, we investigated the performance of five commercial microarray platforms using two independent RNA samples and two titration mixtures of these samples. Focusing on 12,091 genes common across all platforms, we determined the ability of each platform to detect the correct titration response across the samples. Global deviations from the response predicted by the titration ratios were observed. These differences could be explained by variations in relative amounts of messenger RNA as a fraction of total RNA between the two independent samples. Overall, both the qualitative and quantitative correspondence across platforms was high. In summary, titration samples may be regarded as a valuable tool, not only for assessing microarray platform performance and different analysis methods, but also for determining some underlying biological features of the samples.


Asunto(s)
Análisis de Falla de Equipo/métodos , Perfilación de la Expresión Génica/instrumentación , Perfilación de la Expresión Génica/normas , Análisis de Secuencia por Matrices de Oligonucleótidos/instrumentación , Análisis de Secuencia por Matrices de Oligonucleótidos/normas , ARN/análisis , ARN/genética , Algoritmos , Valores de Referencia , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Estados Unidos
17.
Nat Biotechnol ; 24(9): 1151-61, 2006 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-16964229

RESUMEN

Over the last decade, the introduction of microarray technology has had a profound impact on gene expression research. The publication of studies with dissimilar or altogether contradictory results, obtained using different microarray platforms to analyze identical RNA samples, has raised concerns about the reliability of this technology. The MicroArray Quality Control (MAQC) project was initiated to address these concerns, as well as other performance and data analysis issues. Expression data on four titration pools from two distinct reference RNA samples were generated at multiple test sites using a variety of microarray-based and alternative technology platforms. Here we describe the experimental design and probe mapping efforts behind the MAQC project. We show intraplatform consistency across test sites as well as a high level of interplatform concordance in terms of genes identified as differentially expressed. This study provides a resource that represents an important first step toward establishing a framework for the use of microarrays in clinical and regulatory settings.


Asunto(s)
Perfilación de la Expresión Génica/instrumentación , Análisis de Secuencia por Matrices de Oligonucleótidos/instrumentación , Garantía de la Calidad de Atención de Salud/métodos , Diseño de Equipo , Análisis de Falla de Equipo , Perfilación de la Expresión Génica/métodos , Control de Calidad , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Estados Unidos
18.
Genome Res ; 16(9): 1075-83, 2006 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-16899657

RESUMEN

Human embryonic stem (hES) cells originate during an embryonic period of active epigenetic remodeling. DNA methylation patterns are likely to be critical for their self-renewal and pluripotence. We compared the DNA methylation status of 1536 CpG sites (from 371 genes) in 14 independently isolated hES cell lines with five other cell types: 24 cancer cell lines, four adult stem cell populations, four lymphoblastoid cell lines, five normal human tissues, and an embryonal carcinoma cell line. We found that the DNA methylation profile clearly distinguished the hES cells from all of the other cell types. A subset of 49 CpG sites from 40 genes contributed most to the differences among cell types. Another set of 25 sites from 23 genes distinguished hES cells from normal differentiated cells and can be used as biomarkers to monitor differentiation. Our results indicate that hES cells have a unique epigenetic signature that may contribute to their developmental potential.


Asunto(s)
Metilación de ADN , Embrión de Mamíferos/citología , Epigénesis Genética , Células Madre/metabolismo , Diferenciación Celular , Línea Celular , Línea Celular Tumoral , Linaje de la Célula , Análisis por Conglomerados , Femenino , Humanos , Masculino , Células Madre Pluripotentes/citología , Células Madre Pluripotentes/metabolismo , Células Madre/citología
19.
BMC Dev Biol ; 6: 20, 2006 May 03.
Artículo en Inglés | MEDLINE | ID: mdl-16672070

RESUMEN

BACKGROUND: In order to compare the gene expression profiles of human embryonic stem cell (hESC) lines and their differentiated progeny and to monitor feeder contaminations, we have examined gene expression in seven hESC lines and human fibroblast feeder cells using Illumina bead arrays that contain probes for 24,131 transcript probes. RESULTS: A total of 48 different samples (including duplicates) grown in multiple laboratories under different conditions were analyzed and pairwise comparisons were performed in all groups. Hierarchical clustering showed that blinded duplicates were correctly identified as the closest related samples. hESC lines clustered together irrespective of the laboratory in which they were maintained. hESCs could be readily distinguished from embryoid bodies (EB) differentiated from them and the karyotypically abnormal hESC line BG01V. The embryonal carcinoma (EC) line NTera2 is a useful model for evaluating characteristics of hESCs. Expression of subsets of individual genes was validated by comparing with published databases, MPSS (Massively Parallel Signature Sequencing) libraries, and parallel analysis by microarray and RT-PCR. CONCLUSION: we show that Illumina's bead array platform is a reliable, reproducible and robust method for developing base global profiles of cells and identifying similarities and differences in large number of samples.


Asunto(s)
Carcinoma Embrionario/patología , Línea Celular , Genoma Humano , Células Madre , Investigaciones con Embriones/legislación & jurisprudencia , Embrión de Mamíferos/citología , Perfilación de la Expresión Génica/métodos , Perfilación de la Expresión Génica/normas , Regulación Gubernamental , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos , Estados Unidos
20.
Genome Res ; 16(3): 383-93, 2006 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-16449502

RESUMEN

We have developed a high-throughput method for analyzing the methylation status of hundreds of preselected genes simultaneously and have applied it to the discovery of methylation signatures that distinguish normal from cancer tissue samples. Through an adaptation of the GoldenGate genotyping assay implemented on a BeadArray platform, the methylation state of 1536 specific CpG sites in 371 genes (one to nine CpG sites per gene) was measured in a single reaction by multiplexed genotyping of 200 ng of bisulfite-treated genomic DNA. The assay was used to obtain a quantitative measure of the methylation level at each CpG site. After validating the assay in cell lines and normal tissues, we analyzed a panel of lung cancer biopsy samples (N = 22) and identified a panel of methylation markers that distinguished lung adenocarcinomas from normal lung tissues with high specificity. These markers were validated in a second sample set (N = 24). These results demonstrate the effectiveness of the method for reliably profiling many CpG sites in parallel for the discovery of informative methylation markers. The technology should prove useful for DNA methylation analyses in large populations, with potential application to the classification and diagnosis of a broad range of cancers and other diseases.


Asunto(s)
Dermatoglifia del ADN , Metilación de ADN , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Secuencia de Bases , Cromosomas Humanos X/metabolismo , Islas de CpG/genética , Regulación Neoplásica de la Expresión Génica , Humanos , Neoplasias Pulmonares/genética , Datos de Secuencia Molecular , Polimorfismo de Nucleótido Simple , Reproducibilidad de los Resultados , Sulfitos/metabolismo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...