RESUMO
BACKGROUND: Pre-symptomatic prediction of disease and drug response based on genetic testing is a critical component of personalized medicine. Previous work has demonstrated that the predictive capacity of genetic testing is constrained by the heritability and prevalence of the tested trait, although these constraints have only been approximated under the assumption of a normally distributed genetic risk distribution. RESULTS: Here, we mathematically derive the absolute limits that these factors impose on test accuracy in the absence of any distributional assumptions on risk. We present these limits in terms of the best-case receiver-operating characteristic (ROC) curve, consisting of the best-case test sensitivities and specificities, and the AUC (area under the curve) measure of accuracy. We apply our method to genetic prediction of type 2 diabetes and breast cancer, and we additionally show the best possible accuracy that can be obtained from integrated predictors, which can incorporate non-genetic features. CONCLUSION: Knowledge of such limits is valuable in understanding the implications of genetic testing even before additional associations are identified.
Assuntos
Neoplasias da Mama/genética , Diabetes Mellitus Tipo 2/genética , Genoma Humano , Modelos Genéticos , Área Sob a Curva , Neoplasias da Mama/diagnóstico , Simulação por Computador , Diabetes Mellitus Tipo 2/diagnóstico , Feminino , Estudo de Associação Genômica Ampla , Humanos , Medicina de Precisão , Valor Preditivo dos Testes , Prognóstico , Curva ROCRESUMO
Feedback control is an important regulatory process in biological systems, which confers robustness against external and internal disturbances. Genes involved in feedback structures are therefore likely to have a major role in regulating cellular processes. Here we rely on a dynamic Bayesian network approach to identify feedback loops in cell cycle regulation. We analyzed the transcriptional profile of the cell cycle in HeLa cancer cells and identified a feedback loop structure composed of 10 genes. In silico analyses showed that these genes hold important roles in system's dynamics. The results of published experimental assays confirmed the central role of 8 of the identified feedback loop genes in cell cycle regulation. In conclusion, we provide a novel approach to identify critical genes for the dynamics of biological processes. This may lead to the identification of therapeutic targets in diseases that involve perturbations of these dynamics.
Assuntos
Ciclo Celular/genética , Biologia Computacional/métodos , Retroalimentação Fisiológica/fisiologia , Expressão Gênica , Redes Reguladoras de Genes , Teorema de Bayes , Simulação por Computador , Bases de Dados Genéticas , Células HeLa , Humanos , Modelos BiológicosRESUMO
BACKGROUND: Transcriptional networks play a central role in cancer development. The authors described a systems biology approach to cancer classification based on the reverse engineering of the transcriptional network surrounding the 2 most common types of lung cancer: adenocarcinoma (AC) and squamous cell carcinoma (SCC). METHODS: A transcriptional network classifier was inferred from the molecular profiles of 111 human lung carcinomas. The authors tested its classification accuracy in 7 independent cohorts, for a total of 422 subjects of Caucasian, African, and Asian descent. RESULTS: The model for distinguishing AC from SCC was a 25-gene network signature. Its performance on the 7 independent cohorts achieved 95.2% classification accuracy. Even more surprisingly, 95% of this accuracy was explained by the interplay of 3 genes (KRT6A, KRT6B, KRT6C) on a narrow cytoband of chromosome 12. The role of this chromosomal region in distinguishing AC and SCC was further confirmed by the analysis of another group of 28 independent subjects assayed by DNA copy number changes. The copy number variations of bands 12q12, 12q13, and 12q12-13 discriminated these samples with 84% accuracy. CONCLUSIONS: These results suggest the existence of a robust signature localized in a relatively small area of the genome, and show the clinical potential of reverse engineering transcriptional networks from molecular profiles.
Assuntos
Adenocarcinoma/genética , Carcinoma de Células Escamosas/genética , Redes Reguladoras de Genes , Neoplasias Pulmonares/classificação , Neoplasias Pulmonares/genética , Teorema de Bayes , Cromossomos Humanos Par 12 , Humanos , Biologia de Sistemas/métodosRESUMO
BACKGROUND: Identification of expression quantitative trait loci (eQTLs) is an emerging area in genomic study. The task requires an integrated analysis of genome-wide single nucleotide polymorphism (SNP) data and gene expression data, raising a new computational challenge due to the tremendous size of data. RESULTS: We develop a method to identify eQTLs. The method represents eQTLs as information flux between genetic variants and transcripts. We use information theory to simultaneously interrogate SNP and gene expression data, resulting in a Transcriptional Information Map (TIM) which captures the network of transcriptional information that links genetic variations, gene expression and regulatory mechanisms. These maps are able to identify both cis- and trans- regulating eQTLs. The application on a dataset of leukemia patients identifies eQTLs in the regions of the GART, PCP4, DSCAM, and RIPK4 genes that regulate ADAMTS1, a known leukemia correlate. CONCLUSIONS: The information theory approach presented in this paper is able to infer the dependence networks between SNPs and transcripts, which in turn can identify cis- and trans-eQTLs. The application of our method to the leukemia study explains how genetic variants and gene expression are linked to leukemia.
Assuntos
Biologia Computacional/métodos , Genoma , Transcrição Gênica/genética , Expressão Gênica , Perfilação da Expressão Gênica , Variação Genética , Humanos , Leucemia/genética , Polimorfismo de Nucleotídeo Único , Locos de Características QuantitativasRESUMO
The etiology of growth impairment in Crohn's disease (CD) has been inadequately explained by nutritional, hormonal, and/or disease-related factors, suggesting that genetics may be an additional contributor. The aim of this cross-sectional study was to investigate genetic variants associated with linear growth in pediatric-onset CD. We genotyped 951 subjects (317 CD patient-parent trios) for 64 polymorphisms within 14 CD-susceptibility and 23 stature-associated loci. Patient height-for-age Z-score < -1.64 was used to dichotomize probands into growth-impaired and nongrowth-impaired groups. The transmission disequilibrium test (TDT) was used to study association to growth impairment. There was a significant association between growth impairment in CD (height-for-age Z-score < -1.64) and a stature-related polymorphism in the dymeclin gene DYM (rs8099594) (OR = 3.2, CI [1.57-6.51], p = 0.0007). In addition, there was nominal over-transmission of two CD-susceptibility alleles, 10q21.1 intergenic region (rs10761659) and ATG16L1 (rs10210302), in growth-impaired CD children (OR = 2.36, CI [1.26-4.41] p = 0.0056 and OR = 2.45, CI [1.22-4.95] p = 0.0094, respectively). Our data indicate that genetic influences due to stature-associated and possibly CD risk alleles may predispose CD patients to alterations in linear growth. This is the first report of a link between a stature-associated locus and growth impairment in CD.
Assuntos
Estatura/genética , Transtornos do Crescimento/etiologia , Adolescente , Criança , Pré-Escolar , Doença de Crohn/genética , Estudos Transversais , Feminino , Predisposição Genética para Doença , Genótipo , Transtornos do Crescimento/genética , Humanos , Lactente , Peptídeos e Proteínas de Sinalização Intracelular , Masculino , Projetos Piloto , Proteínas/metabolismo , População BrancaRESUMO
Abilities to successfully quit smoking display substantial evidence for heritability in classic and molecular genetic studies. Genome-wide association (GWA) studies have demonstrated single-nucleotide polymorphisms (SNPs) and haplotypes that distinguish successful quitters from individuals who were unable to quit smoking in clinical trial participants and in community samples. Many of the subjects in these clinical trial samples were aided by nicotine replacement therapy (NRT). We now report novel GWA results from participants in a clinical trial that sought dose/response relationships for "precessation" NRT. In this trial, 369 European-American smokers were randomized to 21 or 42 mg NRT, initiated 2 wks before target quit dates. Ten-week continuous smoking abstinence was assessed on the basis of self-reports and carbon monoxide levels. SNP genotyping used Affymetrix 6.0 arrays. GWA results for smoking cessation success provided no P value that reached "genome-wide" significance. Compared with chance, these results do identify (a) more clustering of nominally positive results within small genomic regions, (b) more overlap between these genomic regions and those identified in six prior successful smoking cessation GWA studies and (c) sets of genes that fall into gene ontology categories that appear to be biologically relevant. The 1,000 SNPs with the strongest associations form a plausible Bayesian network; no such network is formed by randomly selected sets of SNPs. The data provide independent support, based on individual genotyping, for many loci previously nominated on the basis of data from genotyping in pooled DNA samples. These results provide further support for the idea that aid for smoking cessation may be personalized on the basis of genetic predictors of outcome.
Assuntos
Estudo de Associação Genômica Ampla , Nicotina/uso terapêutico , Abandono do Hábito de Fumar/métodos , Fumar/genética , Fumar/terapia , Adulto , Teorema de Bayes , Monóxido de Carbono/análise , Testes Genéticos , Genótipo , Humanos , Polimorfismo de Nucleotídeo Único , Tabagismo/genética , Resultado do TratamentoRESUMO
The Wilms' tumor suppressor 1 (WT1) gene encodes a DNA- and RNA-binding protein that plays an essential role in nephron progenitor differentiation during renal development. To identify WT1 target genes that might regulate nephron progenitor differentiation in vivo, we performed chromatin immunoprecipitation (ChIP) coupled to mouse promoter microarray (ChIP-chip) using chromatin prepared from embryonic mouse kidney tissue. We identified 1663 genes bound by WT1, 86% of which contain a previously identified, conserved, high-affinity WT1 binding site. To investigate functional interactions between WT1 and candidate target genes in nephron progenitors, we used a novel, modified WT1 morpholino loss-of-function model in embryonic mouse kidney explants to knock down WT1 expression in nephron progenitors ex vivo. Low doses of WT1 morpholino resulted in reduced WT1 target gene expression specifically in nephron progenitors, whereas high doses of WT1 morpholino arrested kidney explant development and were associated with increased nephron progenitor cell apoptosis, reminiscent of the phenotype observed in Wt1(-/-) embryos. Collectively, our results provide a comprehensive description of endogenous WT1 target genes in nephron progenitor cells in vivo, as well as insights into the transcriptional signaling networks controlled by WT1 that might direct nephron progenitor fate during renal development.
Assuntos
Regulação da Expressão Gênica no Desenvolvimento , Rim/citologia , Rim/embriologia , Néfrons/citologia , Células-Tronco/fisiologia , Proteínas WT1/metabolismo , Animais , Apoptose/fisiologia , Sequência de Bases , Imunoprecipitação da Cromatina , Bases de Dados Factuais , Embrião de Mamíferos/anatomia & histologia , Embrião de Mamíferos/fisiologia , Feminino , Hibridização In Situ , Rim/metabolismo , Camundongos , Análise em Microsséries , Néfrons/embriologia , Néfrons/metabolismo , Oligonucleotídeos Antissenso/genética , Oligonucleotídeos Antissenso/metabolismo , Gravidez , Células-Tronco/citologia , Técnicas de Cultura de Tecidos , Proteínas WT1/genéticaRESUMO
BACKGROUND: Many different genetic and clinical factors have been identified as causes or contributors to atherosclerosis. We present a model of preclinical atherosclerosis based on genetic and clinical data that predicts the presence of coronary artery calcification in healthy Americans of European descent 45 to 84 years of age in the Multi-Ethnic Study of Atherosclerosis (MESA). METHODS AND RESULTS: We assessed 712 individuals for the presence or absence of coronary artery calcification and assessed their genotypes for 2882 single-nucleotide polymorphisms. With the use of these single-nucleotide polymorphisms and relevant clinical data, a Bayesian network that predicts the presence of coronary calcification was constructed. The model contained 13 single-nucleotide polymorphisms (from genes AGTR1, ALOX15, INSR, PRKAB1, IL1R2, ESR2, KCNK1, FBLN5, PPARA, VEGFA, PON1, TDRD6, PLA2G7, and 1 ancestry informative marker) and 5 clinical variables (sex, age, weight, smoking, and diabetes mellitus) and achieved 85% predictive accuracy, as measured by area under the receiver operating characteristic curve. This is a significant (P<0.001) improvement on models that use just the single-nucleotide polymorphism data or just the clinical variables. CONCLUSIONS: We present an investigation of joint genetic and clinical factors associated with atherosclerosis that shows predictive results for both cases, as well as enhanced performance for their combination.
Assuntos
Aterosclerose/genética , Calcinose/genética , Doença da Artéria Coronariana/genética , Modelos Cardiovasculares , Idoso , Idoso de 80 Anos ou mais , Aterosclerose/etnologia , Teorema de Bayes , Calcinose/etnologia , Estudos de Coortes , Doença da Artéria Coronariana/etnologia , Feminino , Genótipo , Humanos , Masculino , Pessoa de Meia-Idade , Valor Preditivo dos Testes , Fatores de Risco , População Branca/etnologia , População Branca/genéticaRESUMO
Like all primary cells in vitro, normal human melanocytes exhibit a physiologic decay in proliferative potential as it transitions to a growth-arrested state. The underlying transcriptional program(s) that regulate this phenotypic change is largely unknown. To identify molecular determinants of this process, we performed a Bayesian-based dynamic gene expression analysis on primary melanocytes undergoing proliferative arrest. This analysis revealed several related clusters whose expression behavior correlated with the melanocyte growth kinetics; we designated these clusters the melanocyte growth arrest program (MGAP). These MGAP genes were preferentially represented in benign melanocytic nevi over melanomas and selectively mapped to the hepatocyte fibrosis pathway. This transcriptional relationship between melanocyte growth stasis, nevus biology, and fibrogenic signaling was further validated in vivo by the demonstration of strong pericellular collagen deposition within benign nevi but not melanomas. Taken together, our study provides a novel view of fibroplasia in both melanocyte biology and nevogenesis.
Assuntos
Transformação Celular Neoplásica/genética , Melanócitos/fisiologia , Melanoma/genética , Nevo/genética , Teorema de Bayes , Processos de Crescimento Celular/genética , Transformação Celular Neoplásica/patologia , Análise por Conglomerados , Perfilação da Expressão Gênica/métodos , Humanos , Melanócitos/citologia , Melanoma/patologia , Nevo/patologiaRESUMO
BACKGROUND: Gene interactions play a central role in transcriptional networks. Many studies have performed genome-wide expression analysis to reconstruct regulatory networks to investigate disease processes. Since biological processes are outcomes of regulatory gene interactions, this paper develops a system biology approach to infer function-dependent transcriptional networks modulating phenotypic traits, which serve as a classifier to identify tissue states. Due to gene interactions taken into account in the analysis, we can achieve higher classification accuracy than existing methods. RESULTS: Our system biology approach is carried out by the Bayesian networks framework. The algorithm consists of two steps: gene filtering by Bayes factor followed by collinearity elimination via network learning. We validate our approach with two clinical data. In the study of lung cancer subtypes discrimination, we obtain a 25-gene classifier from 111 training samples, and the test on 422 independent samples achieves 95% classification accuracy. In the study of thoracic aortic aneurysm (TAA) diagnosis, 61 samples determine a 34-gene classifier, whose diagnosis accuracy on 33 independent samples achieves 82%. The performance comparisons with three other popular methods, PCA/LDA, PAM, and Weighted Voting, confirm that our approach yields superior classification accuracy and a more compact signature. CONCLUSIONS: The system biology approach presented in this paper is able to infer function-dependent transcriptional networks, which in turn can classify biological samples with high accuracy. The validation of our classifier using clinical data demonstrates the promising value of our proposed approach for disease diagnosis.
Assuntos
Redes Reguladoras de Genes , Transcrição Gênica , Aneurisma da Aorta Torácica/diagnóstico , Aneurisma da Aorta Torácica/genética , Teorema de Bayes , Humanos , Neoplasias Pulmonares/diagnóstico , Neoplasias Pulmonares/genética , Biologia de SistemasRESUMO
PURPOSE OF REVIEW: To summarize the available evidence on cooccurring gastrointestinal toxicities and their potential link with other symptoms in cancer patients. The information obtained from colorectal cancer patient cohorts will be used as an example. RECENT FINDINGS: In recent years, it has become clear that gastrointestinal toxicities do not occur in isolation in cancer patients. Rather, they may link or associate with many other disturbances. Data have emerged that suggest that many of the complications of cancer chemotherapy occur in clusters and seem to support the sharing of common pathogenesis for clustering toxicities. SUMMARY: During the last few years, research in symptom clusters and cooccurring linked toxicities has markedly changed, progressively shifting from a simplistic descriptive picture to more comprehensive and pathogenetically driven analyses. Still, many questions remain to be answered, and whether and how toxicity aggregations vary during the treatment course remains to be elucidated.
Assuntos
Antineoplásicos/efeitos adversos , Neoplasias Colorretais/tratamento farmacológico , Teorema de Bayes , Humanos , Cadeias de MarkovRESUMO
OBJECTIVE: Identify clinical factors that modulate the risk of progression to COPD among asthma patients using data extracted from electronic medical records. DESIGN: Demographic information and comorbidities from adult asthma patients who were observed for at least 5 years with initial observation dates between 1988 and 1998, were extracted from electronic medical records of the Partners Healthcare System using tools of the National Center for Biomedical Computing "Informatics for Integrating Biology to the Bedside" (i2b2). MEASUREMENTS: A predictive model of COPD was constructed from a set of 9,349 patients (843 cases, 8,506 controls) using Bayesian networks. The model's predictive accuracy was tested using it to predict COPD in a future independent set of asthma patients (992 patients; 46 cases, 946 controls), who had initial observation dates between 1999 and 2002. RESULTS: A Bayesian network model composed of age, sex, race, smoking history, and 8 comorbidity variables is able to predict COPD in the independent set of patients with an accuracy of 83.3%, computed as the area under the Receiver Operating Characteristic curve (AUROC). CONCLUSIONS: Our results demonstrate that data extracted from electronic medical records can be used to create predictive models. With improvements in data extraction and inclusion of more variables, such models may prove to be clinically useful.
Assuntos
Asma/complicações , Teorema de Bayes , Sistemas Computadorizados de Registros Médicos , Redes Neurais de Computação , Doença Pulmonar Obstrutiva Crônica/etiologia , Idoso , Comorbidade , Progressão da Doença , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Modelos Biológicos , Análise Multivariada , Processamento de Linguagem Natural , Curva ROC , Fatores de RiscoRESUMO
Individuals' dependence on nicotine, primarily through cigarette smoking, is a major source of morbidity and mortality worldwide. Many smokers attempt but fail to quit smoking, motivating researchers to identify the origins of this dependence. Because of the known heritability of nicotine-dependence phenotypes, considerable interest has been focused on discovering the genetic factors underpinning the trait. This goal, however, is not easily attained: no single factor is likely to explain any great proportion of dependence because nicotine dependence is thought to be a complex trait (i.e., the result of many interacting factors). Genomewide association studies are powerful tools in the search for the genomic bases of complex traits, and in this context, novel candidate genes have been identified through single nucleotide polymorphism (SNP) association analyses. Beyond association, however, genetic data can be used to generate predictive models of nicotine dependence. As expected in the context of a complex trait, individual SNPs fail to accurately predict nicotine dependence, demanding the use of multivariate models. Standard approaches, such as logistic regression, are unable to consider large numbers of SNPs given existing sample sizes. However, using Bayesian networks, one can overcome these limitations to generate a multivariate predictive model, which has markedly enhanced predictive accuracy on fitted values relative to that of individual SNPs. This approach, combined with the data being generated by genomewide association studies, promises to shed new light on the common, complex trait nicotine dependence.
Assuntos
Modelos Biológicos , Tabagismo/diagnóstico , Animais , Teorema de Bayes , Estudos de Associação Genética , Humanos , Prognóstico , Tabagismo/genética , Tabagismo/fisiopatologiaRESUMO
Although the measurement of fetal proteins in maternal serum is part of standard prenatal screening for aneuploidy and neural tube defects, attempts to better understand the extent of feto-maternal protein trafficking and its clinical and biological significance have been hindered by the presence of abundant maternal proteins. The objective of this study was to circumvent maternal protein interference by using a computational predictive approach for the development of a noninvasive, comprehensive, protein network analysis of the developing fetus in maternal whole blood. From a set of 157 previously identified fetal gene transcripts, 46 were classified into known protein networks, and 222 downstream proteins were predicted. Statistically significantly over-represented pathways were diverse and included T-cell biology, neurodevelopment and cancer biology. Western blot analyses validated the computational predictive model and confirmed the presence of specific downstream fetal proteins in the whole blood of pregnant women and their newborns, with absence or reduced detection of the protein in the maternal postpartum samples. This work demonstrates that extensive feto-maternal protein trafficking occurs during pregnancy, and can be predicted and verified to develop novel noninvasive biomarkers. This study raises important questions regarding the biological effects of fetal proteins on the pregnant woman.
RESUMO
The increasing availability of electronic medical records offers opportunities to better characterize patient populations and create predictive tools to individualize health care. We determined which asthma patients suffer exacerbations using data extracted from electronic medical records of the Partners Healthcare System using Natural Language Processing tools from the "Informatics for Integrating Biology to the Bedside" center (i2b2). Univariable and multivariable analysis of data for 11,356 patients (1,394 cases, 9,962 controls) found that race, BMI, smoking history, and age at initial observation are predictors of asthma exacerbations. The area under the receiver operating characteristic curve (AUROC) corresponding to prediction of exacerbations in an independent group of 1,436 asthma patients (106 cases, 1,330 controls) is 0.67. Our findings are consistent with previous characterizations of asthma patients in epidemiological studies, and demonstrate that data extracted by natural language processing from electronic medical records is suitable for the characterization of patient populations.
Assuntos
Inteligência Artificial , Asma/diagnóstico , Asma/epidemiologia , Sistemas Computadorizados de Registros Médicos/estatística & dados numéricos , Processamento de Linguagem Natural , Reconhecimento Automatizado de Padrão/métodos , Boston/epidemiologia , Doença Crônica , Humanos , Incidência , Medição de Risco/métodos , Fatores de RiscoRESUMO
BACKGROUND: Colorectal cancer patients undergoing chemotherapy (CT) are likely to experience multiple concurrent toxicities that, rather than appearing singularly, may be associated with one another. Graphic and tabular representations of distance matrices were used to identify associations between toxicities and to define the strengths of these relations. METHODS: Using a standardized data collection tool, electronic medical charts of 300 consecutive patients receiving either the combination of leucovorin, 5-fluorouracil (5-FU), and oxaliplatin (FOLFOX); the combination of leucovorin, 5-FU, and irinotecan (FOLFIRI); or 5-FU) were retrospectively reviewed to record baseline demographic and clinical information. Treatment-related toxicities were recorded using National Cancer Institute Common Toxicity Criteria during the first cycle of CT. Using a distance matrix approach, an analysis of CT-induced toxicity associations was elaborated. RESULTS: The graphic analysis, in which associations between toxicities were represented as links, identified 6 major hubs (fever, dehydration, fatigue, anorexia, pain, and weight loss), defined as central nodes with more connections than expected by chance. These were highly linked with minor nodes and provided evidence suggesting the existence of symptom clusters associated with CT-induced toxicities. CONCLUSIONS: The application of distance matrix analyses to define CT-induced toxicity associations is new. The technique was effective in defining the global landscape of the binary relations among toxicities associated with Cycle 1 therapy. The coherent clinical picture emerging from the network provides a strong suggestion that the toxicities in each cluster share a common pathobiologic basis, which may provide an opportunity for intervention. These findings could become useful for the early prediction of co-occurring toxicities and, in the future, as a phenotyping framework for the pharmacogenomic analysis of individual responses to chemotherapy.
Assuntos
Antineoplásicos/efeitos adversos , Teorema de Bayes , Neoplasias Colorretais/tratamento farmacológico , Adolescente , Adulto , Idoso , Feminino , Humanos , Masculino , Pessoa de Meia-IdadeRESUMO
OBJECTIVE: To evaluate the economic impact of a Bayesian network model designed to predict clinical success of a new chemical entity (NCE) based on pre-phase III data. METHODS: We trained our Bayesian network model on publicly accessible data on 503 NCEs, stratified by therapeutic class. We evaluated the sensitivity, specificity and accuracy of our model on an independent data set of 18 NCE-indication pairs, using prior probability data for the antineoplastic NCEs within the training set. We performed Monte Carlo simulations to evaluate the economic performance of our model relative to reported pharmaceutical industry performance, taking into account reported capitalized phase costs, cumulative revenues for a postapproval period of 7 years, and the range of possible false negative and true negative rates for terminated NCEs within the pharmaceutical industry. RESULTS: Our model predicted outcomes on the independent validation set of oncology agents with 78% accuracy (80%sensitivity and 76% specificity). In comparison with the pharmaceutical industry's reported success rates, on average our model significantly reduced capitalized expenditures from $727 million/successful NCE to $444 million/successful NCE (P < 0.001), and significantly improved revenues from $347 million/phase III trial to $507 million/phase III trial (P < 0.001) during the first 7 years post launch. These results indicate that our model identified successful NCEs more efficiently than currently reported pharmaceutical industry performances. CONCLUSIONS: Accurate prediction of NCE outcomes is computationally feasible, significantly increasing the proportion of successful NCEs, and likely eliminating ineffective and unsafe NCEs.
Assuntos
Antineoplásicos/economia , Teorema de Bayes , Farmacoeconomia , Antineoplásicos/farmacologia , Ensaios Clínicos Fase III como Assunto , Indústria Farmacêutica , Previsões , Humanos , Modelos Biológicos , Sensibilidade e EspecificidadeRESUMO
Biological and medical data have been growing exponentially over the past several years [1, 2]. In particular, proteomics has seen automation dramatically change the rate at which data are generated [3]. Analysis that systemically incorporates prior information is becoming essential to making inferences about the myriad, complex data [4-6]. A Bayesian approach can help capture such information and incorporate it seamlessly through a rigorous, probabilistic framework. This paper starts with a review of the background mathematics behind the Bayesian methodology: from parameter estimation to Bayesian networks. The article then goes on to discuss how emerging Bayesian approaches have already been successfully applied to research across proteomics, a field for which Bayesian methods are particularly well suited [7-9]. After reviewing the literature on the subject of Bayesian methods in biological contexts, the article discusses some of the recent applications in proteomics and emerging directions in the field.
Assuntos
Teorema de Bayes , Proteômica , Modelos Moleculares , Peptídeos/química , Filogenia , Transdução de SinaisRESUMO
Myelodysplastic syndromes (MDS) are among the most frequent hematologic malignancies. Patients have a short survival and often progress to acute myeloid leukemia. The diagnosis of MDS can be difficult; there is a paucity of molecular markers, and the pathophysiology is largely unknown. Therefore, we conducted a multicenter study investigating whether serum proteome profiling may serve as a noninvasive platform to discover novel molecular markers for MDS. We generated serum proteome profiles from 218 individuals by MS and identified a profile that distinguishes MDS from non-MDS cytopenias in a learning sample set. This profile was validated by testing its ability to predict MDS in a first independent validation set and a second, prospectively collected, independent validation set run 5 months apart. Accuracy was 80.5% in the first and 79.0% in the second validation set. Peptide mass fingerprinting and quadrupole TOF MS identified two differential proteins: CXC chemokine ligands 4 (CXCL4) and 7 (CXCL7), both of which had significantly decreased serum levels in MDS, as confirmed with independent antibody assays. Western blot analyses of platelet lysates for these two platelet-derived molecules revealed a lack of CXCL4 and CXCL7 in MDS. Subtype analyses revealed that these two proteins have decreased serum levels in advanced MDS, suggesting the possibility of a concerted disturbance of transcription or translation of these chemokines in advanced MDS.
Assuntos
Biomarcadores/metabolismo , Proteínas Sanguíneas/química , Quimiocinas CXC/metabolismo , Síndromes Mielodisplásicas/sangue , Proteoma , Humanos , Espectrometria de MassasRESUMO
Epidermal melanocytes execute specific physiological programs in response to UV radiation (UVR) at the cutaneous interface. Many melanocytic responses, including increased dendrite formation, enhanced melanogenesis/melanization, and cell cycle arrest impact the ability of melanocytes to survive and to attenuate the UVR insult. Although some of the molecules that underlie these UVR programs are known, a coherent view of UVR-induced transcriptional changes is lacking. Using primary melanocyte cultures, we assessed for UVR-mediated alterations in over 47,000 transcripts using Affymetrix Human Genome U133 Plus 2.0 microarrays. From the 100 most statistically robust changes in transcript level, there were 84 genes that were suppressed >2.0-fold by UVR; among these transcripts, the identities of 48 of these genes were known. Similarly, there were 99 genes that were induced >2.0-fold by UVR; the identity of 57 of these genes were known. We then subjected these top 100 changes to the Ingenuity Pathway analysis program and identified a group of p53 targets including the cell cycle regulator CDKN1A (p21CIP), the WNT pathway regulator DKK1 (dickkopf homolog 1), the receptor tyrosine kinase EPHA2, growth factor GDF15, ferrodoxin reductase (FDXR), p53-inducible protein TP53I3, transcription factor ATF3, DNA repair enzyme DDB2, and the beta-adrenergic receptor ADBR2. These genes were also found to be consistently elevated by UVR in six independent melanocyte lines, although there were interindividual variations in magnitude. WWOX, whose protein product interacts and regulates p53 and p73, was found to be consistently suppressed by UVR. There was also a subgroup of neurite/axonal developmental genes that were altered in response to UVR, suggesting that melanocytic and neuronal arborization may share similar mechanisms. When compared to melanomas, the basal levels of many of these p53-responsive genes were greatly dysregulated. Three genes--CDKN1A, DDB2 and ADRB2--exhibited a trend towards loss of expression in melanomas thereby raising the possibility of a linked role in tumorigenesis. These expression data provide a global view of UVR-induced changes in melanocytes and, more importantly, generate novel hypotheses regarding melanocyte physiology.