Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
1.
J Rheumatol ; 51(8): 781-789, 2024 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-38879192

RESUMO

OBJECTIVE: Psoriatic disease remains underdiagnosed and undertreated. We developed and validated a suite of novel, sensor-based smartphone assessments (Psorcast app) that can be self-administered to measure cutaneous and musculoskeletal signs and symptoms of psoriatic disease. METHODS: Participants with psoriasis (PsO) or psoriatic arthritis (PsA) and healthy controls were recruited between June 5, 2019, and November 10, 2021, at 2 academic medical centers. Concordance and accuracy of digital measures and image-based machine learning models were compared to their analogous clinical measures from trained rheumatologists and dermatologists. RESULTS: Of 104 study participants, 51 (49%) were female and 53 (51%) were male, with a mean age of 42.3 years (SD 12.6). Seventy-nine (76%) participants had PsA, 16 (15.4%) had PsO, and 9 (8.7%) were healthy controls. Digital patient assessment of percent body surface area (BSA) affected with PsO demonstrated very strong concordance (Lin concordance correlation coefficient [CCC] 0.94 [95% CI 0.91-0.96]) with physician-assessed BSA. The in-clinic and remote target plaque physician global assessments showed fair-to-moderate concordance (CCCerythema 0.72 [0.59-0.85]; CCCinduration 0.72 [0.62-0.82]; CCCscaling 0.60 [0.48-0.72]). Machine learning models of hand photos taken by patients accurately identified clinically diagnosed nail PsO with an accuracy of 0.76. The Digital Jar Open assessment categorized physician-assessed upper extremity involvement, considering joint tenderness or enthesitis (AUROC 0.68 [0.47-0.85]). CONCLUSION: The Psorcast digital assessments achieved significant clinical validity, although they require further validation in larger cohorts before use in evidence-based medicine or clinical trial settings. The smartphone software and analysis pipelines from the Psorcast suite are open source and freely available.


Assuntos
Artrite Psoriásica , Aprendizado de Máquina , Psoríase , Smartphone , Humanos , Artrite Psoriásica/diagnóstico , Feminino , Masculino , Psoríase/diagnóstico , Adulto , Pessoa de Meia-Idade , Estudo de Prova de Conceito , Aplicativos Móveis , Reprodutibilidade dos Testes
2.
PLoS Genet ; 12(12): e1006466, 2016 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-27935966

RESUMO

Human genome-wide association studies (GWAS) have shown that genetic variation at >130 gene loci is associated with type 2 diabetes (T2D). We asked if the expression of the candidate T2D-associated genes within these loci is regulated by a common locus in pancreatic islets. Using an obese F2 mouse intercross segregating for T2D, we show that the expression of ~40% of the T2D-associated genes is linked to a broad region on mouse chromosome (Chr) 2. As all but 9 of these genes are not physically located on Chr 2, linkage to Chr 2 suggests a genomic factor(s) located on Chr 2 regulates their expression in trans. The transcription factor Nfatc2 is physically located on Chr 2 and its expression demonstrates cis linkage; i.e., its expression maps to itself. When conditioned on the expression of Nfatc2, linkage for the T2D-associated genes was greatly diminished, supporting Nfatc2 as a driver of their expression. Plasma insulin also showed linkage to the same broad region on Chr 2. Overexpression of a constitutively active (ca) form of Nfatc2 induced ß-cell proliferation in mouse and human islets, and transcriptionally regulated more than half of the T2D-associated genes. Overexpression of either ca-Nfatc2 or ca-Nfatc1 in mouse islets enhanced insulin secretion, whereas only ca-Nfatc2 was able to promote ß-cell proliferation, suggesting distinct molecular pathways mediating insulin secretion vs. ß-cell proliferation are regulated by NFAT. Our results suggest that many of the T2D-associated genes are downstream transcriptional targets of NFAT, and may act coordinately in a pathway through which NFAT regulates ß-cell proliferation in both mouse and human islets.


Assuntos
Diabetes Mellitus Tipo 2/genética , Insulina/genética , Fatores de Transcrição NFATC/genética , Animais , Proliferação de Células/genética , Mapeamento Cromossômico , Diabetes Mellitus Tipo 2/metabolismo , Diabetes Mellitus Tipo 2/patologia , Regulação da Expressão Gênica , Ligação Genética , Genoma , Estudo de Associação Genômica Ampla , Humanos , Células Secretoras de Insulina/metabolismo , Células Secretoras de Insulina/patologia , Ilhotas Pancreáticas/metabolismo , Ilhotas Pancreáticas/patologia , Camundongos , Camundongos Obesos , Fatores de Transcrição NFATC/biossíntese , Regiões Promotoras Genéticas
3.
Lancet Oncol ; 18(1): 132-142, 2017 01.
Artigo em Inglês | MEDLINE | ID: mdl-27864015

RESUMO

BACKGROUND: Improvements to prognostic models in metastatic castration-resistant prostate cancer have the potential to augment clinical trial design and guide treatment strategies. In partnership with Project Data Sphere, a not-for-profit initiative allowing data from cancer clinical trials to be shared broadly with researchers, we designed an open-data, crowdsourced, DREAM (Dialogue for Reverse Engineering Assessments and Methods) challenge to not only identify a better prognostic model for prediction of survival in patients with metastatic castration-resistant prostate cancer but also engage a community of international data scientists to study this disease. METHODS: Data from the comparator arms of four phase 3 clinical trials in first-line metastatic castration-resistant prostate cancer were obtained from Project Data Sphere, comprising 476 patients treated with docetaxel and prednisone from the ASCENT2 trial, 526 patients treated with docetaxel, prednisone, and placebo in the MAINSAIL trial, 598 patients treated with docetaxel, prednisone or prednisolone, and placebo in the VENICE trial, and 470 patients treated with docetaxel and placebo in the ENTHUSE 33 trial. Datasets consisting of more than 150 clinical variables were curated centrally, including demographics, laboratory values, medical history, lesion sites, and previous treatments. Data from ASCENT2, MAINSAIL, and VENICE were released publicly to be used as training data to predict the outcome of interest-namely, overall survival. Clinical data were also released for ENTHUSE 33, but data for outcome variables (overall survival and event status) were hidden from the challenge participants so that ENTHUSE 33 could be used for independent validation. Methods were evaluated using the integrated time-dependent area under the curve (iAUC). The reference model, based on eight clinical variables and a penalised Cox proportional-hazards model, was used to compare method performance. Further validation was done using data from a fifth trial-ENTHUSE M1-in which 266 patients with metastatic castration-resistant prostate cancer were treated with placebo alone. FINDINGS: 50 independent methods were developed to predict overall survival and were evaluated through the DREAM challenge. The top performer was based on an ensemble of penalised Cox regression models (ePCR), which uniquely identified predictive interaction effects with immune biomarkers and markers of hepatic and renal function. Overall, ePCR outperformed all other methods (iAUC 0·791; Bayes factor >5) and surpassed the reference model (iAUC 0·743; Bayes factor >20). Both the ePCR model and reference models stratified patients in the ENTHUSE 33 trial into high-risk and low-risk groups with significantly different overall survival (ePCR: hazard ratio 3·32, 95% CI 2·39-4·62, p<0·0001; reference model: 2·56, 1·85-3·53, p<0·0001). The new model was validated further on the ENTHUSE M1 cohort with similarly high performance (iAUC 0·768). Meta-analysis across all methods confirmed previously identified predictive clinical variables and revealed aspartate aminotransferase as an important, albeit previously under-reported, prognostic biomarker. INTERPRETATION: Novel prognostic factors were delineated, and the assessment of 50 methods developed by independent international teams establishes a benchmark for development of methods in the future. The results of this effort show that data-sharing, when combined with a crowdsourced challenge, is a robust and powerful framework to develop new prognostic models in advanced prostate cancer. FUNDING: Sanofi US Services, Project Data Sphere.


Assuntos
Protocolos de Quimioterapia Combinada Antineoplásica/uso terapêutico , Modelos Estatísticos , Nomogramas , Neoplasias de Próstata Resistentes à Castração/mortalidade , Adolescente , Adulto , Idoso , Teorema de Bayes , Crowdsourcing , Docetaxel , Seguimentos , Humanos , Masculino , Pessoa de Meia-Idade , Estadiamento de Neoplasias , Prednisona/administração & dosagem , Prognóstico , Neoplasias de Próstata Resistentes à Castração/tratamento farmacológico , Neoplasias de Próstata Resistentes à Castração/secundário , Taxa de Sobrevida , Taxoides/administração & dosagem , Adulto Jovem
4.
Artigo em Inglês | MEDLINE | ID: mdl-36103435

RESUMO

We propose a counterfactual approach to train "causality-aware" predictive models that are able to leverage causal information in static anticausal machine learning tasks (i.e., prediction tasks where the outcome influences the inputs). In applications plagued by confounding, the approach can be used to generate predictions that are free from the influence of observed confounders. In applications involving observed mediators, the approach can be used to generate predictions that only capture the direct or the indirect causal influences. Mechanistically, we train supervised learners on (counterfactually) simulated inputs that retain only the associations generated by the causal relations of interest. We focus on linear models, where analytical results connecting covariances, causal effects, and prediction mean square errors are readily available. Quite importantly, we show that our approach does not require knowledge of the full causal graph. It suffices to know which variables represent potential confounders and/or mediators. We investigate the stability of the method with respect to dataset shifts generated by selection biases and also relax the linearity assumption by extending the approach to additive models better able to account for nonlinearities in the data. We validate our approach in a series of synthetic data experiments and illustrate its application to a real dataset.

5.
PLoS Genet ; 4(3): e1000034, 2008 Mar 14.
Artigo em Inglês | MEDLINE | ID: mdl-18369453

RESUMO

Although numerous quantitative trait loci (QTL) influencing disease-related phenotypes have been detected through gene mapping and positional cloning, identification of the individual gene(s) and molecular pathways leading to those phenotypes is often elusive. One way to improve understanding of genetic architecture is to classify phenotypes in greater depth by including transcriptional and metabolic profiling. In the current study, we have generated and analyzed mRNA expression and metabolic profiles in liver samples obtained in an F2 intercross between the diabetes-resistant C57BL/6 leptin(ob/ob) and the diabetes-susceptible BTBR leptin(ob/ob) mouse strains. This cross, which segregates for genotype and physiological traits, was previously used to identify several diabetes-related QTL. Our current investigation includes microarray analysis of over 40,000 probe sets, plus quantitative mass spectrometry-based measurements of sixty-seven intermediary metabolites in three different classes (amino acids, organic acids, and acyl-carnitines). We show that liver metabolites map to distinct genetic regions, thereby indicating that tissue metabolites are heritable. We also demonstrate that genomic analysis can be integrated with liver mRNA expression and metabolite profiling data to construct causal networks for control of specific metabolic processes in liver. As a proof of principle of the practical significance of this integrative approach, we illustrate the construction of a specific causal network that links gene expression and metabolic changes in the context of glutamate metabolism, and demonstrate its validity by showing that genes in the network respond to changes in glutamine and glutamate availability. Thus, the methods described here have the potential to reveal regulatory networks that contribute to chronic, complex, and highly prevalent diseases and conditions such as obesity and diabetes.


Assuntos
Fígado/metabolismo , Animais , Cruzamentos Genéticos , Feminino , Perfilação da Expressão Gênica , Hepatócitos/metabolismo , Leptina/genética , Masculino , Redes e Vias Metabólicas , Camundongos , Camundongos Endogâmicos C57BL , Camundongos Mutantes , Modelos Genéticos , Fenótipo , Locos de Características Quantitativas , RNA Mensageiro/genética , RNA Mensageiro/metabolismo
6.
NPJ Digit Med ; 3: 21, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32128451

RESUMO

Digital technologies such as smartphones are transforming the way scientists conduct biomedical research. Several remotely conducted studies have recruited thousands of participants over a span of a few months allowing researchers to collect real-world data at scale and at a fraction of the cost of traditional research. Unfortunately, remote studies have been hampered by substantial participant attrition, calling into question the representativeness of the collected data including generalizability of outcomes. We report the findings regarding recruitment and retention from eight remote digital health studies conducted between 2014-2019 that provided individual-level study-app usage data from more than 100,000 participants completing nearly 3.5 million remote health evaluations over cumulative participation of 850,000 days. Median participant retention across eight studies varied widely from 2-26 days (median across all studies = 5.5 days). Survival analysis revealed several factors significantly associated with increase in participant retention time, including (i) referral by a clinician to the study (increase of 40 days in median retention time); (ii) compensation for participation (increase of 22 days, 1 study); (iii) having the clinical condition of interest in the study (increase of 7 days compared with controls); and (iv) older age (increase of 4 days). Additionally, four distinct patterns of daily app usage behavior were identified by unsupervised clustering, which were also associated with participant demographics. Most studies were not able to recruit a sample that was representative of the race/ethnicity or geographical diversity of the US. Together these findings can help inform recruitment and retention strategies to enable equitable participation of populations in future digital health research.

7.
Front Cardiovasc Med ; 7: 120, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32850982

RESUMO

There are many approaches to maintaining wellness, including taking a simple vacation to attending highly structured wellness retreats, which typically regulate the attendee's personal time and activities. In a healthy English-speaking cohort of 112 women and men (aged 30-80 years), this study examined the effects of participating in either a 6-days intensive wellness retreat based on Ayurvedic medicine principles or unstructured 6-days vacation at the same wellness center setting. Heart rate variability (HRV) was monitored continuously using a wearable ECG sensor patch for up to 7 days prior to, during, and 1-month following participation in the interventions. Additionally, salivary cortisol levels were assessed for all participants at multiple times during the day. Continual HRV monitoring data in the real-world setting was seen to be associated with demographic [HRVALF: ßAge = 0.98 (95% CI = 0.96-0.98), false discovery rate (FDR) < 0.001] and physiological characteristics [HRVPLF: ß = 0.98 (95% CI = 0.98-1), FDR =0.005] of participants. HRV features were also able to quantify known diurnal variations [HRVLF/HF: ßACT:night vs. early-morning = 2.69 (SE = 1.26), FDR < 0.001] along with notable inter- and intraperson heterogeneity in response to intervention. A statistically significant increase in HRVALF [ß = 1.48 (SE = 1.1), FDR < 0.001] was observed for all participants during the resort visit. Personalized HRV analysis at an individual level showed a distinct individualized response to intervention, further supporting the utility of using continuous real-world tracking of HRV at an individual level to objectively measure responses to potentially stressful or relaxing settings.

8.
Mamm Genome ; 20(8): 476-85, 2009 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-19727952

RESUMO

Type 2 diabetes results from severe insulin resistance coupled with a failure of b cells to compensate by secreting sufficient insulin. Multiple genetic loci are involved in the development of diabetes, although the effect of each gene on diabetes susceptibility is thought to be small. MicroRNAs (miRNAs) are noncoding 19-22-nucleotide RNA molecules that potentially regulate the expression of thousands of genes. To understand the relationship between miRNA regulation and obesity-induced diabetes, we quantitatively profiled approximately 220 miRNAs in pancreatic islets, adipose tissue, and liver from diabetes-resistant (B6) and diabetes-susceptible (BTBR) mice. More than half of the miRNAs profiled were expressed in all three tissues, with many miRNAs in each tissue showing significant changes in response to genetic obesity. Furthermore, several miRNAs in each tissue were differentially responsive to obesity in B6 versus BTBR mice, suggesting that they may be involved in the pathogenesis of diabetes. In liver there were approximately 40 miRNAs that were downregulated in response to obesity in B6 but not BTBR mice, indicating that genetic differences between the mouse strains play a critical role in miRNA regulation. In order to elucidate the genetic architecture of hepatic miRNA expression, we measured the expression of miRNAs in genetically obese F2 mice. Approximately 10% of the miRNAs measured showed significant linkage (miR-eQTLs), identifying loci that control miRNA abundance. Understanding the influence that obesity and genetics exert on the regulation of miRNA expression will reveal the role miRNAs play in the context of obesity-induced type 2 diabetes.


Assuntos
Tecido Adiposo/metabolismo , Regulação da Expressão Gênica , Ilhotas Pancreáticas/metabolismo , Fígado/metabolismo , MicroRNAs/genética , Obesidade/genética , Animais , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/metabolismo , Modelos Animais de Doenças , Feminino , Dosagem de Genes , Perfilação da Expressão Gênica , Humanos , Masculino , Camundongos , Camundongos Obesos , MicroRNAs/metabolismo , Obesidade/metabolismo
9.
Nat Commun ; 10(1): 2674, 2019 06 17.
Artigo em Inglês | MEDLINE | ID: mdl-31209238

RESUMO

The effectiveness of most cancer targeted therapies is short-lived. Tumors often develop resistance that might be overcome with drug combinations. However, the number of possible combinations is vast, necessitating data-driven approaches to find optimal patient-specific treatments. Here we report AstraZeneca's large drug combination dataset, consisting of 11,576 experiments from 910 combinations across 85 molecularly characterized cancer cell lines, and results of a DREAM Challenge to evaluate computational strategies for predicting synergistic drug pairs and biomarkers. 160 teams participated to provide a comprehensive methodological development and benchmarking. Winning methods incorporate prior knowledge of drug-target interactions. Synergy is predicted with an accuracy matching biological replicates for >60% of combinations. However, 20% of drug combinations are poorly predicted by all methods. Genomic rationale for synergy predictions are identified, including ADAM17 inhibitor antagonism when combined with PIK3CB/D inhibition contrasting to synergy when combined with other PI3K-pathway inhibitors in PIK3CA mutant cells.


Assuntos
Protocolos de Quimioterapia Combinada Antineoplásica/farmacologia , Biologia Computacional/métodos , Neoplasias/tratamento farmacológico , Farmacogenética/métodos , Proteína ADAM17/antagonistas & inibidores , Protocolos de Quimioterapia Combinada Antineoplásica/uso terapêutico , Benchmarking , Biomarcadores Tumorais/genética , Linhagem Celular Tumoral , Biologia Computacional/normas , Conjuntos de Dados como Assunto , Antagonismo de Drogas , Resistencia a Medicamentos Antineoplásicos/efeitos dos fármacos , Resistencia a Medicamentos Antineoplásicos/genética , Sinergismo Farmacológico , Genômica/métodos , Humanos , Terapia de Alvo Molecular/métodos , Mutação , Neoplasias/genética , Farmacogenética/normas , Fosfatidilinositol 3-Quinases/genética , Inibidores de Fosfoinositídeo-3 Quinase , Resultado do Tratamento
10.
JCO Clin Cancer Inform ; 1: 1-15, 2017 11.
Artigo em Inglês | MEDLINE | ID: mdl-30657384

RESUMO

PURPOSE: Docetaxel has a demonstrated survival benefit for patients with metastatic castration-resistant prostate cancer (mCRPC); however, 10% to 20% of patients discontinue docetaxel prematurely because of toxicity-induced adverse events, and the management of risk factors for toxicity remains a challenge. PATIENTS AND METHODS: The comparator arms of four phase III clinical trials in first-line mCRPC were collected, annotated, and compiled, with a total of 2,070 patients. Early discontinuation was defined as treatment stoppage within 3 months as a result of adverse treatment effects; 10% of patients discontinued treatment. We designed an open-data, crowd-sourced DREAM Challenge for developing models with which to predict early discontinuation of docetaxel treatment. Clinical features for all four trials and outcomes for three of the four trials were made publicly available, with the outcomes of the fourth trial held back for unbiased model evaluation. Challenge participants from around the world trained models and submitted their predictions. Area under the precision-recall curve was the primary metric used for performance assessment. RESULTS: In total, 34 separate teams submitted predictions. Seven models with statistically similar area under precision-recall curves (Bayes factor ≤ 3) outperformed all other models. A postchallenge analysis of risk prediction using these seven models revealed three patient subgroups: high risk, low risk, or discordant risk. Early discontinuation events were two times higher in the high-risk subgroup compared with the low-risk subgroup. Simulation studies demonstrated that use of patient discontinuation prediction models could reduce patient enrollment in clinical trials without the loss of statistical power. CONCLUSION: This work represents a successful collaboration between 34 international teams that leveraged open clinical trial data. Our results demonstrate that routinely collected clinical features can be used to identify patients with mCRPC who are likely to discontinue treatment because of adverse events and establishes a robust benchmark with implications for clinical trial design.


Assuntos
Antineoplásicos/uso terapêutico , Docetaxel/uso terapêutico , Modelos Teóricos , Neoplasias de Próstata Resistentes à Castração/tratamento farmacológico , Neoplasias de Próstata Resistentes à Castração/patologia , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Antineoplásicos/administração & dosagem , Protocolos de Quimioterapia Combinada Antineoplásica/efeitos adversos , Protocolos de Quimioterapia Combinada Antineoplásica/uso terapêutico , Ensaios Clínicos como Assunto , Docetaxel/administração & dosagem , Humanos , Masculino , Metanálise como Assunto , Pessoa de Meia-Idade , Prednisona , Prognóstico , Neoplasias de Próstata Resistentes à Castração/mortalidade , Fatores de Tempo , Resultado do Tratamento , Adulto Jovem
11.
Sci Data ; 3: 160011, 2016 Mar 03.
Artigo em Inglês | MEDLINE | ID: mdl-26938265

RESUMO

Current measures of health and disease are often insensitive, episodic, and subjective. Further, these measures generally are not designed to provide meaningful feedback to individuals. The impact of high-resolution activity data collected from mobile phones is only beginning to be explored. Here we present data from mPower, a clinical observational study about Parkinson disease conducted purely through an iPhone app interface. The study interrogated aspects of this movement disorder through surveys and frequent sensor-based recordings from participants with and without Parkinson disease. Benefitting from large enrollment and repeated measurements on many individuals, these data may help establish baseline variability of real-world activity measurement collected via mobile phones, and ultimately may lead to quantification of the ebbs-and-flows of Parkinson symptoms. App source code for these data collection modules are available through an open source license for use in studies of other conditions. We hope that releasing data contributed by engaged research participants will seed a new community of analysts working collaboratively on understanding mobile health data to advance human health.


Assuntos
Coleta de Dados , Conjuntos de Dados como Assunto , Doença de Parkinson , Telefone Celular , Humanos , Telemedicina
12.
Nat Commun ; 7: 12460, 2016 08 23.
Artigo em Inglês | MEDLINE | ID: mdl-27549343

RESUMO

Rheumatoid arthritis (RA) affects millions world-wide. While anti-TNF treatment is widely used to reduce disease progression, treatment fails in ∼one-third of patients. No biomarker currently exists that identifies non-responders before treatment. A rigorous community-based assessment of the utility of SNP data for predicting anti-TNF treatment efficacy in RA patients was performed in the context of a DREAM Challenge (http://www.synapse.org/RA_Challenge). An open challenge framework enabled the comparative evaluation of predictions developed by 73 research groups using the most comprehensive available data and covering a wide range of state-of-the-art modelling methodologies. Despite a significant genetic heritability estimate of treatment non-response trait (h(2)=0.18, P value=0.02), no significant genetic contribution to prediction accuracy is observed. Results formally confirm the expectations of the rheumatology community that SNP information does not significantly improve predictive performance relative to standard clinical traits, thereby justifying a refocusing of future efforts on collection of other data.


Assuntos
Anticorpos Monoclonais Humanizados/uso terapêutico , Artrite Reumatoide/tratamento farmacológico , Predisposição Genética para Doença/genética , Polimorfismo de Nucleotídeo Único , Fator de Necrose Tumoral alfa/antagonistas & inibidores , Adulto , Idoso , Anticorpos Monoclonais/uso terapêutico , Antirreumáticos/uso terapêutico , Artrite Reumatoide/genética , Artrite Reumatoide/patologia , Certolizumab Pegol/uso terapêutico , Estudos de Coortes , Crowdsourcing , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Prognóstico , Resultado do Tratamento , Fator de Necrose Tumoral alfa/imunologia
13.
Pac Symp Biocomput ; : 27-38, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24297531

RESUMO

Computational efficiency is important for learning algorithms operating in the "large p, small n" setting. In computational biology, the analysis of data sets containing tens of thousands of features ("large p"), but only a few hundred samples ("small n"), is nowadays routine, and regularized regression approaches such as ridge-regression, lasso, and elastic-net are popular choices. In this paper we propose a novel and highly efficient Bayesian inference method for fitting ridge-regression. Our method is fully analytical, and bypasses the need for expensive tuning parameter optimization, via cross-validation, by employing Bayesian model averaging over the grid of tuning parameters. Additional computational efficiency is achieved by adopting the singular value decomposition reparametrization of the ridge-regression model, replacing computationally expensive inversions of large p × p matrices by efficient inversions of small and diagonal n × n matrices. We show in simulation studies and in the analysis of two large cancer cell line data panels that our algorithm achieves slightly better predictive performance than cross-validated ridge-regression while requiring only a fraction of the computation time. Furthermore, in comparisons based on the cell line data sets, our algorithm systematically out-performs the lasso in both predictive performance and computation time, and shows equivalent predictive performance, but considerably smaller computation time, than the elastic-net.


Assuntos
Algoritmos , Farmacogenética/estatística & dados numéricos , Antineoplásicos/farmacologia , Inteligência Artificial , Teorema de Bayes , Linhagem Celular Tumoral , Biologia Computacional , Resistencia a Medicamentos Antineoplásicos/genética , Humanos , Neoplasias/tratamento farmacológico , Neoplasias/genética , Análise de Regressão
14.
Pac Symp Biocomput ; : 63-74, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24297534

RESUMO

Large-scale pharmacogenomic screens of cancer cell lines have emerged as an attractive pre-clinical system for identifying tumor genetic subtypes with selective sensitivity to targeted therapeutic strategies. Application of modern machine learning approaches to pharmacogenomic datasets have demonstrated the ability to infer genomic predictors of compound sensitivity. Such modeling approaches entail many analytical design choices; however, a systematic study evaluating the relative performance attributable to each design choice is not yet available. In this work, we evaluated over 110,000 different models, based on a multifactorial experimental design testing systematic combinations of modeling factors within several categories of modeling choices, including: type of algorithm, type of molecular feature data, compound being predicted, method of summarizing compound sensitivity values, and whether predictions are based on discretized or continuous response values. Our results suggest that model input data (type of molecular features and choice of compound) are the primary factors explaining model performance, followed by choice of algorithm. Our results also provide a statistically principled set of recommended modeling guidelines, including: using elastic net or ridge regression with input features from all genomic profiling platforms, most importantly, gene expression features, to predict continuous-valued sensitivity scores summarized using the area under the dose response curve, with pathway targeted compounds most likely to yield the most accurate predictors. In addition, our study provides a publicly available resource of all modeling results, an open source code base, and experimental design for researchers throughout the community to build on our results and assess novel methodologies or applications in related predictive modeling problems.


Assuntos
Neoplasias/tratamento farmacológico , Neoplasias/genética , Farmacogenética/estatística & dados numéricos , Algoritmos , Inteligência Artificial , Linhagem Celular Tumoral , Biologia Computacional , Bases de Dados Genéticas/estatística & dados numéricos , Resistencia a Medicamentos Antineoplásicos/genética , Perfilação da Expressão Gênica/estatística & dados numéricos , Humanos , Modelos Genéticos , Análise de Regressão
15.
Genetics ; 193(3): 1003-13, 2013 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-23288936

RESUMO

Current efforts in systems genetics have focused on the development of statistical approaches that aim to disentangle causal relationships among molecular phenotypes in segregating populations. Reverse engineering of transcriptional networks plays a key role in the understanding of gene regulation. However, transcriptional regulation is only one possible mechanism, as methylation, phosphorylation, direct protein-protein interaction, transcription factor binding, etc., can also contribute to gene regulation. These additional modes of regulation can be interpreted as unobserved variables in the transcriptional gene network and can potentially affect its reconstruction accuracy. We develop tests of causal direction for a pair of phenotypes that may be embedded in a more complicated but unobserved network by extending Vuong's selection tests for misspecified models. Our tests provide a significance level, which is unavailable for the widely used AIC and BIC criteria. We evaluate the performance of our tests against the AIC, BIC, and a recently published causality inference test in simulation studies. We compare the precision of causal calls using biologically validated causal relationships extracted from a database of 247 knockout experiments in yeast. Our model selection tests are more precise, showing greatly reduced false-positive rates compared to the alternative approaches. In practice, this is a useful feature since follow-up studies tend to be time consuming and expensive and, hence, it is important for the experimentalist to have causal predictions with low false-positive rates.


Assuntos
Modelos Genéticos , Fenótipo , Simulação por Computador , Reações Falso-Positivas , Deleção de Genes , Regulação da Expressão Gênica , Redes Reguladoras de Genes , Probabilidade , Locos de Características Quantitativas/genética , Leveduras/genética
16.
Genetics ; 191(4): 1355-65, 2012 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-22661325

RESUMO

Quantitative trait loci (QTL) hotspots (genomic locations affecting many traits) are a common feature in genetical genomics studies and are biologically interesting since they may harbor critical regulators. Therefore, statistical procedures to assess the significance of hotspots are of key importance. One approach, randomly allocating observed QTL across the genomic locations separately by trait, implicitly assumes all traits are uncorrelated. Recently, an empirical test for QTL hotspots was proposed on the basis of the number of traits that exceed a predetermined LOD value, such as the standard permutation LOD threshold. The permutation null distribution of the maximum number of traits across all genomic locations preserves the correlation structure among the phenotypes, avoiding the detection of spurious hotspots due to nongenetic correlation induced by uncontrolled environmental factors and unmeasured variables. However, by considering only the number of traits above a threshold, without accounting for the magnitude of the LOD scores, relevant information is lost. In particular, biologically interesting hotspots composed of a moderate to small number of traits with strong LOD scores may be neglected as nonsignificant. In this article we propose a quantile-based permutation approach that simultaneously accounts for the number and the LOD scores of traits within the hotspots. By considering a sliding scale of mapping thresholds, our method can assess the statistical significance of both small and large hotspots. Although the proposed approach can be applied to any type of heritable high-volume "omic" data set, we restrict our attention to expression (e)QTL analysis. We assess and compare the performances of these three methods in simulations and we illustrate how our approach can effectively assess the significance of moderate and small hotspots with strong LOD scores in a yeast expression data set.


Assuntos
Escore Lod , Modelos Genéticos , Mutação , Locos de Características Quantitativas , Algoritmos , Mapeamento Cromossômico , Simulação por Computador , Reprodutibilidade dos Testes
18.
Ann Appl Stat ; 4(1): 320-339, 2010 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-21218138

RESUMO

Causal inference approaches in systems genetics exploit quantitative trait loci (QTL) genotypes to infer causal relationships among phenotypes. The genetic architecture of each phenotype may be complex, and poorly estimated genetic architectures may compromise the inference of causal relationships among phenotypes. Existing methods assume QTLs are known or inferred without regard to the phenotype network structure. In this paper we develop a QTL-driven phenotype network method (QTLnet) to jointly infer a causal phenotype network and associated genetic architecture for sets of correlated phenotypes. Randomization of alleles during meiosis and the unidirectional influence of genotype on phenotype allow the inference of QTLs causal to phenotypes. Causal relationships among phenotypes can be inferred using these QTL nodes, enabling us to distinguish among phenotype networks that would otherwise be distribution equivalent. We jointly model phenotypes and QTLs using homogeneous conditional Gaussian regression models, and we derive a graphical criterion for distribution equivalence. We validate the QTLnet approach in a simulation study. Finally, we illustrate with simulated data and a real example how QTLnet can be used to infer both direct and indirect effects of QTLs and phenotypes that co-map to a genomic region.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA