Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 39
Filtrar
1.
PLoS One ; 11(3): e0151174, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27028297

RESUMEN

Conventional research methodologies and data analytic approaches in psychiatric research are unable to reliably infer causal relations without experimental designs, or to make inferences about the functional properties of the complex systems in which psychiatric disorders are embedded. This article describes a series of studies to validate a novel hybrid computational approach--the Complex Systems-Causal Network (CS-CN) method-designed to integrate causal discovery within a complex systems framework for psychiatric research. The CS-CN method was first applied to an existing dataset on psychopathology in 163 children hospitalized with injuries (validation study). Next, it was applied to a much larger dataset of traumatized children (replication study). Finally, the CS-CN method was applied in a controlled experiment using a 'gold standard' dataset for causal discovery and compared with other methods for accurately detecting causal variables (resimulation controlled experiment). The CS-CN method successfully detected a causal network of 111 variables and 167 bivariate relations in the initial validation study. This causal network had well-defined adaptive properties and a set of variables was found that disproportionally contributed to these properties. Modeling the removal of these variables resulted in significant loss of adaptive properties. The CS-CN method was successfully applied in the replication study and performed better than traditional statistical methods, and similarly to state-of-the-art causal discovery algorithms in the causal detection experiment. The CS-CN method was validated, replicated, and yielded both novel and previously validated findings related to risk factors and potential treatments of psychiatric disorders. The novel approach yields both fine-grain (micro) and high-level (macro) insights and thus represents a promising approach for complex systems-oriented research in psychiatry.


Asunto(s)
Psiquiatría/métodos , Adolescente , Niño , Análisis por Conglomerados , Humanos , Modelos Psicológicos , Análisis de Sistemas , Heridas y Lesiones/psicología
2.
Sci Rep ; 6: 22558, 2016 Mar 04.
Artículo en Inglés | MEDLINE | ID: mdl-26939894

RESUMEN

Reverse-engineering of causal pathways that implicate diseases and vital cellular functions is a fundamental problem in biomedicine. Discovery of the local causal pathway of a target variable (that consists of its direct causes and direct effects) is essential for effective intervention and can facilitate accurate diagnosis and prognosis. Recent research has provided several active learning methods that can leverage passively observed high-throughput data to draft causal pathways and then refine the inferred relations with a limited number of experiments. The current study provides a comprehensive evaluation of the performance of active learning methods for local causal pathway discovery in real biological data. Specifically, 54 active learning methods/variants from 3 families of algorithms were applied for local causal pathways reconstruction of gene regulation for 5 transcription factors in S. cerevisiae. Four aspects of the methods' performance were assessed, including adjacency discovery quality, edge orientation accuracy, complete pathway discovery quality, and experimental cost. The results of this study show that some methods provide significant performance benefits over others and therefore should be routinely used for local causal pathway discovery tasks. This study also demonstrates the feasibility of local causal pathway reconstruction in real biological systems with significant quality and low experimental cost.


Asunto(s)
Regulación de la Expresión Génica , Modelos Biológicos , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/fisiología , Factores de Transcripción/metabolismo , Algoritmos , Ontologías Biológicas , Simulación por Computador , Estudios de Factibilidad , Perfilación de la Expresión Génica , Redes Reguladoras de Genes , Ensayos Analíticos de Alto Rendimiento , Humanos , Aprendizaje Basado en Problemas , Proyectos de Investigación , Proteínas de Saccharomyces cerevisiae/genética , Factores de Transcripción/genética
3.
Arthritis Rheumatol ; 67(11): 2905-15, 2015 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-26195278

RESUMEN

OBJECTIVE: Inflammatory mediators, such as prostaglandin E2 (PGE2 ) and interleukin-1ß (IL-1ß), are produced by osteoarthritic (OA) joint tissue, where they may contribute to disease pathogenesis. We undertook the present study to examine whether inflammation, evidenced in plasma and peripheral blood leukocytes (PBLs), reflects the presence, progression, or specific symptoms of symptomatic knee OA. METHODS: Patients with symptomatic knee OA were enrolled in a 24-month prospective study of radiographic progression. Standardized knee radiographs were obtained at baseline and 24 months. At baseline, levels of the plasma lipids PGE2 and 15-hydroxyeicosatetraenoic acid (15-HETE) were measured, and transcriptome analysis of PBLs was performed by microarray and quantitative polymerase chain reaction. RESULTS: Baseline PGE2 synthase (PGES) levels determined by PBL microarray gene expression and plasma PGE2 levels distinguished patients with symptomatic knee OA from non-OA controls (area under the receiver operating characteristic curve [AUC] 0.87 and 0.89, respectively, P < 0.0001). Baseline plasma 15-HETE levels were significantly elevated in patients with symptomatic knee OA versus non-OA controls (P < 0.0195). In the 146 patients who completed the 24-month study, elevated baseline expression of IL-1ß, tumor necrosis factor α, and cyclooxygenase 2 (COX-2) messenger RNA in PBLs predicted higher risk of radiographic progression as evidenced by joint space narrowing (JSN). In a multivariate model, AUC point estimates of models containing COX-2 in combination with demographic traits overlapped the confidence interval of the base model in 2 of the 3 JSN outcome measures (JSN >0.0 mm, JSN >0.2 mm, and JSN >0.5 mm; AUC 0.62-0.67). CONCLUSION: The inflammatory plasma lipid biomarkers PGE2 and 15-HETE identify patients with symptomatic knee OA, and the PBL inflammatory transcriptome identifies a subset of patients with symptomatic knee OA who are at increased risk of radiographic progression. These findings may reflect low-grade inflammation in OA and may be useful as diagnostic and prognostic biomarkers in clinical development of disease-modifying OA drugs.


Asunto(s)
Dinoprostona/sangre , Ácidos Hidroxieicosatetraenoicos/sangre , Inflamación/patología , Articulación de la Rodilla/patología , Osteoartritis de la Rodilla/patología , Anciano , Biomarcadores/sangre , Progresión de la Enfermedad , Femenino , Humanos , Inflamación/sangre , Articulación de la Rodilla/diagnóstico por imagen , Masculino , Persona de Mediana Edad , Osteoartritis de la Rodilla/sangre , Osteoartritis de la Rodilla/diagnóstico por imagen , Pronóstico , Estudios Prospectivos , Radiografía
4.
J Affect Disord ; 184: 170-5, 2015 Sep 15.
Artículo en Inglés | MEDLINE | ID: mdl-26093830

RESUMEN

BACKGROUND: Pre-deployment identification of soldiers at risk for long-term posttraumatic stress psychopathology after home coming is important to guide decisions about deployment. Early post-deployment identification can direct early interventions to those in need and thereby prevents the development of chronic psychopathology. Both hold significant public health benefits given large numbers of deployed soldiers, but has so far not been achieved. Here, we aim to assess the potential for pre- and early post-deployment prediction of resilience or posttraumatic stress development in soldiers by application of machine learning (ML) methods. METHODS: ML feature selection and prediction algorithms were applied to a prospective cohort of 561 Danish soldiers deployed to Afghanistan in 2009 to identify unique risk indicators and forecast long-term posttraumatic stress responses. RESULTS: Robust pre- and early postdeployment risk indicators were identified, and included individual PTSD symptoms as well as total level of PTSD symptoms, previous trauma and treatment, negative emotions, and thought suppression. The predictive performance of these risk indicators combined was assessed by cross-validation. Together, these indicators forecasted long term posttraumatic stress responses with high accuracy (pre-deployment: AUC = 0.84 (95% CI = 0.81-0.87), post-deployment: AUC = 0.88 (95% CI = 0.85-0.91)). LIMITATIONS: This study utilized a previously collected data set and was therefore not designed to exhaust the potential of ML methods. Further, the study relied solely on self-reported measures. CONCLUSIONS: Pre-deployment and early post-deployment identification of risk for long-term posttraumatic psychopathology are feasible and could greatly reduce the public health costs of war.


Asunto(s)
Aprendizaje Automático , Personal Militar/psicología , Trastornos por Estrés Postraumático/psicología , Adulto , Campaña Afgana 2001- , Algoritmos , Estudios de Cohortes , Dinamarca , Emociones , Femenino , Humanos , Estudios Longitudinales , Masculino , Valor Predictivo de las Pruebas , Estudios Prospectivos , Resiliencia Psicológica , Medición de Riesgo , Máquina de Vectores de Soporte
5.
BMC Psychiatry ; 15: 30, 2015 Mar 16.
Artículo en Inglés | MEDLINE | ID: mdl-25886446

RESUMEN

BACKGROUND: Predicting Posttraumatic Stress Disorder (PTSD) is a pre-requisite for targeted prevention. Current research has identified group-level risk-indicators, many of which (e.g., head trauma, receiving opiates) concern but a subset of survivors. Identifying interchangeable sets of risk indicators may increase the efficiency of early risk assessment. The study goal is to use supervised machine learning (ML) to uncover interchangeable, maximally predictive combinations of early risk indicators. METHODS: Data variables (features) reflecting event characteristics, emergency department (ED) records and early symptoms were collected in 957 trauma survivors within ten days of ED admission, and used to predict PTSD symptom trajectories during the following fifteen months. A Target Information Equivalence Algorithm (TIE*) identified all minimal sets of features (Markov Boundaries; MBs) that maximized the prediction of a non-remitting PTSD symptom trajectory when integrated in a support vector machine (SVM). The predictive accuracy of each set of predictors was evaluated in a repeated 10-fold cross-validation and expressed as average area under the Receiver Operating Characteristics curve (AUC) for all validation trials. RESULTS: The average number of MBs per cross validation was 800. MBs' mean AUC was 0.75 (95% range: 0.67-0.80). The average number of features per MB was 18 (range: 12-32) with 13 features present in over 75% of the sets. CONCLUSIONS: Our findings support the hypothesized existence of multiple and interchangeable sets of risk indicators that equally and exhaustively predict non-remitting PTSD. ML's ability to increase prediction versatility is a promising step towards developing algorithmic, knowledge-based, personalized prediction of post-traumatic psychopathology.


Asunto(s)
Adaptación Psicológica/fisiología , Inteligencia Artificial , Trastornos por Estrés Postraumático , Heridas y Lesiones , Adulto , Algoritmos , Diagnóstico Precoz , Femenino , Humanos , Masculino , Persona de Mediana Edad , Pronóstico , Curva ROC , Medición de Riesgo , Factores de Riesgo , Trastornos por Estrés Postraumático/diagnóstico , Trastornos por Estrés Postraumático/etiología , Trastornos por Estrés Postraumático/fisiopatología , Trastornos por Estrés Postraumático/prevención & control , Investigación Biomédica Traslacional , Heridas y Lesiones/complicaciones , Heridas y Lesiones/psicología
6.
PLoS One ; 10(2): e0118132, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-25705890

RESUMEN

Field of cancerization in the airway epithelium has been increasingly examined to understand early pathogenesis of non-small cell lung cancer. However, the extent of field of cancerization throughout the lung airways is unclear. Here we sought to determine the differential gene and microRNA expressions associated with field of cancerization in the peripheral airway epithelial cells of patients with lung adenocarcinoma. We obtained peripheral airway brushings from smoker controls (n=13) and from the lung contralateral to the tumor in cancer patients (n=17). We performed gene and microRNA expression profiling on these peripheral airway epithelial cells using Affymetrix GeneChip and TaqMan Array. Integrated gene and microRNA analysis was performed to identify significant molecular pathways. We identified 26 mRNAs and 5 miRNAs that were significantly (FDR <0.1) up-regulated and 38 mRNAs and 12 miRNAs that were significantly down-regulated in the cancer patients when compared to smoker controls. Functional analysis identified differential transcriptomic expressions related to tumorigenesis. Integration of miRNA-mRNA data into interaction network analysis showed modulation of the extracellular signal-regulated kinase/mitogen-activated protein kinase (ERK/MAPK) pathway in the contralateral lung field of cancerization. In conclusion, patients with lung adenocarcinoma have tumor related molecules and pathways in histologically normal appearing peripheral airway epithelial cells, a substantial distance from the tumor itself. This finding can potentially provide new biomarkers for early detection of lung cancer and novel therapeutic targets.


Asunto(s)
Adenocarcinoma/genética , Perfilación de la Expresión Génica , Neoplasias Pulmonares/genética , MicroARNs/genética , ARN Mensajero/genética , Sistema Respiratorio/metabolismo , Adenocarcinoma/metabolismo , Anciano , Transformación Celular Neoplásica/genética , Transformación Celular Neoplásica/metabolismo , Células Epiteliales/metabolismo , Quinasas MAP Reguladas por Señal Extracelular/metabolismo , Femenino , Humanos , Neoplasias Pulmonares/metabolismo , Masculino , Persona de Mediana Edad , Análisis de Secuencia por Matrices de Oligonucleótidos , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa , Fumar
7.
AMIA Annu Symp Proc ; 2015: 2043-52, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-26958304

RESUMEN

Brain science is a frontier research area with great promise for understanding, preventing, and treating multiple diseases affecting millions of patients. Its key task of reconstructing neuronal brain connectivity poses unique Big Data Analysis challenges distinct from those in clinical or "-omics" domains. Our goal is to understand the strengths and limitations of reconstruction algorithms, measure performance and its determinants, and ultimately enhance performance and applicability. We devised a set of experiments in a well-controlled setting using an established gold-standard based on calcium fluorescence time series recordings of thousands of neurons sampled from a previously validated neuronal model of complex time-varying causal neuronal connections. Following empirical testing of several state-of-the-art reconstruction algorithms, and using the best-performing algorithms, we constructed features of a classifier and predicted the presence or absence of connections using meta-learning. This approach combines information-theoretic, feature construction, and pattern recognition meta-learning methods to considerably improve the Area under ROC curve performance. Our data are very promising toward the feasibility of reliably reconstructing complex neuronal connectivity.


Asunto(s)
Algoritmos , Encéfalo/fisiología , Neuronas , Humanos , Aprendizaje , Estadística como Asunto
8.
PLoS One ; 9(9): e106479, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25215507

RESUMEN

De-novo reverse-engineering of genome-scale regulatory networks is a fundamental problem of biological and translational research. One of the major obstacles in developing and evaluating approaches for de-novo gene network reconstruction is the absence of high-quality genome-scale gold-standard networks of direct regulatory interactions. To establish a foundation for assessing the accuracy of de-novo gene network reverse-engineering, we constructed high-quality genome-scale gold-standard networks of direct regulatory interactions in Saccharomyces cerevisiae that incorporate binding and gene knockout data. Then we used 7 performance metrics to assess accuracy of 18 statistical association-based approaches for de-novo network reverse-engineering in 13 different datasets spanning over 4 data types. We found that most reconstructed networks had statistically significant accuracies. We also determined which statistical approaches and datasets/data types lead to networks with better reconstruction accuracies. While we found that de-novo reverse-engineering of the entire network is a challenging problem, it is possible to reconstruct sub-networks around some transcription factors with good accuracy. The latter transcription factors can be identified by assessing their connectivity in the inferred networks. Overall, this study provides the gene network reverse-engineering community with a rigorous assessment of the accuracy of S. cerevisiae gene network reconstruction and variability in performance of various approaches for learning both the entire network and sub-networks around transcription factors.


Asunto(s)
Redes Reguladoras de Genes/genética , Genoma Fúngico/genética , Saccharomyces cerevisiae/genética , Algoritmos , Bases de Datos Genéticas , Mutación/genética , Valor Predictivo de las Pruebas , Unión Proteica/genética , Curva ROC , Estándares de Referencia , Genética Inversa , Factores de Transcripción/metabolismo
9.
J Psychiatr Res ; 59: 68-76, 2014 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-25260752

RESUMEN

There is broad interest in predicting the clinical course of mental disorders from early, multimodal clinical and biological information. Current computational models, however, constitute a significant barrier to realizing this goal. The early identification of trauma survivors at risk of post-traumatic stress disorder (PTSD) is plausible given the disorder's salient onset and the abundance of putative biological and clinical risk indicators. This work evaluates the ability of Machine Learning (ML) forecasting approaches to identify and integrate a panel of unique predictive characteristics and determine their accuracy in forecasting non-remitting PTSD from information collected within 10 days of a traumatic event. Data on event characteristics, emergency department observations, and early symptoms were collected in 957 trauma survivors, followed for fifteen months. An ML feature selection algorithm identified a set of predictors that rendered all others redundant. Support Vector Machines (SVMs) as well as other ML classification algorithms were used to evaluate the forecasting accuracy of i) ML selected features, ii) all available features without selection, and iii) Acute Stress Disorder (ASD) symptoms alone. SVM also compared the prediction of a) PTSD diagnostic status at 15 months to b) posterior probability of membership in an empirically derived non-remitting PTSD symptom trajectory. Results are expressed as mean Area Under Receiver Operating Characteristics Curve (AUC). The feature selection algorithm identified 16 predictors, present in ≥ 95% cross-validation trials. The accuracy of predicting non-remitting PTSD from that set (AUC = .77) did not differ from predicting from all available information (AUC = .78). Predicting from ASD symptoms was not better then chance (AUC = .60). The prediction of PTSD status was less accurate than that of membership in a non-remitting trajectory (AUC = .71). ML methods may fill a critical gap in forecasting PTSD. The ability to identify and integrate unique risk indicators makes this a promising approach for developing algorithms that infer probabilistic risk of chronic posttraumatic stress psychopathology based on complex sources of biological, psychological, and social information.


Asunto(s)
Inteligencia Artificial , Trastornos por Estrés Postraumático/diagnóstico , Trastornos de Estrés Traumático Agudo/psicología , Adolescente , Adulto , Anciano , Algoritmos , Femenino , Estudios de Seguimiento , Humanos , Masculino , Persona de Mediana Edad , Valor Predictivo de las Pruebas , Escalas de Valoración Psiquiátrica , Curva ROC , Reproducibilidad de los Resultados , Factores de Riesgo , Trastornos por Estrés Postraumático/prevención & control , Adulto Joven
10.
Sci Rep ; 4: 4411, 2014 Mar 21.
Artículo en Inglés | MEDLINE | ID: mdl-24651673

RESUMEN

The spectrum of modern molecular high-throughput assaying includes diverse technologies such as microarray gene expression, miRNA expression, proteomics, DNA methylation, among many others. Now that these technologies have matured and become increasingly accessible, the next frontier is to collect "multi-modal" data for the same set of subjects and conduct integrative, multi-level analyses. While multi-modal data does contain distinct biological information that can be useful for answering complex biology questions, its value for predicting clinical phenotypes and contributions of each type of input remain unknown. We obtained 47 datasets/predictive tasks that in total span over 9 data modalities and executed analytic experiments for predicting various clinical phenotypes and outcomes. First, we analyzed each modality separately using uni-modal approaches based on several state-of-the-art supervised classification and feature selection methods. Then, we applied integrative multi-modal classification techniques. We have found that gene expression is the most predictively informative modality. Other modalities such as protein expression, miRNA expression, and DNA methylation also provide highly predictive results, which are often statistically comparable but not superior to gene expression data. Integrative multi-modal analyses generally do not increase predictive signal compared to gene expression data.


Asunto(s)
Biología Computacional/estadística & datos numéricos , ADN de Neoplasias/genética , MicroARNs/genética , Proteínas de Neoplasias/genética , Neoplasias/diagnóstico , ARN Neoplásico/genética , Metilación de ADN , ADN de Neoplasias/metabolismo , Conjuntos de Datos como Asunto , Diagnóstico por Imagen , Femenino , Dosificación de Gen , Expresión Génica , Humanos , Masculino , MicroARNs/metabolismo , Proteínas de Neoplasias/metabolismo , Neoplasias/genética , Neoplasias/mortalidad , Neoplasias/patología , Pronóstico , ARN Neoplásico/metabolismo , Análisis de Supervivencia
11.
PLoS One ; 9(2): e89987, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-24587168

RESUMEN

The extreme diversity of HIV-1 strains presents a formidable challenge for HIV-1 vaccine design. Although antibodies (Abs) can neutralize HIV-1 and potentially protect against infection, antibodies that target the immunogenic viral surface protein gp120 have widely variable and poorly predictable cross-strain reactivity. Here, we developed a novel computational approach, the Method of Dynamic Epitopes, for identification of neutralization epitopes targeted by anti-HIV-1 monoclonal antibodies (mAbs). Our data demonstrate that this approach, based purely on calculated energetics and 3D structural information, accurately predicts the presence of neutralization epitopes targeted by V3-specific mAbs 2219 and 447-52D in any HIV-1 strain. The method was used to calculate the range of conservation of these specific epitopes across all circulating HIV-1 viruses. Accurately identifying an Ab-targeted neutralization epitope in a virus by computational means enables easy prediction of the breadth of reactivity of specific mAbs across the diversity of thousands of different circulating HIV-1 variants and facilitates rational design and selection of immunogens mimicking specific mAb-targeted epitopes in a multivalent HIV-1 vaccine. The defined epitopes can also be used for the purpose of epitope-specific analyses of breakthrough sequences recorded in vaccine clinical trials. Thus, our study is a prototype for a valuable tool for rational HIV-1 vaccine design.


Asunto(s)
Anticuerpos Monoclonales/inmunología , Anticuerpos Neutralizantes/inmunología , Biología Computacional , Epítopos/inmunología , Anticuerpos Anti-VIH/inmunología , Proteína gp120 de Envoltorio del VIH/inmunología , VIH-1/inmunología , Fragmentos de Péptidos/inmunología , Vacunas contra el SIDA/inmunología , Secuencia de Aminoácidos , Secuencia Conservada , Proteína gp120 de Envoltorio del VIH/química , VIH-1/genética , Modelos Moleculares , Datos de Secuencia Molecular , Fragmentos de Péptidos/química , Conformación Proteica , Especificidad de la Especie , Termodinámica
12.
Sci Rep ; 3: 2620, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-24018484

RESUMEN

Psoriasis is a common chronic inflammatory disease of the skin. We sought to use bacterial community abundance data to assess the feasibility of developing multivariate molecular signatures for differentiation of cutaneous psoriatic lesions, clinically unaffected contralateral skin from psoriatic patients, and similar cutaneous loci in matched healthy control subjects. Using 16S rRNA high-throughput DNA sequencing, we assayed the cutaneous microbiome for 51 such matched specimen triplets including subjects of both genders, different age groups, ethnicities and multiple body sites. None of the subjects had recently received relevant treatments or antibiotics. We found that molecular signatures for the diagnosis of psoriasis result in significant accuracy ranging from 0.75 to 0.89 AUC, depending on the classification task. We also found a significant effect of DNA sequencing and downstream analysis protocols on the accuracy of molecular signatures. Our results demonstrate that it is feasible to develop accurate molecular signatures for the diagnosis of psoriasis from microbiomic data.


Asunto(s)
Bacterias/clasificación , Bacterias/genética , Metagenoma , Microbiota , Psoriasis/microbiología , Estudios de Casos y Controles , Humanos , Psoriasis/diagnóstico , ARN Ribosómico 16S , Reproducibilidad de los Resultados , Análisis de Secuencia de ADN
13.
AAPS J ; 15(2): 427-37, 2013 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-23319288

RESUMEN

Gene expression is useful for identifying the molecular signature of a disease and for correlating a pharmacodynamic marker with the dose-dependent cellular responses to exposure of a drug. Gene expression offers utility to guide drug discovery by illustrating engagement of the desired cellular pathways/networks, as well as avoidance of acting on the toxicological pathways. Successful employment of gene-expression signatures in the later stages of drug development depends on their linkage to clinically meaningful phenotypic characteristics and requires a biologically meaningful mechanism combined with a stringent statistical rigor. Much of the success in clinical drug development is hinged on predefining the signature genes for their fitness for purposes of application. Specific examples are highlighted to illustrate the breadth and depth of the potential utility of gene-expression signatures in drug discovery and clinical development to targeted therapeutics at the bedside.


Asunto(s)
Descubrimiento de Drogas/métodos , Perfilación de la Expresión Génica , Regulación de la Expresión Génica/efectos de los fármacos , Pruebas Genéticas , Investigación Biomédica Traslacional/métodos , Animales , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Redes Reguladoras de Genes , Marcadores Genéticos , Predisposición Genética a la Enfermedad , Humanos , Terapia Molecular Dirigida , Seguridad del Paciente , Selección de Paciente , Fenotipo , Medicina de Precisión , Valor Predictivo de las Pruebas
14.
BMC Syst Biol ; 7 Suppl 5: S1, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-24564859

RESUMEN

BACKGROUND: Oncogenic mechanisms in small-cell lung cancer remain poorly understood leaving this tumor with the worst prognosis among all lung cancers. Unlike other cancer types, sequencing genomic approaches have been of limited success in small-cell lung cancer, i.e., no mutated oncogenes with potential driver characteristics have emerged, as it is the case for activating mutations of epidermal growth factor receptor in non-small-cell lung cancer. Differential gene expression analysis has also produced SCLC signatures with limited application, since they are generally not robust across datasets. Nonetheless, additional genomic approaches are warranted, due to the increasing availability of suitable small-cell lung cancer datasets. Gene co-expression network approaches are a recent and promising avenue, since they have been successful in identifying gene modules that drive phenotypic traits in several biological systems, including other cancer types. RESULTS: We derived an SCLC-specific classifier from weighted gene co-expression network analysis (WGCNA) of a lung cancer dataset. The classifier, termed SCLC-specific hub network (SSHN), robustly separates SCLC from other lung cancer types across multiple datasets and multiple platforms, including RNA-seq and shotgun proteomics. The classifier was also conserved in SCLC cell lines. SSHN is enriched for co-expressed signaling network hubs strongly associated with the SCLC phenotype. Twenty of these hubs are actionable kinases with oncogenic potential, among which spleen tyrosine kinase (SYK) exhibits one of the highest overall statistical associations to SCLC. In patient tissue microarrays and cell lines, SCLC can be separated into SYK-positive and -negative. SYK siRNA decreases proliferation rate and increases cell death of SYK-positive SCLC cell lines, suggesting a role for SYK as an oncogenic driver in a subset of SCLC. CONCLUSIONS: SCLC treatment has thus far been limited to chemotherapy and radiation. Our WGCNA analysis identifies SYK both as a candidate biomarker to stratify SCLC patients and as a potential therapeutic target. In summary, WGCNA represents an alternative strategy to large scale sequencing for the identification of potential oncogenic drivers, based on a systems view of signaling networks. This strategy is especially useful in cancer types where no actionable mutations have emerged.


Asunto(s)
Perfilación de la Expresión Génica , Redes Reguladoras de Genes , Péptidos y Proteínas de Señalización Intracelular/metabolismo , Neoplasias Pulmonares/metabolismo , Neoplasias Pulmonares/patología , Proteínas Oncogénicas/metabolismo , Proteínas Tirosina Quinasas/metabolismo , Carcinoma Pulmonar de Células Pequeñas/metabolismo , Carcinoma Pulmonar de Células Pequeñas/patología , Línea Celular Tumoral , Proliferación Celular , Supervivencia Celular , Técnicas de Silenciamiento del Gen , Humanos , Péptidos y Proteínas de Señalización Intracelular/deficiencia , Péptidos y Proteínas de Señalización Intracelular/genética , Terapia Molecular Dirigida , Proteínas Oncogénicas/deficiencia , Proteínas Oncogénicas/genética , Proteínas Tirosina Quinasas/deficiencia , Proteínas Tirosina Quinasas/genética , Proteómica , Quinasa Syk
15.
Microbiome ; 1(1): 11, 2013 Apr 05.
Artículo en Inglés | MEDLINE | ID: mdl-24456583

RESUMEN

BACKGROUND: Recent advances in next-generation DNA sequencing enable rapid high-throughput quantitation of microbial community composition in human samples, opening up a new field of microbiomics. One of the promises of this field is linking abundances of microbial taxa to phenotypic and physiological states, which can inform development of new diagnostic, personalized medicine, and forensic modalities. Prior research has demonstrated the feasibility of applying machine learning methods to perform body site and subject classification with microbiomic data. However, it is currently unknown which classifiers perform best among the many available alternatives for classification with microbiomic data. RESULTS: In this work, we performed a systematic comparison of 18 major classification methods, 5 feature selection methods, and 2 accuracy metrics using 8 datasets spanning 1,802 human samples and various classification tasks: body site and subject classification and diagnosis. CONCLUSIONS: We found that random forests, support vector machines, kernel ridge regression, and Bayesian logistic regression with Laplace priors are the most effective machine learning techniques for performing accurate classification from these microbiomic data.

16.
J Mach Learn Res ; 14: 499-566, 2013 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-25285052

RESUMEN

Algorithms for Markov boundary discovery from data constitute an important recent development in machine learning, primarily because they offer a principled solution to the variable/feature selection problem and give insight on local causal structure. Over the last decade many sound algorithms have been proposed to identify a single Markov boundary of the response variable. Even though faithful distributions and, more broadly, distributions that satisfy the intersection property always have a single Markov boundary, other distributions/data sets may have multiple Markov boundaries of the response variable. The latter distributions/data sets are common in practical data-analytic applications, and there are several reasons why it is important to induce multiple Markov boundaries from such data. However, there are currently no sound and efficient algorithms that can accomplish this task. This paper describes a family of algorithms TIE* that can discover all Markov boundaries in a distribution. The broad applicability as well as efficiency of the new algorithmic family is demonstrated in an extensive benchmarking study that involved comparison with 26 state-of-the-art algorithms/variants in 15 data sets from a diversity of application domains.

17.
PLoS One ; 7(6): e39790, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22761902

RESUMEN

We have developed a mouse model of atherosclerotic plaque regression in which an atherosclerotic aortic arch from a hyperlipidemic donor is transplanted into a normolipidemic recipient, resulting in rapid elimination of cholesterol and monocyte-derived macrophage cells (CD68+) from transplanted vessel walls. To gain a comprehensive view of the differences in gene expression patterns in macrophages associated with regressing compared with progressing atherosclerotic plaque, we compared mRNA expression patterns in CD68+ macrophages extracted from plaque in aortic aches transplanted into normolipidemic or into hyperlipidemic recipients. In CD68+ cells from regressing plaque we observed that genes associated with the contractile apparatus responsible for cellular movement (e.g. actin and myosin) were up-regulated whereas genes related to cell adhesion (e.g. cadherins, vinculin) were down-regulated. In addition, CD68+ cells from regressing plaque were characterized by enhanced expression of genes associated with an anti-inflammatory M2 macrophage phenotype, including arginase I, CD163 and the C-lectin receptor. Our analysis suggests that in regressing plaque CD68+ cells preferentially express genes that reduce cellular adhesion, enhance cellular motility, and overall act to suppress inflammation.


Asunto(s)
Aterosclerosis/patología , Macrófagos/metabolismo , Transcriptoma , Animales , Antígenos CD/genética , Antígenos CD/inmunología , Antígenos de Diferenciación Mielomonocítica/genética , Antígenos de Diferenciación Mielomonocítica/inmunología , Apolipoproteínas E/genética , Aterosclerosis/genética , Macrófagos/inmunología , Ratones , Ratones Endogámicos C57BL , Ratones Noqueados , Reacción en Cadena en Tiempo Real de la Polimerasa
18.
BMC Genomics ; 13 Suppl 8: S22, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-23282373

RESUMEN

BACKGROUND: The discovery of molecular pathways is a challenging problem and its solution relies on the identification of causal molecular interactions in genomics data. Causal molecular interactions can be discovered using randomized experiments; however such experiments are often costly, infeasible, or unethical. Fortunately, algorithms that infer causal interactions from observational data have been in development for decades, predominantly in the quantitative sciences, and many of them have recently been applied to genomics data. While these algorithms can infer unoriented causal interactions between involved molecular variables (i.e., without specifying which one is the cause and which one is the effect), causally orienting all inferred molecular interactions was assumed to be an unsolvable problem until recently. In this work, we use transcription factor-target gene regulatory interactions in three different organisms to evaluate a new family of methods that, given observational data for just two causally related variables, can determine which one is the cause and which one is the effect. RESULTS: We have found that a particular family of causal orientation methods (IGCI Gaussian) is often able to accurately infer directionality of causal interactions, and that these methods usually outperform other causal orientation techniques. We also introduced a novel ensemble technique for causal orientation that combines decisions of individual causal orientation methods. The ensemble method was found to be more accurate than any best individual causal orientation method in the tested data. CONCLUSIONS: This work represents a first step towards establishing context for practical use of causal orientation methods in the genomics domain. We have found that some causal orientation methodologies yield accurate predictions of causal orientation in genomics data, and we have improved on this capability with a novel ensemble method. Our results suggest that these methods have the potential to facilitate reconstruction of molecular pathways by minimizing the number of required randomized experiments to find causal directionality and by avoiding experiments that are infeasible and/or unethical.


Asunto(s)
Algoritmos , Genómica , Área Bajo la Curva , Bases de Datos Factuales , Escherichia coli/genética , Escherichia coli/metabolismo , Proteínas de Escherichia coli/genética , Proteínas de Escherichia coli/metabolismo , Redes Reguladoras de Genes , Humanos , Leucemia-Linfoma Linfoblástico de Células T Precursoras/genética , Leucemia-Linfoma Linfoblástico de Células T Precursoras/metabolismo , Curva ROC , Receptor Notch1/genética , Receptor Notch1/metabolismo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo , Factor de Transcripción ReIA/genética , Factor de Transcripción ReIA/metabolismo
19.
PLoS One ; 6(6): e20662, 2011.
Artículo en Inglés | MEDLINE | ID: mdl-21673802

RESUMEN

BACKGROUND: The promise of modern personalized medicine is to use molecular and clinical information to better diagnose, manage, and treat disease, on an individual patient basis. These functions are predominantly enabled by molecular signatures, which are computational models for predicting phenotypes and other responses of interest from high-throughput assay data. Data-analytics is a central component of molecular signature development and can jeopardize the entire process if conducted incorrectly. While exploratory data analysis may tolerate suboptimal protocols, clinical-grade molecular signatures are subject to vastly stricter requirements. Closing the gap between standards for exploratory versus clinically successful molecular signatures entails a thorough understanding of possible biases in the data analysis phase and developing strategies to avoid them. METHODOLOGY AND PRINCIPAL FINDINGS: Using a recently introduced data-analytic protocol as a case study, we provide an in-depth examination of the poorly studied biases of the data-analytic protocols related to signature multiplicity, biomarker redundancy, data preprocessing, and validation of signature reproducibility. The methodology and results presented in this work are aimed at expanding the understanding of these data-analytic biases that affect development of clinically robust molecular signatures. CONCLUSIONS AND SIGNIFICANCE: Several recommendations follow from the current study. First, all molecular signatures of a phenotype should be extracted to the extent possible, in order to provide comprehensive and accurate grounds for understanding disease pathogenesis. Second, redundant genes should generally be removed from final signatures to facilitate reproducibility and decrease manufacturing costs. Third, data preprocessing procedures should be designed so as not to bias biomarker selection. Finally, molecular signatures developed and applied on different phenotypes and populations of patients should be treated with great caution.


Asunto(s)
Biología Computacional/métodos , Interpretación Estadística de Datos , Infecciones del Sistema Respiratorio/diagnóstico , Infecciones del Sistema Respiratorio/virología , Enfermedad Aguda , Sesgo , Biomarcadores/metabolismo , Humanos , Fenotipo , Infecciones del Sistema Respiratorio/genética , Infecciones del Sistema Respiratorio/metabolismo
20.
Biol Direct ; 6: 25, 2011 May 18.
Artículo en Inglés | MEDLINE | ID: mdl-21592391

RESUMEN

BACKGROUND: GWAS owe their popularity to the expectation that they will make a major impact on diagnosis, prognosis and management of disease by uncovering genetics underlying clinical phenotypes. The dominant paradigm in GWAS data analysis so far consists of extensive reliance on methods that emphasize contribution of individual SNPs to statistical association with phenotypes. Multivariate methods, however, can extract more information by considering associations of multiple SNPs simultaneously. Recent advances in other genomics domains pinpoint multivariate causal graph-based inference as a promising principled analysis framework for high-throughput data. Designed to discover biomarkers in the local causal pathway of the phenotype, these methods lead to accurate and highly parsimonious multivariate predictive models. In this paper, we investigate the applicability of causal graph-based method TIE* to analysis of GWAS data. To test the utility of TIE*, we focus on anti-CCP positive rheumatoid arthritis (RA) GWAS datasets, where there is a general consensus in the community about the major genetic determinants of the disease. RESULTS: Application of TIE* to the North American Rheumatoid Arthritis Cohort (NARAC) GWAS data results in six SNPs, mostly from the MHC locus. Using these SNPs we develop two predictive models that can classify cases and disease-free controls with an accuracy of 0.81 area under the ROC curve, as verified in independent testing data from the same cohort. The predictive performance of these models generalizes reasonably well to Swedish subjects from the closely related but not identical Epidemiological Investigation of Rheumatoid Arthritis (EIRA) cohort with 0.71-0.78 area under the ROC curve. Moreover, the SNPs identified by the TIE* method render many other previously known SNP associations conditionally independent of the phenotype. CONCLUSIONS: Our experiments demonstrate that application of TIE* captures maximum amount of genetic information about RA in the data and recapitulates the major consensus findings about the genetic factors of this disease. In addition, TIE* yields reproducible markers and signatures of RA. This suggests that principled multivariate causal and predictive framework for GWAS analysis empowers the community with a new tool for high-quality and more efficient discovery. REVIEWERS: This article was reviewed by Prof. Anthony Almudevar, Dr. Eugene V. Koonin, and Prof. Marianthi Markatou.


Asunto(s)
Artritis Reumatoide/genética , Biología Computacional/métodos , Estudio de Asociación del Genoma Completo , Algoritmos , Canadá , Perfilación de la Expresión Génica , Humanos , Complejo Mayor de Histocompatibilidad , Modelos Biológicos , Polimorfismo de Nucleótido Simple , Suecia , Estados Unidos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...