RESUMEN
Background/aim: The COVID-19 pandemic originated in Wuhan, China, in December 2019 and became one of the worst global health crises ever. While struggling with the unknown nature of this novel coronavirus, many researchers and groups attempted to project the progress of the pandemic using empirical or mechanistic models, each one having its drawbacks. The first confirmed cases were announced early in March, and since then, serious containment measures have taken place in Turkey. Materials and methods: Here, we present a different approach, a Bayesian negative binomial multilevel model with mixed effects, for the projection of the COVID-19 pandemic and we apply this model to the Turkish case. The model source code is available at https:// github.com/kansil/covid-19. We predicted the confirmed daily cases and cumulative numbers from June 6th to June 26th with 80%, 95%, and 99% prediction intervals (PI). Results: Our projections showed that if we continued to comply with the measures and no drastic changes were seen in diagnosis or management protocols, the epidemic curve would tend to decrease in this time interval. Also, the predictive validity analysis suggests that the proposed model projections should have a PI around 95% for the first 12 days of the projections. Conclusion: We expect that drastic changes in the course of COVID-19 in Turkey will cause the model to suffer in predictive validity, and this can be used to monitor the epidemic. We hope that the discussion on these projections and the limitations of the epidemiological forecasting will be beneficial to the medical community, and policy makers.
Asunto(s)
COVID-19/epidemiología , Pandemias/estadística & datos numéricos , Teorema de Bayes , Métodos Epidemiológicos , Predicción , Humanos , Modelos Estadísticos , Probabilidad , Turquía/epidemiologíaRESUMEN
BACKGROUND: Non-linear relationships at the genotype level are essential in understanding the genetic interactions of complex disease traits. Genome-wide association Studies (GWAS) have revealed statistical association of the SNPs in many complex diseases. As GWAS results could not thoroughly reveal the genetic background of these disorders, Genome-Wide Interaction Studies have started to gain importance. In recent years, various statistical approaches, such as entropy-based methods, have been suggested for revealing these non-additive interactions between variants. This study presents a novel prioritization workflow integrating two-step Random Forest (RF) modeling and entropy analysis after PLINK filtering. PLINK-RF-RF workflow is followed by an entropy-based 3-way interaction information (3WII) method to capture the hidden patterns resulting from non-linear relationships between genotypes in Late-Onset Alzheimer Disease to discover early and differential diagnosis markers. RESULTS: Three models from different datasets are developed by integrating PLINK-RF-RF analysis and entropy-based three-way interaction information (3WII) calculation method, which enables the detection of the third-order interactions, which are not primarily considered in epistatic interaction studies. A reduced SNP set is selected for all three datasets by 3WII analysis by PLINK filtering and prioritization of SNP with RF-RF modeling, promising as a model minimization approach. Among SNPs revealed by 3WII, 4 SNPs out of 19 from GenADA, 1 SNP out of 27 from ADNI, and 4 SNPs out of 106 from NCRAD are mapped to genes directly associated with Alzheimer Disease. Additionally, several SNPs are associated with other neurological disorders. Also, the genes the variants mapped to in all datasets are significantly enriched in calcium ion binding, extracellular matrix, external encapsulating structure, and RUNX1 regulates estrogen receptor-mediated transcription pathways. Therefore, these functional pathways are proposed for further examination for a possible LOAD association. Besides, all 3WII variants are proposed as candidate biomarkers for the genotyping-based LOAD diagnosis. CONCLUSION: The entropy approach performed in this study reveals the complex genetic interactions that significantly contribute to LOAD risk. We benefited from the entropy-based 3WII as a model minimization step and determined the significant 3-way interactions between the prioritized SNPs by PLINK-RF-RF. This framework is a promising approach for disease association studies, which can also be modified by integrating other machine learning and entropy-based interaction methods.
RESUMEN
Through technological innovations, patient cohorts can be examined from multiple views with high-dimensional, multiscale biomedical data to classify clinical phenotypes and predict outcomes. Here, we aim to present our approach for analyzing multimodal data using unsupervised and supervised sparse linear methods in a COVID-19 patient cohort. This prospective cohort study of 149 adult patients was conducted in a tertiary care academic center. First, we used sparse canonical correlation analysis (CCA) to identify and quantify relationships across different data modalities, including viral genome sequencing, imaging, clinical data, and laboratory results. Then, we used cooperative learning to predict the clinical outcome of COVID-19 patients: Intensive care unit admission. We show that serum biomarkers representing severe disease and acute phase response correlate with original and wavelet radiomics features in the LLL frequency channel (cor(Xu1, Zv1) = 0.596, p value < 0.001). Among radiomics features, histogram-based first-order features reporting the skewness, kurtosis, and uniformity have the lowest negative, whereas entropy-related features have the highest positive coefficients. Moreover, unsupervised analysis of clinical data and laboratory results gives insights into distinct clinical phenotypes. Leveraging the availability of global viral genome databases, we demonstrate that the Word2Vec natural language processing model can be used for viral genome encoding. It not only separates major SARS-CoV-2 variants but also allows the preservation of phylogenetic relationships among them. Our quadruple model using Word2Vec encoding achieves better prediction results in the supervised task. The model yields area under the curve (AUC) and accuracy values of 0.87 and 0.77, respectively. Our study illustrates that sparse CCA analysis and cooperative learning are powerful techniques for handling high-dimensional, multimodal data to investigate multivariate associations in unsupervised and supervised tasks.
RESUMEN
Introduction: Despite the significant progress in understanding cancer biology, the deduction of metastasis is still a challenge in the clinic. Transcriptional regulation is one of the critical mechanisms underlying cancer development. Even though mRNA, microRNA, and DNA methylation mechanisms have a crucial impact on the metastatic outcome, there are no comprehensive data mining models that combine all transcriptional regulation aspects for metastasis prediction. This study focused on identifying the regulatory impact of genetic biomarkers for monitoring metastatic molecular signatures of melanoma by investigating the consolidated effect of miRNA, mRNA, and DNA methylation. Method: We developed multiple machine learning models to distinguish the metastasis by integrating miRNA, mRNA, and DNA methylation markers. We used the TCGA melanoma dataset to differentiate between metastatic melanoma samples by assessing a set of predictive models. For this purpose, machine learning models using a support vector machine with different kernels, artificial neural networks, random forests, AdaBoost, and Naïve Bayes are compared. An iterative combination of differentially expressed miRNA, mRNA, and methylation signatures is used as a candidate marker to reveal each new biomarker category's impact. In each iteration, the performances of the combined models are calculated. During all comparisons, the choice of the feature selection method and under and oversampling approaches are analyzed. Selected biomarkers of the highest performing models are further analyzed for the biological interpretation of functional enrichment. Results: In the initial model, miRNA biomarkers can identify metastatic melanoma with an 81% F-score. The addition of mRNA markers upon miRNA increased the F-score to 92%. In the final integrated model, the addition of the methylation data resulted in a similar F-score of 92% but produced a stable model with low variance across multiple trials. Conclusion: Our results support the role of miRNA regulation in metastatic melanoma as miRNA markers model metastasis outcomes with high accuracy. Moreover, the integrated evaluation of miRNA with mRNA and methylation biomarkers increases the model's power. It populates selected biomarkers on the metastasis-associated pathways of melanoma, such as the "osteoclast", "Rap1 signaling", and "chemokine signaling" pathways. Source Code: https://github.com/aysegul-kt/MelonomaMetastasisPrediction/.
RESUMEN
Advances in genetic/genomic research and translational studies drive the progress on molecular diagnosis, personalised treatment, and monitoring. Healthcare professionals and governments are encouraged to set administrative regulations and implement structured and interoperable representation to utilise the genetic/genomic data, which will support precision medicine approaches through Health Information Systems (HIS). Clear regulations and careful legislation are also crucial for the security and privacy of genetic/genomic test data. In this article, we present a review of the National Health Information System of Turkey (NHIS-T) about interoperable health data representation for genetic tests. We discuss the content of rules and regulations related to genetic/genomic testing and structured data representation in Turkey. A brief comparison of the Turkish "Law on the Protection of Personal Data" (LPPD) in genetic/genomic data privacy with its counterparts is presented. The final discussion about the shortcomings of Turkey is transferable to health information systems worldwide. Constructing a national reference database and IT infrastructure to enable data integration and exchange between genomic data, metadata, and health records will improve genetics studies' utility and outcomes. The critical success factors behind integration are establishing broadly accepted terminologies and government guidance. The governments should set clear a transparent policy defining the legal and ethical framework, workforce training, clinical decision-support tools, public engagement, and education concurrently.
Asunto(s)
Sistemas de Información en Salud , Pruebas Genéticas , Genómica , Humanos , Privacidad , TurquíaRESUMEN
BACKGROUND: Boron is a prominent part of the human diet and one of the essential trace elements for humans. Dietary boron is mostly transformed into boric acid within the body and has been associated with desirable health outcomes. Non-dietary resources of boron, such as boron-based drugs and occupational exposure, might lead to excessive boron levels in the blood and provoke health adversities. The liver might be particularly sensitive to boron intake with ample evidence suggesting a relation between boron and liver function, although the underlying molecular processes remain largely unknown. METHODS: In order to better understand boron-related metabolism and molecular mechanisms associated with a cytotoxic level of boric acid, the half-maximal inhibitory concentration (IC50) of boric acid for the hepatoma cell line (HepG2) was determined using the XTT assay. Cellular responses followed by boric acid treatment at this concentration were investigated using genotoxicity assays and microarray hybridizations. Enrichment analyses were carried out to find out over-represented biological processes using the list of differentially expressed genes identified within the gene expression analysis. RESULTS: DNA breaks were detected in HepG2 cells treated with 24â¯mM boric acid, the estimated IC50-level of boric acid. On the other hand, pleiotropic transcriptomic effects, including cell cycle arrest, DNA repair, and apoptosis as well as altered expression of Phase I and Phase II enzymes, amino acid metabolism, and lipid metabolism were discerned in microarray analyses. CONCLUSION: HepG2 cells treated with a growth-inhibitory concentration of boric acid for 24â¯h exhibited a senescence-like transcriptomic profile along with DNA damage. Further studies might help in understanding the concentration-dependent effects and mechanisms of boric acid.
Asunto(s)
Ácidos Bóricos/administración & dosificación , Ácidos Bóricos/farmacología , Regulación de la Expresión Génica/efectos de los fármacos , Proliferación Celular/efectos de los fármacos , Daño del ADN , Relación Dosis-Respuesta a Droga , Ontología de Genes , Células Hep G2 , Humanos , Pruebas de MutagenicidadRESUMEN
Intraflagellar transport (IFT) proteins are essential for cilia assembly and have recently been associated with a number of developmental processes, such as left-right axis specification and limb and neural tube patterning. Genetic studies indicate that IFT proteins are required for Sonic hedgehog (Shh) signaling downstream of the Smoothened and Patched membrane proteins but upstream of the Glioma (Gli) transcription factors. However, the role that IFT proteins play in transduction of Shh signaling and the importance of cilia in this process remain unknown. Here we provide insights into the mechanism by which defects in an IFT protein, Tg737/Polaris, affect Shh signaling in the murine limb bud. Our data show that loss of Tg737 results in altered Gli3 processing that abrogates Gli3-mediated repression of Gli1 transcriptional activity. In contrast to the conclusions drawn from genetic analysis, the activity of Gli1 and truncated forms of Gli3 (Gli3R) are unaffected in Tg737 mutants at the molecular level, indicating that Tg737/Polaris is differentially involved in specific activities of the Gli proteins. Most important, a negative regulator of Shh signaling, Suppressor of fused, and the three full-length Gli transcription factors localize to the distal tip of cilia in addition to the nucleus. Thus, our data support a model where cilia have a direct role in Gli processing and Shh signal transduction.
Asunto(s)
Factores de Transcripción de Tipo Kruppel/fisiología , Proteínas del Tejido Nervioso/fisiología , Proteínas Supresoras de Tumor/metabolismo , Animales , Extremidades/embriología , Flagelos/metabolismo , Proteínas Hedgehog , Factores de Transcripción de Tipo Kruppel/genética , Ratones , Ratones Endogámicos BALB C , Ratones Endogámicos C57BL , Modelos Biológicos , Proteínas del Tejido Nervioso/genética , Transactivadores/metabolismo , Factores de Transcripción/metabolismo , Proteína Gli2 con Dedos de Zinc , Proteína Gli3 con Dedos de ZincRESUMEN
BACKGROUND: Multifactor dimensionality reduction (MDR) is a nonparametric approach that can be used to detect relevant interactions between single-nucleotide polymorphisms (SNPs). The aim of this study was to build the best genomic model based on SNP associations and to identify candidate polymorphisms that are the underlying molecular basis of the bipolar disorders. METHODS: This study was performed on Whole-Genome Association Study of Bipolar Disorder (dbGaP [database of Genotypes and Phenotypes] study accession number: phs000017.v3.p1) data. After preprocessing of the genotyping data, three classification-based data mining methods (ie, random forest, naïve Bayes, and k-nearest neighbor) were performed. Additionally, as a nonparametric, model-free approach, the MDR method was used to evaluate the SNP profiles. The validity of these methods was evaluated using true classification rate, recall (sensitivity), precision (positive predictive value), and F-measure. RESULTS: Random forests, naïve Bayes, and k-nearest neighbors identified 16, 13, and ten candidate SNPs, respectively. Surprisingly, the top six SNPs were reported by all three methods. Random forests and k-nearest neighbors were more successful than naïve Bayes, with recall values >0.95. On the other hand, MDR generated a model with comparable predictive performance based on five SNPs. Although different SNP profiles were identified in MDR compared to the classification-based models, all models mapped SNPs to the DOCK10 gene. CONCLUSION: Three classification-based data mining approaches, random forests, naïve Bayes, and k-nearest neighbors, have prioritized similar SNP profiles as predictors of bipolar disorders, in contrast to MDR, which has found different SNPs through analysis of two-way and three-way interactions. The reduced number of associated SNPs discovered by MDR, without loss in the classification performance, would facilitate validation studies and decision support models, and would reduce the cost to develop predictive and diagnostic tests. Nevertheless, we need to emphasize that translation of genomic models to the clinical setting requires models with higher classification performance.
RESUMEN
Salmonella enterica is a bacterial pathogen that usually infects its host through food sources. Translocation of the pathogen proteins into the host cells leads to changes in the signaling mechanism either by activating or inhibiting the host proteins. Given that the bacterial infection modifies the response network of the host, a more coherent view of the underlying biological processes and the signaling networks can be obtained by using a network modeling approach based on the reverse engineering principles. In this work, we have used a published temporal phosphoproteomic dataset of Salmonella-infected human cells and reconstructed the temporal signaling network of the human host by integrating the interactome and the phosphoproteomic dataset. We have combined two well-established network modeling frameworks, the Prize-collecting Steiner Forest (PCSF) approach and the Integer Linear Programming (ILP) based edge inference approach. The resulting network conserves the information on temporality, direction of interactions, while revealing hidden entities in the signaling, such as the SNARE binding, mTOR signaling, immune response, cytoskeleton organization, and apoptosis pathways. Targets of the Salmonella effectors in the host cells such as CDC42, RHOA, 14-3-3δ, Syntaxin family, Oxysterol-binding proteins were included in the reconstructed signaling network although they were not present in the initial phosphoproteomic data. We believe that integrated approaches, such as the one presented here, have a high potential for the identification of clinical targets in infectious diseases, especially in the Salmonella infections.
RESUMEN
Genome wide association studies (GWAS) determine susceptibility profiles for complex diseases. In this study, GWAS was performed in 26 patients with oligo and rheumatoid factor negative polyarticular juvenile idiopathic artritis (JIA) and their healthy parents by Affymetrix 250K SNP arrays. Biological function and pathway enrichment analysis was done. This is the first GWAS reported for JIA families from the eastern Mediterranean population. Enrichment of FcγR-mediated phagocytosis pathway and response to various stimuli were the leading discoveries, along with the presentation of the strong interaction of JIA-associated genes with HLA cluster in the co-expression network. The co-expression network also presented the direct interaction of a gene in FcγRmediated phagocytosis pathway, namely GAB2, with BLK, CDH13, IL4R and MICA. The systems biology approach helped us to investigate the interactions between the identified genes and biological pathways and molecular functions, expanding our understanding of JIA pathogenesis at molecular level.
Asunto(s)
Artritis Juvenil/genética , Adolescente , Femenino , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Humanos , Masculino , Proyectos PilotoRESUMEN
BACKGROUND: A personalized medicine approach provides opportunities for predictive and preventive medicine. Using genomic, clinical, environmental, and behavioral data, the tracking and management of individual wellness is possible. A prolific way to carry this personalized approach into routine practices can be accomplished by integrating clinical interpretations of genomic variations into electronic medical record (EMR)s/electronic health record (EHR)s systems. Today, various central EHR infrastructures have been constituted in many countries of the world, including Turkey. OBJECTIVE: As an initial attempt to develop a sophisticated infrastructure, we have concentrated on incorporating the personal single nucleotide polymorphism (SNP) data into the National Health Information System of Turkey (NHIS-T) for disease risk assessment, and evaluated the performance of various predictive models for prostate cancer cases. We present our work as a miniseries containing three parts: (1) an overview of requirements, (2) the incorporation of SNP into the NHIS-T, and (3) an evaluation of SNP data incorporated into the NHIS-T for prostate cancer. METHODS: For the second article of this miniseries, we have analyzed the existing NHIS-T and proposed the possible extensional architectures. In light of the literature survey and characteristics of NHIS-T, we have proposed and argued opportunities and obstacles for a SNP incorporated NHIS-T. A prototype with complementary capabilities (knowledge base and end-user applications) for these architectures has been designed and developed. RESULTS: In the proposed architectures, the clinically relevant personal SNP (CR-SNP) and clinicogenomic associations are shared between central repositories and end-users via the NHIS-T infrastructure. To produce these files, we need to develop a national level clinicogenomic knowledge base. Regarding clinicogenomic decision support, we planned to complete interpretation of these associations on the end-user applications. This approach gives us the flexibility to add/update envirobehavioral parameters and family health history that will be monitored or collected by end users. CONCLUSIONS: Our results emphasized that even though the existing NHIS-T messaging infrastructure supports the integration of SNP data and clinicogenomic association, it is critical to develop a national level, accredited knowledge base and better end-user systems for the interpretation of genomic, clinical, and envirobehavioral parameters.
RESUMEN
BACKGROUND: A personalized medicine approach provides opportunities for predictive and preventive medicine. Using genomic, clinical, environmental, and behavioral data, the tracking and management of individual wellness is possible. A prolific way to carry this personalized approach into routine practices can be accomplished by integrating clinical interpretations of genomic variations into electronic medical records (EMRs)/electronic health records (EHRs). Today, various central EHR infrastructures have been constituted in many countries of the world, including Turkey. OBJECTIVE: As an initial attempt to develop a sophisticated infrastructure, we have concentrated on incorporating the personal single nucleotide polymorphism (SNP) data into the National Health Information System of Turkey (NHIS-T) for disease risk assessment, and evaluated the performance of various predictive models for prostate cancer cases. We present our work as a three part miniseries: (1) an overview of requirements, (2) the incorporation of SNP data into the NHIS-T, and (3) an evaluation of SNP data incorporated into the NHIS-T for prostate cancer. METHODS: In the third article of this miniseries, we have evaluated the proposed complementary capabilities (ie, knowledge base and end-user application) with real data. Before the evaluation phase, clinicogenomic associations about increased prostate cancer risk were extracted from knowledge sources, and published predictive genomic models assessing individual prostate cancer risk were collected. To evaluate complementary capabilities, we also gathered personal SNP data of four prostate cancer cases and fifteen controls. Using these data files, we compared various independent and model-based, prostate cancer risk assessment approaches. RESULTS: Through the extraction and selection processes of SNP-prostate cancer risk associations, we collected 209 independent associations for increased risk of prostate cancer from the studied knowledge sources. Also, we gathered six cumulative models and two probabilistic models. Cumulative models and assessment of independent associations did not have impressive results. There was one of the probabilistic, model-based interpretation that was successful compared to the others. In envirobehavioral and clinical evaluations, we found that some of the comorbidities, especially, would be useful to evaluate disease risk. Even though we had a very limited dataset, a comparison of performances of different disease models and their implementation with real data as use case scenarios helped us to gain deeper insight into the proposed architecture. CONCLUSIONS: In order to benefit from genomic variation data, existing EHR/EMR systems must be constructed with the capability of tracking and monitoring all aspects of personal health status (genomic, clinical, environmental, etc) in 24/7 situations, and also with the capability of suggesting evidence-based recommendations. A national-level, accredited knowledge base is a top requirement for improved end-user systems interpreting these parameters. Finally, categorization using similar, individual characteristics (SNP patterns, exposure history, etc) may be an effective way to predict disease risks, but this approach needs to be concretized and supported with new studies.
RESUMEN
Through Genome Wide Association Studies (GWAS) many Single Nucleotide Polymorphism (SNP)-complex disease relations can be investigated. The output of GWAS can be high in amount and high dimensional, also relations between SNPs, phenotypes and diseases are most likely to be nonlinear. In order to handle high volume-high dimensional data and to be able to find the nonlinear relations we have utilized data mining approaches and a hybrid feature selection model of support vector machine and decision tree has been designed. The designed model is tested on prostate cancer data and for the first time combined genotype and phenotype information is used to increase the diagnostic performance. We were able to select phenotypic features such as ethnicity and body mass index, and SNPs those map to specific genes such as CRR9, TERT. The performance results of the proposed hybrid model, on prostate cancer dataset, with 90.92% of sensitivity and 0.91 of area under ROC curve, shows the potential of the approach for prediction and early detection of the prostate cancer.
Asunto(s)
Bases de Datos Genéticas , Modelos Biológicos , Neoplasias de la Próstata/genética , Neoplasias de la Próstata/patología , Máquina de Vectores de Soporte , Alelos , Genotipo , Humanos , Masculino , Fenotipo , Polimorfismo de Nucleótido Simple/genéticaRESUMEN
Single Nucleotide Polymorphisms (SNPs) are the most common genomic variations where only a single nucleotide differs between individuals. Individual SNPs and SNP profiles associated with diseases can be utilized as biological markers. But there is a need to determine the SNP subsets and patients' clinical data which is informative for the diagnosis. Data mining approaches have the highest potential for extracting the knowledge from genomic datasets and selecting the representative SNPs as well as most effective and informative clinical features for the clinical diagnosis of the diseases. In this study, we have applied one of the widely used data mining classification methodology: "decision tree" for associating the SNP biomarkers and significant clinical data with the Alzheimer's disease (AD), which is the most common form of "dementia". Different tree construction parameters have been compared for the optimization, and the most accurate tree for predicting the AD is presented.
Asunto(s)
Enfermedad de Alzheimer/diagnóstico , Enfermedad de Alzheimer/genética , Minería de Datos/métodos , Bases de Datos Genéticas , Sistemas de Apoyo a Decisiones Clínicas , Diagnóstico por Computador/métodos , Polimorfismo de Nucleótido Simple/genética , Marcadores Genéticos/genética , Predisposición Genética a la Enfermedad/genética , Humanos , Reproducibilidad de los Resultados , Sensibilidad y EspecificidadRESUMEN
BACKGROUND: Personalized medicine approaches provide opportunities for predictive and preventive medicine. Using genomic, clinical, environmental, and behavioral data, tracking and management of individual wellness is possible. A prolific way to carry this personalized approach into routine practices can be accomplished by integrating clinical interpretations of genomic variations into electronic medical records (EMRs)/electronic health records (EHRs). Today, various central EHR infrastructures have been constituted in many countries of the world including Turkey. OBJECTIVE: The objective of this study was to concentrate on incorporating the personal single nucleotide polymorphism (SNP) data into the National Health Information System of Turkey (NHIS-T) for disease risk assessment, and evaluate the performance of various predictive models for prostate cancer cases. We present our work as a miniseries containing three parts: (1) an overview of requirements, (2) the incorporation of SNP into the NHIS-T, and (3) an evaluation of SNP incorporated NHIS-T for prostate cancer. METHODS: For the first article of this miniseries, the scientific literature is reviewed and the requirements of SNP data integration into EMRs/EHRs are extracted and presented. RESULTS: In the literature, basic requirements of genomic-enabled EMRs/EHRs are listed as incorporating genotype data and its clinical interpretation into EMRs/EHRs, developing accurate and accessible clinicogenomic interpretation resources (knowledge bases), interpreting and reinterpreting of variant data, and immersing of clinicogenomic information into the medical decision processes. In this section, we have analyzed these requirements under the subtitles of terminology standards, interoperability standards, clinicogenomic knowledge bases, defining clinical significance, and clinicogenomic decision support. CONCLUSIONS: In order to integrate structured genotype and phenotype data into any system, there is a need to determine data components, terminology standards, and identifiers of clinicogenomic information. Also, we need to determine interoperability standards to share information between different information systems of stakeholders, and develop decision support capability to interpret genomic variations based on the knowledge bases via different assessment approaches.
RESUMEN
BACKGROUND/AIM: Despite the rise in type 2 diabetes prevalence worldwide, we do not have a method for early risk prediction. The predictive ability of genetic models has been found to be little or negligible so far. In this study, we aimed to develop a better early risk prediction method for type 2 diabetes. MATERIALS AND METHODS: We used phenotypic and genotypic data from the Nurses' Health Study and Health Professionals' Follow-up Study cohorts and analyzed them by using binary logistic regression. RESULTS: Phenotypic variables yielded 70.7% overall correctness and an area under the curve (AUC) of 0.77. With regard to genotype, 798 single nucleotide polymorphisms with P-values of lower than 1.0E-3 yielded 90.0% correctness and an AUC of 0.965. This is the highest score in the literature, even including the scores obtained with phenotypic variables. The additive contributions of phenotype and genotype increased the overall correctness to 92.9% and the AUC to 0.980. CONCLUSION: Our results showed that genotype could be used to obtain a higher score, which could enable early risk prediction. These findings present new possibilities for genome-wide association study analysis in terms of discovering missing heritability. These results should be confirmed by follow-up studies.
Asunto(s)
Diabetes Mellitus Tipo 2/epidemiología , Diabetes Mellitus Tipo 2/genética , Estudio de Asociación del Genoma Completo , Anciano , Estudios de Cohortes , Femenino , Humanos , Modelos Logísticos , Masculino , Persona de Mediana Edad , Polimorfismo de Nucleótido Simple , Valor Predictivo de las Pruebas , Curva ROC , Medición de RiesgoRESUMEN
Recently, there has been increasing research to discover genomic biomarkers, haplotypes, and potentially other variables that together contribute to the development of diseases. Single Nucleotide Polymorphisms (SNPs) are the most common form of genomic variations and they can represent an individual’s genetic variability in greatest detail. Genome-wide association studies (GWAS) of SNPs, high-dimensional case-control studies, are among the most promising approaches for identifying disease causing variants. METU-SNP software is a Java based integrated desktop application specifically designed for the prioritization of SNP biomarkers and the discovery of genes and pathways related to diseases via analysis of the GWAS case-control data. Outputs of METU-SNP can easily be utilized for the downstream biomarkers research to allow the prediction and the diagnosis of diseases and other personalized medical approaches. Here, we introduce and describe the system functionality and architecture of the METU-SNP. We believe that the METU-SNP will help researchers with the reliable identification of SNPs that are involved in the etiology of complex diseases, ultimately supporting the development of personalized medicine approaches and targeted drug discoveries.
Asunto(s)
Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo/métodos , Polimorfismo de Nucleótido Simple , Programas Informáticos , Estudio de Asociación del Genoma Completo/instrumentación , Humanos , Desequilibrio de LigamientoRESUMEN
The Oak Ridge Polycystic Kidney (ORPK) mouse was described nearly 14 years ago as a model for human recessive polycystic kidney disease. The ORPK mouse arose through integration of a transgene into an intron of the Ift88 gene resulting in a hypomorphic allele (Ift88Tg737Rpw). The Ift88Tg737Rpw mutation impairs intraflagellar transport (IFT), a process required for assembly of motile and immotile cilia. Historically, the primary immotile cilium was thought to have minimal importance for human health; however, a rapidly expanding number of human disorders have now been attributed to ciliary defects. Importantly, many of these phenotypes are present and can be analyzed using the ORPK mouse. In this review, we highlight the research conducted using the OPRK mouse and the phenotypes shared with human cilia disorders. Furthermore, we describe an additional follicular dysplasia phenotype in the ORPK mouse, which alongside the ectodermal dysplasias seen in human Ellis-van Creveld and Sensenbrenner's syndromes, suggests an unappreciated role for primary cilia in the skin and hair follicle.