Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 216
Filter
Add more filters

Publication year range
1.
Brief Bioinform ; 25(3)2024 Mar 27.
Article in English | MEDLINE | ID: mdl-38653490

ABSTRACT

Genome-wide Association Studies (GWAS) methods have identified individual single-nucleotide polymorphisms (SNPs) significantly associated with specific phenotypes. Nonetheless, many complex diseases are polygenic and are controlled by multiple genetic variants that are usually non-linearly dependent. These genetic variants are marginally less effective and remain undetected in GWAS analysis. Kernel-based tests (KBT), which evaluate the joint effect of a group of genetic variants, are therefore critical for complex disease analysis. However, choosing different kernel functions in KBT can significantly influence the type I error control and power, and selecting the optimal kernel remains a statistically challenging task. A few existing methods suffer from inflated type 1 errors, limited scalability, inferior power or issues of ambiguous conclusions. Here, we present a new Bayesian framework, BayesKAT (https://github.com/wangjr03/BayesKAT), which overcomes these kernel specification issues by selecting the optimal composite kernel adaptively from the data while testing genetic associations simultaneously. Furthermore, BayesKAT implements a scalable computational strategy to boost its applicability, especially for high-dimensional cases where other methods become less effective. Based on a series of performance comparisons using both simulated and real large-scale genetics data, BayesKAT outperforms the available methods in detecting complex group-level associations and controlling type I errors simultaneously. Applied on a variety of groups of functionally related genetic variants based on biological pathways, co-expression gene modules and protein complexes, BayesKAT deciphers the complex genetic basis and provides mechanistic insights into human diseases.


Subject(s)
Bayes Theorem , Genome-Wide Association Study , Polymorphism, Single Nucleotide , Humans , Genome-Wide Association Study/methods , Genetic Predisposition to Disease , Algorithms , Software , Computational Biology/methods , Genetic Association Studies/methods
2.
Am J Hum Genet ; 109(5): 802-811, 2022 05 05.
Article in English | MEDLINE | ID: mdl-35421325

ABSTRACT

Heritability is a fundamental concept in genetic studies, measuring the genetic contribution to complex traits and bringing insights about disease mechanisms. The advance of high-throughput technologies has provided many resources for heritability estimation. Linkage disequilibrium (LD) score regression (LDSC) estimates both heritability and confounding biases, such as cryptic relatedness and population stratification, among single-nucleotide polymorphisms (SNPs) by using only summary statistics released from genome-wide association studies. However, only partial information in the LD matrix is utilized in LDSC, leading to loss in precision. In this study, we propose LD eigenvalue regression (LDER), an extension of LDSC, by making full use of the LD information. Compared to state-of-the-art heritability estimating methods, LDER provides more accurate estimates of SNP heritability and better distinguishes the inflation caused by polygenicity and confounding effects. We demonstrate the advantages of LDER both theoretically and with extensive simulations. We applied LDER to 814 complex traits from UK Biobank, and LDER identified 363 significantly heritable phenotypes, among which 97 were not identified by LDSC.


Subject(s)
Genome-Wide Association Study , Polymorphism, Single Nucleotide , Genome-Wide Association Study/methods , Humans , Linkage Disequilibrium , Models, Genetic , Multifactorial Inheritance/genetics , Phenotype , Polymorphism, Single Nucleotide/genetics
3.
Am J Hum Genet ; 109(10): 1742-1760, 2022 10 06.
Article in English | MEDLINE | ID: mdl-36152628

ABSTRACT

Complex traits are influenced by genetic risk factors, lifestyle, and environmental variables, so-called exposures. Some exposures, e.g., smoking or lipid levels, have common genetic modifiers identified in genome-wide association studies. Because measurements are often unfeasible, exposure polygenic risk scores (ExPRSs) offer an alternative to study the influence of exposures on various phenotypes. Here, we collected publicly available summary statistics for 28 exposures and applied four common PRS methods to generate ExPRSs in two large biobanks: the Michigan Genomics Initiative and the UK Biobank. We established ExPRSs for 27 exposures and demonstrated their applicability in phenome-wide association studies and as predictors for common chronic conditions. Especially the addition of multiple ExPRSs showed, for several chronic conditions, an improvement compared to prediction models that only included traditional, disease-focused PRSs. To facilitate follow-up studies, we share all ExPRS constructs and generated results via an online repository called ExPRSweb.


Subject(s)
Genetic Predisposition to Disease , Genome-Wide Association Study , Humans , Lipids , Multifactorial Inheritance/genetics , Risk Factors
4.
Brief Bioinform ; 23(6)2022 11 19.
Article in English | MEDLINE | ID: mdl-36151749

ABSTRACT

Currently, there exist no generally accepted strategies of evaluating computational models for microRNA-disease associations (MDAs). Though K-fold cross validations and case studies seem to be must-have procedures, the value of K, the evaluation metrics, and the choice of query diseases as well as the inclusion of other procedures (such as parameter sensitivity tests, ablation studies and computational cost reports) are all determined on a case-by-case basis and depending on the researchers' choices. In the current review, we include a comprehensive analysis on how 29 state-of-the-art models for predicting MDAs were evaluated. Based on the analytical results, we recommend a feasible evaluation workflow that would suit any future model to facilitate fair and systematic assessment of predictive performance.


Subject(s)
MicroRNAs , MicroRNAs/genetics , Computational Biology/methods , Algorithms , Computer Simulation
5.
Brief Bioinform ; 23(6)2022 11 19.
Article in English | MEDLINE | ID: mdl-36094095

ABSTRACT

MicroRNAs (miRNAs) are gene regulators involved in the pathogenesis of complex diseases such as cancers, and thus serve as potential diagnostic markers and therapeutic targets. The prerequisite for designing effective miRNA therapies is accurate discovery of miRNA-disease associations (MDAs), which has attracted substantial research interests during the last 15 years, as reflected by more than 55 000 related entries available on PubMed. Abundant experimental data gathered from the wealth of literature could effectively support the development of computational models for predicting novel associations. In 2017, Chen et al. published the first-ever comprehensive review on MDA prediction, presenting various relevant databases, 20 representative computational models, and suggestions for building more powerful ones. In the current review, as the continuation of the previous study, we revisit miRNA biogenesis, detection techniques and functions; summarize recent experimental findings related to common miRNA-associated diseases; introduce recent updates of miRNA-relevant databases and novel database releases since 2017, present mainstream webservers and new webserver releases since 2017 and finally elaborate on how fusion of diverse data sources has contributed to accurate MDA prediction.


Subject(s)
MicroRNAs , Neoplasms , Humans , MicroRNAs/genetics , Databases, Genetic , Neoplasms/genetics , PubMed , Computational Biology/methods , Genetic Predisposition to Disease , Algorithms
6.
Brief Bioinform ; 23(5)2022 09 20.
Article in English | MEDLINE | ID: mdl-36056743

ABSTRACT

Since the problem proposed in late 2000s, microRNA-disease association (MDA) predictions have been implemented based on the data fusion paradigm. Integrating diverse data sources gains a more comprehensive research perspective, and brings a challenge to algorithm design for generating accurate, concise and consistent representations of the fused data. After more than a decade of research progress, a relatively simple algorithm like the score function or a single computation layer may no longer be sufficient for further improving predictive performance. Advanced model design has become more frequent in recent years, particularly in the form of reasonably combing multiple algorithms, a process known as model fusion. In the current review, we present 29 state-of-the-art models and introduce the taxonomy of computational models for MDA prediction based on model fusion and non-fusion. The new taxonomy exhibits notable changes in the algorithmic architecture of models, compared with that of earlier ones in the 2017 review by Chen et al. Moreover, we discuss the progresses that have been made towards overcoming the obstacles to effective MDA prediction since 2017 and elaborated on how future models can be designed according to a set of new schemas. Lastly, we analysed the strengths and weaknesses of each model category in the proposed taxonomy and proposed future research directions from diverse perspectives for enhancing model performance.


Subject(s)
MicroRNAs , Algorithms , Computational Biology , Computer Simulation , MicroRNAs/genetics
7.
Brief Bioinform ; 23(1)2022 01 17.
Article in English | MEDLINE | ID: mdl-34676391

ABSTRACT

Circular RNAs (circRNAs) are a category of novelty discovered competing endogenous non-coding RNAs that have been proved to implicate many human complex diseases. A large number of circRNAs have been confirmed to be involved in cancer progression and are expected to become promising biomarkers for tumor diagnosis and targeted therapy. Deciphering the underlying relationships between circRNAs and diseases may provide new insights for us to understand the pathogenesis of complex diseases and further characterize the biological functions of circRNAs. As traditional experimental methods are usually time-consuming and laborious, computational models have made significant progress in systematically exploring potential circRNA-disease associations, which not only creates new opportunities for investigating pathogenic mechanisms at the level of circRNAs, but also helps to significantly improve the efficiency of clinical trials. In this review, we first summarize the functions and characteristics of circRNAs and introduce some representative circRNAs related to tumorigenesis. Then, we mainly investigate the available databases and tools dedicated to circRNA and disease studies. Next, we present a comprehensive review of computational methods for predicting circRNA-disease associations and classify them into five categories, including network propagating-based, path-based, matrix factorization-based, deep learning-based and other machine learning methods. Finally, we further discuss the challenges and future researches in this field.


Subject(s)
Neoplasms , RNA, Circular , Algorithms , Computational Biology/methods , Humans , Machine Learning , Neoplasms/genetics
8.
Ann Rheum Dis ; 2024 Jul 30.
Article in English | MEDLINE | ID: mdl-39079893

ABSTRACT

OBJECTIVES: Hypocomplementaemia is common in patients with IgG4-related disease (IgG4-RD). We aimed to determine the IgG4-RD features associated with hypocomplementaemia and investigate mechanisms of complement activation in this disease. METHODS: We performed a single-centre cross-sectional study of 279 patients who fulfilled the IgG4-RD classification criteria, using unadjusted and multivariable-adjusted logistic regression to identify factors associated with hypocomplementaemia. RESULTS: Hypocomplementaemia was observed in 90 (32%) patients. In the unadjusted model, the number of organs involved (OR 1.42, 95% CI 1.23 to 1.63) and involvement of the lymph nodes (OR 3.87, 95% CI 2.19 to 6.86), lungs (OR 3.81, 95% CI 2.10 to 6.89), pancreas (OR 1.66, 95% CI 1.001 to 2.76), liver (OR 2.73, 95% CI 1.17 to 6.36) and kidneys (OR 2.48, 95% CI 1.47 to 4.18) were each associated with hypocomplementaemia. After adjusting for age, sex and number of organs involved, only lymph node (OR 2.59, 95% CI 1.36 to 4.91) and lung (OR 2.56, 95% CI 1.35 to 4.89) involvement remained associated with hypocomplementaemia while the association with renal involvement was attenuated (OR 1.6, 95% CI 0.92 to 2.98). Fibrotic disease manifestations (OR 0.43, 95% CI 0.21 to 0.87) and lacrimal gland involvement (OR 0.53, 95% CI 0.28 to 0.999) were inversely associated with hypocomplementaemia in the adjusted analysis. Hypocomplementaemia was associated with higher concentrations of all IgG subclasses and IgE (all p<0.05). After adjusting for serum IgG1 and IgG3, only IgG1 but not IgG4 remained strongly associated with hypocomplementaemia. CONCLUSIONS: Hypocomplementaemia in IgG4-RD is not unique to patients with renal involvement and may reflect the extent of disease. IgG1 independently correlates with hypocomplementaemia in IgG4-RD, but IgG4 does not. Complement activation is likely involved in IgG4-RD pathophysiology.

9.
Ann Rheum Dis ; 83(5): 550-555, 2024 Apr 11.
Article in English | MEDLINE | ID: mdl-38413169

ABSTRACT

A hallmark of rheumatoid arthritis (RA) is the increased levels of autoantibodies preceding the onset and contributing to the classification of the disease. These autoantibodies, mainly anti-citrullinated protein antibody (ACPA) and rheumatoid factor, have been assumed to be pathogenic and many attempts have been made to link them to the development of bone erosion, pain and arthritis. We and others have recently discovered that most cloned ACPA protect against experimental arthritis in the mouse. In addition, we have identified suppressor B cells in healthy individuals, selected in response to collagen type II, and these cells decrease in numbers in RA. These findings provide a new angle on how to explain the development of RA and maybe also other complex autoimmune diseases preceded by an increased autoimmune response.


Subject(s)
Arthritis, Rheumatoid , Autoimmune Diseases , Animals , Mice , Autoimmunity , Autoantibodies , Anti-Citrullinated Protein Antibodies
10.
J Transl Med ; 22(1): 599, 2024 Jun 27.
Article in English | MEDLINE | ID: mdl-38937846

ABSTRACT

BACKGROUND: Patient heterogeneity poses significant challenges for managing individuals and designing clinical trials, especially in complex diseases. Existing classifications rely on outcome-predicting scores, potentially overlooking crucial elements contributing to heterogeneity without necessarily impacting prognosis. METHODS: To address patient heterogeneity, we developed ClustALL, a computational pipeline that simultaneously faces diverse clinical data challenges like mixed types, missing values, and collinearity. ClustALL enables the unsupervised identification of patient stratifications while filtering for stratifications that are robust against minor variations in the population (population-based) and against limited adjustments in the algorithm's parameters (parameter-based). RESULTS: Applied to a European cohort of patients with acutely decompensated cirrhosis (n = 766), ClustALL identified five robust stratifications, using only data at hospital admission. All stratifications included markers of impaired liver function and number of organ dysfunction or failure, and most included precipitating events. When focusing on one of these stratifications, patients were categorized into three clusters characterized by typical clinical features; notably, the 3-cluster stratification showed a prognostic value. Re-assessment of patient stratification during follow-up delineated patients' outcomes, with further improvement of the prognostic value of the stratification. We validated these findings in an independent prospective multicentre cohort of patients from Latin America (n = 580). CONCLUSIONS: By applying ClustALL to patients with acutely decompensated cirrhosis, we identified three patient clusters. Following these clusters over time offers insights that could guide future clinical trial design. ClustALL is a novel and robust stratification method capable of addressing the multiple challenges of patient stratification in most complex diseases.


Subject(s)
Liver Cirrhosis , Humans , Male , Female , Cluster Analysis , Middle Aged , Prognosis , Acute Disease , Algorithms , Aged , Cohort Studies
11.
Trends Genet ; 36(12): 951-966, 2020 12.
Article in English | MEDLINE | ID: mdl-32868128

ABSTRACT

Single-cell multimodal omics (scMulti-omics) technologies have made it possible to trace cellular lineages during differentiation and to identify new cell types in heterogeneous cell populations. The derived information is especially promising for computing cell-type-specific biological networks encoded in complex diseases and improving our understanding of the underlying gene regulatory mechanisms. The integration of these networks could, therefore, give rise to a heterogeneous regulatory landscape (HRL) in support of disease diagnosis and drug therapeutics. In this review, we provide an overview of this field and pay particular attention to how diverse biological networks can be inferred in a specific cell type based on integrative methods. Then, we discuss how HRL can advance our understanding of regulatory mechanisms underlying complex diseases and aid in the prediction of prognosis and therapeutic responses. Finally, we outline challenges and future trends that will be central to bringing the field of HRL in complex diseases forward.


Subject(s)
Computational Biology/methods , Disease/genetics , Gene Regulatory Networks , Single-Cell Analysis/methods , Animals , Humans
12.
J Intern Med ; 294(4): 378-396, 2023 10.
Article in English | MEDLINE | ID: mdl-37093654

ABSTRACT

Complex diseases are caused by a combination of genetic, lifestyle, and environmental factors and comprise common noncommunicable diseases, including allergies, cardiovascular disease, and psychiatric and metabolic disorders. More than 25% of Europeans suffer from a complex disease, and together these diseases account for 70% of all deaths. The use of genomic, molecular, or imaging data to develop accurate diagnostic tools for treatment recommendations and preventive strategies, and for disease prognosis and prediction, is an important step toward precision medicine. However, for complex diseases, precision medicine is associated with several challenges. There is a significant heterogeneity between patients of a specific disease-both with regards to symptoms and underlying causal mechanisms-and the number of underlying genetic and nongenetic risk factors is often high. Here, we summarize precision medicine approaches for complex diseases and highlight the current breakthroughs as well as the challenges. We conclude that genomic-based precision medicine has been used mainly for patients with highly penetrant monogenic disease forms, such as cardiomyopathies. However, for most complex diseases-including psychiatric disorders and allergies-available polygenic risk scores are more probabilistic than deterministic and have not yet been validated for clinical utility. However, subclassifying patients of a specific disease into discrete homogenous subtypes based on molecular or phenotypic data is a promising strategy for improving diagnosis, prediction, treatment, prevention, and prognosis. The availability of high-throughput molecular technologies, together with large collections of health data and novel data-driven approaches, offers promise toward improved individual health through precision medicine.


Subject(s)
Mental Disorders , Precision Medicine , Humans , Precision Medicine/methods , Genomics/methods , Risk Factors
13.
Ann Rheum Dis ; 82(5): 585-593, 2023 05.
Article in English | MEDLINE | ID: mdl-36535746

ABSTRACT

Immune deposits/complexes are detected in a multitude of tissues in autoimmune disorders, but no organ has attracted as much attention as the kidney. Several kidney diseases are characterised by the presence of specific configurations of such deposits, and many of them are under a 'shared care' between rheumatologists and nephrologists. This review focuses on five different diseases commonly encountered in rheumatological and nephrological practice, namely IgA vasculitis, lupus nephritis, cryoglobulinaemia, anti-glomerular basement membrane disease and anti-neutrophil cytoplasm-antibody glomerulonephritis. They differ in disease aetiopathogenesis, but also the potential speed of kidney function decline, the responsiveness to immunosuppression/immunomodulation and the deposition of immune deposits/complexes. To date, it remains unclear if deposits are causing a specific disease or aim to abrogate inflammatory cascades responsible for tissue damage, such as neutrophil extracellular traps or the complement system. In principle, immunosuppressive therapies have not been developed to tackle immune deposits/complexes, and repeated kidney biopsy studies found persistence of deposits despite reduction of active inflammation, again highlighting the uncertainty about their involvement in tissue damage. In these studies, a progression of active lesions to chronic changes such as glomerulosclerosis was frequently reported. Novel therapeutic approaches aim to mitigate these changes more efficiently and rapidly. Several new agents, such as avacopan, an oral C5aR1 inhibitor, or imlifidase, that dissolves IgG within minutes, are more specifically reducing inflammatory cascades in the kidney and repeat tissue sampling might help to understand their impact on immune cell deposition and finally kidney function recovery and potential impact of immune complexes/deposits.


Subject(s)
Glomerulonephritis , Kidney Diseases , Lupus Nephritis , Humans , Kidney/pathology , Kidney Diseases/diagnosis , Kidney Diseases/etiology , Lupus Nephritis/pathology , Glomerulonephritis/pathology , Antigen-Antibody Complex
14.
Ann Rheum Dis ; 82(10): 1248-1257, 2023 10.
Article in English | MEDLINE | ID: mdl-37495237

ABSTRACT

OBJECTIVE: Calcium pyrophosphate deposition (CPPD) disease is prevalent and has diverse presentations, but there are no validated classification criteria for this symptomatic arthritis. The American College of Rheumatology (ACR) and EULAR have developed the first-ever validated classification criteria for symptomatic CPPD disease. METHODS: Supported by the ACR and EULAR, a multinational group of investigators followed established methodology to develop these disease classification criteria. The group generated lists of candidate items and refined their definitions, collected de-identified patient profiles, evaluated strengths of associations between candidate items and CPPD disease, developed a classification criteria framework, and used multi-criterion decision analysis to define criteria weights and a classification threshold score. The criteria were validated in an independent cohort. RESULTS: Among patients with joint pain, swelling, or tenderness (entry criterion) whose symptoms are not fully explained by an alternative disease (exclusion criterion), the presence of crowned dens syndrome or calcium pyrophosphate crystals in synovial fluid are sufficient to classify a patient as having CPPD disease. In the absence of these findings, a score>56 points using weighted criteria, comprising clinical features, associated metabolic disorders, and results of laboratory and imaging investigations, can be used to classify as CPPD disease. These criteria had a sensitivity of 92.2% and specificity of 87.9% in the derivation cohort (190 CPPD cases, 148 mimickers), whereas sensitivity was 99.2% and specificity was 92.5% in the validation cohort (251 CPPD cases, 162 mimickers). CONCLUSION: The 2023 ACR/EULAR CPPD disease classification criteria have excellent performance characteristics and will facilitate research in this field.


Subject(s)
Calcinosis , Chondrocalcinosis , Rheumatology , Humans , United States , Chondrocalcinosis/diagnostic imaging , Calcium Pyrophosphate , Syndrome
15.
Ann Rheum Dis ; 82(10): 1315-1327, 2023 10.
Article in English | MEDLINE | ID: mdl-37365013

ABSTRACT

OBJECTIVE: Whereas genetic susceptibility for systemic lupus erythematosus (SLE) has been well explored, the triggers for clinical disease flares remain elusive. To investigate relationships between microbiota community resilience and disease activity, we performed the first longitudinal analyses of lupus gut-microbiota communities. METHODS: In an observational study, taxononomic analyses, including multivariate analysis of ß-diversity, assessed time-dependent alterations in faecal communities from patients and healthy controls. From gut blooms, strains were isolated, with genomes and associated glycans analysed. RESULTS: Multivariate analyses documented that, unlike healthy controls, significant temporal community-wide ecological microbiota instability was common in SLE patients, and transient intestinal growth spikes of several pathogenic species were documented. Expansions of only the anaerobic commensal, Ruminococcus (blautia) gnavus (RG) occurred at times of high-disease activity, and were detected in almost half of patients during lupus nephritis (LN) disease flares. Whole genome sequence analysis of RG strains isolated during these flares documented 34 genes postulated to aid adaptation and expansion within a host with an inflammatory condition. Yet, the most specific feature of strains found during lupus flares was the common expression of a novel type of cell membrane-associated lipoglycan. These lipoglycans share conserved structural features documented by mass spectroscopy, and highly immunogenic repetitive antigenic-determinants, recognised by high-level serum IgG2 antibodies, that spontaneously arose, concurrent with RG blooms and lupus flares. CONCLUSIONS: Our findings rationalise how blooms of the RG pathobiont may be common drivers of clinical flares of often remitting-relapsing lupus disease, and highlight the potential pathogenic properties of specific strains isolated from active LN patients.


Subject(s)
Gastrointestinal Microbiome , Lupus Erythematosus, Systemic , Lupus Nephritis , Microbiota , Humans , Gastrointestinal Microbiome/genetics , Symptom Flare Up , Feces , Lupus Nephritis/genetics
16.
Methods ; 198: 56-64, 2022 02.
Article in English | MEDLINE | ID: mdl-34364986

ABSTRACT

Complex diseases are caused by a variety of factors, and their diagnosis, treatment and prognosis are usually difficult. Proteins play an indispensable role in living organisms and perform specific biological functions by interacting with other proteins or biomolecules, their dysfunction may lead to diseases, it is a natural way to mine disease-related biomarkers from protein-protein interaction network. AUC, the area under the receiver operating characteristics (ROC) curve, is regarded as a gold standard to evaluate the effectiveness of a binary classifier, which measures the classification ability of an algorithm under arbitrary distribution or any misclassification cost. In this study, we have proposed a network-based multi-biomarker identification method by AUC optimization (NetAUC), which integrates gene expression and the network information to identify biomarkers for the complex disease analysis. The main purpose is to optimize two objectives simultaneously: maximizing AUC and minimizing the number of selected features. We have applied NetAUC to two types of disease analysis: 1) prognosis of breast cancer, 2) classification of similar diseases. The results show that NetAUC can identify a small panel of disease-related biomarkers which have the powerful classification ability and the functional interpretability.


Subject(s)
Algorithms , Breast Neoplasms , Area Under Curve , Biomarkers , Breast Neoplasms/diagnosis , Breast Neoplasms/genetics , Female , Humans , ROC Curve
17.
Sensors (Basel) ; 23(9)2023 May 01.
Article in English | MEDLINE | ID: mdl-37177642

ABSTRACT

Genome-wide association studies have proven their ability to improve human health outcomes by identifying genotypes associated with phenotypes. Various works have attempted to predict the risk of diseases for individuals based on genotype data. This prediction can either be considered as an analysis model that can lead to a better understanding of gene functions that underlie human disease or as a black box in order to be used in decision support systems and in early disease detection. Deep learning techniques have gained more popularity recently. In this work, we propose a deep-learning framework for disease risk prediction. The proposed framework employs a multilayer perceptron (MLP) in order to predict individuals' disease status. The proposed framework was applied to the Wellcome Trust Case-Control Consortium (WTCCC), the UK National Blood Service (NBS) Control Group, and the 1958 British Birth Cohort (58C) datasets. The performance comparison of the proposed framework showed that the proposed approach outperformed the other methods in predicting disease risk, achieving an area under the curve (AUC) up to 0.94.


Subject(s)
Deep Learning , Humans , Genome-Wide Association Study , Neural Networks, Computer , Genotype , Genomics
18.
Genet Epidemiol ; 45(2): 222-234, 2021 03.
Article in English | MEDLINE | ID: mdl-33231893

ABSTRACT

Though additive forms of heritability are primarily studied in genetics, nonlinear, non-additive gene-gene interactions, that is, epistasis, could explain a portion of the missing heritability in complex human diseases including cancer. In recent years, powerful computational methods have been introduced to understand multivariable genetic factors of these complex human diseases in extremely high-dimensional genome-wide data. In this study, we investigated the performance of three powerful methods, BOolean Operation-based Screening and Testing (BOOST), FastEpistasis, and Tree-based Epistasis Association Mapping (TEAM) to identify interacting genetic risk factors of colorectal cancer (CRC) for genome-wide association studies (GWAS). After quality-control based data preprocessing, we applied these three algorithms to a CRC GWAS data set, and selected the top-ranked 100 single-nucleotide polymorphism (SNP) pairs identified by each method (251 SNPs in total), among which 74 pairs were common between FastEpistasis and BOOST. The identified SNPs by BOOST, FastEpistasis, and TEAM mapped to 58, 57, and 62 genes, respectively. Some genes highlighted by our study, including MACF1, USP49, SMAD2, SMAD3, TGFBR1, and RHOA, have been detected in previous CRC-related research. We also identified some new genes with potential biological relevance to CRC such as CCDC32. Furthermore, we constructed the network of these top SNP pairs for three methods, and the patterns identified in the networks show that some SNPs including rs2412531, rs349699, and rs17142011 play a crucial role in the classification of disease status in our study.


Subject(s)
Colorectal Neoplasms , Epistasis, Genetic , Algorithms , Colorectal Neoplasms/genetics , Genetic Predisposition to Disease , Genome-Wide Association Study , Humans , Microfilament Proteins , Polymorphism, Single Nucleotide , Ubiquitin Thiolesterase
19.
Genet Epidemiol ; 45(5): 455-470, 2021 07.
Article in English | MEDLINE | ID: mdl-33645812

ABSTRACT

Genetic studies of two related survival outcomes of a pleiotropic gene are commonly encountered but statistical models to analyze them are rarely developed. To analyze sequencing data, we propose mixed effect Cox proportional hazard models by functional regressions to perform gene-based joint association analysis of two survival traits motivated by our ongoing real studies. These models extend fixed effect Cox models of univariate survival traits by incorporating variations and correlation of multivariate survival traits into the models. The associations between genetic variants and two survival traits are tested by likelihood ratio test statistics. Extensive simulation studies suggest that type I error rates are well controlled and power performances are stable. The proposed models are applied to analyze bivariate survival traits of left and right eyes in the age-related macular degeneration progression.


Subject(s)
Eye Diseases , Genetic Variation , Eye Diseases/genetics , Genetic Association Studies , Humans , Models, Genetic , Phenotype
20.
Brief Bioinform ; 21(2): 429-440, 2020 03 23.
Article in English | MEDLINE | ID: mdl-30698665

ABSTRACT

Biological complex systems are composed of numerous components that interact within and across different scales. The ever-increasing generation of high-throughput biomedical data has given us an opportunity to develop a quantitative model of nonlinear biological systems having implications in health and diseases. Multidimensional molecular data can be modeled using various statistical methods at different scales of biological organization, such as genome, transcriptome and proteome. I will discuss recent advances in the application of computational medicine in complex diseases such as network-based studies, genome-scale metabolic modeling, kinetic modeling and support vector machines with specific examples in the field of cancer, psychiatric disorders and type 2 diabetes. The recent advances in translating these computational models in diagnosis and identification of drug targets of complex diseases are discussed, as well as the challenges researchers and clinicians are facing in taking computational medicine from the bench to bedside.


Subject(s)
Computational Biology/methods , Diabetes Mellitus, Type 2/genetics , Mental Disorders/genetics , Neoplasms/genetics , Algorithms , Genomics , Humans , Medicine/methods
SELECTION OF CITATIONS
SEARCH DETAIL