Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 252
Filter
1.
Expert Rev Med Devices ; 20(10): 851-864, 2023.
Article in English | MEDLINE | ID: mdl-37522639

ABSTRACT

BACKGROUND: Proper maintenance of electro-medical devices is crucial for the quality of care to patients and the economic performance of healthcare organizations. This research aims to identify the interaction between Ultrasound scanners (US) maintenance variables as a function of maintenance indicators: US in service or decommissioned, excessive number of failures, and failure rate. Knowing those interactions, specific maintenance measures will be developed to improve the reliability of the US. RESEARCH DESIGN AND METHODS: Multifactor Dimensionality Reduction (MDR) method was eployed to analyze data from 222 US and their four-year maintenance history. Models were developed based on the variables with the greatest influence on maintenance indicators, where US were classified according to the associated risk. RESULTS: US with more than one major failure or at least one major component replacement had up to 496.4% more failures than the average. Failure rate increased by up to 188.7% over the average for those US with more than three moderate failures, three replacements, or both. CONCLUSIONS: This study identifies and quantifies the causes of risk to establish a specific maintenance plan for US. It helps to better understand the degradation of US to optimize their operation and maintenance.


Subject(s)
Models, Genetic , Multifactor Dimensionality Reduction , Humans , Multifactor Dimensionality Reduction/methods , Reproducibility of Results , Ultrasonography
2.
Mol Neurobiol ; 60(8): 4731-4737, 2023 Aug.
Article in English | MEDLINE | ID: mdl-37148523

ABSTRACT

Dementia is a multifactorial disease in which environmental, lifestyle, and genetic factors intervene. Population studies have been used in looking for the susceptibility genes for this disease. Since the activity of dopamine b hydroxylase (DßH) is reduced in the hippocampus and neocortex in the brain, changes in the physiological status of dopamine have been reported in Alzheimer's disease (AD) induced by this enzyme. Therefore, DBH polymorphisms have been associated with susceptibility to some neurological diseases such as AD, but few studies have investigated the relationship between these polymorphisms with other types of dementia, especially in Mexican populations. The aim of this study was to evaluate the association between single-nucleotide polymorphism (SNP) in the dopamine b-hydroxylase (DBH gene (rs1611115) and their interactions with environmental factors and the dementia risk. We examined the genotype of the gene DBH (rs1611115) polymorphism in patients with dementia and healthy. The interaction and the impact of DBH (rs1611115) polymorphism on dementia were examined through multifactor dimensionality reduction (MDR) analysis, and the results were verified by the Chi-square test. Hardy-Weinberg equilibrium (HWE) was also checked by the Chi-square test. The relative risk was expressed by odds ratio (OR) and 95%. A total of 221 dementia patients and 534 controls met the inclusion criteria of MDR analyses. The results of the MDR analysis showed that the development of dementia was positively correlated with interaction between the TT genotype of the DBH1 locus rs1611115 TT and diabetes, hypertension, and alcohol consumption (OR = 6.5: 95% CI = 4.5-9.5), originating further cognitive damage. These findings provide insight into the positive correlation between the metabolism and cardiovascular disorders and the presence of the T allele by means of a recessive model of DBH rs1611115 polymorphism with the suspensibility of dementia.


Subject(s)
Dementia , Dopamine beta-Hydroxylase , Humans , Dopamine beta-Hydroxylase/genetics , Dopamine , Multifactor Dimensionality Reduction , Polymorphism, Single Nucleotide/genetics , Genotype , Dementia/genetics , Genetic Predisposition to Disease
3.
Asian Pac J Cancer Prev ; 24(4): 1231-1237, 2023 Apr 01.
Article in English | MEDLINE | ID: mdl-37116145

ABSTRACT

BACKGROUND: The present study investigated the association of interactions between gene polymorphisms in metabolic 'caretaker' genes (Phase I: CYP1A1, CYP2E1; Phase II: GSTM1, GSTT1), the cell cycle regulatory gene, p53, along with its negative controller, MDM-2, and the environment variable (tobacco). A nonparametric model, multifactor dimensionality reduction (MDR), was applied to analyse these interactions. MATERIALS AND METHODS: This case-control study was carried out on 242 subjects. Genomic DNA was extracted from peripheral blood lymphocytes.11 gene variants with an exposure variable (tobacco use) were analysed using MDR to identify the best locus model for gene-gene and gene-environment interactions. Statistical significance was evaluated using a 1000-fold permutation test using MDR permutation testing software (version 1.0 beta 2). The value of p<0.05 was considered statistically significant. RESULTS: The best three-locus model for gene-gene interaction included two of the p53 gene polymorphisms; rs17878362 (intron 3) and rs1042522 (exon 4) and rs6413432 in the Phase I gene, CYP2E1(DraI). The three-locus model to evaluate the gene-environment interaction included two intronic polymorphisms of the p53 gene, that is, rs17878362 (intron 3) and rs1625895 (intron 6), and rs4646903 in the Phase I gene CYP1A1*2C. The interaction graphs revealed independent main effects of the tobacco and p53 polymorphism, rs1042522 (exon 4), and a significant additive interaction effect between rs17878362 (intron 3) and rs1042522 (exon 4). CONCLUSIONS: The nonparametric approach highlighted the potential role of tobacco use and variations in the p53 gene as significant contributors to oral cancer risk. The findings of the present study will help implement preventive strategies in both tobacco use and screening using a molecular pathology approach.


Subject(s)
Cytochrome P-450 CYP1A1 , Mouth Neoplasms , Humans , Cytochrome P-450 CYP1A1/genetics , Cytochrome P-450 CYP2E1/genetics , Genes, p53 , Genetic Predisposition to Disease , Multifactor Dimensionality Reduction , Genotype , Risk Factors , Case-Control Studies , Tumor Suppressor Protein p53/genetics , Tobacco Use/adverse effects , Mouth Neoplasms/etiology , Mouth Neoplasms/genetics , Glutathione Transferase/genetics , Proto-Oncogene Proteins c-mdm2/genetics
4.
Brief Bioinform ; 24(1)2023 01 19.
Article in English | MEDLINE | ID: mdl-36458451

ABSTRACT

In epistasis analysis, single-nucleotide polymorphism-single-nucleotide polymorphism interactions (SSIs) among genes may, alongside other environmental factors, influence the risk of multifactorial diseases. To identify SSI between cases and controls (i.e. binary traits), the score for model quality is affected by different objective functions (i.e. measurements) because of potential disease model preferences and disease complexities. Our previous study proposed a multiobjective approach-based multifactor dimensionality reduction (MOMDR), with the results indicating that two objective functions could enhance SSI identification with weak marginal effects. However, SSI identification using MOMDR remains a challenge because the optimal measure combination of objective functions has yet to be investigated. This study extended MOMDR to the many-objective version (i.e. many-objective MDR, MaODR) by integrating various disease probability measures based on a two-way contingency table to improve the identification of SSI between cases and controls. We introduced an objective function selection approach to determine the optimal measure combination in MaODR among 10 well-known measures. In total, 6 disease models with and 40 disease models without marginal effects were used to evaluate the general algorithms, namely those based on multifactor dimensionality reduction, MOMDR and MaODR. Our results revealed that the MaODR-based three objective function model, correct classification rate, likelihood ratio and normalized mutual information (MaODR-CLN) exhibited the higher 6.47% detection success rates (Accuracy) than MOMDR and higher 17.23% detection success rates than MDR through the application of an objective function selection approach. In a Wellcome Trust Case Control Consortium, MaODR-CLN successfully identified the significant SSIs (P < 0.001) associated with coronary artery disease. We performed a systematic analysis to identify the optimal measure combination in MaODR among 10 objective functions. Our combination detected SSIs-based binary traits with weak marginal effects and thus reduced spurious variables in the score model. MOAI is freely available at https://sites.google.com/view/maodr/home.


Subject(s)
Epistasis, Genetic , Models, Genetic , Algorithms , Phenotype , Multifactor Dimensionality Reduction/methods , Polymorphism, Single Nucleotide
5.
Article in English | MEDLINE | ID: mdl-35061588

ABSTRACT

Epistasis detection is vital for understanding disease susceptibility in genetics. Multiobjective multifactor dimensionality reduction (MOMDR) was previously proposed to detect epistasis. MOMDR was performed using binary classification to distinguish the high-risk (H) and low-risk (L) groups to reduce multifactor dimensionality. However, the binary classification does not reflect the uncertainty of the H and L classification. In this study, we proposed an empirical fuzzy MOMDR (EFMOMDR) to address the limitations of binary classification using the degree of membership through an empirical fuzzy approach. The EFMOMDR can simultaneously consider two incorporated fuzzy-based measures, including correct classification rate and likelihood rate, and does not require parameter tuning. Simulation studies revealed that EFMOMDR has higher 7.14% detection success rates than MOMDR, indicating that the limitations of binary classification of MOMDR have been successfully improved by empirical fuzzy. Moreover, EFMOMDR was used to analyze coronary artery disease in the Wellcome Trust Case Control Consortium dataset.


Subject(s)
Coronary Artery Disease , Epistasis, Genetic , Humans , Epistasis, Genetic/genetics , Multifactor Dimensionality Reduction , Models, Genetic , Computer Simulation , Coronary Artery Disease/genetics , Polymorphism, Single Nucleotide , Algorithms
6.
Oxid Med Cell Longev ; 2022: 2633127, 2022.
Article in English | MEDLINE | ID: mdl-35126809

ABSTRACT

Based on the "oxidative stress hypothesis" of major depressive disorder (MDD), cells regulate their structure through the Wnt pathway. Little is known regarding the interactions of dishevelled 3 (DVL3) and glycogen synthase kinase 3 beta (GSK3ß) polymorphisms with MDD. The aim of the current study was to verify the relationship between DVL3 and GSK3ß genetic variants in a Chinese Han population and further to evaluate whether these interactions exhibit gender-specificity. A total of 1136 participants, consisting of 541 MDD patients and 595 healthy subjects, were recruited. Five single-nucleotide polymorphisms (SNPs) of DVL3/GSK3ß were selected to assess their interaction by use of a generalized multifactor dimensionality reduction method. The genotype and haplotype frequencies of DVL3/GSK3ß polymorphisms were significantly different between patients and controls for DVL3 rs1709642 (P < 0.01) and GSK3ß rs334558, rs6438552, and rs2199503 (P < 0.01). In addition, our results also showed that there were significant interaction effects between DVL3 and GSK3ß polymorphisms and the risk of developing MDD, particularly in women. The interaction between DVL3 (rs1709642) and GSK3ß (rs334558, rs6438552) showed a cross-validation (CV) consistency of 10/10, a P value of 0.001, and a testing accuracy of 59.22%, which was considered as the best generalized multifactor dimensionality reduction (GMDR) model. This study reveals the interaction between DVL3 and GSK3ß polymorphisms on MDD susceptibility in a female Chinese Han population. The effect of gender should be taken into account in future studies that seek to explore the genetic predisposition to MDD relative to the DVL3 and GSK3ß genes.


Subject(s)
Asian People/genetics , Depressive Disorder, Major/epidemiology , Depressive Disorder, Major/genetics , Dishevelled Proteins/genetics , Genetic Predisposition to Disease/genetics , Glycogen Synthase Kinase 3 beta/genetics , Polymorphism, Single Nucleotide , Adult , Alleles , Case-Control Studies , China/epidemiology , Female , Gene Frequency , Genetic Loci , Haplotypes , Humans , Male , Middle Aged , Multifactor Dimensionality Reduction/methods , Sex Factors
7.
PLoS One ; 17(2): e0263390, 2022.
Article in English | MEDLINE | ID: mdl-35180244

ABSTRACT

BACKGROUND: Numerous approaches have been proposed for the detection of epistatic interactions within GWAS datasets in order to better understand the drivers of disease and genetics. METHODS: A selection of state-of-the-art approaches were assessed. These included the statistical tests, fast-epistasis, BOOST, logistic regression and wtest; swarm intelligence methods, namely AntEpiSeeker, epiACO and CINOEDV; and data mining approaches, including MDR, GSS, SNPRuler and MPI3SNP. Data were simulated to provide randomly generated models with no individual main effects at different heritabilities (pure epistasis) as well as models based on penetrance tables with some main effects (impure epistasis). Detection of both two and three locus interactions were assessed across a total of 1,560 simulated datasets. The different methods were also applied to a section of the UK biobank cohort for Atrial Fibrillation. RESULTS: For pure, two locus interactions, PLINK's implementation of BOOST recovered the highest number of correct interactions, with 53.9% and significantly better performing than the other methods (p = 4.52e - 36). For impure two locus interactions, MDR exhibited the best performance, recovering 62.2% of the most significant impure epistatic interactions (p = 6.31e - 90 for all but one test). The assessment of three locus interaction prediction revealed that wtest recovered the highest number (17.2%) of pure epistatic interactions(p = 8.49e - 14). wtest also recovered the highest number of three locus impure epistatic interactions (p = 6.76e - 48) while AntEpiSeeker ranked as the most significant the highest number of such interactions (40.5%). Finally, when applied to a real dataset for Atrial Fibrillation, most notably finding an interaction between SYNE2 and DTNB.


Subject(s)
Atrial Fibrillation/genetics , Epistasis, Genetic , Genetic Loci , Models, Genetic , Penetrance , Algorithms , Alleles , Data Mining/methods , Dystrophin-Associated Proteins/genetics , Gene Frequency , Genome-Wide Association Study/methods , Genotype , Humans , Linear Models , Microfilament Proteins/genetics , Multifactor Dimensionality Reduction , Nerve Tissue Proteins/genetics , Neuropeptides/genetics , Polymorphism, Single Nucleotide , ROC Curve
8.
BMC Bioinformatics ; 22(1): 480, 2021 Oct 04.
Article in English | MEDLINE | ID: mdl-34607566

ABSTRACT

BACKGROUND: Identifying interaction effects between genes is one of the main tasks of genome-wide association studies aiming to shed light on the biological mechanisms underlying complex diseases. Multifactor dimensionality reduction (MDR) is a popular approach for detecting gene-gene interactions that has been extended in various forms to handle binary and continuous phenotypes. However, only few multivariate MDR methods are available for multiple related phenotypes. Current approaches use Hotelling's T2 statistic to evaluate interaction models, but it is well known that Hotelling's T2 statistic is highly sensitive to heavily skewed distributions and outliers. RESULTS: We propose a robust approach based on nonparametric statistics such as spatial signs and ranks. The new multivariate rank-based MDR (MR-MDR) is mainly suitable for analyzing multiple continuous phenotypes and is less sensitive to skewed distributions and outliers. MR-MDR utilizes fuzzy k-means clustering and classifies multi-locus genotypes into two groups. Then, MR-MDR calculates a spatial rank-sum statistic as an evaluation measure and selects the best interaction model with the largest statistic. Our novel idea lies in adopting nonparametric statistics as an evaluation measure for robust inference. We adopt tenfold cross-validation to avoid overfitting. Intensive simulation studies were conducted to compare the performance of MR-MDR with current methods. Application of MR-MDR to a real dataset from a Korean genome-wide association study demonstrated that it successfully identified genetic interactions associated with four phenotypes related to kidney function. The R code for conducting MR-MDR is available at https://github.com/statpark/MR-MDR . CONCLUSIONS: Intensive simulation studies comparing MR-MDR with several current methods showed that the performance of MR-MDR was outstanding for skewed distributions. Additionally, for symmetric distributions, MR-MDR showed comparable power. Therefore, we conclude that MR-MDR is a useful multivariate non-parametric approach that can be used regardless of the phenotype distribution, the correlations between phenotypes, and sample size.


Subject(s)
Genome-Wide Association Study , Multifactor Dimensionality Reduction , Algorithms , Computer Simulation , Epistasis, Genetic , Models, Genetic , Phenotype , Polymorphism, Single Nucleotide
9.
Clin Nutr ; 40(10): 5355-5364, 2021 10.
Article in English | MEDLINE | ID: mdl-34560606

ABSTRACT

BACKGROUND & AIMS: Sarcopenia elevates metabolic disorders in the elderly, and genetic and environmental factors influence the risk of sarcopenia. The purpose of the study was to examine the hypothesis that polygenetic variants for sarcopenic risk had interactions with metabolic disorders and lifestyles associated with sarcopenia risk in adults >50 years in a large urban hospital cohort. METHODS: Sarcopenia was defined as an appendicular skeletal muscle mass/body weight (SMI) < 29.0% for men and <22.8% for women estimated from participants aged 18-39 years in the KNHANES 2009-2010. Genetic variants were selected using a genome-wide association study for sarcopenia (sarcopenia, n = 1368; control, n = 15,472). The best model showing the gene-gene interactions was selected using a generalized multifactor dimensionality reduction. The polygenic risk scores (PRS) were generated by summing the selected SNP risk alleles in the best model. RESULTS: SMI was much higher in the control subjects than the sarcopenia subjects in both genders, and the fat mass index was opposite the SMI. The five-single nucleotide polymorphisms (SNPs) model included FADS2_rs97384, MYO10_rs31574 KCNQ5_rs6453647, DOCK5_rs11135857, and LRP1B_ rs74659977. Sarcopenia risk was positively associated with the PRS of the five-SNP model (ORs = 1.977, 95% CI = 1.634-2.393). The PRS interacted with age (P < 0.0001), metabolic syndrome (P = 0.01), grip strength (P = 0.007), and serum total cholesterol concentrations (P = 0.005) for the sarcopenia risk. There were no interactions of PRS with the lifestyle components except for exercise. CONCLUSION: The genetic impact may be offset in the elderly, having metabolic syndrome, high serum total cholesterol concentrations, and high grip strength, but only exercise in the lifestyle factors can overcome the genetic effect. Middle-aged and elderly participants with a genetic risk for sarcopenia may require regular exercise to maintain high grip strength and prevent metabolic syndrome.


Subject(s)
Exercise , Hand Strength , Metabolic Syndrome/epidemiology , Polymorphism, Single Nucleotide , Sarcopenia/epidemiology , Sarcopenia/genetics , Aged , Cohort Studies , Energy Intake , Fatty Acid Desaturases/genetics , Female , Guanine Nucleotide Exchange Factors/genetics , Hospitals, Urban , Humans , KCNQ Potassium Channels/genetics , Male , Middle Aged , Multifactor Dimensionality Reduction , Muscle, Skeletal , Myosins/genetics , Receptors, LDL/genetics , Republic of Korea/epidemiology , Risk Factors
10.
Cuad. psicol. deporte ; 21(3): 258-268, septiembre 2021. tab
Article in English | IBECS | ID: ibc-219377

ABSTRACT

The main objective of this research was to culturally adapt and evaluate the psychometric properties of the Life Orientation Test Revised version for the Brazilian sports context (LOT-R). The sample consisted of 953 young Brazilian athletes of both sexes, with a mean age of 16 years. The results of the confirmatory factor analysis produced two correlated dimensions, reflecting optimism and pessimism, and provided support for a two-factor model. A satisfactory fit was found for LOT-R for sport (LOT-R-Sport) with six items (χ2=4.541, df=8; CFI=1; TLI=1;RMSEA=0 [90%IC = 0.000 -0.024]; SRMR=0.016). Satisfactory tests of internal consistency were also generated through the analysis of factor loadings. The Composite Reliability indices (0.72/ 0.70) were suitablefor Optimism and Pessimism, respectively. Motivation, Resilience and Satisfaction with Life showed positive correlations with Optimism and negative correlations with Pessimism, indicative of convergent validity.The configural, metric and scalar invariance was achieved, indicating that the LOT-R-Sport can measure athletes of different sexes, sports, ages and years of experience equally. Although more studies are needed to confirm the psychometric properties of the instrument, the adaptation of the LOT-R-Sport was the first step in the future works about the influence of Optimism on sports performance. (AU)


El objetivo principal de esta investigación fue adaptar y evaluar culturalmente las propiedades psicométricas de la versión brasileira del Life Orientation Test Revisedpara el contexto deportivo (LOT-R). La muestra consistió en 953 jóvenes atletas brasileños de ambos sexos, con una edad media de 16 años. Los resultados del análisis factorial confirmatorio produjeron dos dimensiones correlacionadas, que reflejan optimismoy pesimismo, y brindaron apoyo para un modelo de dos factores. Se encontró un ajuste satisfactorio para LOT-R con 6 ítems (χ2=4.541, df=8; CFI=1; TLI=1; RMSEA=0 [90%IC = 0.000 -0.024]; SRMR=0.016).También se generaron pruebas satisfactorias de consistencia interna a través del análisis de cargas factoriales. La Fiabilidad Compuesta (.72/ .70) fue adecuada en las variables latentes Optimismo y Pesimismo, respectivamente. Motivación, Resiliencia y Satisfacción con la Vida mostraron correlaciones positivas con Optimismo y correlaciones negativas con Pesimismo, indicativo de validez convergente.Se logró la invariancia configuracional, métrica y escalar, lo que indica que LOT-R-Sport puede medir atletas de diferentessexos,deportes, edades y años de experiencia por igual. Aunque se necesitan más estudios para confirmar las propiedades psicométricas del instrumento, la adaptación de LOT-R al contexto deportivo fue el primer paso en el futuro sobre la influencia del Optimismo en el rendimiento deportivo. (AU)


O principal objetivo desta pesquisa foi adaptar culturalmente e avaliar as propriedades psicométricas da Life Orientation Test Revisada para o contexto esportivo (LOT-R). A amostra foi composta por 953 jovens atletas brasileiros de ambos os sexos, com idade média de 16 anos. Os resultados da análise fatorial confirmatória produziram duas dimensões correlacionadas, refletindo otimismo e pessimismo e forneceram suportepara um modelo de dois fatores. Foi encontrado um ajuste satisfatório para LOT-R para o esporte (LOT-R-Sport) com seis itens ítems (χ2=4.541, df=8; CFI=1; TLI=1; RMSEA=0 [90%IC = 0.000 -0.024]; SRMR=0.016). Testes satisfatórios de consistência interna foram gerados através da análise de cargas fatoriais. Os índices Confiabilidade Composta (0.72/ 0.70) foram adequados para Otimismo e Pessimismo, respectivamente. Motivação, Resiliência e Satisfação com a Vida apresentaram correlações positivas com Otimismo e negativas com Pessimismo, indicativo de validade convergente.A invariância configural, métrica e escalar foi alcançada, indicando que o LOT-R-Sport pode avaliar de diferentessexos, modalidades, idadese anosde experiência igualmente. Embora sejam necessários mais estudos para confirmar as propriedades psicométricas do instrumento, a adaptação do LOT-R-Sport ao contexto esportivo foi o primeiro passo para futuros estudos sobre a influência do otimismo no desempenho esportivo. (AU)


Subject(s)
Humans , Optimism , Pessimism , Sports , Multifactor Dimensionality Reduction
11.
Genes Genomics ; 43(8): 961-973, 2021 08.
Article in English | MEDLINE | ID: mdl-34129193

ABSTRACT

BACKGROUND: Recently, many researchers focus on the best way to produce high-quality meat, as the trend in food consumption today is to focus on quality. In general, consumers' preferences in beef differ depending on taste and meatiness. Therefore, researchers are interested in how the marbling score affects the flavors of meat or the various factors that make up the meatiness to captivate the consumers' tastes. OBJECTIVE: This study identifies single nucleotide polymorphisms (SNPs) or gene combinations that affect the carcass traits of Korean cattle (Hanwoo) by using the multifactor dimensionality reduction (MDR) method. METHODS: We collected the candidate SNPs to identify SNPs related to marbling scores from whole-exome sequencing and bovine SNP genotyping data. Using 96 Hanwoo samples, we performed PCR amplification to investigate the polymorphism status. In addition, we investigated genetic relationships between carcass traits and SNPs using 612 Hanwoo samples. Furthermore, each candidate SNP genotype and the combinations of SNP genotypes were verified to improve the accuracy of genetic relationships using MDR method. RESULTS: Twenty-four candidate SNPs associated with carcass trait and marbling scores were identified from SNP genotyping and whole-exome sequencing. Among them, three SNP markers (c.459 T > C of the PLCB1 gene, c.271 A > C of the C/EBPα gene, and g.17257 A > G of the TDRKH gene) were showed statistically significant differences between intramuscular fat and genotypes. Especially, two candidate SNPs, including c.459 T > C located in the PLCB1 gene and c.271 A > C located in the C/EBPα gene, could be highly associated with the intramuscular fat of Hanwoo quality grade. In addition, the combination of SNP genotypes is showed higher significant differences with carcass weight, backfat thickness, and longissimus dorsi muscle area. CONCLUSION: Three SNP genotypes and the combination of SNP genotypes in the PLCB1, C/EBPα, and TDRKH genes may be useful genetic markers for improving beef quality.


Subject(s)
CCAAT-Enhancer-Binding Protein-alpha/genetics , Food Analysis , Meat/analysis , Phospholipase C beta/genetics , RNA-Binding Proteins/genetics , Animals , Cattle , Genome-Wide Association Study , Genotype , Humans , Multifactor Dimensionality Reduction , Phenotype , Polymorphism, Single Nucleotide/genetics , Exome Sequencing
12.
PLoS One ; 16(5): e0251902, 2021.
Article in English | MEDLINE | ID: mdl-34019571

ABSTRACT

The volume of Amharic digital documents has grown rapidly in recent years. As a result, automatic document categorization is highly essential. In this paper, we present a novel dimension reduction approach for improving classification accuracy by combining feature selection and feature extraction. The new dimension reduction method utilizes Information Gain (IG), Chi-square test (CHI), and Document Frequency (DF) to select important features and Principal Component Analysis (PCA) to refine the features that have been selected. We evaluate the proposed dimension reduction method with a dataset containing 9 news categories. Our experimental results verified that the proposed dimension reduction method outperforms other methods. Classification accuracy with the new dimension reduction is 92.60%, which is 13.48%, 16.51% and 10.19% higher than with IG, CHI, and DF respectively. Further work is required since classification accuracy still decreases as we reduce the feature size to save computational time.


Subject(s)
Data Mining/methods , Information Technology , Linguistics/statistics & numerical data , Multifactor Dimensionality Reduction/statistics & numerical data , Support Vector Machine , Datasets as Topic , Ethiopia , Humans , Language , Principal Component Analysis
13.
Med Biol Eng Comput ; 59(4): 733-758, 2021 Apr.
Article in English | MEDLINE | ID: mdl-33839998

ABSTRACT

Genome-wide association studies (GWAS) provide clear insight into understanding genetic variations and environmental influences responsible for various human diseases. Cancer identification through genetic interactions (epistasis) is one of the significant ongoing researches in GWAS. The growth of the cancer cell emerges from multi-locus as well as complex genetic interaction. It is impractical for the physician to detect cancer via manual examination of SNPs interaction. Due to its importance, several computational approaches have been modeled to infer epistasis effects. This article includes a comprehensive and multifaceted review of all relevant genetic studies published between 2001 and 2020. In this contemporary review, various computational methods are as follows: multifactor dimensionality reduction-based approaches, statistical strategies, machine learning, and optimization-based techniques are carefully reviewed and presented with their evaluation results. Moreover, these computational approaches' strengths and limitations are described. The issues behind the computational methods for identifying the cancer disease through genetic interactions and the various evaluation parameters used by researchers have been analyzed. This review is highly beneficial for researchers and medical professionals to learn techniques adapted to discover the epistasis and aids to design novel automatic epistasis detection systems with strong robustness and maximum efficiency to address the different research problems in finding practical solutions effectively.


Subject(s)
Epistasis, Genetic , Neoplasms , Computational Biology , Genome-Wide Association Study , Humans , Models, Genetic , Multifactor Dimensionality Reduction , Neoplasms/genetics , Polymorphism, Single Nucleotide
14.
Methods Mol Biol ; 2212: 181-190, 2021.
Article in English | MEDLINE | ID: mdl-33733357

ABSTRACT

If one uses data to identify the most likely epistatic interaction between two genetic units, and then tests if the identified interaction is associated with a phenotype, the nominal statistical evidence will be inflated. Corrections are available but computationally expensive for genome-wide studies. We provide a first-order correction that can be applied in practice with essentially no additional computational cost.


Subject(s)
Algorithms , Epistasis, Genetic , Genetic Association Studies , Models, Genetic , Polymorphism, Single Nucleotide , Computer Simulation , Genotype , Humans , Multifactor Dimensionality Reduction , Phenotype , Statistics as Topic
15.
Methods Mol Biol ; 2212: 307-323, 2021.
Article in English | MEDLINE | ID: mdl-33733364

ABSTRACT

Epistasis is a challenge in prediction, classification, and suspicion of human genetic diseases. Many technologies, methods, and tools have been developed for epistasis detection. Multifactor dimensionality reduction (MDR) is the method commonly used in epistasis detection. It uses two class groups-high risk and low risk-in human genetic disease and complex genetic traits. However, it cannot handle uncertainties from genetic information. This chapter describes the fuzzy sigmoid membership-based MDR (FSMDR) method of epistasis detection. The algorithmic steps in FSMDR are also elaborated with simulated data generated from GAMETES and a real coronary artery disease patient epistasis data set obtained from the Wellcome Trust Case Control Consortium (WTCCC). Moreover, a belief degree-associated fuzzy MDR framework is also proposed for epistasis detection, which can overcome the uncertainties of MDR-based methods. This framework improves the detection efficiency. It works like fuzzy set-based MDR methods. Simulated epistasis data sets are used to compare different MDR-based methods. Belief degree-associated fuzzy MDR was shown to gives good results by taking into account the uncertainly of the high/low risk classification.


Subject(s)
Coronary Artery Disease/genetics , Epistasis, Genetic , Fuzzy Logic , Multifactor Dimensionality Reduction , Multifactorial Inheritance , Software , Algorithms , Databases, Genetic , Datasets as Topic , Humans , Models, Genetic , Polymorphism, Single Nucleotide , Uncertainty
16.
Methods Mol Biol ; 2212: 337-345, 2021.
Article in English | MEDLINE | ID: mdl-33733366

ABSTRACT

Complex disease is different from Mendelian disorders. Its development usually involves the interaction of multiple genes or the interaction between genes and the environment (i.e. epistasis). Although the high-throughput sequencing technologies for complex diseases have produced a large amount of data, it is extremely difficult to analyze the data due to the high feature dimension and the combination in the epistasis analysis. In this work, we introduce machine learning methods to effectively reduce the gene dimensionality, retain the key epistatic effects, and effectively characterize the relationship between epistatic effects and complex diseases.


Subject(s)
Epistasis, Genetic , Machine Learning , Models, Genetic , Multifactorial Inheritance/genetics , Polymorphism, Single Nucleotide , Computational Biology/methods , Datasets as Topic , Humans , Multifactor Dimensionality Reduction , Software
17.
BMC Bioinformatics ; 22(1): 74, 2021 Feb 18.
Article in English | MEDLINE | ID: mdl-33602124

ABSTRACT

BACKGROUND: One component of precision medicine is to construct prediction models with their predicitve ability as high as possible, e.g. to enable individual risk prediction. In genetic epidemiology, complex diseases like coronary artery disease, rheumatoid arthritis, and type 2 diabetes, have a polygenic basis and a common assumption is that biological and genetic features affect the outcome under consideration via interactions. In the case of omics data, the use of standard approaches such as generalized linear models may be suboptimal and machine learning methods are appealing to make individual predictions. However, most of these algorithms focus mostly on main or marginal effects of the single features in a dataset. On the other hand, the detection of interacting features is an active area of research in the realm of genetic epidemiology. One big class of algorithms to detect interacting features is based on the multifactor dimensionality reduction (MDR). Here, we further develop the model-based MDR (MB-MDR), a powerful extension of the original MDR algorithm, to enable interaction empowered individual prediction. RESULTS: Using a comprehensive simulation study we show that our new algorithm (median AUC: 0.66) can use information hidden in interactions and outperforms two other state-of-the-art algorithms, namely the Random Forest (median AUC: 0.54) and Elastic Net (median AUC: 0.50), if interactions are present in a scenario of two pairs of two features having small effects. The performance of these algorithms is comparable if no interactions are present. Further, we show that our new algorithm is applicable to real data by comparing the performance of the three algorithms on a dataset of rheumatoid arthritis cases and healthy controls. As our new algorithm is not only applicable to biological/genetic data but to all datasets with discrete features, it may have practical implications in other research fields where interactions between features have to be considered as well, and we made our method available as an R package ( https://github.com/imbs-hl/MBMDRClassifieR ). CONCLUSIONS: The explicit use of interactions between features can improve the prediction performance and thus should be included in further attempts to move precision medicine forward.


Subject(s)
Precision Medicine , Algorithms , Diabetes Mellitus, Type 2/epidemiology , Diabetes Mellitus, Type 2/genetics , Humans , Machine Learning , Multifactor Dimensionality Reduction , Power, Psychological
18.
Nat Commun ; 12(1): 124, 2021 01 05.
Article in English | MEDLINE | ID: mdl-33402734

ABSTRACT

High-dimensional multi-omics data are now standard in biology. They can greatly enhance our understanding of biological systems when effectively integrated. To achieve proper integration, joint Dimensionality Reduction (jDR) methods are among the most efficient approaches. However, several jDR methods are available, urging the need for a comprehensive benchmark with practical guidelines. We perform a systematic evaluation of nine representative jDR methods using three complementary benchmarks. First, we evaluate their performances in retrieving ground-truth sample clustering from simulated multi-omics datasets. Second, we use TCGA cancer data to assess their strengths in predicting survival, clinical annotations and known pathways/biological processes. Finally, we assess their classification of multi-omics single-cell data. From these in-depth comparisons, we observe that intNMF performs best in clustering, while MCIA offers an effective behavior across many contexts. The code developed for this benchmark study is implemented in a Jupyter notebook-multi-omics mix (momix)-to foster reproducibility, and support users and future developers.


Subject(s)
Algorithms , Computational Biology/methods , Gene Expression Regulation, Neoplastic , Neoplasm Proteins/genetics , Neoplasms/genetics , Benchmarking , Cell Line, Tumor , Datasets as Topic , Gene Ontology , Humans , Molecular Sequence Annotation , Multifactor Dimensionality Reduction , Neoplasm Proteins/metabolism , Neoplasms/diagnosis , Neoplasms/mortality , Neoplasms/pathology , Reproducibility of Results , Single-Cell Analysis , Survival Analysis
19.
Nat Biomed Eng ; 5(6): 624-635, 2021 06.
Article in English | MEDLINE | ID: mdl-33139824

ABSTRACT

Dimensionality reduction is widely used in the visualization, compression, exploration and classification of data. Yet a generally applicable solution remains unavailable. Here, we report an accurate and broadly applicable data-driven algorithm for dimensionality reduction. The algorithm, which we named 'feature-augmented embedding machine' (FEM), first learns the structure of the data and the inherent characteristics of the data components (such as central tendency and dispersion), denoises the data, increases the separation of the components, and then projects the data onto a lower number of dimensions. We show that the technique is effective at revealing the underlying dominant trends in datasets of protein expression and single-cell RNA sequencing, computed tomography, electroencephalography and wearable physiological sensors.


Subject(s)
Algorithms , Biomedical Research/statistics & numerical data , Datasets as Topic , Multifactor Dimensionality Reduction/statistics & numerical data , Electroencephalography/statistics & numerical data , Humans , Protein Biosynthesis , Sequence Analysis, RNA/statistics & numerical data , Single-Cell Analysis/statistics & numerical data , Tomography, X-Ray Computed/statistics & numerical data
SELECTION OF CITATIONS
SEARCH DETAIL