Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Mais filtros

Base de dados
Ano de publicação
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
BMC Bioinformatics ; 23(1): 325, 2022 Aug 07.
Artigo em Inglês | MEDLINE | ID: mdl-35934714

RESUMO

BACKGROUND: The malaria risk prediction is currently limited to using advanced statistical methods, such as time series and cluster analysis on epidemiological data. Nevertheless, machine learning models have been explored to study the complexity of malaria through blood smear images and environmental data. However, to the best of our knowledge, no study analyses the contribution of Single Nucleotide Polymorphisms (SNPs) to malaria using a machine learning model. More specifically, this study aims to quantify an individual's susceptibility to the development of malaria by using risk scores obtained from the cumulative effects of SNPs, known as weighted genetic risk scores (wGRS). RESULTS: We proposed an SNP-based feature extraction algorithm that incorporates the susceptibility information of an individual to malaria to generate the feature set. However, it can become computationally expensive for a machine learning model to learn from many SNPs. Therefore, we reduced the feature set by employing the Logistic Regression and Recursive Feature Elimination (LR-RFE) method to select SNPs that improve the efficacy of our model. Next, we calculated the wGRS of the selected feature set, which is used as the model's target variables. Moreover, to compare the performance of the wGRS-only model, we calculated and evaluated the combination of wGRS with genotype frequency (wGRS + GF). Finally, Light Gradient Boosting Machine (LightGBM), eXtreme Gradient Boosting (XGBoost), and Ridge regression algorithms are utilized to establish the machine learning models for malaria risk prediction. CONCLUSIONS: Our proposed approach identified SNP rs334 as the most contributing feature with an importance score of 6.224 compared to the baseline, with an importance score of 1.1314. This is an important result as prior studies have proven that rs334 is a major genetic risk factor for malaria. The analysis and comparison of the three machine learning models demonstrated that LightGBM achieves the highest model performance with a Mean Absolute Error (MAE) score of 0.0373. Furthermore, based on wGRS + GF, all models performed significantly better than wGRS alone, in which LightGBM obtained the best performance (0.0033 MAE score).


Assuntos
Malária , Polimorfismo de Nucleotídeo Único , Algoritmos , Humanos , Aprendizado de Máquina , Malária/epidemiologia , Malária/genética , Fatores de Risco
2.
Malar J ; 21(1): 79, 2022 Mar 09.
Artigo em Inglês | MEDLINE | ID: mdl-35264165

RESUMO

BACKGROUND: The malaria risk analysis of multiple populations is crucial and of great importance whilst compressing limitations. However, the exponential growth in diversity and accumulation of genetic variation data obtained from malaria-infected patients through Genome-Wide Association Studies opens up unprecedented opportunities to explore the significant differences between genetic markers (risk factors), particularly in the resistance or susceptibility of populations to malaria risk. Thus, this study proposes using statistical tests to analyse large-scale genetic variation data, comprising 20,854 samples from 11 populations within three continents: Africa, Oceania, and Asia. METHODS: Even though statistical tests have been utilized to conduct case-control studies since the 1950s to link risk factors to a particular disease, several challenges faced, including the choice of data (ordinal vs. non-ordinal) and test (parametric vs. non-parametric). This study overcomes these challenges by adopting the Mann-Whitney U test to analyse large-scale genetic variation data; to explore the statistical significance of markers between populations; and to further identify the highly differentiated markers. RESULTS: The findings of this study revealed a significant difference in the genetic markers between populations (p < 0.01) in all the case groups and most control groups. However, for the highly differentiated genetic markers, a significant difference (p < 0.01) was present for most genetic markers with varying p-values between the populations in the case and control groups. Moreover, several genetic markers were observed to have very significant differences (p < 0.001) across all populations, while others exist between certain specific populations. Also, several genetic markers have no significant differences between populations. CONCLUSIONS: These findings further support that the genetic markers contribute differently between populations towards malaria resistance or susceptibility, thus showing differences in the likelihood of malaria infection. In addition, this study demonstrated the robustness of the Mann-Whitney U test in analysing genetic markers in large-scale genetic variation data, thereby indicating an alternative method to explore genetic markers in other complex diseases. The findings hold great promise for genetic markers analysis, and the pipeline emphasized in this study can fully be reproduced to analyse new data.


Assuntos
Estudo de Associação Genômica Ampla , Malária , Marcadores Genéticos , Variação Genética , Humanos , Malária/genética , Estatísticas não Paramétricas
3.
BMC Genet ; 21(1): 31, 2020 03 14.
Artigo em Inglês | MEDLINE | ID: mdl-32171244

RESUMO

BACKGROUND: Publicly available genome data provides valuable information on the genetic variation patterns across different modern human populations. Neuropeptide genes are crucial to the nervous, immune, endocrine system, and physiological homeostasis as they play an essential role in communicating information in neuronal functions. It remains unclear how evolutionary forces, such as natural selection and random genetic drift, have affected neuropeptide genes among human populations. To date, there are over 100 known human neuropeptides from the over 1000 predicted peptides encoded in the genome. The purpose of this study is to analyze and explore the genetic variation in continental human populations across all known neuropeptide genes by examining highly differentiated SNPs between African and non-African populations. RESULTS: We identified a total of 644,225 SNPs in 131 neuropeptide genes in 6 worldwide population groups from a public database. Of these, 5163 SNPs that had ΔDAF |(African - non-African)| ≥ 0.20 were identified and fully annotated. A total of 20 outlier SNPs that included 19 missense SNPs with a moderate impact and one stop lost SNP with high impact, were identified in 16 neuropeptide genes. Our results indicate that an overall strong population differentiation was observed in the non-African populations that had a higher derived allele frequency for 15/20 of those SNPs. Highly differentiated SNPs in four genes were particularly striking: NPPA (rs5065) with high impact stop lost variant; CHGB (rs6085324, rs236150, rs236152, rs742710 and rs742711) with multiple moderate impact missense variants; IGF2 (rs10770125) and INS (rs3842753) with moderate impact missense variants that are in linkage disequilibrium. Phenotype and disease associations of these differentiated SNPs indicated their association with hypertension and diabetes and highlighted the pleiotropic effects of these neuropeptides and their role in maintaining physiological homeostasis in humans. CONCLUSIONS: We compiled a list of 131 human neuropeptide genes from multiple databases and literature survey. We detect significant population differentiation in the derived allele frequencies of variants in several neuropeptide genes in African and non-African populations. The results highlights SNPs in these genes that may also contribute to population disparities in prevalence of diseases such as hypertension and diabetes.


Assuntos
Fator Natriurético Atrial/genética , População Negra/genética , Neuropeptídeos/genética , Seleção Genética/genética , Povo Asiático/genética , Frequência do Gene , Deriva Genética , Genética Populacional , Genoma Humano/genética , Haplótipos/genética , Humanos , Desequilíbrio de Ligação/genética , Polimorfismo de Nucleotídeo Único/genética , População Branca/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA