Your browser doesn't support javascript.
loading
Autosomal deletion/insertion polymorphisms for global stratification analyses and ancestry origin inferences of different continental populations by machine learning methods.
Jin, Xiaoye; Liu, Yuluo; Zhang, Yuanyuan; Li, Yongle; Chen, Chuanliang; Wang, Hongdan.
Afiliação
  • Jin X; Department of Forensic Medicine, Guizhou Medical University, Guiyang, P. R. China.
  • Liu Y; Medical Genetics Institute of Henan Province, Henan Provincial People's Hospital,Zhengzhou University People's Hospital, Zhengzhou, P. R. China.
  • Zhang Y; National Health Commission Key Laboratory of Birth Defects Prevention, Henan Key Laboratory of Population Defects Prevention, Zhengzhou, P. R. China.
  • Li Y; Department of Forensic Science, Guangdong Police College, Guangzhou, P. R. China.
  • Chen C; School of Foreign Languages, Zhengzhou University of Industrial Technology, Zhengzhou, P. R. China.
  • Wang H; National Health Commission Key Laboratory of Birth Defects Prevention, Henan Key Laboratory of Population Defects Prevention, Zhengzhou, P. R. China.
Electrophoresis ; 42(14-15): 1473-1479, 2021 08.
Article em En | MEDLINE | ID: mdl-33948979
ABSTRACT
A lot of population data of 30 deletion/insertion polymorphisms (DIPs) of the Investigator DIPplex kit in different continental populations have been reported. Here, we assessed genetic distributions of these 30 DIPs in different continental populations to pinpoint candidate ancestry informative DIPs. Besides, the effectiveness of machine learning methods for ancestry analysis was explored. Pairwise informativeness (In) values of 30 DIPs revealed that six loci displayed relatively high In values (>0.1) among different continental populations. Besides, more loci showed high population-specific divergence (PSD) values in African population. Based on the pairwise In and PSD values of 30 DIPs, 17 DIPs in the Investigator DIPplex kit were selected to ancestry analyses of African, European, and East Asian populations. Even though 30 DIPs provided better ancestry resolution of these continental populations based on the results of PCA and population genetic structure, we found that 17 DIPs could also distinguish these continental populations. More importantly, these 17 DIPs possessed more balanced cumulative PSD distributions in these populations. Six machine learning methods were used to perform ancestry analyses of these continental populations based on 17 DIPs. Obtained results revealed that naïve Bayes manifested the greatest performance; whereas, k nearest neighbor showed relatively low performance. To sum up, these machine learning methods, especially for naïve Bayes, could be used as the valuable tool for ancestry analysis.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Grupos Raciais / Aprendizado de Máquina / Genética Populacional Tipo de estudo: Prognostic_studies Limite: Humans Idioma: En Ano de publicação: 2021 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Grupos Raciais / Aprendizado de Máquina / Genética Populacional Tipo de estudo: Prognostic_studies Limite: Humans Idioma: En Ano de publicação: 2021 Tipo de documento: Article