RESUMEN
The use of genetic markers, specifically Short Tandem Repeats (STRs), has been a valuable tool for identifying persons of interest. However, the ability to analyze additional markers including Single Nucleotide Polymorphisms (SNPs) and Insertion/Deletion (INDELs) polymorphisms allows laboratories to explore other investigative leads. INDELs were chosen in this study because large panels can be differentiated by size, allowing them to be genotyped by capillary electrophoresis. Moreover, these markers do not produce stutter and are smaller in size than STRs, facilitating the recovery of genetic information from degraded samples. The INDEL Ancestry Informative Markers (AIMs) in this study were selected from the 1000 Genomes Project based on a fixation index (FST) greater than 0.50, high allele frequency divergence, and genetic distance. A total of 25 INDEL-AIMs were optimized and validated according to SWGDAM guidelines in a five-dye multiplex. To validate the panel, genotyping was performed on 155 unrelated individuals from four ancestral groups (Caucasian, African, Hispanic, and East Asian). Bayesian clustering and principal component analysis (PCA) were performed revealing clear separation among three groups, with some observed overlap within the Hispanic group. Additionally, the PCA results were compared against a training set of 793 samples from the 1000 Genomes Project, demonstrating consistent results. Validation studies showed the assay to be reproducible, tolerant to common inhibitors, robust with challenging casework type samples, and sensitive down to 125 pg. In conclusion, our results demonstrated the robustness and effectiveness of a 25 loci INDEL system for ancestry inference of four ancestries commonly found in the United States.
Asunto(s)
Electroforesis Capilar , Mutación INDEL , Análisis de Componente Principal , Grupos Raciales , Humanos , Teorema de Bayes , Dermatoglifia del ADN/métodos , Frecuencia de los Genes , Marcadores Genéticos , Genética de Población , Genotipo , Repeticiones de Microsatélite , Grupos Raciales/genética , Estados UnidosRESUMEN
Y-InDels (insertions/deletions) are genetic markers which are extremely understudied. It is unknown whether this type of markers can be utilized for genetic ancestry inference. We have developed an innovative Y chromosome ancestry inference system tailored for forensic applications. This panel amplifies 21 Y chromosome loci, encompassing Y-InDels and Y-SNPs (Single Nucleotide Polymorphism), utilizing the capillary electrophoresis (CE) platform. The system performed well at DNA concentrations greater than 0.125 ng/ul and produced accurate results at a 1:100 mixing ratio of male and female DNA. The Cumulative probability of matching (CPM) was between 0.95 and 0.97 in the experimental population. The system's efficacy in inferring ancestral origins was demonstrated through intercontinental population discrimination, revealing high discrimination power between African and East Asian populations. Population genetic analyses conducted on Han, Qiang and Hui populations in Southwest China, where the smallest FST value was 0.0002 between Han Chinese in Beijing (from 1000 Genomes Project) and Qiang Chinese from Sichuan (CQSC). Phylogenetic tree construction further illuminated distinct haplotypes among populations, with ethnically unique haplotypes observed in 34.6% of Hui and 7.1% of Qiang populations. K-fold cross-validation show the system's inference abilities at the intercontinental level. In addition, our investigations identified potential associations between the Y-InDel locus Y: 15,385,547 (GRCh37) and haplogroup R1a1a1b2a2- Z2124, as well as locus Y: 13,990,180 (GRCh37) and haplogroup F-M89. In conclusion, we have established a Y-chromosome inference system tailored for grassroots-level application, underscoring the value of incorporating Y-InDel markers in forensic analyses.
RESUMEN
BACKGROUND: Latin American and Hispanic women are less likely to develop breast cancer (BC) than women of European descent. Observational studies have found an inverse relationship between the individual proportion of Native American ancestry and BC risk. Here, we use ancestry-informative markers to rule out potential confounding of this relationship, estimating the confounder-free effect of Native American ancestry on BC risk. METHODS AND STUDY POPULATION: We used the informativeness for assignment measure to select robust instrumental variables for the individual proportion of Native American ancestry. We then conducted separate Mendelian randomization (MR) analyses based on 1401 Colombian women, most of them from the central Andean regions of Cundinamarca and Huila, and 1366 Mexican women from Mexico City, Monterrey and Veracruz, supplemented by sensitivity and stratified analyses. RESULTS: The proportion of Colombian Native American ancestry showed a putatively causal protective effect on BC risk (inverse variance-weighted odds ratio [OR] = 0.974 per 1% increase in ancestry proportion, 95% confidence interval [CI] 0.970-0.978, p = 3.1 × 10-40). The corresponding OR for Mexican Native American ancestry was 0.988 (95% CI 0.987-0.990, p = 1.4 × 10-44). Stratified analyses revealed a stronger association between Native American ancestry and familial BC (Colombian women: OR = 0.958, 95% CI 0.952-0.964; Mexican women: OR = 0.973, 95% CI 0.969-0.978), and stronger protective effects on oestrogen receptor (ER)-positive BC than on ER-negative and triple-negative BC. CONCLUSIONS: The present results point to an unconfounded protective effect of Native American ancestry on BC risk in both Colombian and Mexican women which appears to be stronger for familial and ER-positive BC. These findings provide a rationale for personalised prevention programmes that take genetic ancestry into account, as well as for future admixture mapping studies.
Asunto(s)
Indio Americano o Nativo de Alaska , Neoplasias de la Mama , Femenino , Humanos , Indio Americano o Nativo de Alaska/etnología , Indio Americano o Nativo de Alaska/genética , Indio Americano o Nativo de Alaska/estadística & datos numéricos , Mama , Neoplasias de la Mama/epidemiología , Neoplasias de la Mama/etnología , Neoplasias de la Mama/genética , Colombia/epidemiología , México/epidemiología , Neoplasias de la Mama Triple Negativas/epidemiología , Neoplasias de la Mama Triple Negativas/etnología , Neoplasias de la Mama Triple Negativas/genéticaRESUMEN
Biogeographical origin inferences of different populations can provide valuable clues in the forensic investigation by narrowing down the detection scope. However, much research mainly focuses on forensic ancestral origin analyses of major continental populations, which may provide limited information in forensic practice. To improve the ancestral resolution of East Asian populations, we systematically selected ancestry informative single-nucleotide polymorphisms (AISNPs) for differentiating Han, Dai, Japanese, and Kinh populations. In addition, we evaluated the performance of the selected AISNPs to differentiate these populations via multiple methods. Totally 116 AISNPs were selected from the genome-wide data to infer the population origins of these four populations. Results of principle component analysis and population genetic structure of these populations indicated that the selected 116 AISNPs could achieve ancestral resolution of most individuals. Furthermore, the machine learning model built by 116 AISNPs unveiled that most individuals from these four populations could be assigned to correct population origins. To sum up, the selected 116 SNPs could be available for ancestral origin predictions of Han, Dai, Japanese, and Kinh populations, which could provide valuable information for forensic research and genome-wide association study in East Asian populations to some extent.
Asunto(s)
Pueblos del Este de Asia , Polimorfismo de Nucleótido Simple , Humanos , Polimorfismo de Nucleótido Simple/genética , Estudio de Asociación del Genoma Completo , Genética de Población , Grupos Raciales/genética , Frecuencia de los Genes , GenotipoRESUMEN
Eye color prediction based on an individual's genetic information is of interest in the field of forensic genetics. In recent years, researchers have studied different genes and markers associated with this externally visible characteristic and have developed methods for its prediction. The IrisPlex represents a validated tool for homogeneous populations, though its applicability in populations of mixed ancestry is limited, mainly regarding the prediction of intermediate eye colors. With the aim of validating the applicability of this system in an admixed population from Argentina (n = 302), we analyzed the six single nucleotide variants used in that multiplex for eye color and four additional SNPs, and evaluated its prediction ability. We also performed a genotype-phenotype association analysis. This system proved to be useful when dealing with the extreme ends of the eye color spectrum (blue and brown) but presented difficulties in determining the intermediate phenotypes (green), which were found in a large proportion of our population. We concluded that these genetic tools should be used with caution in admixed populations and that more studies are required in order to improve the prediction of intermediate phenotypes.
Asunto(s)
ADN , Color del Ojo , Humanos , Color del Ojo/genética , Argentina , Genotipo , Fenotipo , Polimorfismo de Nucleótido Simple , Nucleótidos , Genética de PoblaciónRESUMEN
In the past two decades, Y chromosome data has been generated for human population genetic studies. These Y chromosome datasets were produced with various testing methods and markers, thus difficult to combine them for a comprehensive analysis. In this study, we combine four human Y chromosomal datasets of Han, Tibetan, Hui, and Li ethnic groups. The dataset contains 27 microsatellites and 137 single nucleotide polymorphisms these populations share in common. We assembled a single dataset containing 2439 individuals from 25 nationwide populations in China. A systematic analysis of genetic distance and clustering was performed. To determine the gene flow of the studied population with worldwide populations, we modeled the ancestry informative markers. The reference panel was regarded as a mixture of South Asian (SAS), East Asian (EAS), European (EUR), African (AFR), and American (AMR) populations from 1000 Genomes data of Y chromosome using nonlinear data-fitting. We then calculated the admixture proportion of these four studied populations with 26 worldwide populations. The results showed that the Han and Hui have great genetic affinity, and Hui is the most admixed ethnic group, with 61.53% EAS, 34.65% SAS, 1.91% AFR, 1.56% AMR, and 0.04% EUR ancestry component (the AMR is highly admixed and thus should be ignored). All the other three ethnic groups contained more than 97% EAS ancestry component. The Li is the least admixed population in this study. The combined dataset in this study is the largest of this kind reported to date and proposes reference population data for use in future paternal genetic studies and forensic genealogical identification.
Asunto(s)
Pueblos del Este de Asia , Genética de Población , Humanos , Cromosomas Humanos Y/genética , Pueblo Asiatico/genética , Etnicidad/genéticaRESUMEN
In forensic genetics, the use of ancestry informative single-nucleotide polymorphisms (AISNPs) panels can narrow the direction of the investigation by estimating an individual's biogeographic ancestry. However, distinguishing subgroups within continental regions requires more specific panels. In this study, we screened 19 AISNPs from the 1000 Genomes Project (1KG) based on their FST values to distinguish target populations in East Asia and obtained genotypes through SNaPshot. The 19 AISNPs could divide the global population of the 1KG into five clusters and could further divide the East Asian population into four clusters: Japanese, Han Chinese, Dai Chinese, and Kinh in Ho Chi Minh City of Vietnam. In summary, the 19-AISNP panel may serve as a useful and cost-effective tool for forensic ancestry inference in East Asian populations at a finer scale.
Asunto(s)
Genética de Población , Polimorfismo de Nucleótido Simple , Pueblo Asiatico/genética , Etnicidad/genética , Frecuencia de los Genes , Genotipo , Humanos , Polimorfismo de Nucleótido Simple/genéticaRESUMEN
Ancestry informative markers have extensive uses and advantages in inferring ancestral origins and estimating ancestral genetic information components of admixed populations. With the characteristics of highly cultural exchange and the admixed genetic structure of the Kyrgyz group, it is essential to enrich the genetic data of the Kyrgyz group. In this study, we used a self-developed ancestry informative marker-deletion/insertion polymorphic (AIM-DIP) panel to explore ancestral components of Chinese Kyrgyz group and population genetic relationships between the Kyrgyz group and reference populations. Results showed that all AIM-DIP loci were conformed to Hardy-Weinberg equilibrium. There were 36 AIM-DIP loci that contributed significantly to genetic information inference. Multiple statistical analyses revealed that Chinese Kyrgyz group had a closer genetic relationship with Chinese Uyghur group. The ancestral components of the Kyrgyz group, being mostly composed of genetic components of European and East Asian populations, were more similar to the ancestral components of Chinese Uyghur group.
Asunto(s)
Pueblo Asiatico , Polimorfismo de Nucleótido Simple , Pueblo Asiatico/genética , China , Frecuencia de los Genes , Estructuras Genéticas , Genética de Población , HumanosRESUMEN
Reports of morphological differences between European anchovy (Engraulis cf. encrasicolus) from coastal and marine habitats have long existed in the ichthyologic literature and have given rise to a long-standing debate on their taxonomic status. More recently, molecular studies have confirmed the existence of genetic differentiation between the two anchovy ecotypes. Using ancestry-informative markers, we show that coastal anchovies throughout the Mediterranean share a common ancestry and that substantial genetic differentiation persists in different pairs of coastal/marine populations despite the presence of limited gene flow. On the basis of genetic and ecological arguments, we propose that coastal anchovies deserve a species status of their own (E. maeoticus) and argue that a unified taxonomical framework is critical for future research and management.
Asunto(s)
Peces , Alimentos Marinos , Animales , Ecosistema , Peces/genética , Flujo Génico , Flujo GenéticoRESUMEN
Precision (or personalized) medicine holds great promise in the treatment of breast cancer. The success of personalized medicine is contingent upon inclusivity and representation for minority groups in clinical trials. In this article, we focus on the roadblocks for the African American demographic, including the barriers to access and enrollment in breast oncology trials, the prevailing classification of race and ethnicity, and the need to refine monolithic categorization by employing genetic ancestry mapping tools for a more accurate determination of race or ethnicity.
Asunto(s)
Neoplasias de la Mama , Medicina de Precisión , Negro o Afroamericano/genética , Neoplasias de la Mama/genética , Neoplasias de la Mama/terapia , Ensayos Clínicos como Asunto , Femenino , Hispánicos o Latinos , Humanos , Grupos MinoritariosRESUMEN
Over the past decade, studies of admixed populations have increasingly gained interest in both medical and population genetics. These studies have so far shed light on the patterns of genetic variation throughout modern human evolution and have improved our understanding of the demographics and adaptive processes of human populations. To date, there exist about 20 methods or tools to deconvolve local ancestry. These methods have merits and drawbacks in estimating local ancestry in multiway admixed populations. In this article, we survey existing ancestry deconvolution methods, with special emphasis on multiway admixture, and compare these methods based on simulation results reported by different studies, computational approaches used, including mathematical and statistical models, and biological challenges related to each method. This should orient users on the choice of an appropriate method or tool for given population admixture characteristics and update researchers on current advances, challenges and opportunities behind existing ancestry deconvolution methods.
Asunto(s)
Evolución Molecular , Genoma Humano , Modelos Genéticos , HumanosRESUMEN
The cost of SNP genotyping to screen different breeds and to estimate the exact proportion of ancestry level is quite high, which can be compensated through deriving a small panel of ancestry informative markers (AIMs). Hence, we carried out the present study to provide an insight into ancestry level inferred from a panel of informative markers in the crossbred Vrindavani population developed at ICAR-IVRI, India. We have performed a new method i.e., discriminant analysis of principal components (DAPC) for the first time on the dataset of Vrindavani cattle. To confirm our method, we had performed DAPC on two other well-known crossbred cattle, i.e., Frieswal and Beefmaster. Three sets of panels (500, 1000 and 2000 markers) were tested for clustering of individuals. Among all the panels, we found the panel (1000 markers) with DAPC based contribution method was of the smallest size and comparatively of the highest accuracy.
Asunto(s)
Bovinos/genética , Hibridación Genética , Linaje , Animales , Análisis Discriminante , Marcadores Genéticos , Estudio de Asociación del Genoma Completo/métodos , Estudio de Asociación del Genoma Completo/normas , Polimorfismo de Nucleótido Simple , Análisis de Componente Principal , Selección ArtificialRESUMEN
Admixed populations arise when two or more ancestral populations interbreed. As a result of this admixture, the genome of admixed populations is defined by tracts of variable size inherited from these parental groups and has particular genetic features that provide valuable information about their demographic history. Diverse methods can be used to derive the ancestry apportionment of admixed individuals, and such inferences can be leveraged for the discovery of genetic loci associated with diseases and traits, therefore having important biomedical implications. In this review article, we summarize the most common methods of global and local genetic ancestry estimation and discuss the use of admixture mapping studies in human diseases.
Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Investigación Biomédica , Sitios Genéticos/genética , Genotipo , HumanosRESUMEN
BACKGROUND: Orang-utans comprise three critically endangered species endemic to the islands of Borneo and Sumatra. Though whole-genome sequencing has recently accelerated our understanding of their evolutionary history, the costs of implementing routine genome screening and diagnostics remain prohibitive. Capitalizing on a tri-fold locus discovery approach, combining data from published whole-genome sequences, novel whole-exome sequencing, and microarray-derived genotype data, we aimed to develop a highly informative gene-focused panel of targets that can be used to address a broad range of research questions. RESULTS: We identified and present genomic co-ordinates for 175,186 SNPs and 2315 Y-chromosomal targets, plus 185 genes either known or presumed to be pathogenic in cardiovascular (N = 109) or respiratory (N = 43) diseases in humans - the primary and secondary causes of captive orang-utan mortality - or a majority of other human diseases (N = 33). As proof of concept, we designed and synthesized 'SeqCap' hybrid capture probes for these targets, demonstrating cost-effective target enrichment and reduced-representation sequencing. CONCLUSIONS: Our targets are of broad utility in studies of orang-utan ancestry, admixture and disease susceptibility and aetiology, and thus are of value in addressing questions key to the survival of these species. To facilitate comparative analyses, these targets could now be standardized for future orang-utan population genomic studies. The targets are broadly compatible with commercial target enrichment platforms and can be utilized as published here to synthesize applicable probes.
Asunto(s)
Genómica , Pongo , Animales , Borneo , Susceptibilidad a Enfermedades , Humanos , Indonesia , Pongo/genéticaRESUMEN
Ancestry-informative markers (AIMs) can be used to infer the ancestry of an individual to minimize the inaccuracy of self-reported ethnicity in biomedical research. In this study, we describe three methods for selecting AIM SNPs for the Malay population (Malay AIM panel) using different approaches based on pairwise FST, informativeness for assignment (In), and PCA-correlated SNPs (PCAIMs). These Malay AIM panels were extracted from genotype data stored in SNP arrays hosted by the Malaysian node of the Human Variome Project (MyHVP) and the Singapore Genome Variation Project (SGVP). In particular, genotype data from a total of 165 Malay individuals were analyzed, comprising data on 117 individual genotypes from the Affymetrix SNP-6 SNP array platform and data on 48 individual genotypes from the OMNI 2.5 Illumina SNP array platform. The HapMap phase 3 database (1397 individuals from 11 populations) was used as a reference for comparison with the Malay genotype data. The accuracy of each resulting Malay AIM panel was evaluated using a machine learning "ancestry-predictive model" constructed by using WEKA, a comprehensive machine learning platform written in Java. A total of 1250 SNPs were finally selected, which successfully identified Malay individuals from other world populations with an accuracy of 90%, but the accuracy decreased to 80% using 157 SNPs according to the pairwise FST method, while a panel of 200 SNPs selected using In and PCAIMs could be used to identify Malay individuals with an accuracy of approximately 80%.
Asunto(s)
Bases de Datos Genéticas , Etnicidad/genética , Genética de Población/métodos , Genotipo , Polimorfismo de Nucleótido Simple , Pueblo Asiatico/genética , Marcadores Genéticos , Proyecto Mapa de Haplotipos , Humanos , Malasia/etnología , Modelos Estadísticos , Nativos de Hawái y Otras Islas del Pacífico/genética , Análisis de Componente Principal , SingapurRESUMEN
BACKGROUND: Europeans and American Indians were major genetic ancestry of Hispanics in the U.S. These ancestral groups have markedly different incidence rates and outcomes in many types of cancers. Therefore, the genetic admixture may cause biased genetic association study with cancer susceptibility variants specifically in Hispanics. For example, the incidence rate of liver cancer has been shown with substantial disparity between Hispanic, Asian and non-Hispanic white populations. Currently, ancestry informative marker (AIM) panels have been widely utilized with up to a few hundred ancestry-informative single nucleotide polymorphisms (SNPs) to infer ancestry admixture. Notably, current available AIMs are predominantly located in intron and intergenic regions, while the whole exome sequencing (WES) protocols commonly used in translational research and clinical practice do not cover these markers. Thus, it remains challenging to accurately determine a patient's admixture proportion without additional DNA testing. RESULTS: In this study we designed an unique AIM panel that infers 3-way genetic admixture from three distinct and selective continental populations (African (AFR), European (EUR), and East Asian (EAS)) within evolutionarily conserved exonic regions. Initially, about 1 million exonic SNPs from selective three populations in the 1000 Genomes Project were trimmed by their linkage disequilibrium (LD), restricted to biallelic variants, and finally we optimized to an AIM panel with 250 SNP markers, or the UT-AIM250 panel, using their ancestral informativeness statistics. Comparing to published AIM panels, UT-AIM250 performed better accuracy when we tested with three ancestral populations (accuracy: 0.995 ± 0.012 for AFR, 0.997 ± 0.007 for EUR, and 0.994 ± 0.012 for EAS). We further demonstrated the performance of the UT-AIM250 panel to admixed American (AMR) samples of the 1000 Genomes Project and obtained similar results (AFR, 0.085 ± 0.098; EUR, 0.665 ± 0.182; and EAS, 0.250 ± 0.205) to previously published AIM panels (Phillips-AIM34: AFR, 0.096 ± 0.127, EUR, 0.575 ± 0.290, and EAS, 0.330 ± 0.315; Wei-AIM278: AFR, 0.070 ± 0.096, EUR, 0.537 ± 0.267, and EAS, 0.393 ± 0.300). Subsequently, we applied the UT-AIM250 panel to a clinical dataset of 26 self-reported Hispanic patients in South Texas with hepatocellular carcinoma (HCC). We estimated the admixture proportions using WES data of adjacent non-cancer liver tissues (AFR, 0.065 ± 0.043; EUR, 0.594 ± 0.150; and EAS, 0.341 ± 0.160). Similar admixture proportions were identified from corresponding tumor tissues. In addition, we estimated admixture proportions of The Cancer Genome Atlas (TCGA) collection of hepatocellular carcinoma (TCGA-LIHC) samples (376 patients) using the UT-AIM250 panel. The panel obtained consistent admixture proportions from tumor and matched normal tissues, identified 3 possible incorrectly reported race/ethnicity, and/or provided race/ethnicity determination if necessary. CONCLUSIONS: Here we demonstrated the feasibility of using evolutionarily conserved exonic regions to infer admixture proportions and provided a robust and reliable control for sample collection or patient stratification for genetic analysis. R implementation of UT-AIM250 is available at https://github.com/chenlabgccri/UT-AIM250.
Asunto(s)
Genoma Humano/genética , Estudio de Asociación del Genoma Completo/métodos , Hispánicos o Latinos/genética , Carcinoma Hepatocelular/etnología , Carcinoma Hepatocelular/genética , Etnicidad/genética , Exones/genética , Frecuencia de los Genes , Pruebas Genéticas , Genética de Población , Genotipo , Humanos , Neoplasias Hepáticas/etnología , Neoplasias Hepáticas/genética , Polimorfismo de Nucleótido Simple , Programas InformáticosRESUMEN
Ancestry informative markers play an important role in medical genetics and forensic analyses. Several ancestry informative SNP panels have been developed and validated that can differentiate global populations into continental or major regional groups. These global panels have served as good first-tier genetic markers; however, their performance in discriminating populations within regions appears unsatisfactory. To boost ancestry inference for regional populations, second-tier panels with more refined discrimination power among subpopulations within each of the regions need to be developed. In East Asia, Han Chinese, Japanese, and Korean show highly similar externally visible characteristics and are genetically closely related. Reliable ancestry informative genetic markers appear invaluable in discriminating these populations. In the present study, we compiled a genome-wide SNP dataset composing of 317,439 clean SNPs for a total of 1101 unrelated individuals from Han Chinese (817), Koreans (184), and Japanese (100). From this starting dataset, we developed a set of four nested ancestry informative SNP panels including 36, 59, 98, and 142 SNPs, respectively. The results of cross-validation tests indicate that these panels can discriminate the Chinese Han, Japanese, and Korean populations with overall average accuracies ranging from 90% to 99%. In the further performance assessments, these panels also manifested high sensitivity and specificity. In combination with the first-tier global panels, these second-tier panels would contribute to medical genetics and forensic research in East Asia.
Asunto(s)
Pueblo Asiatico/genética , Genética de Población , Polimorfismo de Nucleótido Simple , China , Humanos , Japón , República de CoreaRESUMEN
Inference of ancestry from biological evidence can provide investigative information, especially for unknown DNA donors. Although tools for predicting ancestry have been developing, ancestry research focusing on populations relevant for South Korea is not common and markers are seldom chosen specifically to differentiate Koreans from other East Asian and South East Asian populations. Here, we report ancestry informative markers (AIMs) for distinguishing six East/South East Asian regional populations: China, Japan, Indonesia, Philippines, South Korea and Thailand. Individual genotypes from these six populations were available in PanSNPdb: The HUGO Pan-Asian SNP Database. To select AIMs, we calculated four population divergence metrics for each SNP: Nei's FST, Rosenberg's Informativeness (In), the average absolute allele frequency difference between populations (δFmean) and the maximum allele frequency difference between populations (δFmax). Based on these values, we selected 100 single nucleotide polymorphisms (SNPs) for distinguishing the six populations, 13 of which exhibited large allele frequency differences between Koreans and non-Koreans. To assess the performance of the AIMs, we performed principal coordinates analysis (PCoA) on the individuals from all six populations and inferred ancestral population clusters using the STRUCTURE program. In conclusion, we found that the selected AIMs can be applied to distinguish the six East/South East Asian groups and we suggest the markers in this study will be helpful to establish ancestry panels for Korea and neighbouring populations.
Asunto(s)
Pueblo Asiatico/genética , Marcadores Genéticos , Genética de Población , Polimorfismo de Nucleótido Simple , Asia , Dermatoglifia del ADN , Bases de Datos Genéticas , Frecuencia de los Genes , Genotipo , Humanos , Análisis de Componente PrincipalRESUMEN
Ancestry-informative markers (AIMs) are markers that give information about the ancestry of individuals. They are used in forensic genetics for predicting the geographic origin of the investigated individual in crime and identification cases. In the exploration of the genogeographic origin of an AIMs profile, the likelihoods of the AIMs profile in various populations may be calculated. However, there may not be an appropriate reference population in the database. The fact that the likelihood ratio (LR) of one population compared to that of another population is large does not imply that any of the populations is relevant. To handle this phenomena, we derived a likelihood ratio test (LRT) that is a measure of absolute concordance between an AIMs profile and a population rather than a relative measure of the AIMs profile's likelihood in two populations. The LRT is similar to a Fisher's exact test. By aggregating over markers, the central limit theorem suggests that the resulting quantity is approximately normally distributed. If only a few markers are genotyped or if the majority of the markers are fixed in a given population, the approximation may fail. We overcome this using importance sampling and show how exponential tilting results in an efficient proposal distribution. By simulations and published AIMs profiles, we demonstrate the applicability of the derived methodology. For the genotyped AIMs, the LRT approach achieves the nominal levels of rejection when tested on data from five major continental regions.
Asunto(s)
Marcadores Genéticos , Genética de Población , Funciones de Verosimilitud , Modelos Genéticos , Simulación por Computador , Dinamarca , Genética Forense/métodos , Frecuencia de los Genes , Genotipo , Geografía , Groenlandia , Humanos , Reacción en Cadena de la Polimerasa , Población BlancaRESUMEN
Conventional forensic DNA analysis involves a matching principle, which compares DNA profiles from evidential samples to those from reference samples of known origin. In casework, however, the accessibility to a reference sample is not guaranteed which limits the use of DNA as an investigative tool. This has led to the development of phenotype prediction, which uses SNP analysis to estimate the physical appearance of the sample donor. Physical traits, such as eye, hair and skin colour, have been associated with certain alleles within specific genes involved in the melanogenesis pathways. These genetic markers are also associated with ancestry and their trait prediction ability has mainly been assessed in European and North American populations. This has prompted research investigating the discriminatory power of these markers in other populations, especially those exhibiting admixture. South Africa is well known for its diversity, and the viability of these particular SNPs still needs to be assessed within this population. South African law currently restricts the use of DNA for molecular phenotyping, and there are also numerous ethical and social considerations, all of which are discussed.