RESUMO
Massively parallel sequencing allows for integrated genotyping of different types of forensic markers, which reduces DNA consumption, simplifies experimental processes, and provides additional sequence-based genetic information. The STRseqTyper122 kit genotypes 63 autosomal STRs, 16 X-STRs, 42 Y-STRs, and the Amelogenin locus. Amplicon sizes of 117 loci were below 300 bp. In this study, MiSeq FGx sequencing metrics for STRseqTyper122 were presented. The genotyping accuracy of this kit was examined by comparing to certified genotypes of NIST standard reference materials and results from five capillary electrophoresis-based kits. The sensitivity of STRseqTyper122 reached 125 pg, and > 80% of the loci were correctly called with 62.5 pg and 31.25 pg input genomic DNA. Repeatability, species specificity, and tolerance for DNA degradation and PCR inhibitors of this kit were also evaluated. STRseqTyper122 demonstrated reliable performance with routine case-work samples and provided a powerful tool for forensic applications.
Assuntos
Impressões Digitais de DNA , Sequenciamento de Nucleotídeos em Larga Escala , Repetições de Microssatélites , Humanos , Impressões Digitais de DNA/métodos , Amelogenina/genética , Reprodutibilidade dos Testes , Análise de Sequência de DNA/métodos , Genótipo , Reação em Cadeia da Polimerase , Especificidade da Espécie , Masculino , Animais , Degradação Necrótica do DNA , Eletroforese Capilar , FemininoRESUMO
Uniparental-inherited haploid genetic marker of Y-chromosome single nucleotide polymorphisms (Y-SNP) have the power to provide a deep understanding of the human evolutionary past, forensic pedigree, and bio-geographical ancestry information. Several international cross-continental or regional Y-panels instead of Y-whole sequencing have recently been developed to promote Y-tools in forensic practice. However, panels based on next-generation sequencing (NGS) explicitly developed for Chinese populations are insufficient to represent the Chinese Y-chromosome genetic diversity and complex population structures, especially for Chinese-predominant haplogroup O. We developed and validated a 639-plex panel including 633 Y-SNPs and 6 Y-Insertion/deletions, which covered 573 Y haplogroups on the Y-DNA haplogroup tree. In this panel, subgroups from haplogroup O accounted for 64.4% of total inferable haplogroups. We reported the sequencing metrics of 354 libraries sequenced with this panel, with the average sequencing depth among 226 individuals being 3,741×. We illuminated the high level of concordance, accuracy, reproducibility, and specificity of the 639-plex panel and found that 610 loci were genotyped with as little as 0.03 ng of genomic DNA in the sensitivity test. 94.05% of the 639 loci were detectable in male-female mixed DNA samples with a mix ratio of 1:500. Nearly all of the loci were genotyped correctly when no more than 25 ng/µL tannic acid, 20 ng/µL humic acid, or 37.5 µM hematin was added to the amplification mixture. More than 80% of genotypes were obtained from degraded DNA samples with a degradation index of 11.76. Individuals from the same pedigree shared identical genotypes in 11 male pedigrees. Finally, we presented the complex evolutionary history of 183 northern Chinese Hans and six other Chinese populations, and found multiple founding lineages that contributed to the northern Han Chinese gene pool. The 639-plex panel proved an efficient tool for Chinese paternal studies and forensic applications.
Assuntos
População do Leste Asiático , Polimorfismo de Nucleotídeo Único , Humanos , Genótipo , Reprodutibilidade dos Testes , Genética Populacional , Haplótipos , Cromossomos Humanos Y/genética , DNARESUMO
Identifying the types of body fluids left at the crime scene can be essential to reconstructing the crime scene and inferring criminal behavior. MicroRNA (miRNA) molecule extracted from the trace of body fluids is one of the most promising biomarkers for the identification due to its high expression, extreme stability and tissue specificity. However, the detection of miRNA markers is not the answer to a yes-no question but the probability of an assumption. Therefore, it is a crucial task to develop complicated methods combining multi-miRNAs as well as computational algorithms to achieve the goal. In this study, we systematically analyzed the expression of 10 most probable body fluid-specific miRNA markers (miR-451a, miR-205-5p, miR-203a-3p, miR-214-3p, miR-144-3p, miR-144-5p, miR-654-5p, miR-888-5p, miR-891a-5p and miR-124-3p) in 605 body fluids-related samples, including peripheral blood, menstrual blood, saliva, semen and vaginal secretion. We introduced the kernel density estimation (KDE) method and six well-established methods to classify the body fluids in order to find the most optimal combinations of miRNA markers as well as the corresponding classifying method. The results show that the combination of miR-451a, miR-891a-5p, miR-144-5p and miR-203a-3p together with KDE can achieve the most accurate and robust performance according to the cross-validation, independent tests and random perturbation tests. This systematic analysis suggests a reference scheme for the identification of body fluids in an accurate and stable manner.
Assuntos
Líquidos Corporais , Genética Forense , Marcadores Genéticos , MicroRNAs/genética , Reação em Cadeia da Polimerase em Tempo Real , Adulto , Algoritmos , Feminino , Humanos , MasculinoRESUMO
OBJECTIVES: To establish a system for simultaneous detection of miR-888 and miR-891a by droplet digital PCR (ddPCR), and to evaluate its application value in semen identification. METHODS: The hydrolysis probes with different fluorescence modified reporter groups were designed to realize the detection of miR-888 and miR-891a by duplex ddPCR. A total of 75 samples of 5 body fluids (including peripheral blood, menstrual blood, semen, saliva and vaginal secretion) were detected. The difference analysis was conducted by Mann-Whitney U test. The semen differentiation ability of miR-888 and miR-891a was evaluated by ROC curve analysis and the optimal cut-off value was obtained. RESULTS: There was no significant difference between the dual-plex assay and the single assay in this system. The detection sensitivity was up to 0.1 ng total RNA, and the intra- and inter-batch coefficients of variation were less than 15%. The expression levels of miR-888 and miR-891a detected by duplex ddPCR in semen were both higher than those in other body fluids. ROC curve analysis showed that the AUC of miR-888 was 0.976, the optimal cut-off value was 2.250 copies/µL, and the discrimination accuracy was 97.33%; the AUC of miR-891a was 1.000, the optimal cut-off value was 1.100 copies/µL, and the discrimination accuracy was 100%. CONCLUSIONS: In this study, a method for detection of miR-888 and miR-891a by duplex ddPCR was successfully established. The system has good stability and repeatability and can be used for semen identification. Both miR-888 and miR-891a have high ability to identify semen, and the discrimination accuracy of miR-891a is higher.
Assuntos
Líquidos Corporais , MicroRNAs , Feminino , Humanos , Líquidos Corporais/química , MicroRNAs/análise , Reação em Cadeia da Polimerase em Tempo Real/métodos , Saliva/química , Sêmen/química , MasculinoRESUMO
Massively parallel sequencing of forensic STRs simultaneously provides length-based genotypes and core repeat sequences as well as flanking sequence variations. Here, we report primer sequences and concentrations of a next-generation sequencing (NGS)-based in-house panel covering 28 autosomal STR loci (CSF1PO, D1GATA113, D1S1627, D1S1656, D1S1677, D2S441, D2S1776, D3S3053, D5S818, D6S474, D6S1017, D6S1043, D8S1179, D9S2157, D10S1435, D11S4463, D13S317, D14S1434, D16S539, D18S51, D18S853, D20S482, D20S1082, D22S1045, FGA, TH01, TPOX, and vWA) and the sex determinant locus Amelogenin. Preliminary evaluation experiments showed that the panel yielded intralocus- and interlocus-balanced sequencing data with a sensitivity as low as 62.5 pg input DNA. A total of 203 individuals from Yunnan Bai population were sequenced with this panel. Comparative forensic genetic analyses showed that sequence-based matching probability of this 29-plex panel reached 2.37 × 10-29 , which was 23 times lower than the length-based data. Compound stutter sequences of eight STRs were compared with parental alleles. For seven loci, repeat motif insertions or deletions occurred in the longest uninterrupted repeat sequences (LUS). However, LUS and non-LUS stutters co-existed in the locus D6S474 with different sequencing depth ratios. These results supplemented our current knowledge of forensic STR stutters, and provided a sound basis for DNA mixture deconvolution.
Assuntos
Genética Forense/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Repetições de Microssatélites/genética , Análise de Sequência de DNA/métodos , Povo Asiático/genética , China , Humanos , Reação em Cadeia da Polimerase MultiplexRESUMO
Blood samples are the most common and important biological samples found at crime scenes, and distinguishing peripheral blood and menstrual blood samples is crucial for solving criminal cases. MicroRNAs (miRNAs) are important molecules with strong tissue specificity that can be used in forensic fields to identify the tissue properties of body fluid samples. In this study, the relative expression levels of four different miRNAs (miR-451, miR-205, miR-214 and miR-203) were analysed by real-time PCR, with 200 samples from 5 different body fluids, including two kinds of blood samples (peripheral blood and menstrual blood) and three kinds of non-blood samples (saliva, semen and vaginal secretion). Then, a strategy for identifying menstrual and peripheral blood based on Fisher's discriminant function and the relative expression of multiple miRNAs was established. Two sets of functions were used: Z1 and Z2 were used to distinguish blood samples from non-blood samples, and Y1 and Y2 were used to distinguish peripheral blood from menstrual blood. A 100% accuracy rate was achieved when 50 test samples were used. Ten samples were used to test the sensitivity of the method, and 10 ng or more of total RNA from peripheral blood samples and 10 pg or more of total RNA from menstrual blood samples were sufficient for this method. The results provide a scientific reference to address the difficult forensic problem of distinguishing menstrual blood from peripheral blood.
Assuntos
Análise Química do Sangue , Secreções Corporais/química , Menstruação/sangue , MicroRNAs/análise , Análise Discriminante , Feminino , Medicina Legal/métodos , Humanos , Masculino , Reação em Cadeia da Polimerase em Tempo Real , Sensibilidade e EspecificidadeRESUMO
In human society, the facial surface is visible and recognizable based on the facial shape variation which represents a set of highly polygenic and correlated complex traits. Understanding the genetic basis underlying facial shape traits has important implications in population genetics, developmental biology, and forensic science. A number of single nucleotide polymorphisms (SNPs) are associated with human facial shape variation, mostly in European populations. To bridge the gap between European and Asian populations in term of the genetic basis of facial shape variation, we examined the effect of these SNPs in a European-Asian admixed Eurasian population which included a total of 612 individuals. The coordinates of 17 facial landmarks were derived from high resolution 3dMD facial images, and 136 Euclidean distances between all pairs of landmarks were quantitatively derived. DNA samples were genotyped using the Illumina Infinium Global Screening Array and imputed using the 1000 Genomes reference panel. Genetic association between 125 previously reported facial shape-associated SNPs and 136 facial shape phenotypes was tested using linear regression. As a result, a total of eight SNPs from different loci demonstrated significant association with one or more facial shape traits after adjusting for multiple testing (significance threshold p < 1.28 × 10-3), together explaining up to 6.47% of sex-, age-, and BMI-adjusted facial phenotype variance. These included EDAR rs3827760, LYPLAL1 rs5781117, PRDM16 rs4648379, PAX3 rs7559271, DKK1 rs1194708, TNFSF12 rs80067372, CACNA2D3 rs56063440, and SUPT3H rs227833. Notably, the EDAR rs3827760 and LYPLAL1 rs5781117 SNPs displayed significant association with eight and seven facial phenotypes, respectively (2.39 × 10-5 < p < 1.28 × 10-3). The majority of these SNPs showed a distinct allele frequency between European and East Asian reference panels from the 1000 Genomes Project. These results showed the details of above eight genes influence facial shape variation in a Eurasian population.
Assuntos
Face/anatomia & histologia , Genética Populacional , Estudo de Associação Genômica Ampla/métodos , Polimorfismo de Nucleotídeo Único , Adolescente , Adulto , Povo Asiático/etnologia , China , Feminino , Frequência do Gene , Genótipo , Humanos , Masculino , Pessoa de Meia-Idade , Fenótipo , População Branca/etnologia , Adulto JovemRESUMO
Tibetans have adapted to the extreme environment of high altitude for hundreds of generations. A highly differentiated 5-SNP (Single Nucleotide Polymorphism) haplotype motif (AGGAA) on a hypoxic pathway gene, EPAS1, is observed in Tibetans and lowlanders. To evaluate the potential usage of the 5-SNP haplotype in ancestry inference for Tibetan or Tibetan-related populations, we analyzed this haplotype in 1053 individuals of 12 Chinese populations residing on the Tibetan Plateau, peripheral regions of Tibet, and plain regions. These data were integrated with the genotypes from the 1000 Genome populations and populations in a previously reported paper for population structure analyses. We found that populations representing highland and lowland groups have different dominant ancestry components. The core Denisovan haplotype (AGGAA) was observed at a frequency of 72.32% in the Tibetan Plateau, with a frequency range from 9.48 to 21.05% in the peripheral regions and < 2.5% in the plains area. From the individual perspective, 87.57% of the individuals from the Tibetan Plateau carried the archaic haplotype, while < 5% of the Chinese Han people carried the haplotype. Our findings indicate that the 5-SNP haplotype has a special distribution pattern in populations of Tibet and peripheral regions and could be integrated into AISNP (Ancestry Informative Single Nucleotide Polymorphism) panels to enhance ancestry resolution.
Assuntos
Povo Asiático/genética , Fatores de Transcrição Hélice-Alça-Hélice Básicos/genética , Haplótipos , Polimorfismo de Nucleotídeo Único , Adaptação Fisiológica , Altitude , China , Genótipo , Humanos , Tibet/etnologiaRESUMO
Anthropology generally divides the individuals into the East Asian Mongolia race, European Caucasian race and African Nigro race. The 27-plex single nucleotide polymorphism (SNP) panel for ancestry information has been established to differentiate samples from East Asian, European, African and admixture populations of East Asian and European origin by genotyping and ancestry inference. To infer ancestry for unknown individuals, we established an optimized analysis pipeline based on the likelihood ratio, ancestry component and individual ancestry assignment. Four samples from East Asian, European, African and admixture populations of East Asian and European origin were tested using the optimized analysis pipeline. Cross validation within basic referential database and validation of 1 010 test samples were both used to evaluate the inference process. The results showed that accuracy of the method was higher than 99% in East Asia, Europe, Africa and admixture populations. The inference method can characterize the ancestry information of DNA donors, and has important practical application values in the field of human molecular and forensic genetics.
Assuntos
Genética Populacional , Polimorfismo de Nucleotídeo Único , Genética Forense , HumanosRESUMO
The identification of biological stains and their tissue resource is an important part of forensic research. Current methods suffer from several limitations including poor sensitivity and specificity, trace samples, and sample destruction. In this study, we profiled the proteomes of menstrual blood, peripheral blood, saliva, semen, and vaginal fluid with mass spectrometry technology. Tissue-enhanced and tissue-specific proteins of each group have been proposed as potential biomarkers. These candidate proteins were further annotated and screened through the combination with the Human Protein Atlas database. Our data not only validates the protein biomarkers reported in previous studies but also identifies novel candidate biomarkers for human body fluids. These candidates lay the foundation for the development of rapid and specific forensic examination methods.
Assuntos
Líquidos Corporais , Proteômica , Feminino , Humanos , Líquidos Corporais/química , Saliva/química , Biomarcadores/análise , Espectrometria de Massas , Proteoma/análise , Proteoma/metabolismo , Sêmen/química , Genética ForenseRESUMO
MicroRNA (miRNA)-based methods for body fluid identification are promising tools in the practice of forensic science. The selection of appropriate endogenous reference genes as normalizers for the relative quantification of miRNA expression levels using quantitative reverse transcription-polymerase chain reaction (RTqPCR) is essential to avoid errors and improve the comparability of miRNA expression level data among different body fluids. In this study, small RNAs were isolated from individual donations of five forensically relevant body fluids (peripheral blood, menstrual blood, saliva, semen and vaginal secretions). Thirty-seven samples were subjected to high-throughput miRNA sequencing. By combining our results with those obtained through a literature investigation, 28 candidate RNAs were identified. Following RTqPCR validation, the candidate RNAs were preliminarily evaluated in 15 samples to exclude miRNAs with low expression and high variation. Then, the expression levels of 10 relatively stable candidate reference RNAs in 100 samples were determined and further analysed using four commonly employed programs (geNorm, NormFinder, BestKeeper and ΔCq). According to the comprehensive stability rankings of the four algorithms, miR-320a-3p was validated as the most stable endogenous reference gene among the five forensically relevant body fluids, followed by miR-484, SNORD43, miR-320c and RNU6b. Moreover, the combined application of miR-320a-3p with RNU6b could increase the normalization effect. In addition, a total of 56 mock samples placed outdoors and indoors for different times were prepared to further evaluate the stability of candidate reference RNAs, and miR-320a-3p remained the preferred reference gene. Furthermore, the relative expression levels of publicly accepted body fluid-specific miRNAs were determined in 30 samples to verify the practicality and effectiveness of the reference genes. Our results revealed a set of alternative reference genes and could promote the development and application of miRNA-based body fluid identification by determining optional reference genes for strict normalization.
Assuntos
Líquidos Corporais , MicroRNAs , Feminino , Humanos , MicroRNAs/metabolismo , Líquidos Corporais/química , Saliva/química , Sêmen/química , Medicina Legal , Reação em Cadeia da Polimerase em Tempo Real , Perfilação da Expressão GênicaRESUMO
More accurate identification of the types of body fluids left at a crime scene is indispensable for improving the judicial chain of evidence. MicroRNAs (miRNAs) have become recognized as ideal molecular markers for the identification of body fluids in forensic science due to their short length, stability and high tissue specificity. In this study, small RNA sequencing was performed on 20 samples of five types of body fluids (peripheral blood, menstrual blood, saliva, semen, and vaginal secretions) with the BGISEQ-500 sequencing platform, and the specific miRNA markers of saliva and vaginal secretions were screened by bioinformatics methods, including differential expression analysis and significant enrichment analysis. Through RT-qPCR validation of 169 samples, we confirmed that miR-223-3p can be used as a saliva-specific marker. In addition, we considered miR-223-3p in combination with four other miRNA molecules (miR-451a, miR-891a-5p, miR-144-5p, miR-203a-3p) that had been previously screened and verified in our laboratory, and seven body fluid prediction models based on machine learning algorithms were constructed and verified. The results showed that a kernel density estimation (KDE) model based on the five miRNA markers for body fluid identification could achieve 100% accuracy in the samples tested in the present study.
Assuntos
Líquidos Corporais , MicroRNAs , Feminino , Humanos , Saliva , Genética Forense/métodos , MicroRNAs/análise , Líquidos Corporais/química , Biomarcadores/metabolismoRESUMO
Distinction between menstrual blood and peripheral blood is vital for forensic casework, as it could provide strong evidence to figure out the nature of some criminal cases. However, to date no single blood-specific gene, including the most variable microRNAs (miRNAs) could work well in identification of blood source. In this study, we developed a new strategy for identification of human blood samples by using the copy number ratios of miR-451a to miR-21-5p based on 133 samples, including 56 menstrual blood and 47 peripheral blood, as well as 30 non-blood samples of saliva (10), semen (10) and vaginal secretion (10). The cut-off value and efficacy of the identification strategy were determined through receiver operating characteristic (ROC) analysis. Our results showed that when the miR-451a/miR-21-5p ratio below 0.929, the sample should be non-blood. In contrast, when the miR-451a/miR-21-5p ratio above 0.929 and below 10.201, the sample should be menstrual blood; and when this ratio above 10.201, the sample should be peripheral blood. External validation using 86 samples (62 menstrual blood and 24 peripheral blood samples) fully supported this strategy with the 100% sensitivity and 100% specificity. We confirmed that this result accuracy was not affected by various potential confounding factors of samples and different experimental platforms. We showed that 0.2 ng of total RNA from menstrual blood and peripheral blood was sufficient for qPCR quantification. In conclusion, our results provide an accurate reference to distinguish menstrual blood from peripheral blood for forensic authentication.
Assuntos
Líquidos Corporais , MicroRNAs , Líquidos Corporais/química , Feminino , Humanos , MicroRNAs/análise , MicroRNAs/genética , Reação em Cadeia da Polimerase em Tempo Real , Saliva/química , Sêmen/químicaRESUMO
Microhaplotypes have been highly regarded for forensic mixture DNA deconvolution because they do not experience interference from stutters in the same way as short tandem repeat markers, and they tend to be more polymorphic than single nucleotide polymorphism markers. However, forensic microhaplotype kits have not been reported. The MHSeqTyper47 kit genotypes 47 microhaplotype loci. In this study, MiSeq FGx sequencing metrics for MHSeqTyper47 were presented, and the genotyping accuracy of this kit was examined. The sensitivity of MHSeqTyper47 reached 62.5 pg, and full genotyping results were obtained from degraded DNA samples with degradation indexes ≤ 3.00. Full genotypes were obtained in the presence of 100 ng/µL tannin, 50 µM heme, 25 ng/µL humic acid, and 1.25 µg/µL indigo dye. In DNA mixture studies, a minimum of 31 loci of the minor contributor were correctly genotyped at 1:99 or 99:1 mixing ratios, with the cumulative random matching probability of these loci reaching 4.54 × 10-25. Mixing ratios could be reliably predicted from two-donor DNA mixtures based on the loci with four called alleles. Taken together, these data showed that the MHSeqTyper47 kit was effective for forensically challenging DNA analysis.
Assuntos
Impressões Digitais de DNA , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Impressões Digitais de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Substâncias Húmicas/análise , Índigo Carmim , Repetições de Microssatélites , DNA/genética , DNA/análise , Análise de Sequência de DNA , Polimorfismo de Nucleotídeo Único , Heme , TaninosRESUMO
Hair shaft is one of the most common biological evidence found at crime scenes. However, due to the biogenic degradation of nuclear DNA in hair shaft, it is difficult to achieve individual identification through routine DNA analysis. In contrast, the proteins in hair shaft are stable and contain genetic polymorphisms in the form of single amino acid polymorphisms (SAPs), translated from non-synonymous single nucleotide polymorphisms (nsSNPs) in the genome. However, the number of SAPs detected still cannot meet the requirements of practical applications. This paper developed a deep coverage proteome analysis method by combining a three-step sequential ionic liquid-based protein extraction and 2D-RPLC-MS/MS with high and low pH to identify both variant and reference SAPs from 2-cm-long hair shafts. We identified 632 ± 243 protein groups from 10 individuals, with the average number of SAPs reaching 167 ± 21/person. These were further used to calculate random match probabilities (RMPs), a widely accepted forensic statistical term for human identification. The RMPs ranged from 6.53 × 10-4 to 3.10 × 10-14 (median = 2.62 × 10-8) when calculated with frequency of matching nsSNP genotype data from exomes, and ranged from 2.62 × 10-3 to 2.07 × 10-10 (median = 4.88 × 10-6) with SAP genotype frequency. All these results indicate that the deep coverage proteomics method is beneficial for improving SAP-based forensic individual identification in hair shaft, with great potential in crime investigation.
Assuntos
Proteoma , Espectrometria de Massas em Tandem , Genótipo , Cabelo/química , Humanos , Proteoma/genética , ProteômicaRESUMO
Microhaplotypes are forensic genetic markers that combine single nucleotide polymorphisms in close proximity to one another. Highly discriminative microhaplotype markers could be superior to short tandem repeats (STRs) in DNA mixture deconvolution investigations because they are not interfered by stutters. In this study, the effective number of alleles (Ae) and discrimination power values of microhaplotypes and STRs were compared. It was found that current microhaplotypes are not as discriminative as commonly used forensic STRs. Effective screening of highly discriminative microhaplotype markers were consequently conducted for East Asian populations. To satisfy different forensic application needs, four sets of microhaplotypes with Ae values ≥ 4 were screened for under different conditions that included marker length and physical distances between markers. While the four sets contained 703, 301, 337, and 190 microhaplotypes, their average Ae values reached 5.38, 6.30, 7.39, and 5.61, respectively. The microhaplotype group containing 301 markers (maximum length of 200 bp and separated by ≥ 5 million bases) was further investigated. The results showed that none of the 301 loci were exactly the same as those previously reported, while seven loci partially overlapped with known markers. While Ae values of 45 loci were ≥ 8, the Ae value of the mh17WL-008 locus reached a maximum of 93.57. Further analysis showed that the newly identified microhaplotype markers were also highly polymorphic in African, American, European, and South Asian populations.
Assuntos
Genética Forense , Sequenciamento de Nucleotídeos em Larga Escala , Impressões Digitais de DNA/métodos , Genética Forense/métodos , Frequência do Gene , Genética Populacional , Haplótipos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Repetições de Microssatélites , Polimorfismo de Nucleotídeo ÚnicoRESUMO
Sequence polymorphisms of Y chromosome short tandem repeat (Y-STR) markers can be unveiled using next generation sequencing (NGS). Compared to capillary electrophoresis, NGS has the advantage of distinguishing between some alleles of the same length. Here, a 68-plex in-house panel covering 67 Y-STR loci and the sex determinant Amelogenin locus, was developed. The accuracy of this panel was 100% concordant with three standard reference samples. The sensitive was as low as 250 pg. A total of 466 length-based alleles, 806 sequence-based alleles, and 149 haplotypes were observed across 149 Chinese Han individuals. The total haplotype diversity and discrimination capacity was 1.0000 in detected samples. The DYS710 locus possessed the highest diversity by sequence among these Y-STRs, with 109 sequence-based alleles observed. Micro-variant alleles with the same length were observed in 39 Y-STR loci, with their sequence variations mainly attributable to repeat pattern variations. While the number of sequence-based alleles identified for DYS447, DYS449, DYS710, DYS720 and DYF387S1a/b was approximately three times that of their length-based alleles, flanking sequence variations were observed in 18 alleles. In addition, 201 sequence-based alleles in 42 loci were newly discovered. This significantly expanded the knowledge of human Y-STR sequence polymorphisms. Collectively, the 68-plex panel provided reliable Y-STR results as well as higher resolution for paternal lineage analysis.