Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
1.
J Hum Genet ; 67(8): 487-493, 2022 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-35347230

RESUMEN

The application of massively parallel sequencing (MPS) data from whole genomes has allowed very many more Y-SNP loci to be genotyped simultaneously than previously possible. Although this greatly increases the resolution of Y-SNP haplogroups to link common ancestors, it remains a great challenge to provide a phylogenetic tree to clearly display the relationship of varying haplogroups. Y-SNP Haplogroup Hierarchy Finder is a web tool to generate hierarchical haplogroups based on Y-SNP data with the derived allele at the terminal of a haplogroup tree. The input data can include that from whole-genome sequencing. Confidence in assignment using Y-SNP Haplogroup Hierarchy Finder was demonstrated using Y-SNP genotypes of 1233 samples, sourced from the 1000 genomes project phase 3, used to generate the expected haplogroups. The outcome includes 2 reports: a 'Haplogroup Report' lists mutation types from the submitted Y-SNPs and their corresponding haplogroups, and a 'Haplogroup Hierarchy Report' lists all possible hierarchical haplogroups and ranks the three most supported haplogroups. Each layer of the descending haplogroups from one step to the next is shown and the supporting numbers of Y-SNPs are also included in these reports. All haplogroups that exhibited a clear relationship between the ancestral through to the derived SNPs can be clustered into a hierarchy of haplogroups. The assigned 1233 haplogroups were compared with 2 other software programs designed to assemble haplogroups, which resulted in one where there were many differences and the other one where there was only minor difference. The advantage of this web-based tool is that it provides an easy way to assign Y-SNP haplogroup based on the visualized hierarchical pattern.


Asunto(s)
Cromosomas Humanos Y , Polimorfismo de Nucleótido Simple , Haplotipos/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Filogenia
2.
Int J Legal Med ; 135(4): 1191-1199, 2021 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-33586030

RESUMEN

Population and geographic assignment are frequently undertaken using DNA sequences on the mitochondrial genome. Assignment to broad continental populations is common, although finer resolution to subpopulations can be less accurate due to shared genetic ancestry at a local level and members of different ancestral subpopulations cohabiting the same geographic area. This study reports on the accuracy of population and subpopulation assignment by using the sequence data obtained from the 3070 mitochondrial genomes and applying the K-nearest neighbors (KNN) algorithm. These data also included training samples used for continental and population assignment comprised of 1105 Europeans (including Austria, France, Germany, Spain, and England and Caucasian countries), 374 Africans (including North and East Africa and non-specific area (Pan-Africa)), and 1591 Asians (including Japan, Philippines, and Taiwan). Subpopulations included in this study were 1153 mitochondrial DNA (mtDNA) control region sequences from 12 subpopulations in Taiwan (including Han, Hakka, Ami, Atayal, Bunun, Paiwan, Puyuma, Rukai, Saisiyat, Tsou, Tao, and Pingpu). Additionally, control region sequence data from a further 50 samples, obtained from the Sigma Company, were included after they were amplified and sequenced. These additional 50 samples acted as the "testing samples" to verify the accuracy of the population. In this study, based on genetic distances as genetic metric, we used the KNN algorithm and the K-weighted-nearest neighbors (KWNN) algorithm weighted by genetic distance to classify individuals into continental populations, and subpopulations within the same continent. Accuracy results of ethnic inferences at the level of continental populations and of subpopulations among KNN and KWNN algorithms were obtained. The training sample set achieved an overall accuracy of 99 to 82% for assignment to their continental populations with K values from 1 to 101. Population assignment for subpopulations with K assignments from 1 to 5 reached an accuracy of 77 to 54%. Four out of 12 Taiwanese populations returned an accuracy of assignment of over 60%, Ami (66%), Atayal (67%), Saisiyat (66%), and Tao (80%). For the testing sample set, results of ethnic prediction for continental populations with recommended K values as 5, 10, and 35, based on results of the training sample set, achieved overall an accuracy of 100 to 94%. This study provided an accurate method in population assignment for not only continental populations but also subpopulations, which can be useful in forensic and anthropological studies.


Asunto(s)
Algoritmos , ADN Mitocondrial/genética , Genética de Población/métodos , Región de Control de Posición , Filogenia , Grupos Raciales/genética , Humanos , Pueblos Indígenas/genética , Taiwán/etnología
3.
Forensic Sci Int Genet ; 30: 127-133, 2017 09.
Artículo en Inglés | MEDLINE | ID: mdl-28728055

RESUMEN

Accurate sequencing of the control region of the mitochondrial genome is notoriously difficult due to the presence of polycytosine bases, termed C-tracts. The precise number of bases that constitute a C-tract and the bases beyond the poly cytosines may not be accurately defined when analyzing Sanger sequencing data separated by capillary electrophoresis. Massively parallel sequencing has the potential to resolve such poor definition and provides the opportunity to discover variants due to length heteroplasmy. In this study, the control region of mitochondrial genomes from 20 samples was sequenced using both standard Sanger methods with separation by capillary electrophoresis and also using massively parallel DNA sequencing technology. After comparison of the two sets of generated sequence, with the exception of the C-tracts where length heteroplasmy was observed, all sequences were concordant. Sequences of three segments 16184-16193, 303-315 and 568-573 with C-tracts in HVI, II and III can be clearly defined from the massively parallel sequencing data using the program SEQ Mapper. Multiple sequence variants were observed in the length of C-tracts longer than 7 bases. Our report illustrates the accurate designation of all the length variants leading to heteroplasmy in the control region of the mitochondrial genome that can be determined by SEQ Mapper based on data generated by massively parallel DNA sequencing.


Asunto(s)
ADN Mitocondrial/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Región de Control de Posición , Análisis de Secuencia de ADN/métodos , Humanos
4.
Forensic Sci Int Genet ; 26: 66-69, 2017 01.
Artículo en Inglés | MEDLINE | ID: mdl-27792894

RESUMEN

The development of massively parallel sequencing (MPS) has increased greatly the scale of DNA sequencing. The analysis of massive data-files from single MPS analysis can be a major challenge if examining the data for potential polymorphic loci. To aid in the analysis of both short tandem repeat (STR) and single nucleotide polymorphisms (SNP), we have designed a new program called SEQ Mapper to search for genetic polymorphisms within a large number of reads generated by MPS. This new program has been designed to perform sequence mapping between reference data and generated reads. As a proof-of-concept, sequences derived from the allelic ladders of five STR loci and data from the amelogenin locus were used as reference data sets. Detecting and recording the polymorphic nature of each STR loci was performed using four levels of search criteria: the entire STR locus spanning the two primers; the STR region plus the two primer sequences; the STR region only; and the two primers only. All the genotypes of 5 STR loci and the amelogenin gene were identified correctly using SEQ Mapper when compared to results obtained from capillary electrophoresis based on 10 test samples in this study. SEQ Mapper is a useful tool to detect STR or SNP alleles generated by MPS in both clinical medicine and forensic genetics.


Asunto(s)
Alelos , Secuenciación de Nucleótidos de Alto Rendimiento , Repeticiones de Microsatélite , Polimorfismo de Nucleótido Simple , Programas Informáticos , Amelogenina/genética , Genotipo , Humanos , Reacción en Cadena de la Polimerasa Multiplex
5.
Investig Genet ; 6: 10, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-26246889

RESUMEN

BACKGROUND: Whole-genome sequencing is performed routinely as a means to identify polymorphic genetic loci such as short tandem repeat loci. We have developed a simple tool, called pSTR Finder, which is freely available as a means of identifying putative polymorphic short tandem repeat (STR) loci from data generated from genome-wide sequences. The program performs cross comparisons on the STR sequences generated using the Tandem Repeats Finder based on multiple-genome samples in a FASTA format. These comparisons generate reports listing identical, polymorphic, and different STR loci when comparing two samples. METHODS: The web site http://forensic.mc.ntu.edu.tw:9000/PSTRWeb/Default has been developed as a means to identify polymorphic STR loci within complex mass genome sequences. The program was developed to generate a series of user-friendly reports. RESULTS: As proof of concept for the program, four FASTA genome sequence samples of human chromosome X (AC_000155.1, CM000685.1, NC_018934.2, and CM000274.1) were obtained from GenBank and were analyzed for the presence of putative STR regions. The sequences within AC-000155.1 were used as an initial reference sequence from which there were 5443 identical and 4305 polymorphic STR loci identified using a repeat unit of 1-6 and 10 bp as the flanking sequence either side of the putative STR loci. A reliability test was used to compare five FASTA samples, which had sections of DNA sequence removed to mimic partial or fragmented DNA sequences, to determine whether pSTR Finder can efficiently and consistently find identical, polymorphic, and different STR loci. CONCLUSIONS: From the mass of DNA sequence data, the project was found to reproducibly identify polymorphic STR loci and generate user-friendly reports detailing the number and location of these potential polymorphic loci. This freely available program was found to be a useful tool to find polymorphic STR within whole-genome sequence data in forensic genetic studies.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA