RESUMEN
Y-chromosomal haplogroups assigned from male-specific Y-chromosomal single nucleotide polymorphisms (Y-SNPs) allow paternal lineage identification and paternal bio-geographic ancestry inference, both being relevant in forensic genetics. However, most previously developed forensic Y-SNP tools did not provide Y haplogroup resolution on the high level needed in forensic applications, because the limited multiplex capacity of the DNA technologies used only allowed the inclusion of a relatively small number of Y-SNPs. In a proof-of-principle study, we recently demonstrated that high-resolution Y haplogrouping is feasible via two AmpliSeq PCR analyses and simultaneous massively parallel sequencing (MPS) of 530 Y-SNPs allowing the inference of 432 Y-haplogroups. With the current study, we present a largely improved Y-SNP MPS lab tool that we specifically designed for the analysis of low quality and quantity DNA often confronted with in forensic DNA analysis. Improvements include i) Y-SNP marker selection based on the "minimal reference phylogeny for the human Y chromosome" (PhyloTree Y), ii) strong increase of the number of targeted Y-SNPs allowing many more Y haplogroups to be inferred, iii) focus on short amplicon length enabling successful analysis of degraded DNA, and iv) combination of all amplicons in a single AmpliSeq PCR and simultaneous sequencing allowing single DNA aliquot use. This new MPS tool simultaneously analyses 859 Y-SNPs and allows inferring 640 Y haplogroups. Preliminary forensic developmental validation testing revealed that this tool performs highly accurate, is sensitive and robust. We also provide a revised software tool for analysing the sequencing data produced by the new MPS lab tool including final Y haplogroup assignment. We envision the tools introduced here for high-resolution Y-chromosomal haplogrouping to determine a man's paternal lineage and/or paternal bio-geographic ancestry to become widely used in forensic Y-chromosome DNA analysis and other applications were Y haplogroup information from low quality / quantity DNA samples is required.
Asunto(s)
Cromosomas Humanos Y , Haplotipos , Secuenciación de Nucleótidos de Alto Rendimiento , Polimorfismo de Nucleótido Simple , Análisis de Secuencia de ADN , ADN/análisis , Degradación Necrótica del ADN , Genética Forense/métodos , Humanos , Masculino , Reacción en Cadena de la Polimerasa , Reproducibilidad de los ResultadosRESUMEN
Pakistan harbors 16 major ethnic groups including Punjabis (56% of total population) and Kashmiri (6% of total population). Here, we report data of 17 Y-chromosomal short tandem repeats (Y-STRs) genotyped with the AmpFlSTR Y-filer™ PCR Amplification kit in 94 Punjabis and 101 Kashmiris. The estimated haplotype diversity was higher in Punjabis (0.996) than that in Kashmiris (0.983). Furthermore, we performed population genetic analyses by including data from six other Pakistani groups. The presented haplotype data were recently included in the Y-Chromosome Haplotype Reference Database (YHRD) for future forensic and other usage.
Asunto(s)
Cromosomas Humanos Y , Etnicidad/genética , Genética de Población , Repeticiones de Microsatélite , Dermatoglifia del ADN , Haplotipos , Humanos , Masculino , Pakistán , Reacción en Cadena de la PolimerasaRESUMEN
Human mitochondrial DNA haplogroup U is among the initial maternal founders in Southwest Asia and Europe and one that best indicates matrilineal genetic continuity between late Pleistocene hunter-gatherer groups and present-day populations of Europe. While most haplogroup U subclades are older than 30 thousand years, the comparatively recent coalescence time of the extant variation of haplogroup U7 (~16-19 thousand years ago) suggests that its current distribution is the consequence of more recent dispersal events, despite its wide geographical range across Europe, the Near East and South Asia. Here we report 267 new U7 mitogenomes that - analysed alongside 100 published ones - enable us to discern at least two distinct temporal phases of dispersal, both of which most likely emanated from the Near East. The earlier one began prior to the Holocene (~11.5 thousand years ago) towards South Asia, while the later dispersal took place more recently towards Mediterranean Europe during the Neolithic (~8 thousand years ago). These findings imply that the carriers of haplogroup U7 spread to South Asia and Europe before the suggested Bronze Age expansion of Indo-European languages from the Pontic-Caspian Steppe region.
Asunto(s)
ADN Mitocondrial/genética , Evolución Molecular , Haplotipos/genética , Teorema de Bayes , Geografía , Humanos , Mutación/genética , FilogeniaRESUMEN
Aboriginal Australians represent one of the oldest continuous cultures outside Africa, with evidence indicating that their ancestors arrived in the ancient landmass of Sahul (present-day New Guinea and Australia) ~55 thousand years ago. Genetic studies, though limited, have demonstrated both the uniqueness and antiquity of Aboriginal Australian genomes. We have further resolved known Aboriginal Australian mitochondrial haplogroups and discovered novel indigenous lineages by sequencing the mitogenomes of 127 contemporary Aboriginal Australians. In particular, the more common haplogroups observed in our dataset included M42a, M42c, S, P5 and P12, followed by rarer haplogroups M15, M16, N13, O, P3, P6 and P8. We propose some major phylogenetic rearrangements, such as in haplogroup P where we delinked P4a and P4b and redefined them as P4 (New Guinean) and P11 (Australian), respectively. Haplogroup P2b was identified as a novel clade potentially restricted to Torres Strait Islanders. Nearly all Aboriginal Australian mitochondrial haplogroups detected appear to be ancient, with no evidence of later introgression during the Holocene. Our findings greatly increase knowledge about the geographic distribution and phylogenetic structure of mitochondrial lineages that have survived in contemporary descendants of Australia's first settlers.
Asunto(s)
Variación Genética , Genoma Mitocondrial , Nativos de Hawái y Otras Islas del Pacífico/genética , Filogenia , Australia , Humanos , Análisis de Secuencia de ADNRESUMEN
Aboriginal Australians are one of the more poorly studied populations from the standpoint of human evolution and genetic diversity. Thus, to investigate their genetic diversity, the possible date of their ancestors' arrival and their relationships with neighboring populations, we analyzed mitochondrial DNA (mtDNA) diversity in a large sample of Aboriginal Australians. Selected mtDNA single-nucleotide polymorphisms and the hypervariable segment haplotypes were analyzed in 594 Aboriginal Australians drawn from locations across the continent, chiefly from regions not previously sampled. Most (~78%) samples could be assigned to mtDNA haplogroups indigenous to Australia. The indigenous haplogroups were all ancient (with estimated ages >40 000 years) and geographically widespread across the continent. The most common haplogroup was P (44%) followed by S (23%) and M42a (9%). There was some geographic structure at the haplotype level. The estimated ages of the indigenous haplogroups range from 39 000 to 55 000 years, dates that fit well with the estimated date of colonization of Australia based on archeological evidence (~47 000 years ago). The distribution of mtDNA haplogroups in Australia and New Guinea supports the hypothesis that the ancestors of Aboriginal Australians entered Sahul through at least two entry points. The mtDNA data give no support to the hypothesis of secondary gene flow into Australia during the Holocene, but instead suggest long-term isolation of the continent.
Asunto(s)
ADN Mitocondrial/genética , Variación Genética , Nativos de Hawái y Otras Islas del Pacífico/genética , Filogenia , Evolución Biológica , ADN Mitocondrial/historia , Femenino , Flujo Génico , Haplotipos , Historia del Siglo XXI , Historia Antigua , Humanos , Masculino , Nativos de Hawái y Otras Islas del Pacífico/historia , Oceanía , Paleontología , Filogeografía , Polimorfismo de Nucleótido Simple , Aislamiento ReproductivoRESUMEN
Nusa Tenggara, including East Timor, located at the crossroad between Island Southeast Asia, Near Oceania, and Australia, are characterized by a complex cultural structure harbouring speakers from two different major linguistic groups of different geographic origins (Austronesian (AN) and non-Austronesian (NAN)). This provides suitable possibilities to study gene-language relationship; however, previous studies from other parts of Nusa Tenggara reported conflicting evidence about gene-language correlation in this region. Aiming to investigate gene-language relationships including sex-mediated aspects in East Timor, we analysed the paternally inherited non-recombining part of the Y chromosome (NRY) and the maternally inherited mitochondrial (mt) DNA in a representative collection of AN- and NAN-speaking groups. Y-SNP (single-nucleotide polymorphism) data were newly generated for 273 samples and combined with previously established Y-STR (short tandem repeat) data of the same samples, and with previously established mtDNA data of 290 different samples with, however, very similar representation of geographic and linguistic coverage of the country. We found NRY and mtDNA haplogroups of previously described putative East/Southeast Asian (E/SEA) and Near Oceanian (NO) origins in both AN and NAN speakers of East Timor, albeit in different proportions, suggesting reciprocal genetic admixture between both linguistic groups for females, but directional admixture for males. Our data underline the dual genetic origin of East Timorese in E/SEA and NO, and highlight that substantial genetic admixture between the two major linguistic groups had occurred, more so via women than men. Our study therefore provides another example where languages and genes do not conform due to sex-biased genetic admixture across major linguistic groups.
Asunto(s)
Genotipo , Lenguaje , Población/genética , Cromosomas Humanos Y/genética , ADN Mitocondrial/genética , Femenino , Humanos , Indonesia , Masculino , Repeticiones de Microsatélite , Polimorfismo de Nucleótido SimpleRESUMEN
MSeqDR is the Mitochondrial Disease Sequence Data Resource, a centralized and comprehensive genome and phenome bioinformatics resource built by the mitochondrial disease community to facilitate clinical diagnosis and research investigations of individual patient phenotypes, genomes, genes, and variants. A central Web portal (https://mseqdr.org) integrates community knowledge from expert-curated databases with genomic and phenotype data shared by clinicians and researchers. MSeqDR also functions as a centralized application server for Web-based tools to analyze data across both mitochondrial and nuclear DNA, including investigator-driven whole exome or genome dataset analyses through MSeqDR-Genesis. MSeqDR-GBrowse genome browser supports interactive genomic data exploration and visualization with custom tracks relevant to mtDNA variation and mitochondrial disease. MSeqDR-LSDB is a locus-specific database that currently manages 178 mitochondrial diseases, 1,363 genes associated with mitochondrial biology or disease, and 3,711 pathogenic variants in those genes. MSeqDR Disease Portal allows hierarchical tree-style disease exploration to evaluate their unique descriptions, phenotypes, and causative variants. Automated genomic data submission tools are provided that capture ClinVar compliant variant annotations. PhenoTips will be used for phenotypic data submission on deidentified patients using human phenotype ontology terminology. The development of a dynamic informed patient consent process to guide data access is underway to realize the full potential of these resources.
Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Enfermedades Mitocondriales/genética , Variación Genética , Genoma Mitocondrial , Genómica , Humanos , Difusión de la Información , Interfaz Usuario-Computador , Navegador WebRESUMEN
Although previous studies have documented a bottleneck in the transmission of mtDNA genomes from mothers to offspring, several aspects remain unclear, including the size and nature of the bottleneck. Here, we analyze the dynamics of mtDNA heteroplasmy transmission in the Genomes of the Netherlands (GoNL) data, which consists of complete mtDNA genome sequences from 228 trios, eight dizygotic (DZ) twin quartets, and 10 monozygotic (MZ) twin quartets. Using a minor allele frequency (MAF) threshold of 2%, we identified 189 heteroplasmies in the trio mothers, of which 59% were transmitted to offspring, and 159 heteroplasmies in the trio offspring, of which 70% were inherited from the mothers. MZ twin pairs exhibited greater similarity in MAF at heteroplasmic sites than DZ twin pairs, suggesting that the heteroplasmy MAF in the oocyte is the major determinant of the heteroplasmy MAF in the offspring. We used a likelihood method to estimate the effective number of mtDNA genomes transmitted to offspring under different bottleneck models; a variable bottleneck size model provided the best fit to the data, with an estimated mean of nine individual mtDNA genomes transmitted. We also found evidence for negative selection during transmission against novel heteroplasmies (in which the minor allele has never been observed in polymorphism data). These novel heteroplasmies are enhanced for tRNA and rRNA genes, and mutations associated with mtDNA diseases frequently occur in these genes. Our results thus suggest that the female germ line is able to recognize and select against deleterious heteroplasmies.
Asunto(s)
ADN Mitocondrial , Familia , Heterogeneidad Genética , Patrón de Herencia , Población Blanca/genética , Alelos , Femenino , Frecuencia de los Genes , Humanos , Masculino , Modelos Genéticos , Modelos Estadísticos , Mutación , Países Bajos , Polimorfismo Genético , Selección Genética , GemelosRESUMEN
The use of mitochondrial DNA (mtDNA) for maternal lineage identification often marks the last resort when investigating forensic and missing-person cases involving highly degraded biological materials. As with all comparative DNA testing, a match between evidence and reference sample requires a statistical interpretation, for which high-quality mtDNA population frequency data are crucial. Here, we determined, under high quality standards, the complete mtDNA control-region sequences of 680 individuals from across the Netherlands sampled at 54 sites, covering the entire country with 10 geographic sub-regions. The complete mtDNA control region (nucleotide positions 16,024-16,569 and 1-576) was amplified with two PCR primers and sequenced with ten different sequencing primers using the EMPOP protocol. Haplotype diversity of the entire sample set was very high at 99.63% and, accordingly, the random-match probability was 0.37%. No population substructure within the Netherlands was detected with our dataset. Phylogenetic analyses were performed to determine mtDNA haplogroups. Inclusion of these high-quality data in the EMPOP database (accession number: EMP00666) will improve its overall data content and geographic coverage in the interest of all EMPOP users worldwide. Moreover, this dataset will serve as (the start of) a national reference database for mtDNA applications in forensic and missing person casework in the Netherlands.
Asunto(s)
ADN Mitocondrial/genética , Bases de Datos de Ácidos Nucleicos , Genética Forense/métodos , Mitocondrias/genética , Cartilla de ADN , ADN Mitocondrial/sangre , Bases de Datos Genéticas , Genética de Población/métodos , Haplotipos , Humanos , Masculino , Países Bajos , Reacción en Cadena de la Polimerasa/métodos , Estándares de Referencia , Análisis de Secuencia de ADN/normasRESUMEN
OBJECTIVE: Understanding the origins of Aboriginal Australians is crucial in reconstructing the evolution and spread of Homo sapiens as evidence suggests they represent the descendants of the earliest group to leave Africa. This study analyzed a large sample of Y-chromosomes to answer questions relating to the migration routes of their ancestors, the age of Y-haplogroups, date of colonization, as well as the extent of male-specific variation. METHODS: Knowledge of Y-chromosome variation among Aboriginal Australians is extremely limited. This study examined Y-SNP and Y-STR variation among 657 self-declared Aboriginal males from locations across the continent. 17 Y-STR loci and 47 Y-SNPs spanning the Y-chromosome phylogeny were typed in total. RESULTS: The proportion of non-indigenous Y-chromosomes of assumed Eurasian origin was high, at 56%. Y lineages of indigenous Sahul origin belonged to haplogroups C-M130*(xM8,M38,M217,M347) (1%), C-M347 (19%), K-M526*(xM147,P308,P79,P261,P256,M231,M175,M45,P202) (12%), S-P308 (12%), and M-M186 (0.9%). Haplogroups C-M347, K-M526*, and S-P308 are Aboriginal Australian-specific. Dating of C-M347, K-M526*, and S-P308 indicates that all are at least 40,000 years old, confirming their long-term presence in Australia. Haplogroup C-M347 comprised at least three sub-haplogroups: C-DYS390.1del, C-M210, and the unresolved paragroup C-M347*(xDYS390.1del,M210). CONCLUSIONS: There was some geographic structure to the Y-haplogroup variation, but most haplogroups were present throughout Australia. The age of the Australian-specific Y-haplogroups suggests New Guineans and Aboriginal Australians have been isolated for over 30,000 years, supporting findings based on mitochondrial DNA data. Our data support the hypothesis of more than one route (via New Guinea) for males entering Sahul some 50,000 years ago and give no support for colonization events during the Holocene, from either India or elsewhere.
Asunto(s)
Cromosomas Humanos Y/genética , Nativos de Hawái y Otras Islas del Pacífico/genética , Antropología Física , Australia , Variación Genética , Haplotipos , Humanos , Masculino , Polimorfismo de Nucleótido Simple/genéticaRESUMEN
Genetic signatures from the Paleolithic inhabitants of Eurasia can be traced from the early divergent mitochondrial DNA lineages still present in contemporary human populations. Previous studies already suggested a pre-Neolithic diffusion of mitochondrial haplogroup HV*(xH,V) lineages, a relatively rare class of mtDNA types that includes parallel branches mainly distributed across Europe and West Asia with a certain degree of structure. Up till now, variation within haplogroup HV was addressed mainly by analyzing sequence data from the mtDNA control region, except for specific sub-branches, such as HV4 or the widely distributed haplogroups H and V. In this study, we present a revised HV topology based on full mtDNA genome data, and we include a comprehensive dataset consisting of 316 complete mtDNA sequences including 60 new samples from the Italian peninsula, a previously underrepresented geographic area. We highlight points of instability in the particular topology of this haplogroup, reconstructed with BEAST-generated trees and networks. We also confirm a major lineage expansion that probably followed the Late Glacial Maximum and preceded Neolithic population movements. We finally observe that Italy harbors a reservoir of mtDNA diversity, with deep-rooting HV lineages often related to sequences present in the Caucasus and the Middle East. The resulting hypothesis of a glacial refugium in Southern Italy has implications for the understanding of late Paleolithic population movements and is discussed within the archaeological cultural shifts occurred over the entire continent.
Asunto(s)
ADN Mitocondrial/genética , Etnicidad/genética , Genética de Población , Mitocondrias/genética , Población Blanca/genética , Linaje de la Célula/genética , Europa (Continente) , Variación Genética/genética , Geografía , Haplotipos , Humanos , Datos de Secuencia Molecular , FilogeografíaRESUMEN
Whole mitochondrial (mt) genome analysis enables a considerable increase in analysis throughput, and improves the discriminatory power to the maximum possible phylogenetic resolution. Most established protocols on the different massively parallel sequencing (MPS) platforms, however, invariably involve the PCR amplification of large fragments, typically several kilobases in size, which may fail due to mtDNA fragmentation in the available degraded materials. We introduce a MPS tiling approach for simultaneous whole human mt genome sequencing using 161 short overlapping amplicons (average 200 bp) with the Ion Torrent Personal Genome Machine. We illustrate the performance of this new method by sequencing 20 DNA samples belonging to different worldwide mtDNA haplogroups. Additional quality control, particularly regarding the potential detection of nuclear insertions of mtDNA (NUMTs), was performed by comparative MPS analysis using the conventional long-range amplification method. Preliminary sensitivity testing revealed that detailed haplogroup inference was feasible with 100 pg genomic input DNA. Complete mt genome coverage was achieved from DNA samples experimentally degraded down to genomic fragment sizes of about 220 bp, and up to 90% coverage from naturally degraded samples. Overall, we introduce a new approach for whole mt genome MPS analysis from degraded and nondegraded materials relevant to resolve and infer maternal genetic ancestry at complete resolution in anthropological, evolutionary, medical, and forensic applications.
Asunto(s)
ADN Mitocondrial , Genoma Mitocondrial , Secuenciación de Nucleótidos de Alto Rendimiento/instrumentación , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Código de Barras del ADN Taxonómico/instrumentación , Código de Barras del ADN Taxonómico/métodos , Código de Barras del ADN Taxonómico/normas , Genómica/instrumentación , Genómica/métodos , Genómica/normas , Haplotipos , Secuenciación de Nucleótidos de Alto Rendimiento/normas , Humanos , Reproducibilidad de los Resultados , Sensibilidad y EspecificidadRESUMEN
A considerable body of evidence supports the role of mitochondrial dysfunction in psychiatric disorders and mitochondrial DNA (mtDNA) mutations are known to alter brain energy metabolism, neurotransmission, and cause neurodegenerative disorders. Genetic studies focusing on common nuclear genome variants associated with these disorders have produced genome wide significant results but those studies have not directly studied mtDNA variants. The purpose of this study is to investigate, using next generation sequencing, the involvement of mtDNA variation in bipolar disorder, schizophrenia, major depressive disorder, and methamphetamine use. MtDNA extracted from multiple brain regions and blood were sequenced (121 mtDNA samples with an average of 8,800x coverage) and compared to an electronic database containing 26,850 mtDNA genomes. We confirmed novel and rare variants, and confirmed next generation sequencing error hotspots by traditional sequencing and genotyping methods. We observed a significant increase of non-synonymous mutations found in individuals with schizophrenia. Novel and rare non-synonymous mutations were found in psychiatric cases in mtDNA genes: ND6, ATP6, CYTB, and ND2. We also observed mtDNA heteroplasmy in brain at a locus previously associated with schizophrenia (T16519C). Large differences in heteroplasmy levels across brain regions within subjects suggest that somatic mutations accumulate differentially in brain regions. Finally, multiplasmy, a heteroplasmic measure of repeat length, was observed in brain from selective cases at a higher frequency than controls. These results offer support for increased rates of mtDNA substitutions in schizophrenia shown in our prior results. The variable levels of heteroplasmic/multiplasmic somatic mutations that occur in brain may be indicators of genetic instability in mtDNA.
Asunto(s)
ADN Mitocondrial/genética , Trastornos Mentales/genética , Mutación/genética , Adulto , Estudios de Casos y Controles , Análisis Mutacional de ADN , Electroforesis en Gel de Agar , Femenino , Sitios Genéticos , Humanos , Masculino , Trastornos Mentales/sangre , Persona de Mediana Edad , Datos de Secuencia Molecular , Corteza Prefrontal/patologíaRESUMEN
SNPs from the non-recombining part of the human Y chromosome (Y-SNPs) are informative to classify paternal lineages in forensic, genealogical, anthropological, and evolutionary studies. Although thousands of Y-SNPs were identified thus far, previous Y-SNP multiplex tools target only dozens of markers simultaneously, thereby restricting the provided Y-haplogroup resolution and limiting their applications. Here, we overcome this shortcoming by introducing a high-resolution multiplex tool for parallel genotyping-by-sequencing of 530 Y-SNPs using the Ion Torrent PGM platform, which allows classification of 432 worldwide Y haplogroups. Contrary to previous Y-SNP multiplex tools, our approach covers branches of the entire Y tree, thereby maximizing the paternal lineage classification obtainable. We used a default DNA input amount of 10 ng per reaction but preliminary sensitivity testing revealed positive results from as little as 100 pg input DNA. Furthermore, we demonstrate that sample pooling using barcodes is feasible, allowing increased throughput for lower per-sample costs. In addition to the wetlab protocol, we provide a software tool for automated data quality control and haplogroup classification. The unique combination of ultra-high marker density and high sensitivity achievable from low amounts of potentially degraded DNA makes this new multiplex tool suitable for a wide range of Y-chromosome applications.
Asunto(s)
Cromosomas Humanos Y/genética , Polimorfismo de Nucleótido Simple , Análisis de Secuencia de ADN/métodos , Haplotipos , Humanos , Masculino , Sensibilidad y Especificidad , Análisis de Secuencia de ADN/economía , Análisis de Secuencia de ADN/instrumentación , Programas InformáticosRESUMEN
Currently, several different Y-chromosomal phylogenies and haplogroup nomenclatures are presented in scientific literature and at conferences demonstrating the present diversity in Y-chromosomal phylogenetic trees and Y-SNP sets used within forensic and anthropological research. This situation can be ascribed to the exponential growth of the number of Y-SNPs discovered due to mostly next-generation sequencing (NGS) studies. As Y-SNPs and their respective phylogenetic positions are important in forensics, such as for male lineage characterization and paternal bio-geographic ancestry inference, there is a need for forensic geneticists to know how to deal with these newly identified Y-SNPs and phylogenies, especially since these phylogenies are often created with other aims than to carry out forensic genetic research. Therefore, we give here an overview of four categories of currently used Y-chromosomal phylogenies and the associated Y-SNP sets in scientific research in the current NGS era. We compare these categories based on the construction method, their advantages and disadvantages, the disciplines wherein the phylogenetic tree can be used, and their specific relevance for forensic geneticists. Based on this overview, it is clear that an up-to-date reduced tree with a consensus Y-SNP set and a stable nomenclature will be the most appropriate reference resource for forensic research. Initiatives to reach such an international consensus are therefore highly recommended.
Asunto(s)
Cromosomas Humanos Y , Genética Forense , Filogenia , Polimorfismo de Nucleótido Simple , Análisis de Secuencia/métodos , HumanosRESUMEN
Success rates for genomic analyses of highly heterogeneous disorders can be greatly improved if a large cohort of patient data is assembled to enhance collective capabilities for accurate sequence variant annotation, analysis, and interpretation. Indeed, molecular diagnostics requires the establishment of robust data resources to enable data sharing that informs accurate understanding of genes, variants, and phenotypes. The "Mitochondrial Disease Sequence Data Resource (MSeqDR) Consortium" is a grass-roots effort facilitated by the United Mitochondrial Disease Foundation to identify and prioritize specific genomic data analysis needs of the global mitochondrial disease clinical and research community. A central Web portal (https://mseqdr.org) facilitates the coherent compilation, organization, annotation, and analysis of sequence data from both nuclear and mitochondrial genomes of individuals and families with suspected mitochondrial disease. This Web portal provides users with a flexible and expandable suite of resources to enable variant-, gene-, and exome-level sequence analysis in a secure, Web-based, and user-friendly fashion. Users can also elect to share data with other MSeqDR Consortium members, or even the general public, either by custom annotation tracks or through the use of a convenient distributed annotation system (DAS) mechanism. A range of data visualization and analysis tools are provided to facilitate user interrogation and understanding of genomic, and ultimately phenotypic, data of relevance to mitochondrial biology and disease. Currently available tools for nuclear and mitochondrial gene analyses include an MSeqDR GBrowse instance that hosts optimized mitochondrial disease and mitochondrial DNA (mtDNA) specific annotation tracks, as well as an MSeqDR locus-specific database (LSDB) that curates variant data on more than 1300 genes that have been implicated in mitochondrial disease and/or encode mitochondria-localized proteins. MSeqDR is integrated with a diverse array of mtDNA data analysis tools that are both freestanding and incorporated into an online exome-level dataset curation and analysis resource (GEM.app) that is being optimized to support needs of the MSeqDR community. In addition, MSeqDR supports mitochondrial disease phenotyping and ontology tools, and provides variant pathogenicity assessment features that enable community review, feedback, and integration with the public ClinVar variant annotation resource. A centralized Web-based informed consent process is being developed, with implementation of a Global Unique Identifier (GUID) system to integrate data deposited on a given individual from different sources. Community-based data deposition into MSeqDR has already begun. Future efforts will enhance capabilities to incorporate phenotypic data that enhance genomic data analyses. MSeqDR will fill the existing void in bioinformatics tools and centralized knowledge that are necessary to enable efficient nuclear and mtDNA genomic data interpretation by a range of shareholders across both clinical diagnostic and research settings. Ultimately, MSeqDR is focused on empowering the global mitochondrial disease community to better define and explore mitochondrial diseases.
Asunto(s)
Bases de Datos Genéticas , Genoma Mitocondrial , Interfaz Usuario-Computador , Biología Computacional , Exoma , Femenino , Genómica , Humanos , Difusión de la Información , Internet , Masculino , Enfermedades Mitocondriales/genética , Fenotipo , Programas InformáticosRESUMEN
To estimate genetic and forensic parameters, the entire mitochondrial DNA control region of 100 unrelated Makrani individuals (males, n=96; females, n=4) living in Pakistan (Turbat, Panjgur, Awaran, Kharan, Nasirabad, Gwadar, Buleda, Karachi and Burewala) was sequenced. We observed a total of 70 different haplotypes of which 54 were unique and 16 were shared by more than one individual. The Makrani population showed a high genetic diversity (0.9688) and, consequently, a high power of discrimination (0.9592). Our results revealed a strongly admixed mtDNA pool composed of African haplogroups (28%), West Eurasian haplogroups (26%), South Asian haplogroups (24%), and East Asian haplogroups (2%), while the origin of the remaining individuals (20%) could not be confidently assigned. The results of this study are a valuable contribution to build a database of mtDNA variation in Pakistan.
Asunto(s)
Pueblo Asiatico/genética , ADN Mitocondrial/genética , Femenino , Variación Genética , Haplotipos , Humanos , Masculino , PakistánRESUMEN
MOTIVATION: All current mitochondrial haplogroup classification tools require variants to be detected from an alignment with the reference sequence and to be properly named according to the canonical nomenclature standards for describing mitochondrial variants, before they can be compared with the haplogroup determining polymorphisms. With the emergence of high-throughput sequencing technologies and hence greater availability of mitochondrial genome sequences, there is a strong need for an automated haplogroup classification tool that is alignment-free and agnostic to reference sequence. RESULTS: We have developed a novel mitochondrial genome haplogroup-defining algorithm using a k-mer approach namely Phy-Mer. Phy-Mer performs equally well as the leading haplogroup classifier, HaploGrep, while avoiding the errors that may occur when preparing variants to required formats and notations. We have further expanded Phy-Mer functionality such that next-generation sequencing data can be used directly as input. AVAILABILITY AND IMPLEMENTATION: Phy-Mer is publicly available under the GNU Affero General Public License v3.0 on GitHub (https://github.com/danielnavarrogomez/phy-mer). CONTACT: Xiaowu_Gai@meei.harvard.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Algoritmos , ADN Mitocondrial/genética , Variación Genética/genética , Haplotipos/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN/métodos , Humanos , Programas InformáticosRESUMEN
Mitochondrial DNA (mtDNA) can be used for matrilineal biogeographic ancestry prediction and can thus provide investigative leads towards identifying unknown suspects, when conventional autosomal short tandem repeat (STR) profiling fails to provide a match. Recently, six multiplex genotyping assays targeting 62 ancestry-informative mitochondrial single nucleotide polymorphisms (mt-SNPs) were developed. This hierarchical system of assays allows detection of the major haplogroups present in Africa, America, Western Eurasia, Eastern Eurasia, Australia and Oceania, thus revealing the broad geographic region of matrilineal origin of a DNA donor. Here, we provide a forensic developmental validation study of five multiplex assays targeting all the 62 ancestry-informative mt-SNPs following the Scientific Working Group on DNA Analysis Methods (SWGDAM) guidelines. We demonstrate that the assays are highly sensitive; being able to produce full profiles at input DNA amounts of as little as 1pg. The assays were shown to be highly robust and efficient in providing information from degraded samples and from simulated casework samples of different substrates such as blood, semen, hair, saliva and trace DNA samples. Reproducible results were successfully achieved from concordance testing across three independent laboratories depicting the ease and reliability of these assays. Overall, our results demonstrate the suitability of these five mt-SNP assays for application to forensic casework and other purposes aiming to establish an individual's matrilineal genetic ancestry. With this validated tool, it is now possible to determine the matrilineal biogeographic origin of unknown individuals on the level of continental resolution from forensic DNA samples to provide investigative leads in criminal and missing person cases where autosomal STR profiling is uninformative.
Asunto(s)
ADN Mitocondrial/genética , Genealogía y Heráldica , Genotipo , Geografía , Secuencia de Bases , Cartilla de ADN , Humanos , Filogenia , Reproducibilidad de los ResultadosRESUMEN
The island region at the southeastern-most tip of New Guinea and its inhabitants known as Massim are well known for a unique traditional inter-island trading system, called Kula or Kula Ring. To characterize the Massim genetically, and to evaluate the influence of the Kula Ring on patterns of human genetic variation, we analyzed paternally inherited Y-chromosome (NRY) and maternally inherited mitochondrial (mt) DNA polymorphisms in >400 individuals from this region. We found that the nearly exclusively Austronesian-speaking Massim people harbor genetic ancestry components of both Asian (AS) and Near Oceanian (NO) origin, with a proportionally larger NO NRY component versus a larger AS mtDNA component. This is similar to previous observations in other Austronesian-speaking populations from Near and Remote Oceania and suggests sex-biased genetic admixture between Asians and Near Oceanians before the occupation of Remote Oceania, in line with the Slow Boat from Asia hypothesis on the expansion of Austronesians into the Pacific. Contrary to linguistic expectations, Rossel Islanders, the only Papuan speakers of the Massim, showed a lower amount of NO genetic ancestry than their Austronesian-speaking Massim neighbors. For the islands traditionally involved in the Kula Ring, a significant correlation between inter-island travelling distances and genetic distances was observed for mtDNA, but not for NRY, suggesting more male- than female-mediated gene flow. As traditionally only males take part in the Kula voyages, this finding may indicate a genetic signature of the Kula Ring, serving as another example of how cultural tradition has shaped human genetic diversity.