Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 565
Filtrar
2.
Nucleic Acids Res ; 52(5): 2212-2230, 2024 Mar 21.
Artículo en Inglés | MEDLINE | ID: mdl-38364871

RESUMEN

Nonreference sequences (NRSs) are DNA sequences present in global populations but absent in the current human reference genome. However, the extent and functional significance of NRSs in the human genomes and populations remains unclear. Here, we de novo assembled 539 genomes from five genetically divergent human populations using long-read sequencing technology, resulting in the identification of 5.1 million NRSs. These were merged into 45284 unique NRSs, with 29.7% being novel discoveries. Among these NRSs, 38.7% were common across the five populations, and 35.6% were population specific. The use of a graph-based pangenome approach allowed for the detection of 565 transcript expression quantitative trait loci on NRSs, with 426 of these being novel findings. Moreover, 26 NRS candidates displayed evidence of adaptive selection within human populations. Genes situated in close proximity to or intersecting with these candidates may be associated with metabolism and type 2 diabetes. Genome-wide association studies revealed 14 NRSs to be significantly associated with eight phenotypes. Additionally, 154 NRSs were found to be in strong linkage disequilibrium with 258 phenotype-associated SNPs in the GWAS catalogue. Our work expands the understanding of human NRSs and provides novel insights into their functions, facilitating evolutionary and biomedical researches.


Asunto(s)
Genoma Humano , Estudio de Asociación del Genoma Completo , Grupos de Población , Humanos , Diabetes Mellitus Tipo 2/genética , Desequilibrio de Ligamiento , Fenotipo , Polimorfismo de Nucleótido Simple , Genética de Población , Grupos de Población/genética
3.
Commun Biol ; 6(1): 964, 2023 09 22.
Artículo en Inglés | MEDLINE | ID: mdl-37736834

RESUMEN

Risk prediction models using genetic data have seen increasing traction in genomics. However, most of the polygenic risk models were developed using data from participants with similar (mostly European) ancestry. This can lead to biases in the risk predictors resulting in poor generalization when applied to minority populations and admixed individuals such as African Americans. To address this issue, largely due to the prediction models being biased by the underlying population structure, we propose a deep-learning framework that leverages data from diverse population and disentangles ancestry from the phenotype-relevant information in its representation. The ancestry disentangled representation can be used to build risk predictors that perform better across minority populations. We applied the proposed method to the analysis of Alzheimer's disease genetics. Comparing with standard linear and nonlinear risk prediction methods, the proposed method substantially improves risk prediction in minority populations, including admixed individuals, without needing self-reported ancestry information.


Asunto(s)
Enfermedad de Alzheimer , Predisposición Genética a la Enfermedad , Medición de Riesgo , Humanos , Enfermedad de Alzheimer/genética , Negro o Afroamericano/genética , Genómica , Herencia Multifactorial , Fenotipo , Predisposición Genética a la Enfermedad/etnología , Predisposición Genética a la Enfermedad/genética , Medición de Riesgo/etnología , Aprendizaje Profundo , Riesgo , Pueblo Europeo/genética , Grupos Minoritarios , Grupos de Población/etnología , Grupos de Población/genética , Modelos Estadísticos
4.
Genome Med ; 15(1): 52, 2023 07 17.
Artículo en Inglés | MEDLINE | ID: mdl-37461045

RESUMEN

BACKGROUND: Metabolic pathways are related to physiological functions and disease states and are influenced by genetic variation and environmental factors. Hispanics/Latino individuals have ancestry-derived genomic regions (local ancestry) from their recent admixture that have been less characterized for associations with metabolite abundance and disease risk. METHODS: We performed admixture mapping of 640 circulating metabolites in 3887 Hispanic/Latino individuals from the Hispanic Community Health Study/Study of Latinos (HCHS/SOL). Metabolites were quantified in fasting serum through non-targeted mass spectrometry (MS) analysis using ultra-performance liquid chromatography-MS/MS. Replication was performed in 1856 nonoverlapping HCHS/SOL participants with metabolomic data. RESULTS: By leveraging local ancestry, this study identified significant ancestry-enriched associations for 78 circulating metabolites at 484 independent regions, including 116 novel metabolite-genomic region associations that replicated in an independent sample. Among the main findings, we identified Native American enriched genomic regions at chromosomes 11 and 15, mapping to FADS1/FADS2 and LIPC, respectively, associated with reduced long-chain polyunsaturated fatty acid metabolites implicated in metabolic and inflammatory pathways. An African-derived genomic region at chromosome 2 was associated with N-acetylated amino acid metabolites. This region, mapped to ALMS1, is associated with chronic kidney disease, a disease that disproportionately burdens individuals of African descent. CONCLUSIONS: Our findings provide important insights into differences in metabolite quantities related to ancestry in admixed populations including metabolites related to regulation of lipid polyunsaturated fatty acids and N-acetylated amino acids, which may have implications for common diseases in populations.


Asunto(s)
Estudio de Asociación del Genoma Completo , Hispánicos o Latinos , Espectrometría de Masas en Tándem , Humanos , Población Negra/genética , Genoma Humano , Estudio de Asociación del Genoma Completo/métodos , Hispánicos o Latinos/genética , Polimorfismo de Nucleótido Simple , Indio Americano o Nativo de Alaska/genética , Metabolismo/genética , Grupos de Población/etnología , Grupos de Población/genética
5.
Genetics ; 224(1)2023 05 04.
Artículo en Inglés | MEDLINE | ID: mdl-36843304

RESUMEN

Common genetic association models for structured populations, including principal component analysis (PCA) and linear mixed-effects models (LMMs), model the correlation structure between individuals using population kinship matrices, also known as genetic relatedness matrices. However, the most common kinship estimators can have severe biases that were only recently determined. Here we characterize the effect of these kinship biases on genetic association. We employ a large simulated admixed family and genotypes from the 1000 Genomes Project, both with simulated traits, to evaluate key kinship estimators. Remarkably, we find practically invariant association statistics for kinship matrices of different bias types (matching all other features). We then prove using statistical theory and linear algebra that LMM association tests are invariant to these kinship biases, and PCA approximately so. Our proof shows that the intercept and relatedness effect coefficients compensate for the kinship bias, an argument that extends to generalized linear models. As a corollary, association testing is also invariant to changing the reference ancestral population of the kinship matrix. Lastly, we observed that all kinship estimators, except for popkin ratio-of-means, can give improper non-positive semidefinite matrices, which can be problematic although some LMMs handle them surprisingly well, and condition numbers can be used to choose kinship estimators. Overall, we find that existing association studies are robust to kinship estimation bias, and our calculations may help improve association methods by taking advantage of this unexpected robustness, as well as help determine the effects of kinship bias in related problems.


Asunto(s)
Modelos Genéticos , Grupos de Población , Humanos , Grupos de Población/genética , Genotipo , Modelos Lineales , Fenotipo , Sesgo
6.
Forensic Sci Int Genet ; 62: 102806, 2023 01.
Artículo en Inglés | MEDLINE | ID: mdl-36399972

RESUMEN

As evidenced by the large number of articles recently published in the literature, forensic scientists are making great efforts to infer externally visible features and biogeographical ancestry (BGA) from DNA analysis. Just as phenotypic, ancestry information obtained from DNA can provide investigative leads to identify the victims (missing/unidentified persons, crime/armed conflict/mass disaster victims) or trace their perpetrators when no matches were found with the reference profile or in the database. Recently, the advent of Massively Parallel Sequencing technologies associated with the possibility of harnessing high-throughput genetic data allowed us to investigate the associations between phenotypic and genomic variations in worldwide human populations and develop new BGA forensic tools capable of simultaneously analyzing up to millions of markers if for example the ancient DNA approach of hybridization capture was adopted to target SNPs of interest. In the present study, a selection of more than 3000 SNPs was performed to create a new BGA panel and the accuracy of the new panel to infer ancestry from unknown samples was evaluated by the PLS-DA method. Subsequently, the panel created was assessed using three variable selection techniques (Backward variable elimination, Genetic Algorithm and Regularized elimination procedure), and the best SNPs in terms of inferring bio-geographical ancestry at inter- and intra-continental level were selected to obtain panels to predict BGA with a reduced number of selected markers to be applied in routine forensic cases where PCR amplification is the best choice to target SNPs.


Asunto(s)
Genética Forense , Secuenciación de Nucleótidos de Alto Rendimiento , Grupos de Población , Humanos , ADN/genética , Genética Forense/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de los Mínimos Cuadrados , Filogeografía , Reacción en Cadena de la Polimerasa , Polimorfismo de Nucleótido Simple , Grupos de Población/genética
7.
Methods Mol Biol ; 2547: 595-609, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36068478

RESUMEN

Genetic ancestry inference can be used to stratify patient cohorts and to model pharmacogenomic variation within and between populations. We provide a detailed guide to genetic ancestry inference using genome-wide genetic variant datasets, with an emphasis on two widely used techniques: principal components analysis (PCA) and ADMIXTURE analysis. PCA can be used for patient stratification and categorical ancestry inference, whereas ADMIXTURE is used to characterize genetic ancestry as a continuous variable. Visualization methods are critical for the interpretation of genetic ancestry inference methods, and we provide instructions for how the results of PCA and ADMIXTURE can be effectively visualized.


Asunto(s)
Técnicas Genéticas , Farmacogenética , Genética de Población , Humanos , Polimorfismo de Nucleótido Simple , Grupos de Población/genética , Análisis de Componente Principal
8.
PLoS Genet ; 18(7): e1010281, 2022 07.
Artículo en Inglés | MEDLINE | ID: mdl-35839249

RESUMEN

Estimating admixture histories is crucial for understanding the genetic diversity we see in present-day populations. Allele frequency or phylogeny-based methods are excellent for inferring the existence of admixture or its proportions. However, to estimate admixture times, spatial information from admixed chromosomes of local ancestry or the decay of admixture linkage disequilibrium (ALD) is used. One popular method, implemented in the programs ALDER and ROLLOFF, uses two-locus ALD to infer the time of a single admixture event, but is only able to estimate the time of the most recent admixture event based on this summary statistic. To address this limitation, we derive analytical expressions for the expected ALD in a three-locus system and provide a new statistical method based on these results that is able to resolve more complicated admixture histories. Using simulations, we evaluate the performance of this method on a range of different admixture histories. As an example, we apply the method to the Colombian and Mexican samples from the 1000 Genomes project. The implementation of our method is available at https://github.com/Genomics-HSE/LaNeta.


Asunto(s)
Genética de Población , Grupos de Población , Colombia , Frecuencia de los Genes/genética , Humanos , Desequilibrio de Ligamiento , Modelos Genéticos , Grupos de Población/genética
9.
Sci Rep ; 12(1): 655, 2022 01 13.
Artículo en Inglés | MEDLINE | ID: mdl-35027632

RESUMEN

Southern Thailand is home to various populations; the Moklen, Moken and Urak Lawoi' sea nomads and Maniq negrito are the minority, while the southern Thai groups (Buddhist and Muslim) are the majority. Although previous studies have generated forensic STR dataset for major groups, such data of the southern Thai minority have not been included; here we generated a regional forensic database of southern Thailand. We newly genotyped common 15 autosomal STRs in 184 unrelated southern Thais, including all minorities and majorities. When combined with previously published data of major southern Thais, this provides a total of 334 southern Thai samples. The forensic parameter results show appropriate values for personal identification and paternity testing; the probability of excluding paternity is 0.99999622, and the combined discrimination power is 0.999999999999999. Probably driven by genetic drift and/or isolation with small census size, we found genetic distinction of the Maniq and sea nomads from the major groups, which were closer to the Malay and central Thais than the other Thai groups. The allelic frequency results can strength the regional forensic database in southern Thailand and also provide useful information for anthropological perspective.


Asunto(s)
Genética Forense , Genética de Población , Repeticiones de Microsatélite/genética , Grupos de Población/genética , Alelos , Bases de Datos Genéticas , Conjuntos de Datos como Asunto , Femenino , Frecuencia de los Genes , Flujo Genético , Humanos , Masculino , Tailandia
10.
Proc Natl Acad Sci U S A ; 119(4)2022 01 25.
Artículo en Inglés | MEDLINE | ID: mdl-35042810

RESUMEN

The field of genomics has benefited greatly from its "openness" approach to data sharing. However, with the increasing volume of sequence information being created and stored and the growing number of international genomics efforts, the equity of openness is under question. The United Nations Convention of Biodiversity aims to develop and adopt a standard policy on access and benefit-sharing for sequence information across signatory parties. This standardization will have profound implications on genomics research, requiring a new definition of open data sharing. The redefinition of openness is not unwarranted, as its limitations have unintentionally introduced barriers of engagement to some, including Indigenous Peoples. This commentary provides an insight into the key challenges of openness faced by the researchers who aspire to protect and conserve global biodiversity, including Indigenous flora and fauna, and presents immediate, practical solutions that, if implemented, will equip the genomics community with both the diversity and inclusivity required to respectfully protect global biodiversity.


Asunto(s)
Pueblos Indígenas/genética , Difusión de la Información/ética , Biodiversidad , Genómica/métodos , Humanos , Pueblos Indígenas/psicología , Pueblos Indígenas/estadística & datos numéricos , Difusión de la Información/métodos , Grupos de Población/genética
11.
EBioMedicine ; 74: 103695, 2021 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-34775353

RESUMEN

BACKGROUND: The heterogeneity in symptomatology and phenotypic profile attributable to COVID-19 is widely unknown. The objective of this manuscript is to conduct a trans-ancestry genome wide association study (GWAS) meta-analysis of COVID-19 severity to improve the understanding of potentially causal targets for SARS-CoV-2. METHODS: This cross-sectional study recruited 646 participants in the UAE that were divided into two phenotypic groups based on the severity of COVID-19 phenotypes, hospitalized (n=482) and non-hospitalized (n=164) participants. Hospitalized participants were COVID-19 patients that developed acute respiratory distress syndrome (ARDS), pneumonia or progression to respiratory failure that required supplemental oxygen therapy or mechanical ventilation support or had severe complications such as septic shock or multi-organ failure. We conducted a trans-ancestry meta-analysis GWAS of European (n=302), American (n=102), South Asian (n=99), and East Asian (n=107) ancestry populations. We also carried out comprehensive post-GWAS analysis, including enrichment of SNP associations in tissues and cell-types, expression quantitative trait loci and differential expression analysis. FINDINGS: Eight genes demonstrated a strong association signal: VWA8 gene in locus 13p14·11 (SNP rs10507497; p=9·54 x10-7), PDE8B gene in locus 5q13·3 (SNP rs7715119; p=2·19 x10-6), CTSC gene in locus 11q14·2 (rs72953026; p=2·38 x10-6), THSD7B gene in locus 2q22·1 (rs7605851; p=3·07x10-6), STK39 gene in locus 2q24·3 (rs7595310; p=4·55 x10-6), FBXO34 gene in locus 14q22·3 (rs10140801; p=8·26 x10-6), RPL6P27 gene in locus 18p11·31 (rs11659676; p=8·88 x10-6), and METTL21C gene in locus 13q33·1 (rs599976; p=8·95 x10-6). The genes are expressed in the lung, associated to tumour progression, emphysema, airway obstruction, and surface tension within the lung, as well as an association to T-cell-mediated inflammation and the production of inflammatory cytokines. INTERPRETATION: We have discovered eight highly plausible genetic association with hospitalized cases in COVID-19. Further studies must be conducted on worldwide population genetics to facilitate the development of population specific therapeutics to mitigate this worldwide challenge. FUNDING: This review was commissioned as part of a project to study the host cell receptors of coronaviruses funded by Khalifa University's CPRA grant (Reference number 2020-004).


Asunto(s)
Predisposición Genética a la Enfermedad/genética , Sitios de Carácter Cuantitativo/genética , Carácter Cuantitativo Heredable , Síndrome de Dificultad Respiratoria/genética , Índice de Severidad de la Enfermedad , Adolescente , Adulto , Anciano , COVID-19/mortalidad , COVID-19/patología , Estudios Transversales , Femenino , Estudio de Asociación del Genoma Completo , Hospitalización/estadística & datos numéricos , Humanos , Inflamación/genética , Pulmón/patología , Masculino , Persona de Mediana Edad , Polimorfismo de Nucleótido Simple/genética , Grupos de Población/genética , Síndrome de Dificultad Respiratoria/patología , SARS-CoV-2 , Linfocitos T/inmunología , Resultado del Tratamiento , Emiratos Árabes Unidos , Adulto Joven
12.
Science ; 373(6562): 1442-1443, 2021 Sep 24.
Artículo en Inglés | MEDLINE | ID: mdl-34554771
14.
Proc Natl Acad Sci U S A ; 118(13)2021 03 30.
Artículo en Inglés | MEDLINE | ID: mdl-33753512

RESUMEN

Island Southeast Asia has recently produced several surprises regarding human history, but the region's complex demography remains poorly understood. Here, we report ∼2.3 million genotypes from 1,028 individuals representing 115 indigenous Philippine populations and genome-sequence data from two ∼8,000-y-old individuals from Liangdao in the Taiwan Strait. We show that the Philippine islands were populated by at least five waves of human migration: initially by Northern and Southern Negritos (distantly related to Australian and Papuan groups), followed by Manobo, Sama, Papuan, and Cordilleran-related populations. The ancestors of Cordillerans diverged from indigenous peoples of Taiwan at least ∼8,000 y ago, prior to the arrival of paddy field rice agriculture in the Philippines ∼2,500 y ago, where some of their descendants remain to be the least admixed East Asian groups carrying an ancestry shared by all Austronesian-speaking populations. These observations contradict an exclusive "out-of-Taiwan" model of farming-language-people dispersal within the last four millennia for the Philippines and Island Southeast Asia. Sama-related ethnic groups of southwestern Philippines additionally experienced some minimal South Asian gene flow starting ∼1,000 y ago. Lastly, only a few lowlanders, accounting for <1% of all individuals, presented a low level of West Eurasian admixture, indicating a limited genetic legacy of Spanish colonization in the Philippines. Altogether, our findings reveal a multilayered history of the Philippines, which served as a crucial gateway for the movement of people that ultimately changed the genetic landscape of the Asia-Pacific region.


Asunto(s)
Migración Humana/historia , Grupos de Población/historia , Agricultura , Asia Sudoriental/etnología , Australia/etnología , Femenino , Flujo Genético , Genómica , Historia Antigua , Humanos , Masculino , Oryza , Filipinas , Grupos de Población/genética , Taiwán/etnología
15.
Sci Rep ; 11(1): 5249, 2021 03 04.
Artículo en Inglés | MEDLINE | ID: mdl-33664303

RESUMEN

Determining the number of contributors (NOC) accurately in a forensic DNA mixture profile can be challenging. To address this issue, there have been various studies that examined the uncertainty in estimating the NOC in a DNA mixture profile. However, the focus of these studies lies primarily on dominant populations residing within Europe and North America. Thus, there is limited representation of Asian populations in these studies. Further, the effects of allele dropout on the NOC estimation has not been explored. As such, this study assesses the uncertainty of NOC in simulated DNA mixture profiles of Chinese, Malay, and Indian populations, which are the predominant ethnic populations in Asia. The Caucasian ethnic population was also included to provide a basis of comparison with other similar studies. Our results showed that without considering allele dropout, the NOC from DNA mixture profiles derived from up to four contributors of the same ethnic population could be estimated with confidence in the Chinese, Malay, Indian and Caucasian populations. The same results can be observed on DNA mixture profiles originating from a combination of differing ethnic populations. The inclusion of an overall 30% allele dropout rate increased the probability (risk) of underestimating the NOC in a DNA mixture profile; even a 3-person DNA mixture profile has a > 99% risk of underestimating the NOC as two or fewer contributors. However, such risks could be mitigated when the highly polymorphic SE33 locus was included in the dataset. Lastly there was a negligible level of risk in misinterpreting the NOC in a mixture profile as deriving from a single source profile. In summary, our studies showcased novel results representative of the Chinese, Malay, and Indian ethnic populations when examining the uncertainty in NOC estimation in a DNA mixture profile. Our results would be useful in the estimation of NOC in a DNA mixture profile in the Asian context.


Asunto(s)
ADN/genética , Etnicidad/genética , Genética de Población/estadística & datos numéricos , Asia/epidemiología , China/epidemiología , Dermatoglifia del ADN/estadística & datos numéricos , Europa (Continente)/epidemiología , Humanos , India/epidemiología , Malaui/epidemiología , Repeticiones de Microsatélite/genética , Modelos Teóricos , América del Norte/epidemiología , Grupos de Población/genética
16.
PLoS Genet ; 17(3): e1009392, 2021 03.
Artículo en Inglés | MEDLINE | ID: mdl-33661925

RESUMEN

The natural history of tuberculosis (TB) is characterized by a large inter-individual outcome variability after exposure to Mycobacterium tuberculosis. Specifically, some highly exposed individuals remain resistant to M. tuberculosis infection, as inferred by tuberculin skin test (TST) or interferon-gamma release assays (IGRAs). We performed a genome-wide association study of resistance to M. tuberculosis infection in an endemic region of Southern Vietnam. We enrolled household contacts (HHC) of pulmonary TB cases and compared subjects who were negative for both TST and IGRA (n = 185) with infected individuals (n = 353) who were either positive for both TST and IGRA or had a diagnosis of TB. We found a genome-wide significant locus on chromosome 10q26.2 with a cluster of variants associated with strong protection against M. tuberculosis infection (OR = 0.42, 95%CI 0.35-0.49, P = 3.71×10-8, for the genotyped variant rs17155120). The locus was replicated in a French multi-ethnic HHC cohort and a familial admixed cohort from a hyper-endemic area of South Africa, with an overall OR for rs17155120 estimated at 0.50 (95%CI 0.45-0.55, P = 1.26×10-9). The variants are located in intronic regions and upstream of C10orf90, a tumor suppressor gene which encodes an ubiquitin ligase activating the transcription factor p53. In silico analysis showed that the protective alleles were associated with a decreased expression in monocytes of the nearby gene ADAM12 which could lead to an enhanced response of Th17 lymphocytes. Our results reveal a novel locus controlling resistance to M. tuberculosis infection across different populations.


Asunto(s)
Cromosomas Humanos Par 10 , Resistencia a la Enfermedad/genética , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Mycobacterium tuberculosis , Sitios de Carácter Cuantitativo , Tuberculosis/genética , Tuberculosis/microbiología , Alelos , Biología Computacional/métodos , Francia , Genotipo , Humanos , Metaanálisis como Asunto , Grupos de Población/genética , Sudáfrica , Vietnam
17.
Science ; 372(6537)2021 04 02.
Artículo en Inglés | MEDLINE | ID: mdl-33632895

RESUMEN

Long-read and strand-specific sequencing technologies together facilitate the de novo assembly of high-quality haplotype-resolved human genomes without parent-child trio data. We present 64 assembled haplotypes from 32 diverse human genomes. These highly contiguous haplotype assemblies (average minimum contig length needed to cover 50% of the genome: 26 million base pairs) integrate all forms of genetic variation, even across complex loci. We identified 107,590 structural variants (SVs), of which 68% were not discovered with short-read sequencing, and 278 SV hotspots (spanning megabases of gene-rich sequence). We characterized 130 of the most active mobile element source elements and found that 63% of all SVs arise through homology-mediated mechanisms. This resource enables reliable graph-based genotyping from short reads of up to 50,340 SVs, resulting in the identification of 1526 expression quantitative trait loci as well as SV candidates for adaptive selection within the human population.


Asunto(s)
Variación Genética , Genoma Humano , Haplotipos , Femenino , Genotipo , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Mutación INDEL , Secuencias Repetitivas Esparcidas , Masculino , Grupos de Población/genética , Sitios de Carácter Cuantitativo , Retroelementos , Análisis de Secuencia de ADN , Inversión de Secuencia , Secuenciación Completa del Genoma
18.
Sci Rep ; 11(1): 4701, 2021 02 25.
Artículo en Inglés | MEDLINE | ID: mdl-33633141

RESUMEN

The introduction of massively parallel sequencing (MPS) in forensic investigation enables sequence-based large-scale multiplexing beyond size-based analysis using capillary electrophoresis (CE). For the practical application of MPS to forensic casework, many population studies have provided sequence data for autosomal short tandem repeats (STRs). However, SE33, a highly polymorphic STR marker, has little sequence-based data because of difficulties in analysis. In this study, 25 autosomal STRs were analyzed, including SE33, using an in-house MPS panel for 350 samples from four populations (African-American, Caucasian, Hispanic, and Korean). The barcoded MPS library was generated using a two-step PCR method and sequenced using a MiSeq System. As a result, 99.88% genotype concordance was obtained between length- and sequence-based analyses. In SE33, the most discordances (eight samples, 0.08%) were observed because of the 4 bp deletion between the CE and MPS primer binding sites. Compared with the length-based CE method, the number of alleles increased from 332 to 725 (2.18-fold) for 25 autosomal STRs in the sequence-based MPS method. Notably, additional 129 unique alleles, a 4.15-fold increase, were detected in SE33 by identifying sequence variations. This population data set provides sequence variations and sequence-based allele frequencies for 25 autosomal STRs.


Asunto(s)
Genética Forense , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Repeticiones de Microsatélite/genética , Grupos de Población/genética , Electroforesis Capilar , Frecuencia de los Genes , Humanos , Reacción en Cadena de la Polimerasa , Polimorfismo de Nucleótido Simple
19.
Mitochondrion ; 58: 111-122, 2021 05.
Artículo en Inglés | MEDLINE | ID: mdl-33618020

RESUMEN

Investigation of human mitochondrial (mt) genome variation has been shown to provide insights to the human history and natural selection. By analyzing 24,167 human mt-genome samples, collected for five continents, we have developed a co-mutation network model to investigate characteristic human evolutionary patterns. The analysis highlighted richer co-mutating regions of the mt-genome, suggesting the presence of epistasis. Specifically, a large portion of COX genes was found to co-mutate in Asian and American populations, whereas, in African, European, and Oceanic populations, there was greater co-mutation bias in hypervariable regions. Interestingly, this study demonstrated hierarchical modularity as a crucial agent for these co-mutation networks. More profoundly, our ancestry-based co-mutation module analyses showed that mutations cluster preferentially in known mitochondrial haplogroups. Contemporary human mt-genome nucleotides most closely resembled the ancestral state, and very few of them were found to be ancestral-variants. Overall, these results demonstrated that subpopulation-based biases may favor mitochondrial gene specific epistasis.


Asunto(s)
Epistasis Genética , Evolución Molecular , Genes Mitocondriales , Grupos de Población/genética , Humanos , Mutación
20.
PLoS Genet ; 17(1): e1009210, 2021 01.
Artículo en Inglés | MEDLINE | ID: mdl-33428619

RESUMEN

Modern day Saudi Arabia occupies the majority of historical Arabia, which may have contributed to ancient waves of migration out of Africa. This ancient history has left a lasting imprint in the genetics of the region, including the diverse set of tribes that call Saudi Arabia their home. How these tribes relate to each other and to the world's major populations remains an unanswered question. In an attempt to improve our understanding of the population structure of Saudi Arabia, we conducted genomic profiling of 957 unrelated individuals who self-identify with 28 large tribes in Saudi Arabia. Consistent with the tradition of intra-tribal unions, the subjects showed strong clustering along tribal lines with the distance between clusters correlating with their geographical proximities in Arabia. However, these individuals form a unique cluster when compared to the world's major populations. The ancient origin of these tribal affiliations is supported by analyses that revealed little evidence of ancestral origin from within the 28 tribes. Our results disclose a granular map of population structure and have important implications for future genetic studies into Mendelian and common diseases in the region.


Asunto(s)
Árabes/genética , Genoma Humano/genética , Grupos de Población/genética , África/epidemiología , Arabia/epidemiología , Árabes/historia , Asia/epidemiología , Europa (Continente)/epidemiología , Femenino , Proyecto Mapa de Haplotipos , Haplotipos/genética , Historia Antigua , Humanos , Endogamia , Masculino , Grupos de Población/historia , Análisis de Componente Principal , Arabia Saudita/epidemiología
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA