Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 411
Filtrar
1.
J Genet ; 1032024.
Artículo en Inglés | MEDLINE | ID: mdl-38258299

RESUMEN

Fixation index (Fst) statistics provide critical insights into evolutionary processes affecting the structure of genetic variation within and among populations. Fst statistics have been widely applied in population and evolutionary genetics to identify genomic regions targeted by selection pressures. The FSTest 1.3 software was developed to estimate four Fst statistics of Hudson, Weir and Cockerham, Nei, and Wright using high-throughput genotyping or sequencing data. Here, we introduced FSTest 1.3 and compared its performance with two widely used software VCFtools 0.1.16 and PLINK 2.0. Chromosome 1 of 1000 Genomes Phase III variant data belonging to South Asian (n = 211) and African (n = 274) populations were included as an example case in this study. Different Fst estimates were calculated for each single-nucleotide polymorphism (SNP) in a pairwise comparison of South Asian against African populations, and the results of FSTest 1.3 were confirmed by VCFtools 0.1.16 and PLINK 2.0. Two different sliding window approaches, one based on a fixed number of SNPs and another based on a fixed number of base pair (bp) were conducted using FSTest 1.3 and VCFtools 0.1.16. Our results showed that regions with low coverage genotypic data could lead to an overestimation of Fst in sliding window analysis using a fixed number of bp. FSTest 1.3 could mitigate this challenge by estimating the average of consecutive SNPs along the chromosome. FSTest 1.3 allows direct analysis of VCF files with a small amount of code and can calculate Fst estimates on a desktop computer for more than a million SNPs in a few minutes. FSTest 1.3 is freely available at https://github.com/similab/FSTest.


Asunto(s)
Pueblo Africano , Cromosomas Humanos Par 1 , Variación Genética , Genética de Población , Personas del Sur de Asia , Humanos , Pueblo Asiatico/genética , Evolución Biológica , Cromosomas Humanos Par 1/genética , Genómica , Genotipo , Genética de Población/métodos , Genética de Población/estadística & datos numéricos , Personas del Sur de Asia/genética , Pueblo Africano/genética , Variación Genética/genética
2.
Nat Commun ; 12(1): 4921, 2021 08 13.
Artículo en Inglés | MEDLINE | ID: mdl-34389724

RESUMEN

Age-related clonal hematopoiesis (ARCH) is characterized by age-associated accumulation of somatic mutations in hematopoietic stem cells (HSCs) or their pluripotent descendants. HSCs harboring driver mutations will be positively selected and cells carrying these mutations will rise in frequency. While ARCH is a known risk factor for blood malignancies, such as Acute Myeloid Leukemia (AML), why some people who harbor ARCH driver mutations do not progress to AML remains unclear. Here, we model the interaction of positive and negative selection in deeply sequenced blood samples from individuals who subsequently progressed to AML, compared to healthy controls, using deep learning and population genetics. Our modeling allows us to discriminate amongst evolutionary classes with high accuracy and captures signatures of purifying selection in most individuals. Purifying selection, acting on benign or mildly damaging passenger mutations, appears to play a critical role in preventing disease-predisposing clones from rising to dominance and is associated with longer disease-free survival. Through exploring a range of evolutionary models, we show how different classes of selection shape clonal dynamics and health outcomes thus enabling us to better identify individuals at a high risk of malignancy.


Asunto(s)
Evolución Clonal , Hematopoyesis Clonal/genética , Células Madre Hematopoyéticas/metabolismo , Leucemia Mieloide/genética , Mutación , Enfermedad Aguda , Adulto , Anciano , Aprendizaje Profundo , Genética de Población/métodos , Genética de Población/estadística & datos numéricos , Células Madre Hematopoyéticas/citología , Humanos , Estimación de Kaplan-Meier , Leucemia Mieloide/patología , Persona de Mediana Edad , Modelos Genéticos , Evaluación de Resultado en la Atención de Salud/métodos , Evaluación de Resultado en la Atención de Salud/estadística & datos numéricos
3.
PLoS Comput Biol ; 17(8): e1008904, 2021 08.
Artículo en Inglés | MEDLINE | ID: mdl-34339413

RESUMEN

The killer-cell immunoglobulin-like receptor (KIR) complex on chromosome 19 encodes receptors that modulate the activity of natural killer cells, and variation in these genes has been linked to infectious and autoimmune disease, as well as having bearing on pregnancy and transplant outcomes. The medical relevance and high variability of KIR genes makes short-read sequencing an attractive technology for interrogating the region, providing a high-throughput, high-fidelity sequencing method that is cost-effective. However, because this gene complex is characterized by extensive nucleotide polymorphism, structural variation including gene fusions and deletions, and a high level of homology between genes, its interrogation at high resolution has been thwarted by bioinformatic challenges, with most studies limited to examining presence or absence of specific genes. Here, we present the PING (Pushing Immunogenetics to the Next Generation) pipeline, which incorporates empirical data, novel alignment strategies and a custom alignment processing workflow to enable high-throughput KIR sequence analysis from short-read data. PING provides KIR gene copy number classification functionality for all KIR genes through use of a comprehensive alignment reference. The gene copy number determined per individual enables an innovative genotype determination workflow using genotype-matched references. Together, these methods address the challenges imposed by the structural complexity and overall homology of the KIR complex. To determine copy number and genotype determination accuracy, we applied PING to European and African validation cohorts and a synthetic dataset. PING demonstrated exceptional copy number determination performance across all datasets and robust genotype determination performance. Finally, an investigation into discordant genotypes for the synthetic dataset provides insight into misaligned reads, advancing our understanding in interpretation of short-read sequencing data in complex genomic regions. PING promises to support a new era of studies of KIR polymorphism, delivering high-resolution KIR genotypes that are highly accurate, enabling high-quality, high-throughput KIR genotyping for disease and population studies.


Asunto(s)
Inmunogenética/estadística & datos numéricos , Receptores KIR/genética , África Austral , Alelos , Biología Computacional , Simulación por Computador , Bases de Datos de Ácidos Nucleicos/estadística & datos numéricos , Europa (Continente) , Dosificación de Gen , Genética de Población/estadística & datos numéricos , Genotipo , Secuenciación de Nucleótidos de Alto Rendimiento/estadística & datos numéricos , Humanos , Polimorfismo Genético , Receptores KIR/clasificación , Alineación de Secuencia/estadística & datos numéricos , Diseño de Software
4.
Hum Genet ; 140(10): 1487-1498, 2021 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-34424406

RESUMEN

Migration and admixture history of populations have always been curious and an interesting theme. The West Coast of India harbours a rich diversity, bestowing various ethno-linguistic groups, with many of them having well-documented history of migrations. The Roman Catholic is one such distinct group, whose origin was much debated. While some historians and anthropologists relating them to ancient group of Gaud Saraswat Brahmins, others relating them for being members of the Jews Lost Tribes in the first Century migration to India. Historical records suggests that this community was later forcibly converted to Christianity by the Portuguese in Goa during the Sixteenth Century. Till date, no genetic study was done on this group to infer their origin and genetic affinity. Hence, we analysed 110 Roman Catholics from three different locations of West Coast of India including Goa, Kumta and Mangalore using both uniparental and autosomal markers to understand their genetic history. We found that the Roman Catholics have close affinity with the Indo-European linguistic groups, particularly Brahmins. Additionally, we detected genetic signal of Jews in the linkage disequilibrium-based admixture analysis, which was absent in other Indo-European populations, who are inhabited in the same geographical regions. Haplotype-based analysis suggests that the Roman Catholics consist of South Asian-specific ancestry and showed high drift. Ancestry-specific historical population size estimation points to a possible bottleneck around the time of Goan inquisition (fifteenth century). Analysis of the Roman Catholics data along with ancient DNA data of Neolithic and bronze age revealed that the Roman Catholics fits well in a basic model of ancient ancestral composition, typical of most of the Indo-European caste groups of India. Mitochondrial DNA (mtDNA) analysis suggests that most of the Roman Catholics have aboriginal Indian maternal genetic ancestry; while the Y chromosomal DNA analysis indicates high frequency of R1a lineage, which is predominant in groups with higher ancestral North Indian (ANI) component. Therefore, we conclude that the Roman Catholics of Goa, Kumta and Mangalore regions are the remnants of very early lineages of Brahmin community of India, having Indo-Europeans genetic affinity along with cryptic Jewish admixture, which needs to be explored further.


Asunto(s)
Catolicismo , Etnicidad/genética , Evolución Molecular , Variación Genética , Genética de Población/estadística & datos numéricos , Geografía , Dinámica Poblacional , Etnicidad/estadística & datos numéricos , Europa (Continente) , Humanos , India , Judíos/genética , Filogenia
5.
Nat Commun ; 12(1): 4506, 2021 07 23.
Artículo en Inglés | MEDLINE | ID: mdl-34301930

RESUMEN

Polygenic Risk Scores (PRS) for AD offer unique possibilities for reliable identification of individuals at high and low risk of AD. However, there is little agreement in the field as to what approach should be used for genetic risk score calculations, how to model the effect of APOE, what the optimal p-value threshold (pT) for SNP selection is and how to compare scores between studies and methods. We show that the best prediction accuracy is achieved with a model with two predictors (APOE and PRS excluding APOE region) with pT<0.1 for SNP selection. Prediction accuracy in a sample across different PRS approaches is similar, but individuals' scores and their associated ranking differ. We show that standardising PRS against the population mean, as opposed to the sample mean, makes the individuals' scores comparable between studies. Our work highlights the best strategies for polygenic profiling when assessing individuals for AD risk.


Asunto(s)
Enfermedad de Alzheimer/genética , Apolipoproteínas E/genética , Estudio de Asociación del Genoma Completo/métodos , Herencia Multifactorial/genética , Polimorfismo de Nucleótido Simple , Alelos , Enfermedad de Alzheimer/diagnóstico , Estudios de Casos y Controles , Frecuencia de los Genes , Genética de Población/métodos , Genética de Población/estadística & datos numéricos , Estudio de Asociación del Genoma Completo/estadística & datos numéricos , Genotipo , Humanos , Reproducibilidad de los Resultados , Medición de Riesgo/métodos , Medición de Riesgo/estadística & datos numéricos , Factores de Riesgo , Sensibilidad y Especificidad
6.
Proc Natl Acad Sci U S A ; 118(21)2021 05 25.
Artículo en Inglés | MEDLINE | ID: mdl-34016747

RESUMEN

As populations boom and bust, the accumulation of genetic diversity is modulated, encoding histories of living populations in present-day variation. Many methods exist to decode these histories, and all must make strong model assumptions. It is typical to assume that mutations accumulate uniformly across the genome at a constant rate that does not vary between closely related populations. However, recent work shows that mutational processes in human and great ape populations vary across genomic regions and evolve over time. This perturbs the mutation spectrum (relative mutation rates in different local nucleotide contexts). Here, we develop theoretical tools in the framework of Kingman's coalescent to accommodate mutation spectrum dynamics. We present mutation spectrum history inference (mushi), a method to perform nonparametric inference of demographic and mutation spectrum histories from allele frequency data. We use mushi to reconstruct trajectories of effective population size and mutation spectrum divergence between human populations, identify mutation signatures and their dynamics in different human populations, and calibrate the timing of a previously reported mutational pulse in the ancestors of Europeans. We show that mutation spectrum histories can be placed in a well-studied theoretical setting and rigorously inferred from genomic variation data, like other features of evolutionary history.


Asunto(s)
Frecuencia de los Genes/genética , Genética de Población/estadística & datos numéricos , Modelos Genéticos , Mutación/genética , Animales , Variación Genética/genética , Genómica , Hominidae/genética , Humanos , Tasa de Mutación , Densidad de Población
7.
Sci Rep ; 11(1): 5249, 2021 03 04.
Artículo en Inglés | MEDLINE | ID: mdl-33664303

RESUMEN

Determining the number of contributors (NOC) accurately in a forensic DNA mixture profile can be challenging. To address this issue, there have been various studies that examined the uncertainty in estimating the NOC in a DNA mixture profile. However, the focus of these studies lies primarily on dominant populations residing within Europe and North America. Thus, there is limited representation of Asian populations in these studies. Further, the effects of allele dropout on the NOC estimation has not been explored. As such, this study assesses the uncertainty of NOC in simulated DNA mixture profiles of Chinese, Malay, and Indian populations, which are the predominant ethnic populations in Asia. The Caucasian ethnic population was also included to provide a basis of comparison with other similar studies. Our results showed that without considering allele dropout, the NOC from DNA mixture profiles derived from up to four contributors of the same ethnic population could be estimated with confidence in the Chinese, Malay, Indian and Caucasian populations. The same results can be observed on DNA mixture profiles originating from a combination of differing ethnic populations. The inclusion of an overall 30% allele dropout rate increased the probability (risk) of underestimating the NOC in a DNA mixture profile; even a 3-person DNA mixture profile has a > 99% risk of underestimating the NOC as two or fewer contributors. However, such risks could be mitigated when the highly polymorphic SE33 locus was included in the dataset. Lastly there was a negligible level of risk in misinterpreting the NOC in a mixture profile as deriving from a single source profile. In summary, our studies showcased novel results representative of the Chinese, Malay, and Indian ethnic populations when examining the uncertainty in NOC estimation in a DNA mixture profile. Our results would be useful in the estimation of NOC in a DNA mixture profile in the Asian context.


Asunto(s)
ADN/genética , Etnicidad/genética , Genética de Población/estadística & datos numéricos , Asia/epidemiología , China/epidemiología , Dermatoglifia del ADN/estadística & datos numéricos , Europa (Continente)/epidemiología , Humanos , India/epidemiología , Malaui/epidemiología , Repeticiones de Microsatélite/genética , Modelos Teóricos , América del Norte/epidemiología , Grupos de Población/genética
8.
PLoS Genet ; 17(1): e1009241, 2021 01.
Artículo en Inglés | MEDLINE | ID: mdl-33465078

RESUMEN

FST and kinship are key parameters often estimated in modern population genetics studies in order to quantitatively characterize structure and relatedness. Kinship matrices have also become a fundamental quantity used in genome-wide association studies and heritability estimation. The most frequently-used estimators of FST and kinship are method-of-moments estimators whose accuracies depend strongly on the existence of simple underlying forms of structure, such as the independent subpopulations model of non-overlapping, independently evolving subpopulations. However, modern data sets have revealed that these simple models of structure likely do not hold in many populations, including humans. In this work, we analyze the behavior of these estimators in the presence of arbitrarily-complex population structures, which results in an improved estimation framework specifically designed for arbitrary population structures. After generalizing the definition of FST to arbitrary population structures and establishing a framework for assessing bias and consistency of genome-wide estimators, we calculate the accuracy of existing FST and kinship estimators under arbitrary population structures, characterizing biases and estimation challenges unobserved under their originally-assumed models of structure. We then present our new approach, which consistently estimates kinship and FST when the minimum kinship value in the dataset is estimated consistently. We illustrate our results using simulated genotypes from an admixture model, constructing a one-dimensional geographic scenario that departs nontrivially from the independent subpopulations model. Our simulations reveal the potential for severe biases in estimates of existing approaches that are overcome by our new framework. This work may significantly improve future analyses that rely on accurate kinship and FST estimates.


Asunto(s)
Genética de Población/estadística & datos numéricos , Estudio de Asociación del Genoma Completo/estadística & datos numéricos , Endogamia , Modelos Genéticos , Genotipo , Humanos , Linaje , Polimorfismo de Nucleótido Simple/genética
9.
Hum Immunol ; 82(2): 97-102, 2021 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-33388178

RESUMEN

We estimated HLA allele and haplotype frequencies of the Saudi Arabian population from a sample of 45,457 registered stem cell donors. The most frequent HLA alleles were A*02:01g (18.5%), C*06:02g (16.1%), B*51:01g (14.1%), DRB1*07:01g (16.2%), DQB1*02:01g (30.5%), and DPB1*04:01g (33.6%). The most frequent 5-locus haplotypes were A*02:05g~C*06:02g~B*50:01g~DRB1*07:01g~DQB1*02:01g (1.73%), A*02:01g~C*06:02g~B*50:01g~DRB1*07:01g~DQB1*02:01g (1.66%), and A*26:01g~C*07:02g~B*08:01g~DRB1*03:01g~DQB1*02:01g (1.38%). Furthermore, we used the calculated haplotype frequencies to estimate stem cell donor matching probabilities for Saudi Arabian donor and patient populations under various matching requirements. These results are relevant for strategic donor registry planning in the Kingdom of Saudi Arabia.


Asunto(s)
Selección de Donante/métodos , Antígenos HLA-D/genética , Trasplante de Células Madre Hematopoyéticas/métodos , Antígenos de Histocompatibilidad Clase I/genética , Alelos , Árabes/genética , Conjuntos de Datos como Asunto , Frecuencia de los Genes , Genética de Población/estadística & datos numéricos , Antígenos HLA-D/inmunología , Haplotipos , Antígenos de Histocompatibilidad Clase I/inmunología , Prueba de Histocompatibilidad , Humanos , Sistema de Registros/estadística & datos numéricos , Arabia Saudita , Donantes de Tejidos
10.
Hum Immunol ; 82(1): 1-2, 2021 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-33257011

RESUMEN

We investigated HLA class I (HLA-A, -B, and -C) and class II (HLA-DRB1, -DQB1, -DPA1, and -DPB1) alleles by NGS-based typing among 478 Brazilian individuals from two populations in the Barra Mansa city based on their self-declared skin color (Caucasian, N = 405, AFND-ID: 3729; Black, N = 73, AFND-ID: 3731) to calculate allelic and haplotypic frequencies, plus linkage disequilibrium. No locus deviated from Hardy-Weinberg equilibrium. Both populations shared the most frequent allele on HLA-A, -C, -DPA1, and -DPB1. Genotype and frequency data are available in the Allele Frequencies Net Database.


Asunto(s)
Antígenos de Histocompatibilidad Clase II/genética , Antígenos de Histocompatibilidad Clase I/genética , Alelos , Brasil , Frecuencia de los Genes , Genética de Población/estadística & datos numéricos , Haplotipos , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Desequilibrio de Ligamiento , Grupos de Población/genética
11.
Hum Immunol ; 82(1): 5-7, 2021 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-33303214

RESUMEN

In this study, we report for the first time HLA allele and haplotype frequencies in the modern Panamanian population at a two-field (four digits) resolution level. Reported frequencies were calculated from genotype data for the HLA-A, -B, -C, -DPB1, -DQB1 and -DRB1 loci of 462 healthy unrelated Panamanian adults of Hispanic ethnicity. In addition to providing new insights on the allelic structure of the Panamanian population and its origin, these data are critical for better planning of healthcare strategies in the country and for future research exploring the association with certain chronic and infectious diseases.


Asunto(s)
Hispánicos o Latinos/genética , Antígenos de Histocompatibilidad Clase II/genética , Antígenos de Histocompatibilidad Clase I/genética , Adolescente , Adulto , Anciano , Alelos , Femenino , Frecuencia de los Genes , Genética de Población/estadística & datos numéricos , Haplotipos , Voluntarios Sanos , Humanos , Desequilibrio de Ligamiento , Masculino , Persona de Mediana Edad , Panamá , Adulto Joven
12.
J Biosoc Sci ; 53(2): 183-198, 2021 03.
Artículo en Inglés | MEDLINE | ID: mdl-32172699

RESUMEN

Several studies have shown that the Brazilian Northeast is a region with high rates of inbreeding as well as a high incidence of autosomal recessive diseases. The elaboration of public health policies focused on the epidemiological surveillance of congenital anomalies and rare genetic diseases in this region is urgently needed. However, the vast territory, socio-demographic heterogeneity, economic difficulties and low number of professionals with expertise in medical genetics make strategic planning a challenging task. Surnames can be compared to a genetic system with multiple neutral alleles and allow some approximation of population structure. Here, surname analysis of more than 37 million people was combined with health and socio-demographic indicators covering all 1794 municipalities of the nine states of the region. The data distribution showed a heterogeneous spatial pattern (Global Moran Index, GMI = 0.58; p < 0.001), with higher isonymy rates in the east of the region and the highest rates in the Quilombo dos Palmares region - the largest conglomerate of escaped slaves in Latin America. A positive correlation was found between the isonymy index and the frequency of live births with congenital anomalies (r = 0.268; p < 0.001), and the two indicators were spatially correlated (GMI = 0.50; p < 0.001). With this approach, quantitative information on the genetic structure of the Brazilian Northeast population was obtained, which may represent an economical and useful tool for decision-making in the medical field.


Asunto(s)
Genética Médica/estadística & datos numéricos , Genética de Población/estadística & datos numéricos , Nombres , Adolescente , Adulto , Anciano , Brasil , Femenino , Humanos , Masculino , Persona de Mediana Edad , Dinámica Poblacional , Adulto Joven
13.
Hum Immunol ; 82(1): 3-4, 2021 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-33267971

RESUMEN

We investigated HLA class I (HLA-A, -B, and -C) and class II (HLA-DRB1, -DQB1, -DPA1, and -DPB1) alleles by NGS-based typing among 759 Brazilian individuals from three populations in the Rio de Janeiro city based on their self-declared skin color (Caucasian, N = 521, AFND-ID: 3730; Parda, N = 170, AFND-ID: 3728; Black, N = 68, AFND-ID: 3727) to calculate allelic and haplotypic frequencies, plus linkage disequilibrium. Only HLA-DRB1 locus deviated from Hardy-Weinberg equilibrium (in Caucasian and Black populations). The three populations shared the most frequent allele on HLA-A, -C, -DRB1, -DPA1, and -DPB1. Genotype and frequency data are available in the Allele Frequencies Net Database.


Asunto(s)
Antígenos de Histocompatibilidad Clase II/genética , Antígenos de Histocompatibilidad Clase I/genética , Alelos , Brasil , Frecuencia de los Genes , Genética de Población/estadística & datos numéricos , Haplotipos , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Desequilibrio de Ligamiento , Grupos de Población/genética
14.
Nucleic Acids Res ; 49(D1): D1225-D1232, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33095885

RESUMEN

With the advent of next-generation sequencing, large-scale initiatives for mining whole genomes and exomes have been employed to better understand global or population-level genetic architecture. India encompasses more than 17% of the world population with extensive genetic diversity, but is under-represented in the global sequencing datasets. This gave us the impetus to perform and analyze the whole genome sequencing of 1029 healthy Indian individuals under the pilot phase of the 'IndiGen' program. We generated a compendium of 55,898,122 single allelic genetic variants from geographically distinct Indian genomes and calculated the allele frequency, allele count, allele number, along with the number of heterozygous or homozygous individuals. In the present study, these variants were systematically annotated using publicly available population databases and can be accessed through a browsable online database named as 'IndiGenomes' http://clingen.igib.res.in/indigen/. The IndiGenomes database will help clinicians and researchers in exploring the genetic component underlying medical conditions. Till date, this is the most comprehensive genetic variant resource for the Indian population and is made freely available for academic utility. The resource has also been accessed extensively by the worldwide community since it's launch.


Asunto(s)
Bases de Datos Genéticas , Variación Genética , Genoma Humano , Proyecto Genoma Humano , Programas Informáticos , Adulto , Exoma , Femenino , Genética de Población/estadística & datos numéricos , Humanos , India , Internet , Masculino , Anotación de Secuencia Molecular , Secuenciación Completa del Genoma
15.
PLoS Comput Biol ; 16(11): e1008402, 2020 11.
Artículo en Inglés | MEDLINE | ID: mdl-33151935

RESUMEN

Resources are rarely distributed uniformly within a population. Heterogeneity in the concentration of a drug, the quality of breeding sites, or wealth can all affect evolutionary dynamics. In this study, we represent a collection of properties affecting the fitness at a given location using a color. A green node is rich in resources while a red node is poorer. More colors can represent a broader spectrum of resource qualities. For a population evolving according to the birth-death Moran model, the first question we address is which structures, identified by graph connectivity and graph coloring, are evolutionarily equivalent. We prove that all properly two-colored, undirected, regular graphs are evolutionarily equivalent (where "properly colored" means that no two neighbors have the same color). We then compare the effects of background heterogeneity on properly two-colored graphs to those with alternative schemes in which the colors are permuted. Finally, we discuss dynamic coloring as a model for spatiotemporal resource fluctuations, and we illustrate that random dynamic colorings often diminish the effects of background heterogeneity relative to a proper two-coloring.


Asunto(s)
Evolución Biológica , Modelos Biológicos , Animales , Color , Biología Computacional , Gráficos por Computador , Simulación por Computador , Aptitud Genética , Genética de Población/estadística & datos numéricos , Humanos , Conceptos Matemáticos , Mutación , Dinámica Poblacional/estadística & datos numéricos , Probabilidad , Análisis Espacio-Temporal
16.
BMC Plant Biol ; 20(1): 510, 2020 Nov 09.
Artículo en Inglés | MEDLINE | ID: mdl-33167894

RESUMEN

BACKGROUND: Paeonia decomposita, endemic to China, has important ornamental, medicinal, and economic value and is regarded as an endangered plant. The genetic diversity and population structure have seldom been described. A conservation management plan is not currently available. RESULTS: In the present study, 16 pairs of simple sequence repeat (SSR) primers were used to evaluate the genetic diversity and population structure. A total of 122 alleles were obtained with a mean of 7.625 alleles per locus. The expected heterozygosity (He) varied from 0.043 to 0.901 (mean 0.492) in 16 primers. Moderate genetic diversity (He = 0.405) among populations was revealed, with Danba identified as the center of genetic diversity. Mantel tests revealed a positive correlation between geographic and genetic distance among populations (r = 0.592, P = 0.0001), demonstrating consistency with the isolation by distance model. Analysis of molecular variance (AMOVA) indicated that the principal molecular variance existed within populations (73.48%) rather than among populations (26.52%). Bayesian structure analysis and principal coordinate analysis (PCoA) supported the classification of the populations into three clusters. CONCLUSIONS: This is the first study of the genetic diversity and population structure of P. decomposita using SSR. Three management units were proposed as conservation measures. The results will be beneficial for the conservation and exploitation of the species, providing a theoretical basis for further research of its evolution and phylogeography.


Asunto(s)
Conservación de los Recursos Naturales , ADN de Plantas/genética , Especies en Peligro de Extinción/estadística & datos numéricos , Variación Genética , Genética de Población/estadística & datos numéricos , Paeonia/genética , Alelos , China , Pérdida de Heterocigocidad , Repeticiones de Microsatélite , Filogenia , Filogeografía
17.
PLoS Genet ; 16(10): e1009037, 2020 10.
Artículo en Inglés | MEDLINE | ID: mdl-33035220

RESUMEN

Genetic surveillance of malaria parasites supports malaria control programmes, treatment guidelines and elimination strategies. Surveillance studies often pose questions about malaria parasite ancestry (e.g. how antimalarial resistance has spread) and employ statistical methods that characterise parasite population structure. Many of the methods used to characterise structure are unsupervised machine learning algorithms which depend on a genetic distance matrix, notably principal coordinates analysis (PCoA) and hierarchical agglomerative clustering (HAC). PCoA and HAC are sensitive to both the definition of genetic distance and algorithmic specification. Importantly, neither algorithm infers malaria parasite ancestry. As such, PCoA and HAC can inform (e.g. via exploratory data visualisation and hypothesis generation), but not answer comprehensively, key questions about malaria parasite ancestry. We illustrate the sensitivity of PCoA and HAC using 393 Plasmodium falciparum whole genome sequences collected from Cambodia and neighbouring regions (where antimalarial resistance has emerged and spread recently) and we provide tentative guidance for the use and interpretation of PCoA and HAC in malaria parasite genetic epidemiology. This guidance includes a call for fully transparent and reproducible analysis pipelines that feature (i) a clearly outlined scientific question; (ii) a clear justification of analytical methods used to answer the scientific question along with discussion of any inferential limitations; (iii) publicly available genetic distance matrices when downstream analyses depend on them; and (iv) sensitivity analyses. To bridge the inferential disconnect between the output of non-inferential unsupervised learning algorithms and the scientific questions of interest, tailor-made statistical models are needed to infer malaria parasite ancestry. In the absence of such models speculative reasoning should feature only as discussion but not as results.


Asunto(s)
Genética de Población/estadística & datos numéricos , Malaria Falciparum/epidemiología , Epidemiología Molecular , Plasmodium falciparum/genética , Algoritmos , Antimaláricos/uso terapéutico , Cambodia/epidemiología , Análisis por Conglomerados , Resistencia a Medicamentos/genética , Genotipo , Humanos , Malaria Falciparum/tratamiento farmacológico , Malaria Falciparum/genética , Malaria Falciparum/parasitología , Plasmodium falciparum/patogenicidad , Aprendizaje Automático no Supervisado
18.
Nat Commun ; 11(1): 4661, 2020 09 16.
Artículo en Inglés | MEDLINE | ID: mdl-32938925

RESUMEN

The recent years have seen a growing number of studies investigating evolutionary questions using ancient DNA. To address these questions, one of the most frequently-used method is principal component analysis (PCA). When PCA is applied to temporal samples, the sample dates are, however, ignored during analysis, leading to imperfect representations of samples in PC plots. Here, we present a factor analysis (FA) method in which individual scores are corrected for the effect of allele frequency drift over time. We obtained exact solutions for the estimates of corrected factors, and we provided a fast algorithm for their computation. Using computer simulations and ancient European samples, we compared geometric representations obtained from FA with PCA and with ancestry estimation programs. In admixture analyses, FA estimates agreed with tree-based statistics, and they were more accurate than those obtained from PCA projections and from ancestry estimation programs. A great advantage of FA over existing approaches is to improve descriptive analyses of ancient DNA samples without requiring inclusion of outgroup or present-day samples.


Asunto(s)
ADN Antiguo/análisis , Análisis Factorial , Genoma Humano , Metagenómica/estadística & datos numéricos , Algoritmos , Inglaterra , Europa (Continente) , Frecuencia de los Genes , Flujo Genético , Genética de Población/estadística & datos numéricos , Humanos , Modelos Genéticos , Análisis de Componente Principal
19.
BMC Genet ; 21(1): 40, 2020 04 07.
Artículo en Inglés | MEDLINE | ID: mdl-32264823

RESUMEN

BACKGROUND: Global and local ancestry inference in admixed human populations can be performed using computational tools implementing distinct algorithms. The development and resulting accuracy of these tools has been tested largely on populations with relatively straightforward admixture histories but little is known about how well they perform in more complex admixture scenarios. RESULTS: Using simulations, we show that RFMix outperforms ADMIXTURE in determining global ancestry proportions even in a complex 5-way admixed population, in addition to assigning local ancestry with an accuracy of 89%. The ability of RFMix to determine global and local ancestry to a high degree of accuracy, particularly in admixed populations provides the opportunity for more accurate association analyses. CONCLUSION: This study highlights the utility of the extension of computational tools to become more compatible to genetically structured populations, as well as the need to expand the sampling of diverse world-wide populations. This is particularly noteworthy as modern-day societies are becoming increasingly genetically complex and some genetic tools and commonly used ancestral populations are less appropriate. Based on these caveats and the results presented here, we suggest that RFMix be used for both global and local ancestry estimation in world-wide complex admixture scenarios particularly when including these estimates in association studies.


Asunto(s)
Estudios de Asociación Genética/estadística & datos numéricos , Genética de Población/estadística & datos numéricos , Polimorfismo de Nucleótido Simple/genética , Algoritmos , Humanos , Modelos Genéticos
20.
Sci Rep ; 10(1): 6781, 2020 04 22.
Artículo en Inglés | MEDLINE | ID: mdl-32321949

RESUMEN

Breeding strategies based on molecular markers have been adopted by ex-situ conservation programs to assess alternative parameters for the genetic diversity estimates. In this work we evaluated molecular and studbook data for captive populations of black-lion-tamarin (BLT), an endangered primate endemic to Brazil's Atlantic Forest. Pedigree analyses were performed using BLT studbook information collected from 1973 to 2018. We analyzed the whole captive population since its foundation; the current captive population (CCP); and all extant BLTs in the Brazilian captive population (BCP), separately. Microsatellite analyses were implemented on the BCP individuals from the eighth generation (BCP-F8) only to avoid generation overlap. The expected heterozygosity for BCP-F8, using molecular, data was 0.45, and the initial expected heterozygosity was 0.69. Kinship parameters showed high genetic relationships in both pedigree and molecular analyses. The genealogy-based endogamy evidenced a high inbreeding coefficient, while the molecular analyses suggested a non-inbreeding signature. The Mate Suitability Index showed detrimental values for the majority of potential pairs in the CCP. Nevertheless, some individuals evidenced high individual heterozygosity and allele representation, demonstrating good potential to be used as breeders. Thus, we propose the use of molecular data as a complementary parameter to evaluate mating-pairs and to aid management decision-making.


Asunto(s)
Especies en Peligro de Extinción , Variación Genética , Genética de Población/métodos , Leontopithecus/genética , Animales , Animales de Zoológico , Brasil , Cruzamiento , Conservación de los Recursos Naturales/métodos , Conservación de los Recursos Naturales/estadística & datos numéricos , Femenino , Bosques , Genética de Población/estadística & datos numéricos , Genotipo , Heterocigoto , Masculino , Linaje , Dinámica Poblacional
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA