Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
Add more filters










Database
Language
Publication year range
1.
Elife ; 122023 06 21.
Article in English | MEDLINE | ID: mdl-37342968

ABSTRACT

Simulation is a key tool in population genetics for both methods development and empirical research, but producing simulations that recapitulate the main features of genomic datasets remains a major obstacle. Today, more realistic simulations are possible thanks to large increases in the quantity and quality of available genetic data, and the sophistication of inference and simulation software. However, implementing these simulations still requires substantial time and specialized knowledge. These challenges are especially pronounced for simulating genomes for species that are not well-studied, since it is not always clear what information is required to produce simulations with a level of realism sufficient to confidently answer a given question. The community-developed framework stdpopsim seeks to lower this barrier by facilitating the simulation of complex population genetic models using up-to-date information. The initial version of stdpopsim focused on establishing this framework using six well-characterized model species (Adrion et al., 2020). Here, we report on major improvements made in the new release of stdpopsim (version 0.2), which includes a significant expansion of the species catalog and substantial additions to simulation capabilities. Features added to improve the realism of the simulated genomes include non-crossover recombination and provision of species-specific genomic annotations. Through community-driven efforts, we expanded the number of species in the catalog more than threefold and broadened coverage across the tree of life. During the process of expanding the catalog, we have identified common sticking points and developed the best practices for setting up genome-scale simulations. We describe the input data required for generating a realistic simulation, suggest good practices for obtaining the relevant information from the literature, and discuss common pitfalls and major considerations. These improvements to stdpopsim aim to further promote the use of realistic whole-genome population genetic simulations, especially in non-model organisms, making them available, transparent, and accessible to everyone.


Subject(s)
Genome , Software , Computer Simulation , Genetics, Population , Genomics
2.
Nat Commun ; 14(1): 3377, 2023 06 08.
Article in English | MEDLINE | ID: mdl-37291107

ABSTRACT

The benefits of large-scale genetic studies for healthcare of the populations studied are well documented, but these genetic studies have traditionally ignored people from some parts of the world, such as South Asia. Here we describe whole genome sequence (WGS) data from 4806 individuals recruited from the healthcare delivery systems of Pakistan, India and Bangladesh, combined with WGS from 927 individuals from isolated South Asian populations. We characterize population structure in South Asia and describe a genotyping array (SARGAM) and imputation reference panel that are optimized for South Asian genomes. We find evidence for high rates of reproductive isolation, endogamy and consanguinity that vary across the subcontinent and that lead to levels of rare homozygotes that reach 100 times that seen in outbred populations. Founder effects increase the power to associate functional variants with disease processes and make South Asia a uniquely powerful place for population-scale genetic studies.


Subject(s)
Asian People , Founder Effect , Humans , Asian People/genetics , Bangladesh , Homozygote , India , Pakistan , South Asian People
3.
NPJ Genom Med ; 6(1): 10, 2021 Feb 11.
Article in English | MEDLINE | ID: mdl-33574314

ABSTRACT

Personalized medical care focuses on prediction of disease risk and response to medications. To build the risk models, access to both large-scale genomic resources and human genetic studies is required. The Taiwan Biobank (TWB) has generated high-coverage, whole-genome sequencing data from 1492 individuals and genome-wide SNP data from 103,106 individuals of Han Chinese ancestry using custom SNP arrays. Principal components analysis of the genotyping data showed that the full range of Han Chinese genetic variation was found in the cohort. The arrays also include thousands of known functional variants, allowing for simultaneous ascertainment of Mendelian disease-causing mutations and variants that affect drug metabolism. We found that 21.2% of the population are mutation carriers of autosomal recessive diseases, 3.1% have mutations in cancer-predisposing genes, and 87.3% carry variants that affect drug response. We highlight how TWB data provide insight into both population history and disease burden, while showing how widespread genetic testing can be used to improve clinical care.

4.
BMC Genomics ; 20(1): 620, 2019 Aug 16.
Article in English | MEDLINE | ID: mdl-31416423

ABSTRACT

BACKGROUND: Data from the 1000 Genomes project is quite often used as a reference for human genomic analysis. However, its accuracy needs to be assessed to understand the quality of predictions made using this reference. We present here an assessment of the genotyping, phasing, and imputation accuracy data in the 1000 Genomes project. We compare the phased haplotype calls from the 1000 Genomes project to experimentally phased haplotypes for 28 of the same individuals sequenced using the 10X Genomics platform. RESULTS: We observe that phasing and imputation for rare variants are unreliable, which likely reflects the limited sample size of the 1000 Genomes project data. Further, it appears that using a population specific reference panel does not improve the accuracy of imputation over using the entire 1000 Genomes data set as a reference panel. We also note that the error rates and trends depend on the choice of definition of error, and hence any error reporting needs to take these definitions into account. CONCLUSIONS: The quality of the 1000 Genomes data needs to be considered while using this database for further studies. This work presents an analysis that can be used for these assessments.


Subject(s)
Genome, Human/genetics , Haplotypes/genetics , Racial Groups/genetics , Gene Frequency/genetics , High-Throughput Nucleotide Sequencing , Human Genome Project , Humans , Polymorphism, Single Nucleotide , Racial Groups/ethnology , Scientific Experimental Error
5.
Genome Res ; 29(5): 848-856, 2019 05.
Article in English | MEDLINE | ID: mdl-30926611

ABSTRACT

Baboons (genus Papio) are broadly studied in the wild and in captivity. They are widely used as a nonhuman primate model for biomedical studies, and the Southwest National Primate Research Center (SNPRC) at Texas Biomedical Research Institute has maintained a large captive baboon colony for more than 50 yr. Unlike other model organisms, however, the genomic resources for baboons are severely lacking. This has hindered the progress of studies using baboons as a model for basic biology or human disease. Here, we describe a data set of 100 high-coverage whole-genome sequences obtained from the mixed colony of olive (P. anubis) and yellow (P. cynocephalus) baboons housed at the SNPRC. These data provide a comprehensive catalog of common genetic variation in baboons, as well as a fine-scale genetic map. We show how the data can be used to learn about ancestry and admixture and to correct errors in the colony records. Finally, we investigated the consequences of inbreeding within the SNPRC colony and found clear evidence for increased rates of infant mortality and increased homozygosity of putatively deleterious alleles in inbred individuals.


Subject(s)
Papio anubis/genetics , Papio cynocephalus/genetics , Alleles , Animals , Female , Genetic Variation , Genotype , Inbreeding , Male , Recombination, Genetic , Whole Genome Sequencing
6.
J Phys Chem B ; 122(21): 5300-5307, 2018 05 31.
Article in English | MEDLINE | ID: mdl-28899094

ABSTRACT

We analyze the role of solvation for enzymatic catalysis in two distinct, artificially designed Kemp Eliminases, KE07 and KE70, and mutated variants that were optimized by laboratory directed evolution. Using a spatially resolved analysis of hydration patterns, intermolecular vibrations, and local solvent entropies, we identify distinct classes of hydration water and follow their changes upon substrate binding and transition state formation for the designed KE07 and KE70 enzymes and their evolved variants. We observe that differences in hydration of the enzymatic systems are concentrated in the active site and undergo significant changes during substrate recruitment. For KE07, directed evolution reduces variations in the hydration of the polar catalytic center upon substrate binding, preserving strong protein-water interactions, while the evolved enzyme variant of KE70 features a more hydrophobic reaction center for which the expulsion of low-entropy water molecules upon substrate binding is substantially enhanced. While our analysis indicates a system-dependent role of solvation for the substrate binding process, we identify more subtle changes in solvation for the transition state formation, which are less affected by directed evolution.

7.
Phys Chem Chem Phys ; 19(7): 5579-5590, 2017 Feb 15.
Article in English | MEDLINE | ID: mdl-28165073

ABSTRACT

We have used the AMOEBA model to simulate the THz spectra of two zwitterionic amino acids in aqueous solution, which is compared to the results on these same systems using ab initio molecular dynamics (AIMD) simulations. Overall we find that the polarizable force field shows promising agreement with AIMD data for both glycine and valine in water. This includes the THz spectral assignments and the mode-specific spectral decomposition into intramolecular solute motions as well as distinct solute-water cross-correlation modes some of which cannot be captured by non-polarizable force fields that rely on fixed partial charges. This bodes well for future studies for simulating and decomposing the THz spectra for larger solutes such as proteins or polymers for which AIMD studies are presently intractable. Furthermore, we believe that the current study on rather simple aqueous solutions offers a way to systematically investigate the importance of charge transfer, nuclear quantum effects, and the validity of computationally practical density functionals, all of which are needed to fully quantitatively capture complex dynamical motions in the condensed phase.


Subject(s)
Amino Acids/chemistry , Terahertz Spectroscopy , Water/chemistry , Molecular Dynamics Simulation
SELECTION OF CITATIONS
SEARCH DETAIL
...