Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
PLoS Genet ; 20(3): e1011144, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38507461

RESUMEN

Across the human genome, there are large-scale fluctuations in genetic diversity caused by the indirect effects of selection. This "linked selection signal" reflects the impact of selection according to the physical placement of functional regions and recombination rates along chromosomes. Previous work has shown that purifying selection acting against the steady influx of new deleterious mutations at functional portions of the genome shapes patterns of genomic variation. To date, statistical efforts to estimate purifying selection parameters from linked selection models have relied on classic Background Selection theory, which is only applicable when new mutations are so deleterious that they cannot fix in the population. Here, we develop a statistical method based on a quantitative genetics view of linked selection, that models how polygenic additive fitness variance distributed along the genome increases the rate of stochastic allele frequency change. By jointly predicting the equilibrium fitness variance and substitution rate due to both strong and weakly deleterious mutations, we estimate the distribution of fitness effects (DFE) and mutation rate across three geographically distinct human samples. While our model can accommodate weaker selection, we find evidence of strong selection operating similarly across all human samples. Although our quantitative genetic model of linked selection fits better than previous models, substitution rates of the most constrained sites disagree with observed divergence levels. We find that a model incorporating selective interference better predicts observed divergence in conserved regions, but overall our results suggest uncertainty remains about the processes generating fitness variation in humans.


Asunto(s)
Modelos Genéticos , Selección Genética , Humanos , Evolución Molecular , Frecuencia de los Genes/genética , Mutación , Genoma Humano/genética , Variación Genética , Aptitud Genética
2.
Bioinformatics ; 40(1)2024 01 02.
Artículo en Inglés | MEDLINE | ID: mdl-38180848

RESUMEN

MOTIVATION: Managing data and code in open scientific research is complicated by two key problems: large datasets often cannot be stored alongside code in repository platforms like GitHub, and iterative analysis can lead to unnoticed changes to data, increasing the risk that analyses are based on older versions of data. RESULTS: SciDataFlow is a fast, concurrent command-line tool paired with a simple Data Manifest specification that streamlines tracking data changes, uploading data to remote repositories, and pulling in all data necessary to reproduce a computational analysis. AVAILABILITY AND IMPLEMENTATION: SciDataFlow is available at https://github.com/vsbuffalo/scidataflow.


Asunto(s)
Manejo de Datos , Programas Informáticos
3.
Proc Natl Acad Sci U S A ; 119(38): e2201521119, 2022 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-36095205

RESUMEN

Metazoan adaptation to global change relies on selection of standing genetic variation. Determining the extent to which this variation exists in natural populations, particularly for responses to simultaneous stressors, is essential to make accurate predictions for persistence in future conditions. Here, we identified the genetic variation enabling the copepod Acartia tonsa to adapt to experimental ocean warming, acidification, and combined ocean warming and acidification (OWA) over 25 generations of continual selection. Replicate populations showed a consistent polygenic response to each condition, targeting an array of adaptive mechanisms including cellular homeostasis, development, and stress response. We used a genome-wide covariance approach to partition the allelic changes into three categories: selection, drift and replicate-specific selection, and laboratory adaptation responses. The majority of allele frequency change in warming (57%) and OWA (63%) was driven by shared selection pressures across replicates, but this effect was weaker under acidification alone (20%). OWA and warming shared 37% of their response to selection but OWA and acidification shared just 1%, indicating that warming is the dominant driver of selection in OWA. Despite the dominance of warming, the interaction with acidification was still critical as the OWA selection response was highly synergistic with 47% of the allelic selection response unique from either individual treatment. These results disentangle how genomic targets of selection differ between single and multiple stressors and demonstrate the complexity that nonadditive multiple stressors will contribute to predictions of adaptation to complex environmental shifts caused by global change.


Asunto(s)
Adaptación Fisiológica , Copépodos , Ácidos/química , Adaptación Fisiológica/genética , Animales , Copépodos/genética , Copépodos/fisiología , Genómica , Calentamiento Global , Homeostasis , Concentración de Iones de Hidrógeno , Océanos y Mares
4.
Elife ; 102021 08 19.
Artículo en Inglés | MEDLINE | ID: mdl-34409937

RESUMEN

Neutral theory predicts that genetic diversity increases with population size, yet observed levels of diversity across metazoans vary only two orders of magnitude while population sizes vary over several. This unexpectedly narrow range of diversity is known as Lewontin's Paradox of Variation (1974). While some have suggested selection constrains diversity, tests of this hypothesis seem to fall short. Here, I revisit Lewontin's Paradox to assess whether current models of linked selection are capable of reducing diversity to this extent. To quantify the discrepancy between pairwise diversity and census population sizes across species, I combine previously-published estimates of pairwise diversity from 172 metazoan taxa with newly derived estimates of census sizes. Using phylogenetic comparative methods, I show this relationship is significant accounting for phylogeny, but with high phylogenetic signal and evidence that some lineages experience shifts in the evolutionary rate of diversity deep in the past. Additionally, I find a negative relationship between recombination map length and census size, suggesting abundant species have less recombination and experience greater reductions in diversity due to linked selection. However, I show that even assuming strong and abundant selection, models of linked selection are unlikely to explain the observed relationship between diversity and census sizes across species.


Asunto(s)
Variación Genética , Densidad de Población , Animales , Evolución Biológica , Genética de Población , Filogenia , Selección Genética
5.
Proc Natl Acad Sci U S A ; 117(34): 20672-20680, 2020 08 25.
Artículo en Inglés | MEDLINE | ID: mdl-32817464

RESUMEN

Rapid phenotypic adaptation is often observed in natural populations and selection experiments. However, detecting the genome-wide impact of this selection is difficult since adaptation often proceeds from standing variation and selection on polygenic traits, both of which may leave faint genomic signals indistinguishable from a noisy background of genetic drift. One promising signal comes from the genome-wide covariance between allele frequency changes observable from temporal genomic data (e.g., evolve-and-resequence studies). These temporal covariances reflect how heritable fitness variation in the population leads changes in allele frequencies at one time point to be predictive of the changes at later time points, as alleles are indirectly selected due to remaining associations with selected alleles. Since genetic drift does not lead to temporal covariance, we can use these covariances to estimate what fraction of the variation in allele frequency change through time is driven by linked selection. Here, we reanalyze three selection experiments to quantify the effects of linked selection over short timescales using covariance among time points and across replicates. We estimate that at least 17 to 37% of allele frequency change is driven by selection in these experiments. Against this background of positive genome-wide temporal covariances, we also identify signals of negative temporal covariance corresponding to reversals in the direction of selection for a reasonable proportion of loci over the time course of a selection experiment. Overall, we find that in the three studies we analyzed, linked selection has a large impact on short-term allele frequency dynamics that is readily distinguishable from genetic drift.


Asunto(s)
Adaptación Biológica/genética , Frecuencia de los Genes/genética , Selección Genética/genética , Aclimatación/genética , Adaptación Fisiológica/genética , Alelos , Animales , Evolución Biológica , Evolución Molecular , Frecuencia de los Genes/fisiología , Flujo Genético , Genética de Población/métodos , Genómica/métodos , Humanos , Modelos Genéticos , Herencia Multifactorial/genética , Densidad de Población
6.
Genetics ; 213(3): 1007-1045, 2019 11.
Artículo en Inglés | MEDLINE | ID: mdl-31558582

RESUMEN

The majority of empirical population genetic studies have tried to understand the evolutionary processes that have shaped genetic variation in a single sample taken from a present-day population. However, genomic data collected over tens of generations in both natural and laboratory populations are increasingly used to find selected loci underpinning adaptation over these short timescales. Although these studies have been quite successful in detecting selection on large-effect loci, the fitness differences between individuals are often polygenic, such that selection leads to allele frequency changes that are difficult to distinguish from genetic drift. However, one promising signal comes from polygenic selection's effect on neutral sites that become stochastically associated with the genetic backgrounds that lead to fitness differences between individuals. Previous theoretical work has established that the random associations between a neutral allele and heritable fitness backgrounds act to reduce the effective population size experienced by this neutral allele. These associations perturb neutral allele frequency trajectories, creating autocovariance in the allele frequency changes across generations. Here, we show how temporal genomic data allow us to measure the temporal autocovariance in allele frequency changes and characterize the genome-wide impact of polygenic selection. We develop expressions for these temporal autocovariances, showing that their magnitude is determined by the level of additive genetic variation, recombination, and linkage disequilibria in a region. Furthermore, by using analytic expressions for the temporal variances and autocovariances in allele frequency, we demonstrate that one can estimate the additive genetic variation for fitness and the drift-effective population size from temporal genomic data. We also show how the proportion of total variation in allele frequency change due to linked selection can be estimated from temporal data. Overall, we demonstrate that temporal genomic data offer opportunities to identify the role of linked selection on genome-wide diversity over short timescales, and can help bridge population genetic and quantitative genetic studies of adaptation.


Asunto(s)
Evolución Molecular , Modelos Genéticos , Herencia Multifactorial , Selección Genética
7.
Gigascience ; 7(4): 1-12, 2018 04 01.
Artículo en Inglés | MEDLINE | ID: mdl-29300887

RESUMEN

Background: Characterization of genetic variations in maize has been challenging, mainly due to deterioration of collinearity between individual genomes in the species. An international consortium of maize research groups combined resources to develop the maize haplotype version 3 (HapMap 3), built from whole-genome sequencing data from 1218 maize lines, covering predomestication and domesticated Zea mays varieties across the world. Results: A new computational pipeline was set up to process more than 12 trillion bp of sequencing data, and a set of population genetics filters was applied to identify more than 83 million variant sites. Conclusions: We identified polymorphisms in regions where collinearity is largely preserved in the maize species. However, the fact that the B73 genome used as the reference only represents a fraction of all haplotypes is still an important limiting factor.


Asunto(s)
Genoma de Planta , Haplotipos , Zea mays/genética , Variación Genética
8.
Genetics ; 204(1): 57-75, 2016 09.
Artículo en Inglés | MEDLINE | ID: mdl-27356612

RESUMEN

Close relatives can share large segments of their genome identical by descent (IBD) that can be identified in genome-wide polymorphism data sets. There are a range of methods to use these IBD segments to identify relatives and estimate their relationship. These methods have focused on sharing on the autosomes, as they provide a rich source of information about genealogical relationships. We hope to learn additional information about recent ancestry through shared IBD segments on the X chromosome, but currently lack the theoretical framework to use this information fully. Here, we fill this gap by developing probability distributions for the number and length of X chromosome segments shared IBD between an individual and an ancestor k generations back, as well as between half- and full-cousin relationships. Due to the inheritance pattern of the X and the fact that X homologous recombination occurs only in females (outside of the pseudoautosomal regions), the number of females along a genealogical lineage is a key quantity for understanding the number and length of the IBD segments shared among relatives. When inferring relationships among individuals, the number of female ancestors along a genealogical lineage will often be unknown. Therefore, our IBD segment length and number distributions marginalize over this unknown number of recombinational meioses through a distribution of recombinational meioses we derive. By using Bayes' theorem to invert these distributions, we can estimate the number of female ancestors between two relatives, giving us details about the genealogical relations between individuals not possible with autosomal data alone.


Asunto(s)
Cromosomas Humanos X , Patrón de Herencia , Teorema de Bayes , Cromosomas Humanos X/genética , Femenino , Genealogía y Heráldica , Variación Genética , Genética de Población/métodos , Genética de Población/estadística & datos numéricos , Estudio de Asociación del Genoma Completo , Humanos , Masculino , Modelos Genéticos , Linaje
9.
BMC Plant Biol ; 14: 368, 2014 Dec 19.
Artículo en Inglés | MEDLINE | ID: mdl-25524236

RESUMEN

BACKGROUND: During wheat senescence, leaf components are degraded in a coordinated manner, releasing amino acids and micronutrients which are subsequently transported to the developing grain. We have previously shown that the simultaneous downregulation of Grain Protein Content (GPC) transcription factors, GPC1 and GPC2, greatly delays senescence and disrupts nutrient remobilization, and therefore provide a valuable entry point to identify genes involved in micronutrient transport to the wheat grain. RESULTS: We generated loss-of-function mutations for GPC1 and GPC2 in tetraploid wheat and showed in field trials that gpc1 mutants exhibit significant delays in senescence and reductions in grain Zn and Fe content, but that mutations in GPC2 had no significant effect on these traits. An RNA-seq study of these mutants at different time points showed a larger proportion of senescence-regulated genes among the GPC1 (64%) than among the GPC2 (37%) regulated genes. Combined, the two GPC genes regulate a subset (21.2%) of the senescence-regulated genes, 76.1% of which are upregulated at 12 days after anthesis, before the appearance of any visible signs of senescence. Taken together, these results demonstrate that GPC1 is a key regulator of nutrient remobilization which acts predominantly during the early stages of senescence. Genes upregulated at this stage include transporters from the ZIP and YSL gene families, which facilitate Zn and Fe export from the cytoplasm to the phloem, and genes involved in the biosynthesis of chelators that facilitate the phloem-based transport of these nutrients to the grains. CONCLUSIONS: This study provides an overview of the transport mechanisms activated in the wheat flag leaf during monocarpic senescence. It also identifies promising targets to improve nutrient remobilization to the wheat grain, which can help mitigate Zn and Fe deficiencies that afflict many regions of the developing world.


Asunto(s)
Regulación del Desarrollo de la Expresión Génica , Regulación de la Expresión Génica de las Plantas , Proteínas de Transporte de Membrana/genética , Hojas de la Planta/genética , Proteínas de Plantas/genética , Triticum/genética , Secuencia de Bases , Hierro/metabolismo , Proteínas de Transporte de Membrana/metabolismo , Datos de Secuencia Molecular , Filogenia , Hojas de la Planta/crecimiento & desarrollo , Proteínas de Plantas/metabolismo , ARN de Planta/genética , ARN de Planta/metabolismo , Triticum/crecimiento & desarrollo , Triticum/metabolismo , Zinc/metabolismo
10.
Genome Biol ; 14(6): R66, 2013 Jun 25.
Artículo en Inglés | MEDLINE | ID: mdl-23800085

RESUMEN

BACKGROUND: The high level of identity among duplicated homoeologous genomes in tetraploid pasta wheat presents substantial challenges for de novo transcriptome assembly. To solve this problem, we develop a specialized bioinformatics workflow that optimizes transcriptome assembly and separation of merged homoeologs. To evaluate our strategy, we sequence and assemble the transcriptome of one of the diploid ancestors of pasta wheat, and compare both assemblies with a benchmark set of 13,472 full-length, non-redundant bread wheat cDNAs. RESULTS: A total of 489 million 100 bp paired-end reads from tetraploid wheat assemble in 140,118 contigs, including 96% of the benchmark cDNAs. We used a comparative genomics approach to annotate 66,633 open reading frames. The multiple k-mer assembly strategy increases the proportion of cDNAs assembled full-length in a single contig by 22% relative to the best single k-mer size. Homoeologs are separated using a post-assembly pipeline that includes polymorphism identification, phasing of SNPs, read sorting, and re-assembly of phased reads. Using a reference set of genes, we determine that 98.7% of SNPs analyzed are correctly separated by phasing. CONCLUSIONS: Our study shows that de novo transcriptome assembly of tetraploid wheat benefit from multiple k-mer assembly strategies more than diploid wheat. Our results also demonstrate that phasing approaches originally designed for heterozygous diploid organisms can be used to separate the close homoeologous genomes of tetraploid wheat. The predicted tetraploid wheat proteome and gene models provide a valuable tool for the wheat research community and for those interested in comparative genomic studies.


Asunto(s)
Cromosomas de las Plantas/química , Mapeo Contig/métodos , Genes de Plantas , Genoma de Planta , Transcriptoma , Triticum/genética , Secuencia de Bases , Marcadores Genéticos , Modelos Genéticos , Datos de Secuencia Molecular , Sistemas de Lectura Abierta , Filogenia , Polimorfismo de Nucleótido Simple , Seudogenes , Tetraploidía , Triticum/clasificación
11.
Genomics ; 101(1): 30-7, 2013 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-22982528

RESUMEN

We genotyped a Chinese and an Indian-origin rhesus macaque using the Affymetrix Genome-Wide Human SNP Array 6.0 and cataloged 85,473 uniquely mapping heterospecific SNPs. These SNPs were assigned to rhesus chromosomes according to their probe sequence alignments as displayed in the human and rhesus reference sequences. The conserved gene order (synteny) revealed by heterospecific SNP maps is in concordance with that of the published human and rhesus macaque genomes. Using these SNPs' original human rs numbers, we identified 12,328 genes annotated in humans that are associated with these SNPs, 3674 of which were found in at least one of the two rhesus macaques studied. Due to their density, the heterospecific SNPs allow fine-grained comparisons, including approximate boundaries of intra- and extra-chromosomal rearrangements involving gene orthologs, which can be used to distinguish rhesus macaque chromosomes from human chromosomes.


Asunto(s)
Genes , Macaca/genética , Polimorfismo de Nucleótido Simple , Animales , Secuencia de Bases , Mapeo Cromosómico/métodos , ADN/química , ADN/genética , Sondas de ADN , Genoma Humano , Humanos , Alineación de Secuencia , Sintenía
12.
Genome Res ; 21(12): 2224-41, 2011 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-21926179

RESUMEN

Low-cost short read sequencing technology has revolutionized genomics, though it is only just becoming practical for the high-quality de novo assembly of a novel large genome. We describe the Assemblathon 1 competition, which aimed to comprehensively assess the state of the art in de novo assembly methods when applied to current sequencing technologies. In a collaborative effort, teams were asked to assemble a simulated Illumina HiSeq data set of an unknown, simulated diploid genome. A total of 41 assemblies from 17 different groups were received. Novel haplotype aware assessments of coverage, contiguity, structure, base calling, and copy number were made. We establish that within this benchmark: (1) It is possible to assemble the genome to a high level of coverage and accuracy, and that (2) large differences exist between the assemblies, suggesting room for further improvements in current methods. The simulated benchmark, including the correct answer, the assemblies, and the code that was used to evaluate the assemblies is now public and freely available from http://www.assemblathon.org/.


Asunto(s)
Genoma/fisiología , Genómica/métodos , Análisis de Secuencia de ADN/métodos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...