Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 37
Filtrar
1.
Theor Appl Genet ; 137(5): 108, 2024 Apr 18.
Artículo en Inglés | MEDLINE | ID: mdl-38637355

RESUMEN

KEY MESSAGE: The integration of genomic prediction with crop growth models enabled the estimation of missing environmental variables which improved the prediction accuracy of grain yield. Since the invention of whole-genome prediction (WGP) more than two decades ago, breeding programmes have established extensive reference populations that are cultivated under diverse environmental conditions. The introduction of the CGM-WGP model, which integrates crop growth models (CGM) with WGP, has expanded the applications of WGP to the prediction of unphenotyped traits in untested environments, including future climates. However, CGMs require multiple seasonal environmental records, unlike WGP, which makes CGM-WGP less accurate when applied to historical reference populations that lack crucial environmental inputs. Here, we investigated the ability of CGM-WGP to approximate missing environmental variables to improve prediction accuracy. Two environmental variables in a wheat CGM, initial soil water content (InitlSoilWCont) and initial nitrate profile, were sampled from different normal distributions separately or jointly in each iteration within the CGM-WGP algorithm. Our results showed that sampling InitlSoilWCont alone gave the best results and improved the prediction accuracy of grain number by 0.07, yield by 0.06 and protein content by 0.03. When using the sampled InitlSoilWCont values as an input for the traditional CGM, the average narrow-sense heritability of the genotype-specific parameters (GSPs) improved by 0.05, with GNSlope, PreAnthRes, and VernSen showing the greatest improvements. Moreover, the root mean square of errors for grain number and yield was reduced by about 7% for CGM and 31% for CGM-WGP when using the sampled InitlSoilWCont values. Our results demonstrate the advantage of sampling missing environmental variables in CGM-WGP to improve prediction accuracy and increase the size of the reference population by enabling the utilisation of historical data that are missing environmental records.


Asunto(s)
Fitomejoramiento , Triticum , Triticum/genética , Genoma , Genómica/métodos , Genotipo , Fenotipo , Grano Comestible/genética , Modelos Genéticos
2.
Genet Sel Evol ; 56(1): 42, 2024 Jun 06.
Artículo en Inglés | MEDLINE | ID: mdl-38844868

RESUMEN

BACKGROUND: Female fertility is an important trait in dairy cattle. Identifying putative causal variants associated with fertility may help to improve the accuracy of genomic prediction of fertility. Combining expression data (eQTL) of genes, exons, gene splicing and allele specific expression is a promising approach to fine map QTL to get closer to the causal mutations. Another approach is to identify genomic differences between cows selected for high and low fertility and a selection experiment in New Zealand has created exactly this resource. Our objective was to combine multiple types of expression data, fertility traits and allele frequency in high- (POS) and low-fertility (NEG) cows with a genome-wide association study (GWAS) on calving interval in Australian cows to fine-map QTL associated with fertility in both Australia and New Zealand dairy cattle populations. RESULTS: Variants that were significantly associated with calving interval (CI) were strongly enriched for variants associated with gene, exon, gene splicing and allele-specific expression, indicating that there is substantial overlap between QTL associated with CI and eQTL. We identified 671 genes with significant differential expression between POS and NEG cows, with the largest fold change detected for the CCDC196 gene on chromosome 10. Our results provide numerous candidate genes associated with female fertility in dairy cattle, including GYS2 and TIGAR on chromosome 5 and SYT3 and HSD17B14 on chromosome 18. Multiple QTL regions were located in regions with large numbers of copy number variants (CNV). To identify the causal mutations for these variants, long read sequencing may be useful. CONCLUSIONS: Variants that were significantly associated with CI were highly enriched for eQTL. We detected 671 genes that were differentially expressed between POS and NEG cows. Several QTL detected for CI overlapped with eQTL, providing candidate genes for fertility in dairy cattle.


Asunto(s)
Fertilidad , Estudio de Asociación del Genoma Completo , Sitios de Carácter Cuantitativo , Animales , Bovinos/genética , Fertilidad/genética , Femenino , Estudio de Asociación del Genoma Completo/veterinaria , Polimorfismo de Nucleótido Simple , Mapeo Cromosómico , Frecuencia de los Genes
3.
Genet Sel Evol ; 54(1): 37, 2022 Jun 02.
Artículo en Inglés | MEDLINE | ID: mdl-35655152

RESUMEN

BACKGROUND: Meta-analysis describes a category of statistical methods that aim at combining the results of multiple studies to increase statistical power by exploiting summary statistics. Different industries that use genomic prediction do not share their raw data due to logistic or privacy restrictions, which can limit the size of their reference populations and creates a need for a practical meta-analysis method. RESULTS: We developed a meta-analysis, named MetaGS, that duplicates the results of multi-trait best linear unbiased prediction (mBLUP) analysis without accessing raw data. MetaGS exploits the correlations among different populations to produce more accurate population-specific single nucleotide polymorphism (SNP) effects. The method improves SNP effect estimations for a given population depending on its relations to other populations. MetaGS was tested on milk, fat and protein yield data of Australian Holstein and Jersey cattle and it generated very similar genomic estimated breeding values to those produced using the mBLUP method for all traits in both breeds. One of the major difficulties when combining SNP effects across populations is the use of different variants for the populations, which limits the applications of meta-analysis in practice. We solved this issue by developing a method to impute missing summary statistics without using raw data. Our results showed that imputing summary statistics can be done with high accuracy (r > 0.9) even when more than 70% of the SNPs were missing with a minimal effect on prediction accuracy. CONCLUSIONS: We demonstrated that MetaGS can replace the mBLUP model when raw data cannot be shared, which can lead to more flexible collaborations compared to the single-trait BLUP model.


Asunto(s)
Genómica , Polimorfismo de Nucleótido Simple , Animales , Australia , Bovinos/genética , Genoma , Genómica/métodos , Fenotipo
4.
Genet Sel Evol ; 53(1): 19, 2021 Feb 26.
Artículo en Inglés | MEDLINE | ID: mdl-33637049

RESUMEN

BACKGROUND: Whole-genome sequence (WGS) data are increasingly available on large numbers of individuals in animal and plant breeding and in human genetics through second-generation resequencing technologies, 1000 genomes projects, and large-scale genotype imputation from lower marker densities. Here, we present a computationally fast implementation of a variable selection genomic prediction method, that could handle WGS data on more than 35,000 individuals, test its accuracy for across-breed predictions and assess its quantitative trait locus (QTL) mapping precision. METHODS: The Monte Carlo Markov chain (MCMC) variable selection model (Bayes GC) fits simultaneously a genomic best linear unbiased prediction (GBLUP) term, i.e. a polygenic effect whose correlations are described by a genomic relationship matrix (G), and a Bayes C term, i.e. a set of single nucleotide polymorphisms (SNPs) with large effects selected by the model. Computational speed is improved by a Metropolis-Hastings sampling that directs computations to the SNPs, which are, a priori, most likely to be included into the model. Speed is also improved by running many relatively short MCMC chains. Memory requirements are reduced by storing the genotype matrix in binary form. The model was tested on a WGS dataset containing Holstein, Jersey and Australian Red cattle. The data contained 4,809,520 genotypes on 35,549 individuals together with their milk, fat and protein yields, and fat and protein percentage traits. RESULTS: The prediction accuracies of the Jersey individuals improved by 1.5% when using across-breed GBLUP compared to within-breed predictions. Using WGS instead of 600 k SNP-chip data yielded on average a 3% accuracy improvement for Australian Red cows. QTL were fine-mapped by locating the SNP with the highest posterior probability of being included in the model. Various QTL known from the literature were rediscovered, and a new SNP affecting milk production was discovered on chromosome 20 at 34.501126 Mb. Due to the high mapping precision, it was clear that many of the discovered QTL were the same across the five dairy traits. CONCLUSIONS: Across-breed Bayes GC genomic prediction improved prediction accuracies compared to GBLUP. The combination of across-breed WGS data and Bayesian genomic prediction proved remarkably effective for the fine-mapping of QTL.


Asunto(s)
Cruzamiento/métodos , Bovinos/genética , Estudio de Asociación del Genoma Completo/métodos , Sitios de Carácter Cuantitativo , Secuenciación Completa del Genoma/métodos , Animales , Femenino , Masculino , Productos de la Carne/normas , Carácter Cuantitativo Heredable
5.
Genet Sel Evol ; 52(1): 37, 2020 Jul 07.
Artículo en Inglés | MEDLINE | ID: mdl-32635893

RESUMEN

BACKGROUND: Sequence-based genome-wide association studies (GWAS) provide high statistical power to identify candidate causal mutations when a large number of individuals with both sequence variant genotypes and phenotypes is available. A meta-analysis combines summary statistics from multiple GWAS and increases the power to detect trait-associated variants without requiring access to data at the individual level of the GWAS mapping cohorts. Because linkage disequilibrium between adjacent markers is conserved only over short distances across breeds, a multi-breed meta-analysis can improve mapping precision. RESULTS: To maximise the power to identify quantitative trait loci (QTL), we combined the results of nine within-population GWAS that used imputed sequence variant genotypes of 94,321 cattle from eight breeds, to perform a large-scale meta-analysis for fat and protein percentage in cattle. The meta-analysis detected (p ≤ 10-8) 138 QTL for fat percentage and 176 QTL for protein percentage. This was more than the number of QTL detected in all within-population GWAS together (124 QTL for fat percentage and 104 QTL for protein percentage). Among all the lead variants, 100 QTL for fat percentage and 114 QTL for protein percentage had the same direction of effect in all within-population GWAS. This indicates either persistence of the linkage phase between the causal variant and the lead variant across breeds or that some of the lead variants might indeed be causal or tightly linked with causal variants. The percentage of intergenic variants was substantially lower for significant variants than for non-significant variants, and significant variants had mostly moderate to high minor allele frequencies. Significant variants were also clustered in genes that are known to be relevant for fat and protein percentages in milk. CONCLUSIONS: Our study identified a large number of QTL associated with fat and protein percentage in dairy cattle. We demonstrated that large-scale multi-breed meta-analysis reveals more QTL at the nucleotide resolution than within-population GWAS. Significant variants were more often located in genic regions than non-significant variants and a large part of them was located in potentially regulatory regions.


Asunto(s)
Bovinos/genética , Genotipo , Desequilibrio de Ligamiento , Lípidos/genética , Proteínas de la Leche/genética , Leche/normas , Animales , Frecuencia de los Genes , Leche/metabolismo , Polimorfismo Genético , Sitios de Carácter Cuantitativo
6.
Genet Sel Evol ; 51(1): 61, 2019 Oct 29.
Artículo en Inglés | MEDLINE | ID: mdl-31664896

RESUMEN

BACKGROUND: Two distinct populations have been extensively studied in Atlantic cod (Gadus morhua L.): the Northeast Arctic cod (NEAC) population and the coastal cod (CC) population. The objectives of the current study were to identify genomic islands of divergence and to propose an approach to quantify the strength of selection pressures using whole-genome single nucleotide polymorphism (SNP) data. After applying filtering criteria, information on 93 animals (9 CC individuals, 50 NEAC animals and 34 CC × NEAC crossbred individuals) and 3,123,434 autosomal SNPs were used. RESULTS: Four genomic islands of divergence were identified on chromosomes 1, 2, 7 and 12, which were mapped accurately based on SNP data and which extended in size from 11 to 18 Mb. These regions differed considerably between the two populations although the differences in the rest of the genome were small due to considerable gene flow between the populations. The estimates of selection pressures showed that natural selection was substantially more important than genetic drift in shaping these genomic islands. Our data confirmed results from earlier publications that suggested that genomic islands are due to chromosomal rearrangements that are under strong selection and reduce recombination between rearranged and non-rearranged segments. CONCLUSIONS: Our findings further support the hypothesis that selection and reduced recombination in genomic islands may promote speciation between these two populations although their habitats overlap considerably and migrations occur between them.


Asunto(s)
Gadus morhua/genética , Islas Genómicas , Polimorfismo de Nucleótido Simple , Selección Genética , Animales , Cromosomas/genética , Flujo Génico , Flujo Genético , Recombinación Genética
7.
BMC Genomics ; 19(1): 395, 2018 May 24.
Artículo en Inglés | MEDLINE | ID: mdl-29793448

RESUMEN

BACKGROUND: Topological association domains (TADs) are chromosomal domains characterised by frequent internal DNA-DNA interactions. The transcription factor CTCF binds to conserved DNA sequence patterns called CTCF binding motifs to either prohibit or facilitate chromosomal interactions. TADs and CTCF binding motifs control gene expression, but they are not yet well defined in the bovine genome. In this paper, we sought to improve the annotation of bovine TADs and CTCF binding motifs, and assess whether the new annotation can reduce the search space for cis-regulatory variants. RESULTS: We used genomic synteny to map TADs and CTCF binding motifs from humans, mice, dogs and macaques to the bovine genome. We found that our mapped TADs exhibited the same hallmark properties of those sourced from experimental data, such as housekeeping genes, transfer RNA genes, CTCF binding motifs, short interspersed elements, H3K4me3 and H3K27ac. We showed that runs of genes with the same pattern of allele-specific expression (ASE) (either favouring paternal or maternal allele) were often located in the same TAD or between the same conserved CTCF binding motifs. Analyses of variance showed that when averaged across all bovine tissues tested, TADs explained 14% of ASE variation (standard deviation, SD: 0.056), while CTCF explained 27% (SD: 0.078). Furthermore, we showed that the quantitative trait loci (QTLs) associated with gene expression variation (eQTLs) or ASE variation (aseQTLs), which were identified from mRNA transcripts from 141 lactating cows' white blood and milk cells, were highly enriched at putative bovine CTCF binding motifs. The linearly-furthermost, and most-significant aseQTL and eQTL for each genic target were located within the same TAD as the gene more often than expected (Chi-Squared test P-value < 0.001). CONCLUSIONS: Our results suggest that genomic synteny can be used to functionally annotate conserved transcriptional components, and provides a tool to reduce the search space for causative regulatory variants in the bovine genome.


Asunto(s)
Factor de Unión a CCCTC/metabolismo , Genómica , Motivos de Nucleótidos , Animales , Bovinos , Unión Proteica , Sitios de Carácter Cuantitativo/genética
8.
Genet Sel Evol ; 49(1): 70, 2017 09 21.
Artículo en Inglés | MEDLINE | ID: mdl-28934948

RESUMEN

BACKGROUND: The increasing availability of whole-genome sequence data is expected to increase the accuracy of genomic prediction. However, results from simulation studies and analysis of real data do not always show an increase in accuracy from sequence data compared to high-density (HD) single nucleotide polymorphism (SNP) chip genotypes. In addition, the sheer number of variants makes analysis of all variants and accurate estimation of all effects computationally challenging. Our objective was to find a strategy to approximate the analysis of whole-sequence data with a Bayesian variable selection model. Using a simulated dataset, we applied a Bayes R hybrid model to analyse whole-sequence data, test the effect of dropping a proportion of variants during the analysis, and test how the analysis can be split into separate analyses per chromosome to reduce the elapsed computing time. We also investigated the effect of imputation errors on prediction accuracy. Subsequently, we applied the approach to a dataset that contained imputed sequences and records for production and fertility traits for 38,492 Holstein, Jersey, Australian Red and crossbred bulls and cows. RESULTS: With the simulated dataset, we found that prediction accuracy was highly increased for a breed that was not represented in the training population for sequence data compared to HD SNP data. Either dropping part of the variants during the analysis or splitting the analysis into separate analyses per chromosome decreased accuracy compared to analysing whole-sequence data. First, dropping variants from each chromosome and reanalysing the retained variants together resulted in an accuracy similar to that obtained when analysing whole-sequence data. Adding imputation errors decreased prediction accuracy, especially for errors in the validation population. With real data, using sequence variants resulted in accuracies that were similar to those obtained with the HD SNPs. CONCLUSIONS: We present an efficient approach to approximate analysis of whole-sequence data with a Bayesian variable selection model. The lack of increase in prediction accuracy when applied to real data could be due to imputation errors, which demonstrates the importance of developing more accurate methods of imputation or directly genotyping sequence variants that have a major effect in the prediction equation.


Asunto(s)
Cruzamiento , Bovinos/genética , Genómica/métodos , Modelos Genéticos , Animales , Australia , Teorema de Bayes , Bases de Datos Genéticas , Femenino , Genotipo , Masculino , Polimorfismo de Nucleótido Simple
9.
Mol Biol Evol ; 30(9): 2209-23, 2013 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-23842528

RESUMEN

Whole-genome sequence is potentially the richest source of genetic data for inferring ancestral demography. However, full sequence also presents significant challenges to fully utilize such large data sets and to ensure that sequencing errors do not introduce bias into the inferred demography. Using whole-genome sequence data from two Holstein cattle, we demonstrate a new method to correct for bias caused by hidden errors and then infer stepwise changes in ancestral demography up to present. There was a strong upward bias in estimates of recent effective population size (Ne) if the correction method was not applied to the data, both for our method and the Li and Durbin (Inference of human population history from individual whole-genome sequences. Nature 475:493-496) pairwise sequentially Markovian coalescent method. To infer demography, we use an analytical predictor of multiloci linkage disequilibrium (LD) based on a simple coalescent model that allows for changes in Ne. The LD statistic summarizes the distribution of runs of homozygosity for any given demography. We infer a best fit demography as one that predicts a match with the observed distribution of runs of homozygosity in the corrected sequence data. We use multiloci LD because it potentially holds more information about ancestral demography than pairwise LD. The inferred demography indicates a strong reduction in the Ne around 170,000 years ago, possibly related to the divergence of African and European Bos taurus cattle. This is followed by a further reduction coinciding with the period of cattle domestication, with Ne of between 3,500 and 6,000. The most recent reduction of Ne to approximately 100 in the Holstein breed agrees well with estimates from pedigrees. Our approach can be applied to whole-genome sequence from any diploid species and can be scaled up to use sequence from multiple individuals.


Asunto(s)
Bovinos , Genética de Población , Genoma , Homocigoto , Filogenia , Animales , Bovinos/clasificación , Bovinos/genética , Femenino , Sitios Genéticos , Haplotipos , Desequilibrio de Ligamiento , Masculino , Cadenas de Markov , Modelos Genéticos , Polimorfismo de Nucleótido Simple , Densidad de Población , Análisis de Secuencia de ADN , Factores de Tiempo
10.
Hum Mol Genet ; 21(R1): R45-51, 2012 Oct 15.
Artículo en Inglés | MEDLINE | ID: mdl-22899652

RESUMEN

The genetic architecture of complex traits in cattle includes very large numbers of loci affecting any given trait. Most of these loci have small effects but occasionally there are loci with moderate-to-large effects segregating due to recent selection for the mutant allele. Genomic markers capture most but not all of the additive genetic variance for traits, probably because there are causal mutations with low allele frequency and therefore in incomplete linkage disequilibrium with the markers. The prediction of genetic value from genomic markers can achieve high accuracy by using statistical models that include all markers and assuming that marker effects are random variables drawn from a specified prior distribution. Recent effective population size is in the order of 100 within cattle breeds and ≈ 2500 animals with genotypes and phenotypes are sufficient to predict the genetic value of animals with an accuracy of 0.65. Recent effective population size for humans is much larger, in the order of 10,000-15,000, and more than 145,000 records would be required to reach a similar accuracy for people. However, our calculations assume that genomic markers capture all the genetic variance. This may be possible in the future as causal polymorphisms are genotyped using genome sequence data.


Asunto(s)
Bovinos/genética , Mapeo Cromosómico , Patrón de Herencia , Sitios de Carácter Cuantitativo , Carácter Cuantitativo Heredable , Animales , Marcadores Genéticos , Variación Genética , Genoma , Genómica , Genotipo , Desequilibrio de Ligamiento , Mutación , Fenotipo
11.
Genet Sel Evol ; 45: 43, 2013 Oct 29.
Artículo en Inglés | MEDLINE | ID: mdl-24168700

RESUMEN

BACKGROUND: The apparent effect of a single nucleotide polymorphism (SNP) on phenotype depends on the linkage disequilibrium (LD) between the SNP and a quantitative trait locus (QTL). However, the phase of LD between a SNP and a QTL may differ between Bos indicus and Bos taurus because they diverged at least one hundred thousand years ago. Here, we test the hypothesis that the apparent effect of a SNP on a quantitative trait depends on whether the SNP allele is inherited from a Bos taurus or Bos indicus ancestor. METHODS: Phenotype data on one or more traits and SNP genotype data for 10 181 cattle from Bos taurus, Bos indicus and composite breeds were used. All animals had genotypes for 729 068 SNPs (real or imputed). Chromosome segments were classified as originating from B. indicus or B. taurus on the basis of the haplotype of SNP alleles they contained. Consequently, SNP alleles were classified according to their sub-species origin. Three models were used for the association study: (1) conventional GWAS (genome-wide association study), fitting a single SNP effect regardless of subspecies origin, (2) interaction GWAS, fitting an interaction between SNP and subspecies-origin, and (3) best variable GWAS, fitting the most significant combination of SNP and sub-species origin. RESULTS: Fitting an interaction between SNP and subspecies origin resulted in more significant SNPs (i.e. more power) than a conventional GWAS. Thus, the effect of a SNP depends on the subspecies that the allele originates from. Also, most QTL segregated in only one subspecies, suggesting that many mutations that affect the traits studied occurred after divergence of the subspecies or the mutation became fixed or was lost in one of the subspecies. CONCLUSIONS: The results imply that GWAS and genomic selection could gain power by distinguishing SNP alleles based on their subspecies origin, and that only few QTL segregate in both B. indicus and B. taurus cattle. Thus, the QTL that segregate in current populations likely resulted from mutations that occurred in one of the subspecies and can have both positive and negative effects on the traits. There was no evidence that selection has increased the frequency of alleles that increase body weight.


Asunto(s)
Bovinos/clasificación , Bovinos/genética , Estudio de Asociación del Genoma Completo/métodos , Sitios de Carácter Cuantitativo , Alelos , Animales , Peso Corporal/genética , Cruzamiento , Cromosomas , Frecuencia de los Genes , Variación Genética , Genoma , Genotipo , Crecimiento/genética , Haplotipos , Fenotipo , Polimorfismo de Nucleótido Simple , Selección Genética , Especificidad de la Especie
12.
PLoS Genet ; 6(9): e1001139, 2010 Sep 23.
Artículo en Inglés | MEDLINE | ID: mdl-20927186

RESUMEN

Prediction of genetic merit using dense SNP genotypes can be used for estimation of breeding values for selection of livestock, crops, and forage species; for prediction of disease risk; and for forensics. The accuracy of these genomic predictions depends in part on the genetic architecture of the trait, in particular number of loci affecting the trait and distribution of their effects. Here we investigate the difference among three traits in distribution of effects and the consequences for the accuracy of genomic predictions. Proportion of black coat colour in Holstein cattle was used as one model complex trait. Three loci, KIT, MITF, and a locus on chromosome 8, together explain 24% of the variation of proportion of black. However, a surprisingly large number of loci of small effect are necessary to capture the remaining variation. A second trait, fat concentration in milk, had one locus of large effect and a host of loci with very small effects. Both these distributions of effects were in contrast to that for a third trait, an index of scores for a number of aspects of cow confirmation ("overall type"), which had only loci of small effect. The differences in distribution of effects among the three traits were quantified by estimating the distribution of variance explained by chromosome segments containing 50 SNPs. This approach was taken to account for the imperfect linkage disequilibrium between the SNPs and the QTL affecting the traits. We also show that the accuracy of predicting genetic values is higher for traits with a proportion of large effects (proportion black and fat percentage) than for a trait with no loci of large effect (overall type), provided the method of analysis takes advantage of the distribution of loci effects.


Asunto(s)
Bovinos/genética , Genoma/genética , Genómica/métodos , Lípidos/química , Leche/química , Carácter Cuantitativo Heredable , Pigmentación de la Piel/genética , Animales , Cruzamiento , Cromosomas de los Mamíferos/genética , Estudio de Asociación del Genoma Completo , Masculino , Fenotipo , Polimorfismo de Nucleótido Simple/genética , Sitios de Carácter Cuantitativo/genética , Reproducibilidad de los Resultados
13.
PLoS Genet ; 6(6): e1000998, 2010 Jun 24.
Artículo en Inglés | MEDLINE | ID: mdl-20585622

RESUMEN

The increased transcription of the Cyp6g1 gene of Drosophila melanogaster, and consequent resistance to insecticides such as DDT, is a widely cited example of adaptation mediated by cis-regulatory change. A fragment of an Accord transposable element inserted upstream of the Cyp6g1 gene is causally associated with resistance and has spread to high frequencies in populations around the world since the 1940s. Here we report the existence of a natural allelic series at this locus of D. melanogaster, involving copy number variation of Cyp6g1, and two additional transposable element insertions (a P and an HMS-Beagle). We provide evidence that this genetic variation underpins phenotypic variation, as the more derived the allele, the greater the level of DDT resistance. Tracking the spatial and temporal patterns of allele frequency changes indicates that the multiple steps of the allelic series are adaptive. Further, a DDT association study shows that the most resistant allele, Cyp6g1-[BP], is greatly enriched in the top 5% of the phenotypic distribution and accounts for approximately 16% of the underlying phenotypic variation in resistance to DDT. In contrast, copy number variation for another candidate resistance gene, Cyp12d1, is not associated with resistance. Thus the Cyp6g1 locus is a major contributor to DDT resistance in field populations, and evolution at this locus features multiple adaptive steps occurring in rapid succession.


Asunto(s)
Sistema Enzimático del Citocromo P-450/genética , Variaciones en el Número de Copia de ADN , Elementos Transponibles de ADN , Proteínas de Drosophila/genética , Drosophila melanogaster/genética , Adaptación Biológica , Alelos , Animales , Animales Modificados Genéticamente , Sitios Genéticos , Transcripción Genética
14.
Genome ; 53(11): 876-83, 2010 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-21076503

RESUMEN

Results from genome-wide association studies in livestock, and humans, has lead to the conclusion that the effect of individual quantitative trait loci (QTL) on complex traits, such as yield, are likely to be small; therefore, a large number of QTL are necessary to explain genetic variation in these traits. Given this genetic architecture, gains from marker-assisted selection (MAS) programs using only a small number of DNA markers to trace a limited number of QTL is likely to be small. This has lead to the development of alternative technology for using the available dense single nucleotide polymorphism (SNP) information, called genomic selection. Genomic selection uses a genome-wide panel of dense markers so that all QTL are likely to be in linkage disequilibrium with at least one SNP. The genomic breeding values are predicted to be the sum of the effect of these SNPs across the entire genome. In dairy cattle breeding, the accuracy of genomic estimated breeding values (GEBV) that can be achieved and the fact that these are available early in life have lead to rapid adoption of the technology. Here, we discuss the design of experiments necessary to achieve accurate prediction of GEBV in future generations in terms of the number of markers necessary and the size of the reference population where marker effects are estimated. We also present a simple method for implementing genomic selection using a genomic relationship matrix. Future challenges discussed include using whole genome sequence data to improve the accuracy of genomic selection and management of inbreeding through genomic relationships.


Asunto(s)
Cruzamiento , Estudio de Asociación del Genoma Completo , Genoma , Ganado/genética , Selección Genética/genética , Animales , Genotipo , Desequilibrio de Ligamiento , Sitios de Carácter Cuantitativo
15.
BMC Genomics ; 10: 177, 2009 Apr 24.
Artículo en Inglés | MEDLINE | ID: mdl-19393045

RESUMEN

BACKGROUND: The Bovinae subfamily incorporates an array of antelope, buffalo and cattle species. All of the members of this subfamily have diverged recently. Not surprisingly, a number of phylogenetic studies from molecular and morphological data have resulted in ambiguous trees and relationships amongst species, especially for Yak and Bison species. A partial phylogenetic reconstruction of 13 extant members of the Bovini tribe (Bovidae, Bovinae) from 15 complete or partially sequenced autosomal genes is presented. RESULTS: We identified 3 distinct lineages after the Bovini split from the Boselaphini and Tragelaphini tribes, which has lead to the (1) Buffalo clade (Bubalus and Syncerus species) and a more recent divergence leading to the (2) Banteng, Gaur and Mithan and (3) Domestic cattle clades. A fourth lineage may also exist that leads to Bison and Yak. However, there was some ambiguity as to whether this was a divergence from the Banteng/Gaur/Mithan or the Domestic cattle clade. From an analysis of approximately 30,000 sites that were amplified in all species 133 sites were identified with ambiguous inheritance, in that all trees implied more than one mutation at the same site. Closer examination of these sites has identified that they are the result of ancient polymorphisms that have subsequently undergone lineage sorting in the Bovini tribe, of which 53 have remained polymorphic since Bos and Bison species last shared a common ancestor with Bubalus between 5-8 million years ago (MYA). CONCLUSION: Uncertainty arises in our phylogenetic reconstructions because many species in the Bovini diverged over a short period of time. It appears that a number of sites with ambiguous inheritance have been maintained in subsequent populations by chance (lineage sorting) and that they have contributed to an association between Yak and Domestic cattle and an unreliable phylogenetic reconstruction for the Bison/Yak clade. Interestingly, a number of these aberrant sites are in coding sections of the genome and their identification may have important implications for studying the neutral rate of mutation at nonsynonymous sites. The presence of these sites could help account for the apparent contradiction between levels of polymorphism and effective population size in domesticated cattle.


Asunto(s)
Evolución Molecular , Filogenia , Polimorfismo Genético , Rumiantes/genética , Animales , Secuencia de Consenso , Rumiantes/clasificación , Análisis de Secuencia de ADN
16.
BMC Genomics ; 10: 181, 2009 Apr 24.
Artículo en Inglés | MEDLINE | ID: mdl-19393053

RESUMEN

BACKGROUND: Identifying recent positive selection signatures in domesticated animals could provide information on genome response to strong directional selection from domestication and artificial selection. With the completion of the cattle genome, private companies are now providing large numbers of polymorphic markers for probing variation in domestic cattle (Bos taurus). We analysed over 7,500 polymorphic single nucleotide polymorphisms (SNP) in beef (Angus) and dairy (Holstein) cattle and outgroup species Bison, Yak and Banteng in an indirect test of inbreeding and positive selection in Domestic cattle. RESULTS: Outgroup species: Bison, Yak and Banteng, were genotyped with high levels of success (90%) and used to determine ancestral and derived allele states in domestic cattle. Frequency spectrums of the derived alleles in Angus and Holstein were examined using Fay and Wu's H test. Significant divergences from the predicted frequency spectrums expected under neutrality were identified. This appeared to be the result of combined influences of positive selection, inbreeding and ascertainment bias for moderately frequent SNP. Approximately 10% of all polymorphisms identified as segregating in B. taurus were also segregating in Bison, Yak or Banteng; highlighting a large number of polymorphisms that are ancient in origin. CONCLUSION: These results suggest that a large effective population size (N(e)) of approximately 90,000 or more existed in B. taurus since they shared a common ancestor with Bison, Yak and Banteng ~1-2 million years ago (MYA). More recently N(e) decreased sharply probably associated with domestication. This may partially explain the paradox of high levels of polymorphism in Domestic cattle and the relatively small recent N(e) in this species. The period of inbreeding caused Fay and Wu's H statistic to depart from its expectation under neutrality mimicking the effect of selection. However, there was also evidence for selection, because high frequency derived alleles tended to cluster near each other on the genome.


Asunto(s)
Animales Domésticos/genética , Bovinos/genética , Variación Genética , Polimorfismo de Nucleótido Simple , Selección Genética , Animales , Animales Domésticos/clasificación , Animales Salvajes/clasificación , Animales Salvajes/genética , Cruzamiento , Bovinos/clasificación , Simulación por Computador , Femenino , Frecuencia de los Genes , Marcadores Genéticos/genética , Genoma , Genómica/métodos , Genotipo , Masculino , Filogenia , Densidad de Población
17.
BMC Genomics ; 10: 179, 2009 Apr 24.
Artículo en Inglés | MEDLINE | ID: mdl-19393048

RESUMEN

BACKGROUND: If mutation within the coding region of the genome is largely not adaptive, the ratio of nonsynonymous (dN) to synonymous substitutions (dS) per site (dN/dS) should be approximately equal among closely related species. Furthermore, dN/dS in divergence between species should be equivalent to dN/dS in polymorphisms. This hypothesis is of particular interest in closely related members of the Bovini tribe, because domestication has promoted rapid phenotypic divergence through strong artificial selection of some species while others remain undomesticated. We examined a number of genes that may be involved in milk production in Domestic cattle and a number of their wild relatives for evidence that domestication had affected molecular evolution. Elevated rates of dN/dS were further queried to determine if they were the result of positive selection, low effective population size (N(e)) or reduced selective constraint. RESULTS: We have found that the domestication process has contributed to higher dN/dS ratios in cattle, especially in the lineages leading to the Domestic cow (Bos taurus) and Mithan (Bos frontalis) and within some breeds of Domestic cow. However, the high rates of dN/dS polymorphism within B. taurus when compared to species divergence suggest that positive selection has not elevated evolutionary rates in these genes. Likewise, the low rate of dN/dS in Bison, which has undergone a recent population bottleneck, indicates a reduction in population size alone is not responsible for these observations. CONCLUSION: The effect of selection depends on effective population size and the selection coefficient (N(e)s). Typically under domestication both selection pressure for traits important in fitness in the wild and Ne are reduced. Therefore, reduced selective constraint could be responsible for the observed elevated evolutionary ratios in domesticated species, especially in B. taurus and B. frontalis, which have the highest dN/dS in the Bovini. This may have important implications for tests of selection such as the McDonald-Kreitman test. Surprisingly we have also detected a significant difference in the supposed neutral substitution rate between synonymous and noncoding sites in the Bovine genome, with a 30% higher rate of substitution at synonymous sites. This is due, at least in part, to an excess of the highly mutable CpG dinucleotides at synonymous sites, which will have implications for time of divergence estimates from molecular data.


Asunto(s)
Animales Domésticos/genética , Bovinos/genética , Evolución Molecular , Selección Genética , Animales , Animales Domésticos/clasificación , Animales Salvajes/clasificación , Animales Salvajes/genética , Bovinos/clasificación , Biología Computacional/métodos , Variación Genética , Genómica/métodos , Mutación , Fenotipo , Filogenia , Polimorfismo Genético , Densidad de Población
18.
Genetics ; 179(3): 1539-46, 2008 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-18562653

RESUMEN

Genotype-by-environment interactions for production traits in dairy cattle have often been observed, while QTL analyses have focused on detecting genes with general effects on production traits. In this study, a QTL search for genes with environmental interaction for the traits milk yield, protein yield, and fat yield were performed on Bos taurus autosome 6 (BTA6), also including information about the previously investigated candidate genes ABCG2 and OPN. The animals in the study were Norwegian Red. Eighteen grandsires and 716 sires were genotyped for 362 markers on BTA6. Every marker bracket was regarded as a putative QTL position. The effects of the candidate genes and the putative QTL were modeled as a regression on an environmental parameter (herd year), which is based on the predicted herd-year effect for the trait. Two QTL were found to have environmentally dependent effects on milk yield. These QTL were located 3.6 cM upstream and 9.1 cM downstream from ABCG2. No environmentally dependent QTL was found to significantly affect protein or fat yield.


Asunto(s)
Bovinos/genética , Cromosomas/genética , Ambiente , Leche/metabolismo , Sitios de Carácter Cuantitativo/genética , Carácter Cuantitativo Heredable , Animales , Bases de Datos Genéticas , Haplotipos , Funciones de Verosimilitud , Proteínas de la Leche/metabolismo , Modelos Genéticos
19.
Genetica ; 136(2): 245-57, 2009 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-18704696

RESUMEN

Genomic selection refers to the use of dense markers covering the whole genome to estimate the breeding value of selection candidates for a quantitative trait. This paper considers prediction of breeding value based on a linear combination of the markers. In this case the best estimate of each marker's effect is the expectation of the effect conditional on the data. To calculate this requires a prior distribution of marker effects. If the marker effects are normally distributed with constant variance, BLUP can be used to calculate the estimated effects of the markers and hence the estimated breeding value (EBV). In this case the model is equivalent to a conventional animal model in which the relationship matrix among the animals is estimated from the markers instead of the pedigree. The accuracy of the EBV can approach 1.0 but a very large amount of data is required. An alternative model was investigated in which only some markers have non-zero effects and these effects follow a reflected exponential distribution. In this case the expected effect of a marker is a non-linear function of the data such that apparently small effects are regressed back almost to zero and consequently these markers can be deleted from the model. The accuracy in this case is considerably higher than when marker effects are normally distributed. If genomic selection is practiced for several generations the response declines in a manner that can be predicted from the marker allele frequencies. Genomic selection is likely to lead to a more rapid decline in the selection response than phenotypic selection unless new markers are continually added to the prediction of breeding value. A method to find the optimum index to maximise long term selection response is derived. This index varies the weight given to a marker according to its frequency such that markers where the favourable allele has low frequency receive more weight in the index.


Asunto(s)
Genómica/métodos , Animales , Cruzamiento , Marcadores Genéticos , Modelos Genéticos , Sensibilidad y Especificidad , Factores de Tiempo
20.
Genet Res (Camb) ; 91(6): 427-36, 2009 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-20122298

RESUMEN

We used a least absolute shrinkage and selection operator (LASSO) approach to estimate marker effects for genomic selection. The least angle regression (LARS) algorithm and cross-validation were used to define the best subset of markers to include in the model. The LASSO-LARS approach was tested on two data sets: a simulated data set with 5865 individuals and 6000 Single Nucleotide Polymorphisms (SNPs); and a mouse data set with 1885 individuals genotyped for 10 656 SNPs and phenotyped for a number of quantitative traits. In the simulated data, three approaches were used to split the reference population into training and validation subsets for cross-validation: random splitting across the whole population; random sampling of validation set from the last generation only, either within or across families. The highest accuracy was obtained by random splitting across the whole population. The accuracy of genomic estimated breeding values (GEBVs) in the candidate population obtained by LASSO-LARS was 0.89 with 156 explanatory SNPs. This value was higher than those obtained by Best Linear Unbiased Prediction (BLUP) and a Bayesian method (BayesA), which were 0.75 and 0.84, respectively. In the mouse data, 1600 individuals were randomly allocated to the reference population. The GEBVs for the remaining 285 individuals estimated by LASSO-LARS were more accurate than those obtained by BLUP and BayesA for weight at six weeks and slightly lower for growth rate and body length. It was concluded that LASSO-LARS approach is a good alternative method to estimate marker effects for genomic selection, particularly when the cost of genotyping can be reduced by using a limited subset of markers.


Asunto(s)
Algoritmos , Genoma , Selección Genética , Animales , Genotipo , Humanos , Modelos Genéticos , Polimorfismo de Nucleótido Simple
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA