Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 84
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
G3 (Bethesda) ; 14(4)2024 Apr 03.
Artigo em Inglês | MEDLINE | ID: mdl-38366548

RESUMO

In species with large and complex genomes such as conifers, dense linkage maps are a useful resource for supporting genome assembly and laying the genomic groundwork at the structural, populational, and functional levels. However, most of the 600+ extant conifer species still lack extensive genotyping resources, which hampers the development of high-density linkage maps. In this study, we developed a linkage map relying on 21,570 single nucleotide polymorphism (SNP) markers in Sitka spruce (Picea sitchensis [Bong.] Carr.), a long-lived conifer from western North America that is widely planted for productive forestry in the British Isles. We used a single-step mapping approach to efficiently combine RAD-seq and genotyping array SNP data for 528 individuals from 2 full-sib families. As expected for spruce taxa, the saturated map contained 12 linkages groups with a total length of 2,142 cM. The positioning of 5,414 unique gene coding sequences allowed us to compare our map with that of other Pinaceae species, which provided evidence for high levels of synteny and gene order conservation in this family. We then developed an integrated map for P. sitchensis and Picea glauca based on 27,052 markers and 11,609 gene sequences. Altogether, these 2 linkage maps, the accompanying catalog of 286,159 SNPs and the genotyping chip developed, herein, open new perspectives for a variety of fundamental and more applied research objectives, such as for the improvement of spruce genome assemblies, or for marker-assisted sustainable management of genetic resources in Sitka spruce and related species.


Assuntos
Picea , Traqueófitas , Humanos , Picea/genética , Traqueófitas/genética , Mapeamento Cromossômico , Genoma , Genômica , Polimorfismo de Nucleotídeo Único , Ligação Genética , Genoma de Planta
2.
Plant Genome ; 17(1): e20392, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-37986545

RESUMO

Advances in sequencing technologies mean that insights into crop diversification can now be explored in crops beyond major staples. We use a genome assembly of finger millet, an allotetraploid orphan crop, to analyze DArTseq single nucleotide polymorphisms (SNPs) at the whole and sub-genome level. A set of 8778 SNPs and 13 agronomic traits was used to characterize a diverse panel of 423 landraces from Africa and Asia. Through principal component analysis (PCA) and discriminant analysis of principal components, four distinct groups of accessions were identified that coincided with the primary geographic regions of finger millet cultivation. Notably, East Africa, presumed to be the crop's origin, exhibited the lowest genetic diversity. The PCA of phenotypic data also revealed geographic differentiation, albeit with differing relationships among geographic areas than indicated with genomic data. Further exploration of the sub-genomes A and B using neighbor-joining trees revealed distinct features that provide supporting evidence for the complex evolutionary history of finger millet. Although genome-wide association study found only a limited number of significant marker-trait associations, a clustering approach based on the distribution of marker effects obtained from a ridge regression genomic model was employed to investigate trait complexity. This analysis uncovered two distinct clusters. Overall, the findings suggest that finger millet has undergone complex and context-specific diversification, indicative of a lengthy domestication history. These analyses provide insights for the future development of finger millet.


Assuntos
Eleusine , Eleusine/genética , Estudo de Associação Genômica Ampla , Ásia , Fenótipo , Genômica
3.
bioRxiv ; 2023 Nov 04.
Artigo em Inglês | MEDLINE | ID: mdl-37961279

RESUMO

As a result of recombination, adjacent nucleotides can have different paths of genetic inheritance and therefore the genealogical trees for a sample of DNA sequences vary along the genome. The structure capturing the details of these intricately interwoven paths of inheritance is referred to as an ancestral recombination graph (ARG). New developments have made it possible to infer ARGs at scale, enabling many new applications in population and statistical genetics. This rapid progress, however, has led to a substantial gap opening between theory and practice. Standard mathematical formalisms, based on exhaustively detailing the "events" that occur in the history of a sample, are insufficient to describe the outputs of current methods. Moreover, we argue that the underlying assumption that all events can be known and precisely estimated is fundamentally unsuited to the realities of modern, population-scale datasets. We propose an alternative mathematical formulation that encompasses the outputs of recent methods and can capture the full richness of modern large-scale datasets. By defining this ARG encoding in terms of specific genomes and their intervals of genetic inheritance, we avoid the need to exhaustively list (and estimate) all events. The effects of multiple events can be aggregated in different ways, providing a natural way to express many forms of approximate and partial knowledge about the recombinant ancestry of a sample.

5.
Genet Sel Evol ; 55(1): 36, 2023 Jun 02.
Artigo em Inglês | MEDLINE | ID: mdl-37268883

RESUMO

BACKGROUND: In breeding programmes, the observed genetic change is a sum of the contributions of different selection paths represented by groups of individuals. Quantifying these sources of genetic change is essential for identifying the key breeding actions and optimizing breeding programmes. However, it is difficult to disentangle the contribution of individual paths due to the inherent complexity of breeding programmes. Here we extend the previously developed method for partitioning genetic mean by paths of selection to work both with the mean and variance of breeding values. METHODS: First, we extended the partitioning method to quantify the contribution of different paths to genetic variance assuming that the breeding values are known. Second, we combined the partitioning method with the Markov Chain Monte Carlo approach to draw samples from the posterior distribution of breeding values and use these samples for computing the point and interval estimates of partitions for the genetic mean and variance. We implemented the method in the R package AlphaPart. We demonstrated the method with a simulated cattle breeding programme. RESULTS: We show how to quantify the contribution of different groups of individuals to genetic mean and variance and that the contributions of different selection paths to genetic variance are not necessarily independent. Finally, we observed that the partitioning method under the pedigree-based model has some limitations, which suggests the need for a genomic extension. CONCLUSIONS: We presented a partitioning method to quantify sources of change in genetic mean and variance in breeding programmes. The method can help breeders and researchers understand the dynamics in genetic mean and variance in a breeding programme. The developed method for partitioning genetic mean and variance is a powerful method for understanding how different selection paths interact within a breeding programme and how they can be optimised.


Assuntos
Genoma , Genômica , Animais , Bovinos/genética , Método de Monte Carlo , Linhagem , Cadeias de Markov , Modelos Genéticos , Seleção Genética
6.
Genet Sel Evol ; 55(1): 42, 2023 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-37322449

RESUMO

BACKGROUND: Genome-wide association studies (GWAS) aim at identifying genomic regions involved in phenotype expression, but identifying causative variants is difficult. Pig Combined Annotation Dependent Depletion (pCADD) scores provide a measure of the predicted consequences of genetic variants. Incorporating pCADD into the GWAS pipeline may help their identification. Our objective was to identify genomic regions associated with loin depth and muscle pH, and identify regions of interest for fine-mapping and further experimental work. Genotypes for ~ 40,000 single nucleotide morphisms (SNPs) were used to perform GWAS for these two traits, using de-regressed breeding values (dEBV) for 329,964 pigs from four commercial lines. Imputed sequence data was used to identify SNPs in strong ([Formula: see text] 0.80) linkage disequilibrium with lead GWAS SNPs with the highest pCADD scores. RESULTS: Fifteen distinct regions were associated with loin depth and one with loin pH at genome-wide significance. Regions on chromosomes 1, 2, 5, 7, and 16, explained between 0.06 and 3.55% of the additive genetic variance and were strongly associated with loin depth. Only a small part of the additive genetic variance in muscle pH was attributed to SNPs. The results of our pCADD analysis suggests that high-scoring pCADD variants are enriched for missense mutations. Two close but distinct regions on SSC1 were associated with loin depth, and pCADD identified the previously identified missense variant within the MC4R gene for one of the lines. For loin pH, pCADD identified a synonymous variant in the RNF25 gene (SSC15) as the most likely candidate for the muscle pH association. The missense mutation in the PRKAG3 gene known to affect glycogen content was not prioritised by pCADD for loin pH. CONCLUSIONS: For loin depth, we identified several strong candidate regions for further statistical fine-mapping that are supported in the literature, and two novel regions. For loin muscle pH, we identified one previously identified associated region. We found mixed evidence for the utility of pCADD as an extension of heuristic fine-mapping. The next step is to perform more sophisticated fine-mapping and expression quantitative trait loci (eQTL) analysis, and then interrogate candidate variants in vitro by perturbation-CRISPR assays.


Assuntos
Estudo de Associação Genômica Ampla , Músculos , Suínos/genética , Animais , Estudo de Associação Genômica Ampla/métodos , Genótipo , Locos de Características Quantitativas , Fenótipo , Concentração de Íons de Hidrogênio , Polimorfismo de Nucleotídeo Único
7.
Elife ; 122023 06 21.
Artigo em Inglês | MEDLINE | ID: mdl-37342968

RESUMO

Simulation is a key tool in population genetics for both methods development and empirical research, but producing simulations that recapitulate the main features of genomic datasets remains a major obstacle. Today, more realistic simulations are possible thanks to large increases in the quantity and quality of available genetic data, and the sophistication of inference and simulation software. However, implementing these simulations still requires substantial time and specialized knowledge. These challenges are especially pronounced for simulating genomes for species that are not well-studied, since it is not always clear what information is required to produce simulations with a level of realism sufficient to confidently answer a given question. The community-developed framework stdpopsim seeks to lower this barrier by facilitating the simulation of complex population genetic models using up-to-date information. The initial version of stdpopsim focused on establishing this framework using six well-characterized model species (Adrion et al., 2020). Here, we report on major improvements made in the new release of stdpopsim (version 0.2), which includes a significant expansion of the species catalog and substantial additions to simulation capabilities. Features added to improve the realism of the simulated genomes include non-crossover recombination and provision of species-specific genomic annotations. Through community-driven efforts, we expanded the number of species in the catalog more than threefold and broadened coverage across the tree of life. During the process of expanding the catalog, we have identified common sticking points and developed the best practices for setting up genome-scale simulations. We describe the input data required for generating a realistic simulation, suggest good practices for obtaining the relevant information from the literature, and discuss common pitfalls and major considerations. These improvements to stdpopsim aim to further promote the use of realistic whole-genome population genetic simulations, especially in non-model organisms, making them available, transparent, and accessible to everyone.


Assuntos
Genoma , Software , Simulação por Computador , Genética Populacional , Genômica
8.
Genet Sel Evol ; 55(1): 31, 2023 May 09.
Artigo em Inglês | MEDLINE | ID: mdl-37161307

RESUMO

BACKGROUND: The Western honeybee is an economically important species globally, but has been experiencing colony losses that lead to economical damage and decreased genetic variability. This situation is spurring additional interest in honeybee breeding and conservation programs. Stochastic simulators are essential tools for rapid and low-cost testing of breeding programs and methods, yet no existing simulator allows for a detailed simulation of honeybee populations. Here we describe SIMplyBee, a holistic simulator of honeybee populations and breeding programs. SIMplyBee is an R package and hence freely available for installation from CRAN http://cran.r-project.org/package=SIMplyBee . IMPLEMENTATION: SIMplyBee builds upon the stochastic simulator AlphaSimR that simulates individuals with their corresponding genomes and quantitative genetic values. To enable honeybee-specific simulations, we extended AlphaSimR by developing classes for global simulation parameters, SimParamBee, for a honeybee colony, Colony, and multiple colonies, MultiColony. We also developed functions to address major honeybee specificities: honeybee genome, haplodiploid inheritance, social organisation, complementary sex determination, polyandry, colony events, and quantitative genetics at the individual- and colony-levels. RESULTS: We describe its implementation for simulating a honeybee genome, creating a honeybee colony and its members, addressing haplodiploid inheritance and complementary sex determination, simulating colony events, creating and managing multiple colonies at the same time, and obtaining genomic data and honeybee quantitative genetics. Further documentation, available at http://www.SIMplyBee.info , provides details on these operations and describes additional operations related to genomics, quantitative genetics, and other functionalities. DISCUSSION: SIMplyBee is a holistic simulator of honeybee populations and breeding programs. It simulates individual honeybees with their genomes, colonies with colony events, and individual- and colony-level genetic and breeding values. Regarding the latter, SIMplyBee takes a user-defined function to combine individual- into colony-level values and hence allows for modeling any type of interaction within a colony. SIMplyBee provides a research platform for testing breeding and conservation strategies and their effect on future genetic gain and genetic variability. Future developments of SIMplyBee will focus on improving the simulation of honeybee genomes, optimizing the simulator's performance, and including spatial awareness in mating functions and phenotype simulation. We invite the honeybee genetics and breeding community to join us in the future development of SIMplyBee.


Assuntos
Genômica , Padrões de Herança , Abelhas/genética , Animais , Simulação por Computador , Fenótipo , Reprodução
9.
Front Genet ; 14: 1194266, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37252666

RESUMO

Genomic selection can accelerate genetic progress in aquaculture breeding programmes, particularly for traits measured on siblings of selection candidates. However, it is not widely implemented in most aquaculture species, and remains expensive due to high genotyping costs. Genotype imputation is a promising strategy that can reduce genotyping costs and facilitate the broader uptake of genomic selection in aquaculture breeding programmes. Genotype imputation can predict ungenotyped SNPs in populations genotyped at a low-density (LD), using a reference population genotyped at a high-density (HD). In this study, we used datasets of four aquaculture species (Atlantic salmon, turbot, common carp and Pacific oyster), phenotyped for different traits, to investigate the efficacy of genotype imputation for cost-effective genomic selection. The four datasets had been genotyped at HD, and eight LD panels (300-6,000 SNPs) were generated in silico. SNPs were selected to be: i) evenly distributed according to physical position ii) selected to minimise the linkage disequilibrium between adjacent SNPs or iii) randomly selected. Imputation was performed with three different software packages (AlphaImpute2, FImpute v.3 and findhap v.4). The results revealed that FImpute v.3 was faster and achieved higher imputation accuracies. Imputation accuracy increased with increasing panel density for both SNP selection methods, reaching correlations greater than 0.95 in the three fish species and 0.80 in Pacific oyster. In terms of genomic prediction accuracy, the LD and the imputed panels performed similarly, reaching values very close to the HD panels, except in the pacific oyster dataset, where the LD panel performed better than the imputed panel. In the fish species, when LD panels were used for genomic prediction without imputation, selection of markers based on either physical or genetic distance (instead of randomly) resulted in a high prediction accuracy, whereas imputation achieved near maximal prediction accuracy independently of the LD panel, showing higher reliability. Our results suggests that, in fish species, well-selected LD panels may achieve near maximal genomic selection prediction accuracy, and that the addition of imputation will result in maximal accuracy independently of the LD panel. These strategies represent effective and affordable methods to incorporate genomic selection into most aquaculture settings.

10.
Front Genet ; 14: 1164935, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37229190

RESUMO

Genomic selection has recently become an established part of breeding strategies in cereals. However, a limitation of linear genomic prediction models for complex traits such as yield is that these are unable to accommodate Genotype by Environment effects, which are commonly observed over trials on multiple locations. In this study, we investigated how this environmental variation can be captured by the collection of a large number of phenomic markers using high-throughput field phenotyping and whether it can increase GS prediction accuracy. For this purpose, 44 winter wheat (Triticum aestivum L.) elite populations, comprising 2,994 lines, were grown on two sites over 2 years, to approximate the size of trials in a practical breeding programme. At various growth stages, remote sensing data from multi- and hyperspectral cameras, as well as traditional ground-based visual crop assessment scores, were collected with approximately 100 different data variables collected per plot. The predictive power for grain yield was tested for the various data types, with or without genome-wide marker data sets. Models using phenomic traits alone had a greater predictive value (R2 = 0.39-0.47) than genomic data (approximately R2 = 0.1). The average improvement in predictive power by combining trait and marker data was 6%-12% over the best phenomic-only model, and performed best when data from one full location was used to predict the yield on an entire second location. The results suggest that genetic gain in breeding programmes can be increased by utilisation of large numbers of phenotypic variables using remote sensing in field trials, although at what stage of the breeding cycle phenomic selection could be most profitably applied remains to be answered.

11.
Front Genet ; 14: 1168212, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37234871

RESUMO

Nucleus-based breeding programs are characterized by intense selection that results in high genetic gain, which inevitably means reduction of genetic variation in the breeding population. Therefore, genetic variation in such breeding systems is typically managed systematically, for example, by avoiding mating the closest relatives to limit progeny inbreeding. However, intense selection requires maximum effort to make such breeding programs sustainable in the long-term. The objective of this study was to use simulation to evaluate the long-term impact of genomic selection on genetic mean and variance in an intense layer chicken breeding program. We developed a large-scale stochastic simulation of an intense layer chicken breeding program to compare conventional truncation selection to genomic truncation selection optimized with either minimization of progeny inbreeding or full-scale optimal contribution selection. We compared the programs in terms of genetic mean, genic variance, conversion efficiency, rate of inbreeding, effective population size, and accuracy of selection. Our results confirmed that genomic truncation selection has immediate benefits compared to conventional truncation selection in all specified metrics. A simple minimization of progeny inbreeding after genomic truncation selection did not provide any significant improvements. Optimal contribution selection was successful in having better conversion efficiency and effective population size compared to genomic truncation selection, but it must be fine-tuned for balance between loss of genetic variance and genetic gain. In our simulation, we measured this balance using trigonometric penalty degrees between truncation selection and a balanced solution and concluded that the best results were between 45° and 65°. This balance is specific to the breeding program and depends on how much immediate genetic gain a breeding program may risk vs. save for the future. Furthermore, our results show that the persistence of accuracy is better with optimal contribution selection compared to truncation selection. In general, our results show that optimal contribution selection can ensure long-term success in intensive breeding programs using genomic selection.

12.
Theor Appl Genet ; 136(4): 74, 2023 Mar 23.
Artigo em Inglês | MEDLINE | ID: mdl-36952013

RESUMO

KEY MESSAGE: For genomic selection in clonally propagated crops with diploid (-like) meiotic behavior to be effective, crossing parents should be selected based on genomic predicted cross-performance unless dominance is negligible. For genomic selection (GS) in clonal breeding programs to be effective, parents should be selected based on genomic predicted cross-performance unless dominance is negligible. Genomic prediction of cross-performance enables efficient exploitation of the additive and dominance value simultaneously. Here, we compared different GS strategies for clonally propagated crops with diploid (-like) meiotic behavior, using strawberry as an example. We used stochastic simulation to evaluate six combinations of three breeding programs and two parent selection methods. The three breeding programs included (1) a breeding program that introduced GS in the first clonal stage, and (2) two variations of a two-part breeding program with one and three crossing cycles per year, respectively. The two parent selection methods were (1) parent selection based on genomic estimated breeding values (GEBVs) and (2) parent selection based on genomic predicted cross-performance (GPCP). Selection of parents based on GPCP produced faster genetic gain than selection of parents based on GEBVs because it reduced inbreeding when the dominance degree increased. The two-part breeding programs with one and three crossing cycles per year using GPCP always produced the most genetic gain unless dominance was negligible. We conclude that (1) in clonal breeding programs with GS, parents should be selected based on GPCP, and (2) a two-part breeding program with parent selection based on GPCP to rapidly drive population improvement has great potential to improve breeding clonally propagated crops.


Assuntos
Melhoramento Vegetal , Seleção Genética , Melhoramento Vegetal/métodos , Genoma , Genômica/métodos , Endogamia , Produtos Agrícolas/genética , Modelos Genéticos
13.
Sci Rep ; 13(1): 1640, 2023 01 30.
Artigo em Inglês | MEDLINE | ID: mdl-36717606

RESUMO

Social insects are very successful invasive species, and the continued increase of global trade and transportation has exacerbated this problem. The yellow-legged hornet, Vespa velutina nigrithorax (henceforth Asian hornet), is drastically expanding its range in Western Europe. As an apex insect predator, this hornet poses a serious threat to the honey bee industry and endemic pollinators. Current suppression methods have proven too inefficient and expensive to limit its spread. Gene drives might be an effective tool to control this species, but their use has not yet been thoroughly investigated in social insects. Here, we built a model that matches the hornet's life history and modelled the effect of different gene drive scenarios on an established invasive population. To test the broader applicability and sensitivity of the model, we also incorporated the invasive European paper wasp Polistes dominula. We find that, due to the haplodiploidy of social hymenopterans, only a gene drive targeting female fertility is promising for population control. Our results show that although a gene drive can suppress a social wasp population, it can only do so under fairly stringent gene drive-specific conditions. This is due to a combination of two factors: first, the large number of surviving offspring that social wasp colonies produce make it possible that, even with very limited formation of resistance alleles, such alleles can quickly spread and rescue the population. Second, due to social wasp life history, infertile individuals do not compete with fertile ones, allowing fertile individuals to maintain a large population size even when drive alleles are widespread. Nevertheless, continued improvements in gene drive technology may make it a promising method for the control of invasive social insects in the future.


Assuntos
Tecnologia de Impulso Genético , Vespas , Feminino , Abelhas/genética , Animais , Vespas/genética , Europa (Continente) , Fertilidade , Espécies Introduzidas
14.
Plant Genome ; 16(1): e20282, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36349831

RESUMO

Tea [Camellia sinensis (L.) O. Kuntze] is mainly grown in low- to middle-income countries (LMIC) and is a global commodity. Breeding programs in these countries face the challenge of increasing genetic gain because the accuracy of selecting superior genotypes is low and resources are limited. Phenotypic selection (PS) is traditionally the primary method of developing improved tea varieties and can take over 16 yr. Genomic selection (GS) can be used to improve the efficiency of tea breeding by increasing selection accuracy and shortening the generation interval and breeding cycle. Our main objective was to investigate the potential of implementing GS in tea-breeding programs to speed up genetic progress despite the low cost of PS in LMIC. We used stochastic simulations to compare three GS-breeding programs with a Pedigree and PS program. The PS program mimicked a practical commercial tea-breeding program over a 40-yr breeding period. All the GS programs achieved at least 1.65 times higher genetic gains than the PS program and 1.4 times compared with Seed-Ped program. Seed-GSc was the most cost-effective strategy of implementing GS in tea-breeding programs. It introduces GS at the seedlings stage to increase selection accuracy early in the program and reduced the generation interval to 2 yr. The Seed-Ped program outperformed PS by 1.2 times and could be implemented where it is not possible to use GS. Our results indicate that GS could be used to improve genetic gain per unit time and cost even in cost-constrained tea-breeding programs.


Assuntos
Melhoramento Vegetal , Seleção Genética , Melhoramento Vegetal/métodos , Genoma , Genômica/métodos , Chá
15.
Genet Sel Evol ; 54(1): 76, 2022 Nov 22.
Artigo em Inglês | MEDLINE | ID: mdl-36418945

RESUMO

BACKGROUND: By entering the era of mega-scale genomics, we are facing many computational issues with standard genomic evaluation models due to their dense data structure and cubic computational complexity. Several scalable approaches have been proposed to address this challenge, such as the Algorithm for Proven and Young (APY). In APY, genotyped animals are partitioned into core and non-core subsets, which induces a sparser inverse of the genomic relationship matrix. This partitioning is often done at random. While APY is a good approximation of the full model, random partitioning can make results unstable, possibly affecting accuracy or even reranking animals. Here we present a stable optimisation of the core subset by choosing animals with the most informative genotype data. METHODS: We derived a novel algorithm for optimising the core subset based on a conditional genomic relationship matrix or a conditional single nucleotide polymorphism (SNP) genotype matrix. We compared the accuracy of genomic predictions with different core subsets for simulated and real pig data sets. The core subsets were constructed (1) at random, (2) based on the diagonal of the genomic relationship matrix, (3) at random with weights from (2), or (4) based on the novel conditional algorithm. To understand the different core subset constructions, we visualise the population structure of the genotyped animals with linear Principal Component Analysis and non-linear Uniform Manifold Approximation and Projection. RESULTS: All core subset constructions performed equally well when the number of core animals captured most of the variation in the genomic relationships, both in simulated and real data sets. When the number of core animals was not sufficiently large, there was substantial variability in the results with the random construction but no variability with the conditional construction. Visualisation of the population structure and chosen core animals showed that the conditional construction spreads core animals across the whole domain of genotyped animals in a repeatable manner. CONCLUSIONS: Our results confirm that the size of the core subset in APY is critical. Furthermore, the results show that the core subset can be optimised with the conditional algorithm that achieves an optimal and repeatable spread of core animals across the domain of genotyped animals.


Assuntos
Genoma , Modelos Genéticos , Suínos , Animais , Genômica/métodos , Genótipo , Algoritmos
16.
Sci Rep ; 12(1): 18023, 2022 10 26.
Artigo em Inglês | MEDLINE | ID: mdl-36289298

RESUMO

Rubber tree (Hevea brasiliensis) is the main feedstock for commercial rubber; however, its long vegetative cycle has hindered the development of more productive varieties via breeding programs. With the availability of H. brasiliensis genomic data, several linkage maps with associated quantitative trait loci have been constructed and suggested as a tool for marker-assisted selection. Nonetheless, novel genomic strategies are still needed, and genomic selection (GS) may facilitate rubber tree breeding programs aimed at reducing the required cycles for performance assessment. Even though such a methodology has already been shown to be a promising tool for rubber tree breeding, increased model predictive capabilities and practical application are still needed. Here, we developed a novel machine learning-based approach for predicting rubber tree stem circumference based on molecular markers. Through a divide-and-conquer strategy, we propose a neural network prediction system with two stages: (1) subpopulation prediction and (2) phenotype estimation. This approach yielded higher accuracies than traditional statistical models in a single-environment scenario. By delivering large accuracy improvements, our methodology represents a powerful tool for use in Hevea GS strategies. Therefore, the incorporation of machine learning techniques into rubber tree GS represents an opportunity to build more robust models and optimize Hevea breeding programs.


Assuntos
Hevea , Hevea/genética , Hevea/metabolismo , Borracha/metabolismo , Melhoramento Vegetal , Genômica , Aprendizado de Máquina
17.
OMICS ; 26(11): 586-588, 2022 11.
Artigo em Inglês | MEDLINE | ID: mdl-36315198

RESUMO

In this perspective analysis, we strive to answer the following question: how can we advance integrative biology research in the 21st century with lessons from animal science? At the University of Ljubljana, Biotechnical Faculty, Department of Animal Science, we share here our three lessons learned in the two decades from 2002 to 2022 that we believe could inform integrative biology, systems science, and animal science scholarship in other countries and geographies. Cultivating multiomics knowledge through a conceptual lens of integrative biology is crucial for life sciences research that can stand the test of diverse biological, clinical, and ecological contexts. Moreover, in an era of the current COVID-19 pandemic, animal nutrition and animal science, and the study of their interactions with human health (and vice versa) through integrative biology approaches hold enormous prospects and significance for systems medicine and ecosystem health.


Assuntos
Disciplinas das Ciências Biológicas , COVID-19 , Animais , Humanos , História do Século XXI , Ecossistema , Pandemias , COVID-19/epidemiologia , Biologia
18.
Genet Sel Evol ; 54(1): 65, 2022 Sep 24.
Artigo em Inglês | MEDLINE | ID: mdl-36153511

RESUMO

BACKGROUND: Early simulations indicated that whole-genome sequence data (WGS) could improve the accuracy of genomic predictions within and across breeds. However, empirical results have been ambiguous so far. Large datasets that capture most of the genomic diversity in a population must be assembled so that allele substitution effects are estimated with high accuracy. The objectives of this study were to use a large pig dataset from seven intensely selected lines to assess the benefits of using WGS for genomic prediction compared to using commercial marker arrays and to identify scenarios in which WGS provides the largest advantage. METHODS: We sequenced 6931 individuals from seven commercial pig lines with different numerical sizes. Genotypes of 32.8 million variants were imputed for 396,100 individuals (17,224 to 104,661 per line). We used BayesR to perform genomic prediction for eight complex traits. Genomic predictions were performed using either data from a standard marker array or variants preselected from WGS based on association tests. RESULTS: The accuracies of genomic predictions based on preselected WGS variants were not robust across traits and lines and the improvements in prediction accuracy that we achieved so far with WGS compared to standard marker arrays were generally small. The most favourable results for WGS were obtained when the largest training sets were available and standard marker arrays were augmented with preselected variants with statistically significant associations to the trait. With this method and training sets of around 80k individuals, the accuracy of within-line genomic predictions was on average improved by 0.025. With multi-line training sets, improvements of 0.04 compared to marker arrays could be expected. CONCLUSIONS: Our results showed that WGS has limited potential to improve the accuracy of genomic predictions compared to marker arrays in intensely selected pig lines. Thus, although we expect that larger improvements in accuracy from the use of WGS are possible with a combination of larger training sets and optimised pipelines for generating and analysing such datasets, the use of WGS in the current implementations of genomic prediction should be carefully evaluated against the cost of large-scale WGS data on a case-by-case basis.


Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Alelos , Animais , Genômica/métodos , Genótipo , Suínos/genética
19.
Theor Appl Genet ; 135(10): 3393-3415, 2022 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-36066596

RESUMO

KEY MESSAGE: The integration of known and latent environmental covariates within a single-stage genomic selection approach provides breeders with an informative and practical framework to utilise genotype by environment interaction for prediction into current and future environments. This paper develops a single-stage genomic selection approach which integrates known and latent environmental covariates within a special factor analytic framework. The factor analytic linear mixed model of Smith et al. (2001) is an effective method for analysing multi-environment trial (MET) datasets, but has limited practicality since the underlying factors are latent so the modelled genotype by environment interaction (GEI) is observable, rather than predictable. The advantage of using random regressions on known environmental covariates, such as soil moisture and daily temperature, is that the modelled GEI becomes predictable. The integrated factor analytic linear mixed model (IFA-LMM) developed in this paper includes a model for predictable and observable GEI in terms of a joint set of known and latent environmental covariates. The IFA-LMM is demonstrated on a late-stage cotton breeding MET dataset from Bayer CropScience. The results show that the known covariates predominately capture crossover GEI and explain 34.4% of the overall genetic variance. The most notable covariates are maximum downward solar radiation (10.1%), average cloud cover (4.5%) and maximum temperature (4.0%). The latent covariates predominately capture non-crossover GEI and explain 40.5% of the overall genetic variance. The results also show that the average prediction accuracy of the IFA-LMM is [Formula: see text] higher than conventional random regression models for current environments and [Formula: see text] higher for future environments. The IFA-LMM is therefore an effective method for analysing MET datasets which also utilises crossover and non-crossover GEI for genomic prediction into current and future environments. This is becoming increasingly important with the emergence of rapidly changing environments and climate change.


Assuntos
Interação Gene-Ambiente , Modelos Genéticos , Genômica , Genótipo , Melhoramento Vegetal , Solo
20.
Sci Rep ; 12(1): 12499, 2022 07 21.
Artigo em Inglês | MEDLINE | ID: mdl-35864135

RESUMO

Poaceae, among the most abundant plant families, includes many economically important polyploid species, such as forage grasses and sugarcane (Saccharum spp.). These species have elevated genomic complexities and limited genetic resources, hindering the application of marker-assisted selection strategies. Currently, the most promising approach for increasing genetic gains in plant breeding is genomic selection. However, due to the polyploidy nature of these polyploid species, more accurate models for incorporating genomic selection into breeding schemes are needed. This study aims to develop a machine learning method by using a joint learning approach to predict complex traits from genotypic data. Biparental populations of sugarcane and two species of forage grasses (Urochloa decumbens, Megathyrsus maximus) were genotyped, and several quantitative traits were measured. High-quality markers were used to predict several traits in different cross-validation scenarios. By combining classification and regression strategies, we developed a predictive system with promising results. Compared with traditional genomic prediction methods, the proposed strategy achieved accuracy improvements exceeding 50%. Our results suggest that the developed methodology could be implemented in breeding programs, helping reduce breeding cycles and increase genetic gains.


Assuntos
Poaceae , Saccharum , Genômica/métodos , Fenótipo , Melhoramento Vegetal , Poaceae/genética , Poliploidia , Saccharum/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...