Pesquisa | BVS IEC

1.

Phylogenomic discovery of deleterious mutations facilitates hybrid potato breeding.

Wu, Yaoyao; Li, Dawei; Hu, Yong; Li, Hongbo; Ramstein, Guillaume P; Zhou, Shaoqun; Zhang, Xinyan; Bao, Zhigui; Zhang, Yu; Song, Baoxing; Zhou, Yao; Zhou, Yongfeng; Gagnon, Edeline; Särkinen, Tiina; Knapp, Sandra; Zhang, Chunzhi; Städler, Thomas; Buckler, Edward S; Huang, Sanwen.

Cell ; 186(11): 2313-2328.e15, 2023 05 25.

Artigo em Inglês | MEDLINE | ID: mdl-37146612

RESUMO

Hybrid potato breeding will transform the crop from a clonally propagated tetraploid to a seed-reproducing diploid. Historical accumulation of deleterious mutations in potato genomes has hindered the development of elite inbred lines and hybrids. Utilizing a whole-genome phylogeny of 92 Solanaceae and its sister clade species, we employ an evolutionary strategy to identify deleterious mutations. The deep phylogeny reveals the genome-wide landscape of highly constrained sites, comprising â¼2.4% of the genome. Based on a diploid potato diversity panel, we infer 367,499 deleterious variants, of which 50% occur at non-coding and 15% at synonymous sites. Counterintuitively, diploid lines with relatively high homozygous deleterious burden can be better starting material for inbred-line development, despite showing less vigorous growth. Inclusion of inferred deleterious mutations increases genomic-prediction accuracy for yield by 24.7%. Our study generates insights into the genome-wide incidence and properties of deleterious mutations and their far-reaching consequences for breeding.

Assuntos

Melhoramento Vegetal , Solanum tuberosum , Diploide , Mutação , Filogenia , Solanum tuberosum/genética

2.

Leveraging microbiome information for animal genetic improvement.

Venegas, Lucas; López, Paulina; Derome, Nicolas; Yáñez, José M.

Trends Genet ; 39(10): 721-723, 2023 10.

Artigo em Inglês | MEDLINE | ID: mdl-37516623

RESUMO

There is growing evidence that the microbiome influences host phenotypic variation. Incorporating information about the holobiont - the host and its microbiome - into genomic prediction models may accelerate genetic improvements in farmed animal populations. Importantly, these models must account for the indirect effects of the host genome on microbiome-mediated phenotypes.

Assuntos

Microbiota , Animais , Microbiota/genética , Genoma/genética , Genômica , Fenótipo , Modelos Genéticos

3.

polyGBLUP: a modified genomic best linear unbiased prediction improved the genomic prediction efficiency for autopolyploid species.

Song, Hailiang; Zhang, Qin; Hu, Hongxia.

Brief Bioinform ; 25(2)2024 Jan 22.

Artigo em Inglês | MEDLINE | ID: mdl-38517695

RESUMO

Given the universality of autopolyploid species in nature, it is crucial to develop genomic selection methods that consider different allele dosages for autopolyploid breeding. However, no method has been developed to deal with autopolyploid data regardless of the ploidy level. In this study, we developed a modified genomic best linear unbiased prediction (GBLUP) model (polyGBLUP) through constructing additive and dominant genomic relationship matrices based on different allele dosages. polyGBLUP could carry out genomic prediction for autopolyploid species regardless of the ploidy level. Through comprehensive simulations and analysis of real data of autotetraploid blueberry and guinea grass and autohexaploid sweet potato, the results showed that polyGBLUP achieved higher prediction accuracy than GBLUP and its superiority was more obvious when the ploidy level of autopolyploids is high. Furthermore, when the dominant effect was added to polyGBLUP (polyGDBLUP), the greater the dominance degree, the more obvious the advantages of polyGDBLUP over the diploid models in terms of prediction accuracy, bias, mean squared error and mean absolute error. For real data, the superiority of polyGBLUP over GBLUP appeared in blueberry and sweet potato populations and a part of the traits in guinea grass population due to the high correlation coefficients between diploid and polyploidy genomic relationship matrices. In addition, polyGDBLUP did not produce higher prediction accuracy than polyGBLUP for most traits of real data as dominant genetic variance was not captured for these traits. Our study will be a significant promising method for genomic prediction of autopolyploid species.

Assuntos

Genoma , Genômica , Humanos , Genômica/métodos , Fenótipo , Ploidias , Poliploidia , Modelos Genéticos , Genótipo , Polimorfismo de Nucleotídeo Único

4.

A perspective on genetic and polygenic risk scores-advances and limitations and overview of associated tools.

Schwarzerova, Jana; Hurta, Martin; Barton, Vojtech; Lexa, Matej; Walther, Dirk; Provaznik, Valentine; Weckwerth, Wolfram.

Brief Bioinform ; 25(3)2024 Mar 27.

Artigo em Inglês | MEDLINE | ID: mdl-38770718

RESUMO

Polygenetic Risk Scores are used to evaluate an individual's vulnerability to developing specific diseases or conditions based on their genetic composition, by taking into account numerous genetic variations. This article provides an overview of the concept of Polygenic Risk Scores (PRS). We elucidate the historical advancements of PRS, their advantages and shortcomings in comparison with other predictive methods, and discuss their conceptual limitations in light of the complexity of biological systems. Furthermore, we provide a survey of published tools for computing PRS and associated resources. The various tools and software packages are categorized based on their technical utility for users or prospective developers. Understanding the array of available tools and their limitations is crucial for accurately assessing and predicting disease risks, facilitating early interventions, and guiding personalized healthcare decisions. Additionally, we also identify potential new avenues for future bioinformatic analyzes and advancements related to PRS.

Assuntos

Predisposição Genética para Doença , Herança Multifatorial , Software , Humanos , Biologia Computacional/métodos , Estudo de Associação Genômica Ampla/métodos , Fatores de Risco , Medição de Risco/métodos , Estratificação de Risco Genético

5.

Improving multi-population genomic prediction accuracy using multi-trait GBLUP models which incorporate global or local genetic correlation information.

Teng, Jun; Zhai, Tingting; Zhang, Xinyi; Zhao, Changheng; Wang, Wenwen; Tang, Hui; Wang, Dan; Shang, Yingli; Ning, Chao; Zhang, Qin.

Brief Bioinform ; 25(4)2024 May 23.

Artigo em Inglês | MEDLINE | ID: mdl-38856170

RESUMO

In the application of genomic prediction, a situation often faced is that there are multiple populations in which genomic prediction (GP) need to be conducted. A common way to handle the multi-population GP is simply to combine the multiple populations into a single population. However, since these populations may be subject to different environments, there may exist genotype-environment interactions which may affect the accuracy of genomic prediction. In this study, we demonstrated that multi-trait genomic best linear unbiased prediction (MTGBLUP) can be used for multi-population genomic prediction, whereby the performances of a trait in different populations are regarded as different traits, and thus multi-population prediction is regarded as multi-trait prediction by employing the between-population genetic correlation. Using real datasets, we proved that MTGBLUP outperformed the conventional multi-population model that simply combines different populations together. We further proposed that MTGBLUP can be improved by partitioning the global between-population genetic correlation into local genetic correlations (LGC). We suggested two LGC models, LGC-model-1 and LGC-model-2, which partition the genome into regions with and without significant LGC (LGC-model-1) or regions with and without strong LGC (LGC-model-2). In analysis of real datasets, we demonstrated that the LGC models could increase universally the prediction accuracy and the relative improvement over MTGBLUP reached up to 163.86% (25.64% on average).

Assuntos

Genômica , Modelos Genéticos , Genômica/métodos , Genética Populacional/métodos , Locos de Características Quantitativas , Humanos , Algoritmos , Genótipo

6.

MAK: a machine learning framework improved genomic prediction via multi-target ensemble regressor chains and automatic selection of assistant traits.

Liang, Mang; Cao, Sheng; Deng, Tianyu; Du, Lili; Li, Keanning; An, Bingxing; Du, Yueying; Xu, Lingyang; Zhang, Lupei; Gao, Xue; Li, Junya; Guo, Peng; Gao, Huijiang.

Brief Bioinform ; 24(2)2023 03 19.

Artigo em Inglês | MEDLINE | ID: mdl-36752363

RESUMO

Incorporating the genotypic and phenotypic of the correlated traits into the multi-trait model can significantly improve the prediction accuracy of the target trait in animal and plant breeding, as well as human genetics. However, in most cases, the phenotypic information of the correlated and target trait of the individual to be evaluated was null simultaneously, particularly for the newborn. Therefore, we propose a machine learning framework, MAK, to improve the prediction accuracy of the target trait by constructing the multi-target ensemble regression chains and selecting the assistant trait automatically, which predicted the genomic estimated breeding values of the target trait using genotypic information only. The prediction ability of MAK was significantly more robust than the genomic best linear unbiased prediction, BayesB, BayesRR and the multi trait Bayesian method in the four real animal and plant datasets, and the computational efficiency of MAK was roughly 100 times faster than BayesB and BayesRR.

Assuntos

Modelos Genéticos , Melhoramento Vegetal , Animais , Humanos , Recém-Nascido , Teorema de Bayes , Fenótipo , Genômica/métodos , Genótipo , Aprendizado de Máquina

7.

A transformer-based genomic prediction method fused with knowledge-guided module.

Wu, Cuiling; Zhang, Yiyi; Ying, Zhiwen; Li, Ling; Wang, Jun; Yu, Hui; Zhang, Mengchen; Feng, Xianzhong; Wei, Xinghua; Xu, Xiaogang.

Brief Bioinform ; 25(1)2023 11 22.

Artigo em Inglês | MEDLINE | ID: mdl-38058185

RESUMO

Genomic prediction (GP) uses single nucleotide polymorphisms (SNPs) to establish associations between markers and phenotypes. Selection of early individuals by genomic estimated breeding value shortens the generation interval and speeds up the breeding process. Recently, methods based on deep learning (DL) have gained great attention in the field of GP. In this study, we explore the application of Transformer-based structures to GP and develop a novel deep-learning model named GPformer. GPformer obtains a global view by gleaning beneficial information from all relevant SNPs regardless of the physical distance between SNPs. Comprehensive experimental results on five different crop datasets show that GPformer outperforms ridge regression-based linear unbiased prediction (RR-BLUP), support vector regression (SVR), light gradient boosting machine (LightGBM) and deep neural network genomic prediction (DNNGP) in terms of mean absolute error, Pearson's correlation coefficient and the proposed metric consistent index. Furthermore, we introduce a knowledge-guided module (KGM) to extract genome-wide association studies-based information, which is fused into GPformer as prior knowledge. KGM is very flexible and can be plugged into any DL network. Ablation studies of KGM on three datasets illustrate the efficiency of KGM adequately. Moreover, GPformer is robust and stable to hyperparameters and can generalize to each phenotype of every dataset, which is suitable for practical application scenarios.

Assuntos

Estudo de Associação Genômica Ampla , Modelos Genéticos , Humanos , Genótipo , Teorema de Bayes , Genômica/métodos , Fenótipo , Polimorfismo de Nucleotídeo Único

8.

Additive genetic effects in interacting species jointly determine the outcome of caterpillar herbivory.

Gompert, Zachariah; Saley, Tara; Philbin, Casey; Yoon, Su'ad A; Perry, Eva; Sneck, Michelle E; Harrison, Joshua G; Buerkle, C Alex; Fordyce, James A; Nice, Chris C; Dodson, Craig D; Lebeis, Sarah L; Lucas, Lauren K; Forister, Matthew L.

Proc Natl Acad Sci U S A ; 119(36): e2206052119, 2022 09 06.

Artigo em Inglês | MEDLINE | ID: mdl-36037349

RESUMO

Plant-insect interactions are common and important in basic and applied biology. Trait and genetic variation can affect the outcome and evolution of these interactions, but the relative contributions of plant and insect genetic variation and how these interact remain unclear and are rarely subject to assessment in the same experimental context. Here, we address this knowledge gap using a recent host-range expansion onto alfalfa by the Melissa blue butterfly. Common garden rearing experiments and genomic data show that caterpillar performance depends on plant and insect genetic variation, with insect genetics contributing to performance earlier in development and plant genetics later. Our models of performance based on caterpillar genetics retained predictive power when applied to a second common garden. Much of the plant genetic effect could be explained by heritable variation in plant phytochemicals, especially saponins, peptides, and phosphatidyl cholines, providing a possible mechanistic understanding of variation in the species interaction. We find evidence of polygenic, mostly additive effects within and between species, with consistent effects of plant genotype on growth and development across multiple butterfly species. Our results inform theories of plant-insect coevolution and the evolution of diet breadth in herbivorous insects and other host-specific parasites.

Assuntos

Borboletas , Herbivoria , Plantas , Animais , Borboletas/genética , Genótipo , Herbivoria/genética , Larva , Plantas/genética

9.

Improving GWAS discovery and genomic prediction accuracy in biobank data.

Orliac, Etienne J; Trejo Banos, Daniel; Ojavee, Sven E; Läll, Kristi; Mägi, Reedik; Visscher, Peter M; Robinson, Matthew R.

Proc Natl Acad Sci U S A ; 119(31): e2121279119, 2022 08 02.

Artigo em Inglês | MEDLINE | ID: mdl-35905320

RESUMO

Genetically informed, deep-phenotyped biobanks are an important research resource and it is imperative that the most powerful, versatile, and efficient analysis approaches are used. Here, we apply our recently developed Bayesian grouped mixture of regressions model (GMRM) in the UK and Estonian Biobanks and obtain the highest genomic prediction accuracy reported to date across 21 heritable traits. When compared to other approaches, GMRM accuracy was greater than annotation prediction models run in the LDAK or LDPred-funct software by 15% (SE 7%) and 14% (SE 2%), respectively, and was 18% (SE 3%) greater than a baseline BayesR model without single-nucleotide polymorphism (SNP) markers grouped into minor allele frequency-linkage disequilibrium (MAF-LD) annotation categories. For height, the prediction accuracy R2 was 47% in a UK Biobank holdout sample, which was 76% of the estimated [Formula: see text]. We then extend our GMRM prediction model to provide mixed-linear model association (MLMA) SNP marker estimates for genome-wide association (GWAS) discovery, which increased the independent loci detected to 16,162 in unrelated UK Biobank individuals, compared to 10,550 from BoltLMM and 10,095 from Regenie, a 62 and 65% increase, respectively. The average [Formula: see text] value of the leading markers increased by 15.24 (SE 0.41) for every 1% increase in prediction accuracy gained over a baseline BayesR model across the traits. Thus, we show that modeling genetic associations accounting for MAF and LD differences among SNP markers, and incorporating prior knowledge of genomic function, is important for both genomic prediction and discovery in large-scale individual-level studies.

Assuntos

Bases de Dados Genéticas , Estudo de Associação Genômica Ampla , Medicina de Precisão , Característica Quantitativa Herdável , Teorema de Bayes , Inglaterra , Estônia , Genômica , Genótipo , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único

10.

Cost-effective genomic prediction of critical economic traits in sturgeons through low-coverage sequencing.

Song, Hailiang; Dong, Tian; Wang, Wei; Jiang, Boyun; Yan, Xiaoyu; Geng, Chenfan; Bai, Song; Xu, Shijian; Hu, Hongxia.

Genomics ; 116(4): 110874, 2024 07.

Artigo em Inglês | MEDLINE | ID: mdl-38839024

RESUMO

Low-coverage whole-genome sequencing (LCS) offers a cost-effective alternative for sturgeon breeding, especially given the lack of SNP chips and the high costs associated with whole-genome sequencing. In this study, the efficiency of LCS for genotype imputation and genomic prediction was assessed in 643 sequenced Russian sturgeons (â¼13.68×). The results showed that using BaseVar+STITCH at a sequencing depth of 2× with a sample size larger than 300 resulted in the highest genotyping accuracy. In addition, when the sequencing depth reached 0.5× and SNP density was reduced to 50 K through linkage disequilibrium pruning, the prediction accuracy was comparable to that of whole sequencing depth. Furthermore, an incremental feature selection method has the potential to improve prediction accuracy. This study suggests that the combination of LCS and imputation can be a cost-effective strategy, contributing to the genetic improvement of economic traits and promoting genetic gains in aquaculture species.

Assuntos

Peixes , Polimorfismo de Nucleotídeo Único , Peixes/genética , Animais , Sequenciamento Completo do Genoma/economia , Sequenciamento Completo do Genoma/métodos , Genômica/métodos , Genômica/economia , Análise Custo-Benefício , Desequilíbrio de Ligação

11.

Ensemble learning for integrative prediction of genetic values with genomic variants.

Gu, Lin-Lin; Yang, Run-Qing; Wang, Zhi-Yong; Jiang, Dan; Fang, Ming.

BMC Bioinformatics ; 25(1): 120, 2024 Mar 21.

Artigo em Inglês | MEDLINE | ID: mdl-38515026

RESUMO

BACKGROUND: Whole genome variants offer sufficient information for genetic prediction of human disease risk, and prediction of animal and plant breeding values. Many sophisticated statistical methods have been developed for enhancing the predictive ability. However, each method has its own advantages and disadvantages, so far, no one method can beat others. RESULTS: We herein propose an Ensemble Learning method for Prediction of Genetic Values (ELPGV), which assembles predictions from several basic methods such as GBLUP, BayesA, BayesB and BayesCπ, to produce more accurate predictions. We validated ELPGV with a variety of well-known datasets and a serious of simulated datasets. All revealed that ELPGV was able to significantly enhance the predictive ability than any basic methods, for instance, the comparison p-value of ELPGV over basic methods were varied from 4.853E-118 to 9.640E-20 for WTCCC dataset. CONCLUSIONS: ELPGV is able to integrate the merit of each method together to produce significantly higher predictive ability than any basic methods and it is simple to implement, fast to run, without using genotype data. is promising for wide application in genetic predictions.

Assuntos

Genoma , Melhoramento Vegetal , Animais , Humanos , Genótipo , Genômica , Aprendizado de Máquina , Modelos Genéticos , Fenótipo , Polimorfismo de Nucleotídeo Único , Teorema de Bayes

12.

Genetic analysis of resistance to bean leaf crumple virus identifies a candidate LRR-RLK gene.

Ariza-Suarez, Daniel; Keller, Beat; Spescha, Anna; Aparicio, Johan Steven; Mayor, Victor; Portilla-Benavides, Ana Elizabeth; Buendia, Hector Fabio; Bueno, Juan Miguel; Studer, Bruno; Raatz, Bodo.

Plant J ; 114(1): 23-38, 2023 04.

Artigo em Inglês | MEDLINE | ID: mdl-35574650

RESUMO

Bean leaf crumple virus (BLCrV) is a novel begomovirus (family Geminiviridae, genus Begomovirus) infecting common bean (Phaseolus vulgaris L.), threatening bean production in Latin America. Genetic resistance is required to ensure yield stability and reduce the use of insecticides, yet the available resistance sources are limited. In this study, three common bean populations containing a total of 558 genotypes were evaluated in different yield and BLCrV resistance trials under natural infection in the field. A genome-wide association study identified the locus BLC7.1 on chromosome Pv07 at 3.31 Mbp, explaining 8 to 16% of the phenotypic variation for BLCrV resistance. In comparison, whole-genome regression models explained 51 to 78% of the variation and identified the same region on Pv07 to confer resistance. The most significantly associated markers were located within the gene model Phvul.007G040400, which encodes a leucine-rich repeat receptor-like kinase subfamily III member and is likely to be involved in the innate immune response against the virus. The allelic diversity within this gene revealed five different haplotype groups, one of which was significantly associated with BLCrV resistance. As the same genome region was previously reported to be associated with resistance against other geminiviruses affecting common bean, our study highlights the role of previous breeding efforts for virus resistance in the accumulation of positive alleles against newly emerging viruses. In addition, we provide novel diagnostic single-nucleotide polymorphism markers for marker-assisted selection to exploit BLC7.1 for breeding against geminivirus diseases in one of the most important food crops worldwide.

Assuntos

Estudo de Associação Genômica Ampla , Phaseolus , Resistência à Doença/genética , Melhoramento Vegetal , Genótipo , Phaseolus/genética , Folhas de Planta , Doenças das Plantas/genética

13.

Genomic dissection of additive and non-additive genetic effects and genomic prediction in an open-pollinated family test of Japanese larch.

Dong, Leiming; Xie, Yunhui; Zhang, Yalin; Wang, Ruizhen; Sun, Xiaomei.

BMC Genomics ; 25(1): 11, 2024 Jan 02.

Artigo em Inglês | MEDLINE | ID: mdl-38166605

RESUMO

Genomic dissection of genetic effects on desirable traits and the subsequent use of genomic selection hold great promise for accelerating the rate of genetic improvement of forest tree species. In this study, a total of 661 offspring trees from 66 open-pollinated families of Japanese larch (Larix kaempferi (Lam.) Carrière) were sampled at a test site. The contributions of additive and non-additive effects (dominance, imprinting and epistasis) were evaluated for nine valuable traits related to growth, wood physical and chemical properties, and competitive ability using three pedigree-based and four Genomics-based Best Linear Unbiased Predictions (GBLUP) models and used to determine the genetic model. The predictive ability (PA) of two genomic prediction methods, GBLUP and Reproducing Kernel Hilbert Spaces (RKHS), was compared. The traits could be classified into two types based on different quantitative genetic architectures: for type I, including wood chemical properties and Pilodyn penetration, additive effect is the main source of variation (38.20-67.46%); for type II, including growth, competitive ability and acoustic velocity, epistasis plays a significant role (50.76-91.26%). Dominance and imprinting showed low to moderate contributions (< 36.26%). GBLUP was more suitable for traits of type I (PAs = 0.37-0.39 vs. 0.14-0.25), and RKHS was more suitable for traits of type II (PAs = 0.23-0.37 vs. 0.07-0.23). Non-additive effects make no meaningful contribution to the enhancement of PA of GBLUP method for all traits. These findings enhance our current understanding of the architecture of quantitative traits and lay the foundation for the development of genomic selection strategies in Japanese larch.

Assuntos

Larix , Larix/genética , Genótipo , Japão , Genoma , Genômica/métodos , Fenótipo , Modelos Genéticos , Polimorfismo de Nucleotídeo Único

14.

Enhancing winter wheat prediction with genomics, phenomics and environmental data.

Montesinos-López, Osval A; Herr, Andrew W; Crossa, José; Montesinos-López, Abelardo; Carter, Arron H.

BMC Genomics ; 25(1): 544, 2024 May 31.

Artigo em Inglês | MEDLINE | ID: mdl-38822262

RESUMO

In the realm of multi-environment prediction, when the goal is to predict a complete environment using the others as a training set, the efficiency of genomic selection (GS) falls short of expectations. Genotype by environment interaction poses a challenge in achieving high prediction accuracies. Consequently, current efforts are focused on enhancing efficiency by integrating various types of inputs, such as phenomics data, environmental information, and other omics data. In this study, we sought to evaluate the impact of incorporating environmental information into the modeling process, in addition to genomic and phenomics information. Our evaluation encompassed five data sets of soft white winter wheat, and the results revealed a significant improvement in prediction accuracy, as measured by the normalized root mean square error (NRMSE), through the integration of environmental information. Notably, there was an average gain in prediction accuracy of 49.19% in terms of NRMSE across the data sets. Moreover, the observed prediction accuracy ranged from 5.68% (data set 3) to 60.36% (data set 4), underscoring the substantial effect of integrating environmental information. By including genomic, phenomic, and environmental data in prediction models, plant breeding programs can improve selection efficiency across locations.

Assuntos

Genômica , Fenômica , Triticum , Triticum/genética , Genômica/métodos , Interação Gene-Ambiente , Fenótipo , Genótipo , Melhoramento Vegetal , Meio Ambiente , Genoma de Planta

15.

Genomic prediction using machine learning: a comparison of the performance of regularized regression, ensemble, instance-based and deep learning methods on synthetic and empirical data.

Lourenço, Vanda M; Ogutu, Joseph O; Rodrigues, Rui A P; Posekany, Alexandra; Piepho, Hans-Peter.

BMC Genomics ; 25(1): 152, 2024 Feb 07.

Artigo em Inglês | MEDLINE | ID: mdl-38326768

RESUMO

BACKGROUND: The accurate prediction of genomic breeding values is central to genomic selection in both plant and animal breeding studies. Genomic prediction involves the use of thousands of molecular markers spanning the entire genome and therefore requires methods able to efficiently handle high dimensional data. Not surprisingly, machine learning methods are becoming widely advocated for and used in genomic prediction studies. These methods encompass different groups of supervised and unsupervised learning methods. Although several studies have compared the predictive performances of individual methods, studies comparing the predictive performance of different groups of methods are rare. However, such studies are crucial for identifying (i) groups of methods with superior genomic predictive performance and assessing (ii) the merits and demerits of such groups of methods relative to each other and to the established classical methods. Here, we comparatively evaluate the genomic predictive performance and informally assess the computational cost of several groups of supervised machine learning methods, specifically, regularized regression methods, deep, ensemble and instance-based learning algorithms, using one simulated animal breeding dataset and three empirical maize breeding datasets obtained from a commercial breeding program. RESULTS: Our results show that the relative predictive performance and computational expense of the groups of machine learning methods depend upon both the data and target traits and that for classical regularized methods, increasing model complexity can incur huge computational costs but does not necessarily always improve predictive accuracy. Thus, despite their greater complexity and computational burden, neither the adaptive nor the group regularized methods clearly improved upon the results of their simple regularized counterparts. This rules out selection of one procedure among machine learning methods for routine use in genomic prediction. The results also show that, because of their competitive predictive performance, computational efficiency, simplicity and therefore relatively few tuning parameters, the classical linear mixed model and regularized regression methods are likely to remain strong contenders for genomic prediction. CONCLUSIONS: The dependence of predictive performance and computational burden on target datasets and traits call for increasing investments in enhancing the computational efficiency of machine learning algorithms and computing resources.

Assuntos

Aprendizado Profundo , Animais , Melhoramento Vegetal , Genoma , Genômica/métodos , Aprendizado de Máquina

16.

Genomic dissection of the correlation between milk yield and various health traits using functional and evolutionary information about imputed sequence variants of 34,497 German Holstein cows.

Schneider, Helen; Krizanac, Ana-Marija; Falker-Gieske, Clemens; Heise, Johannes; Tetens, Jens; Thaller, Georg; Bennewitz, Jörn.

BMC Genomics ; 25(1): 265, 2024 Mar 09.

Artigo em Inglês | MEDLINE | ID: mdl-38461236

RESUMO

BACKGROUND: Over the last decades, it was subject of many studies to investigate the genomic connection of milk production and health traits in dairy cattle. Thereby, incorporating functional information in genomic analyses has been shown to improve the understanding of biological and molecular mechanisms shaping complex traits and the accuracies of genomic prediction, especially in small populations and across-breed settings. Still, little is known about the contribution of different functional and evolutionary genome partitioning subsets to milk production and dairy health. Thus, we performed a uni- and a bivariate analysis of milk yield (MY) and eight health traits using a set of ~34,497 German Holstein cows with 50K chip genotypes and ~17 million imputed sequence variants divided into 27 subsets depending on their functional and evolutionary annotation. In the bivariate analysis, eight trait-combinations were observed that contrasted MY with each health trait. Two genomic relationship matrices (GRM) were included, one consisting of the 50K chip variants and one consisting of each set of subset variants, to obtain subset heritabilities and genetic correlations. In addition, 50K chip heritabilities and genetic correlations were estimated applying merely the 50K GRM. RESULTS: In general, 50K chip heritabilities were larger than the subset heritabilities. The largest heritabilities were found for MY, which was 0.4358 for the 50K and 0.2757 for the subset heritabilities. Whereas all 50K genetic correlations were negative, subset genetic correlations were both, positive and negative (ranging from -0.9324 between MY and mastitis to 0.6662 between MY and digital dermatitis). The subsets containing variants which were annotated as noncoding related, splice sites, untranslated regions, metabolic quantitative trait loci, and young variants ranked highest in terms of their contribution to the traits` genetic variance. We were able to show that linkage disequilibrium between subset variants and adjacent variants did not cause these subsets` high effect. CONCLUSION: Our results confirm the connection of milk production and health traits in dairy cattle via the animals` metabolic state. In addition, they highlight the potential of including functional information in genomic analyses, which helps to dissect the extent and direction of the observed traits` connection in more detail.

Assuntos

Leite , Polimorfismo de Nucleotídeo Único , Animais , Feminino , Bovinos/genética , Fenótipo , Genótipo , Genômica/métodos , Locos de Características Quantitativas , Lactação/genética

17.

A review of SNP heritability estimation methods.

Tang, Mingsheng; Wang, Tong; Zhang, Xuefen.

Brief Bioinform ; 23(3)2022 05 13.

Artigo em Inglês | MEDLINE | ID: mdl-35289357

RESUMO

Over the past decade, statistical methods have been developed to estimate single nucleotide polymorphism (SNP) heritability, which measures the proportion of phenotypic variance explained by all measured SNPs in the data. Estimates of SNP heritability measure the degree to which the available genetic variants influence phenotypes and improve our understanding of the genetic architecture of complex phenotypes. In this article, we review the recently developed and commonly used SNP heritability estimation methods for continuous and binary phenotypes from the perspective of model assumptions and parameter optimization. We primarily focus on their capacity to handle multiple phenotypes and longitudinal measurements, their ability for SNP heritability partition and their use of individual-level data versus summary statistics. State-of-the-art statistical methods that are scalable to the UK Biobank dataset are also elucidated in detail.

Assuntos

Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Estudo de Associação Genômica Ampla/métodos , Modelos Genéticos , Fenótipo

18.

High-throughput field phenotyping reveals that selection in breeding has affected the phenology and temperature response of wheat in the stem elongation phase.

Roth, Lukas; Kronenberg, Lukas; Aasen, Helge; Walter, Achim; Hartung, Jens; van Eeuwijk, Fred; Piepho, Hans-Peter; Hund, Andreas.

J Exp Bot ; 75(7): 2084-2099, 2024 Mar 27.

Artigo em Inglês | MEDLINE | ID: mdl-38134290

RESUMO

Crop growth and phenology are driven by seasonal changes in environmental variables, with temperature as one important factor. However, knowledge about genotype-specific temperature response and its influence on phenology is limited. Such information is fundamental to improve crop models and adapt selection strategies. We measured the increase in height of 352 European winter wheat varieties in 4 years to quantify phenology, and fitted an asymptotic temperature response model. The model used hourly fluctuations in temperature to parameterize the base temperature (Tmin), the temperature optimum (rmax), and the steepness (lrc) of growth responses. Our results show that higher Tmin and lrc relate to an earlier start and end of stem elongation. A higher rmax relates to an increased final height. Both final height and rmax decreased for varieties originating from the continental east of Europe towards the maritime west. A genome-wide association study (GWAS) indicated a quantitative inheritance and a large degree of independence among loci. Nevertheless, genomic prediction accuracies (GBLUPs) for Tmin and lrc were low (r≤0.32) compared with other traits (r≥0.59). As well as known, major genes related to vernalization, photoperiod, or dwarfing, the GWAS indicated additional, as yet unknown loci that dominate the temperature response.

Assuntos

Estudo de Associação Genômica Ampla , Triticum , Triticum/genética , Temperatura , Locos de Características Quantitativas , Melhoramento Vegetal , Fenótipo

19.

Environmental context of phenotypic plasticity in flowering time in sorghum and rice.

Guo, Tingting; Wei, Jialu; Li, Xianran; Yu, Jianming.

J Exp Bot ; 75(3): 1004-1015, 2024 Feb 02.

Artigo em Inglês | MEDLINE | ID: mdl-37819624

RESUMO

Phenotypic plasticity is an important topic in biology and evolution. However, how to generate broadly applicable insights from individual studies remains a challenge. Here, with flowering time observed from a large geographical region for sorghum and rice genetic populations, we examine the consistency of parameter estimation for reaction norms of genotypes across different subsets of environments and searched for potential strategies to inform the study design. Both sample size and environmental mean range of the subset affected the consistency. The subset with either a large range of environmental mean or a large sample size resulted in genetic parameters consistent with the overall pattern. Furthermore, high accuracy through genomic prediction was obtained for reaction norm parameters of untested genotypes using models built from tested genotypes under the subsets of environments with either a large range or a large sample size. With 1428 and 1674 simulated settings, our analyses suggested that the distribution of environmental index values of a site should be considered in designing experiments. Overall, we showed that environmental context was critical, and considerations should be given to better cover the intended range of the environmental variable. Our findings have implications for the genetic architecture of complex traits, plant-environment interaction, and climate adaptation.

Assuntos

Oryza , Sorghum , Fenótipo , Oryza/genética , Sorghum/genética , Genótipo , Adaptação Fisiológica

20.

Using machine learning to realize genetic site screening and genomic prediction of productive traits in pigs.

Xiang, Tao; Li, Tao; Li, Jielin; Li, Xin; Wang, Jia.

FASEB J ; 37(6): e22961, 2023 06.

Artigo em Inglês | MEDLINE | ID: mdl-37178007

RESUMO

Genomic prediction, which is based on solving linear mixed-model (LMM) equations, is the most popular method for predicting breeding values or phenotypic performance for economic traits in livestock. With the need to further improve the performance of genomic prediction, nonlinear methods have been considered as an alternative and promising approach. The excellent ability to predict phenotypes in animal husbandry has been demonstrated by machine learning (ML) approaches, which have been rapidly developed. To investigate the feasibility and reliability of implementing genomic prediction using nonlinear models, the performances of genomic predictions for pig productive traits using the linear genomic selection model and nonlinear machine learning models were compared. Then, to reduce the high-dimensional features of genome sequence data, different machine learning algorithms, including the random forest (RF), support vector machine (SVM), extreme gradient boosting (XGBoost) and convolutional neural network (CNN) algorithms, were used to perform genomic feature selection as well as genomic prediction on reduced feature genome data. All of the analyses were processed on two real pig datasets: the published PIC pig dataset and a dataset comprising data from a national pig nucleus herd in Chifeng, North China. Overall, the accuracies of predicted phenotypic performance for traits T1, T2, T3 and T5 in the PIC dataset and average daily gain (ADG) in the Chifeng dataset were higher using the ML methods than the LMM method, while those for trait T4 in the PIC dataset and total number of piglets born (TNB) in the Chifeng dataset were slightly lower using the ML methods than the LMM method. Among all the different ML algorithms, SVM was the most appropriate for genomic prediction. For the genomic feature selection experiment, the most stable and most accurate results across different algorithms were achieved using XGBoost in combination with the SVM algorithm. Through feature selection, the number of genomic markers can be reduced to 1 in 20, while the predictive performance on some traits can even be improved compared to using the full genome data. Finally, we developed a new tool that can be used to execute combined XGBoost and SVM algorithms to realize genomic feature selection and phenotypic prediction.

Assuntos

Genômica , Aprendizado de Máquina , Animais , Suínos/genética , Reprodutibilidade dos Testes , Genoma/genética , Fenótipo , Algoritmos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA