RESUMO
Statistical machine learning (ML) extracts patterns from extensive genomic, phenotypic, and environmental data. ML algorithms automatically identify relevant features and use cross-validation to ensure robust models and improve prediction reliability in new lines. Furthermore, ML analyses of genotype-by-environment (G×E) interactions can offer insights into the genetic factors that affect performance in specific environments. By leveraging historical breeding data, ML streamlines strategies and automates analyses to reveal genomic patterns. In this review we examine the transformative impact of big data, including multi-trait genomics, phenomics, and environmental covariables, on genomic-enabled prediction in plant breeding. We discuss how big data and ML are revolutionizing the field by enhancing prediction accuracy, deepening our understanding of G×E interactions, and optimizing breeding strategies through the analysis of extensive and diverse datasets.
RESUMO
Sesame seeds and their edible oil are highly nutritious and rich in mono- and polyunsaturated fatty acids. Bioactive compounds such as sterols, tocopherols, and sesamol provide significant medicinal benefits. The high oil content (50%) and favorable mono- and polyunsaturated fatty acid balance, as well as resilience to water stress, make sesame a promising candidate crop for global agricultural expansion. However, sesame production faces challenges such as low yields, poor response to agricultural inputs, and losses due to capsule dehiscence. To enhance yield, traits like determinate growth, dwarfism, a high harvest index, non-shattering capsules, disease resistance, and photoperiod sensitivity are needed. These traits can be achieved through variation or induced mutation breeding. Crossbreeding methods often result in unwanted genetic changes. The gene editing CRISPR/Cas9 technology has the potential to suppress detrimental alleles and improve the fatty acid profile by inhibiting polyunsaturated fatty acid biosynthesis. Even though sesame is an orphan crop, it has entered the genomic era, with available sequences assisting molecular breeding efforts. This progress aids in associating single-nucleotide polymorphisms (SNPs) and simple sequence repeats (SSR) with key economic traits, as well as identifying genes related to adaptability, oil production, fatty acid synthesis, and photosynthesis. Additionally, transcriptomic research can reveal genes involved in abiotic stress responses and adaptation to diverse climates. The mapping of quantitative trait loci (QTL) can identify loci linked to key traits such as capsule size, seed count per capsule, and capsule number per plant. This article reviews recent advances in sesame breeding, discusses ongoing challenges, and explores potential strategies for future improvement. Hence, integrating advanced genomic tools and breeding strategies provides promising ways to enhance sesame production to meet global demands.
RESUMO
Bananas (Musa spp.) are one of the most highly consumed fruits globally, grown in the tropical and sub-tropical regions. We evaluated 856 Musa accessions from the breeding programs of the International Institute of Tropical Agriculture of Nigeria, Tanzania, and Uganda; the National Agricultural Research Organization of Uganda; the Brazilian Agricultural Research Corporation (Embrapa); and the National Research Centre for Banana of India. Accessions from the in vitro gene bank at the International Transit Centre in Belgium were included to provide a baseline of available global diversity. A total of 16,903 informative single nucleotide polymorphism markers were used to estimate and characterize the genetic diversity and population structure and identify overlaps and unique material among the breeding programs. Analysis of molecular variance displayed low genetic variation among accessions and diploids and a higher variation among tetraploids (p < 0.001). Structure analysis revealed two major clusters corresponding to genomic composition. The results indicate that there is potential for the banana breeding programs to increase the diversity in their breeding materials and should exploit this potential for parental improvement and to enhance genetic gains in future breeding efforts.
RESUMO
Epistasis refers to nonallelic interaction between genes that cause bias in estimates of genetic parameters for a phenotype with interactions of two or more genes affecting the same trait. Partitioning of epistatic effects allows true estimation of the genetic parameters affecting phenotypes. Multigenic variation plays a central role in the evolution of complex characteristics, among which pleiotropy, where a single gene affects several phenotypic characters, has a large influence. While pleiotropic interactions provide functional specificity, they increase the challenge of gene discovery and functional analysis. Overcoming pleiotropy-based phenotypic trade-offs offers potential for assisting breeding for complex traits. Modelling higher order nonallelic epistatic interaction, pleiotropy and non-pleiotropy-induced variation, and genotype × environment interaction in genomic selection may provide new paths to increase the productivity and stress tolerance for next generation of crop cultivars. Advances in statistical models, software and algorithm developments, and genomic research have facilitated dissecting the nature and extent of pleiotropy and epistasis. We overview emerging approaches to exploit positive (and avoid negative) epistatic and pleiotropic interactions in a plant breeding context, including developing avenues of artificial intelligence, novel exploitation of large-scale genomics and phenomics data, and involvement of genes with minor effects to analyse epistatic interactions and pleiotropic quantitative trait loci, including missing heritability.
RESUMO
Challenges of climate change and growth population are exacerbated by noticeable environmental changes, which can increase the range of plant diseases, for instance, net blotch (NB), a foliar disease which significantly decreases barley (Hordeum vulgare L.) grain yield and quality. A resistant germplasm is usually identified through visual observation and the scoring of disease symptoms; however, this is subjective and time-consuming. Thus, automated, non-destructive, and low-cost disease-scoring approaches are highly relevant to barley breeding. This study presents a novel screening method for evaluating NB severity in barley. The proposed method uses an automated RGB imaging system, together with machine learning, to evaluate different symptoms and the severity of NB. The study was performed on three barley cultivars with distinct levels of resistance to NB (resistant, moderately resistant, and susceptible). The tested approach showed mean precision of 99% for various categories of NB severity (chlorotic, necrotic, and fungal lesions, along with leaf tip necrosis). The results demonstrate that the proposed method could be effective in assessing NB from barley leaves and specifying the level of NB severity; this type of information could be pivotal to precise selection for NB resistance in barley breeding.
RESUMO
Genomic selection, the application of genomic prediction (GP) models to select candidate individuals, has significantly advanced in the past two decades, effectively accelerating genetic gains in plant breeding. This article provides a holistic overview of key factors that have influenced GP in plant breeding during this period. We delved into the pivotal roles of training population size and genetic diversity, and their relationship with the breeding population, in determining GP accuracy. Special emphasis was placed on optimizing training population size. We explored its benefits and the associated diminishing returns beyond an optimum size. This was done while considering the balance between resource allocation and maximizing prediction accuracy through current optimization algorithms. The density and distribution of single-nucleotide polymorphisms, level of linkage disequilibrium, genetic complexity, trait heritability, statistical machine-learning methods, and non-additive effects are the other vital factors. Using wheat, maize, and potato as examples, we summarize the effect of these factors on the accuracy of GP for various traits. The search for high accuracy in GP-theoretically reaching one when using the Pearson's correlation as a metric-is an active research area as yet far from optimal for various traits. We hypothesize that with ultra-high sizes of genotypic and phenotypic datasets, effective training population optimization methods and support from other omics approaches (transcriptomics, metabolomics and proteomics) coupled with deep-learning algorithms could overcome the boundaries of current limitations to achieve the highest possible prediction accuracy, making genomic selection an effective tool in plant breeding.
Assuntos
Genoma de Planta , Melhoramento Vegetal , Humanos , Genoma de Planta/genética , Seleção Genética , Genômica , Fenótipo , Genótipo , Plantas , Polimorfismo de Nucleotídeo Único/genéticaRESUMO
MAIN CONCLUSION: Molecular mechanisms of biological rhythms provide opportunities to harness functional allelic diversity in core (and trait- or stress-responsive) oscillator networks to develop more climate-resilient and productive germplasm. The circadian clock senses light and temperature in day-night cycles to drive biological rhythms. The clock integrates endogenous signals and exogenous stimuli to coordinate diverse physiological processes. Advances in high-throughput non-invasive assays, use of forward- and inverse-genetic approaches, and powerful algorithms are allowing quantitation of variation and detection of genes associated with circadian dynamics. Circadian rhythms and phytohormone pathways in response to endogenous and exogenous cues have been well documented the model plant Arabidopsis. Novel allelic variation associated with circadian rhythms facilitates adaptation and range expansion, and may provide additional opportunity to tailor climate-resilient crops. The circadian phase and period can determine adaptation to environments, while the robustness in the circadian amplitude can enhance resilience to environmental changes. Circadian rhythms in plants are tightly controlled by multiple and interlocked transcriptional-translational feedback loops involving morning (CCA1, LHY), mid-day (PRR9, PRR7, PRR5), and evening (TOC1, ELF3, ELF4, LUX) genes that maintain the plant circadian clock ticking. Significant progress has been made to unravel the functions of circadian rhythms and clock genes that regulate traits, via interaction with phytohormones and trait-responsive genes, in diverse crops. Altered circadian rhythms and clock genes may contribute to hybrid vigor as shown in Arabidopsis, maize, and rice. Modifying circadian rhythms via transgenesis or genome-editing may provide additional opportunities to develop crops with better buffering capacity to environmental stresses. Models that involve clock geneâphytohormoneâtrait interactions can provide novel insights to orchestrate circadian rhythms and modulate clock genes to facilitate breeding of all season crops.
Assuntos
Proteínas de Arabidopsis , Arabidopsis , Relógios Circadianos , Relógios Circadianos/genética , Arabidopsis/genética , Reguladores de Crescimento de Plantas , Melhoramento Vegetal , Alelos , Produtos Agrícolas/genética , Fatores de Transcrição/genéticaRESUMO
At the turn of 2000 many authors envisioned future plant breeding. Twenty years after, which of those authors' visions became reality or not, and which ones may become so in the years to come. After two decades of debates, climate change is a "certainty," food systems shifted from maximizing farm production to reducing environmental impact, and hopes placed into GMOs are mitigated by their low appreciation by consumers. We revise herein how plant breeding may raise or reduce genetic gains based on the breeder's equation. "Accuracy of Selection" has significantly improved by many experimental-scale field and laboratory implements, but also by vulgarizing statistical models, and integrating DNA markers into selection. Pre-breeding has really promoted the increase of useful "Genetic Variance." Shortening "Recycling Time" has seen great progression, to the point that achieving a denominator equal to "1" is becoming a possibility. Maintaining high "Selection Intensity" remains the biggest challenge, since adding any technology results in a higher cost per progeny, despite the steady reduction in cost per datapoint. Furthermore, the concepts of variety and seed enterprise might change with the advent of cheaper genomic tools to monitor their use and the promotion of participatory or citizen science. The technological and societal changes influence the new generation of plant breeders, moving them further away from field work, emphasizing instead the use of genomic-based selection methods relying on big data. We envisage what skills plant breeders of tomorrow might need to address challenges, and whether their time in the field may dwindle.
Assuntos
Genoma , Melhoramento Vegetal , Melhoramento Vegetal/métodos , Genômica , Sementes , Marcadores GenéticosRESUMO
The global production of durum wheat (Triticum durum Desf.) is hindered by a constant rise in the frequency of severe heat stress events. To identify heat-tolerant germplasm, three different germplasm panels ("discovery," "investigation," and "validation") were studied under a range of heat-stressed conditions. Grain yield (GY) and its components were recorded at each site and a heat stress susceptibility index was calculated, confirming that each 1°C temperature rise corresponds to a GY reduction in durum wheat of 4.6%-6.3%. A total of 2552 polymorphic single nucleotide polymorphisms (SNPs) defined the diversity of the first panel, while 5642 SNPs were polymorphic in the "investigation panel." The use of genome-wide association studies revealed that 36 quantitative trait loci were associated with the target traits in the discovery panel, of which five were confirmed in a "subset" tested imposing heat stress by plastic tunnels, and in the investigation panel. A study of allelic combinations confirmed that Q.icd.Heat.003-1A, Q.icd.Heat.007-1B, and Q.icd.Heat.016-3B are additive in nature and the positive alleles at all three loci resulted in a 16% higher GY under heat stress. The underlying SNPs were converted into kompetitive allele specific PCR markers and tested on the validation panel, confirming that each explained up to 9% of the phenotypic variation for GY under heat stress. These markers can now be used for breeding to improve resilience to climate change and increase productivity in heat-stressed areas.
Assuntos
Termotolerância , Triticum , Triticum/genética , Estudo de Associação Genômica Ampla , Termotolerância/genética , Melhoramento Vegetal , Locos de Características Quantitativas , Grão ComestívelRESUMO
Characterization of major resistance (R) genes to late blight (LB) -caused by the oomycete Phytophthora infestans- is very important for potato breeding. The objective of this study was to identify novel genes for resistance to LB from diploid Solanum tuberosum L. Andigenum Group (StAG) cultivar accessions. Using comparative analysis with a edgeR bioconductor package for differential expression analysis of transcriptomes, two of these accessions with contrasting levels of resistance to LB were analyzed using digital gene expression data. As a result, various differentially expressed genes (P ≤ 0.0001, Log2FC ≥ 2, FDR < 0.001) were noted. The combination of transcriptomic analysis provided 303 candidate genes that are overexpressed and underexpressed, thereby giving high resistance to LB. The functional analysis showed differential expression of R genes and their corresponding proteins related to disease resistance, NBS-LRR domain proteins, and specific disease resistance proteins. Comparative analysis of specific tissue transcriptomes in resistant and susceptible genotypes can be used for rapidly identifying candidate R genes, thus adding novel genes from diploid StAG cultivar accessions for host plant resistance to P. infestans in potato.
RESUMO
Alternative splicing (AS) is a gene regulatory mechanism modulating gene expression in multiple ways. AS is prevalent in all eukaryotes including plants. AS generates two or more mRNAs from the precursor mRNA (pre-mRNA) to regulate transcriptome complexity and proteome diversity. Advances in next-generation sequencing, omics technology, bioinformatics tools, and computational methods provide new opportunities to quantify and visualize AS-based quantitative trait variation associated with plant growth, development, reproduction, and stress tolerance. Domestication, polyploidization, and environmental perturbation may evolve novel splicing variants associated with agronomically beneficial traits. To date, pre-mRNAs from many genes are spliced into multiple transcripts that cause phenotypic variation for complex traits, both in model plant Arabidopsis and field crops. Cataloguing and exploiting such variation may provide new paths to enhance climate resilience, resource-use efficiency, productivity, and nutritional quality of staple food crops. This review provides insights into AS variation alongside a gene expression analysis to select for novel phenotypic diversity for use in breeding programs. AS contributes to heterosis, enhances plant symbiosis (mycorrhiza and rhizobium), and provides a mechanistic link between the core clock genes and diverse environmental clues.
Assuntos
Processamento Alternativo , Arabidopsis , Melhoramento Vegetal , Splicing de RNA , Arabidopsis/genética , Produtos Agrícolas/genética , Produtos Agrícolas/metabolismo , Precursores de RNA/genéticaRESUMO
The genetic improvement of crops faces the significant challenge of feeding an ever-increasing population amidst a changing climate, and when governments are adopting a 'more with less' approach to reduce input use. Plant breeding has the potential to contribute to the United Nations Agenda 2030 by addressing various sustainable development goals (SDGs), with its most profound impact expected on SDG2 Zero Hunger. To expedite the time-consuming crossbreeding process, a genomic-led approach for predicting breeding values, targeted mutagenesis through gene editing, high-throughput phenomics for trait evaluation, enviromics for including characterization of the testing environments, machine learning for effective management of large datasets, and speed breeding techniques promoting early flowering and seed production are being incorporated into the plant breeding toolbox. These advancements are poised to enhance genetic gains through selection in the cultigen pools of various crops. Consequently, these knowledge-based breeding methods are pursued for trait introgression, population improvement, and cultivar development. This article uses the potato crop as an example to showcase the progress being made in both genomic-led approaches and gene editing for accelerating the delivery of genetic gains through the utilization of genetically enhanced elite germplasm. It also further underscores that access to technological advances in plant breeding may be influenced by regulations and intellectual property rights.
Assuntos
Produtos Agrícolas , Melhoramento Vegetal , Melhoramento Vegetal/métodos , Produtos Agrícolas/genética , Fenótipo , Edição de Genes , FenômicaRESUMO
The use of biocontrol agents with plant growth-promoting activity has emerged as an approach to support sustainable agriculture. During our field evaluation of potato plants treated with biocontrol rhizobacteria, four bacteria were associated with increased plant height. Using two important solanaceous crop plants, tomato and potato, we carried out a comparative analysis of the growth-promoting activity of the four bacterial strains: Pseudomonas fluorescens SLU99, Serratia plymuthica S412, S. rubidaea AV10, and S. rubidaea EV23. Greenhouse and in vitro experiments showed that P. fluorescens SLU99 promoted plant height, biomass accumulation, and yield of potato and tomato plants, while EV23 promoted growth in potato but not in tomato plants. SLU99 induced the expression of plant hormone-related genes in potato and tomato, especially those involved in maintaining homeostasis of auxin, cytokinin, gibberellic acid and ethylene. Our results reveal potential mechanisms underlying the growth promotion and biocontrol effects of these rhizobacteria and suggest which strains may be best deployed for sustainably improving crop yield.
RESUMO
Yanyang Liu, Henan Academy of Agricultural Sciences (HNAAS), China; Landraces are an important genetic source for transferring valuable novel genes and alleles required to enhance genetic variation. Therefore, information on the gene pool's genetic diversity and population structure is essential for the conservation and sustainable use of durum wheat genetic resources. Hence, the aim of this study was to assess genetic diversity, population structure, and linkage disequilibrium, as well as to identify regions with selection signature. Five hundred (500) individuals representing 46 landraces, along with 28 cultivars were evaluated using the Illumina Infinium 25K wheat SNP array, resulting in 8,178 SNPs for further analysis. Gene diversity (GD) and the polymorphic information content (PIC) ranged from 0.13-0.50 and 0.12-0.38, with mean GD and PIC values of 0.34 and 0.27, respectively. Linkage disequilibrium (LD) revealed 353,600 pairs of significant SNPs at a cut-off (r2 > 0.20, P < 0.01), with an average r2 of 0.21 for marker pairs. The nucleotide diversity (π) and Tajima's D (TD) per chromosome for the populations ranged from 0.29-0.36 and 3.46-5.06, respectively, with genome level, mean π values of 0.33 and TD values of 4.43. Genomic scan using the Fst outlier test revealed 85 loci under selection signatures, with 65 loci under balancing selection and 17 under directional selection. Putative candidate genes co-localized with regions exhibiting strong selection signatures were associated with grain yield, plant height, host plant resistance to pathogens, heading date, grain quality, and phenolic content. The Bayesian Model (STRUCTURE) and distance-based (principal coordinate analysis, PCoA, and unweighted pair group method with arithmetic mean, UPGMA) methods grouped the genotypes into five subpopulations, where landraces from geographically non-adjoining environments were clustered in the same cluster. This research provides further insights into population structure and genetic relationships in a diverse set of durum wheat germplasm, which could be further used in wheat breeding programs to address production challenges sustainably.
RESUMO
Mutation breeding based on various chemical and physical mutagens induces and disrupts non-target loci. Hence, large populations were required for visual screening, but desired plants were rare and it was a further laborious task to identify desirable mutants. Generated mutant had high defect due to non-targeted mutation, with poor agronomic performance. Mutation techniques were augmented by targeted induced local lesions in genome (TILLING) facilitating the selection of desirable germplasm. On the other hand, gene editing through CRISPR/Cas9 allows knocking down genes for site-directed mutation. This handy technique has been exploited for the modification of fatty acid profile. High oleic acid genetic stocks were obtained in a broad range of crops. Moreover, genes involved in the accumulation of undesirable seed components such as starch, polysaccharide, and flavors were knocked down to enhance seed quality, which helps to improve oil contents and reduces the anti-nutritional component.
Assuntos
Ácidos Graxos , Edição de Genes , Edição de Genes/métodos , Melhoramento Vegetal , Ácido Oleico , Mudança ClimáticaRESUMO
Underutilized pulses and their wild relatives are typically stress tolerant and their seeds are packed with protein, fibers, minerals, vitamins, and phytochemicals. The consumption of such nutritionally dense legumes together with cereal-based food may promote global food and nutritional security. However, such species are deficient in a few or several desirable domestication traits thereby reducing their agronomic value, requiring further genetic enhancement for developing productive, nutritionally dense, and climate resilient cultivars. This review article considers 13 underutilized pulses and focuses on their germplasm holdings, diversity, crop-wild-crop gene flow, genome sequencing, syntenic relationships, the potential for breeding and transgenic manipulation, and the genetics of agronomic and stress tolerance traits. Recent progress has shown the potential for crop improvement and food security, for example, the genetic basis of stem determinacy and fragrance in moth bean and rice bean, multiple abiotic stress tolerant traits in horse gram and tepary bean, bruchid resistance in lima bean, low neurotoxin in grass pea, and photoperiod induced flowering and anthocyanin accumulation in adzuki bean have been investigated. Advances in introgression breeding to develop elite genetic stocks of grass pea with low ß-ODAP (neurotoxin compound), resistance to Mungbean yellow mosaic India virus in black gram using rice bean, and abiotic stress adaptation in common bean, using genes from tepary bean have been carried out. This highlights their potential in wider breeding programs to introduce such traits in locally adapted cultivars. The potential of de-domestication or feralization in the evolution of new variants in these crops are also highlighted.
RESUMO
It is of paramount importance in plant breeding to have methods dealing with large numbers of predictor variables and few sample observations, as well as efficient methods for dealing with high correlation in predictors and measured traits. This paper explores in terms of prediction performance the partial least squares (PLS) method under single-trait (ST) and multi-trait (MT) prediction of potato traits. The first prediction was for tested lines in tested environments under a five-fold cross-validation (5FCV) strategy and the second prediction was for tested lines in untested environments (herein denoted as leave one environment out cross validation, LOEO). There was a good performance in terms of predictions (with accuracy mostly > 0.5 for Pearson's correlation) the accuracy of 5FCV was better than LOEO. Hence, we have empirical evidence that the ST and MT PLS framework is a very valuable tool for prediction in the context of potato breeding data.
Assuntos
Solanum tuberosum , Solanum tuberosum/genética , Análise dos Mínimos Quadrados , Modelos Genéticos , Melhoramento Vegetal , Fenótipo , Genômica/métodos , GenótipoRESUMO
Ethiopia is considered a center of origin and diversity for durum wheat and is endowed with many diverse landraces. This research aimed to estimate the extent and pattern of genetic diversity in Ethiopian durum wheat germplasm. Thus, 104 durum wheat genotypes representing thirteen populations, three regions, and four altitudinal classes were investigated for their genetic diversity, using 10 grain quality- and grain yield-related phenotypic traits and 14 simple sequence repeat (SSR) makers. The analysis of the phenotypic traits revealed a high mean Shannon diversity index (H' = 0.78) among the genotypes and indicated a high level of phenotypic variation. The principal component analysis (PCA) classified the genotypes into three groups. The SSR markers showed a high mean value of polymorphic information content (PIC = 0.50) and gene diversity (h = 0.56), and a moderate number of alleles per locus (Na = 4). Analysis of molecular variance (AMOVA) revealed a high level of variation within populations, regions, and altitudinal classes, accounting for 88%, 97%, and 97% of the total variation, respectively. Pairwise genetic differentiation and Nei's genetic distance analyses identified that the cultivars are distinct from the landrace populations. The distance-based (Discriminant Analysis of Principal Component (DAPC) and Minimum Spanning Network (MSN)) and model-based population stratification (STRUCTURE) methods of clustering grouped the genotypes into two clusters. Both the phenotypic data-based PCA and the molecular data-based DAPC and MSN analyses defined distinct groupings of cultivars and landraces. The phenotypic and molecular diversity analyses highlighted the high genetic variation in the Ethiopian durum wheat gene pool. The investigated SSRs showed significant associations with one or more target phenotypic traits. The markers identify landraces with high grain yield and quality traits. This study highlights the usefulness of Ethiopian landraces for cultivar development, contributing to food security in the region and beyond.