Pesquisa | BVS - MINISTÉRIO DA SAÚDE

1.

A glimpse into the fungal metabolomic abyss: Novel network analysis reveals relationships between exogenous compounds and their outputs.

Gopalakrishnan Meena, Muralikrishnan; Lane, Matthew J; Tannous, Joanna; Carrell, Alyssa A; Abraham, Paul E; Giannone, Richard J; Ané, Jean-Michel; Keller, Nancy P; Labbé, Jesse L; Geiger, Armin G; Kainer, David; Jacobson, Daniel A; Rush, Tomás A.

PNAS Nexus ; 2(10): pgad322, 2023 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-37854706

RESUMO

Fungal specialized metabolites are a major source of beneficial compounds that are routinely isolated, characterized, and manufactured as pharmaceuticals, agrochemical agents, and industrial chemicals. The production of these metabolites is encoded by biosynthetic gene clusters that are often silent under standard growth conditions. There are limited resources for characterizing the direct link between abiotic stimuli and metabolite production. Herein, we introduce a network analysis-based, data-driven algorithm comprising two routes to characterize the production of specialized fungal metabolites triggered by different exogenous compounds: the direct route and the auxiliary route. Both routes elucidate the influence of treatments on the production of specialized metabolites from experimental data. The direct route determines known and putative metabolites induced by treatments and provides additional insight over traditional comparison methods. The auxiliary route is specific for discovering unknown analytes, and further identification can be curated through online bioinformatic resources. We validated our algorithm by applying chitooligosaccharides and lipids at two different temperatures to the fungal pathogen Aspergillus fumigatus. After liquid chromatography-mass spectrometry quantification of significantly produced analytes, we used network centrality measures to rank the treatments' ability to elucidate these analytes and confirmed their identity through fragmentation patterns or in silico spiking with commercially available standards. Later, we examined the transcriptional regulation of these metabolites through real-time quantitative polymerase chain reaction. Our data-driven techniques can complement existing metabolomic network analysis by providing an approach to track the influence of any exogenous stimuli on metabolite production. Our experimental-based algorithm can overcome the bottlenecks in elucidating novel fungal compounds used in drug discovery.

2.

Quantum biological insights into CRISPR-Cas9 sgRNA efficiency from explainable-AI driven feature engineering.

Noshay, Jaclyn M; Walker, Tyler; Alexander, William G; Klingeman, Dawn M; Romero, Jonathon; Walker, Angelica M; Prates, Erica; Eckert, Carrie; Irle, Stephan; Kainer, David; Jacobson, Daniel A.

Nucleic Acids Res ; 51(19): 10147-10161, 2023 10 27.

Artigo em Inglês | MEDLINE | ID: mdl-37738140

RESUMO

CRISPR-Cas9 tools have transformed genetic manipulation capabilities in the laboratory. Empirical rules-of-thumb have been developed for only a narrow range of model organisms, and mechanistic underpinnings for sgRNA efficiency remain poorly understood. This work establishes a novel feature set and new public resource, produced with quantum chemical tensors, for interpreting and predicting sgRNA efficiency. Feature engineering for sgRNA efficiency is performed using an explainable-artificial intelligence model: iterative Random Forest (iRF). By encoding quantitative attributes of position-specific sequences for Escherichia coli sgRNAs, we identify important traits for sgRNA design in bacterial species. Additionally, we show that expanding positional encoding to quantum descriptors of base-pair, dimer, trimer, and tetramer sequences captures intricate interactions in local and neighboring nucleotides of the target DNA. These features highlight variation in CRISPR-Cas9 sgRNA dynamics between E. coli and H. sapiens genomes. These novel encodings of sgRNAs enhance our understanding of the elaborate quantum biological processes involved in CRISPR-Cas9 machinery.

Assuntos

Sistemas CRISPR-Cas , RNA Guia de Sistemas CRISPR-Cas , Inteligência Artificial , DNA , Escherichia coli/genética , Edição de Genes , Humanos

3.

Validation of a metabolite-GWAS network for Populus trichocarpa family 1 UDP-glycosyltransferases.

Saint-Vincent, Patricia M B; Furches, Anna; Galanie, Stephanie; Teixeira Prates, Erica; Aldridge, Jessa L; Labbe, Audrey; Zhao, Nan; Martin, Madhavi Z; Ranjan, Priya; Jones, Piet; Kainer, David; Kalluri, Udaya C; Chen, Jin-Gui; Muchero, Wellington; Jacobson, Daniel A; Tschaplinski, Timothy J.

Front Plant Sci ; 14: 1210146, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37546246

RESUMO

Metabolite genome-wide association studies (mGWASs) are increasingly used to discover the genetic basis of target phenotypes in plants such as Populus trichocarpa, a biofuel feedstock and model woody plant species. Despite their growing importance in plant genetics and metabolomics, few mGWASs are experimentally validated. Here, we present a functional genomics workflow for validating mGWAS-predicted enzyme-substrate relationships. We focus on uridine diphosphate-glycosyltransferases (UGTs), a large family of enzymes that catalyze sugar transfer to a variety of plant secondary metabolites involved in defense, signaling, and lignification. Glycosylation influences physiological roles, localization within cells and tissues, and metabolic fates of these metabolites. UGTs have substantially expanded in P. trichocarpa, presenting a challenge for large-scale characterization. Using a high-throughput assay, we produced substrate acceptance profiles for 40 previously uncharacterized candidate enzymes. Assays confirmed 10 of 13 leaf mGWAS associations, and a focused metabolite screen demonstrated varying levels of substrate specificity among UGTs. A substrate binding model case study of UGT-23 rationalized observed enzyme activities and mGWAS associations, including glycosylation of trichocarpinene to produce trichocarpin, a major higher-order salicylate in P. trichocarpa. We identified UGTs putatively involved in lignan, flavonoid, salicylate, and phytohormone metabolism, with potential implications for cell wall biosynthesis, nitrogen uptake, and biotic and abiotic stress response that determine sustainable biomass crop production. Our results provide new support for in silico analyses and evidence-based guidance for in vivo functional characterization.

4.

Few-Shot Learning Enables Population-Scale Analysis of Leaf Traits in Populus trichocarpa.

Lagergren, John; Pavicic, Mirko; Chhetri, Hari B; York, Larry M; Hyatt, Doug; Kainer, David; Rutter, Erica M; Flores, Kevin; Bailey-Bale, Jack; Klein, Marie; Taylor, Gail; Jacobson, Daniel; Streich, Jared.

Plant Phenomics ; 5: 0072, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37519935

RESUMO

Plant phenotyping is typically a time-consuming and expensive endeavor, requiring large groups of researchers to meticulously measure biologically relevant plant traits, and is the main bottleneck in understanding plant adaptation and the genetic architecture underlying complex traits at population scale. In this work, we address these challenges by leveraging few-shot learning with convolutional neural networks to segment the leaf body and visible venation of 2,906 Populus trichocarpa leaf images obtained in the field. In contrast to previous methods, our approach (a) does not require experimental or image preprocessing, (b) uses the raw RGB images at full resolution, and (c) requires very few samples for training (e.g., just 8 images for vein segmentation). Traits relating to leaf morphology and vein topology are extracted from the resulting segmentations using traditional open-source image-processing tools, validated using real-world physical measurements, and used to conduct a genome-wide association study to identify genes controlling the traits. In this way, the current work is designed to provide the plant phenotyping community with (a) methods for fast and accurate image-based feature extraction that require minimal training data and (b) a new population-scale dataset, including 68 different leaf phenotypes, for domain scientists and machine learning researchers. All of the few-shot learning code, data, and results are made publicly available.

5.

Exploring the role of plant lysin motif receptor-like kinases in regulating plant-microbe interactions in the bioenergy crop Populus.

Cope, Kevin R; Prates, Erica T; Miller, John I; Demerdash, Omar N A; Shah, Manesh; Kainer, David; Cliff, Ashley; Sullivan, Kyle A; Cashman, Mikaela; Lane, Matthew; Matthiadis, Anna; Labbé, Jesse; Tschaplinski, Timothy J; Jacobson, Daniel A; Kalluri, Udaya C.

Comput Struct Biotechnol J ; 21: 1122-1139, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-36789259

RESUMO

For plants, distinguishing between mutualistic and pathogenic microbes is a matter of survival. All microbes contain microbe-associated molecular patterns (MAMPs) that are perceived by plant pattern recognition receptors (PRRs). Lysin motif receptor-like kinases (LysM-RLKs) are PRRs attuned for binding and triggering a response to specific MAMPs, including chitin oligomers (COs) in fungi, lipo-chitooligosaccharides (LCOs), which are produced by mycorrhizal fungi and nitrogen-fixing rhizobial bacteria, and peptidoglycan in bacteria. The identification and characterization of LysM-RLKs in candidate bioenergy crops including Populus are limited compared to other model plant species, thus inhibiting our ability to both understand and engineer microbe-mediated gains in plant productivity. As such, we performed a sequence analysis of LysM-RLKs in the Populus genome and predicted their function based on phylogenetic analysis with known LysM-RLKs. Then, using predictive models, molecular dynamics simulations, and comparative structural analysis with previously characterized CO and LCO plant receptors, we identified probable ligand-binding sites in Populus LysM-RLKs. Using several machine learning models, we predicted remarkably consistent binding affinity rankings of Populus proteins to CO. In addition, we used a modified Random Walk with Restart network-topology based approach to identify a subset of Populus LysM-RLKs that are functionally related and propose a corresponding signal transduction cascade. Our findings provide the first look into the role of LysM-RLKs in Populus-microbe interactions and establish a crucial jumping-off point for future research efforts to understand specificity and redundancy in microbial perception mechanisms.

6.

Structural variants identified using non-Mendelian inheritance patterns advance the mechanistic understanding of autism spectrum disorder.

Kainer, David; Templeton, Alan R; Prates, Erica T; Jacboson, Daniel; Allan, Euan R O; Climer, Sharlee; Garvin, Michael R.

HGG Adv ; 4(1): 100150, 2023 01 12.

Artigo em Inglês | MEDLINE | ID: mdl-36340933

RESUMO

The heritability of autism spectrum disorder (ASD), based on 680,000 families and five countries, is estimated to be nearly 80%, yet heritability reported from SNP-based studies are consistently lower, and few significant loci have been identified with genome-wide association studies. This gap in genomic information may reside in rare variants, interaction among variants (epistasis), or cryptic structural variation (SV) and may provide mechanisms that underlie ASD. Here we use a method to identify potential SVs based on non-Mendelian inheritance patterns in pedigrees using parent-child genotypes from ASD families and demonstrate that they are enriched in ASD-risk genes. Most are in non-coding genic space and are over-represented in expression quantitative trait loci, suggesting that they affect gene regulation, which we confirm with their overlap of differentially expressed genes in postmortem brain tissue of ASD individuals. We then identify an SV in the GRIK2 gene that alters RNA splicing and a regulatory region of the ACMSD gene in the kynurenine pathway as significantly associated with a non-verbal ASD phenotype, supporting our hypothesis that these currently excluded loci can provide a clearer mechanistic understanding of ASD. Finally, we use an explainable artificial intelligence approach to define subgroups demonstrating their use in the context of precision medicine.

Assuntos

Transtorno do Espectro Autista , Humanos , Transtorno do Espectro Autista/genética , Estudo de Associação Genômica Ampla/métodos , Inteligência Artificial , Locos de Características Quantitativas/genética , Padrões de Herança/genética

7.

Lipo-Chitooligosaccharides Induce Specialized Fungal Metabolite Profiles That Modulate Bacterial Growth.

Rush, Tomás A; Tannous, Joanna; Lane, Matthew J; Gopalakrishnan Meena, Muralikrishnan; Carrell, Alyssa A; Golan, Jacob J; Drott, Milton T; Cottaz, Sylvain; Fort, Sébastien; Ané, Jean-Michel; Keller, Nancy P; Pelletier, Dale A; Jacobson, Daniel A; Kainer, David; Abraham, Paul E; Giannone, Richard J; Labbé, Jesse L.

mSystems ; 7(6): e0105222, 2022 12 20.

Artigo em Inglês | MEDLINE | ID: mdl-36453934

RESUMO

Lipo-chitooligosaccharides (LCOs) are historically known for their role as microbial-derived signaling molecules that shape plant symbiosis with beneficial rhizobia or mycorrhizal fungi. Recent studies showing that LCOs are widespread across the fungal kingdom have raised questions about the ecological function of these compounds in organisms that do not form symbiotic relationships with plants. To elucidate the ecological function of these compounds, we investigate the metabolomic response of the ubiquitous human pathogen Aspergillus fumigatus to LCOs. Our metabolomics data revealed that exogenous application of various types of LCOs to A. fumigatus resulted in significant shifts in the fungal metabolic profile, with marked changes in the production of specialized metabolites known to mediate ecological interactions. Using network analyses, we identify specific types of LCOs with the most significant effect on the abundance of known metabolites. Extracts of several LCO-induced metabolic profiles significantly impact the growth rates of diverse bacterial species. These findings suggest that LCOs may play an important role in the competitive dynamics of non-plant-symbiotic fungi and bacteria. This study identifies specific metabolomic profiles induced by these ubiquitously produced chemicals and creates a foundation for future studies into the potential roles of LCOs as modulators of interkingdom competition. IMPORTANCE The activation of silent biosynthetic gene clusters (BGC) for the identification and characterization of novel fungal secondary metabolites is a perpetual motion in natural product discoveries. Here, we demonstrated that one of the best-studied symbiosis signaling compounds, lipo-chitooligosaccharides (LCOs), play a role in activating some of these BGCs, resulting in the production of known, putative, and unknown metabolites with biological activities. This collection of metabolites induced by LCOs differentially modulate bacterial growth, while the LCO standards do not convey the same effect. These findings create a paradigm shift showing that LCOs have a more prominent role outside of host recognition of symbiotic microbes. Importantly, our work demonstrates that fungi use LCOs to produce a variety of metabolites with biological activity, which can be a potential source of bio-stimulants, pesticides, or pharmaceuticals.

Assuntos

Quitosana , Micorrizas , Humanos , Quitina , Quitosana/farmacologia , Oligossacarídeos/farmacologia

8.

Evaluating the performance of random forest and iterative random forest based methods when applied to gene expression data.

Walker, Angelica M; Cliff, Ashley; Romero, Jonathon; Shah, Manesh B; Jones, Piet; Felipe Machado Gazolla, Joao Gabriel; Jacobson, Daniel A; Kainer, David.

Comput Struct Biotechnol J ; 20: 3372-3386, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35832622

RESUMO

Gene-to-gene networks, such as Gene Regulatory Networks (GRN) and Predictive Expression Networks (PEN) capture relationships between genes and are beneficial for use in downstream biological analyses. There exists multiple network inference tools to produce these gene-to-gene networks from matrices of gene expression data. Random Forest-Leave One Out Prediction (RF-LOOP) is a method that has been shown to be efficient at producing these gene-to-gene networks, frequently known as GEne Network Inference with Ensemble of trees (GENIE3). Random Forest can be replaced in this process by iterative Random Forest (iRF), which performs variable selection and boosting. Here we validate that iterative Random Forest-Leave One Out Prediction (iRF-LOOP) produces higher quality networks than GENIE3 (RF-LOOP). We use both synthetic and empirical networks from the Dialogue for Reverse Engineering Assessment and Methods (DREAM) Challenges by Sage Bionetworks, as well as two additional empirical networks created from Arabidopsis thaliana and Populus trichocarpa expression data.

9.

The Genetic Architecture of Nitrogen Use Efficiency in Switchgrass (Panicum virgatum L.).

Shrestha, Vivek; Chhetri, Hari B; Kainer, David; Xu, Yaping; Hamilton, Lance; Piasecki, Cristiano; Wolfe, Ben; Wang, Xueyan; Saha, Malay; Jacobson, Daniel; Millwood, Reginald J; Mazarei, Mitra; Stewart, C Neal.

Front Plant Sci ; 13: 893610, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35586220

RESUMO

Switchgrass (Panicum virgatum L.) has immense potential as a bioenergy crop with the aim of producing biofuel as an end goal. Nitrogen (N)-related sustainability traits, such as nitrogen use efficiency (NUE) and nitrogen remobilization efficiency (NRE), are important factors affecting switchgrass quality and productivity. Hence, it is imperative to develop nitrogen use-efficient switchgrass accessions by exploring the genetic basis of NUE in switchgrass. For that, we used 331 diverse field-grown switchgrass accessions planted under low and moderate N fertility treatments. We performed a genome wide association study (GWAS) in a holistic manner where we not only considered NUE as a single trait but also used its related phenotypic traits, such as total dry biomass at low N and moderate N, and nitrogen use index, such as NRE. We have evaluated the phenotypic characterization of the NUE and the related traits, highlighted their relationship using correlation analysis, and identified the top ten nitrogen use-efficient switchgrass accessions. Our GWAS analysis identified 19 unique single nucleotide polymorphisms (SNPs) and 32 candidate genes. Two promising GWAS candidate genes, caffeoyl-CoA O-methyltransferase (CCoAOMT) and alfin-like 6 (AL6), were further supported by linkage disequilibrium (LD) analysis. Finally, we discussed the potential role of nitrogen in modulating the expression of these two genes. Our findings have opened avenues for the development of improved nitrogen use-efficient switchgrass lines.

10.

Characterization of terpene biosynthesis in Melaleuca quinquenervia and ecological consequences of terpene accumulation during myrtle rust infection.

Hsieh, Ji-Fan; Krause, Sandra T; Kainer, David; Degenhardt, Jörg; Foley, William J; Külheim, Carsten.

Plant Environ Interact ; 2(4): 177-193, 2021 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-37283700

RESUMO

Plants use a wide array of secondary metabolites including terpenes as defense against herbivore and pathogen attack, which can be constitutively expressed or induced. Here, we investigated aspects of the chemical and molecular basis of resistance against the exotic rust fungus Austropuccinia psidii in Melaleuca quinquenervia, with a focus on terpenes. Foliar terpenes of resistant and susceptible plants were quantified, and we assessed whether chemotypic variation contributed to resistance to infection by A. psidii. We found that chemotypes did not contribute to the resistance and susceptibility of M. quinquenervia. However, in one of the chemotypes (Chemotype 2), susceptible plants showed higher concentrations of several terpenes including α-pinene, limonene, 1,8-cineole, and viridiflorol compared with resistant plants. Transcriptome profiling of these plants showed that several TPS genes were strongly induced in response to infection by A. psidii. Functional characterization of these TPS showed them to be mono- and sesquiterpene synthases producing compounds including 1,8-cineole, ß-caryophyllene, viridiflorol and nerolidol. The expression of these TPS genes correlated with metabolite data in a susceptible plant. These results suggest the complexity of resistance mechanism regulated by M. quinquenervia and that modulation of terpenes may be one of the components that contribute to resistance against A. psidii.

11.

Potentially adaptive SARS-CoV-2 mutations discovered with novel spatiotemporal and explainable AI models.

Garvin, Michael R; T Prates, Erica; Pavicic, Mirko; Jones, Piet; Amos, B Kirtley; Geiger, Armin; Shah, Manesh B; Streich, Jared; Felipe Machado Gazolla, Joao Gabriel; Kainer, David; Cliff, Ashley; Romero, Jonathon; Keith, Nathan; Brown, James B; Jacobson, Daniel.

Genome Biol ; 21(1): 304, 2020 12 23.

Artigo em Inglês | MEDLINE | ID: mdl-33357233

RESUMO

BACKGROUND: A mechanistic understanding of the spread of SARS-CoV-2 and diligent tracking of ongoing mutagenesis are of key importance to plan robust strategies for confining its transmission. Large numbers of available sequences and their dates of transmission provide an unprecedented opportunity to analyze evolutionary adaptation in novel ways. Addition of high-resolution structural information can reveal the functional basis of these processes at the molecular level. Integrated systems biology-directed analyses of these data layers afford valuable insights to build a global understanding of the COVID-19 pandemic. RESULTS: Here we identify globally distributed haplotypes from 15,789 SARS-CoV-2 genomes and model their success based on their duration, dispersal, and frequency in the host population. Our models identify mutations that are likely compensatory adaptive changes that allowed for rapid expansion of the virus. Functional predictions from structural analyses indicate that, contrary to previous reports, the Asp614Gly mutation in the spike glycoprotein (S) likely reduced transmission and the subsequent Pro323Leu mutation in the RNA-dependent RNA polymerase led to the precipitous spread of the virus. Our model also suggests that two mutations in the nsp13 helicase allowed for the adaptation of the virus to the Pacific Northwest of the USA. Finally, our explainable artificial intelligence algorithm identified a mutational hotspot in the sequence of S that also displays a signature of positive selection and may have implications for tissue or cell-specific expression of the virus. CONCLUSIONS: These results provide valuable insights for the development of drugs and surveillance strategies to combat the current and future pandemics.

Assuntos

Adaptação Biológica , Evolução Molecular , Modelos Genéticos , SARS-CoV-2/genética , Proteínas Virais/genética , Inteligência Artificial , Genoma Viral , Haplótipos , Mutação , Seleção Genética

12.

Genome-Wide Association Study of Wood Anatomical and Morphological Traits in Populus trichocarpa.

Chhetri, Hari B; Furches, Anna; Macaya-Sanz, David; Walker, Alejandro R; Kainer, David; Jones, Piet; Harman-Ware, Anne E; Tschaplinski, Timothy J; Jacobson, Daniel; Tuskan, Gerald A; DiFazio, Stephen P.

Front Plant Sci ; 11: 545748, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-33013968

RESUMO

To understand the genetic mechanisms underlying wood anatomical and morphological traits in Populus trichocarpa, we used 869 unrelated genotypes from a common garden in Clatskanie, Oregon that were previously collected from across the distribution range in western North America. Using GEMMA mixed model analysis, we tested for the association of 25 phenotypic traits and nine multitrait combinations with 6.741 million SNPs covering the entire genome. Broad-sense trait heritabilities ranged from 0.117 to 0.477. Most traits were significantly correlated with geoclimatic variables suggesting a role of climate and geography in shaping the variation of this species. Fifty-seven SNPs from single trait GWAS and 11 SNPs from multitrait GWAS passed an FDR threshold of 0.05, leading to the identification of eight and seven nearby candidate genes, respectively. The percentage of phenotypic variance explained (PVE) by the significant SNPs for both single and multitrait GWAS ranged from 0.01% to 6.18%. To further evaluate the potential roles of candidate genes, we used a multi-omic network containing five additional data sets, including leaf and wood metabolite GWAS layers and coexpression and comethylation networks. We also performed a functional enrichment analysis on coexpression nearest neighbors for each gene model identified by the wood anatomical and morphological trait GWAS analyses. Genes affecting cell wall composition and transport related genes were enriched in wood anatomy and stomatal density trait networks. Signaling and metabolism related genes were also common in networks for stomatal density. For leaf morphology traits (leaf dry and wet weight) the networks were significantly enriched for GO terms related to photosynthetic processes as well as cellular homeostasis. The identified genes provide further insights into the genetic control of these traits, which are important determinants of the suitability and sustainability of improved genotypes for lignocellulosic biofuel production.

13.

A phylogenomic approach reveals a low somatic mutation rate in a long-lived plant.

Orr, Adam J; Padovan, Amanda; Kainer, David; Külheim, Carsten; Bromham, Lindell; Bustos-Segura, Carlos; Foley, William; Haff, Tonya; Hsieh, Ji-Fan; Morales-Suarez, Alejandro; Cartwright, Reed A; Lanfear, Robert.

Proc Biol Sci ; 287(1922): 20192364, 2020 03 11.

Artigo em Inglês | MEDLINE | ID: mdl-32156194

RESUMO

Somatic mutations can have important effects on the life history, ecology, and evolution of plants, but the rate at which they accumulate is poorly understood and difficult to measure directly. Here, we develop a method to measure somatic mutations in individual plants and use it to estimate the somatic mutation rate in a large, long-lived, phenotypically mosaic Eucalyptus melliodora tree. Despite being 100 times larger than Arabidopsis, this tree has a per-generation mutation rate only ten times greater, which suggests that this species may have evolved mechanisms to reduce the mutation rate per unit of growth. This adds to a growing body of evidence that illuminates the correlated evolutionary shifts in mutation rate and life history in plants.

Assuntos

Arabidopsis/fisiologia , Taxa de Mutação , Filogenia , Fenômenos Fisiológicos Vegetais

14.

Can exascale computing and explainable artificial intelligence applied to plant biology deliver on the United Nations sustainable development goals?

Streich, Jared; Romero, Jonathon; Gazolla, João Gabriel Felipe Machado; Kainer, David; Cliff, Ashley; Prates, Erica Teixeira; Brown, James B; Khoury, Sacha; Tuskan, Gerald A; Garvin, Michael; Jacobson, Daniel; Harfouche, Antoine L.

Curr Opin Biotechnol ; 61: 217-225, 2020 02.

Artigo em Inglês | MEDLINE | ID: mdl-32086132

RESUMO

Human population growth and accelerated climate change necessitate agricultural improvements using designer crop ideotypes (idealized plants that can grow in niche environments). Diverse and highly skilled research groups must integrate efforts to bridge the gaps needed to achieve international goals toward sustainable agriculture. Given the scale of global agricultural needs and the breadth of multiple types of omics data needed to optimize these efforts, explainable artificial intelligence (AI with a decipherable decision making process that provides a meaningful explanation to humans) and exascale computing (computers that can perform 1018 floating-point operations per second, or exaflops) are crucial. Accurate phenotyping and daily-resolution climatype associations are equally important for refining ideotype production to specific environments at various levels of granularity. We review advances toward tackling technological hurdles to solve multiple United Nations Sustainable Development Goals and discuss a vision to overcome gaps between research and policy.

Assuntos

Inteligência Artificial , Desenvolvimento Sustentável , Agricultura , Objetivos , Humanos , Nações Unidas

15.

The draft nuclear genome assembly of Eucalyptus pauciflora: a pipeline for comparing de novo assemblies.

Wang, Weiwen; Das, Ashutosh; Kainer, David; Schalamun, Miriam; Morales-Suarez, Alejandro; Schwessinger, Benjamin; Lanfear, Robert.

Gigascience ; 9(1)2020 01 01.

Artigo em Inglês | MEDLINE | ID: mdl-31895413

RESUMO

BACKGROUND: Eucalyptus pauciflora (the snow gum) is a long-lived tree with high economic and ecological importance. Currently, little genomic information for E. pauciflora is available. Here, we sequentially assemble the genome of Eucalyptus pauciflora with different methods, and combine multiple existing and novel approaches to help to select the best genome assembly. FINDINGS: We generated high coverage of long- (Nanopore, 174×) and short- (Illumina, 228×) read data from a single E. pauciflora individual and compared assemblies from 5 assemblers (Canu, SMARTdenovo, Flye, Marvel, and MaSuRCA) with different read lengths (1 and 35 kb minimum read length). A key component of our approach is to keep a randomly selected collection of â¼10% of both long and short reads separated from the assemblies to use as a validation set for assessing assemblies. Using this validation set along with a range of existing tools, we compared the assemblies in 8 ways: contig N50, BUSCO scores, LAI (long terminal repeat assembly index) scores, assembly ploidy, base-level error rate, CGAL (computing genome assembly likelihoods) scores, structural variation, and genome sequence similarity. Our result showed that MaSuRCA generated the best assembly, which is 594.87 Mb in size, with a contig N50 of 3.23 Mb, and an estimated error rate of â¼0.006 errors per base. CONCLUSIONS: We report a draft genome of E. pauciflora, which will be a valuable resource for further genomic studies of eucalypts. The approaches for assessing and comparing genomes should help in assessing and choosing among many potential genome assemblies from a single dataset.

Assuntos

Biologia Computacional , Eucalyptus/genética , Genoma de Planta , Genômica , Biologia Computacional/métodos , Contaminação por DNA , Tamanho do Genoma , Genômica/métodos

16.

A High-Performance Computing Implementation of Iterative Random Forest for the Creation of Predictive Expression Networks.

Cliff, Ashley; Romero, Jonathon; Kainer, David; Walker, Angelica; Furches, Anna; Jacobson, Daniel.

Genes (Basel) ; 10(12)2019 12 02.

Artigo em Inglês | MEDLINE | ID: mdl-31810264

RESUMO

As time progresses and technology improves, biological data sets are continuously increasing in size. New methods and new implementations of existing methods are needed to keep pace with this increase. In this paper, we present a high-performance computing (HPC)-capable implementation of Iterative Random Forest (iRF). This new implementation enables the explainable-AI eQTL analysis of SNP sets with over a million SNPs. Using this implementation, we also present a new method, iRF Leave One Out Prediction (iRF-LOOP), for the creation of Predictive Expression Networks on the order of 40,000 genes or more. We compare the new implementation of iRF with the previous R version and analyze its time to completion on two of the world's fastest supercomputers, Summit and Titan. We also show iRF-LOOP's ability to capture biologically significant results when creating Predictive Expression Networks. This new implementation of iRF will enable the analysis of biological data sets at scales that were previously not possible.

Assuntos

Algoritmos , Simulação por Computador , Modelos Genéticos , Locos de Características Quantitativas , Biologia Computacional

17.

Finding New Cell Wall Regulatory Genes in Populus trichocarpa Using Multiple Lines of Evidence.

Furches, Anna; Kainer, David; Weighill, Deborah; Large, Annabel; Jones, Piet; Walker, Angelica M; Romero, Jonathon; Gazolla, Joao Gabriel Felipe Machado; Joubert, Wayne; Shah, Manesh; Streich, Jared; Ranjan, Priya; Schmutz, Jeremy; Sreedasyam, Avinash; Macaya-Sanz, David; Zhao, Nan; Martin, Madhavi Z; Rao, Xiaolan; Dixon, Richard A; DiFazio, Stephen; Tschaplinski, Timothy J; Chen, Jin-Gui; Tuskan, Gerald A; Jacobson, Daniel.

Front Plant Sci ; 10: 1249, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-31649710

RESUMO

Understanding the regulatory network controlling cell wall biosynthesis is of great interest in Populus trichocarpa, both because of its status as a model woody perennial and its importance for lignocellulosic products. We searched for genes with putatively unknown roles in regulating cell wall biosynthesis using an extended network-based Lines of Evidence (LOE) pipeline to combine multiple omics data sets in P. trichocarpa, including gene coexpression, gene comethylation, population level pairwise SNP correlations, and two distinct SNP-metabolite Genome Wide Association Study (GWAS) layers. By incorporating validation, ranking, and filtering approaches we produced a list of nine high priority gene candidates for involvement in the regulation of cell wall biosynthesis. We subsequently performed a detailed investigation of candidate gene GROWTH-REGULATING FACTOR 9 (PtGRF9). To investigate the role of PtGRF9 in regulating cell wall biosynthesis, we assessed the genome-wide connections of PtGRF9 and a paralog across data layers with functional enrichment analyses, predictive transcription factor binding site analysis, and an independent comparison to eQTN data. Our findings indicate that PtGRF9 likely affects the cell wall by directly repressing genes involved in cell wall biosynthesis, such as PtCCoAOMT and PtMYB.41, and indirectly by regulating homeobox genes. Furthermore, evidence suggests that PtGRF9 paralogs may act as transcriptional co-regulators that direct the global energy usage of the plant. Using our extended pipeline, we show multiple lines of evidence implicating the involvement of these genes in cell wall regulatory functions and demonstrate the value of this method for prioritizing candidate genes for experimental validation.

18.

Accelerating Climate Resilient Plant Breeding by Applying Next-Generation Artificial Intelligence.

Harfouche, Antoine L; Jacobson, Daniel A; Kainer, David; Romero, Jonathon C; Harfouche, Antoine H; Scarascia Mugnozza, Giuseppe; Moshelion, Menachem; Tuskan, Gerald A; Keurentjes, Joost J B; Altman, Arie.

Trends Biotechnol ; 37(11): 1217-1235, 2019 11.

Artigo em Inglês | MEDLINE | ID: mdl-31235329

RESUMO

Breeding crops for high yield and superior adaptability to new and variable climates is imperative to ensure continued food security, biomass production, and ecosystem services. Advances in genomics and phenomics are delivering insights into the complex biological mechanisms that underlie plant functions in response to environmental perturbations. However, linking genotype to phenotype remains a huge challenge and is hampering the optimal application of high-throughput genomics and phenomics to advanced breeding. Critical to success is the need to assimilate large amounts of data into biologically meaningful interpretations. Here, we present the current state of genomics and field phenomics, explore emerging approaches and challenges for multiomics big data integration by means of next-generation (Next-Gen) artificial intelligence (AI), and propose a workable path to improvement.

Assuntos

Produtos Agrícolas/genética , Melhoramento Vegetal/métodos , Inteligência Artificial , Biomassa , Clima , Mudança Climática , Ecossistema , Genômica/métodos , Genótipo , Humanos , Fenômica/métodos , Fenótipo

19.

High marker density GWAS provides novel insights into the genomic architecture of terpene oil yield in Eucalyptus.

Kainer, David; Padovan, Amanda; Degenhardt, Joerg; Krause, Sandra; Mondal, Prodyut; Foley, William J; Külheim, Carsten.

New Phytol ; 223(3): 1489-1504, 2019 08.

Artigo em Inglês | MEDLINE | ID: mdl-31066055

RESUMO

Terpenoid-based essential oils are economically important commodities, yet beyond their biosynthetic pathways, little is known about the genetic architecture of terpene oil yield from plants. Transport, storage, evaporative loss, transcriptional regulation and precursor competition may be important contributors to this complex trait. Here, we associate 2.39 million single nucleotide polymorphisms derived from shallow whole-genome sequencing of 468 Eucalyptus polybractea individuals with 12 traits related to the overall terpene yield, eight direct measures of terpene concentration and four biomass-related traits. Our results show that in addition to terpene biosynthesis, development of secretory cavities, where terpenes are both synthesized and stored, and transport of terpenes were important components of terpene yield. For sesquiterpene concentrations, the availability of precursors in the cytosol was important. Candidate terpene synthase genes for the production of 1,8-cineole and α-pinene, and ß-pinene (which comprised > 80% of the total terpenes) were functionally characterized as a 1,8-cineole synthase and a ß/α-pinene synthase. Our results provide novel insights into the genomic architecture of terpene yield and we provide candidate genes for breeding or engineering of crops for biofuels or the production of industrially valuable terpenes.

Assuntos

Eucalyptus/genética , Genoma de Planta , Estudo de Associação Genômica Ampla , Óleos de Plantas/metabolismo , Terpenos/metabolismo , Alquil e Aril Transferases/genética , Vias Biossintéticas , Genes de Plantas , Genótipo , Padrões de Herança/genética , Análise Multivariada , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética , Reprodutibilidade dos Testes , Terpenos/química

20.

Multitrait genome-wide association analysis of Populus trichocarpa identifies key polymorphisms controlling morphological and physiological traits.

Chhetri, Hari B; Macaya-Sanz, David; Kainer, David; Biswal, Ajaya K; Evans, Luke M; Chen, Jin-Gui; Collins, Cassandra; Hunt, Kimberly; Mohanty, Sushree S; Rosenstiel, Todd; Ryno, David; Winkeler, Kim; Yang, Xiaohan; Jacobson, Daniel; Mohnen, Debra; Muchero, Wellington; Strauss, Steven H; Tschaplinski, Timothy J; Tuskan, Gerald A; DiFazio, Stephen P.

New Phytol ; 223(1): 293-309, 2019 07.

Artigo em Inglês | MEDLINE | ID: mdl-30843213

RESUMO

Genome-wide association studies (GWAS) have great promise for identifying the loci that contribute to adaptive variation, but the complex genetic architecture of many quantitative traits presents a substantial challenge. We measured 14 morphological and physiological traits and identified single nucleotide polymorphism (SNP)-phenotype associations in a Populus trichocarpa population distributed from California, USA to British Columbia, Canada. We used whole-genome resequencing data of 882 trees with more than 6.78 million SNPs, coupled with multitrait association to detect polymorphisms with potentially pleiotropic effects. Candidate genes were validated with functional data. Broad-sense heritability (H2 ) ranged from 0.30 to 0.56 for morphological traits and 0.08 to 0.36 for physiological traits. In total, 4 and 20 gene models were detected using the single-trait and multitrait association methods, respectively. Several of these associations were corroborated by additional lines of evidence, including co-expression networks, metabolite analyses, and direct confirmation of gene function through RNAi. Multitrait association identified many more significant associations than single-trait association, potentially revealing pleiotropic effects of individual genes. This approach can be particularly useful for challenging physiological traits such as water-use efficiency or complex traits such as leaf morphology, for which we were able to identify credible candidate genes by combining multitrait association with gene co-expression and co-methylation data.

Assuntos

Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único/genética , Populus/genética , Populus/fisiologia , Característica Quantitativa Herdável , Regulação para Baixo , Redes Reguladoras de Genes , Genes de Plantas , Genótipo , Geografia , Padrões de Herança/genética , Análise Multivariada , Estômatos de Plantas/fisiologia , Populus/anatomia & histologia , Análise de Componente Principal

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA