Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 27
Filter
Add more filters










Publication year range
1.
Genome Biol ; 25(1): 8, 2024 Jan 03.
Article in English | MEDLINE | ID: mdl-38172911

ABSTRACT

Dramatic improvements in measuring genetic variation across agriculturally relevant populations (genomics) must be matched by improvements in identifying and measuring relevant trait variation in such populations across many environments (phenomics). Identifying the most critical opportunities and challenges in genome to phenome (G2P) research is the focus of this paper. Previously (Genome Biol, 23(1):1-11, 2022), we laid out how Agricultural Genome to Phenome Initiative (AG2PI) will coordinate activities with USA federal government agencies expand public-private partnerships, and engage with external stakeholders to achieve a shared vision of future the AG2PI. Acting on this latter step, AG2PI organized the "Thinking Big: Visualizing the Future of AG2PI" two-day workshop held September 9-10, 2022, in Ames, Iowa, co-hosted with the United State Department of Agriculture's National Institute of Food and Agriculture (USDA NIFA). During the meeting, attendees were asked to use their experience and curiosity to review the current status of agricultural genome to phenome (AG2P) work and envision the future of the AG2P field. The topic summaries composing this paper are distilled from two 1.5-h small group discussions. Challenges and solutions identified across multiple topics at the workshop were explored. We end our discussion with a vision for the future of agricultural progress, identifying two areas of innovation needed: (1) innovate in genetic improvement methods development and evaluation and (2) innovate in agricultural research processes to solve societal problems. To address these needs, we then provide six specific goals that we recommend be implemented immediately in support of advancing AG2P research.


Subject(s)
Agriculture , Phenomics , United States , Genomics
2.
BMC Res Notes ; 17(1): 33, 2024 Jan 23.
Article in English | MEDLINE | ID: mdl-38263080

ABSTRACT

OBJECTIVES: Phenotyping plants in a field environment can involve a variety of methods including the use of automated instruments and labor-intensive manual measurement and scoring. Researchers also collect language-based phenotypic descriptions and use controlled vocabularies and structures such as ontologies to enable computation on descriptive phenotype data, including methods to determine phenotypic similarities. In this study, spoken descriptions of plants were collected and observers were instructed to use their own vocabulary to describe plant features that were present and visible. Further, these plants were measured and scored manually as part of a larger study to investigate whether spoken plant descriptions can be used to recover known biological phenomena. DATA DESCRIPTION: Data comprise phenotypic observations of 686 accessions of the maize Wisconsin Diversity panel, and 25 positive control accessions that carry visible, dramatic phenotypes. The data include the list of accessions planted, field layout, data collection procedures, student participants' (whose personal data are protected for ethical reasons) and volunteers' observation transcripts, volunteers' audio data files, terrestrial and aerial images of the plants, Amazon Web Services method selection experimental data, and manually collected phenotypes (e.g., plant height, ear and tassel features, etc.; measurements and scores). Data were collected during the summer of 2021 at Iowa State University's Agricultural Engineering and Agronomy Research Farms.


Subject(s)
Agriculture , Humans , Wisconsin , Data Collection , Farms , Phenotype
3.
BMC Res Notes ; 17(1): 9, 2024 Jan 02.
Article in English | MEDLINE | ID: mdl-38167110

ABSTRACT

OBJECTIVES: We annotated the latest published sequences of the 26 Zea mays Nested Association Mapping (NAM) founder lines using GOMAP, the Gene Ontology Meta Annotator for Plants. The maize NAM panel enables researchers to understand and identify the genetic basis of complex traits. Annotations of predicted functions for genes can help researchers investigate gene-phenotype associations, prioritize candidate genes for phenotypes of interest, and formulate testable hypotheses about gene function/phenotype associations. The creation and release of high-confidence, high-coverage gene function annotation sets for the NAM founder lines is critical to accelerate the generation of knowledge in maize genetics research. GOMAP is a high-throughput computational pipeline that annotates gene functions genome-wide in plant genomes using Gene Ontology functional class terms. Here we report and share GOMAP-generated functional annotations for the NAM founder lines. DATA DESCRIPTION: Datasets include the protein sequences used as input, GOMAP-generated annotation files, scripts used to update obsolete terms, and GAF-formatted tab-delimited text files of gene function annotations along with README files that describe formatting, content, and how files relate to each other.


Subject(s)
Genome, Plant , Zea mays , Zea mays/genetics , Genome, Plant/genetics , Phenotype
4.
Gigascience ; 112022 04 15.
Article in English | MEDLINE | ID: mdl-35426911

ABSTRACT

BACKGROUND: Genome-wide gene function annotations are useful for hypothesis generation and for prioritizing candidate genes potentially responsible for phenotypes of interest. We functionally annotated the genes of 18 crop plant genomes across 14 species using the GOMAP pipeline. RESULTS: By comparison to existing GO annotation datasets, GOMAP-generated datasets cover more genes, contain more GO terms, and are similar in quality (based on precision and recall metrics using existing gold standards as the basis for comparison). From there, we sought to determine whether the datasets across multiple species could be used together to carry out comparative functional genomics analyses in plants. To test the idea and as a proof of concept, we created dendrograms of functional relatedness based on terms assigned for all 18 genomes. These dendrograms were compared to well-established species-level evolutionary phylogenies to determine whether trees derived were in agreement with known evolutionary relationships, which they largely are. Where discrepancies were observed, we determined branch support based on jackknifing then removed individual annotation sets by genome to identify the annotation sets causing unexpected relationships. CONCLUSIONS: GOMAP-derived functional annotations used together across multiple species generally retain sufficient biological signal to recover known phylogenetic relationships based on genome-wide functional similarities, indicating that comparative functional genomics across species based on GO data holds promise for generating novel hypotheses about comparative gene function and traits.


Subject(s)
Genome, Plant , Genomics , Databases, Genetic , Gene Ontology , Molecular Sequence Annotation , Phylogeny , Plants/genetics
7.
Plant Methods ; 17(1): 54, 2021 May 25.
Article in English | MEDLINE | ID: mdl-34034755

ABSTRACT

Annotating gene structures and functions to genome assemblies is necessary to make assembly resources useful for biological inference. Gene Ontology (GO) term assignment is the most used functional annotation system, and new methods for GO assignment have improved the quality of GO-based function predictions. The Gene Ontology Meta Annotator for Plants (GOMAP) is an optimized, high-throughput, and reproducible pipeline for genome-scale GO annotation of plants. We containerized GOMAP to increase portability and reproducibility and also optimized its performance for HPC environments. Here we report on the pipeline's availability and performance for annotating large, repetitive plant genomes and describe how GOMAP was used to annotate multiple maize genomes as a test case. Assessment shows that GOMAP expands and improves the number of genes annotated and annotations assigned per gene as well as the quality (based on [Formula: see text]) of GO assignments in maize. GOMAP has been deployed to annotate other species including wheat, rice, barley, cotton, and soy. Instructions and access to the GOMAP Singularity container are freely available online at https://bioinformapping.com/gomap/ . A list of annotated genomes and links to data is maintained at https://dill-picl.org/projects/gomap/ .

8.
Plant Phenomics ; 2020: 1963251, 2020.
Article in English | MEDLINE | ID: mdl-33313544

ABSTRACT

Many newly observed phenotypes are first described, then experimentally manipulated. These language-based descriptions appear in both the literature and in community datastores. To standardize phenotypic descriptions and enable simple data aggregation and analysis, controlled vocabularies and specific data architectures have been developed. Such simplified descriptions have several advantages over natural language: they can be rigorously defined for a particular context or problem, they can be assigned and interpreted programmatically, and they can be organized in a way that allows for semantic reasoning (inference of implicit facts). Because researchers generally report phenotypes in the literature using natural language, curators have been translating phenotypic descriptions into controlled vocabularies for decades to make the information computable. Unfortunately, this methodology is highly dependent on human curation, which does not scale to the scope of all publications available across all of plant biology. Simultaneously, researchers in other domains have been working to enable computation on natural language. This has resulted in new, automated methods for computing on language that are now available, with early analyses showing great promise. Natural language processing (NLP) coupled with machine learning (ML) allows for the use of unstructured language for direct analysis of phenotypic descriptions. Indeed, we have found that these automated methods can be used to create data structures that perform as well or better than those generated by human curators on tasks such as predicting gene function and biochemical pathway membership. Here, we describe current and ongoing efforts to provide tools for the plant phenomics community to explore novel predictions that can be generated using these techniques. We also describe how these methods could be used along with mobile speech-to-text tools to collect and analyze in-field spoken phenotypic descriptions for association genetics and breeding applications.

10.
BMC Genomics ; 21(1): 193, 2020 Mar 02.
Article in English | MEDLINE | ID: mdl-32122303

ABSTRACT

BACKGROUND: Genome assemblies are foundational for understanding the biology of a species. They provide a physical framework for mapping additional sequences, thereby enabling characterization of, for example, genomic diversity and differences in gene expression across individuals and tissue types. Quality metrics for genome assemblies gauge both the completeness and contiguity of an assembly and help provide confidence in downstream biological insights. To compare quality across multiple assemblies, a set of common metrics are typically calculated and then compared to one or more gold standard reference genomes. While several tools exist for calculating individual metrics, applications providing comprehensive evaluations of multiple assembly features are, perhaps surprisingly, lacking. Here, we describe a new toolkit that integrates multiple metrics to characterize both assembly and gene annotation quality in a way that enables comparison across multiple assemblies and assembly types. RESULTS: Our application, named GenomeQC, is an easy-to-use and interactive web framework that integrates various quantitative measures to characterize genome assemblies and annotations. GenomeQC provides researchers with a comprehensive summary of these statistics and allows for benchmarking against gold standard reference assemblies. CONCLUSIONS: The GenomeQC web application is implemented in R/Shiny version 1.5.9 and Python 3.6 and is freely available at https://genomeqc.maizegdb.org/ under the GPL license. All source code and a containerized version of the GenomeQC pipeline is available in the GitHub repository https://github.com/HuffordLab/GenomeQC.


Subject(s)
Genomics/methods , Chromosome Mapping , Computational Biology/methods , High-Throughput Nucleotide Sequencing , Humans , Molecular Sequence Annotation , Sequence Analysis, DNA , Software
11.
Front Genet ; 11: 592769, 2020.
Article in English | MEDLINE | ID: mdl-33763106

ABSTRACT

Genomic prediction provides an efficient alternative to conventional phenotypic selection for developing improved cultivars with desirable characteristics. New and improved methods to genomic prediction are continually being developed that attempt to deal with the integration of data types beyond genomic information. Modern automated weather systems offer the opportunity to capture continuous data on a range of environmental parameters at specific field locations. In principle, this information could characterize training and target environments and enhance predictive ability by incorporating weather characteristics as part of the genotype-by-environment (G×E) interaction component in prediction models. We assessed the usefulness of including weather data variables in genomic prediction models using a naïve environmental kinship model across 30 environments comprising the Genomes to Fields (G2F) initiative in 2014 and 2015. Specifically four different prediction scenarios were evaluated (i) tested genotypes in observed environments; (ii) untested genotypes in observed environments; (iii) tested genotypes in unobserved environments; and (iv) untested genotypes in unobserved environments. A set of 1,481 unique hybrids were evaluated for grain yield. Evaluations were conducted using five different models including main effect of environments; general combining ability (GCA) effects of the maternal and paternal parents modeled using the genomic relationship matrix; specific combining ability (SCA) effects between maternal and paternal parents; interactions between genetic (GCA and SCA) effects and environmental effects; and finally interactions between the genetics effects and environmental covariates. Incorporation of the genotype-by-environment interaction term improved predictive ability across all scenarios. However, predictive ability was not improved through inclusion of naive environmental covariates in G×E models. More research should be conducted to link the observed weather conditions with important physiological aspects in plant development to improve predictive ability through the inclusion of weather data.

12.
Sci Rep ; 9(1): 19902, 2019 12 27.
Article in English | MEDLINE | ID: mdl-31882637

ABSTRACT

An important advantage of delivering CRISPR reagents into cells as a ribonucleoprotein (RNP) complex is the ability to edit genes without reagents being integrated into the genome. Transient presence of RNP molecules in cells can reduce undesirable off-target effects. One method for RNP delivery into plant cells is the use of a biolistic gun. To facilitate selection of transformed cells during RNP delivery, a plasmid carrying a selectable marker gene can be co-delivered with the RNP to enrich for transformed/edited cells. In this work, we compare targeted mutagenesis in rice using three different delivery platforms: biolistic RNP/DNA co-delivery; biolistic DNA delivery; and Agrobacterium-mediated delivery. All three platforms were successful in generating desired mutations at the target sites. However, we observed a high frequency (over 14%) of random plasmid or chromosomal DNA fragment insertion at the target sites in transgenic events generated from both biolistic delivery platforms. In contrast, integration of random DNA fragments was not observed in transgenic events generated from the Agrobacterium-mediated method. These data reveal important insights that must be considered when selecting the method for genome-editing reagent delivery in plants, and emphasize the importance of employing appropriate molecular screening methods to detect unintended alterations following genome engineering.


Subject(s)
CRISPR-Cas Systems/genetics , Oryza/genetics , Plasmids/genetics , RNA, Plant/genetics , Agrobacterium/genetics , DNA Fragmentation , Ribonucleoproteins/genetics , Ribonucleoproteins/metabolism
13.
Plant Methods ; 15: 117, 2019.
Article in English | MEDLINE | ID: mdl-31660060

ABSTRACT

BACKGROUND: Assessing the impact of the environment on plant performance requires growing plants under controlled environmental conditions. Plant phenotypes are a product of genotype × environment (G × E), and the Enviratron at Iowa State University is a facility for testing under controlled conditions the effects of the environment on plant growth and development. Crop plants (including maize) can be grown to maturity in the Enviratron, and the performance of plants under different environmental conditions can be monitored 24 h per day, 7 days per week throughout the growth cycle. RESULTS: The Enviratron is an array of custom-designed plant growth chambers that simulate different environmental conditions coupled with precise sensor-based phenotypic measurements carried out by a robotic rover. The rover has workflow instructions to periodically visit plants growing in the different chambers where it measures various growth and physiological parameters. The rover consists of an unmanned ground vehicle, an industrial robotic arm and an array of sensors including RGB, visible and near infrared (VNIR) hyperspectral, thermal, and time-of-flight (ToF) cameras, laser profilometer and pulse-amplitude modulated (PAM) fluorometer. The sensors are autonomously positioned for detecting leaves in the plant canopy, collecting various physiological measurements based on computer vision algorithms and planning motion via "eye-in-hand" movement control of the robotic arm. In particular, the automated leaf probing function that allows the precise placement of sensor probes on leaf surfaces presents a unique advantage of the Enviratron system over other types of plant phenotyping systems. CONCLUSIONS: The Enviratron offers a new level of control over plant growth parameters and optimizes positioning and timing of sensor-based phenotypic measurements. Plant phenotypes in the Enviratron are measured in situ-in that the rover takes sensors to the plants rather than moving plants to the sensors.

14.
Front Plant Sci ; 10: 1050, 2019.
Article in English | MEDLINE | ID: mdl-31555312

ABSTRACT

Background: An organism can be described by its observable features (phenotypes) and the genes and genomic information (genotypes) that cause these phenotypes. For many decades, researchers have tried to find relationships between genotypes and phenotypes, and great strides have been made. However, improved methods and tools for discovering and visualizing these phenotypic relationships are still needed. The maize genetics and genomics database (MaizeGDB, www.maizegdb.org) provides an array of useful resources for diverse data types including thousands of images related to mutant phenotypes in Zea mays ssp. mays (maize). To integrate mutant phenotype images with genomics information, we implemented and enhanced the web-based software package BioDIG (Biological Database of Images and Genomes). Findings: We developed a genotype-phenotype database for maize called MaizeDIG. MaizeDIG has several enhancements over the original BioDIG package. MaizeDIG, which supports multiple reference genome assemblies, is seamlessly integrated with genome browsers to accommodate custom tracks showing tagged mutant phenotypes images in their genomic context and allows for custom tagging of images to highlight the phenotype. This is accomplished through an updated interface allowing users to create image-to-gene links and is accessible via the image search tool. Conclusions: We have created a user-friendly and extensible web-based resource called MaizeDIG. MaizeDIG is preloaded with 2,396 images that are available on genome browsers for 10 different maize reference genomes. Approximately 90 images of classically defined maize genes have been manually annotated. MaizeDIG is available at http://maizedig.maizegdb.org/. The code is free and open source and can be found at https://github.com/Maize-Genetics-and-Genomics-Database/maizedig.

15.
Front Plant Sci ; 10: 1629, 2019.
Article in English | MEDLINE | ID: mdl-31998331

ABSTRACT

Natural language descriptions of plant phenotypes are a rich source of information for genetics and genomics research. We computationally translated descriptions of plant phenotypes into structured representations that can be analyzed to identify biologically meaningful associations. These representations include the entity-quality (EQ) formalism, which uses terms from biological ontologies to represent phenotypes in a standardized, semantically rich format, as well as numerical vector representations generated using natural language processing (NLP) methods (such as the bag-of-words approach and document embedding). We compared resulting phenotype similarity measures to those derived from manually curated data to determine the performance of each method. Computationally derived EQ and vector representations were comparably successful in recapitulating biological truth to representations created through manual EQ statement curation. Moreover, NLP methods for generating vector representations of phenotypes are scalable to large quantities of text because they require no human input. These results indicate that it is now possible to computationally and automatically produce and populate large-scale information resources that enable researchers to query phenotypic descriptions directly.

16.
Plant Biotechnol J ; 17(2): 362-372, 2019 02.
Article in English | MEDLINE | ID: mdl-29972722

ABSTRACT

CRISPR/Cas9 and Cas12a (Cpf1) nucleases are two of the most powerful genome editing tools in plants. In this work, we compared their activities by targeting maize glossy2 gene coding region that has overlapping sequences recognized by both nucleases. We introduced constructs carrying SpCas9-guide RNA (gRNA) and LbCas12a-CRISPR RNA (crRNA) into maize inbred B104 embryos using Agrobacterium-mediated transformation. On-target mutation analysis showed that 90%-100% of the Cas9-edited T0 plants carried indel mutations and 63%-77% of them were homozygous or biallelic mutants. In contrast, 0%-60% of Cas12a-edited T0 plants had on-target mutations. We then conducted CIRCLE-seq analysis to identify genome-wide potential off-target sites for Cas9. A total of 18 and 67 potential off-targets were identified for the two gRNAs, respectively, with an average of five mismatches compared to the target sites. Sequencing analysis of a selected subset of the off-target sites revealed no detectable level of mutations in the T1 plants, which constitutively express Cas9 nuclease and gRNAs. In conclusion, our results suggest that the CRISPR/Cas9 system used in this study is highly efficient and specific for genome editing in maize, while CRISPR/Cas12a needs further optimization for improved editing efficiency.


Subject(s)
CRISPR-Cas Systems , Endonucleases/metabolism , Gene Editing/methods , Genome, Plant/genetics , Zea mays/enzymology , Agrobacterium , Endonucleases/genetics , Gene Targeting/methods , Mutagenesis , Mutation , RNA, Guide, Kinetoplastida/genetics , Sequence Alignment , Zea mays/genetics
17.
PLoS Comput Biol ; 14(7): e1006337, 2018 07.
Article in English | MEDLINE | ID: mdl-30059508

ABSTRACT

The accuracy of machine learning tasks critically depends on high quality ground truth data. Therefore, in many cases, producing good ground truth data typically involves trained professionals; however, this can be costly in time, effort, and money. Here we explore the use of crowdsourcing to generate a large number of training data of good quality. We explore an image analysis task involving the segmentation of corn tassels from images taken in a field setting. We investigate the accuracy, speed and other quality metrics when this task is performed by students for academic credit, Amazon MTurk workers, and Master Amazon MTurk workers. We conclude that the Amazon MTurk and Master Mturk workers perform significantly better than the for-credit students, but with no significant difference between the two MTurk worker types. Furthermore, the quality of the segmentation produced by Amazon MTurk workers rivals that of an expert worker. We provide best practices to assess the quality of ground truth data, and to compare data quality produced by different sources. We conclude that properly managed crowdsourcing can be used to establish large volumes of viable ground truth data at a low cost and high quality, especially in the context of high throughput plant phenotyping. We also provide several metrics for assessing the quality of the generated datasets.


Subject(s)
Crops, Agricultural/physiology , Crowdsourcing/methods , Image Processing, Computer-Assisted/methods , Machine Learning , Algorithms , Data Accuracy , Food Supply , Humans , Internet , Phenotype , Pilot Projects
18.
BMC Res Notes ; 11(1): 452, 2018 Jul 09.
Article in English | MEDLINE | ID: mdl-29986751

ABSTRACT

OBJECTIVES: Crop improvement relies on analysis of phenotypic, genotypic, and environmental data. Given large, well-integrated, multi-year datasets, diverse queries can be made: Which lines perform best in hot, dry environments? Which alleles of specific genes are required for optimal performance in each environment? Such datasets also can be leveraged to predict cultivar performance, even in uncharacterized environments. The maize Genomes to Fields (G2F) Initiative is a multi-institutional organization of scientists working to generate and analyze such datasets from existing, publicly available inbred lines and hybrids. G2F's genotype by environment project has released 2014 and 2015 datasets to the public, with 2016 and 2017 collected and soon to be made available. DATA DESCRIPTION: Datasets include DNA sequences; traditional phenotype descriptions, as well as detailed ear, cob, and kernel phenotypes quantified by image analysis; weather station measurements; and soil characterizations by site. Data are released as comma separated value spreadsheets accompanied by extensive README text descriptions. For genotypic and phenotypic data, both raw data and a version with outliers removed are reported. For weather data, two versions are reported: a full dataset calibrated against nearby National Weather Service sites and a second calibrated set with outliers and apparent artifacts removed.


Subject(s)
Datasets as Topic , Genotype , Phenotype , Zea mays/genetics , Environment , Genome, Plant , Inbreeding , Plant Breeding , Seasons , Sequence Analysis, DNA
19.
Plant Cell ; 30(6): 1220-1242, 2018 06.
Article in English | MEDLINE | ID: mdl-29802214

ABSTRACT

The unfolded protein response (UPR) is a highly conserved response that protects plants from adverse environmental conditions. The UPR is elicited by endoplasmic reticulum (ER) stress, in which unfolded and misfolded proteins accumulate within the ER. Here, we induced the UPR in maize (Zea mays) seedlings to characterize the molecular events that occur over time during persistent ER stress. We found that a multiphasic program of gene expression was interwoven among other cellular events, including the induction of autophagy. One of the earliest phases involved the degradation by regulated IRE1-dependent RNA degradation (RIDD) of RNA transcripts derived from a family of peroxidase genes. RIDD resulted from the activation of the promiscuous ribonuclease activity of ZmIRE1 that attacks the mRNAs of secreted proteins. This was followed by an upsurge in expression of the canonical UPR genes indirectly driven by ZmIRE1 due to its splicing of Zmbzip60 mRNA to make an active transcription factor that directly upregulates many of the UPR genes. At the peak of UPR gene expression, a global wave of RNA processing led to the production of many aberrant UPR gene transcripts, likely tempering the ER stress response. During later stages of ER stress, ZmIRE1's activity declined, as did the expression of survival modulating genes, Bax inhibitor1 and Bcl-2-associated athanogene7, amid a rising tide of cell death. Thus, in response to persistent ER stress, maize seedlings embark on a course of gene expression and cellular events progressing from adaptive responses to cell death.


Subject(s)
Cell Death/physiology , Endoplasmic Reticulum Stress/physiology , Unfolded Protein Response/physiology , Zea mays/cytology , Zea mays/metabolism , Cell Death/genetics , Endoplasmic Reticulum Stress/genetics , RNA, Messenger/genetics , RNA, Messenger/metabolism , Unfolded Protein Response/genetics , Zea mays/genetics
20.
GM Crops Food ; 9(2): 53-58, 2018.
Article in English | MEDLINE | ID: mdl-29561212

ABSTRACT

Biotech news coverage in English-language Russian media fits the profile of the Russian information warfare strategy described in recent military reports. This raises the question of whether Russia views the dissemination of anti-GMO information as just one of many divisive issues it can exploit as part of its information war, or if GMOs serve more expansive disruptive purposes. Distinctive patterns in Russian news provide evidence of a coordinated information campaign that could turn public opinion against genetic engineering. The recent branding of Russian agriculture as the ecologically clean alternative to genetically engineered foods is suggestive of an economic motive behind the information campaign against western biotechnologies.


Subject(s)
Organisms, Genetically Modified/metabolism , Agriculture , Russia , Social Media
SELECTION OF CITATIONS
SEARCH DETAIL
...