RESUMO
During germination plants rely entirely on their seed storage compounds to provide energy and precursors for the synthesis of macromolecular structures until the seedling has emerged from the soil and photosynthesis can be established. Lupin seeds use proteins as their major storage compounds, accounting for up to 40% of the seed dry weight. Lupins are therefore a valuable complement to soy as a source of plant protein for human and animal nutrition. The aim of this study was to elucidate how storage protein metabolism is coordinated with other metabolic processes to meet the requirements of the growing seedling. In a quantitative approach, we analysed seedling growth, as well as alterations in biomass composition, the proteome, and metabolite profiles during germination and seedling establishment in Lupinus albus. The reallocation of nitrogen resources from seed storage proteins to functional seed proteins was mapped based on a manually curated functional protein annotation database. Although classified as a protein crop, Lupinus albus does not use amino acids as a primary substrate for energy metabolism during germination. However, fatty acid and amino acid metabolism may be integrated at the level of malate synthase to combine stored carbon from lipids and proteins into gluconeogenesis.
Assuntos
Aminoácidos , Germinação , Lupinus , Proteínas de Plantas , Proteoma , Plântula , Lupinus/metabolismo , Lupinus/crescimento & desenvolvimento , Aminoácidos/metabolismo , Proteoma/metabolismo , Plântula/metabolismo , Plântula/crescimento & desenvolvimento , Proteínas de Plantas/metabolismo , Sementes/metabolismo , Sementes/crescimento & desenvolvimentoRESUMO
Citrus species are some of the most valuable and widely consumed fruits globally. The genome sequences of representative citrus (e.g., Citrus clementina, C. sinensis, C. grandis) species have been released but the research base for mandarin molecular breeding is still poor. We assembled the genomes of Citrus unshiu and Poncirus trifoliata, two important species for citrus industry in Japan, using hybrid de novo assembly of Illumina and PacBio sequence data, and developed the Mikan Genome Database (MiGD). The assembled genome sizes of C. unshiu and P. trifoliata are 346 and 292 Mb, respectively, similar to those of citrus species in public databases; they are predicted to possess 41,489 and 34,333 protein-coding genes in their draft genome sequences, with 9,642 and 8,377 specific genes when compared to C. clementina, respectively. MiGD is an integrated database of genome annotation, genetic diversity, and Cleaved Amplified Polymorphic Sequence (CAPS) marker information, with these contents being mutually linked by genes. MiGD facilitates access to genome sequences of interest from previously reported linkage maps through CAPS markers and obtains polymorphism information through the multiple genome browser TASUKE. The genomic resources in MiGD (https://mikan.dna.affrc.go.jp) could provide valuable information for mandarin molecular breeding in Japan.
RESUMO
BACKGROUND: Chemicals of Emerging Concern (CECs) include a very wide group of chemicals that are suspected to be responsible for adverse effects on health, but for which very limited information is available. Chromatographic techniques coupled with high-resolution mass spectrometry (HRMS) can be used for non-targeted screening and detection of CECs, by using comprehensive annotation databases. Establishing a database focused on the annotation of CECs in human samples will provide new insight into the distribution and extent of exposures to a wide range of CECs in humans. OBJECTIVES: This study describes an approach for the aggregation and curation of an annotation database (CECscreen) for the identification of CECs in human biological samples. METHODS: The approach consists of three main parts. First, CECs compound lists from various sources were aggregated and duplications and inorganic compounds were removed. Subsequently, the list was curated by standardization of structures to create "MS-ready" and "QSAR-ready" SMILES, as well as calculation of exact masses (monoisotopic and adducts) and molecular formulas. The second step included the simulation of Phase I metabolites. The third and final step included the calculation of QSAR predictions related to physicochemical properties, environmental fate, toxicity and Absorption, Distribution, Metabolism, Excretion (ADME) processes and the retrieval of information from the US EPA CompTox Chemicals Dashboard. RESULTS: All CECscreen database and property files are publicly available (DOI: https://doi.org/10.5281/zenodo.3956586). In total, 145,284 entries were aggregated from various CECs data sources. After elimination of duplicates and curation, the pipeline produced 70,397 unique "MS-ready" structures and 66,071 unique QSAR-ready structures, corresponding with 69,526 CAS numbers. Simulation of Phase I metabolites resulted in 306,279 unique metabolites. QSAR predictions could be performed for 64,684 of the QSAR-ready structures, whereas information was retrieved from the CompTox Chemicals Dashboard for 59,739 CAS numbers out of 69,526 inquiries. CECscreen is incorporated in the in silico fragmentation approach MetFrag. DISCUSSION: The CECscreen database can be used to prioritize annotation of CECs measured in non-targeted HRMS, facilitating the large-scale detection of CECs in human samples for exposome research. Large-scale detection of CECs can be further improved by integrating the present database with resources that contain CECs (metabolites) and meta-data measurements, further expansion towards in silico and experimental (e.g., MassBank) generation of MS/MS spectra, and development of bioinformatics approaches capable of using correlation patterns in the measured chemical features.
Assuntos
Expossoma , Simulação por Computador , Gerenciamento de Dados , Bases de Dados Factuais , Humanos , Espectrometria de Massas em TandemRESUMO
The Gene Ontology (GO) is widely recognised as the gold standard bioinformatics resource for summarizing functional knowledge of gene products in a consistent and computable, information-rich language. GO describes cellular and organismal processes across all species, yet until now there has been a considerable gene annotation deficit within the neurological and immunological domains, both of which are relevant to Parkinson's disease. Here we introduce the Parkinson's disease GO Annotation Project, funded by Parkinson's UK and supported by the GO Consortium, which is addressing this deficit by providing GO annotation to Parkinson's-relevant human gene products, principally through expert literature curation. We discuss the steps taken to prioritise proteins, publications and cellular processes for annotation, examples of how GO annotations capture Parkinson's-relevant information, and the advantages that a topic-focused annotation approach offers to users. Building on the existing GO resource, this project collates a vast amount of Parkinson's-relevant literature into a set of high-quality annotations to be utilized by the research community.