RESUMEN
Plant samples are potential sources of physiologically active secondary metabolites and their classification is an extremely important task in traditional medicine and other fields of research. In the production of herbal drugs, different plant parts of the same or related species can serve as adulterants for primary plant material. The use of highly informative and relatively easily accessible tools, such as liquid chromatography and low-resolution mass spectrometry, helps to solve these tasks by means of fingerprint analysis. In this study, to reveal specific plant part features for 20 species from one family (Apiaceae), and to preserve the maximum information content, two approaches are suggested. In both cases, minimal raw data pretreatment, including rescaling of time and m/z axes and cutting off some uninformative regions, was applied. For the support vector machine (SVM) method, tensor unfolding was required, while neural networks (NNs) were able to work directly with squared heatmaps as input data. Moreover, five data augmentation variants are proposed, to overcome the typical problem of a lack of data. As a result, a comparable F1-score close to 0.75 was achieved by SVM and two employed NN architectures. Eight marker compounds belonging to chlorophylls, lipids, and coumarin apio-glucosides were tentatively identified as characteristic of their corresponding sample groups: roots, stems, leaves, and fruits. The proposed approaches are simple, information-saving and can be applied to a broad type of tasks in metabolomics.
RESUMEN
Species of the genus Burmannia possess distinctive and highly elaborated flowers with prominent floral tubes that often bear large longitudinal wings. Complicated floral structure of Burmannia hampers understanding its floral evolutionary morphology and biology of the genus. In addition, information on structural features believed to be taxonomically important is lacking for some species. Here we provide an investigation of flowers and inflorescences of Burmannia based on a comprehensive sampling that included eight species with various lifestyles (autotrophic, partially mycoheterotrophic and mycoheterotrophic). We describe the diversity of inflorescence architecture in the genus: a basic (most likely, ancestral) inflorescence type is a thyrsoid comprising two cincinni, which is transformed into a botryoid in some species via reduction of the lateral cymes to single flowers. Burmannia oblonga differs from all the other studied species in having an adaxial (vs. transversal) floral prophyll. For the first time, we describe in detail early floral development in Burmannia. We report presence of the inner tepal lobes in B. oblonga, a species with reportedly absent inner tepals; the growth of the inner tepal lobes is arrested after the middle stage of floral development of this species, and therefore they are undetectable in a mature flower. Floral vasculature in Burmannia varies to reflect the variation of the size of the inner tepal lobes; in B. oblonga with the most reduced inner tepals their vascular supply is completely lost. The gynoecium consists of synascidiate, symplicate, and asymplicate zones. The symplicate zone is secondarily trilocular (except for its distal portion in some of the species) without visible traces of postgenital fusion, which prevented earlier researchers to correctly identify the zones within a definitive ovary. The placentas occupy the entire symplicate zone and a short distal portion of the synascidiate zone. Finally, we revealed an unexpected diversity of stamen-style interactions in Burmannia. In all species studied, the stamens are tightly arranged around the common style to occlude the flower entrance. However, in some species the stamens are free from the common style, whereas in the others the stamen connectives are postgenitally fused with the common style, which results in formation of a gynostegium.
RESUMEN
The combination of Liquid Chromatography and Mass Spectrometry (LC-MS) is commonly used to determine and characterize biologically active compounds because of its high resolution and sensitivity. In this work we explore the interpretation of LC-MS data using multivariate statistical analysis algorithms to extract useful chemical information and identify clusters of similar samples. Samples of leaves from 19 plants belonging to the Apiaceae family were analyzed in unified LC conditions by high- and low-resolution mass spectrometry in a wide range scan mode. LC-MS data preprocessing was performed followed by statistical analysis using tensor decomposition in the form of Parallel Factor Analysis (PARAFAC); matrix factorization following tensor unfolding with principal component analysis (PCA), independent component analysis (ICA), non-negative matrix factorization (NMF); or unsupervised feature selection (UFS). The optimal number of components for each of these methods were found and results were compared using four different metrics: silhouette score, Davies-Bouldin index, computational time, number of noisy components. It was found that PCA, ICA and UFS give the best results across the majority of the criteria for both low- and high-resolution data. An algorithm for biomarker signal selection is suggested and 23 potential chemotaxonomic markers were tentatively identified using MS2 data. Dendrograms constructed by the methods were compared to the molecular phylogenic tree by calculating pixel-wise mean square error (MSE). Therefore, the suggested approach can support chemotaxonomic studies and yield valuable chemical information for biomarker discovery.