RESUMO
Food ontologies are acquiring a central role in human nutrition, providing a standardized terminology for a proper description of intervention and observational trials. In addition to bioactive molecules, several fermented foods, particularly dairy products, provide the host with live microorganisms, thus carrying potential "genetic/functional" nutrients. To date, a proper ontology to structure and formalize the concepts used to describe fermented foods is lacking. Here we describe a semantic representation of concepts revolving around what consuming fermented foods entails, both from a technological and health point of view, focusing actions on kefir and Parmigiano Reggiano, as representatives of fresh and ripened dairy products. We included concepts related to the connection of specific microbial taxa to the dairy fermentation process, demonstrating the potential of ontologies to formalize the various gene pathways involved in raw ingredient transformation, connect them to resulting metabolites, and finally to their consequences on the fermented product, including technological, health and sensory aspects. Our work marks an improvement in the ambition of creating a harmonized semantic model for integrating different aspects of modern nutritional science. Such a model, besides formalizing a multifaceted knowledge, will be pivotal for a rich annotation of data in public repositories, as a prerequisite to generalized meta-analysis.
RESUMO
We present six datasets containing telemetry data of the Mars Express Spacecraft (MEX), a spacecraft orbiting Mars operated by the European Space Agency. The data consisting of context data and thermal power consumption measurements, capture the status of the spacecraft over three Martian years, sampled at six different time resolutions that range from 1 min to 60 min. From a data analysis point-of-view, these data are challenging even for the more sophisticated state-of-the-art artificial intelligence methods. In particular, given the heterogeneity, complexity, and magnitude of the data, they can be employed in a variety of scenarios and analyzed through the prism of different machine learning tasks, such as multi-target regression, learning from data streams, anomaly detection, clustering, etc. Analyzing MEX's telemetry data is critical for aiding very important decisions regarding the spacecraft's status and operation, extracting novel knowledge, and monitoring the spacecraft's health, but the data can also be used to benchmark artificial intelligence methods designed for a variety of tasks.
RESUMO
Multilabel classification (MLC) is a machine learning task where the goal is to learn to label an example with multiple labels simultaneously. It receives increasing interest from the machine learning community, as evidenced by the increasing number of papers and methods that appear in the literature. Hence, ensuring proper, correct, robust, and trustworthy benchmarking is of utmost importance for the further development of the field. We believe that this can be achieved by adhering to the recently emerged data management standards, such as the FAIR (Findable, Accessible, Interoperable, and Reusable) and TRUST (Transparency, Responsibility, User focus, Sustainability, and Technology) principles. We introduce an ontology-based online catalogue of MLC datasets originating from various application domains following these principles. The catalogue extensively describes many MLC datasets with comprehensible meta-features, MLC-specific semantic descriptions, and different data provenance information. The MLC data catalogue is available at: http://semantichub.ijs.si/MLCdatasets .
Assuntos
Aprendizado de Máquina , Semântica , PublicaçõesRESUMO
Therapies halting the progression of fibrosis are ineffective and limited. Activated myofibroblasts are emerging as important targets in the progression of fibrotic diseases. Previously, we performed a high-throughput screen on lung fibroblasts and subsequently demonstrated that the inhibition of myofibroblast activation is able to prevent lung fibrosis in bleomycin-treated mice. High-throughput screens are an ideal method of repurposing drugs, yet they contain an intrinsic limitation, which is the size of the library itself. Here, we exploited the data from our "wet" screen and used "dry" machine learning analysis to virtually screen millions of compounds, identifying novel anti-fibrotic hits which target myofibroblast differentiation, many of which were structurally related to dopamine. We synthesized and validated several compounds ex vivo ("wet") and confirmed that both dopamine and its derivative TS1 are powerful inhibitors of myofibroblast activation. We further used RNAi-mediated knock-down and demonstrated that both molecules act through the dopamine receptor 3 and exert their anti-fibrotic effect by inhibiting the canonical transforming growth factor ß pathway. Furthermore, molecular modelling confirmed the capability of TS1 to bind both human and mouse dopamine receptor 3. The anti-fibrotic effect on human cells was confirmed using primary fibroblasts from idiopathic pulmonary fibrosis patients. Finally, TS1 prevented and reversed disease progression in a murine model of lung fibrosis. Both our interdisciplinary approach and our novel compound TS1 are promising tools for understanding and combating lung fibrosis.
Assuntos
Bleomicina/efeitos adversos , Descoberta de Drogas/métodos , Ensaios de Seleção de Medicamentos Antitumorais/métodos , Ensaios de Triagem em Larga Escala/métodos , Fibrose Pulmonar Idiopática/induzido quimicamente , Fibrose Pulmonar Idiopática/terapia , Pneumopatias/induzido quimicamente , Pneumopatias/terapia , Aprendizado de Máquina/normas , Miofibroblastos/metabolismo , Animais , Diferenciação Celular , Humanos , Fibrose Pulmonar Idiopática/patologia , Pneumopatias/patologia , Camundongos , TransfecçãoRESUMO
New microbial genomes are sequenced at a high pace, allowing insight into the genetics of not only cultured microbes, but a wide range of metagenomic collections such as the human microbiome. To understand the deluge of genomic data we face, computational approaches for gene functional annotation are invaluable. We introduce a novel model for computational annotation that refines two established concepts: annotation based on homology and annotation based on phyletic profiling. The phyletic profiling-based model that includes both inferred orthologs and paralogs-homologs separated by a speciation and a duplication event, respectively-provides more annotations at the same average Precision than the model that includes only inferred orthologs. For experimental validation, we selected 38 poorly annotated Escherichia coli genes for which the model assigned one of three GO terms with high confidence: involvement in DNA repair, protein translation, or cell wall synthesis. Results of antibiotic stress survival assays on E. coli knockout mutants showed high agreement with our model's estimates of accuracy: out of 38 predictions obtained at the reported Precision of 60%, we confirmed 25 predictions, indicating that our confidence estimates can be used to make informed decisions on experimental validation. Our work will contribute to making experimental validation of computational predictions more approachable, both in cost and time. Our predictions for 998 prokaryotic genomes include ~400000 specific annotations with the estimated Precision of 90%, ~19000 of which are highly specific-e.g. "penicillin binding," "tRNA aminoacylation for protein translation," or "pathogenesis"-and are freely available at http://gorbi.irb.hr/.