RESUMEN
[This corrects the article DOI: 10.1371/journal.pone.0261548.].
RESUMEN
Training models with semi- or self-supervised learning methods is one way to reduce annotation effort since they rely on unlabeled or sparsely labeled datasets. Such approaches are particularly promising for domains with a time-consuming annotation process requiring specialized expertise and where high-quality labeled machine learning datasets are scarce, like in computational pathology. Even though some of these methods have been used in the histopathological domain, there is, so far, no comprehensive study comparing different approaches. Therefore, this work compares feature extractors models trained with state-of-the-art semi- or self-supervised learning methods PAWS, SimCLR, and SimSiam within a unified framework. We show that such models, across different architectures and network configurations, have a positive performance impact on histopathological classification tasks, even in low data regimes. Moreover, our observations suggest that features learned from a particular dataset, i.e., tissue type, are only in-domain transferable to a certain extent. Finally, we share our experience using each method in computational pathology and provide recommendations for its use.
RESUMEN
Clinical metagenomics is a powerful diagnostic tool, as it offers an open view into all DNA in a patient's sample. This allows the detection of pathogens that would slip through the cracks of classical specific assays. However, due to this unspecific nature of metagenomic sequencing, a huge amount of unspecific data is generated during the sequencing itself and the diagnosis only takes place at the data analysis stage where relevant sequences are filtered out. Typically, this is done by comparison to reference databases. While this approach has been optimized over the past years and works well to detect pathogens that are represented in the used databases, a common challenge in analysing a metagenomic patient sample arises when no pathogen sequences are found: How to determine whether truly no evidence of a pathogen is present in the data or whether the pathogen's genome is simply absent from the database and the sequences in the dataset could thus not be classified? Here, we present a novel approach to this problem of detecting novel pathogens in metagenomic datasets by classifying the (segments of) proteins encoded by the sequences in the datasets. We train a neural network on the sequences of coding sequences, labeled by taxonomic domain, and use this neural network to predict the taxonomic classification of sequences that can not be classified by comparison to a reference database, thus facilitating the detection of potential novel pathogens.
Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Metagenómica/métodos , Redes Neurales de la Computación , Algoritmos , Animales , Bacterias/clasificación , Bacterias/genética , ADN/clasificación , ADN/genética , ADN Bacteriano/clasificación , ADN Bacteriano/genética , ADN Viral/clasificación , ADN Viral/genética , Humanos , Metagenoma , Virus/clasificación , Virus/genéticaRESUMEN
Herein an operationally simple multicomponent reaction of unprotected carbohydrates with amino acids and isonitriles is presented. By the extension of this Ugi-type reaction to an unprotected disaccharide a novel glycopeptide structure was accessible.
Asunto(s)
Aminoácidos/química , Carbohidratos/química , Glicopéptidos/síntesis química , Glicopéptidos/química , Estructura MolecularRESUMEN
An organocatalyzed transformation to elongate unprotected carbohydrates is described. This operationally simple methodology is based on a Knoevenagel-oxa-Michael cascade. This reaction is catalyzed by proline and DBU. Products were obtained with exceptional high degrees of stereoselectivity.
Asunto(s)
Carbohidratos/química , Monosacáridos/química , Catálisis , Ésteres/química , Glicósidos , Prolina/química , EstereoisomerismoRESUMEN
Aldol additions of unprotected carbohydrates to 1.3-dicarbonyl compounds have been described. This transformation is based on a dual activation by tertiary amines and 2-hydroxypyridine.