Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Más filtros

Banco de datos
Tipo del documento
Asunto de la revista
País de afiliación
Intervalo de año de publicación
1.
Bioinformatics ; 36(22-23): 5304-5312, 2021 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-33367584

RESUMEN

MOTIVATION: Protein orthologous group databases are powerful tools for evolutionary analysis, functional annotation or metabolic pathway modeling across lineages. Sequences are typically assigned to orthologous groups with alignment-based methods, such as profile hidden Markov models, which have become a computational bottleneck. RESULTS: We present DeepNOG, an extremely fast and accurate, alignment-free orthology assignment method based on deep convolutional networks. We compare DeepNOG against state-of-the-art alignment-based (HMMER, DIAMOND) and alignment-free methods (DeepFam) on two orthology databases (COG, eggNOG 5). DeepNOG can be scaled to large orthology databases like eggNOG, for which it outperforms DeepFam in terms of precision and recall by large margins. While alignment-based methods still provide the most accurate assignments among the investigated methods, computing time of DeepNOG is an order of magnitude lower on CPUs. Optional GPU usage further increases throughput massively. A command-line tool enables rapid adoption by users. AVAILABILITYAND IMPLEMENTATION: Source code and packages are freely available at https://github.com/univieCUBE/deepnog. Install the platform-independent Python program with $pip install deepnog. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

2.
Knowl Inf Syst ; 59(1): 137-166, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-32647403

RESUMEN

Hubness is an aspect of the curse of dimensionality related to the distance concentration effect. Hubs occur in high-dimensional data spaces as objects that are particularly often among the nearest neighbors of other objects. Conversely, other data objects become antihubs, which are rarely or never nearest neighbors to other objects. Many machine learning algorithms rely on nearest neighbor search and some form of measuring distances, which are both impaired by high hubness. Degraded performance due to hubness has been reported for various tasks such as classification, clustering, regression, visualization, recommendation, retrieval and outlier detection. Several hubness reduction methods based on different paradigms have previously been developed. Local and global scaling as well as shared neighbors approaches aim at repairing asymmetric neighborhood relations. Global and localized centering try to eliminate spatial centrality, while the related global and local dissimilarity measures are based on density gradient flattening. Additional methods and alternative dissimilarity measures that were argued to mitigate detrimental effects of distance concentration also influence the related hubness phenomenon. In this paper, we present a large-scale empirical evaluation of all available unsupervised hubness reduction methods and dissimilarity measures. We investigate several aspects of hubness reduction as well as its influence on data semantics which we measure via nearest neighbor classification. Scaling and density gradient flattening methods improve evaluation measures such as hubness and classification accuracy consistently for data sets from a wide range of domains, while centering approaches achieve the same only under specific settings.

3.
BMC Bioinformatics ; 16 Suppl 14: S1, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-26451672

RESUMEN

The accessibility of almost complete genome sequences of uncultivable microbial species from metagenomes necessitates computational methods predicting microbial phenotypes solely based on genomic data. Here we investigate how comparative genomics can be utilized for the prediction of microbial phenotypes. The PICA framework facilitates application and comparison of different machine learning techniques for phenotypic trait prediction. We have improved and extended PICA's support vector machine plug-in and suggest its applicability to large-scale genome databases and incomplete genome sequences. We have demonstrated the stability of the predictive power for phenotypic traits, not perturbed by the rapid growth of genome databases. A new software tool facilitates the in-depth analysis of phenotype models, which associate expected and unexpected protein functions with particular traits. Most of the traits can be reliably predicted in only 60-70% complete genomes. We have established a new phenotypic model that predicts intracellular microorganisms. Thereby we could demonstrate that also independently evolved phenotypic traits, characterized by genome reduction, can be reliably predicted based on comparative genomics. Our results suggest that the extended PICA framework can be used to automatically annotate phenotypes in near-complete microbial genome sequences, as generated in large numbers in current metagenomics studies.


Asunto(s)
Mapeo Cromosómico/métodos , Genoma Microbiano , Metagenoma/genética , Metagenómica/métodos , Fenotipo , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Algoritmos , Aprendizaje Automático , Máquina de Vectores de Soporte
4.
Front Microbiol ; 12: 645972, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34168623

RESUMEN

A very common way to classify bacteria is through microscopic images. Microscopic cell counting is a widely used technique to measure microbial growth. To date, fully automated methodologies are available for accurate and fast measurements; yet for bacteria dividing longitudinally, as in the case of Candidatus Thiosymbion oneisti, its cell count mainly remains manual. The identification of this type of cell division is important because it helps to detect undergoing cellular division from those which are not dividing once the sample is fixed. Our solution automates the classification of longitudinal division by using a machine learning method called residual network. Using transfer learning, we train a binary classification model in fewer epochs compared to the model trained without it. This potentially eliminates most of the manual labor of classifying the type of bacteria cell division. The approach is useful in automatically labeling a certain bacteria division after detecting and segmenting (extracting) individual bacteria images from microscopic images of colonies.

5.
Front Mol Neurosci ; 9: 44, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27378845

RESUMEN

Atomic resolution structures of cys-loop receptors, including one of a γ-aminobutyric acid type A receptor (GABAA receptor) subtype, allow amazing insights into the structural features and conformational changes that these pentameric ligand-gated ion channels (pLGICs) display. Here we present a comprehensive analysis of more than 30 cys-loop receptor structures of homologous proteins that revealed several allosteric binding sites not previously described in GABAA receptors. These novel binding sites were examined in GABAA receptor homology models and assessed as putative candidate sites for allosteric ligands. Four so far undescribed putative ligand binding sites were proposed for follow up studies based on their presence in the GABAA receptor homology models. A comprehensive analysis of conserved structural features in GABAA and glycine receptors (GlyRs), the glutamate gated ion channel, the bacterial homologs Erwinia chrysanthemi (ELIC) and Gloeobacter violaceus GLIC, and the serotonin type 3 (5-HT3) receptor was performed. The conserved features were integrated into a master alignment that led to improved homology models. The large fragment of the intracellular domain that is present in the structure of the 5-HT3 receptor was utilized to generate GABAA receptor models with a corresponding intracellular domain fragment. Results of mutational and photoaffinity ligand studies in GABAA receptors were analyzed in the light of the model structures. This led to an assignment of candidate ligands to two proposed novel pockets, candidate binding sites for furosemide and neurosteroids in the trans-membrane domain were identified. The homology models can serve as hypotheses generators, and some previously controversial structural interpretations of biochemical data can be resolved in the light of the presented multi-template approach to comparative modeling. Crystal and cryo-EM microscopic structures of the closest homologs that were solved in different conformational states provided important insights into structural rearrangements of binding sites during conformational transitions. The impact of structural variation and conformational motion on the shape of the investigated binding sites was analyzed. Rules for best template and alignment choice were obtained and can generally be applied to modeling of cys-loop receptors. Overall, we provide an updated structure based view of ligand binding sites present in GABAA receptors.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA