Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
1.
Bioinformatics ; 40(3)2024 03 04.
Artículo en Inglés | MEDLINE | ID: mdl-38377398

RESUMEN

MOTIVATION: Missing values are commonly observed in metabolomics data from mass spectrometry. Imputing them is crucial because it assures data completeness, increases the statistical power of analyses, prevents inaccurate results, and improves the quality of exploratory analysis, statistical modeling, and machine learning. Numerous Missing Value Imputation Algorithms (MVIAs) employ heuristics or statistical models to replace missing information with estimates. In the context of metabolomics data, we identified 52 MVIAs implemented across 70 R functions. Nevertheless, the usage of those 52 established methods poses challenges due to package dependency issues, lack of documentation, and their instability. RESULTS: Our R package, 'imputomics', provides a convenient wrapper around 41 (plus random imputation as a baseline model) out of 52 MVIAs in the form of a command-line tool and a web application. In addition, we propose a novel functionality for selecting MVIAs recommended for metabolomics data with the best performance or execution time. AVAILABILITY AND IMPLEMENTATION: 'imputomics' is freely available as an R package (github.com/BioGenies/imputomics) and a Shiny web application (biogenies.info/imputomics-ws). The documentation is available at biogenies.info/imputomics.


Asunto(s)
Metabolómica , Programas Informáticos , Metabolómica/métodos , Algoritmos , Computadores , Espectrometría de Masas/métodos
2.
Nucleic Acids Res ; 51(D1): D352-D357, 2023 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-36243982

RESUMEN

Information about the impact of interactions between amyloid proteins on their fibrillization propensity is scattered among many experimental articles and presented in unstructured form. We manually curated information located in almost 200 publications (selected out of 562 initially considered), obtaining details of 883 experimentally studied interactions between 46 amyloid proteins or peptides. We also proposed a novel standardized terminology for the description of amyloid-amyloid interactions, which is included in our database, covering all currently known types of such a cross-talk, including inhibition of fibrillization, cross-seeding and other phenomena. The new approach allows for more specific studies on amyloids and their interactions, by providing very well-defined data. AmyloGraph, an online database presenting information on amyloid-amyloid interactions, is available at (http://AmyloGraph.com/). Its functionalities are also accessible as the R package (https://github.com/KotulskaLab/AmyloGraph). AmyloGraph is the only publicly available repository for experimentally determined amyloid-amyloid interactions.


Asunto(s)
Amiloide , Proteínas Amiloidogénicas , Proteínas Amiloidogénicas/metabolismo , Péptidos , Bases de Datos de Proteínas
3.
Appl Environ Microbiol ; 88(5): e0227921, 2022 03 08.
Artículo en Inglés | MEDLINE | ID: mdl-35020452

RESUMEN

Pathogenic bacteria, such as enteropathogenic Escherichia coli (EPEC) and enterotoxigenic E. coli (ETEC), cause diarrhea in mammals. In particular, E. coli colonizes and infects the gastrointestinal tract via type 1 fimbriae (T1F). Here, the major zymogen granule membrane glycoprotein 2 (GP2) acts as a host cell receptor. GP2 is also secreted by the pancreas and various mucous glands, interacting with luminal type 1 fimbriae-positive E. coli. It is unknown whether GP2 isoforms demonstrate specific E. coli pathotype binding. In this study, we investigated interactions of human, porcine, and bovine EPEC and ETEC, as well as commensal E. coli isolates with human, porcine, and bovine GP2. We first defined pathotype- and host-associated FimH variants. Second, we could prove that GP2 isoforms bound to FimH variants to various degrees. However, the GP2-FimH interactions did not seem to be influenced by the host specificity of E. coli. In contrast, soluble GP2 affected ETEC infection and phagocytosis rates of macrophages. Preincubation of the ETEC pathotype with GP2 reduced the infection of cell lines. Furthermore, preincubation of E. coli with GP2 improved the phagocytosis rate of macrophages. Our findings suggest that GP2 plays a role in the defense against E. coli infection and in the corresponding host immune response. IMPORTANCE Infection by pathogenic bacteria, such as certain Escherichia coli pathotypes, results in diarrhea in mammals. Pathogens, including zoonotic agents, can infect different hosts or show host specificity. There are Escherichia coli strains which are frequently transmitted between humans and animals, whereas other Escherichia coli strains tend to colonize only one host. This host specificity is still not fully understood. We show that glycoprotein 2 is a selective receptor for particular Escherichia coli strains or variants of the adhesin FimH but not a selector for a species-specific Escherichia coli group. We demonstrate that GP2 is involved in the regulation of colonization and infection and thus represents a molecule of interest for the prevention or treatment of disease.


Asunto(s)
Escherichia coli Enteropatógena , Escherichia coli Enterotoxigénica , Infecciones por Escherichia coli , Animales , Bovinos , Diarrea/microbiología , Infecciones por Escherichia coli/microbiología , Infecciones por Escherichia coli/veterinaria , Fimbrias Bacterianas/metabolismo , Mamíferos , Glicoproteínas de Membrana/metabolismo , Vesículas Secretoras/metabolismo , Porcinos
4.
Int J Mol Sci ; 21(12)2020 Jun 17.
Artículo en Inglés | MEDLINE | ID: mdl-32560350

RESUMEN

Antimicrobial peptides (AMPs) are molecules widespread in all branches of the tree of life that participate in host defense and/or microbial competition. Due to their positive charge, hydrophobicity and amphipathicity, they preferentially disrupt negatively charged bacterial membranes. AMPs are considered an important alternative to traditional antibiotics, especially at the time when multidrug-resistant bacteria being on the rise. Therefore, to reduce the costs of experimental research, robust computational tools for AMP prediction and identification of the best AMP candidates are essential. AmpGram is our novel tool for AMP prediction; it outperforms top-ranking AMP classifiers, including AMPScanner, CAMPR3R and iAMPpred. It is the first AMP prediction tool created for longer AMPs and for high-throughput proteomic screening. AmpGram prediction reliability was confirmed on the example of lactoferrin and thrombin. The former is a well known antimicrobial protein and the latter a cryptic one. Both proteins produce (after protease treatment) functional AMPs that have been experimentally validated at molecular level. The lactoferrin and thrombin AMPs were located in the antimicrobial regions clearly detected by AmpGram. Moreover, AmpGram also provides a list of shot 10 amino acid fragments in the antimicrobial regions, along with their probability predictions; these can be used for further studies and the rational design of new AMPs. AmpGram is available as a web-server, and an easy-to-use R package for proteomic analysis at CRAN repository.


Asunto(s)
Péptidos Catiónicos Antimicrobianos/química , Diseño de Fármacos , Descubrimiento de Drogas/métodos , Proteómica , Programas Informáticos , Área Bajo la Curva , Bases de Datos Factuales , Pruebas de Sensibilidad Microbiana , Proteómica/métodos , Sensibilidad y Especificidad , Navegador Web
5.
Int J Mol Sci ; 19(12)2018 Nov 22.
Artículo en Inglés | MEDLINE | ID: mdl-30469512

RESUMEN

Signal peptides are N-terminal presequences responsible for targeting proteins to the endomembrane system, and subsequent subcellular or extracellular compartments, and consequently condition their proper function. The significance of signal peptides stimulates development of new computational methods for their detection. These methods employ learning systems trained on datasets comprising signal peptides from different types of proteins and taxonomic groups. As a result, the accuracy of predictions are high in the case of signal peptides that are well-represented in databases, but might be low in other, atypical cases. Such atypical signal peptides are present in proteins found in apicomplexan parasites, causative agents of malaria and toxoplasmosis. Apicomplexan proteins have a unique amino acid composition due to their AT-biased genomes. Therefore, we designed a new, more flexible and universal probabilistic model for recognition of atypical eukaryotic signal peptides. Our approach called signalHsmm includes knowledge about the structure of signal peptides and physicochemical properties of amino acids. It is able to recognize signal peptides from the malaria parasites and related species more accurately than popular programs. Moreover, it is still universal enough to provide prediction of other signal peptides on par with the best preforming predictors.


Asunto(s)
Plasmodium/química , Señales de Clasificación de Proteína , Proteínas Protozoarias/química , Análisis de Secuencia de Proteína/métodos , Aminoácidos/química , Cadenas de Markov , Análisis de Secuencia de Proteína/normas
6.
Comput Struct Biotechnol J ; 23: 1951-1958, 2024 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-38736697

RESUMEN

NanoString nCounter is a medium-throughput technology used in mRNA and miRNA differential expression studies. It offers several advantages, including the absence of an amplification step and the ability to analyze low-grade samples. Despite its considerable strengths, the popularity of the nCounter platform in experimental research stabilized in 2022 and 2023, and this trend may continue in the upcoming years. Such stagnation could potentially be attributed to the absence of a standardized analytical pipeline or the indication of optimal processing methods for nCounter data analysis. To standardize the description of the nCounter data analysis workflow, we divided it into five distinct steps: data pre-processing, quality control, background correction, normalization and differential expression analysis. Next, we evaluated eleven R packages dedicated to nCounter data processing to point out functionalities belonging to these steps and provide comments on their applications in studies of mRNA and miRNA samples.

7.
Ann Transl Med ; 9(7): 528, 2021 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-33987226

RESUMEN

BACKGROUND: DNA double-strand breaks can be counted as discrete foci by imaging techniques. In personalized medicine and pharmacology, the analysis of counting data is relevant for numerous applications, e.g., for cancer and aging research and the evaluation of drug efficacy. By default, it is assumed to follow the Poisson distribution. This assumption, however, may lead to biased results and faulty conclusions in datasets with excess zero values (zero-inflation), a variance larger than the mean (overdispersion), or both. In such cases, the assumption of a Poisson distribution would skew the estimation of mean and variance, and other models like the negative binomial (NB), zero-inflated Poisson or zero-inflated NB distributions should be employed. The model chosen has an influence on the parameter estimation (mean value and confidence interval). Yet the choice of the suitable distribution model is not trivial. METHODS: To support, simplify and objectify this process, we have developed the countfitteR software as an R package. We used a Bayesian approach for distribution model selection and the shiny web application framework for interactive data analysis. RESULTS: We show the application of our software based on examples of DNA double-strand break count data from phenotypic imaging by multiplex fluorescence microscopy. In analyzing numerous datasets of molecular pharmacological markers (phosphorylated histone H2AX and p53 binding protein), countfitteR demonstrated an equal or superior statistical performance compared to the usually employed two-step procedure, with an overall power of up to 98%. In addition, it still gave information in cases with no result at all from the two-step procedure. In our data sample we found that the NB distribution was the most frequent, with the Poisson distribution taking second place. CONCLUSIONS: countfitteR can perform an automated distribution model selection and thus support the data analysis and lead to objective statistically verifiable estimated values. Originally designed for the analysis of foci in biomedical image data, countfitteR can be used in a variety of areas where non-Poisson distributed counting data is prevalent.

8.
Sci Rep ; 11(1): 8934, 2021 04 26.
Artículo en Inglés | MEDLINE | ID: mdl-33903613

RESUMEN

Several disorders are related to amyloid aggregation of proteins, for example Alzheimer's or Parkinson's diseases. Amyloid proteins form fibrils of aggregated beta structures. This is preceded by formation of oligomers-the most cytotoxic species. Determining amyloidogenicity is tedious and costly. The most reliable identification of amyloids is obtained with high resolution microscopies, such as electron microscopy or atomic force microscopy (AFM). More frequently, less expensive and faster methods are used, especially infrared (IR) spectroscopy or Thioflavin T staining. Different experimental methods are not always concurrent, especially when amyloid peptides do not readily form fibrils but oligomers. This may lead to peptide misclassification and mislabeling. Several bioinformatics methods have been proposed for in-silico identification of amyloids, many of them based on machine learning. The effectiveness of these methods heavily depends on accurate annotation of the reference training data obtained from in-vitro experiments. We study how robust are bioinformatics methods to weak supervision, encountering imperfect training data. AmyloGram and three other amyloid predictors were applied. The results proved that a certain degree of misannotation in the reference data can be eliminated by the bioinformatics tools, even if they belonged to their training set. The computational results are supported by new experiments with IR and AFM methods.


Asunto(s)
Amiloide , Biología Computacional , Simulación por Computador , Péptidos , Agregado de Proteínas/genética , Amiloide/química , Amiloide/genética , Humanos , Microscopía de Fuerza Atómica , Péptidos/química , Péptidos/genética , Espectrofotometría Infrarroja
9.
Environ Microbiol Rep ; 10(3): 378-382, 2018 06.
Artículo en Inglés | MEDLINE | ID: mdl-29624889

RESUMEN

The vast biodiversity of the microbial world and how little is known about it, has already been revealed by extensive metagenomics analyses. Our rudimentary knowledge of microbes stems from difficulties concerning their isolation and culture in laboratory conditions, which is necessary for describing their phenotype, among other things, for biotechnological purposes. An important component of the understudied ecosystems is methanogens, archaea producing a potent greenhouse-effect gas methane. Therefore, we created PhyMet2 , the first database that combines descriptions of methanogens and their culturing conditions with genetic information. The database contains a set of utilities that facilitate interactive data browsing, data comparison, phylogeny exploration and searching for sequence homologues. The most unique feature of the database is the web server MethanoGram, which can be used to significantly reduce the time and cost of searching for the optimal culturing conditions of methanogens by predicting them based on 16S RNA sequences. The database will aid many researchers in exploring the world of methanogens and their applications in biotechnological processes. PhyMet2 with the MethanoGram predictor is available at http://metanogen.biotech.uni.wroc.pl.


Asunto(s)
Bases de Datos Factuales , Euryarchaeota , Metano/metabolismo , Euryarchaeota/clasificación , Euryarchaeota/fisiología , Metagenómica , Filogenia , ARN Ribosómico 16S/genética
SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda