Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 1.043
Filtrar
2.
EFSA J ; 22(8): e8912, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-39135845

RESUMEN

Microorganisms, genetically modified or not, may be used in the food chain either as active agents, biomasses or as production organisms of substances of interest. The placement of such microorganisms or their derived substances/products in the European market may be subject to a premarket authorisation process. The authorisation process requires a risk assessment in order to establish the safety and/or the efficacy of the microorganism(s) when used in the food chain as such, as biomasses or as production strains. This includes a full molecular characterisation of the microorganism(s) under assessment. For certain regulated products, the use of whole genome sequence (WGS) data of the microorganism is established as a requirement for the risk assessment. In this regard, data obtained from WGS analysis can provide information on the unambiguous taxonomic identification of the strains, on the presence of genes of concern (e.g. those encoding virulence factors, resistance to antimicrobials of clinical relevance for humans and animals, production of harmful metabolites or of clinically relevant antimicrobials) and on the characterisation of genetic modification(s) (where relevant). This document provides recommendations to applicants on how to describe and report the results of WGS analyses in the context of an application for market authorisation of a regulated product. Indications are given on how to perform genome sequencing and the quality criteria/thresholds that should be reached, as well as the data and relevant information that need to be reported, if required. This updated document replaces the EFSA 2021 Statement and reflects the current knowledge in technologies and methodologies to be used to generate and analyse WGS data for the risk assessment of microorganisms.

3.
J Chem Inf Model ; 2024 Aug 13.
Artículo en Inglés | MEDLINE | ID: mdl-39136669

RESUMEN

Molecular Property Prediction (MPP) is vital for drug discovery, crop protection, and environmental science. Over the last decades, diverse computational techniques have been developed, from using simple physical and chemical properties and molecular fingerprints in statistical models and classical machine learning to advanced deep learning approaches. In this review, we aim to distill insights from current research on employing transformer models for MPP. We analyze the currently available models and explore key questions that arise when training and fine-tuning a transformer model for MPP. These questions encompass the choice and scale of the pretraining data, optimal architecture selections, and promising pretraining objectives. Our analysis highlights areas not yet covered in current research, inviting further exploration to enhance the field's understanding. Additionally, we address the challenges in comparing different models, emphasizing the need for standardized data splitting and robust statistical analysis.

4.
Anal Biochem ; 694: 115637, 2024 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-39121938

RESUMEN

Accurate identifications of protein-peptide binding residues are essential for protein-peptide interactions and advancing drug discovery. To address this problem, extensive research efforts have been made to design more discriminative feature representations. However, extracting these explicit features usually depend on third-party tools, resulting in low computational efficacy and suffering from low predictive performance. In this study, we design an end-to-end deep learning-based method, E2EPep, for protein-peptide binding residue prediction using protein sequence only. E2EPep first employs and fine-tunes two state-of-the-art pre-trained protein language models that can extract two different high-latent feature representations from protein sequences relevant for protein structures and functions. A novel feature fusion module is then designed in E2EPep to fuse and optimize the above two feature representations of binding residues. In addition, we have also design E2EPep+, which integrates E2EPep and PepBCL models, to improve the prediction performance. Experimental results on two independent testing data sets demonstrate that E2EPep and E2EPep + could achieve the average AUC values of 0.846 and 0.842 while achieving an average Matthew's correlation coefficient value that is significantly higher than that of existing most of sequence-based methods and comparable to that of the state-of-the-art structure-based predictors. Detailed data analysis shows that the primary strength of E2EPep lies in the effectiveness of feature representation using cross-attention mechanism to fuse the embeddings generated by two fine-tuned protein language models. The standalone package of E2EPep and E2EPep + can be obtained at https://github.com/ckx259/E2EPep.git for academic use only.


Asunto(s)
Péptidos , Unión Proteica , Proteínas , Proteínas/química , Proteínas/metabolismo , Péptidos/química , Péptidos/metabolismo , Aprendizaje Profundo , Sitios de Unión , Bases de Datos de Proteínas , Biología Computacional/métodos
6.
Microorganisms ; 12(7)2024 Jun 21.
Artículo en Inglés | MEDLINE | ID: mdl-39065022

RESUMEN

Although cases of Legionnaires' disease are notifiable, data on the phenotypic and genotypic characterisation of clinical isolates are limited. This retrospective study aims to report the results of the characterisation of Legionella clinical isolates in Spain from 2012 to 2022. Monoclonal antibodies from the Dresden panel were used for phenotypic identification of Legionella pneumophila. Genotypic characterisation and sequence type assignment were performed using the Sequence-Based Typing scheme. Of the 1184 samples, 569 were identified as Legionella by culture. Of these, 561 were identified as L. pneumophila, of which 521 were serogroup 1. The most common subgroups were Philadelphia (n = 107) and Knoxville (n = 106). The SBT analysis revealed 130 different STs, with the most common genotypes being ST1 (n = 87), ST23 (n = 57), ST20 (n = 30), and ST42 (n = 29). Knoxville has the highest variability with 32 different STs. ST23 is mainly found in Allentown/France (n = 46) and ST42 in Benidorm (n = 18), whereas ST1 is widely distributed. The results demonstrate that clinical isolates show high genetic diversity, although only a few sequence types (STs) are responsible for most cases. However, outbreaks can also occur with rare genotypes. More data on LD and associated epidemiological studies are needed to establish the risk of an isolate causing outbreak in the future.

7.
Biotechnol Lett ; 2024 Jul 17.
Artículo en Inglés | MEDLINE | ID: mdl-39017763

RESUMEN

Pentachlorophenol (PCP) was once used as a pesticide, germicide, and preservative due to its stable properties and resistance to degradation. This study aimed to design a biosensor for the quantitative and prompt detection of capable of PCP. A cell-free fluorescence biosensor was developed while employing NalC, an allosteric Transcription Factor responsive to PCP and In Vitro Transcription. By adding a DNA template and PCP and employing Electrophoretic Mobility Shift Assay while monitoring the dynamic fluorescence changes in RNA, this study offers evidence of NalC's potential applicability in sensor systems developed for the specific detection of PCP. The biosensor showed the capability for the quantitative detection of PCP, with a Limit of Detection (LOD) of 0.21 µM. Following the addition of Nucleic Acid Sequence-Based Amplification, the fluorescence intensity of RNA revealed an excellent linear relationship with the concentration of PCP, showing a correlation coefficient (R2) of 0.9595. The final LOD was determined to be 0.002 µM. This study has successfully translated the determination of PCP into a fluorescent RNA output, thereby presenting a novel approach for detecting PCP within environmental settings.

8.
Methods Mol Biol ; 2780: 327-343, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38987476

RESUMEN

The chapter emphasizes the importance of understanding protein-protein interactions in cellular mechanisms and highlights the role of computational modeling in predicting these interactions. It discusses sequence-based approaches such as evolutionary trace (ET), correlated mutation analysis (CMA), and subtractive correlated mutation (SCM) for identifying crucial amino acid residues, considering interface conservation or evolutionary changes. The chapter also explores methods like differential ET, hidden-site class model, and spatial cluster detection (SCD) for interface specificity and spatial clustering. Furthermore, it examines approaches combining structural and sequential methodologies and evaluates modeled predictions through initiatives like critical assessment of prediction of interactions (CAPRI). Additionally, the chapter provides an overview of various software programs used for molecular docking, detailing their search, sampling, refinement and scoring stages, along with innovative techniques and tools like normal mode analysis (NMA) and adaptive Poisson-Boltzmann solver (APBS) for electrostatic calculations. These computational and experimental approaches are crucial for unraveling protein-protein interactions and aid in developing potential therapeutics for various diseases.


Asunto(s)
Biología Computacional , Simulación del Acoplamiento Molecular , Unión Proteica , Proteínas , Programas Informáticos , Biología Computacional/métodos , Proteínas/metabolismo , Proteínas/química , Mapeo de Interacción de Proteínas/métodos , Humanos , Mutación , Algoritmos , Conformación Proteica
9.
Brief Bioinform ; 25(4)2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-39003530

RESUMEN

Protein function prediction is critical for understanding the cellular physiological and biochemical processes, and it opens up new possibilities for advancements in fields such as disease research and drug discovery. During the past decades, with the exponential growth of protein sequence data, many computational methods for predicting protein function have been proposed. Therefore, a systematic review and comparison of these methods are necessary. In this study, we divide these methods into four different categories, including sequence-based methods, 3D structure-based methods, PPI network-based methods and hybrid information-based methods. Furthermore, their advantages and disadvantages are discussed, and then their performance is comprehensively evaluated and compared. Finally, we discuss the challenges and opportunities present in this field.


Asunto(s)
Biología Computacional , Proteínas , Proteínas/química , Proteínas/metabolismo , Biología Computacional/métodos , Humanos , Análisis de Secuencia de Proteína/métodos , Algoritmos
10.
HLA ; 104(1): e15629, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-39073238

RESUMEN

HLA-C*02 246 has one nucleotide change from HLA-C*02:02:02:01 at nucleotide 523 changing Arginine to Cysteine at residue 151.


Asunto(s)
Alelos , Secuencia de Bases , Exones , Antígenos HLA-C , Prueba de Histocompatibilidad , Humanos , Antígenos HLA-C/genética , Análisis de Secuencia de ADN/métodos , Alineación de Secuencia , Sustitución de Aminoácidos , Codón
11.
HLA ; 103(6): e15562, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38887867

RESUMEN

Two nucleotide substitutions in codon 152 of HLA-C*08:01:01:01 result in a novel allele HLA-C*08:66.


Asunto(s)
Exones , Antígenos HLA-C , Prueba de Histocompatibilidad , Humanos , Alelos , Secuencia de Bases , Codón , Prueba de Histocompatibilidad/métodos , Antígenos HLA-C/genética , Alineación de Secuencia , Análisis de Secuencia de ADN/métodos , Taiwán
12.
HLA ; 103(6): e15546, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38887907

RESUMEN

A nucleotide deletion in the residue 371 of HLA-A*11:01:01:01 results in a novel allele HLA-A*11:466N.


Asunto(s)
Exones , Antígeno HLA-A11 , Prueba de Histocompatibilidad , Humanos , Alelos , Secuencia de Bases , Codón , Antígeno HLA-A11/genética , Alineación de Secuencia , Análisis de Secuencia de ADN , Eliminación de Secuencia , Taiwán
13.
HLA ; 103(6): e15551, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38837672

RESUMEN

One nucleotide substitution in codon 130 of HLA-DQB1*03:03:02:01 results in a novel allele HLA-DQB1*03:96.


Asunto(s)
Alelos , Codón , Exones , Cadenas beta de HLA-DQ , Prueba de Histocompatibilidad , Humanos , Cadenas beta de HLA-DQ/genética , Taiwán , Secuencia de Bases , Pueblo Asiatico/genética , Análisis de Secuencia de ADN/métodos , Polimorfismo de Nucleótido Simple
14.
HLA ; 103(6): e15578, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38923289
16.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38701416

RESUMEN

Predicting protein function is crucial for understanding biological life processes, preventing diseases and developing new drug targets. In recent years, methods based on sequence, structure and biological networks for protein function annotation have been extensively researched. Although obtaining a protein in three-dimensional structure through experimental or computational methods enhances the accuracy of function prediction, the sheer volume of proteins sequenced by high-throughput technologies presents a significant challenge. To address this issue, we introduce a deep neural network model DeepSS2GO (Secondary Structure to Gene Ontology). It is a predictor incorporating secondary structure features along with primary sequence and homology information. The algorithm expertly combines the speed of sequence-based information with the accuracy of structure-based features while streamlining the redundant data in primary sequences and bypassing the time-consuming challenges of tertiary structure analysis. The results show that the prediction performance surpasses state-of-the-art algorithms. It has the ability to predict key functions by effectively utilizing secondary structure information, rather than broadly predicting general Gene Ontology terms. Additionally, DeepSS2GO predicts five times faster than advanced algorithms, making it highly applicable to massive sequencing data. The source code and trained models are available at https://github.com/orca233/DeepSS2GO.


Asunto(s)
Algoritmos , Biología Computacional , Redes Neurales de la Computación , Estructura Secundaria de Proteína , Proteínas , Proteínas/química , Proteínas/metabolismo , Proteínas/genética , Biología Computacional/métodos , Bases de Datos de Proteínas , Ontología de Genes , Análisis de Secuencia de Proteína/métodos , Programas Informáticos
17.
Interdiscip Sci ; 16(2): 503-518, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38733473

RESUMEN

Cancer remains a severe illness, and current research indicates that tumor homing peptides (THPs) play an important part in cancer therapy. The identification of THPs can provide crucial insights for drug-discovery and pharmaceutical industries as they allow for tailored medication delivery towards cancer cells. These peptides have a high affinity enabling particular receptors present upon tumor surfaces, allowing for the creation of precision medications that reduce off-target consequences and enhance cancer patient treatment results. Wet-lab techniques are considered essential tools for studying THPs; however, they're labor-extensive and time-consuming, therefore making prediction of THPs a challenging task for the researchers. Computational-techniques, on the other hand, are considered significant tools in identifying THPs according to the sequence data. Despite many strategies have been presented to predict new THP, there is still a need to develop a robust method with higher rates of success. In this paper, we developed a novel framework, THP-DF, for accurately identifying THPs on a large-scale. Firstly, the peptide sequences are encoded through various sequential features. Secondly, each feature is passed to BiLSTM and attention layers to extract simplified deep features. Finally, an ensemble-framework is formed via integrating sequential- and deep features which are fed to a support vector machine which with 10-fold cross-validation to carry to validate the efficiency. The experimental results showed that THP-DF worked better on both [Formula: see text] and [Formula: see text] datasets by achieving accuracy of > 95% which are higher than existing predictors both datasets. This indicates that the proposed predictor could be a beneficial tool to precisely and rapidly identify THPs and will contribute to the cutting-edge cancer treatment strategies and pharmaceuticals.


Asunto(s)
Biología Computacional , Neoplasias , Péptidos , Máquina de Vectores de Soporte , Péptidos/química , Humanos , Biología Computacional/métodos , Algoritmos
18.
19.
Comput Biol Med ; 176: 108543, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38744015

RESUMEN

Proteins play a vital role in various biological processes and achieve their functions through protein-protein interactions (PPIs). Thus, accurate identification of PPI sites is essential. Traditional biological methods for identifying PPIs are costly, labor-intensive, and time-consuming. The development of computational prediction methods for PPI sites offers promising alternatives. Most known deep learning (DL) methods employ layer-wise multi-scale CNNs to extract features from protein sequences. But, these methods usually neglect the spatial positions and hierarchical information embedded within protein sequences, which are actually crucial for PPI site prediction. In this paper, we propose MR2CPPIS, a novel sequence-based DL model that utilizes the multi-scale Res2Net with coordinate attention mechanism to exploit multi-scale features and enhance PPI site prediction capability. We leverage the multi-scale Res2Net to expand the receptive field for each network layer, thus capturing multi-scale information of protein sequences at a granular level. To further explore the local contextual features of each target residue, we employ a coordinate attention block to characterize the precise spatial position information, enabling the network to effectively extract long-range dependencies. We evaluate our MR2CPPIS on three public benchmark datasets (Dset 72, Dset 186, and PDBset 164), achieving state-of-the-art performance. The source codes are available at https://github.com/YyinGong/MR2CPPIS.


Asunto(s)
Aprendizaje Profundo , Proteínas/metabolismo , Proteínas/química , Mapeo de Interacción de Proteínas/métodos , Biología Computacional/métodos , Humanos , Bases de Datos de Proteínas
20.
Biotechnol Adv ; 73: 108376, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38740355

RESUMEN

Enzymes play a pivotal role in various industries by enabling efficient, eco-friendly, and sustainable chemical processes. However, the low turnover rates and poor substrate selectivity of enzymes limit their large-scale applications. Rational computational enzyme design, facilitated by computational algorithms, offers a more targeted and less labor-intensive approach. There has been notable advancement in employing rational computational protein engineering strategies to overcome these issues, it has not been comprehensively reviewed so far. This article reviews recent developments in rational computational enzyme design, categorizing them into three types: structure-based, sequence-based, and data-driven machine learning computational design. Case studies are presented to demonstrate successful enhancements in catalytic activity, stability, and substrate selectivity. Lastly, the article provides a thorough analysis of these approaches, highlights existing challenges and potential solutions, and offers insights into future development directions.


Asunto(s)
Enzimas , Ingeniería de Proteínas , Ingeniería de Proteínas/métodos , Enzimas/química , Enzimas/metabolismo , Biología Computacional/métodos , Aprendizaje Automático , Especificidad por Sustrato , Algoritmos , Modelos Moleculares
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...