Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 11 de 11
Filter
Add more filters










Publication year range
1.
Nucleic Acids Res ; 2024 Apr 08.
Article in English | MEDLINE | ID: mdl-38587188

ABSTRACT

DeepLoc 2.0 is a popular web server for the prediction of protein subcellular localization and sorting signals. Here, we introduce DeepLoc 2.1, which additionally classifies the input proteins into the membrane protein types Transmembrane, Peripheral, Lipid-anchored and Soluble. Leveraging pre-trained transformer-based protein language models, the server utilizes a three-stage architecture for sequence-based, multi-label predictions. Comparative evaluations with other established tools on a test set of 4933 eukaryotic protein sequences, constructed following stringent homology partitioning, demonstrate state-of-the-art performance. Notably, DeepLoc 2.1 outperforms existing models, with the larger ProtT5 model exhibiting a marginal advantage over the ESM-1B model. The web server is available at https://services.healthtech.dtu.dk/services/DeepLoc-2.1.

2.
NAR Genom Bioinform ; 5(4): lqad088, 2023 Dec.
Article in English | MEDLINE | ID: mdl-37850036

ABSTRACT

When splitting biological sequence data for the development and testing of predictive models, it is necessary to avoid too-closely related pairs of sequences ending up in different partitions. If this is ignored, performance of prediction methods will tend to be overestimated. Several algorithms have been proposed for homology reduction, where sequences are removed until no too-closely related pairs remain. We present GraphPart, an algorithm for homology partitioning that divides the data such that closely related sequences always end up in the same partition, while keeping as many sequences as possible in the dataset. Evaluation of GraphPart on Protein, DNA and RNA datasets shows that it is capable of retaining a larger number of sequences per dataset, while providing homology separation on a par with reduction approaches.

3.
Nat Commun ; 14(1): 5624, 2023 09 12.
Article in English | MEDLINE | ID: mdl-37699890

ABSTRACT

The heterogeneity of the SARS-CoV-2 immune responses has become considerably more complex over time and diverse immune imprinting is observed in vaccinated individuals. Despite vaccination, following the emergence of the Omicron variant, some individuals appear more susceptible to primary infections and reinfections than others, underscoring the need to elucidate how immune responses are influenced by previous infections and vaccination. IgG, IgA, neutralizing antibodies and T-cell immune responses in 1,325 individuals (955 of which were infection-naive) were investigated before and after three doses of the BNT162b2 vaccine, examining their relation to breakthrough infections and immune imprinting in the context of Omicron. Our study shows that both humoral and cellular responses following vaccination were generally higher after SARS-CoV-2 infection compared to infection-naive. Notably, viral exposure before vaccination was crucial to achieving a robust IgA response. Individuals with lower IgG, IgA, and neutralizing antibody responses postvaccination had a significantly higher risk of reinfection and future Omicron infections. This was not observed for T-cell responses. A primary infection before Omicron and subsequent reinfection with Omicron dampened the humoral and cellular responses compared to a primary Omicron infection, consistent with immune imprinting. These results underscore the significant impact of hybrid immunity for immune responses in general, particularly for IgA responses even after revaccination, and the importance of robust humoral responses in preventing future infections.


Subject(s)
Breakthrough Infections , COVID-19 , Humans , Reinfection , BNT162 Vaccine , SARS-CoV-2 , COVID-19/prevention & control , Vaccination , Antibodies, Neutralizing , Immunity , Immunoglobulin A , Immunoglobulin G
4.
Nucleic Acids Res ; 50(W1): W228-W234, 2022 07 05.
Article in English | MEDLINE | ID: mdl-35489069

ABSTRACT

The prediction of protein subcellular localization is of great relevance for proteomics research. Here, we propose an update to the popular tool DeepLoc with multi-localization prediction and improvements in both performance and interpretability. For training and validation, we curate eukaryotic and human multi-location protein datasets with stringent homology partitioning and enriched with sorting signal information compiled from the literature. We achieve state-of-the-art performance in DeepLoc 2.0 by using a pre-trained protein language model. It has the further advantage that it uses sequence input rather than relying on slower protein profiles. We provide two means of better interpretability: an attention output along the sequence and highly accurate prediction of nine different types of protein sorting signals. We find that the attention output correlates well with the position of sorting signals. The webserver is available at services.healthtech.dtu.dk/service.php?DeepLoc-2.0.


Subject(s)
Protein Sorting Signals , Proteins , Humans , Proteins/metabolism , Eukaryota/metabolism , Protein Transport , Language , Databases, Protein , Computational Biology , Subcellular Fractions/metabolism
5.
Front Immunol ; 13: 832501, 2022.
Article in English | MEDLINE | ID: mdl-35281023

ABSTRACT

Background: Previous studies have indicated inferior responses to severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) vaccination in solid organ transplant (SOT) recipients. We examined the development of anti-receptor-binding domain (RBD) immunoglobulin G (IgG) after two doses of BNT162b2b in SOT recipients 6 months after vaccination and compared to that of immunocompetent controls. Methods: We measured anti-RBD IgG after two doses of BNT162b2 in 200 SOT recipients and 200 matched healthy controls up to 6 months after first vaccination. Anti-RBD IgG concentration and neutralizing capacity of antibodies were measured at first and second doses of BNT162b2 and 2 and 6 months after the first dose. T-cell responses were measured 6 months after the first dose. Results: In SOT recipients, geometric mean concentration (GMC) of anti-RBD IgG increased from first to second dose (1.14 AU/ml, 95% CI 1.08-1.24 to 11.97 AU/ml, 95% CI 7.73-18.77) and from second dose to 2 months (249.29 AU/ml, 95% CI 153.70-385.19). Six months after the first vaccine, anti-RBD IgG declined (55.85 AU/ml, 95% CI 36.95-83.33). At all time points, anti-RBD IgG was lower in SOT recipients than that in controls. Fewer SOT recipients than controls had a cellular response (13.1% vs. 59.4%, p < 0.001). Risk factors associated with humoral non-response included age [relative risk (RR) 1.23 per 10-year increase, 95% CI 1.11-1.35, p < 0.001], being within 1 year from transplantation (RR 1.55, 95% CI 1.30-1.85, p < 0.001), treatment with mycophenolate (RR 1.54, 95% CI 1.09-2.18, p = 0.015), treatment with corticosteroids (RR 1.45, 95% CI 1.10-1.90, p = 0.009), kidney transplantation (RR 1.70, 95% CI 1.25-2.30, p = 0.001), lung transplantation (RR 1.63, 95% CI 1.16-2.29, p = 0.005), and de novo non-skin cancer comorbidity (RR 1.52, 95% CI, 1.26-1.82, p < 0.001). Conclusion: Immune responses to BNT162b2 are inferior in SOT recipients compared to healthy controls, and studies aiming to determine the clinical impact of inferior vaccine responses are warranted.


Subject(s)
BNT162 Vaccine/immunology , COVID-19/immunology , Organ Transplantation , SARS-CoV-2/physiology , Transplant Recipients , Adult , Antibodies, Neutralizing/blood , Antibodies, Viral/blood , Cohort Studies , Healthy Volunteers , Humans , Male , Prospective Studies , Vaccination
6.
Nat Biotechnol ; 40(7): 1023-1025, 2022 07.
Article in English | MEDLINE | ID: mdl-34980915

ABSTRACT

Signal peptides (SPs) are short amino acid sequences that control protein secretion and translocation in all living organisms. SPs can be predicted from sequence data, but existing algorithms are unable to detect all known types of SPs. We introduce SignalP 6.0, a machine learning model that detects all five SP types and is applicable to metagenomic data.


Subject(s)
Language , Protein Sorting Signals , Algorithms , Amino Acid Sequence , Protein Sorting Signals/genetics , Proteins
7.
Life Sci Alliance ; 2(5)2019 10.
Article in English | MEDLINE | ID: mdl-31570514

ABSTRACT

In bioinformatics, machine learning methods have been used to predict features embedded in the sequences. In contrast to what is generally assumed, machine learning approaches can also provide new insights into the underlying biology. Here, we demonstrate this by presenting TargetP 2.0, a novel state-of-the-art method to identify N-terminal sorting signals, which direct proteins to the secretory pathway, mitochondria, and chloroplasts or other plastids. By examining the strongest signals from the attention layer in the network, we find that the second residue in the protein, that is, the one following the initial methionine, has a strong influence on the classification. We observe that two-thirds of chloroplast and thylakoid transit peptides have an alanine in position 2, compared with 20% in other plant proteins. We also note that in fungi and single-celled eukaryotes, less than 30% of the targeting peptides have an amino acid that allows the removal of the N-terminal methionine compared with 60% for the proteins without targeting peptide. The importance of this feature for predictions has not been highlighted before.


Subject(s)
Computational Biology/methods , Peptides/analysis , Peptides/genetics , Amino Acid Sequence , Chloroplasts/genetics , Chloroplasts/metabolism , Deep Learning , Fungi/genetics , Fungi/metabolism , Methionine/metabolism , Protein Sorting Signals , Thylakoids/genetics , Thylakoids/metabolism
8.
Nat Biotechnol ; 37(4): 420-423, 2019 04.
Article in English | MEDLINE | ID: mdl-30778233

ABSTRACT

Signal peptides (SPs) are short amino acid sequences in the amino terminus of many newly synthesized proteins that target proteins into, or across, membranes. Bioinformatic tools can predict SPs from amino acid sequences, but most cannot distinguish between various types of signal peptides. We present a deep neural network-based approach that improves SP prediction across all domains of life and distinguishes between three types of prokaryotic SPs.


Subject(s)
Neural Networks, Computer , Protein Sorting Signals/genetics , Protein Sorting Signals/physiology , Algorithms , Amino Acid Sequence , Archaeal Proteins/classification , Archaeal Proteins/genetics , Archaeal Proteins/metabolism , Bacterial Proteins/classification , Bacterial Proteins/genetics , Bacterial Proteins/metabolism , Biotechnology , Computational Biology , Eukaryota/genetics , Eukaryota/metabolism , Sequence Analysis, Protein , Software
10.
Bioinformatics ; 33(21): 3387-3395, 2017 Nov 01.
Article in English | MEDLINE | ID: mdl-29036616

ABSTRACT

MOTIVATION: The prediction of eukaryotic protein subcellular localization is a well-studied topic in bioinformatics due to its relevance in proteomics research. Many machine learning methods have been successfully applied in this task, but in most of them, predictions rely on annotation of homologues from knowledge databases. For novel proteins where no annotated homologues exist, and for predicting the effects of sequence variants, it is desirable to have methods for predicting protein properties from sequence information only. RESULTS: Here, we present a prediction algorithm using deep neural networks to predict protein subcellular localization relying only on sequence information. At its core, the prediction model uses a recurrent neural network that processes the entire protein sequence and an attention mechanism identifying protein regions important for the subcellular localization. The model was trained and tested on a protein dataset extracted from one of the latest UniProt releases, in which experimentally annotated proteins follow more stringent criteria than previously. We demonstrate that our model achieves a good accuracy (78% for 10 categories; 92% for membrane-bound or soluble), outperforming current state-of-the-art algorithms, including those relying on homology information. AVAILABILITY AND IMPLEMENTATION: The method is available as a web server at http://www.cbs.dtu.dk/services/DeepLoc. Example code is available at https://github.com/JJAlmagro/subcellular_localization. The dataset is available at http://www.cbs.dtu.dk/services/DeepLoc/data.php. CONTACT: jjalma@dtu.dk.


Subject(s)
Computational Biology/methods , Machine Learning , Protein Transport , Sequence Analysis, Protein/methods , Software , Eukaryota/metabolism , Eukaryotic Cells/metabolism , Models, Biological , Molecular Sequence Annotation/methods , Neural Networks, Computer
11.
Bioinformatics ; 33(22): 3685-3690, 2017 Nov 15.
Article in English | MEDLINE | ID: mdl-28961695

ABSTRACT

MOTIVATION: Deep neural network architectures such as convolutional and long short-term memory networks have become increasingly popular as machine learning tools during the recent years. The availability of greater computational resources, more data, new algorithms for training deep models and easy to use libraries for implementation and training of neural networks are the drivers of this development. The use of deep learning has been especially successful in image recognition; and the development of tools, applications and code examples are in most cases centered within this field rather than within biology. RESULTS: Here, we aim to further the development of deep learning methods within biology by providing application examples and ready to apply and adapt code templates. Given such examples, we illustrate how architectures consisting of convolutional and long short-term memory neural networks can relatively easily be designed and trained to state-of-the-art performance on three biological sequence problems: prediction of subcellular localization, protein secondary structure and the binding of peptides to MHC Class II molecules. AVAILABILITY AND IMPLEMENTATION: All implementations and datasets are available online to the scientific community at https://github.com/vanessajurtz/lasagne4bio. CONTACT: skaaesonderby@gmail.com. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Machine Learning , Protein Structure, Secondary , Protein Transport , Sequence Analysis, Protein/methods , Computational Biology/methods , Neural Networks, Computer , Peptides/metabolism , Protein Binding
SELECTION OF CITATIONS
SEARCH DETAIL
...