Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 209.692
Filter
1.
Sci Data ; 11(1): 458, 2024 May 06.
Article in English | MEDLINE | ID: mdl-38710720

ABSTRACT

The advent of single-particle cryo-electron microscopy (cryo-EM) has brought forth a new era of structural biology, enabling the routine determination of large biological molecules and their complexes at atomic resolution. The high-resolution structures of biological macromolecules and their complexes significantly expedite biomedical research and drug discovery. However, automatically and accurately building atomic models from high-resolution cryo-EM density maps is still time-consuming and challenging when template-based models are unavailable. Artificial intelligence (AI) methods such as deep learning trained on limited amount of labeled cryo-EM density maps generate inaccurate atomic models. To address this issue, we created a dataset called Cryo2StructData consisting of 7,600 preprocessed cryo-EM density maps whose voxels are labelled according to their corresponding known atomic structures for training and testing AI methods to build atomic models from cryo-EM density maps. Cryo2StructData is larger than existing, publicly available datasets for training AI methods to build atomic protein structures from cryo-EM density maps. We trained and tested deep learning models on Cryo2StructData to validate its quality showing that it is ready for being used to train and test AI methods for building atomic models.


Subject(s)
Artificial Intelligence , Cryoelectron Microscopy , Proteins , Cryoelectron Microscopy/methods , Proteins/chemistry , Proteins/ultrastructure , Models, Molecular , Protein Conformation
2.
Methods Mol Biol ; 2804: 3-50, 2024.
Article in English | MEDLINE | ID: mdl-38753138

ABSTRACT

Self-powered microfluidics presents a revolutionary approach to address the challenges of healthcare in decentralized and point-of-care settings where limited access to resources and infrastructure prevails or rapid clinical decision-making is critical. These microfluidic systems exploit physical and chemical phenomena, such as capillary forces and surface tension, to manipulate tiny volumes of fluids without the need for external power sources, making them cost-effective and highly portable. Recent technological advancements have demonstrated the ability to preprogram complex multistep liquid operations within the microfluidic circuit of these standalone systems, which enabled the integration of sensitive detection and readout principles. This chapter first addresses how the accessibility to in vitro diagnostics can be improved by shifting toward decentralized approaches like remote microsampling and point-of-care testing. Next, the crucial role of self-powered microfluidic technologies to enable this patient-centric healthcare transition is emphasized using various state-of-the-art examples, with a primary focus on applications related to biofluid collection and the detection of either proteins or nucleic acids. This chapter concludes with a summary of the main findings and our vision of the future perspectives in the field of self-powered microfluidic technologies and their use for in vitro diagnostics applications.


Subject(s)
Microfluidic Analytical Techniques , Nucleic Acids , Point-of-Care Systems , Proteins , Humans , Lab-On-A-Chip Devices , Microfluidic Analytical Techniques/instrumentation , Microfluidic Analytical Techniques/methods , Microfluidics/methods , Microfluidics/instrumentation , Nucleic Acids/analysis , Point-of-Care Testing , Proteins/analysis
3.
Genome Biol Evol ; 16(5)2024 May 02.
Article in English | MEDLINE | ID: mdl-38735759

ABSTRACT

A fundamental goal in evolutionary biology and population genetics is to understand how selection shapes the fate of new mutations. Here, we test the null hypothesis that insertion-deletion (indel) events in protein-coding regions occur randomly with respect to secondary structures. We identified indels across 11,444 sequence alignments in mouse, rat, human, chimp, and dog genomes and then quantified their overlap with four different types of secondary structure-alpha helices, beta strands, protein bends, and protein turns-predicted by deep-learning methods of AlphaFold2. Indels overlapped secondary structures 54% as much as expected and were especially underrepresented over beta strands, which tend to form internal, stable regions of proteins. In contrast, indels were enriched by 155% over regions without any predicted secondary structures. These skews were stronger in the rodent lineages compared to the primate lineages, consistent with population genetic theory predicting that natural selection will be more efficient in species with larger effective population sizes. Nonsynonymous substitutions were also less common in regions of protein secondary structure, although not as strongly reduced as in indels. In a complementary analysis of thousands of human genomes, we showed that indels overlapping secondary structure segregated at significantly lower frequency than indels outside of secondary structure. Taken together, our study shows that indels are selected against if they overlap secondary structure, presumably because they disrupt the tertiary structure and function of a protein.


Subject(s)
INDEL Mutation , Protein Structure, Secondary , Humans , Animals , Mice , Rats , Evolution, Molecular , Proteins/genetics , Proteins/chemistry , Dogs , Selection, Genetic , Genome
4.
Mol Cell ; 84(9): 1802-1810.e4, 2024 May 02.
Article in English | MEDLINE | ID: mdl-38701741

ABSTRACT

Polyphosphate (polyP) is a chain of inorganic phosphate that is present in all domains of life and affects diverse cellular phenomena, ranging from blood clotting to cancer. A study by Azevedo et al. described a protein modification whereby polyP is attached to lysine residues within polyacidic serine and lysine (PASK) motifs via what the authors claimed to be covalent phosphoramidate bonding. This was based largely on the remarkable ability of the modification to survive extreme denaturing conditions. Our study demonstrates that lysine polyphosphorylation is non-covalent, based on its sensitivity to ionic strength and lysine protonation and absence of phosphoramidate bond formation, as analyzed via 31P NMR. Ionic interaction with lysine residues alone is sufficient for polyP modification, and we present a new list of non-PASK lysine repeat proteins that undergo polyP modification. This work clarifies the biochemistry of polyP-lysine modification, with important implications for both studying and modulating this phenomenon. This Matters Arising paper is in response to Azevedo et al. (2015), published in Molecular Cell. See also the Matters Arising Response by Azevedo et al. (2024), published in this issue.


Subject(s)
Amides , Lysine , Phosphoric Acids , Polyphosphates , Lysine/metabolism , Lysine/chemistry , Polyphosphates/chemistry , Polyphosphates/metabolism , Phosphorylation , Humans , Protein Processing, Post-Translational , Proteins/chemistry , Proteins/metabolism , Proteins/genetics
5.
JCI Insight ; 9(10)2024 May 22.
Article in English | MEDLINE | ID: mdl-38775157

ABSTRACT

Redundant tumor microenvironment (TME) immunosuppressive mechanisms and epigenetic maintenance of terminal T cell exhaustion greatly hinder functional antitumor immune responses in chronic lymphocytic leukemia (CLL). Bromodomain and extraterminal (BET) proteins regulate key pathways contributing to CLL pathogenesis and TME interactions, including T cell function and differentiation. Herein, we report that blocking BET protein function alleviates immunosuppressive networks in the CLL TME and repairs inherent CLL T cell defects. The pan-BET inhibitor OPN-51107 reduced exhaustion-associated cell signatures resulting in improved T cell proliferation and effector function in the Eµ-TCL1 splenic TME. Following BET inhibition (BET-i), TME T cells coexpressed significantly fewer inhibitory receptors (IRs) (e.g., PD-1, CD160, CD244, LAG3, VISTA). Complementary results were witnessed in primary CLL cultures, wherein OPN-51107 exerted proinflammatory effects on T cells, regardless of leukemic cell burden. BET-i additionally promotes a progenitor T cell phenotype through reduced expression of transcription factors that maintain terminal differentiation and increased expression of TCF-1, at least in part through altered chromatin accessibility. Moreover, direct T cell effects of BET-i were unmatched by common targeted therapies in CLL. This study demonstrates the immunomodulatory action of BET-i on CLL T cells and supports the inclusion of BET inhibitors in the management of CLL to alleviate terminal T cell dysfunction and potentially enhance tumoricidal T cell activity.


Subject(s)
Leukemia, Lymphocytic, Chronic, B-Cell , T-Lymphocytes , Tumor Microenvironment , Leukemia, Lymphocytic, Chronic, B-Cell/immunology , Leukemia, Lymphocytic, Chronic, B-Cell/drug therapy , Tumor Microenvironment/immunology , Tumor Microenvironment/drug effects , Humans , Animals , Mice , T-Lymphocytes/immunology , T-Lymphocytes/drug effects , T-Lymphocytes/metabolism , Transcription Factors/metabolism , Transcription Factors/genetics , Hepatocyte Nuclear Factor 1-alpha/metabolism , Hepatocyte Nuclear Factor 1-alpha/genetics , Cell Proliferation/drug effects , Bromodomain Containing Proteins , Proteins
7.
J Sep Sci ; 47(9-10): e2400061, 2024 May.
Article in English | MEDLINE | ID: mdl-38726749

ABSTRACT

Determination of proteins from dried matrix spots using MS is an expanding research area. Mainly, the collected dried matrix sample is whole blood from a finger or heal prick, resulting in dried blood spots. However as other matrices such as plasma, serum, urine, and tear fluid also can be collected in this way, the term dried matrix spot is used as an overarching term. In this review, the focus is on advancements in the field made from 2017 up to 2023. In the first part reviews concerning the subject are discussed. After this, advancements made for clinical purposes are highlighted. Both targeted protein analyses, with and without the use of affinity extractions, as well as untargeted, global proteomic approaches are discussed. In the last part, both methodological advancements are being reviewed as well as the possibility to integrate sample preparation steps during the sample handling. The focus, of this so-called smart sampling, is on the incorporation of cell separation, proteolysis, and antibody-based affinity capture.


Subject(s)
Dried Blood Spot Testing , Mass Spectrometry , Proteins , Humans , Chromatography, Liquid , Proteins/analysis , Proteomics/methods , Specimen Handling , Liquid Chromatography-Mass Spectrometry
8.
Mikrochim Acta ; 191(6): 307, 2024 05 07.
Article in English | MEDLINE | ID: mdl-38713296

ABSTRACT

An assay that integrates histidine-rich peptides (HisRPs) with high-affinity aptamers was developed enabling the specific and sensitive determination of the target lysozyme. The enzyme-like activity of HisRP is inhibited by its interaction with a target recognized by an aptamer. In the presence of the target, lysozyme molecules progressively assemble on the surface of HisRP in a concentration-dependent manner, resulting in the gradual suppression of enzyme-like activity. This inhibition of HisRP's enzyme-like activity can be visually observed through color changes in the reaction product or quantified using UV-visible absorption spectroscopy. Under optimal conditions, the proposed colorimetric assay for lysozyme had a detection limit as low as 1 nM and exhibited excellent selectivity against other nonspecific interferents. Furthermore, subsequent research validated the practical applicability of the developed colorimetric approach to saliva samples, indicating that the assay holds significant potential for the detection of lysozymes in samples derived from humans.


Subject(s)
Colorimetry , Muramidase , Saliva , Muramidase/analysis , Muramidase/chemistry , Muramidase/metabolism , Colorimetry/methods , Humans , Saliva/chemistry , Saliva/enzymology , Limit of Detection , Peptides/chemistry , Aptamers, Nucleotide/chemistry , Proteins/analysis , Biosensing Techniques/methods , Histidine/analysis , Histidine/chemistry
9.
Curr Protoc ; 4(5): e1047, 2024 May.
Article in English | MEDLINE | ID: mdl-38720559

ABSTRACT

Recent advancements in protein structure determination and especially in protein structure prediction techniques have led to the availability of vast amounts of macromolecular structures. However, the accessibility and integration of these structures into scientific workflows are hindered by the lack of standardization among publicly available data resources. To address this issue, we introduced the 3D-Beacons Network, a unified platform that aims to establish a standardized framework for accessing and displaying protein structure data. In this article, we highlight the importance of standardized approaches for accessing protein structure data and showcase the capabilities of 3D-Beacons. We describe four protocols for finding and accessing macromolecular structures from various specialist data resources via 3D-Beacons. First, we describe three scenarios for programmatically accessing and retrieving data using the 3D-Beacons API. Next, we show how to perform sequence-based searches to find structures from model providers. Then, we demonstrate how to search for structures and fetch them directly into a workflow using JalView. Finally, we outline the process of facilitating access to data from providers interested in contributing their structures to the 3D-Beacons Network. © 2024 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Programmatic access to the 3D-Beacons API Basic Protocol 2: Sequence-based search using the 3D-Beacons API Basic Protocol 3: Accessing macromolecules from 3D-Beacons with JalView Basic Protocol 4: Enhancing data accessibility through 3D-Beacons.


Subject(s)
Protein Conformation , Proteins , Proteins/chemistry , Databases, Protein , Software
10.
AAPS J ; 26(3): 60, 2024 May 10.
Article in English | MEDLINE | ID: mdl-38730115

ABSTRACT

Subcutaneous (SC) administration of therapeutic proteins is perceived to pose higher risk of immunogenicity when compared with intravenous (IV) route of administration (RoA). However, systematic evaluations of clinical data to support this claim are lacking. This meta-analysis was conducted to compare the immunogenicity of the same therapeutic protein by IV and SC RoA. Anti-drug antibody (ADA) data and controlling variables for 7 therapeutic proteins administered by both IV and SC routes across 48 treatment groups were analyzed. RoA was the primary independent variable of interest while therapeutic protein, patient population, adjusted dose, and number of ADA samples were controlling variables. Analysis of variance was used to compare the ADA incidence between IV and SC RoA, while accounting for controlling variables and potential interactions. Subsequently, 10 additional therapeutic proteins with ADA data published for both IV and SC administration were added to the above 7 therapeutic proteins and were evaluated for ADA incidence. RoA had no statistically significant effect on ADA incidence for the initial dataset of 7 therapeutic proteins (p = 0.55). The only variable with a significant effect on ADA incidence was the therapeutic protein. None of the other controlling variables, including their interactions with RoA, was significant. When all data from the 17 therapeutic proteins were pooled, there was no statistically significant effect of RoA on ADA incidence (p = 0.81). In conclusion, there is no significant difference in ADA incidence between the IV and SC RoA, based on analysis of clinical ADA data from 17 therapeutic proteins.


Subject(s)
Administration, Intravenous , Humans , Injections, Subcutaneous , Antibodies/administration & dosage , Antibodies/immunology , Proteins/administration & dosage , Proteins/immunology
11.
Sci Rep ; 14(1): 10475, 2024 05 07.
Article in English | MEDLINE | ID: mdl-38714683

ABSTRACT

To ensure that an external force can break the interaction between a protein and a ligand, the steered molecular dynamics simulation requires a harmonic restrained potential applied to the protein backbone. A usual practice is that all or a certain number of protein's heavy atoms or Cα atoms are fixed, being restrained by a small force. This present study reveals that while fixing both either all heavy atoms and or all Cα atoms is not a good approach, while fixing a too small number of few atoms sometimes cannot prevent the protein from rotating under the influence of the bulk water layer, and the pulled molecule may smack into the wall of the active site. We found that restraining the Cα atoms under certain conditions is more relevant. Thus, we would propose an alternative solution in which only the Cα atoms of the protein at a distance larger than 1.2 nm from the ligand are restrained. A more flexible, but not too flexible, protein will be expected to lead to a more natural release of the ligand.


Subject(s)
Molecular Dynamics Simulation , Protein Binding , Proteins , Ligands , Proteins/chemistry , Proteins/metabolism , Protein Conformation
12.
Protein Sci ; 33(6): e5001, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38723111

ABSTRACT

De novo protein design expands the protein universe by creating new sequences to accomplish tailor-made enzymes in the future. A promising topology to implement diverse enzyme functions is the ubiquitous TIM-barrel fold. Since the initial de novo design of an idealized four-fold symmetric TIM barrel, the family of de novo TIM barrels is expanding rapidly. Despite this and in contrast to natural TIM barrels, these novel proteins lack cavities and structural elements essential for the incorporation of binding sites or enzymatic functions. In this work, we diversified a de novo TIM barrel by extending multiple ßα-loops using constrained hallucination. Experimentally tested designs were found to be soluble upon expression in Escherichia coli and well-behaved. Biochemical characterization and crystal structures revealed successful extensions with defined α-helical structures. These diversified de novo TIM barrels provide a framework to explore a broad spectrum of functions based on the potential of natural TIM barrels.


Subject(s)
Models, Molecular , Escherichia coli/genetics , Escherichia coli/metabolism , Crystallography, X-Ray , Protein Folding , Protein Engineering/methods , Proteins/chemistry , Proteins/metabolism
13.
Brief Bioinform ; 25(3)2024 Mar 27.
Article in English | MEDLINE | ID: mdl-38706315

ABSTRACT

In UniProtKB, up to date, there are more than 251 million proteins deposited. However, only 0.25% have been annotated with one of the more than 15000 possible Pfam family domains. The current annotation protocol integrates knowledge from manually curated family domains, obtained using sequence alignments and hidden Markov models. This approach has been successful for automatically growing the Pfam annotations, however at a low rate in comparison to protein discovery. Just a few years ago, deep learning models were proposed for automatic Pfam annotation. However, these models demand a considerable amount of training data, which can be a challenge with poorly populated families. To address this issue, we propose and evaluate here a novel protocol based on transfer learningThis requires the use of protein large language models (LLMs), trained with self-supervision on big unnanotated datasets in order to obtain sequence embeddings. Then, the embeddings can be used with supervised learning on a small and annotated dataset for a specialized task. In this protocol we have evaluated several cutting-edge protein LLMs together with machine learning architectures to improve the actual prediction of protein domain annotations. Results are significatively better than state-of-the-art for protein families classification, reducing the prediction error by an impressive 60% compared to standard methods. We explain how LLMs embeddings can be used for protein annotation in a concrete and easy way, and provide the pipeline in a github repo. Full source code and data are available at https://github.com/sinc-lab/llm4pfam.


Subject(s)
Databases, Protein , Proteins , Proteins/chemistry , Molecular Sequence Annotation/methods , Computational Biology/methods , Machine Learning
14.
Brief Bioinform ; 25(3)2024 Mar 27.
Article in English | MEDLINE | ID: mdl-38706316

ABSTRACT

Protein-ligand interactions (PLIs) are essential for cellular activities and drug discovery. But due to the complexity and high cost of experimental methods, there is a great demand for computational approaches to recognize PLI patterns, such as protein-ligand docking. In recent years, more and more models based on machine learning have been developed to directly predict the root mean square deviation (RMSD) of a ligand docking pose with reference to its native binding pose. However, new scoring methods are pressingly needed in methodology for more accurate RMSD prediction. We present a new deep learning-based scoring method for RMSD prediction of protein-ligand docking poses based on a Graphormer method and Shell-like graph architecture, named GSScore. To recognize near-native conformations from a set of poses, GSScore takes atoms as nodes and then establishes the docking interface of protein-ligand into multiple bipartite graphs within different shell ranges. Benefiting from the Graphormer and Shell-like graph architecture, GSScore can effectively capture the subtle differences between energetically favorable near-native conformations and unfavorable non-native poses without extra information. GSScore was extensively evaluated on diverse test sets including a subset of PDBBind version 2019, CASF2016 as well as DUD-E, and obtained significant improvements over existing methods in terms of RMSE, $R$ (Pearson correlation coefficient), Spearman correlation coefficient and Docking power.


Subject(s)
Molecular Docking Simulation , Proteins , Ligands , Proteins/chemistry , Proteins/metabolism , Protein Binding , Software , Algorithms , Computational Biology/methods , Protein Conformation , Databases, Protein , Deep Learning
15.
BMC Bioinformatics ; 25(1): 174, 2024 May 02.
Article in English | MEDLINE | ID: mdl-38698340

ABSTRACT

BACKGROUND: In last two decades, the use of high-throughput sequencing technologies has accelerated the pace of discovery of proteins. However, due to the time and resource limitations of rigorous experimental functional characterization, the functions of a vast majority of them remain unknown. As a result, computational methods offering accurate, fast and large-scale assignment of functions to new and previously unannotated proteins are sought after. Leveraging the underlying associations between the multiplicity of features that describe proteins could reveal functional insights into the diverse roles of proteins and improve performance on the automatic function prediction task. RESULTS: We present GO-LTR, a multi-view multi-label prediction model that relies on a high-order tensor approximation of model weights combined with non-linear activation functions. The model is capable of learning high-order relationships between multiple input views representing the proteins and predicting high-dimensional multi-label output consisting of protein functional categories. We demonstrate the competitiveness of our method on various performance measures. Experiments show that GO-LTR learns polynomial combinations between different protein features, resulting in improved performance. Additional investigations establish GO-LTR's practical potential in assigning functions to proteins under diverse challenging scenarios: very low sequence similarity to previously observed sequences, rarely observed and highly specific terms in the gene ontology. IMPLEMENTATION: The code and data used for training GO-LTR is available at https://github.com/aalto-ics-kepaco/GO-LTR-prediction .


Subject(s)
Computational Biology , Proteins , Proteins/chemistry , Proteins/metabolism , Computational Biology/methods , Databases, Protein , Algorithms
16.
Acta Crystallogr D Struct Biol ; 80(Pt 5): 314-327, 2024 May 01.
Article in English | MEDLINE | ID: mdl-38700059

ABSTRACT

Radiation damage remains one of the major impediments to accurate structure solution in macromolecular crystallography. The artefacts of radiation damage can manifest as structural changes that result in incorrect biological interpretations being drawn from a model, they can reduce the resolution to which data can be collected and they can even prevent structure solution entirely. In this article, we discuss how to identify and mitigate against the effects of radiation damage at each stage in the macromolecular crystal structure-solution pipeline.


Subject(s)
Macromolecular Substances , Crystallography, X-Ray/methods , Macromolecular Substances/chemistry , Models, Molecular , Proteins/chemistry
17.
PLoS One ; 19(5): e0299287, 2024.
Article in English | MEDLINE | ID: mdl-38701058

ABSTRACT

Matrix-assisted laser desorption/ionization time-of-flight-time-of-flight (MALDI-TOF-TOF) tandem mass spectrometry (MS/MS) is a rapid technique for identifying intact proteins from unfractionated mixtures by top-down proteomic analysis. MS/MS allows isolation of specific intact protein ions prior to fragmentation, allowing fragment ion attribution to a specific precursor ion. However, the fragmentation efficiency of mature, intact protein ions by MS/MS post-source decay (PSD) varies widely, and the biochemical and structural factors of the protein that contribute to it are poorly understood. With the advent of protein structure prediction algorithms such as Alphafold2, we have wider access to protein structures for which no crystal structure exists. In this work, we use a statistical approach to explore the properties of bacterial proteins that can affect their gas phase dissociation via PSD. We extract various protein properties from Alphafold2 predictions and analyze their effect on fragmentation efficiency. Our results show that the fragmentation efficiency from cleavage of the polypeptide backbone on the C-terminal side of glutamic acid (E) and asparagine (N) residues were nearly equal. In addition, we found that the rearrangement and cleavage on the C-terminal side of aspartic acid (D) residues that result from the aspartic acid effect (AAE) were higher than for E- and N-residues. From residue interaction network analysis, we identified several local centrality measures and discussed their implications regarding the AAE. We also confirmed the selective cleavage of the backbone at D-proline bonds in proteins and further extend it to N-proline bonds. Finally, we note an enhancement of the AAE mechanism when the residue on the C-terminal side of D-, E- and N-residues is glycine. To the best of our knowledge, this is the first report of this phenomenon. Our study demonstrates the value of using statistical analyses of protein sequences and their predicted structures to better understand the fragmentation of the intact protein ions in the gas phase.


Subject(s)
Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization , Tandem Mass Spectrometry , Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization/methods , Tandem Mass Spectrometry/methods , Bacterial Proteins/chemistry , Proteomics/methods , Algorithms , Proteins/chemistry , Proteins/analysis
18.
Brief Bioinform ; 25(3)2024 Mar 27.
Article in English | MEDLINE | ID: mdl-38701416

ABSTRACT

Predicting protein function is crucial for understanding biological life processes, preventing diseases and developing new drug targets. In recent years, methods based on sequence, structure and biological networks for protein function annotation have been extensively researched. Although obtaining a protein in three-dimensional structure through experimental or computational methods enhances the accuracy of function prediction, the sheer volume of proteins sequenced by high-throughput technologies presents a significant challenge. To address this issue, we introduce a deep neural network model DeepSS2GO (Secondary Structure to Gene Ontology). It is a predictor incorporating secondary structure features along with primary sequence and homology information. The algorithm expertly combines the speed of sequence-based information with the accuracy of structure-based features while streamlining the redundant data in primary sequences and bypassing the time-consuming challenges of tertiary structure analysis. The results show that the prediction performance surpasses state-of-the-art algorithms. It has the ability to predict key functions by effectively utilizing secondary structure information, rather than broadly predicting general Gene Ontology terms. Additionally, DeepSS2GO predicts five times faster than advanced algorithms, making it highly applicable to massive sequencing data. The source code and trained models are available at https://github.com/orca233/DeepSS2GO.


Subject(s)
Algorithms , Computational Biology , Neural Networks, Computer , Protein Structure, Secondary , Proteins , Proteins/chemistry , Proteins/metabolism , Proteins/genetics , Computational Biology/methods , Databases, Protein , Gene Ontology , Sequence Analysis, Protein/methods , Software
19.
Methods Mol Biol ; 2800: 103-113, 2024.
Article in English | MEDLINE | ID: mdl-38709481

ABSTRACT

The spatial resolution of conventional light microscopy is restricted by the diffraction limit to hundreds of nanometers. Super-resolution microscopy enables single digit nanometer resolution by circumventing the diffraction limit of conventional light microscopy. DNA point accumulation for imaging in nanoscale topography (DNA-PAINT) belongs to the family of single-molecule localization super-resolution approaches. Unique features of DNA-PAINT are that it allows for sub-nanometer resolution, spectrally unlimited multiplexing, proximity detection, and quantitative counting of target molecules. Here, we describe prerequisites for efficient DNA-PAINT microscopy.


Subject(s)
DNA , Single Molecule Imaging , DNA/chemistry , Single Molecule Imaging/methods , Microscopy, Fluorescence/methods , Proteins/chemistry , Nanotechnology/methods
20.
Brief Bioinform ; 25(3)2024 Mar 27.
Article in English | MEDLINE | ID: mdl-38695119

ABSTRACT

Sequence similarity is of paramount importance in biology, as similar sequences tend to have similar function and share common ancestry. Scoring matrices, such as PAM or BLOSUM, play a crucial role in all bioinformatics algorithms for identifying similarities, but have the drawback that they are fixed, independent of context. We propose a new scoring method for amino acid similarity that remedies this weakness, being contextually dependent. It relies on recent advances in deep learning architectures that employ self-supervised learning in order to leverage the power of enormous amounts of unlabelled data to generate contextual embeddings, which are vector representations for words. These ideas have been applied to protein sequences, producing embedding vectors for protein residues. We propose the E-score between two residues as the cosine similarity between their embedding vector representations. Thorough testing on a wide variety of reference multiple sequence alignments indicate that the alignments produced using the new $E$-score method, especially ProtT5-score, are significantly better than those obtained using BLOSUM matrices. The new method proposes to change the way alignments are computed, with far-reaching implications in all areas of textual data that use sequence similarity. The program to compute alignments based on various $E$-scores is available as a web server at e-score.csd.uwo.ca. The source code is freely available for download from github.com/lucian-ilie/E-score.


Subject(s)
Algorithms , Computational Biology , Sequence Alignment , Sequence Alignment/methods , Computational Biology/methods , Software , Sequence Analysis, Protein/methods , Amino Acid Sequence , Proteins/chemistry , Proteins/genetics , Deep Learning , Databases, Protein
SELECTION OF CITATIONS
SEARCH DETAIL
...