Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 10.292
Filter
1.
Nat Commun ; 15(1): 6984, 2024 Aug 14.
Article in English | MEDLINE | ID: mdl-39143123

ABSTRACT

Transcription factors specifically bind to their consensus sequence motifs and regulate transcription efficiency. Transcription factors are also able to non-specifically contact the phosphate backbone of DNA through electrostatic interaction. The homeodomain of Meis1 TALE human transcription factor (Meis1-HD) recognizes its target DNA sequences via two DNA contact regions, the L1-α1 region and the α3 helix (specific binding mode). This study demonstrates that the non-specific binding mode of Meis1-HD is the energetically favored process during DNA binding, achieved by the interaction of the L1-α1 region with the phosphate backbone. An NMR dynamics study suggests that non-specific binding might set up an intermediate structure which can then rapidly and easily find the consensus region on a long section of genomic DNA in a facilitated binding process. Structural analysis using NMR and molecular dynamics shows that key structural distortions in the Meis1-HD-DNA complex are induced by various single nucleotide mutations in the consensus sequence, resulting in decreased DNA binding affinity. Collectively, our results elucidate the detailed molecular mechanism of how Meis1-HD recognizes single nucleotide mutations within its consensus sequence: (i) through the conformational features of the α3 helix; and (ii) by the dynamic features (rigid or flexible) of the L1 loop and the α3 helix. These findings enhance our understanding of how single nucleotide mutations in transcription factor consensus sequences lead to dysfunctional transcription and, ultimately, human disease.


Subject(s)
DNA , Molecular Dynamics Simulation , Myeloid Ecotropic Viral Integration Site 1 Protein , Protein Binding , Myeloid Ecotropic Viral Integration Site 1 Protein/metabolism , Myeloid Ecotropic Viral Integration Site 1 Protein/genetics , Humans , DNA/metabolism , DNA/chemistry , DNA/genetics , Binding Sites , Homeodomain Proteins/metabolism , Homeodomain Proteins/genetics , Homeodomain Proteins/chemistry , Mutation , Consensus Sequence , Base Sequence
2.
Nature ; 632(8027): 1073-1081, 2024 Aug.
Article in English | MEDLINE | ID: mdl-39020177

ABSTRACT

Measurements of gene expression or signal transduction activity are conventionally performed using methods that require either the destruction or live imaging of a biological sample within the timeframe of interest. Here we demonstrate an alternative paradigm in which such biological activities are stably recorded to the genome. Enhancer-driven genomic recording of transcriptional activity in multiplex (ENGRAM) is based on the signal-dependent production of prime editing guide RNAs that mediate the insertion of signal-specific barcodes (symbols) into a genomically encoded recording unit. We show how this strategy can be used for multiplex recording of the cell-type-specific activities of dozens to hundreds of cis-regulatory elements with high fidelity, sensitivity and reproducibility. Leveraging signal transduction pathway-responsive cis-regulatory elements, we also demonstrate time- and concentration-dependent genomic recording of WNT, NF-κB and Tet-On activities. By coupling ENGRAM to sequential genome editing via DNA Typewriter1, we stably record information about the temporal dynamics of two orthogonal signalling pathways to genomic DNA. Finally we apply ENGRAM to integratively record the transient activity of nearly 100 transcription factor consensus motifs across daily windows spanning the differentiation of mouse embryonic stem cells into gastruloids, an in vitro model of early mammalian development. Although these are proof-of-concept experiments and much work remains to fully realize the possibilities, the symbolic recording of biological signals or states within cells, to the genome and over time, has broad potential to complement contemporary paradigms for how we make measurements in biological systems.


Subject(s)
DNA , Gene Editing , Signal Transduction , Transcription, Genetic , Animals , Mice , Cell Differentiation/genetics , DNA/genetics , DNA/metabolism , Enhancer Elements, Genetic/genetics , Gene Editing/methods , Genomics , Mouse Embryonic Stem Cells/cytology , NF-kappa B/metabolism , Reproducibility of Results , RNA, Guide, CRISPR-Cas Systems/genetics , RNA, Guide, CRISPR-Cas Systems/metabolism , Signal Transduction/genetics , Time Factors , Transcription Factors/metabolism , Transcription, Genetic/genetics , Wnt Signaling Pathway/genetics , Nucleotide Motifs , Consensus Sequence/genetics , Developmental Biology , Proof of Concept Study
3.
Nucleic Acids Res ; 52(15): 9247-9266, 2024 Aug 27.
Article in English | MEDLINE | ID: mdl-38943346

ABSTRACT

Classification of introns, which is crucial to understanding their evolution and splicing, has historically been binary and has resulted in the naming of major and minor introns that are spliced by their namesake spliceosome. However, a broad range of intron consensus sequences exist, leading us to here reclassify introns as minor, minor-like, hybrid, major-like, major and non-canonical introns in 263 species across six eukaryotic supergroups. Through intron orthology analysis, we discovered that minor-like introns are a transitory node for intron conversion across evolution. Despite close resemblance of their consensus sequences to minor introns, these introns possess an AG dinucleotide at the -1 and -2 position of the 5' splice site, a salient feature of major introns. Through combined analysis of CoLa-seq, CLIP-seq for major and minor spliceosome components, and RNAseq from samples in which the minor spliceosome is inhibited we found that minor-like introns are also an intermediate class from a splicing mechanism perspective. Importantly, this analysis has provided insight into the sequence elements that have evolved to make minor-like introns amenable to recognition by both minor and major spliceosome components. We hope that this revised intron classification provides a new framework to study intron evolution and splicing.


Subject(s)
Evolution, Molecular , Introns , RNA Splicing , Spliceosomes , Introns/genetics , Spliceosomes/genetics , Humans , RNA Splice Sites , Animals , Consensus Sequence , Eukaryota/genetics , Eukaryota/classification , Base Sequence
4.
Brief Bioinform ; 25(4)2024 May 23.
Article in English | MEDLINE | ID: mdl-38920083

ABSTRACT

This study proposes a novel approach to studying severe acute respiratory syndrome coronavirus 2 virus mutations through sequencing data comparison. Traditional consensus-based methods, which focus on the most common nucleotide at each position, might overlook or obscure the presence of low-frequency variants. Our method, in contrast, retains all sequenced nucleotides at each position, forming a genomic matrix. Utilizing simulated short reads from genomes with specified mutations, we contrasted our genomic matrix approach with the consensus sequence method. Our matrix methodology, across multiple simulated datasets, accurately reflected the known mutations with an average accuracy improvement of 20% over the consensus method. In real-world tests using data from GISAID and NCBI-SRA, our approach demonstrated an increase in reliability by reducing the error margin by approximately 15%. The genomic matrix approach offers a more accurate representation of the viral genomic diversity, thereby providing superior insights into virus evolution and epidemiology.


Subject(s)
COVID-19 , Genome, Viral , Phylogeny , SARS-CoV-2 , SARS-CoV-2/genetics , Humans , COVID-19/virology , COVID-19/epidemiology , Mutation , Consensus Sequence , Genetic Variation
5.
Viruses ; 16(5)2024 05 10.
Article in English | MEDLINE | ID: mdl-38793640

ABSTRACT

The HIV-1 Rev protein expressed in the early stage of virus replication is involved in the nuclear export of some forms of virus RNA. Naturally occurring polymorphisms in the Rev protein could influence its activity. The association between the genetic features of different virus variants and HIV infection pathogenesis has been discussed for many years. In this study, Rev diversity among HIV-1 group M clades was analyzed to note the signatures that could influence Rev activity and, subsequently, clinical characteristics. From the Los Alamos HIV Sequence Database, 4962 Rev sequences were downloaded and 26 clades in HIV-1 group M were analyzed for amino acid changes, conservation in consensus sequences, and the presence of clade-specific amino acid substitutions (CSSs) and the Wu-Kabat protein variability coefficient (WK). Subtypes G, CRF 02_AG, B, and A1 showed the largest amino acid changes and diversity. The mean conservation of the Rev protein was 80.8%. In consensus sequences, signatures that could influence Rev activity were detected. In 15 out of 26 consensus sequences, an insertion associated with the reduced export activity of the Rev protein, 95QSQGTET96, was identified. A total of 32 CSSs were found in 16 clades, wherein A6 had the 41Q substitution in the functionally significant region of Rev. The high values of WK coefficient in sites 51 and 82, located on the Rev interaction surface, indicate the susceptibility of these positions to evolutionary replacements. Thus, the noted signatures require further investigation.


Subject(s)
Genetic Variation , HIV Infections , HIV-1 , rev Gene Products, Human Immunodeficiency Virus , HIV-1/genetics , HIV-1/classification , rev Gene Products, Human Immunodeficiency Virus/genetics , rev Gene Products, Human Immunodeficiency Virus/metabolism , Humans , HIV Infections/virology , Phylogeny , Amino Acid Substitution , Amino Acid Sequence , Consensus Sequence
6.
Nat Commun ; 15(1): 3924, 2024 May 09.
Article in English | MEDLINE | ID: mdl-38724518

ABSTRACT

An effective HIV-1 vaccine must elicit broadly neutralizing antibodies (bnAbs) against highly diverse Envelope glycoproteins (Env). Since Env with the longest hypervariable (HV) loops is more resistant to the cognate bnAbs than Env with shorter HV loops, we redesigned hypervariable loops for updated Env consensus sequences of subtypes B and C and CRF01_AE. Using modeling with AlphaFold2, we reduced the length of V1, V2, and V5 HV loops while maintaining the integrity of the Env structure and glycan shield, and modified the V4 HV loop. Spacers are designed to limit strain-specific targeting. All updated Env are infectious as pseudoviruses. Preliminary structural characterization suggests that the modified HV loops have a limited impact on Env's conformation. Binding assays show improved binding to modified subtype B and CRF01_AE Env but not to subtype C Env. Neutralization assays show increases in sensitivity to bnAbs, although not always consistently across clades. Strikingly, the HV loop modification renders the resistant CRF01_AE Env sensitive to 10-1074 despite the absence of a glycan at N332.


Subject(s)
Antibodies, Neutralizing , HIV Antibodies , HIV-1 , env Gene Products, Human Immunodeficiency Virus , HIV-1/immunology , Humans , env Gene Products, Human Immunodeficiency Virus/immunology , env Gene Products, Human Immunodeficiency Virus/chemistry , env Gene Products, Human Immunodeficiency Virus/metabolism , HIV Antibodies/immunology , Antibodies, Neutralizing/immunology , AIDS Vaccines/immunology , Neutralization Tests , HEK293 Cells , Consensus Sequence , HIV Infections/virology , HIV Infections/immunology , Protein Binding , Epitopes/immunology
7.
Carbohydr Res ; 540: 109138, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38703662

ABSTRACT

High-mannose-type glycan structure of N-glycoproteins plays important roles in the proper folding of proteins in sorting glycoprotein secretion and degradation of misfolded proteins in the endoplasmic reticulum (ER). The Glc1Man9GlcNAc2 (G1M9)-type N-glycan is one of the most important signaling molecules in the ER. However, current chemical synthesis strategies are laborious, warranting more practical approaches for G1M9-glycopeptide development. Wang et al. reported the procedure to give G1M9-Asn-Fmoc through chemical modifications and purifications from 40 chicken eggs, but only 3.3 mg of G1M9-glycopeptide was obtained. Therefore, better methods are needed to obtain more than 10 mg of G1M9-glycopeptide. In this study, we report the preparation of G1M9-glycopeptide (13.2 mg) linking Asn-Gly-Thr triad as consensus sequence from 40 chicken eggs. In this procedure, λ-carrageenan treatment followed by papain treatment was used to separate the Fc region of IgY antibody that harbors high-mannose glycans. Moreover, cotton hydrophilic interaction liquid chromatography was adapted for easy purification. The resulting G1M9-Asn(Fmoc)-Gly-Thr was identified by nuclear magnetic resonance and mass spectroscopy. G1M9-Asn(Fmoc)-Gly, G1M9-Asn(Fmoc), and G1M9-OH were also detected by mass spectroscopy. Here, our developed G1M9-tripeptide might be useful for the elucidation of glycoprotein functions as well as the specific roles of the consensus sequence.


Subject(s)
Chickens , Egg Yolk , Oligosaccharides , Animals , Egg Yolk/chemistry , Oligosaccharides/chemistry , Oligosaccharides/chemical synthesis , Asparagine/chemistry , Mannose/chemistry , Threonine/chemistry , Consensus Sequence , Glycine/chemistry , Glycopeptides/chemistry
8.
Protein Sci ; 33(6): e5011, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38747388

ABSTRACT

A protein sequence encodes its energy landscape-all the accessible conformations, energetics, and dynamics. The evolutionary relationship between sequence and landscape can be probed phylogenetically by compiling a multiple sequence alignment of homologous sequences and generating common ancestors via Ancestral Sequence Reconstruction or a consensus protein containing the most common amino acid at each position. Both ancestral and consensus proteins are often more stable than their extant homologs-questioning the differences between them and suggesting that both approaches serve as general methods to engineer thermostability. We used the Ribonuclease H family to compare these approaches and evaluate how the evolutionary relationship of the input sequences affects the properties of the resulting consensus protein. While the consensus protein derived from our full Ribonuclease H sequence alignment is structured and active, it neither shows properties of a well-folded protein nor has enhanced stability. In contrast, the consensus protein derived from a phylogenetically-restricted set of sequences is significantly more stable and cooperatively folded, suggesting that cooperativity may be encoded by different mechanisms in separate clades and lost when too many diverse clades are combined to generate a consensus protein. To explore this, we compared pairwise covariance scores using a Potts formalism as well as higher-order sequence correlations using singular value decomposition (SVD). We find the SVD coordinates of a stable consensus sequence are close to coordinates of the analogous ancestor sequence and its descendants, whereas the unstable consensus sequences are outliers in SVD space.


Subject(s)
Evolution, Molecular , Ribonuclease H/chemistry , Ribonuclease H/genetics , Ribonuclease H/metabolism , Consensus Sequence , Sequence Alignment , Phylogeny , Amino Acid Sequence , Models, Molecular , Protein Folding , Protein Conformation
9.
Rev Alerg Mex ; 71(1): 60, 2024 Feb 01.
Article in Spanish | MEDLINE | ID: mdl-38683078

ABSTRACT

OBJECTIVE: This study aimed to identify by in silico methods tropomyosin consensus B and T epitopes of shrimp species, house dust mites, insects, and nematodes associated with allergic diseases in tropical countries. METHODS: In silico analysis included tropomyosin from mites (Der p 10, Der f 10, Blo t 10), insects (Aed a 10, Per a 7, Bla g 7), shrimp (Lit v 1, Pen m 1, Pen a 1), and nematode (Asc l 3) all sequences were taken from the UniProt database. Linear IgE epitopes were predicted with AlgPred 2.0 and validated with BepiPred 3.0. MHC-II binding T cell epitopes were predicted using the IEDB server, which implements nine predictive methods (consensus method, combinatorial library, NN-align-2.3, NN- align-2.2, SMM-align, Sturniolo, NetMHCIIpan 3.1, and NetMHCIIpan 3.2) these predictions focused on 10 HLA-DR and 2 HLA-DQ alleles associated with allergic diseases. Subsequently, consensus B and T epitopes present in all species were identified. RESULTS: We identified 12 sequences that behaved as IgE-epitopes and B-cell epitopes, three of them: 160RKYDEVARKLAMVEA174, 192ELEEELRVVGNNLKSLEVSEEKAN215, 251KEVDRLEDELV261 were consensus in all species. Eleven peptides (T-epitopes) showed strong binding (percentile rank ≤ 2.0) to HLA-DRB1*0301, *0402, *0411, *0701, *1101, *1401, HLA-DQA1*03:01/DQB1*03:02, and HLA- DQA1*05:01/DQB1*02:01. Only two T-epitopes were consensus in all species: 167RKLAMVEADLERAEERAEt GEsKIVELEEELRV199, and 218EEeY KQQIKT LTaKLKEAEARAEFAERSV246. Subsequently, we identified 2 B and T epitope sequences and reached a consensus between species 167RKLAMVEA174 and 192ELEEELRV199. CONCLUSIONS: These data describe three sequences that may explain the IgE cross-reactivity between the analyzed species. In addition, the consensus B and T epitopes can be used for further in vitro investigations and may help to design multiple-epitope protein-based immunotherapy for tropomyosin-related allergic diseases.


OBJETIVO: Este estudio tuvo como objetivo identificar mediante métodos in silico epítopes B y T consenso de tropomiosina de especies de camarón, ácaros del polvo doméstico, insectos y nematodos asociados a enfermedades alérgicas en países tropicales. MÉTODOS: El análisis in silico incluyó tropomiosina de ácaros (Der p 10, Der f 10, Blo t 10), insectos (Aed a 10, Per a 7, Bla g 7), camarones (Lit v 1, Pen m 1, Pen a 1), y nematodo (Asc l 3). Todas las secuencias se tomaron de la base de datos UniProt. Los epítopes IgE lineales se predijeron con AlgPred 2.0 y se validaron con BepiPred 3.0. Los epítopes de células T de unión a MHC-II se predijeron utilizando el servidor IEDB, que implementa nueve métodos predictivos (método de consenso, biblioteca combinatoria, NN-align-2.3, NN-align-2.2, SMM-align, Sturniolo, NetMHCIIpan 3.1 y NetMHCIIpan 3.2). Estas predicciones se centraron en diez alelos HLA-DR y 2 HLA-DQ asociados con enfermedades alérgicas. Posteriormente, se identificaron epítopes consenso B y T presentes en todas las especies. RESULTADOS: Se identificaron 12 secuencias que se comportaron como epítopes de IgE y, también, como epítopes de células B. Tres de ellas: 160RKYDEVARKLAMVEA174, 192ELEEELRVVGNNLKSLEVSEEKAN213 y 251KEVDRLEDELV261, fueron consenso en todas las especies. Once péptidos mostraron una fuerte unión (rango percentil ≤ 2,0) a HLA-DRB1*0301, *0402, *0411, *0701, *1101, *1401 y a HLA HLA-DQA1*03:01/DQB1*03:02, o HLA-DQA1*05:01/DQB1*02:01. Solo se encontraron dos secuencias: 167RKLAMVEADLERAEERAEtGEsKIVELEEELRV199 con fuerte afinidad por HLA-DQA1*03:01/DQB1*03:02, y HLA-DQA1*05:01/DQB1*02:01. Se identificaron dos secuencias que son epítopos B y T, y son consenso entre especies: 167RKLAMVEA174 y 192ELEEELRV199. CONCLUSIONES: Estos datos describen tres secuencias que pueden explicar la reactividad cruzada de IgE entre las especies analizadas. Además, los epítopos B y T consenso se pueden usar para investigaciones in vitro adicionales, y pueden ayudar a diseñar inmunoterapia basada en proteínas de múltiepítopes para enfermedades alérgicas relacionadas con la tropomiosina.


Subject(s)
Computer Simulation , Cross Reactions , Epitopes, B-Lymphocyte , Epitopes, T-Lymphocyte , Hypersensitivity , Tropomyosin , Animals , Consensus Sequence , Epitopes, B-Lymphocyte/immunology , Epitopes, T-Lymphocyte/immunology , Insecta/immunology , Penaeidae/immunology , Pyroglyphidae/immunology , Tropomyosin/immunology , Tropomyosin/genetics , Hypersensitivity/immunology , Mites/immunology , Crustacea/immunology , Nematoda/immunology
10.
PLoS One ; 19(4): e0301069, 2024.
Article in English | MEDLINE | ID: mdl-38669259

ABSTRACT

Nearly 300 million individuals live with chronic hepatitis B virus (HBV) infection (CHB), for which no curative therapy is available. As viral diversity is associated with pathogenesis and immunological control of infection, improved methods to characterize this diversity could aid drug development efforts. Conventionally, viral sequencing data are mapped/aligned to a reference genome, and only the aligned sequences are retained for analysis. Thus, reference selection is critical, yet selecting the most representative reference a priori remains difficult. We investigate an alternative pangenome approach which can combine multiple reference sequences into a graph which can be used during alignment. Using simulated short-read sequencing data generated from publicly available HBV genomes and real sequencing data from an individual living with CHB, we demonstrate alignment to a phylogenetically representative 'genome graph' can improve alignment, avoid issues of reference ambiguity, and facilitate the construction of sample-specific consensus sequences more genetically similar to the individual's infection. Graph-based methods can, therefore, improve efforts to characterize the genetics of viral pathogens, including HBV, and have broader implications in host-pathogen research.


Subject(s)
Consensus Sequence , Genome, Viral , Hepatitis B virus , Hepatitis B virus/genetics , Humans , Consensus Sequence/genetics , Phylogeny , Sequence Alignment/methods , Genetic Variation , Hepatitis B, Chronic/virology , DNA, Viral/genetics , Sequence Analysis, DNA/methods
11.
J Agric Food Chem ; 72(12): 6454-6462, 2024 Mar 27.
Article in English | MEDLINE | ID: mdl-38477968

ABSTRACT

In this study, the phenomenon of the stability-activity trade-off, which is increasingly recognized in enzyme engineering, was explored. Typically, enhanced stability in enzymes correlates with diminished activity. Utilizing Rosa roxburghii copper-zinc superoxide dismutase (RrCuZnSOD) as a model, single-site mutations were introduced based on a semirational design derived from consensus sequences. The initial set of mutants was selected based on activity, followed by combinatorial mutation. This approach yielded two double-site mutants, D25/A115T (18,688 ± 206 U/mg) and A115T/S135P (18,095 ± 1556 U/mg), exhibiting superior enzymatic properties due to additive and synergistic effects. These mutants demonstrated increased half-lives (T1/2) at 80 °C by 1.2- and 1.6-fold, respectively, and their melting temperatures (Tm) rose by 3.4 and 2.5 °C, respectively, without any loss in activity relative to the wild type. Via an integration of structural analysis and molecular dynamics simulations, we elucidated the underlying mechanism facilitating the concurrent enhancement of both thermostability and enzymatic activity.


Subject(s)
Molecular Dynamics Simulation , Protein Engineering , Enzyme Stability , Temperature , Consensus Sequence
12.
Vet Res ; 55(1): 28, 2024 Mar 06.
Article in English | MEDLINE | ID: mdl-38449049

ABSTRACT

The prevalence of porcine reproductive and respiratory syndrome virus 1 (PRRSV1) isolates has continued to increase in Chinese swine herds in recent years. However, no effective control strategy is available for PRRSV1 infection in China. In this study, we generated the first infectious cDNA clone (rHLJB1) of a Chinese PRRSV1 isolate and subsequently used it as a backbone to construct an ORF2-6 chimeric virus (ORF2-6-CON). This virus contained a synthesized consensus sequence of the PRRSV1 ORF2-6 gene encoding all the envelope proteins. The ORF2-6 consensus sequence shared > 90% nucleotide similarity with four representative strains (Amervac, BJEU06-1, HKEU16 and NMEU09-1) of PRRSV1 in China. ORF2-6-CON had replication efficacy similar to that of the backbone rHLJB1 virus in primary alveolar macrophages (PAMs) and exhibited cell tropism in Marc-145 cells. Piglet inoculation and challenge studies indicated that ORF2-6-CON is not pathogenic to piglets and can induce enhanced cross-protection against a heterologous SD1291 isolate. Notably, ORF2-6-CON inoculation induced higher levels of heterologous neutralizing antibodies (nAbs) against SD1291 than rHLJB1 inoculation, which was concurrent with a higher percentage of T follicular helper (Tfh) cells in tracheobronchial lymph nodes (TBLNs), providing the first clue that porcine Tfh cells are correlated with heterologous PRRSV nAb responses. The number of SD1291-strain-specific IFNγ-secreting cells was similar in ORF2-6-CON-inoculated and rHLJB1-inoculated pigs. Overall, our findings support that the Marc-145-adapted ORF2-6-CON can trigger Tfh cell and heterologous nAb responses to confer improved cross-protection and may serve as a candidate strain for the development of a cross-protective PRRSV1 vaccine.


Subject(s)
Porcine respiratory and reproductive syndrome virus , Animals , Swine , Porcine respiratory and reproductive syndrome virus/genetics , T Follicular Helper Cells , Antibodies, Neutralizing , China , Consensus Sequence
13.
Int J Mol Sci ; 25(3)2024 Jan 30.
Article in English | MEDLINE | ID: mdl-38338947

ABSTRACT

The extended cleavage specificities of two hematopoietic serine proteases originating from the ray-finned fish, the spotted gar (Lepisosteus oculatus), have been characterized using substrate phage display. The preference for particular amino acids at and surrounding the cleavage site was further validated using a panel of recombinant substrates. For one of the enzymes, the gar granzyme G, a strict preference for the aromatic amino acid Tyr was observed at the cleavable P1 position. Using a set of recombinant substrates showed that the gar granzyme G had a high selectivity for Tyr but a lower activity for cleaving after Phe but not after Trp. Instead, the second enzyme, gar DDN1, showed a high preference for Leu in the P1 position of substrates. This latter enzyme also showed a high preference for Pro in the P2 position and Arg in both P4 and P5 positions. The selectivity for the two Arg residues in positions P4 and P5 suggests a highly specific substrate selectivity of this enzyme. The screening of the gar proteome with the consensus sequences obtained by substrate phage display for these two proteases resulted in a very diverse set of potential targets. Due to this diversity, a clear candidate for a specific immune function of these two enzymes cannot yet be identified. Antisera developed against the recombinant gar enzymes were used to study their tissue distribution. Tissue sections from juvenile fish showed the expression of both proteases in cells in Peyer's patch-like structures in the intestinal region, indicating they may be expressed in T or NK cells. However, due to the lack of antibodies to specific surface markers in the gar, it has not been possible to specify the exact cellular origin. A marked difference in abundance was observed for the two proteases where gar DDN1 was expressed at higher levels than gar granzyme G. However, both appear to be expressed in the same or similar cells, having a lymphocyte-like appearance.


Subject(s)
Fishes , Serine Proteases , Animals , Serine Proteases/genetics , Granzymes , Endopeptidases , Consensus Sequence , Substrate Specificity
14.
BMC Genomics ; 25(1): 109, 2024 Jan 24.
Article in English | MEDLINE | ID: mdl-38267856

ABSTRACT

BACKGROUND: Despite the many cheap and fast ways to generate genomic data, good and exact genome assembly is still a problem, with especially the repeats being vastly underrepresented and often misassembled. As short reads in low coverage are already sufficient to represent the repeat landscape of any given genome, many read cluster algorithms were brought forward that provide repeat identification and classification. But how can trustworthy, reliable and representative repeat consensuses be derived from unassembled genomes? RESULTS: Here, we combine methods from repeat identification and genome assembly to derive these robust consensuses. We test several use cases, such as (1) consensus building from clustered short reads of non-model genomes, (2) from genome-wide amplification setups, and (3) specific repeat-centred questions, such as the linked vs. unlinked arrangement of ribosomal genes. In all our use cases, the derived consensuses are robust and representative. To evaluate overall performance, we compare our high-fidelity repeat consensuses to RepeatExplorer2-derived contigs and check, if they represent real transposable elements as found in long reads. Our results demonstrate that it is possible to generate useful, reliable and trustworthy consensuses from short reads by a combination from read cluster and genome assembly methods in an automatable way. CONCLUSION: We anticipate that our workflow opens the way towards more efficient and less manual repeat characterization and annotation, benefitting all genome studies, but especially those of non-model organisms.


Subject(s)
Algorithms , DNA Transposable Elements , Consensus Sequence , Cluster Analysis , Genomics
15.
Biochemistry ; 63(3): 348-354, 2024 Feb 06.
Article in English | MEDLINE | ID: mdl-38206322

ABSTRACT

Proteins' extraordinary performance in recognition and catalysis has led to their use in a range of applications. However, proteins obtained from natural sources are oftentimes not suitable for direct use in industrial or diagnostic setups. Natural proteins, evolved to optimally perform a task in physiological conditions, usually lack the stability required to be used in harsher conditions. Therefore, the alteration of the stability of proteins is commonly pursued in protein engineering studies. Here, we achieved a substantial thermal stabilization of a bacterial Zn(II)-dependent phospholipase C by consensus sequence design. We retrieved and analyzed sequenced homologues from different sources, selecting a subset of examples for expression and characterization. A non-natural consensus sequence showed the highest stability and activity among those tested. Comparison of the stability parameters of this stabilized mutant and other natural variants bearing similar mutations allows us to pinpoint the sites most likely to be responsible for the enhancement. Point mutations in these sites alter the unfolding process of the consensus sequence. We show that the stabilized version of the protein retains full activity even in harsh oil degumming conditions, making it suitable for industrial applications.


Subject(s)
Proteins , Zinc , Amino Acid Sequence , Proteins/metabolism , Mutation , Consensus Sequence
16.
Proc Natl Acad Sci U S A ; 121(3): e2312029121, 2024 Jan 16.
Article in English | MEDLINE | ID: mdl-38194446

ABSTRACT

Understanding natural protein evolution and designing novel proteins are motivating interest in development of high-throughput methods to explore large sequence spaces. In this work, we demonstrate the application of multisite λ dynamics (MSλD), a rigorous free energy simulation method, and chemical denaturation experiments to quantify evolutionary selection pressure from sequence-stability relationships and to address questions of design. This study examines a mesophilic phylogenetic clade of ribonuclease H (RNase H), furthering its extensive characterization in earlier studies, focusing on E. coli RNase H (ecRNH) and a more stable consensus sequence (AncCcons) differing at 15 positions. The stabilities of 32,768 chimeras between these two sequences were computed using the MSλD framework. The most stable and least stable chimeras were predicted and tested along with several other sequences, revealing a designed chimera with approximately the same stability increase as AncCcons, but requiring only half the mutations. Comparing the computed stabilities with experiment for 12 sequences reveals a Pearson correlation of 0.86 and root mean squared error of 1.18 kcal/mol, an unprecedented level of accuracy well beyond less rigorous computational design methods. We then quantified selection pressure using a simple evolutionary model in which sequences are selected according to the Boltzmann factor of their stability. Selection temperatures from 110 to 168 K are estimated in three ways by comparing experimental and computational results to evolutionary models. These estimates indicate selection pressure is high, which has implications for evolutionary dynamics and for the accuracy required for design, and suggests accurate high-throughput computational methods like MSλD may enable more effective protein design.


Subject(s)
Escherichia coli , Ribonuclease H , Escherichia coli/genetics , Phylogeny , Computer Simulation , Consensus Sequence , Ribonuclease H/genetics
17.
J Biotechnol ; 379: 53-64, 2024 Jan 10.
Article in English | MEDLINE | ID: mdl-38070779

ABSTRACT

The baculovirus-insect cell expression system allows addition of O-fucose to EGF-like domains of glycoproteins, following the action of the protein O-fucosyltransferase 1 named POFUT1. In this study, recombinant Spodoptera frugiperda POFUT1 from baculovirus-infected Sf9 cells was compared to recombinant Mus musculus POFUT1 produced by CHO cells. Contrary to recombinant murine POFUT1 carrying two hybrid and/or complex type N-glycans, Spodoptera frugiperda POFUT1 exhibited paucimannose N-glycans, at least on its highly evolutionary conserved across Metazoa NRT site. The abilities of both recombinant enzymes to add in vitro O -fucose to EGF-like domains of three different recombinant mammalian glycoproteins were then explored. In vitro POFUT1-mediated O-fucosylation experiments, followed by click chemistry and blot analyses, showed that Spodoptera frugiperda POFUT1 was able to add O-fucose to mouse NOTCH1 EGF-like 26 and WIF1 EGF-like 3 domains, similarly to the murine counterpart. As proved by mass spectrometry, full-length human WNT Inhibitor Factor 1 expressed by Sf9 cells was also modified with O-fucose. However, Spodoptera frugiperda POFUT1 was unable to modify the single EGF-like domain of mouse PAMR1 with O-fucose, contrary to murine POFUT1. Absence of orthologous proteins such as PAMR1 in insects may explain the enzyme's difficulty in adding O-fucose to a domain that it never encounters naturally.


Subject(s)
Fucosyltransferases , Recombinant Proteins , Recombinant Proteins/chemistry , Recombinant Proteins/genetics , Recombinant Proteins/metabolism , Spodoptera/enzymology , Spodoptera/genetics , Spodoptera/metabolism , Fucosyltransferases/chemistry , Fucosyltransferases/genetics , Fucosyltransferases/metabolism , Humans , Animals , Mice , CHO Cells , Cricetulus , Sf9 Cells , Glycosylation , Consensus Sequence , Fucose/metabolism , Protein Domains
18.
Nature ; 624(7991): 355-365, 2023 Dec.
Article in English | MEDLINE | ID: mdl-38092919

ABSTRACT

Single-cell analyses parse the brain's billions of neurons into thousands of 'cell-type' clusters residing in different brain structures1. Many cell types mediate their functions through targeted long-distance projections allowing interactions between specific cell types. Here we used epi-retro-seq2 to link single-cell epigenomes and cell types to long-distance projections for 33,034 neurons dissected from 32 different regions projecting to 24 different targets (225 source-to-target combinations) across the whole mouse brain. We highlight uses of these data for interrogating principles relating projection types to transcriptomics and epigenomics, and for addressing hypotheses about cell types and connections related to genetics. We provide an overall synthesis with 926 statistical comparisons of discriminability of neurons projecting to each target for every source. We integrate this dataset into the larger BRAIN Initiative Cell Census Network atlas, composed of millions of neurons, to link projection cell types to consensus clusters. Integration with spatial transcriptomics further assigns projection-enriched clusters to smaller source regions than the original dissections. We exemplify this by presenting in-depth analyses of projection neurons from the hypothalamus, thalamus, hindbrain, amygdala and midbrain to provide insights into properties of those cell types, including differentially expressed genes, their associated cis-regulatory elements and transcription-factor-binding motifs, and neurotransmitter use.


Subject(s)
Brain , Epigenomics , Neural Pathways , Neurons , Animals , Mice , Amygdala , Brain/cytology , Brain/metabolism , Consensus Sequence , Datasets as Topic , Gene Expression Profiling , Hypothalamus/cytology , Mesencephalon/cytology , Neural Pathways/cytology , Neurons/metabolism , Neurotransmitter Agents/metabolism , Regulatory Sequences, Nucleic Acid , Rhombencephalon/cytology , Single-Cell Analysis , Thalamus/cytology , Transcription Factors/metabolism
19.
Nature ; 624(7991): 433-441, 2023 Dec.
Article in English | MEDLINE | ID: mdl-38030726

ABSTRACT

FOXP3 is a transcription factor that is essential for the development of regulatory T cells, a branch of T cells that suppress excessive inflammation and autoimmunity1-5. However, the molecular mechanisms of FOXP3 remain unclear. Here we here show that FOXP3 uses the forkhead domain-a DNA-binding domain that is commonly thought to function as a monomer or dimer-to form a higher-order multimer after binding to TnG repeat microsatellites. The cryo-electron microscopy structure of FOXP3 in a complex with T3G repeats reveals a ladder-like architecture, whereby two double-stranded DNA molecules form the two 'side rails' bridged by five pairs of FOXP3 molecules, with each pair forming a 'rung'. Each FOXP3 subunit occupies TGTTTGT within the repeats in a manner that is indistinguishable from that of FOXP3 bound to the forkhead consensus motif (TGTTTAC). Mutations in the intra-rung interface impair TnG repeat recognition, DNA bridging and the cellular functions of FOXP3, all without affecting binding to the forkhead consensus motif. FOXP3 can tolerate variable inter-rung spacings, explaining its broad specificity for TnG-repeat-like sequences in vivo and in vitro. Both FOXP3 orthologues and paralogues show similar TnG repeat recognition and DNA bridging. These findings therefore reveal a mode of DNA recognition that involves transcription factor homomultimerization and DNA bridging, and further implicates microsatellites in transcriptional regulation and diseases.


Subject(s)
DNA , Forkhead Transcription Factors , Microsatellite Repeats , Base Sequence , Consensus Sequence , Cryoelectron Microscopy , DNA/chemistry , DNA/genetics , DNA/metabolism , DNA/ultrastructure , Forkhead Transcription Factors/chemistry , Forkhead Transcription Factors/metabolism , Forkhead Transcription Factors/ultrastructure , Microsatellite Repeats/genetics , Mutation , Nucleotide Motifs , Protein Domains , Protein Multimerization , T-Lymphocytes, Regulatory/metabolism
20.
Sci Rep ; 13(1): 15767, 2023 09 22.
Article in English | MEDLINE | ID: mdl-37737281

ABSTRACT

Gloeocapsopsis dulcis strain AAB1 is an extremely xerotolerant cyanobacterium isolated from the Atacama Desert (i.e., the driest and oldest desert on Earth) that holds astrobiological significance due to its ability to biosynthesize compatible solutes at ultra-low water activities. We sequenced and assembled the G. dulcis genome de novo using a combination of long- and short-read sequencing, which resulted in high-quality consensus sequences of the chromosome and two plasmids. We leveraged the G. dulcis genome to generate a genome-scale metabolic model (iGd895) to simulate growth in silico. iGd895 represents, to our knowledge, the first genome-scale metabolic reconstruction developed for an extremely xerotolerant cyanobacterium. The model's predictive capability was assessed by comparing the in silico growth rate with in vitro growth rates of G. dulcis, in addition to the synthesis of trehalose. iGd895 allowed us to explore simulations of key metabolic processes such as essential pathways for water-stress tolerance, and significant alterations to reaction flux distribution and metabolic network reorganization resulting from water limitation. Our study provides insights into the potential metabolic strategies employed by G. dulcis, emphasizing the crucial roles of compatible solutes, metabolic water, energy conservation, and the precise regulation of reaction rates in their adaptation to water stress.


Subject(s)
Brassicaceae , Cyanobacteria , Desiccation , Cyanobacteria/genetics , Metabolic Networks and Pathways , Consensus Sequence , Dehydration
SELECTION OF CITATIONS
SEARCH DETAIL