Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 10.262
Filtrar
Mais filtros

Intervalo de ano de publicação
1.
Nature ; 624(7991): 355-365, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38092919

RESUMO

Single-cell analyses parse the brain's billions of neurons into thousands of 'cell-type' clusters residing in different brain structures1. Many cell types mediate their functions through targeted long-distance projections allowing interactions between specific cell types. Here we used epi-retro-seq2 to link single-cell epigenomes and cell types to long-distance projections for 33,034 neurons dissected from 32 different regions projecting to 24 different targets (225 source-to-target combinations) across the whole mouse brain. We highlight uses of these data for interrogating principles relating projection types to transcriptomics and epigenomics, and for addressing hypotheses about cell types and connections related to genetics. We provide an overall synthesis with 926 statistical comparisons of discriminability of neurons projecting to each target for every source. We integrate this dataset into the larger BRAIN Initiative Cell Census Network atlas, composed of millions of neurons, to link projection cell types to consensus clusters. Integration with spatial transcriptomics further assigns projection-enriched clusters to smaller source regions than the original dissections. We exemplify this by presenting in-depth analyses of projection neurons from the hypothalamus, thalamus, hindbrain, amygdala and midbrain to provide insights into properties of those cell types, including differentially expressed genes, their associated cis-regulatory elements and transcription-factor-binding motifs, and neurotransmitter use.


Assuntos
Encéfalo , Epigenômica , Vias Neurais , Neurônios , Animais , Camundongos , Tonsila do Cerebelo , Encéfalo/citologia , Encéfalo/metabolismo , Sequência Consenso , Conjuntos de Dados como Assunto , Perfilação da Expressão Gênica , Hipotálamo/citologia , Mesencéfalo/citologia , Vias Neurais/citologia , Neurônios/metabolismo , Neurotransmissores/metabolismo , Sequências Reguladoras de Ácido Nucleico , Rombencéfalo/citologia , Análise de Célula Única , Tálamo/citologia , Fatores de Transcrição/metabolismo
2.
Nature ; 624(7991): 433-441, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38030726

RESUMO

FOXP3 is a transcription factor that is essential for the development of regulatory T cells, a branch of T cells that suppress excessive inflammation and autoimmunity1-5. However, the molecular mechanisms of FOXP3 remain unclear. Here we here show that FOXP3 uses the forkhead domain-a DNA-binding domain that is commonly thought to function as a monomer or dimer-to form a higher-order multimer after binding to TnG repeat microsatellites. The cryo-electron microscopy structure of FOXP3 in a complex with T3G repeats reveals a ladder-like architecture, whereby two double-stranded DNA molecules form the two 'side rails' bridged by five pairs of FOXP3 molecules, with each pair forming a 'rung'. Each FOXP3 subunit occupies TGTTTGT within the repeats in a manner that is indistinguishable from that of FOXP3 bound to the forkhead consensus motif (TGTTTAC). Mutations in the intra-rung interface impair TnG repeat recognition, DNA bridging and the cellular functions of FOXP3, all without affecting binding to the forkhead consensus motif. FOXP3 can tolerate variable inter-rung spacings, explaining its broad specificity for TnG-repeat-like sequences in vivo and in vitro. Both FOXP3 orthologues and paralogues show similar TnG repeat recognition and DNA bridging. These findings therefore reveal a mode of DNA recognition that involves transcription factor homomultimerization and DNA bridging, and further implicates microsatellites in transcriptional regulation and diseases.


Assuntos
DNA , Fatores de Transcrição Forkhead , Repetições de Microssatélites , Sequência de Bases , Sequência Consenso , Microscopia Crioeletrônica , DNA/química , DNA/genética , DNA/metabolismo , DNA/ultraestrutura , Fatores de Transcrição Forkhead/química , Fatores de Transcrição Forkhead/metabolismo , Fatores de Transcrição Forkhead/ultraestrutura , Repetições de Microssatélites/genética , Mutação , Motivos de Nucleotídeos , Domínios Proteicos , Multimerização Proteica , Linfócitos T Reguladores/metabolismo
3.
Nature ; 600(7887): 164-169, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34789875

RESUMO

In the clades of animals that diverged from the bony fish, a group of Mas-related G-protein-coupled receptors (MRGPRs) evolved that have an active role in itch and allergic signals1,2. As an MRGPR, MRGPRX2 is known to sense basic secretagogues (agents that promote secretion) and is involved in itch signals and eliciting pseudoallergic reactions3-6. MRGPRX2 has been targeted by drug development efforts to prevent the side effects induced by certain drugs or to treat allergic diseases. Here we report a set of cryo-electron microscopy structures of the MRGPRX2-Gi1 trimer in complex with polycationic compound 48/80 or with inflammatory peptides. The structures of the MRGPRX2-Gi1 complex exhibited shallow, solvent-exposed ligand-binding pockets. We identified key common structural features of MRGPRX2 and describe a consensus motif for peptidic allergens. Beneath the ligand-binding pocket, the unusual kink formation at transmembrane domain 6 (TM6) and the replacement of the general toggle switch from Trp6.48 to Gly6.48 (superscript annotations as per Ballesteros-Weinstein nomenclature) suggest a distinct activation process. We characterized the interfaces of MRGPRX2 and the Gi trimer, and mapped the residues associated with key single-nucleotide polymorphisms on both the ligand and G-protein interfaces of MRGPRX2. Collectively, our results provide a structural basis for the sensing of cationic allergens by MRGPRX2, potentially facilitating the rational design of therapies to prevent unwanted pseudoallergic reactions.


Assuntos
Proteínas do Tecido Nervoso/química , Proteínas do Tecido Nervoso/metabolismo , Prurido/metabolismo , Receptores Acoplados a Proteínas G/química , Receptores Acoplados a Proteínas G/metabolismo , Receptores de Neuropeptídeos/química , Receptores de Neuropeptídeos/metabolismo , Alérgenos/imunologia , Motivos de Aminoácidos , Sequência de Aminoácidos , Sítios de Ligação , Sequência Consenso , Microscopia Crioeletrônica , Subunidades alfa Gi-Go de Proteínas de Ligação ao GTP/metabolismo , Subunidades alfa Gq-G11 de Proteínas de Ligação ao GTP/metabolismo , Humanos , Modelos Moleculares , Proteínas do Tecido Nervoso/imunologia , Proteínas do Tecido Nervoso/ultraestrutura , Receptores Acoplados a Proteínas G/imunologia , Receptores Acoplados a Proteínas G/ultraestrutura , Receptores de Neuropeptídeos/imunologia , Receptores de Neuropeptídeos/ultraestrutura
4.
Mol Cell ; 73(6): 1232-1242.e4, 2019 03 21.
Artigo em Inglês | MEDLINE | ID: mdl-30765194

RESUMO

The C-terminal domain (CTD) of RNA polymerase II (Pol II) is composed of repeats of the consensus YSPTSPS and is an essential binding scaffold for transcription-associated factors. Metazoan CTDs have well-conserved lengths and sequence compositions arising from the evolution of divergent motifs, features thought to be essential for development. On the contrary, we show that a truncated CTD composed solely of YSPTSPS repeats supports Drosophila viability but that a CTD with enough YSPTSPS repeats to match the length of the wild-type Drosophila CTD is defective. Furthermore, a fluorescently tagged CTD lacking the rest of Pol II dynamically enters transcription compartments, indicating that the CTD functions as a signal sequence. However, CTDs with too many YSPTSPS repeats are more prone to localize to static nuclear foci separate from the chromosomes. We propose that the sequence complexity of the CTD offsets aberrant behavior caused by excessive repetitive sequences without compromising its targeting function.


Assuntos
Motivos de Aminoácidos , Sequência Consenso , Proteínas de Drosophila/metabolismo , Drosophila melanogaster/enzimologia , RNA Polimerase II/metabolismo , Sequências Repetitivas de Aminoácidos , Glândulas Salivares/enzimologia , Animais , Animais Geneticamente Modificados , Proteínas de Drosophila/química , Proteínas de Drosophila/genética , Drosophila melanogaster/embriologia , Drosophila melanogaster/genética , Regulação da Expressão Gênica no Desenvolvimento , Mutação , Domínios Proteicos , RNA Polimerase II/química , RNA Polimerase II/genética , Glândulas Salivares/embriologia , Transcrição Gênica , Ativação Transcricional
5.
Proc Natl Acad Sci U S A ; 121(3): e2312029121, 2024 Jan 16.
Artigo em Inglês | MEDLINE | ID: mdl-38194446

RESUMO

Understanding natural protein evolution and designing novel proteins are motivating interest in development of high-throughput methods to explore large sequence spaces. In this work, we demonstrate the application of multisite λ dynamics (MSλD), a rigorous free energy simulation method, and chemical denaturation experiments to quantify evolutionary selection pressure from sequence-stability relationships and to address questions of design. This study examines a mesophilic phylogenetic clade of ribonuclease H (RNase H), furthering its extensive characterization in earlier studies, focusing on E. coli RNase H (ecRNH) and a more stable consensus sequence (AncCcons) differing at 15 positions. The stabilities of 32,768 chimeras between these two sequences were computed using the MSλD framework. The most stable and least stable chimeras were predicted and tested along with several other sequences, revealing a designed chimera with approximately the same stability increase as AncCcons, but requiring only half the mutations. Comparing the computed stabilities with experiment for 12 sequences reveals a Pearson correlation of 0.86 and root mean squared error of 1.18 kcal/mol, an unprecedented level of accuracy well beyond less rigorous computational design methods. We then quantified selection pressure using a simple evolutionary model in which sequences are selected according to the Boltzmann factor of their stability. Selection temperatures from 110 to 168 K are estimated in three ways by comparing experimental and computational results to evolutionary models. These estimates indicate selection pressure is high, which has implications for evolutionary dynamics and for the accuracy required for design, and suggests accurate high-throughput computational methods like MSλD may enable more effective protein design.


Assuntos
Escherichia coli , Ribonuclease H , Escherichia coli/genética , Filogenia , Simulação por Computador , Sequência Consenso , Ribonuclease H/genética
6.
Brief Bioinform ; 25(4)2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-38920083

RESUMO

This study proposes a novel approach to studying severe acute respiratory syndrome coronavirus 2 virus mutations through sequencing data comparison. Traditional consensus-based methods, which focus on the most common nucleotide at each position, might overlook or obscure the presence of low-frequency variants. Our method, in contrast, retains all sequenced nucleotides at each position, forming a genomic matrix. Utilizing simulated short reads from genomes with specified mutations, we contrasted our genomic matrix approach with the consensus sequence method. Our matrix methodology, across multiple simulated datasets, accurately reflected the known mutations with an average accuracy improvement of 20% over the consensus method. In real-world tests using data from GISAID and NCBI-SRA, our approach demonstrated an increase in reliability by reducing the error margin by approximately 15%. The genomic matrix approach offers a more accurate representation of the viral genomic diversity, thereby providing superior insights into virus evolution and epidemiology.


Assuntos
COVID-19 , Genoma Viral , Filogenia , SARS-CoV-2 , SARS-CoV-2/genética , Humanos , COVID-19/virologia , COVID-19/epidemiologia , Mutação , Sequência Consenso , Variação Genética
7.
Nature ; 583(7818): 729-736, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-32728250

RESUMO

Combinatorial binding of transcription factors to regulatory DNA underpins gene regulation in all organisms. Genetic variation in regulatory regions has been connected with diseases and diverse phenotypic traits1, but it remains challenging to distinguish variants that affect regulatory function2. Genomic DNase I footprinting enables the quantitative, nucleotide-resolution delineation of sites of transcription factor occupancy within native chromatin3-6. However, only a small fraction of such sites have been precisely resolved on the human genome sequence6. Here, to enable comprehensive mapping of transcription factor footprints, we produced high-density DNase I cleavage maps from 243 human cell and tissue types and states and integrated these data to delineate about 4.5 million compact genomic elements that encode transcription factor occupancy at nucleotide resolution. We map the fine-scale structure within about 1.6 million DNase I-hypersensitive sites and show that the overwhelming majority are populated by well-spaced sites of single transcription factor-DNA interaction. Cell-context-dependent cis-regulation is chiefly executed by wholesale modulation of accessibility at regulatory DNA rather than by differential transcription factor occupancy within accessible elements. We also show that the enrichment of genetic variants associated with diseases or phenotypic traits in regulatory regions1,7 is almost entirely attributable to variants within footprints, and that functional variants that affect transcription factor occupancy are nearly evenly partitioned between loss- and gain-of-function alleles. Unexpectedly, we find increased density of human genetic variation within transcription factor footprints, revealing an unappreciated driver of cis-regulatory evolution. Our results provide a framework for both global and nucleotide-precision analyses of gene regulatory mechanisms and functional genetic variation.


Assuntos
Pegada de DNA/normas , Genoma Humano/genética , Fatores de Transcrição/metabolismo , Sequência Consenso , DNA/genética , DNA/metabolismo , Desoxirribonuclease I/metabolismo , Genética Populacional , Estudo de Associação Genômica Ampla , Humanos , Modelos Moleculares , Polimorfismo de Nucleotídeo Único , Sequências Reguladoras de Ácido Nucleico/genética
8.
Nature ; 585(7825): 459-463, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32908305

RESUMO

The RNA polymerase II (Pol II) core promoter is the strategic site of convergence of the signals that lead to the initiation of DNA transcription1-5, but the downstream core promoter in humans has been difficult to understand1-3. Here we analyse the human Pol II core promoter and use machine learning to generate predictive models for the downstream core promoter region (DPR) and the TATA box. We developed a method termed HARPE (high-throughput analysis of randomized promoter elements) to create hundreds of thousands of DPR (or TATA box) variants, each with known transcriptional strength. We then analysed the HARPE data by support vector regression (SVR) to provide comprehensive models for the sequence motifs, and found that the SVR-based approach is more effective than a consensus-based method for predicting transcriptional activity. These results show that the DPR is a functionally important core promoter element that is widely used in human promoters. Notably, there appears to be a duality between the DPR and the TATA box, as many promoters contain one or the other element. More broadly, these findings show that functional DNA motifs can be identified by machine learning analysis of a comprehensive set of sequence variants.


Assuntos
Sequência Consenso/genética , Regulação da Expressão Gênica/genética , Regiões Promotoras Genéticas/genética , RNA Polimerase II/metabolismo , Máquina de Vetores de Suporte , Transcrição Gênica , Sequência de Bases , Células/metabolismo , Simulação por Computador , Conjuntos de Dados como Assunto , Células HeLa , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Modelos Genéticos , Mutagênese , TATA Box/genética
9.
Nature ; 580(7802): 269-273, 2020 04.
Artigo em Inglês | MEDLINE | ID: mdl-32106218

RESUMO

Various species of the intestinal microbiota have been associated with the development of colorectal cancer1,2, but it has not been demonstrated that bacteria have a direct role in the occurrence of oncogenic mutations. Escherichia coli can carry the pathogenicity island pks, which encodes a set of enzymes that synthesize colibactin3. This compound is believed to alkylate DNA on adenine residues4,5 and induces double-strand breaks in cultured cells3. Here we expose human intestinal organoids to genotoxic pks+ E. coli by repeated luminal injection over five months. Whole-genome sequencing of clonal organoids before and after this exposure revealed a distinct mutational signature that was absent from organoids injected with isogenic pks-mutant bacteria. The same mutational signature was detected in a subset of 5,876 human cancer genomes from two independent cohorts, predominantly in colorectal cancer. Our study describes a distinct mutational signature in colorectal cancer and implies that the underlying mutational process results directly from past exposure to bacteria carrying the colibactin-producing pks pathogenicity island.


Assuntos
Neoplasias Colorretais/genética , Neoplasias Colorretais/microbiologia , Escherichia coli/genética , Escherichia coli/patogenicidade , Ilhas Genômicas/genética , Mutagênese , Mutação , Técnicas de Cocultura , Estudos de Coortes , Sequência Consenso , Dano ao DNA , Microbioma Gastrointestinal , Humanos , Organoides/citologia , Organoides/metabolismo , Organoides/microbiologia , Peptídeos/genética , Policetídeos
10.
Nature ; 578(7794): 311-316, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-31996847

RESUMO

PIWI-interacting RNAs (piRNAs) of between approximately 24 and 31 nucleotides in length guide PIWI proteins to silence transposons in animal gonads, thereby ensuring fertility1. In the biogenesis of piRNAs, PIWI proteins are first loaded with 5'-monophosphorylated RNA fragments called pre-pre-piRNAs, which then undergo endonucleolytic cleavage to produce pre-piRNAs1,2. Subsequently, the 3'-ends of pre-piRNAs are trimmed by the exonuclease Trimmer (PNLDC1 in mouse)3-6 and 2'-O-methylated by the methyltransferase Hen1 (HENMT1 in mouse)7-9, generating mature piRNAs. It is assumed that the endonuclease Zucchini (MitoPLD in mouse) is a major enzyme catalysing the cleavage of pre-pre-piRNAs into pre-piRNAs10-13. However, direct evidence for this model is lacking, and how pre-piRNAs are generated remains unclear. Here, to analyse pre-piRNA production, we established a Trimmer-knockout silkworm cell line and derived a cell-free system that faithfully recapitulates Zucchini-mediated cleavage of PIWI-loaded pre-pre-piRNAs. We found that pre-piRNAs are generated by parallel Zucchini-dependent and -independent mechanisms. Cleavage by Zucchini occurs at previously unrecognized consensus motifs on pre-pre-piRNAs, requires the RNA helicase Armitage, and is accompanied by 2'-O-methylation of pre-piRNAs. By contrast, slicing of pre-pre-piRNAs with weak Zucchini motifs is achieved by downstream complementary piRNAs, producing pre-piRNAs without 2'-O-methylation. Regardless of the endonucleolytic mechanism, pre-piRNAs are matured by Trimmer and Hen1. Our findings highlight multiplexed processing of piRNA precursors that supports robust and flexible piRNA biogenesis.


Assuntos
Motivos de Aminoácidos , Sequência Consenso , Proteínas de Insetos/química , Proteínas de Insetos/metabolismo , Proteínas Mitocondriais/química , Proteínas Mitocondriais/metabolismo , Fosfolipase D/química , Fosfolipase D/metabolismo , RNA Interferente Pequeno/biossíntese , Trifosfato de Adenosina/metabolismo , Animais , Sequência de Bases , Bombyx , Linhagem Celular , Sistema Livre de Células , Técnicas de Inativação de Genes , Proteínas de Insetos/genética , Metilação , Camundongos , RNA Helicases/metabolismo
11.
Mol Cell ; 72(3): 482-495.e7, 2018 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-30388410

RESUMO

Productive splicing of human precursor messenger RNAs (pre-mRNAs) requires the correct selection of authentic splice sites (SS) from the large pool of potential SS. Although SS consensus sequence and splicing regulatory proteins are known to influence SS usage, the mechanisms ensuring the effective suppression of cryptic SS are insufficiently explored. Here, we find that many aberrant exonic SS are efficiently silenced by the exon junction complex (EJC), a multi-protein complex that is deposited on spliced mRNA near the exon-exon junction. Upon depletion of EJC proteins, cryptic SS are de-repressed, leading to the mis-splicing of a broad set of mRNAs. Mechanistically, the EJC-mediated recruitment of the splicing regulator RNPS1 inhibits cryptic 5'SS usage, while the deposition of the EJC core directly masks reconstituted 3'SS, thereby precluding transcript disintegration. Thus, the EJC protects the transcriptome of mammalian cells from inadvertent loss of exonic sequences and safeguards the expression of intact, full-length mRNAs.


Assuntos
Processamento Alternativo/fisiologia , Éxons/fisiologia , Sítios de Splice de RNA/fisiologia , Sequência Consenso/genética , RNA Helicases DEAD-box/metabolismo , Fator de Iniciação 4A em Eucariotos/metabolismo , Células HeLa , Humanos , Íntrons , Precursores de RNA/fisiologia , Splicing de RNA/fisiologia , RNA Mensageiro/genética , Proteínas de Ligação a RNA/metabolismo , Ribonucleoproteínas/metabolismo , Transcriptoma/genética
12.
Nucleic Acids Res ; 52(15): 9247-9266, 2024 Aug 27.
Artigo em Inglês | MEDLINE | ID: mdl-38943346

RESUMO

Classification of introns, which is crucial to understanding their evolution and splicing, has historically been binary and has resulted in the naming of major and minor introns that are spliced by their namesake spliceosome. However, a broad range of intron consensus sequences exist, leading us to here reclassify introns as minor, minor-like, hybrid, major-like, major and non-canonical introns in 263 species across six eukaryotic supergroups. Through intron orthology analysis, we discovered that minor-like introns are a transitory node for intron conversion across evolution. Despite close resemblance of their consensus sequences to minor introns, these introns possess an AG dinucleotide at the -1 and -2 position of the 5' splice site, a salient feature of major introns. Through combined analysis of CoLa-seq, CLIP-seq for major and minor spliceosome components, and RNAseq from samples in which the minor spliceosome is inhibited we found that minor-like introns are also an intermediate class from a splicing mechanism perspective. Importantly, this analysis has provided insight into the sequence elements that have evolved to make minor-like introns amenable to recognition by both minor and major spliceosome components. We hope that this revised intron classification provides a new framework to study intron evolution and splicing.


Assuntos
Evolução Molecular , Íntrons , Splicing de RNA , Spliceossomos , Íntrons/genética , Spliceossomos/genética , Humanos , Sítios de Splice de RNA , Animais , Sequência Consenso , Eucariotos/genética , Eucariotos/classificação , Sequência de Bases
13.
Proc Natl Acad Sci U S A ; 120(29): e2220762120, 2023 07 18.
Artigo em Inglês | MEDLINE | ID: mdl-37432995

RESUMO

Large datasets contribute new insights to subjects formerly investigated by exemplars. We used coevolution data to create a large, high-quality database of transmembrane ß-barrels (TMBB). By applying simple feature detection on generated evolutionary contact maps, our method (IsItABarrel) achieves 95.88% balanced accuracy when discriminating among protein classes. Moreover, comparison with IsItABarrel revealed a high rate of false positives in previous TMBB algorithms. In addition to being more accurate than previous datasets, our database (available online) contains 1,938,936 bacterial TMBB proteins from 38 phyla, respectively, 17 and 2.2 times larger than the previous sets TMBB-DB and OMPdb. We anticipate that due to its quality and size, the database will serve as a useful resource where high-quality TMBB sequence data are required. We found that TMBBs can be divided into 11 types, three of which have not been previously reported. We find tremendous variance in proteome percentage among TMBB-containing organisms with some using 6.79% of their proteome for TMBBs and others using as little as 0.27% of their proteome. The distribution of the lengths of the TMBBs is suggestive of previously hypothesized duplication events. In addition, we find that the C-terminal ß-signal varies among different classes of bacteria though its consensus sequence is LGLGYRF. However, this ß-signal is only characteristic of prototypical TMBBs. The ten non-prototypical barrel types have other C-terminal motifs, and it remains to be determined if these alternative motifs facilitate TMBB insertion or perform any other signaling function.


Assuntos
Algoritmos , Proteoma , Humanos , Proteínas de Bactérias/genética , Evolução Biológica , Sequência Consenso
14.
Genes Dev ; 31(1): 1-2, 2017 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-28130343

RESUMO

Transcription by RNA polymerase II (Pol II) is dictated in part by core promoter elements, which are DNA sequences flanking the transcription start site (TSS) that help direct the proper initiation of transcription. Taking advantage of recent advances in genome-wide sequencing approaches, Vo ngoc and colleagues (pp. 6-11) identified transcripts with focused sites of initiation and found that many were transcribed from promoters containing a new consensus sequence for the human initiator (Inr) core promoter element.


Assuntos
Regiões Promotoras Genéticas , Sítio de Iniciação de Transcrição , Sequência de Bases , Sequência Consenso , Humanos , RNA Polimerase II/genética , TATA Box , Transcrição Gênica
15.
Biochemistry ; 63(3): 348-354, 2024 Feb 06.
Artigo em Inglês | MEDLINE | ID: mdl-38206322

RESUMO

Proteins' extraordinary performance in recognition and catalysis has led to their use in a range of applications. However, proteins obtained from natural sources are oftentimes not suitable for direct use in industrial or diagnostic setups. Natural proteins, evolved to optimally perform a task in physiological conditions, usually lack the stability required to be used in harsher conditions. Therefore, the alteration of the stability of proteins is commonly pursued in protein engineering studies. Here, we achieved a substantial thermal stabilization of a bacterial Zn(II)-dependent phospholipase C by consensus sequence design. We retrieved and analyzed sequenced homologues from different sources, selecting a subset of examples for expression and characterization. A non-natural consensus sequence showed the highest stability and activity among those tested. Comparison of the stability parameters of this stabilized mutant and other natural variants bearing similar mutations allows us to pinpoint the sites most likely to be responsible for the enhancement. Point mutations in these sites alter the unfolding process of the consensus sequence. We show that the stabilized version of the protein retains full activity even in harsh oil degumming conditions, making it suitable for industrial applications.


Assuntos
Proteínas , Zinco , Sequência de Aminoácidos , Proteínas/metabolismo , Mutação , Sequência Consenso
16.
BMC Genomics ; 25(1): 109, 2024 Jan 24.
Artigo em Inglês | MEDLINE | ID: mdl-38267856

RESUMO

BACKGROUND: Despite the many cheap and fast ways to generate genomic data, good and exact genome assembly is still a problem, with especially the repeats being vastly underrepresented and often misassembled. As short reads in low coverage are already sufficient to represent the repeat landscape of any given genome, many read cluster algorithms were brought forward that provide repeat identification and classification. But how can trustworthy, reliable and representative repeat consensuses be derived from unassembled genomes? RESULTS: Here, we combine methods from repeat identification and genome assembly to derive these robust consensuses. We test several use cases, such as (1) consensus building from clustered short reads of non-model genomes, (2) from genome-wide amplification setups, and (3) specific repeat-centred questions, such as the linked vs. unlinked arrangement of ribosomal genes. In all our use cases, the derived consensuses are robust and representative. To evaluate overall performance, we compare our high-fidelity repeat consensuses to RepeatExplorer2-derived contigs and check, if they represent real transposable elements as found in long reads. Our results demonstrate that it is possible to generate useful, reliable and trustworthy consensuses from short reads by a combination from read cluster and genome assembly methods in an automatable way. CONCLUSION: We anticipate that our workflow opens the way towards more efficient and less manual repeat characterization and annotation, benefitting all genome studies, but especially those of non-model organisms.


Assuntos
Algoritmos , Elementos de DNA Transponíveis , Sequência Consenso , Análise por Conglomerados , Genômica
17.
Vet Res ; 55(1): 28, 2024 Mar 06.
Artigo em Inglês | MEDLINE | ID: mdl-38449049

RESUMO

The prevalence of porcine reproductive and respiratory syndrome virus 1 (PRRSV1) isolates has continued to increase in Chinese swine herds in recent years. However, no effective control strategy is available for PRRSV1 infection in China. In this study, we generated the first infectious cDNA clone (rHLJB1) of a Chinese PRRSV1 isolate and subsequently used it as a backbone to construct an ORF2-6 chimeric virus (ORF2-6-CON). This virus contained a synthesized consensus sequence of the PRRSV1 ORF2-6 gene encoding all the envelope proteins. The ORF2-6 consensus sequence shared > 90% nucleotide similarity with four representative strains (Amervac, BJEU06-1, HKEU16 and NMEU09-1) of PRRSV1 in China. ORF2-6-CON had replication efficacy similar to that of the backbone rHLJB1 virus in primary alveolar macrophages (PAMs) and exhibited cell tropism in Marc-145 cells. Piglet inoculation and challenge studies indicated that ORF2-6-CON is not pathogenic to piglets and can induce enhanced cross-protection against a heterologous SD1291 isolate. Notably, ORF2-6-CON inoculation induced higher levels of heterologous neutralizing antibodies (nAbs) against SD1291 than rHLJB1 inoculation, which was concurrent with a higher percentage of T follicular helper (Tfh) cells in tracheobronchial lymph nodes (TBLNs), providing the first clue that porcine Tfh cells are correlated with heterologous PRRSV nAb responses. The number of SD1291-strain-specific IFNγ-secreting cells was similar in ORF2-6-CON-inoculated and rHLJB1-inoculated pigs. Overall, our findings support that the Marc-145-adapted ORF2-6-CON can trigger Tfh cell and heterologous nAb responses to confer improved cross-protection and may serve as a candidate strain for the development of a cross-protective PRRSV1 vaccine.


Assuntos
Vírus da Síndrome Respiratória e Reprodutiva Suína , Animais , Suínos , Vírus da Síndrome Respiratória e Reprodutiva Suína/genética , Células T Auxiliares Foliculares , Anticorpos Neutralizantes , China , Sequência Consenso
18.
Nucleic Acids Res ; 50(D1): D371-D379, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34761274

RESUMO

Previous studies on enhancers and their target genes were largely based on bulk samples that represent 'average' regulatory activities from a large population of millions of cells, masking the heterogeneity and important effects from the sub-populations. In recent years, single-cell sequencing technology has enabled the profiling of open chromatin accessibility at the single-cell level (scATAC-seq), which can be used to annotate the enhancers and promoters in specific cell types. A comprehensive resource is highly desirable for exploring how the enhancers regulate the target genes at the single-cell level. Hence, we designed a single-cell database scEnhancer (http://enhanceratlas.net/scenhancer/), covering 14 527 776 enhancers and 63 658 600 enhancer-gene interactions from 1 196 906 single cells across 775 tissue/cell types in three species. An unsupervised learning method was employed to sort and combine tens or hundreds of single cells in each tissue/cell type to obtain the consensus enhancers. In addition, we utilized a cis-regulatory network algorithm to identify the enhancer-gene connections. Finally, we provided a user-friendly platform with seven useful modules to search, visualize, and browse the enhancers/genes. This database will facilitate the research community towards a functional analysis of enhancers at the single-cell level.


Assuntos
Bases de Dados Genéticas , Elementos Facilitadores Genéticos , Análise de Célula Única/métodos , Software , Aprendizado de Máquina não Supervisionado , Animais , Linhagem da Célula/genética , Cromatina/química , Cromatina/metabolismo , Sequência Consenso , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , Células Eucarióticas/citologia , Células Eucarióticas/metabolismo , Regulação da Expressão Gênica , Redes Reguladoras de Genes , Heterogeneidade Genética , Humanos , Internet , Camundongos , Anotação de Sequência Molecular , Especificidade de Órgãos , Regiões Promotoras Genéticas
19.
Int J Mol Sci ; 25(3)2024 Jan 30.
Artigo em Inglês | MEDLINE | ID: mdl-38338947

RESUMO

The extended cleavage specificities of two hematopoietic serine proteases originating from the ray-finned fish, the spotted gar (Lepisosteus oculatus), have been characterized using substrate phage display. The preference for particular amino acids at and surrounding the cleavage site was further validated using a panel of recombinant substrates. For one of the enzymes, the gar granzyme G, a strict preference for the aromatic amino acid Tyr was observed at the cleavable P1 position. Using a set of recombinant substrates showed that the gar granzyme G had a high selectivity for Tyr but a lower activity for cleaving after Phe but not after Trp. Instead, the second enzyme, gar DDN1, showed a high preference for Leu in the P1 position of substrates. This latter enzyme also showed a high preference for Pro in the P2 position and Arg in both P4 and P5 positions. The selectivity for the two Arg residues in positions P4 and P5 suggests a highly specific substrate selectivity of this enzyme. The screening of the gar proteome with the consensus sequences obtained by substrate phage display for these two proteases resulted in a very diverse set of potential targets. Due to this diversity, a clear candidate for a specific immune function of these two enzymes cannot yet be identified. Antisera developed against the recombinant gar enzymes were used to study their tissue distribution. Tissue sections from juvenile fish showed the expression of both proteases in cells in Peyer's patch-like structures in the intestinal region, indicating they may be expressed in T or NK cells. However, due to the lack of antibodies to specific surface markers in the gar, it has not been possible to specify the exact cellular origin. A marked difference in abundance was observed for the two proteases where gar DDN1 was expressed at higher levels than gar granzyme G. However, both appear to be expressed in the same or similar cells, having a lymphocyte-like appearance.


Assuntos
Peixes , Serina Proteases , Animais , Serina Proteases/genética , Granzimas , Endopeptidases , Sequência Consenso , Especificidade por Substrato
20.
J Biol Chem ; 298(8): 102129, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35700824

RESUMO

Epidermal growth factor-like domains (EGFDs) have important functions in cell-cell signaling. Both secreted and cell surface human EGFDs are subject to extensive modifications, including aspartate and asparagine residue C3-hydroxylations catalyzed by the 2-oxoglutarate oxygenase aspartate/asparagine-ß-hydroxylase (AspH). Although genetic studies show AspH is important in human biology, studies on its physiological roles have been limited by incomplete knowledge of its substrates. Here, we redefine the consensus sequence requirements for AspH-catalyzed EGFD hydroxylation based on combined analysis of proteomic mass spectrometric data and mass spectrometry-based assays with isolated AspH and peptide substrates. We provide cellular and biochemical evidence that the preferred site of EGFD hydroxylation is embedded within a disulfide-bridged macrocycle formed of 10 amino acid residues. This definition enabled the identification of previously unassigned hydroxylation sites in three EGFDs of human fibulins as AspH substrates. A non-EGFD containing protein, lymphocyte antigen-6/plasminogen activator urokinase receptor domain containing protein 6B (LYPD6B) was shown to be a substrate for isolated AspH, but we did not observe evidence for LYPD6B hydroxylation in cells. AspH-catalyzed hydroxylation of fibulins is of particular interest given their important roles in extracellular matrix dynamics. In conclusion, these results lead to a revision of the consensus substrate requirements for AspH and expand the range of observed and potential AspH-catalyzed hydroxylation in cells, which will enable future study of the biological roles of AspH.


Assuntos
Sequência Consenso , Fator de Crescimento Epidérmico , Proteômica , Antígenos Ly/metabolismo , Asparagina/metabolismo , Ácido Aspártico/metabolismo , Fator de Crescimento Epidérmico/metabolismo , Humanos , Hidroxilação
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA