Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 30
Filtrar
1.
Nucleic Acids Res ; 52(D1): D426-D433, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37933852

RESUMO

The DescribePROT database of amino acid-level descriptors of protein structures and functions was substantially expanded since its release in 2020. This expansion includes substantial increase in the size, scope, and quality of the underlying data, the addition of experimental structural information, the inclusion of new data download options, and an upgraded graphical interface. DescribePROT currently covers 19 structural and functional descriptors for proteins in 273 reference proteomes generated by 11 accurate and complementary predictive tools. Users can search our resource in multiple ways, interact with the data using the graphical interface, and download data at various scales including individual proteins, entire proteomes, and whole database. The annotations in DescribePROT are useful for a broad spectrum of studies that include investigations of protein structure and function, development and validation of predictive tools, and to support efforts in understanding molecular underpinnings of diseases and development of therapeutics. DescribePROT can be freely accessed at http://biomine.cs.vcu.edu/servers/DESCRIBEPROT/.


Assuntos
Aminoácidos , Proteoma , Proteoma/química , Bases de Dados Factuais
2.
Nucleic Acids Res ; 51(W1): W343-W349, 2023 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-37178004

RESUMO

Predicting protein localization and understanding its mechanisms are critical in biology and pathology. In this context, we propose a new web application of MULocDeep with improved performance, result interpretation, and visualization. By transferring the original model into species-specific models, MULocDeep achieved competitive prediction performance at the subcellular level against other state-of-the-art methods. It uniquely provides a comprehensive localization prediction at the suborganellar level. Besides prediction, our web service quantifies the contribution of single amino acids to localization for individual proteins; for a group of proteins, common motifs or potential targeting-related regions can be derived. Furthermore, the visualizations of targeting mechanism analyses can be downloaded for publication-ready figures. The MULocDeep web service is available at https://www.mu-loc.org/.


Assuntos
Proteínas , Software , Aminoácidos/metabolismo , Biologia Computacional/métodos , Transporte Proteico , Proteínas/química , Internet
3.
Nucleic Acids Res ; 50(D1): D333-D339, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34551440

RESUMO

Resolving the spatial distribution of the transcriptome at a subcellular level can increase our understanding of biology and diseases. To facilitate studies of biological functions and molecular mechanisms in the transcriptome, we updated RNALocate, a resource for RNA subcellular localization analysis that is freely accessible at http://www.rnalocate.org/ or http://www.rna-society.org/rnalocate/. Compared to RNALocate v1.0, the new features in version 2.0 include (i) expansion of the data sources and the coverage of species; (ii) incorporation and integration of RNA-seq datasets containing information about subcellular localization; (iii) addition and reorganization of RNA information (RNA subcellular localization conditions and descriptive figures for method, RNA homology information, RNA interaction and ncRNA disease information) and (iv) three additional prediction tools: DM3Loc, iLoc-lncRNA and iLoc-mRNA. Overall, RNALocate v2.0 provides a comprehensive RNA subcellular localization resource for researchers to deconvolute the highly complex architecture of the cell.


Assuntos
Bases de Dados de Ácidos Nucleicos , RNA não Traduzido/genética , Software , Transcriptoma , Animais , Sequência de Bases , Compartimento Celular , Conjuntos de Dados como Assunto , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , Células Eucarióticas/citologia , Células Eucarióticas/metabolismo , Regulação da Expressão Gênica , Ontologia Genética , Humanos , Internet , Camundongos , Anotação de Sequência Molecular , RNA não Traduzido/classificação , RNA não Traduzido/metabolismo , Ratos , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Alinhamento de Sequência , Homologia de Sequência do Ácido Nucleico , Frações Subcelulares/química , Frações Subcelulares/metabolismo , Peixe-Zebra/genética , Peixe-Zebra/metabolismo
4.
Nucleic Acids Res ; 49(W1): W228-W236, 2021 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-34037802

RESUMO

G2PDeep is an open-access web server, which provides a deep-learning framework for quantitative phenotype prediction and discovery of genomics markers. It uses zygosity or single nucleotide polymorphism (SNP) information from plants and animals as the input to predict quantitative phenotype of interest and genomic markers associated with phenotype. It provides a one-stop-shop platform for researchers to create deep-learning models through an interactive web interface and train these models with uploaded data, using high-performance computing resources plugged at the backend. G2PDeep also provides a series of informative interfaces to monitor the training process and compare the performance among the trained models. The trained models can then be deployed automatically. The quantitative phenotype and genomic markers are predicted using a user-selected trained model and the results are visualized. Our state-of-the-art model has been benchmarked and demonstrated competitive performance in quantitative phenotype predictions by other researchers. In addition, the server integrates the soybean nested association mapping (SoyNAM) dataset with five phenotypes, including grain yield, height, moisture, oil, and protein. A publicly available dataset for seed protein and oil content has also been integrated into the server. The G2PDeep server is publicly available at http://g2pdeep.org. The Python-based deep-learning model is available at https://github.com/shuaizengMU/G2PDeep_model.


Assuntos
Marcadores Genéticos , Fenótipo , Software , Aprendizado Profundo , Genômica , Internet , Polimorfismo de Nucleotídeo Único , Glycine max/genética
5.
Nucleic Acids Res ; 49(8): e46, 2021 05 07.
Artigo em Inglês | MEDLINE | ID: mdl-33503258

RESUMO

Subcellular localization of messenger RNAs (mRNAs), as a prevalent mechanism, gives precise and efficient control for the translation process. There is mounting evidence for the important roles of this process in a variety of cellular events. Computational methods for mRNA subcellular localization prediction provide a useful approach for studying mRNA functions. However, few computational methods were designed for mRNA subcellular localization prediction and their performance have room for improvement. Especially, there is still no available tool to predict for mRNAs that have multiple localization annotations. In this paper, we propose a multi-head self-attention method, DM3Loc, for multi-label mRNA subcellular localization prediction. Evaluation results show that DM3Loc outperforms existing methods and tools in general. Furthermore, DM3Loc has the interpretation ability to analyze RNA-binding protein motifs and key signals on mRNAs for subcellular localization. Our analyses found hundreds of instances of mRNA isoform-specific subcellular localizations and many significantly enriched gene functions for mRNAs in different subcellular localizations.


Assuntos
Biologia Computacional/métodos , Redes Neurais de Computação , RNA Mensageiro/metabolismo , Frações Subcelulares/metabolismo , Membrana Celular/genética , Membrana Celular/metabolismo , Núcleo Celular/genética , Núcleo Celular/metabolismo , Citosol/metabolismo , Bases de Dados Genéticas , Bases de Dados de Proteínas , Retículo Endoplasmático/genética , Retículo Endoplasmático/metabolismo , Exossomos/genética , Exossomos/metabolismo , Ontologia Genética , Humanos , Proteômica , RNA Mensageiro/genética , Ribossomos/genética , Ribossomos/metabolismo , Transcriptoma/genética
6.
Molecules ; 28(19)2023 Sep 25.
Artigo em Inglês | MEDLINE | ID: mdl-37836636

RESUMO

Interactions between proteins and ions are essential for various biological functions like structural stability, metabolism, and signal transport. Given that more than half of all proteins bind to ions, it is becoming crucial to identify ion-binding sites. The accurate identification of protein-ion binding sites helps us to understand proteins' biological functions and plays a significant role in drug discovery. While several computational approaches have been proposed, this remains a challenging problem due to the small size and high versatility of metals and acid radicals. In this study, we propose IonPred, a sequence-based approach that employs ELECTRA (Efficiently Learning an Encoder that Classifies Token Replacements Accurately) to predict ion-binding sites using only raw protein sequences. We successfully fine-tuned our pretrained model to predict the binding sites for nine metal ions (Zn2+, Cu2+, Fe2+, Fe3+, Ca2+, Mg2+, Mn2+, Na+, and K+) and four acid radical ion ligands (CO32-, SO42-, PO43-, NO2-). IonPred surpassed six current state-of-the-art tools by over 44.65% and 28.46%, respectively, in the F1 score and MCC when compared on an independent test dataset. Our method is more computationally efficient than existing tools, producing prediction results for a hundred sequences for a specific ion in under ten minutes.


Assuntos
Metais , Proteínas , Ligantes , Proteínas/química , Sítios de Ligação , Ligação Proteica , Metais/química , Íons/química
7.
Nucleic Acids Res ; 48(W1): W140-W146, 2020 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-32324217

RESUMO

MusiteDeep is an online resource providing a deep-learning framework for protein post-translational modification (PTM) site prediction and visualization. The predictor only uses protein sequences as input and no complex features are needed, which results in a real-time prediction for a large number of proteins. It takes less than three minutes to predict for 1000 sequences per PTM type. The output is presented at the amino acid level for the user-selected PTM types. The framework has been benchmarked and has demonstrated competitive performance in PTM site predictions by other researchers. In this webserver, we updated the previous framework by utilizing more advanced ensemble techniques, and providing prediction and visualization for multiple PTMs simultaneously for users to analyze potential PTM cross-talks directly. Besides prediction, users can interactively review the predicted PTM sites in the context of known PTM annotations and protein 3D structures through homology-based search. In addition, the server maintains a local database providing pre-processed PTM annotations from Uniport/Swiss-Prot for users to download. This database will be updated every three months. The MusiteDeep server is available at https://www.musite.net. The stand-alone tools for locally using MusiteDeep are available at https://github.com/duolinwang/MusiteDeep_web.


Assuntos
Aprendizado Profundo , Processamento de Proteína Pós-Traducional , Software , Gráficos por Computador , Internet , Modelos Moleculares , Conformação Proteica , Proteínas/química , Análise de Sequência de Proteína
8.
Bioinformatics ; 36(1): 169-176, 2020 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-31168616

RESUMO

MOTIVATION: As large amounts of biological data continue to be rapidly generated, a major focus of bioinformatics research has been aimed toward integrating these data to identify active pathways or modules under certain experimental conditions or phenotypes. Although biologically significant modules can often be detected globally by many existing methods, it is often hard to interpret or make use of the results toward pathway model generation and testing. RESULTS: To address this gap, we have developed the IMPRes algorithm, a new step-wise active pathway detection method using a dynamic programing approach. IMPRes takes advantage of the existing pathway interaction knowledge in Kyoto Encyclopedia of Genes and Genomes. Omics data are then used to assign penalties to genes, interactions and pathways. Finally, starting from one or multiple seed genes, a shortest path algorithm is applied to detect downstream pathways that best explain the gene expression data. Since dynamic programing enables the detection one step at a time, it is easy for researchers to trace the pathways, which may lead to more accurate drug design and more effective treatment strategies. The evaluation experiments conducted on three yeast datasets have shown that IMPRes can achieve competitive or better performance than other state-of-the-art methods. Furthermore, a case study on human lung cancer dataset was performed and we provided several insights on genes and mechanisms involved in lung cancer, which had not been discovered before. AVAILABILITY AND IMPLEMENTATION: IMPRes visualization tool is available via web server at http://digbio.missouri.edu/impres. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Perfilação da Expressão Gênica , Modelos Genéticos , Software , Algoritmos , Perfilação da Expressão Gênica/métodos , Humanos
9.
Methods ; 173: 16-23, 2020 02 15.
Artigo em Inglês | MEDLINE | ID: mdl-31220603

RESUMO

Nowadays, large amounts of omics data have been generated and contributed to increasing knowledge about associated biological mechanisms. A new challenge coming along is how to identify the active pathways and extract useful insights from these data with huge background information and noise. Although biologically meaningful modules can often be detected by many existing informatics tools, it is still hard to interpret or make use of the results towards in silico hypothesis generation and testing. To address this gap, we previously developed the IMPRes (Integrative MultiOmics Pathway Resolution) v 1.0 algorithm, a new step-wise active pathway detection method using a dynamic programming approach. This approach enables the network detection one step at a time, making it easy for researchers to trace the pathways, and leading to more accurate drug design and more effective treatment strategies. In this paper, we present IMPRes-Pro, an enhancement to IMPRes v1.0 by integrating proteomics data along with transcriptomics data and constructing a heterogeneous background network. The evaluation experiment conducted on human primary breast cancer dataset has shown the advantage over the original IMPRes v1.0 method. Furthermore, a case study on human metastatic breast cancer dataset was performed and we have provided several insights regarding the selection of optimal therapy strategy. IMPRes-Pro algorithm and visualization tool is available as a web service at http://digbio.missouri.edu/impres.


Assuntos
Neoplasias da Mama/genética , Biologia Computacional/métodos , Proteômica/métodos , Software , Algoritmos , Neoplasias da Mama/patologia , Gráficos por Computador , Simulação por Computador , Feminino , Perfilação da Expressão Gênica/métodos , Humanos
10.
Physiol Genomics ; 52(2): 81-95, 2020 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-31841397

RESUMO

Enhancer of zeste homolog 2 (EZH2) is a histone methyltransferase that suppresses gene expression. Previously, we developed a conditional null model where EZH2 is knocked out in uterus. Deletion of uterine EZH2 increased proliferation of luminal and glandular epithelial cells. Herein, we used RNA-Seq in wild-type (WT) and EZH2 conditional knockout (Ezh2cKO) uteri to obtain mechanistic insights into the gene expression changes that underpin the pathogenesis observed in these mice. Ovariectomized adult Ezh2cKO mice were treated with vehicle (V) or 17ß-estradiol (E2; 1 ng/g). Uteri were collected at postnatal day (PND) 75 for RNA-Seq or immunostaining for epithelial proliferation. Weighted gene coexpression network analysis was used to link uterine gene expression patterns and epithelial proliferation. In V-treated mice, 88 transcripts were differentially expressed (DEG) in Ezh2cKO mice, and Bmp5, Crabp2, Lgr5, and Sprr2f were upregulated. E2 treatment resulted in 40 DEG with Krt5, Krt15, Olig3, Crabp1, and Serpinb7 upregulated in Ezh2cKO compared with control mice. Transcript analysis relative to proliferation rates revealed two module eigengenes correlated with epithelial proliferation in WT V vs. Ezh2cKO V and WT E2 vs. Ezh2cKO E2 mice, with a positive relationship in the former and inverse in the latter. Notably, the ESR1, Wnt, and Hippo signaling pathways were among those functionally enriched in Ezh2cKO females. Current results reveal unique gene expression patterns in Ezh2cKO uterus and provide insight into how loss of this critical epigenetic regulator assumingly contributes to uterine abnormalities.


Assuntos
Proteína Potenciadora do Homólogo 2 de Zeste/genética , Transcriptoma , Útero/metabolismo , Animais , Proliferação de Células , Análise por Conglomerados , Biologia Computacional , Proteína Potenciadora do Homólogo 2 de Zeste/metabolismo , Epigênese Genética , Estradiol/farmacologia , Estrogênios/metabolismo , Feminino , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Genótipo , Heterozigoto , Camundongos , Camundongos Knockout , Fosfatidilinositol 3-Quinases/metabolismo , RNA-Seq , Transdução de Sinais , Regulação para Cima , Útero/anormalidades , Proteínas Wnt/metabolismo
11.
Bioinformatics ; 35(14): 2386-2394, 2019 07 15.
Artigo em Inglês | MEDLINE | ID: mdl-30520972

RESUMO

MOTIVATION: Computational methods for protein post-translational modification (PTM) site prediction provide a useful approach for studying protein functions. The prediction accuracy of the existing methods has significant room for improvement. A recent deep-learning architecture, Capsule Network (CapsNet), which can characterize the internal hierarchical representation of input data, presents a great opportunity to solve this problem, especially using small training data. RESULTS: We proposed a CapsNet for predicting protein PTM sites, including phosphorylation, N-linked glycosylation, N6-acetyllysine, methyl-arginine, S-palmitoyl-cysteine, pyrrolidone-carboxylic-acid and SUMOylation sites. The CapsNet outperformed the baseline convolutional neural network architecture MusiteDeep and other well-known tools in most cases and provided promising results for practical use, especially in learning from small training data. The capsule length also gives an accurate estimate for the confidence of the PTM prediction. We further demonstrated that the internal capsule features could be trained as a motif detector of phosphorylation sites when no kinase-specific phosphorylation labels were provided. In addition, CapsNet generates robust representations that have strong discriminant power in distinguishing kinase substrates from different kinase families. Our study sheds some light on the recognition mechanism of PTMs and applications of CapsNet on other bioinformatic problems. AVAILABILITY AND IMPLEMENTATION: The codes are free to download from https://github.com/duolinwang/CapsNet_PTM. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Redes Neurais de Computação , Processamento de Proteína Pós-Traducional , Glicosilação , Fosforilação , Proteínas , Sumoilação
12.
Acta Pharmacol Sin ; 40(1): 55-63, 2019 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-30013032

RESUMO

Circular RNAs (circRNAs) are emerging species of mRNA splicing products with largely unknown functions. Although several computational pipelines for circRNA identification have been developed, these methods strictly rely on uniquely mapped reads overlapping back-splice junctions (BSJs) and lack approaches to model the statistical significance of the identified circRNAs. Here, we reported a systematic computational approach to identify circRNAs by simultaneously utilizing BSJ overlapping reads and discordant BSJ spanning reads to identify circRNAs. Moreover, we developed a novel procedure to estimate the P-values of the identified circRNAs. A computational cross-validation and experimental validations demonstrated that our method performed favorably compared to existing circRNA detection tools. We created a standalone tool, CircRNAFisher, to implement the method, which might be valuable to computational and experimental scientists studying circRNAs.


Assuntos
Biologia Computacional/métodos , RNA/análise , Análise de Sequência de RNA/métodos , Algoritmos , Linhagem Celular Tumoral , Fibroblastos/química , Humanos , RNA/genética , RNA/isolamento & purificação , RNA Circular
13.
Bioinformatics ; 33(24): 3909-3916, 2017 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-29036382

RESUMO

MOTIVATION: Computational methods for phosphorylation site prediction play important roles in protein function studies and experimental design. Most existing methods are based on feature extraction, which may result in incomplete or biased features. Deep learning as the cutting-edge machine learning method has the ability to automatically discover complex representations of phosphorylation patterns from the raw sequences, and hence it provides a powerful tool for improvement of phosphorylation site prediction. RESULTS: We present MusiteDeep, the first deep-learning framework for predicting general and kinase-specific phosphorylation sites. MusiteDeep takes raw sequence data as input and uses convolutional neural networks with a novel two-dimensional attention mechanism. It achieves over a 50% relative improvement in the area under the precision-recall curve in general phosphorylation site prediction and obtains competitive results in kinase-specific prediction compared to other well-known tools on the benchmark data. AVAILABILITY AND IMPLEMENTATION: MusiteDeep is provided as an open-source tool available at https://github.com/duolinwang/MusiteDeep. CONTACT: xudong@missouri.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Aprendizado de Máquina , Fosfoproteínas/química , Análise de Sequência de Proteína/métodos , Software , Redes Neurais de Computação , Fosforilação , Proteínas Quinases/metabolismo , Proteínas/metabolismo
14.
Res Sq ; 2024 Mar 11.
Artigo em Inglês | MEDLINE | ID: mdl-38559017

RESUMO

Peptide design, with the goal of identifying peptides possessing unique biological properties, stands as a crucial challenge in peptide-based drug discovery. While traditional and computational methods have made significant strides, they often encounter hurdles due to the complexities and costs of laboratory experiments. Recent advancements in deep learning and Bayesian Optimization have paved the way for innovative research in this domain. In this context, our study presents a novel approach that effectively combines protein structure prediction with Bayesian Optimization for peptide design. By applying carefully designed objective functions, we guide and enhance the optimization trajectory for new peptide sequences. Benchmarked against multiple native structures, our methodology is tailored to generate new peptides to their optimal potential biological properties.

15.
Comput Struct Biotechnol J ; 23: 1786-1795, 2024 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38707535

RESUMO

The rapid growth of spatially resolved transcriptomics technology provides new perspectives on spatial tissue architecture. Deep learning has been widely applied to derive useful representations for spatial transcriptome analysis. However, effectively integrating spatial multi-modal data remains challenging. Here, we present ConGcR, a contrastive learning-based model for integrating gene expression, spatial location, and tissue morphology for data representation and spatial tissue architecture identification. Graph convolution and ResNet were used as encoders for gene expression with spatial location and histological image inputs, respectively. We further enhanced ConGcR with a graph auto-encoder as ConGaR to better model spatially embedded representations. We validated our models using 16 human brains, four chicken hearts, eight breast tumors, and 30 human lung spatial transcriptomics samples. The results showed that our models generated more effective embeddings for obtaining tissue architectures closer to the ground truth than other methods. Overall, our models not only can improve tissue architecture identification's accuracy but also may provide valuable insights and effective data representation for other tasks in spatial transcriptome analyses.

16.
Nat Rev Bioeng ; 2(2): 136-154, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38576453

RESUMO

Denoising diffusion models embody a type of generative artificial intelligence that can be applied in computer vision, natural language processing and bioinformatics. In this Review, we introduce the key concepts and theoretical foundations of three diffusion modelling frameworks (denoising diffusion probabilistic models, noise-conditioned scoring networks and score stochastic differential equations). We then explore their applications in bioinformatics and computational biology, including protein design and generation, drug and small-molecule design, protein-ligand interaction modelling, cryo-electron microscopy image data analysis and single-cell data analysis. Finally, we highlight open-source diffusion model tools and consider the future applications of diffusion models in bioinformatics.

17.
bioRxiv ; 2024 Jan 28.
Artigo em Inglês | MEDLINE | ID: mdl-37609352

RESUMO

Large protein language models (PLMs) present excellent potential to reshape protein research by encoding the amino acid sequences into mathematical and biological meaningful embeddings. However, the lack of crucial 3D structure information in most PLMs restricts the prediction capacity of PLMs in various applications, especially those heavily depending on 3D structures. To address this issue, we introduce S-PLM, a 3D structure-aware PLM utilizing multi-view contrastive learning to align the sequence and 3D structure of a protein in a coordinate space. S-PLM applies Swin-Transformer on AlphaFold-predicted protein structures to embed the structural information and fuses it into sequence-based embedding from ESM2. Additionally, we provide a library of lightweight tuning tools to adapt S-PLM for diverse protein property prediction tasks. Our results demonstrate S-PLM's superior performance over sequence-only PLMs, achieving competitiveness in protein function prediction compared to state-of-the-art methods employing both sequence and structure inputs.

18.
Nat Mach Intell ; 5(4): 337-339, 2023 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38260002

RESUMO

Predicting whether T-cell receptors bind to specific peptides is a challenging problem as the majority of binding examples in the training data involves only a few peptides. A new approach employs meta-learning to improve predictions for binding to peptides for which no or little binding data exists.

19.
Nat Commun ; 14(1): 964, 2023 02 21.
Artigo em Inglês | MEDLINE | ID: mdl-36810839

RESUMO

Single-cell multi-omics (scMulti-omics) allows the quantification of multiple modalities simultaneously to capture the intricacy of complex molecular mechanisms and cellular heterogeneity. Existing tools cannot effectively infer the active biological networks in diverse cell types and the response of these networks to external stimuli. Here we present DeepMAPS for biological network inference from scMulti-omics. It models scMulti-omics in a heterogeneous graph and learns relations among cells and genes within both local and global contexts in a robust manner using a multi-head graph transformer. Benchmarking results indicate DeepMAPS performs better than existing tools in cell clustering and biological network construction. It also showcases competitive capability in deriving cell-type-specific biological networks in lung tumor leukocyte CITE-seq data and matched diffuse small lymphocytic lymphoma scRNA-seq and scATAC-seq data. In addition, we deploy a DeepMAPS webserver equipped with multiple functionalities and visualizations to improve the usability and reproducibility of scMulti-omics data analysis.


Assuntos
Benchmarking , Análise de Dados , Reprodutibilidade dos Testes , Análise por Conglomerados , Fontes de Energia Elétrica , Análise de Célula Única
20.
Nat Commun ; 14(1): 812, 2023 02 13.
Artigo em Inglês | MEDLINE | ID: mdl-36781861

RESUMO

Unlike PIWI-interacting RNA (piRNA) in other species that mostly target transposable elements (TEs), >80% of piRNAs in adult mammalian testes lack obvious targets. However, mammalian piRNA sequences and piRNA-producing loci evolve more rapidly than the rest of the genome for unknown reasons. Here, through comparative studies of chickens, ducks, mice, and humans, as well as long-read nanopore sequencing on diverse chicken breeds, we find that piRNA loci across amniotes experience: (1) a high local mutation rate of structural variations (SVs, mutations ≥ 50 bp in size); (2) positive selection to suppress young and actively mobilizing TEs commencing at the pachytene stage of meiosis during germ cell development; and (3) negative selection to purge deleterious SV hotspots. Our results indicate that genetic instability at pachytene piRNA loci, while producing certain pathogenic SVs, also protects genome integrity against TE mobilization by driving the formation of rapid-evolving piRNA sequences.


Assuntos
Galinhas , Células Germinativas , Humanos , Masculino , Animais , Camundongos , RNA Interferente Pequeno/genética , RNA Interferente Pequeno/metabolismo , Galinhas/genética , Galinhas/metabolismo , Células Germinativas/metabolismo , Testículo/metabolismo , Elementos de DNA Transponíveis/genética , RNA de Interação com Piwi , Mamíferos/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA