Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 150
Filtrar
Mais filtros

Bases de dados
Tipo de documento
Intervalo de ano de publicação
1.
Bioessays ; 46(1): e2300098, 2024 01.
Artigo em Inglês | MEDLINE | ID: mdl-38018264

RESUMO

The evolution and biodiversity of ageing have long fascinated scientists and the public alike. While mammals, including long-lived species such as humans, show a marked ageing process, some species of reptiles and amphibians exhibit very slow and even the absence of ageing phenotypes. How can reptiles and other vertebrates age slower than mammals? Herein, I propose that evolving during the rule of the dinosaurs left a lasting legacy in mammals. For over 100 million years when dinosaurs were the dominant predators, mammals were generally small, nocturnal, and short-lived. My hypothesis is that such a long evolutionary pressure on early mammals for rapid reproduction led to the loss or inactivation of genes and pathways associated with long life. I call this the 'longevity bottleneck hypothesis', which is further supported by the absence in mammals of regenerative traits. Although mammals, such as humans, can evolve long lifespans, they do so under constraints dating to the dinosaur era.


Assuntos
Dinossauros , Longevidade , Animais , Envelhecimento/fisiologia , Dinossauros/fisiologia , Mamíferos/fisiologia , Répteis , Evolução Biológica
2.
Nucleic Acids Res ; 52(D1): D203-D212, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37811871

RESUMO

With recent progress in mapping N7-methylguanosine (m7G) RNA methylation sites, tens of thousands of experimentally validated m7G sites have been discovered in various species, shedding light on the significant role of m7G modification in regulating numerous biological processes including disease pathogenesis. An integrated resource that enables the sharing, annotation and customized analysis of m7G data will greatly facilitate m7G studies under various physiological contexts. We previously developed the m7GHub database to host mRNA m7G sites identified in the human transcriptome. Here, we present m7GHub v.2.0, an updated resource for a comprehensive collection of m7G modifications in various types of RNA across multiple species: an m7GDB database containing 430 898 putative m7G sites identified in 23 species, collected from both widely applied next-generation sequencing (NGS) and the emerging Oxford Nanopore direct RNA sequencing (ONT) techniques; an m7GDiseaseDB hosting 156 206 m7G-associated variants (involving addition or removal of an m7G site), including 3238 disease-relevant m7G-SNPs that may function through epitranscriptome disturbance; and two enhanced analysis modules to perform interactive analyses on the collections of m7G sites (m7GFinder) and functional variants (m7GSNPer). We expect that m7Ghub v.2.0 should serve as a valuable centralized resource for studying m7G modification. It is freely accessible at: www.rnamd.org/m7GHub2.


Assuntos
Bases de Dados de Ácidos Nucleicos , Sequenciamento de Nucleotídeos em Larga Escala , Processamento Pós-Transcricional do RNA , Humanos , Interpretação Estatística de Dados , Guanosina/genética
3.
Nucleic Acids Res ; 52(D1): D900-D908, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37933854

RESUMO

Ageing is a complex and multifactorial process. For two decades, the Human Ageing Genomic Resources (HAGR) have aided researchers in the study of various aspects of ageing and its manipulation. Here, we present the key features and recent enhancements of these resources, focusing on its six main databases. One database, GenAge, focuses on genes related to ageing, featuring 307 genes linked to human ageing and 2205 genes associated with longevity and ageing in model organisms. AnAge focuses on ageing, longevity, and life-history across animal species, containing data on 4645 species. DrugAge includes information about 1097 longevity drugs and compounds in model organisms such as mice, rats, flies, worms and yeast. GenDR provides a list of 214 genes associated with the life-extending benefits of dietary restriction in model organisms. CellAge contains a catalogue of 866 genes associated with cellular senescence. The LongevityMap serves as a repository for genetic variants associated with human longevity, encompassing 3144 variants pertaining to 884 genes. Additionally, HAGR provides various tools as well as gene expression signatures of ageing, dietary restriction, and replicative senescence based on meta-analyses. Our databases are integrated, regularly updated, and manually curated by experts. HAGR is freely available online (https://genomics.senescence.info/).


Assuntos
Envelhecimento , Bases de Dados Genéticas , Genômica , Animais , Humanos , Envelhecimento/genética , Senescência Celular , Longevidade/genética
4.
Trends Genet ; 38(3): 216-217, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-34756472

RESUMO

A PubMed analysis shows that the vast majority of human genes have been studied in the context of cancer. As such, the study of nearly any human gene can be justified based on existing literature by its potential relevance to cancer. Moreover, these results have implications for analyzing and interpreting large-scale analyses.


Assuntos
Neoplasias , Humanos , Neoplasias/genética
5.
Nucleic Acids Res ; 51(D1): D145-D158, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36454018

RESUMO

Gene co-expression analysis has emerged as a powerful method to provide insights into gene function and regulation. The rapid growth of publicly available RNA-sequencing (RNA-seq) data has created opportunities for researchers to employ this abundant data to help decipher the complexity and biology of genomes. Co-expression networks have proven effective for inferring the relationship between the genes, for gene prioritization and for assigning function to poorly annotated genes based on their co-expressed partners. To facilitate such analyses we created previously an online co-expression tool for humans and mice entitled GeneFriends. To continue providing a valuable tool to the scientific community, we have now updated the GeneFriends database and website. Here, we present the new version of GeneFriends, which includes gene and transcript co-expression networks based on RNA-seq data from 46 475 human and 34 322 mouse samples. The new database also encompasses tissue-specific gene co-expression networks for 20 human and 21 mouse tissues, dataset-specific gene co-expression maps based on TCGA and GTEx projects and gene co-expression networks for additional seven model organisms (fruit fly, zebrafish, worm, rat, yeast, cow and chicken). GeneFriends is freely available at http://www.genefriends.org/.


Assuntos
Bases de Dados Genéticas , Perfilação da Expressão Gênica , Redes Reguladoras de Genes , Animais , Humanos , RNA , Análise de Sequência de RNA
6.
Nucleic Acids Res ; 51(D1): D106-D116, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36382409

RESUMO

With advanced technologies to map RNA modifications, our understanding of them has been revolutionized, and they are seen to be far more widespread and important than previously thought. Current next-generation sequencing (NGS)-based modification profiling methods are blind to RNA modifications and thus require selective chemical treatment or antibody immunoprecipitation methods for particular modification types. They also face the problem of short read length, isoform ambiguities, biases and artifacts. Direct RNA sequencing (DRS) technologies, commercialized by Oxford Nanopore Technologies (ONT), enable the direct interrogation of any given modification present in individual transcripts and promise to address the limitations of previous NGS-based methods. Here, we present the first ONT-based database of quantitative RNA modification profiles, DirectRMDB, which includes 16 types of modification and a total of 904,712 modification sites in 25 species identified from 39 independent studies. In addition to standard functions adopted by existing databases, such as gene annotations and post-transcriptional association analysis, we provide a fresh view of RNA modifications, which enables exploration of the epitranscriptome in an isoform-specific manner. The DirectRMDB database is freely available at: http://www.rnamd.org/directRMDB/.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Processamento Pós-Transcricional do RNA , Análise de Sequência de RNA , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Anotação de Sequência Molecular , Isoformas de Proteínas , RNA/genética , Análise de Sequência de RNA/métodos , Bases de Dados de Ácidos Nucleicos
7.
Nucleic Acids Res ; 51(D1): D1388-D1396, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36062570

RESUMO

Recent advances in epitranscriptomics have unveiled functional associations between RNA modifications (RMs) and multiple human diseases, but distinguishing the functional or disease-related single nucleotide variants (SNVs) from the majority of 'silent' variants remains a major challenge. We previously developed the RMDisease database for unveiling the association between genetic variants and RMs concerning human disease pathogenesis. In this work, we present RMDisease v2.0, an updated database with expanded coverage. Using deep learning models and from 873 819 experimentally validated RM sites, we identified a total of 1 366 252 RM-associated variants that may affect (add or remove an RM site) 16 different types of RNA modifications (m6A, m5C, m1A, m5U, Ψ, m6Am, m7G, A-to-I, ac4C, Am, Cm, Um, Gm, hm5C, D and f5C) in 20 organisms (human, mouse, rat, zebrafish, maize, fruit fly, yeast, fission yeast, Arabidopsis, rice, chicken, goat, sheep, pig, cow, rhesus monkey, tomato, chimpanzee, green monkey and SARS-CoV-2). Among them, 14 749 disease- and 2441 trait-associated genetic variants may function via the perturbation of epitranscriptomic markers. RMDisease v2.0 should serve as a useful resource for studying the genetic drivers of phenotypes that lie within the epitranscriptome layer circuitry, and is freely accessible at: www.rnamd.org/rmdisease2.


Assuntos
Bases de Dados Factuais , Processamento Pós-Transcricional do RNA , Animais , Humanos , Fenótipo , SARS-CoV-2/genética , SARS-CoV-2/metabolismo , Epigenômica
8.
Nucleic Acids Res ; 50(D1): D196-D203, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34986603

RESUMO

5-Methylcytosine (m5C) is one of the most prevalent covalent modifications on RNA. It is known to regulate a broad variety of RNA functions, including nuclear export, RNA stability and translation. Here, we present m5C-Atlas, a database for comprehensive collection and annotation of RNA 5-methylcytosine. The database contains 166 540 m5C sites in 13 species identified from 5 base-resolution epitranscriptome profiling technologies. Moreover, condition-specific methylation levels are quantified from 351 RNA bisulfite sequencing samples gathered from 22 different studies via an integrative pipeline. The database also presents several novel features, such as the evolutionary conservation of a m5C locus, its association with SNPs, and any relevance to RNA secondary structure. All m5C-atlas data are accessible through a user-friendly interface, in which the m5C epitranscriptomes can be freely explored, shared, and annotated with putative post-transcriptional mechanisms (e.g. RBP intermolecular interaction with RNA, microRNA interaction and splicing sites). Together, these resources offer unprecedented opportunities for exploring m5C epitranscriptomes. The m5C-Atlas database is freely accessible at https://www.xjtlu.edu.cn/biologicalsciences/m5c-atlas.


Assuntos
Bases de Dados Genéticas , Epigenoma/genética , Software , Transcriptoma/genética , 5-Metilcitosina/química , 5-Metilcitosina/metabolismo , Humanos , MicroRNAs/genética , Polimorfismo de Nucleotídeo Único/genética , Processamento Pós-Transcricional do RNA/genética , Análise de Sequência de RNA
9.
Nucleic Acids Res ; 50(18): 10290-10310, 2022 10 14.
Artigo em Inglês | MEDLINE | ID: mdl-36155798

RESUMO

As the most pervasive epigenetic mark present on mRNA and lncRNA, N6-methyladenosine (m6A) RNA methylation regulates all stages of RNA life in various biological processes and disease mechanisms. Computational methods for deciphering RNA modification have achieved great success in recent years; nevertheless, their potential remains underexploited. One reason for this is that existing models usually consider only the sequence of transcripts, ignoring the various regions (or geography) of transcripts such as 3'UTR and intron, where the epigenetic mark forms and functions. Here, we developed three simple yet powerful encoding schemes for transcripts to capture the submolecular geographic information of RNA, which is largely independent from sequences. We show that m6A prediction models based on geographic information alone can achieve comparable performances to classic sequence-based methods. Importantly, geographic information substantially enhances the accuracy of sequence-based models, enables isoform- and tissue-specific prediction of m6A sites, and improves m6A signal detection from direct RNA sequencing data. The geographic encoding schemes we developed have exhibited strong interpretability, and are applicable to not only m6A but also N1-methyladenosine (m1A), and can serve as a general and effective complement to the widely used sequence encoding schemes in deep learning applications concerning RNA transcripts.


Assuntos
Aprendizado Profundo , RNA Longo não Codificante , Regiões 3' não Traduzidas , Metilação , Isoformas de Proteínas/genética , RNA/genética , RNA/metabolismo , RNA Mensageiro/genética
10.
Proc Natl Acad Sci U S A ; 118(7)2021 02 16.
Artigo em Inglês | MEDLINE | ID: mdl-33574059

RESUMO

Ecological flexibility, extended lifespans, and large brains have long intrigued evolutionary biologists, and comparative genomics offers an efficient and effective tool for generating new insights into the evolution of such traits. Studies of capuchin monkeys are particularly well situated to shed light on the selective pressures and genetic underpinnings of local adaptation to diverse habitats, longevity, and brain development. Distributed widely across Central and South America, they are inventive and extractive foragers, known for their sensorimotor intelligence. Capuchins have among the largest relative brain size of any monkey and a lifespan that exceeds 50 y, despite their small (3 to 5 kg) body size. We assemble and annotate a de novo reference genome for Cebus imitator Through high-depth sequencing of DNA derived from blood, various tissues, and feces via fluorescence-activated cell sorting (fecalFACS) to isolate monkey epithelial cells, we compared genomes of capuchin populations from tropical dry forests and lowland rainforests and identified population divergence in genes involved in water balance, kidney function, and metabolism. Through a comparative genomics approach spanning a wide diversity of mammals, we identified genes under positive selection associated with longevity and brain development. Additionally, we provide a technological advancement in the use of noninvasive genomics for studies of free-ranging mammals. Our intra- and interspecific comparative study of capuchin genomics provides insights into processes underlying local adaptation to diverse and physiologically challenging environments, as well as the molecular basis of brain evolution and longevity.


Assuntos
Adaptação Fisiológica , Encéfalo/crescimento & desenvolvimento , Cebus/genética , Genoma , Longevidade/genética , Animais , Evolução Molecular , Citometria de Fluxo/métodos , Florestas , Genômica/métodos
11.
BMC Genomics ; 24(1): 644, 2023 Oct 26.
Artigo em Inglês | MEDLINE | ID: mdl-37884865

RESUMO

INTRODUCTION: Understanding changes in cell identity in cancer and ageing is of great importance. In this work, we analyzed how gene expression changes in human tissues are associated with tissue specificity during cancer and ageing using transcriptome data from TCGA and GTEx. RESULTS: We found significant downregulation of tissue-specific genes during ageing in 40% of the tissues analyzed, which suggests loss of tissue identity with age. For most cancer types, we have noted a consistent pattern of downregulation in genes that are specific to the tissue from which the tumor originated. Moreover, we observed in cancer an activation of genes not usually expressed in the tissue of origin as well as an upregulation of genes specific to other tissues. These patterns in cancer were associated with patient survival. The age of the patient, however, did not influence these patterns. CONCLUSION: We identified loss of cellular identity in 40% of the tissues analysed during human ageing, and a clear pattern in cancer, where during tumorigenesis cells express genes specific to other organs while suppressing the expression of genes from their original tissue. The loss of cellular identity observed in cancer is associated with prognosis and is not influenced by age, suggesting that it is a crucial stage in carcinogenesis.


Assuntos
Neoplasias , Transcriptoma , Humanos , Envelhecimento/genética , Neoplasias/genética , Perfilação da Expressão Gênica , Carcinogênese/genética
12.
Mol Biol Evol ; 39(2)2022 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-34971383

RESUMO

Within primates, the great apes are outliers both in terms of body size and lifespan, since they include the largest and longest-lived species in the order. Yet, the molecular bases underlying such features are poorly understood. Here, we leveraged an integrated approach to investigate multiple sources of molecular variation across primates, focusing on over 10,000 genes, including approximately 1,500 previously associated with lifespan, and additional approximately 9,000 for which an association with longevity has never been suggested. We analyzed dN/dS rates, positive selection, gene expression (RNA-seq), and gene regulation (ChIP-seq). By analyzing the correlation between dN/dS, maximum lifespan, and body mass, we identified 276 genes whose rate of evolution positively correlates with maximum lifespan in primates. Further, we identified five genes, important for tumor suppression, adaptive immunity, metastasis, and inflammation, under positive selection exclusively in the great ape lineage. RNA-seq data, generated from the liver of six species representing all the primate lineages, revealed that 8% of approximately 1,500 genes previously associated with longevity are differentially expressed in apes relative to other primates. Importantly, by integrating RNA-seq with ChIP-seq for H3K27ac (which marks active enhancers), we show that the differentially expressed longevity genes are significantly more likely than expected to be located near a novel "ape-specific" enhancer. Moreover, these particular ape-specific enhancers are enriched for young transposable elements, and specifically SINE-Vntr-Alus. In summary, we demonstrate that multiple evolutionary forces have contributed to the evolution of lifespan and body size in primates.


Assuntos
Hominidae , Longevidade , Animais , Evolução Molecular , Hominidae/genética , Longevidade/genética , Primatas/genética , Sequências Reguladoras de Ácido Nucleico
13.
Brief Bioinform ; 22(6)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-33993206

RESUMO

Motivation N6-methyladenosine (m6A) is the most prevalent RNA modification on mRNAs and lncRNAs. Evidence increasingly demonstrates its crucial importance in essential molecular mechanisms and various diseases. With recent advances in sequencing techniques, tens of thousands of m6A sites are identified in a typical high-throughput experiment, posing a key challenge to distinguish the functional m6A sites from the remaining 'passenger' (or 'silent') sites. Results: We performed a comparative conservation analysis of the human and mouse m6A epitranscriptomes at single site resolution. A novel scoring framework, ConsRM, was devised to quantitatively measure the degree of conservation of individual m6A sites. ConsRM integrates multiple information sources and a positive-unlabeled learning framework, which integrated genomic and sequence features to trace subtle hints of epitranscriptome layer conservation. With a series validation experiments in mouse, fly and zebrafish, we showed that ConsRM outperformed well-adopted conservation scores (phastCons and phyloP) in distinguishing the conserved and unconserved m6A sites. Additionally, the m6A sites with a higher ConsRM score are more likely to be functionally important. An online database was developed containing the conservation metrics of 177 998 distinct human m6A sites to support conservation analysis and functional prioritization of individual m6A sites. And it is freely accessible at: https://www.xjtlu.edu.cn/biologicalsciences/con.


Assuntos
Processamento Pós-Transcricional do RNA , RNA Mensageiro/genética , Análise de Sequência de RNA , Software , Transcriptoma , Animais , Humanos , Camundongos , RNA Mensageiro/biossíntese , Peixe-Zebra
14.
Stem Cells ; 40(1): 35-48, 2022 03 03.
Artigo em Inglês | MEDLINE | ID: mdl-35511867

RESUMO

DNA damage repair (DDR) is a safeguard for genome integrity maintenance. Increasing DDR efficiency could increase the yield of induced pluripotent stem cells (iPSC) upon reprogramming from somatic cells. The epigenetic mechanisms governing DDR during iPSC reprogramming are not completely understood. Our goal was to evaluate the splicing isoforms of histone variant macroH2A1, macroH2A1.1, and macroH2A1.2, as potential regulators of DDR during iPSC reprogramming. GFP-Trap one-step isolation of mtagGFP-macroH2A1.1 or mtagGFP-macroH2A1.2 fusion proteins from overexpressing human cell lines, followed by liquid chromatography-tandem mass spectrometry analysis, uncovered macroH2A1.1 exclusive interaction with Poly-ADP Ribose Polymerase 1 (PARP1) and X-ray cross-complementing protein 1 (XRCC1). MacroH2A1.1 overexpression in U2OS-GFP reporter cells enhanced specifically nonhomologous end joining (NHEJ) repair pathway, while macroH2A1.1 knock-out (KO) mice showed an impaired DDR capacity. The exclusive interaction of macroH2A1.1, but not macroH2A1.2, with PARP1/XRCC1, was confirmed in human umbilical vein endothelial cells (HUVEC) undergoing reprogramming into iPSC through episomal vectors. In HUVEC, macroH2A1.1 overexpression activated transcriptional programs that enhanced DDR and reprogramming. Consistently, macroH2A1.1 but not macroH2A1.2 overexpression improved iPSC reprogramming. We propose the macroH2A1 splicing isoform macroH2A1.1 as a promising epigenetic target to improve iPSC genome stability and therapeutic potential.


Assuntos
Histonas , Células-Tronco Pluripotentes Induzidas , Animais , DNA , Reparo do DNA , Células Endoteliais/metabolismo , Histonas/metabolismo , Humanos , Células-Tronco Pluripotentes Induzidas/metabolismo , Camundongos , Proteína 1 Complementadora Cruzada de Reparo de Raio-X/genética , Proteína 1 Complementadora Cruzada de Reparo de Raio-X/metabolismo
15.
Methods ; 203: 378-382, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-34245870

RESUMO

The primary sequences of DNA, RNA and protein have been used as the dominant information source of existing machine learning tools, especially for contexts not fully explored by wet-experimental approaches. Since molecular markers are profoundly orchestrated in the living organisms, those markers that cannot be unambiguously recovered from the primary sequence often help to predict other biological events. To the best of our knowledge, there is no current tool to build and deploy machine learning models that consider genomic evidence. We therefore developed the WHISTLE server, the first machine learning platform based on genomic coordinates. It features convenient covariate extraction and model web deployment with 46 distinct genomic features integrated along with the conventional sequence features. We showed that, when predicting m6A sites from SRAMP project, the model integrating genomic features substantially outperformed those based on only sequence features. The WHISTLE server should be a useful tool for studying biological attributes specifically associated with genomic coordinates, and is freely accessible at: www.xjtlu.edu.cn/biologicalsciences/whi2.


Assuntos
Aprendizado de Máquina , RNA , Biologia Computacional , Genômica , RNA/genética , RNA/metabolismo , Análise de Sequência de RNA
16.
Nucleic Acids Res ; 49(D1): D1396-D1404, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33010174

RESUMO

Deciphering the biological impacts of millions of single nucleotide variants remains a major challenge. Recent studies suggest that RNA modifications play versatile roles in essential biological mechanisms, and are closely related to the progression of various diseases including multiple cancers. To comprehensively unveil the association between disease-associated variants and their epitranscriptome disturbance, we built RMDisease, a database of genetic variants that can affect RNA modifications. By integrating the prediction results of 18 different RNA modification prediction tools and also 303,426 experimentally-validated RNA modification sites, RMDisease identified a total of 202,307 human SNPs that may affect (add or remove) sites of eight types of RNA modifications (m6A, m5C, m1A, m5U, Ψ, m6Am, m7G and Nm). These include 4,289 disease-associated variants that may imply disease pathogenesis functioning at the epitranscriptome layer. These SNPs were further annotated with essential information such as post-transcriptional regulations (sites for miRNA binding, interaction with RNA-binding proteins and alternative splicing) revealing putative regulatory circuits. A convenient graphical user interface was constructed to support the query, exploration and download of the relevant information. RMDisease should make a useful resource for studying the epitranscriptome impact of genetic variants via multiple RNA modifications with emphasis on their potential disease relevance. RMDisease is freely accessible at: www.xjtlu.edu.cn/biologicalsciences/rmd.


Assuntos
Bases de Dados Genéticas , Epigênese Genética , Regulação Neoplásica da Expressão Gênica , Neoplasias/genética , Processamento Pós-Transcricional do RNA , RNA Neoplásico/genética , Processamento Alternativo , Humanos , Internet , MicroRNAs/genética , MicroRNAs/metabolismo , Anotação de Sequência Molecular , Neoplasias/metabolismo , Neoplasias/patologia , Polimorfismo de Nucleotídeo Único , RNA Neoplásico/classificação , RNA Neoplásico/metabolismo , Proteínas de Ligação a RNA/genética , Proteínas de Ligação a RNA/metabolismo , Software , Transcriptoma
17.
Nucleic Acids Res ; 49(D1): D134-D143, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-32821938

RESUMO

N 6-Methyladenosine (m6A) is the most prevalent RNA modification on mRNAs and lncRNAs. It plays a pivotal role during various biological processes and disease pathogenesis. We present here a comprehensive knowledgebase, m6A-Atlas, for unraveling the m6A epitranscriptome. Compared to existing databases, m6A-Atlas features a high-confidence collection of 442 162 reliable m6A sites identified from seven base-resolution technologies and the quantitative (rather than binary) epitranscriptome profiles estimated from 1363 high-throughput sequencing samples. It also offers novel features, such as; the conservation of m6A sites among seven vertebrate species (including human, mouse and chimp), the m6A epitranscriptomes of 10 virus species (including HIV, KSHV and DENV), the putative biological functions of individual m6A sites predicted from epitranscriptome data, and the potential pathogenesis of m6A sites inferred from disease-associated genetic mutations that can directly destroy m6A directing sequence motifs. A user-friendly graphical user interface was constructed to support the query, visualization and sharing of the m6A epitranscriptomes annotated with sites specifying their interaction with post-transcriptional machinery (RBP-binding, microRNA interaction and splicing sites) and interactively display the landscape of multiple RNA modifications. These resources provide fresh opportunities for unraveling the m6A epitranscriptomes. m6A-Atlas is freely accessible at: www.xjtlu.edu.cn/biologicalsciences/atlas.


Assuntos
Adenosina/análogos & derivados , Bases de Conhecimento , MicroRNAs/genética , RNA Longo não Codificante/genética , RNA Mensageiro/genética , Transcriptoma , Adenosina/metabolismo , Animais , Arabidopsis/genética , Arabidopsis/metabolismo , Atlas como Assunto , Conjuntos de Dados como Assunto , Vírus da Dengue/genética , Vírus da Dengue/metabolismo , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , HIV/genética , HIV/metabolismo , Herpesvirus Humano 8/genética , Herpesvirus Humano 8/metabolismo , Humanos , Camundongos , MicroRNAs/metabolismo , Pan troglodytes/genética , Pan troglodytes/metabolismo , RNA Longo não Codificante/metabolismo , RNA Mensageiro/metabolismo , Ratos , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Suínos , Peixe-Zebra
18.
BMC Bioinformatics ; 23(1): 10, 2022 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-34983372

RESUMO

BACKGROUND: Dietary restriction (DR) is the most studied pro-longevity intervention; however, a complete understanding of its underlying mechanisms remains elusive, and new research directions may emerge from the identification of novel DR-related genes and DR-related genetic features. RESULTS: This work used a Machine Learning (ML) approach to classify ageing-related genes as DR-related or NotDR-related using 9 different types of predictive features: PathDIP pathways, two types of features based on KEGG pathways, two types of Protein-Protein Interactions (PPI) features, Gene Ontology (GO) terms, Genotype Tissue Expression (GTEx) expression features, GeneFriends co-expression features and protein sequence descriptors. Our findings suggested that features biased towards curated knowledge (i.e. GO terms and biological pathways), had the greatest predictive power, while unbiased features (mainly gene expression and co-expression data) have the least predictive power. Moreover, a combination of all the feature types diminished the predictive power compared to predictions based on curated knowledge. Feature importance analysis on the two most predictive classifiers mostly corroborated existing knowledge and supported recent findings linking DR to the Nuclear Factor Erythroid 2-Related Factor 2 (NRF2) signalling pathway and G protein-coupled receptors (GPCR). We then used the two strongest combinations of feature type and ML algorithm to predict DR-relatedness among ageing-related genes currently lacking DR-related annotations in the data, resulting in a set of promising candidate DR-related genes (GOT2, GOT1, TSC1, CTH, GCLM, IRS2 and SESN2) whose predicted DR-relatedness remain to be validated in future wet-lab experiments. CONCLUSIONS: This work demonstrated the strong potential of ML-based techniques to identify DR-associated features as our findings are consistent with literature and recent discoveries. Although the inference of new DR-related mechanistic findings based solely on GO terms and biological pathways was limited due to their knowledge-driven nature, the predictive power of these two features types remained useful as it allowed inferring new promising candidate DR-related genes.


Assuntos
Algoritmos , Aprendizado de Máquina , Ontologia Genética , Longevidade/genética
19.
Brief Bioinform ; 21(3): 803-814, 2020 05 21.
Artigo em Inglês | MEDLINE | ID: mdl-30895300

RESUMO

Biologists very often use enrichment methods based on statistical hypothesis tests to identify gene properties that are significantly over-represented in a given set of genes of interest, by comparison with a 'background' set of genes. These enrichment methods, although based on rigorous statistical foundations, are not always the best single option to identify patterns in biological data. In many cases, one can also use classification algorithms from the machine-learning field. Unlike enrichment methods, classification algorithms are designed to maximize measures of predictive performance and are capable of analysing combinations of gene properties, instead of one property at a time. In practice, however, the majority of studies use either enrichment or classification methods (rather than both), and there is a lack of literature discussing the pros and cons of both types of method. The goal of this paper is to compare and contrast enrichment and classification methods, offering two contributions. First, we discuss the (to some extent complementary) advantages and disadvantages of both types of methods for identifying gene properties that discriminate between gene classes. Second, we provide a set of high-level recommendations for using enrichment and classification methods. Overall, by highlighting the strengths and the weaknesses of both types of methods we argue that both should be used in bioinformatics analyses.


Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Aprendizado de Máquina , Algoritmos
20.
Cell Mol Life Sci ; 78(9): 4365-4376, 2021 May.
Artigo em Inglês | MEDLINE | ID: mdl-33625522

RESUMO

The C1ORF112 gene initially drew attention when it was found to be strongly co-expressed with several genes previously associated with cancer and implicated in DNA repair and cell cycle regulation, such as RAD51 and the BRCA genes. The molecular functions of C1ORF112 remain poorly understood, yet several studies have uncovered clues as to its potential functions. Here, we review the current knowledge on C1ORF112 biology, its evolutionary history, possible functions, and its potential relevance to cancer. C1ORF112 is conserved throughout eukaryotes, from plants to humans, and is very highly conserved in primates. Protein models suggest that C1ORF112 is an alpha-helical protein. Interestingly, homozygous knockout mice are not viable, suggesting an essential role for C1ORF112 in mammalian development. Gene expression data show that, among human tissues, C1ORF112 is highly expressed in the testes and overexpressed in various cancers when compared to healthy tissues. C1ORF112 has also been shown to have altered levels of expression in some tumours with mutant TP53. Recent screens associate C1ORF112 with DNA replication and reveal possible links to DNA damage repair pathways, including the Fanconi anaemia pathway and homologous recombination. These insights provide important avenues for future research in our efforts to understand the functions and potential disease relevance of C1ORF112.


Assuntos
Evolução Biológica , Dano ao DNA , Reparo do DNA , Replicação do DNA , Fases de Leitura Aberta/genética , Animais , Humanos , Masculino , Neoplasias/genética , Neoplasias/metabolismo , Neoplasias/patologia , Mapas de Interação de Proteínas , Testículo/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA