Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
Más filtros

Banco de datos
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
Nature ; 593(7860): 602-606, 2021 05.
Artículo en Inglés | MEDLINE | ID: mdl-33953397

RESUMEN

MicroRNAs (miRNAs) have essential functions during embryonic development, and their dysregulation causes cancer1,2. Altered global miRNA abundance is found in different tissues and tumours, which implies that precise control of miRNA dosage is important1,3,4, but the underlying mechanism(s) of this control remain unknown. The protein complex Microprocessor, which comprises one DROSHA and two DGCR8 proteins, is essential for miRNA biogenesis5-7. Here we identify a developmentally regulated miRNA dosage control mechanism that involves alternative transcription initiation (ATI) of DGCR8. ATI occurs downstream of a stem-loop in DGCR8 mRNA to bypass an autoregulatory feedback loop during mouse embryonic stem (mES) cell differentiation. Deletion of the stem-loop causes imbalanced DGCR8:DROSHA protein stoichiometry that drives irreversible Microprocessor aggregation, reduced primary miRNA processing, decreased mature miRNA abundance, and widespread de-repression of lipid metabolic mRNA targets. Although global miRNA dosage control is not essential for mES cells to exit from pluripotency, its dysregulation alters lipid metabolic pathways and interferes with embryonic development by disrupting germ layer specification in vitro and in vivo. This miRNA dosage control mechanism is conserved in humans. Our results identify a promoter switch that balances Microprocessor autoregulation and aggregation to precisely control global miRNA dosage and govern stem cell fate decisions during early embryonic development.


Asunto(s)
Dosificación de Gen , Estratos Germinativos/metabolismo , MicroARNs/genética , Proteínas de Unión al ARN/genética , Ribonucleasa III/genética , Animales , Regulación del Desarrollo de la Expresión Génica , Células Hep G2 , Humanos , Células K562 , Metabolismo de los Lípidos/genética , Ratones , Regiones Promotoras Genéticas , Iniciación de la Transcripción Genética
2.
Mol Biol Evol ; 40(5)2023 05 02.
Artículo en Inglés | MEDLINE | ID: mdl-37140205

RESUMEN

Gene loss is a prevalent source of genetic variation in genome evolution. Calling loss events effectively and efficiently is a critical step for systematically characterizing their functional and phylogenetic profiles genome wide. Here, we developed a novel pipeline integrating orthologous inference and genome alignment. Interestingly, we identified 33 gene loss events that give rise to evolutionarily novel long noncoding RNAs (lncRNAs) that show distinct expression features and could be associated with various functions related to growth, development, immunity, and reproduction, suggesting loss relics as a potential source of functional lncRNAs in humans. Our data also demonstrated that the rates of protein gene loss are variable among different lineages with distinct functional biases.


Asunto(s)
ARN Largo no Codificante , Humanos , ARN Largo no Codificante/genética , Perfilación de la Expresión Génica , Filogenia , Genoma
3.
Brief Bioinform ; 23(1)2022 01 17.
Artículo en Inglés | MEDLINE | ID: mdl-34849565

RESUMEN

Gene transcription and protein translation are two key steps of the 'central dogma.' It is still a major challenge to quantitatively deconvolute factors contributing to the coding ability of transcripts in mammals. Here, we propose ribosome calculator (RiboCalc) for quantitatively modeling the coding ability of RNAs in human genome. In addition to effectively predicting the experimentally confirmed coding abundance via sequence and transcription features with high accuracy, RiboCalc provides interpretable parameters with biological information. Large-scale analysis further revealed a number of transcripts with a variety of coding ability for distinct types of cells (i.e. context-dependent coding transcripts), suggesting that, contrary to conventional wisdom, a transcript's coding ability should be modeled as a continuous spectrum with a context-dependent nature.


Asunto(s)
Modelos Biológicos , Biosíntesis de Proteínas , ARN , Transcripción Genética , Animales , Genoma Humano , Humanos , Mamíferos/genética , Mamíferos/metabolismo , ARN/metabolismo , ARN Largo no Codificante/genética , Ribosomas/genética , Ribosomas/metabolismo , Transcripción Genética/genética
4.
Nucleic Acids Res ; 48(W1): W230-W238, 2020 07 02.
Artículo en Inglés | MEDLINE | ID: mdl-32406920

RESUMEN

With the abundant mammalian lncRNAs identified recently, a comprehensive annotation resource for these novel lncRNAs is an urgent need. Since its first release in November 2016, AnnoLnc has been the only online server for comprehensively annotating novel human lncRNAs on-the-fly. Here, with significant updates to multiple annotation modules, backend datasets and the code base, AnnoLnc2 continues the effort to provide the scientific community with a one-stop online portal for systematically annotating novel human and mouse lncRNAs with a comprehensive functional spectrum covering sequences, structure, expression, regulation, genetic association and evolution. In response to numerous requests from multiple users, a standalone package is also provided for large-scale offline analysis. We believe that updated AnnoLnc2 (http://annolnc.gao-lab.org/) will help both computational and bench biologists identify lncRNA functions and investigate underlying mechanisms.


Asunto(s)
Anotación de Secuencia Molecular , ARN Largo no Codificante/química , ARN Largo no Codificante/metabolismo , Programas Informáticos , Animales , Evolución Molecular , Regulación de la Expresión Génica , Humanos , Ratones , ARN Largo no Codificante/genética
5.
Nucleic Acids Res ; 48(D1): D1104-D1113, 2020 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-31701126

RESUMEN

With the goal of charting plant transcriptional regulatory maps (i.e. transcription factors (TFs), cis-elements and interactions between them), we have upgraded the TF-centred database PlantTFDB (http://planttfdb.cbi.pku.edu.cn/) to a plant regulatory data and analysis platform PlantRegMap (http://plantregmap.cbi.pku.edu.cn/) over the past three years. In this version, we updated the annotations for the previously collected TFs and set up a new section, 'extended TF repertoires' (TFext), to allow users prompt access to the TF repertoires of newly sequenced species. In addition to our regular TF updates, we are dedicated to updating the data on cis-elements and functional interactions between TFs and cis-elements. We established genome-wide conservation landscapes for 63 representative plants and then developed an algorithm, FunTFBS, to screen for functional regulatory elements and interactions by coupling the base-varied binding affinities of TFs with the evolutionary footprints on their binding sites. Using the FunTFBS algorithm and the conservation landscapes, we further identified over 20 million functional TF binding sites (TFBSs) and two million functional interactions for 21 346 TFs, charting the functional regulatory maps of these 63 plants. These resources are publicly available at PlantRegMap (http://plantregmap.cbi.pku.edu.cn/) and a cloud-based mirror (http://plantregmap.gao-lab.org/), providing the plant research community with valuable resources for decoding plant transcriptional regulatory systems.


Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Regulación de la Expresión Génica de las Plantas , Plantas/genética , Transcripción Genética , Sitios de Unión , Mapeo Cromosómico , Evolución Molecular , Anotación de Secuencia Molecular , Filogenia , Plantas/metabolismo , Unión Proteica , Factores de Transcripción/metabolismo , Navegador Web
6.
Nucleic Acids Res ; 45(D1): D1040-D1045, 2017 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-27924042

RESUMEN

With the goal of providing a comprehensive, high-quality resource for both plant transcription factors (TFs) and their regulatory interactions with target genes, we upgraded plant TF database PlantTFDB to version 4.0 (http://planttfdb.cbi.pku.edu.cn/). In the new version, we identified 320 370 TFs from 165 species, presenting a more comprehensive genomic TF repertoires of green plants. Besides updating the pre-existing abundant functional and evolutionary annotation for identified TFs, we generated three new types of annotation which provide more directly clues to investigate functional mechanisms underlying: (i) a set of high-quality, non-redundant TF binding motifs derived from experiments; (ii) multiple types of regulatory elements identified from high-throughput sequencing data; (iii) regulatory interactions curated from literature and inferred by combining TF binding motifs and regulatory elements. In addition, we upgraded previous TF prediction server, and set up four novel tools for regulation prediction and functional enrichment analyses. Finally, we set up a novel companion portal PlantRegMap (http://plantregmap.cbi.pku.edu.cn) for users to access the regulation resource and analysis tools conveniently.


Asunto(s)
Bases de Datos Genéticas , Regulación de la Expresión Génica de las Plantas , Redes Reguladoras de Genes , Plantas/genética , Plantas/metabolismo , Factores de Transcripción/metabolismo , Sitios de Unión , Biología Computacional/métodos , Evolución Molecular , Genómica/métodos , Anotación de Secuencia Molecular , Motivos de Nucleótidos , Unión Proteica , Navegador Web , Flujo de Trabajo
7.
Nucleic Acids Res ; 45(W1): W12-W16, 2017 07 03.
Artículo en Inglés | MEDLINE | ID: mdl-28521017

RESUMEN

With advances in next-generation sequencing technologies, numerous novel transcripts in a large number of organisms have been identified. With the goal of fast, accurate assessment of the coding ability of RNA transcripts, we upgraded the coding potential calculator CPC1 to CPC2. CPC2 runs ∼1000 times faster than CPC1 and exhibits superior accuracy compared with CPC1, especially for long non-coding transcripts. Moreover, the model of CPC2 is species-neutral, making it feasible for ever-growing non-model organism transcriptomes. A mobile-friendly web server, as well as a downloadable standalone package, is freely available at http://cpc2.cbi.pku.edu.cn.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ARN/métodos , Programas Informáticos , Algoritmos , Animales , Perfilación de la Expresión Génica , Humanos , Internet , Ratones , ARN Largo no Codificante/química , ARN Mensajero/química , ARN Pequeño no Traducido/química
8.
Nucleic Acids Res ; 43(W1): W85-90, 2015 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-25977299

RESUMEN

In 2003, we developed an ab initio program, ZCURVE 1.0, to find genes in bacterial and archaeal genomes. In this work, we present the updated version (i.e. ZCURVE 3.0). Using 422 prokaryotic genomes, the average accuracy was 93.7% with the updated version, compared with 88.7% with the original version. Such results also demonstrate that ZCURVE 3.0 is comparable with Glimmer 3.02 and may provide complementary predictions to it. In fact, the joint application of the two programs generated better results by correctly finding more annotated genes while also containing fewer false-positive predictions. As the exclusive function, ZCURVE 3.0 contains one post-processing program that can identify essential genes with high accuracy (generally >90%). We hope ZCURVE 3.0 will receive wide use with the web-based running mode. The updated ZCURVE can be freely accessed from http://cefg.uestc.edu.cn/zcurve/ or http://tubic.tju.edu.cn/zcurveb/ without any restrictions.


Asunto(s)
Genes Arqueales , Genes Bacterianos , Programas Informáticos , Algoritmos , Genes Esenciales , Genoma Arqueal , Genoma Bacteriano , Internet
9.
BMC Genomics ; 17(Suppl 13): 1023, 2016 12 22.
Artículo en Inglés | MEDLINE | ID: mdl-28155723

RESUMEN

BACKGROUND: The temporal and spatial-specific expression pattern of a transcript in multiple tissues and cell types can indicate key clues about its function. While several gene atlas available online as pre-computed databases for known gene models, it's still challenging to get expression profile for previously uncharacterized (i.e. novel) transcripts efficiently. RESULTS: Here we developed LocExpress, a web server for efficiently estimating expression of novel transcripts across multiple tissues and cell types in human (20 normal tissues/cells types and 14 cell lines) as well as in mouse (24 normal tissues/cell types and nine cell lines). As a wrapper to RNA-Seq quantification algorithm, LocExpress efficiently reduces the time cost by making abundance estimation calls increasingly within the minimum spanning bundle region of input transcripts. For a given novel gene model, such local context-oriented strategy allows LocExpress to estimate its FPKMs in hundreds of samples within minutes on a standard Linux box, making an online web server possible. CONCLUSIONS: To the best of our knowledge, LocExpress is the only web server to provide nearly real-time expression estimation for novel transcripts in common tissues and cell types. The server is publicly available at http://loc-express.cbi.pku.edu.cn .


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Perfilación de la Expresión Génica/métodos , Programas Informáticos , Navegador Web , Algoritmos , Animales , Humanos , Ratones , Transcripción Genética , Transcriptoma , Interfaz Usuario-Computador , Flujo de Trabajo
10.
PNAS Nexus ; 2(5): pgad141, 2023 May.
Artículo en Inglés | MEDLINE | ID: mdl-37181047

RESUMEN

A plant can be thought of as a colony comprising numerous growth buds, each developing to its own rhythm. Such lack of synchrony impedes efforts to describe core principles of plant morphogenesis, dissect the underlying mechanisms, and identify regulators. Here, we use the minimalist known angiosperm to overcome this challenge and provide a model system for plant morphogenesis. We present a detailed morphological description of the monocot Wolffia australiana, as well as high-quality genome information. Further, we developed the plant-on-chip culture system and demonstrate the application of advanced technologies such as single-nucleus RNA-sequencing, protein structure prediction, and gene editing. We provide proof-of-concept examples that illustrate how W. australiana can decipher the core regulatory mechanisms of plant morphogenesis.

11.
Cell Host Microbe ; 30(8): 1124-1138.e8, 2022 08 10.
Artículo en Inglés | MEDLINE | ID: mdl-35908550

RESUMEN

Constitutive activation of plant immunity is detrimental to plant growth and development. Here, we uncover the role of a long non-coding RNA (lncRNA) in fine-tuning the balance of plant immunity and growth. We find that a lncRNA termed salicylic acid biogenesis controller 1 (SABC1) suppresses immunity and promotes growth in healthy plants. SABC1 recruits the polycomb repressive complex 2 to its neighboring gene NAC3, which encodes a NAC transcription factor, to decrease NAC3 transcription via H3K27me3. NAC3 activates the transcription of isochorismate synthase 1 (ICS1), a key enzyme catalyzing salicylic acid (SA) biosynthesis. SABC1 thus represses SA production and plant immunity via decreasing NAC3 and ICS1 transcriptions. Upon pathogen infection, SABC1 is downregulated to derepress plant resistance to bacteria and viruses. Together, our findings reveal lncRNA SABC1 as a molecular switch in balancing plant defense and growth by modulating SA biosynthesis.


Asunto(s)
Proteínas de Arabidopsis , Arabidopsis , ARN Largo no Codificante , Arabidopsis/microbiología , Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , Regulación de la Expresión Génica de las Plantas , Enfermedades de las Plantas , Inmunidad de la Planta/fisiología , Plantas/genética , ARN Largo no Codificante/genética , Ácido Salicílico
12.
Methods Mol Biol ; 2254: 111-131, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33326073

RESUMEN

While more than a hundred thousand long noncoding RNAs (lncRNAs) have been identified in human genome, their biological functions and regulation are largely elusive. Here we present AnnoLnc, a one-stop online annotation portal for human lncRNAs ( http://annolnc1.gao-lab.org/ ). As the first (and the most comprehensive) Web server to provide on-the-fly annotation for novel human lncRNAs, AnnoLnc exploits more than 700 data sources to annotate inputted lncRNA systematically, spanning genomic location, secondary structure, expression patterns, coexpression-based functional annotation, transcriptional regulation, miRNA interaction, protein interaction, genetic association, and evolution. Moreover, in addition to a user-friendly Web interface, AnnoLnc can also be integrated into existing pipelines by either a set of JSON-based web service APIs or a stand-alone version for Linux server.


Asunto(s)
Anotación de Secuencia Molecular/métodos , ARN Largo no Codificante/genética , Bases de Datos de Ácidos Nucleicos , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Humanos , Internet , Programas Informáticos , Interfaz Usuario-Computador
13.
Nat Commun ; 11(1): 3458, 2020 07 10.
Artículo en Inglés | MEDLINE | ID: mdl-32651388

RESUMEN

Single-cell RNA-seq (scRNA-seq) is being used widely to resolve cellular heterogeneity. With the rapid accumulation of public scRNA-seq data, an effective and efficient cell-querying method is critical for the utilization of the existing annotations to curate newly sequenced cells. Such a querying method should be based on an accurate cell-to-cell similarity measure, and capable of handling batch effects properly. Herein, we present Cell BLAST, an accurate and robust cell-querying method built on a neural network-based generative model and a customized cell-to-cell similarity metric. Through extensive benchmarks and case studies, we demonstrate the effectiveness of Cell BLAST in annotating discrete cell types and continuous cell differentiation potential, as well as identifying novel cell types. Powered by a well-curated reference database and a user-friendly Web server, Cell BLAST provides the one-stop solution for real-world scRNA-seq cell querying and annotation.


Asunto(s)
RNA-Seq/métodos , Programas Informáticos , Algoritmos , Aprendizaje Automático , Transcriptoma/genética
14.
Database (Oxford) ; 20182018 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-30339214

RESUMEN

Autism spectrum disorder (ASD) is a complex neurodevelopmental disorder with strong genetic contributions. To provide a comprehensive resource for the genetic evidence of ASD, we have updated the Autism KnowledgeBase (AutismKB) to version 2.0. AutismKB 2.0 integrates multiscale genetic data on 1379 genes, 5420 copy number variations and structural variations, 11 669 single-nucleotide variations or small insertions/deletions (SNVs/indels) and 172 linkage regions. In particular, AutismKB 2.0 highlights 5669 de novo SNVs/indels due to their significant contribution to ASD genetics and includes 789 mosaic variants due to their recently discovered contributions to ASD pathogenesis. The genes and variants are annotated extensively with genetic evidence and clinical evidence. To help users fully understand the functional consequences of SNVs and small indels, we provided comprehensive predictions of pathogenicity with iFish, SIFT, Polyphen etc. To improve user experiences, the new version incorporates multiple query methods, including simple query, advanced query and batch query. It also functionally integrates two analytical tools to help users perform downstream analyses, including a gene ranking tool and an enrichment analysis tool, KOBAS. AutismKB 2.0 is freely available and can be a valuable resource for researchers.


Asunto(s)
Trastorno del Espectro Autista/genética , Bases del Conocimiento , Predisposición Genética a la Enfermedad , Humanos , Internet , Anotación de Secuencia Molecular , Interfaz Usuario-Computador
15.
Zhonghua Jie He He Hu Xi Za Zhi ; 29(1): 31-4, 2006 Jan.
Artículo en Zh | MEDLINE | ID: mdl-16638298

RESUMEN

OBJECTIVE: To explore the application of serum surface-enhanced laser desorption/ionization (SELDI) marker patterns in distinguishing non-small cell lung cancer patients from healthy people by protein chip technology. METHODS: One hundred and sixty-three serum samples (123 patients with lung cancer and 40 healthy persons), were randomly divided into a training set [94 cases, 53 non-small cell lung cancer (NSCLC), 21 small cell lung cancer and 20 healthy persons] and a blinded test set (69 cases), were included for analysis by surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF-MS). Five protein peaks at 11,493, 6,429, 8,245, 5,336 and 2,536 were automatically chosen for the system training and the development of a decision classification tree model (marker pattern). The accuracy of the model was tested with the blinded test set (an independent set of masked serum samples from 49 patients with NSCLC and 20 healthy persons). RESULTS: The model differentiated the patients with NSCLC from the healthy people with a sensitivity of 95.9% (71/74) and a specificity of 90.0% (18/20) in the training set and a sensitivity of 83.7%, and a specificity of 80.0% in the blinded set respectively. CONCLUSION: SELDI-TOF-MS technique can correctly distinguish NSCLC patients from healthy people, and it has the potential for the development of a screening test for the detection of NSCLC.


Asunto(s)
Carcinoma de Pulmón de Células no Pequeñas/sangre , Neoplasias Pulmonares/sangre , Proteínas de Neoplasias/sangre , Análisis por Matrices de Proteínas , Adulto , Anciano , Biomarcadores de Tumor/sangre , Carcinoma de Pulmón de Células no Pequeñas/patología , Estudios de Casos y Controles , Femenino , Humanos , Neoplasias Pulmonares/patología , Masculino , Persona de Mediana Edad , Estadificación de Neoplasias , Análisis por Matrices de Proteínas/métodos , Proteómica , Sensibilidad y Especificidad , Espectrometría de Masa por Láser de Matriz Asistida de Ionización Desorción
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA