Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
1.
Mol Biol Evol ; 40(5)2023 05 02.
Artigo em Inglês | MEDLINE | ID: mdl-37140205

RESUMO

Gene loss is a prevalent source of genetic variation in genome evolution. Calling loss events effectively and efficiently is a critical step for systematically characterizing their functional and phylogenetic profiles genome wide. Here, we developed a novel pipeline integrating orthologous inference and genome alignment. Interestingly, we identified 33 gene loss events that give rise to evolutionarily novel long noncoding RNAs (lncRNAs) that show distinct expression features and could be associated with various functions related to growth, development, immunity, and reproduction, suggesting loss relics as a potential source of functional lncRNAs in humans. Our data also demonstrated that the rates of protein gene loss are variable among different lineages with distinct functional biases.


Assuntos
RNA Longo não Codificante , Humanos , RNA Longo não Codificante/genética , Perfilação da Expressão Gênica , Filogenia , Genoma
2.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34849565

RESUMO

Gene transcription and protein translation are two key steps of the 'central dogma.' It is still a major challenge to quantitatively deconvolute factors contributing to the coding ability of transcripts in mammals. Here, we propose ribosome calculator (RiboCalc) for quantitatively modeling the coding ability of RNAs in human genome. In addition to effectively predicting the experimentally confirmed coding abundance via sequence and transcription features with high accuracy, RiboCalc provides interpretable parameters with biological information. Large-scale analysis further revealed a number of transcripts with a variety of coding ability for distinct types of cells (i.e. context-dependent coding transcripts), suggesting that, contrary to conventional wisdom, a transcript's coding ability should be modeled as a continuous spectrum with a context-dependent nature.


Assuntos
Modelos Biológicos , Biossíntese de Proteínas , RNA , Transcrição Gênica , Animais , Genoma Humano , Humanos , Mamíferos/genética , Mamíferos/metabolismo , RNA/metabolismo , RNA Longo não Codificante/genética , Ribossomos/genética , Ribossomos/metabolismo , Transcrição Gênica/genética
3.
Nucleic Acids Res ; 48(W1): W230-W238, 2020 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-32406920

RESUMO

With the abundant mammalian lncRNAs identified recently, a comprehensive annotation resource for these novel lncRNAs is an urgent need. Since its first release in November 2016, AnnoLnc has been the only online server for comprehensively annotating novel human lncRNAs on-the-fly. Here, with significant updates to multiple annotation modules, backend datasets and the code base, AnnoLnc2 continues the effort to provide the scientific community with a one-stop online portal for systematically annotating novel human and mouse lncRNAs with a comprehensive functional spectrum covering sequences, structure, expression, regulation, genetic association and evolution. In response to numerous requests from multiple users, a standalone package is also provided for large-scale offline analysis. We believe that updated AnnoLnc2 (http://annolnc.gao-lab.org/) will help both computational and bench biologists identify lncRNA functions and investigate underlying mechanisms.


Assuntos
Anotação de Sequência Molecular , RNA Longo não Codificante/química , RNA Longo não Codificante/metabolismo , Software , Animais , Evolução Molecular , Regulação da Expressão Gênica , Humanos , Camundongos , RNA Longo não Codificante/genética
4.
Nucleic Acids Res ; 48(D1): D1104-D1113, 2020 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-31701126

RESUMO

With the goal of charting plant transcriptional regulatory maps (i.e. transcription factors (TFs), cis-elements and interactions between them), we have upgraded the TF-centred database PlantTFDB (http://planttfdb.cbi.pku.edu.cn/) to a plant regulatory data and analysis platform PlantRegMap (http://plantregmap.cbi.pku.edu.cn/) over the past three years. In this version, we updated the annotations for the previously collected TFs and set up a new section, 'extended TF repertoires' (TFext), to allow users prompt access to the TF repertoires of newly sequenced species. In addition to our regular TF updates, we are dedicated to updating the data on cis-elements and functional interactions between TFs and cis-elements. We established genome-wide conservation landscapes for 63 representative plants and then developed an algorithm, FunTFBS, to screen for functional regulatory elements and interactions by coupling the base-varied binding affinities of TFs with the evolutionary footprints on their binding sites. Using the FunTFBS algorithm and the conservation landscapes, we further identified over 20 million functional TF binding sites (TFBSs) and two million functional interactions for 21 346 TFs, charting the functional regulatory maps of these 63 plants. These resources are publicly available at PlantRegMap (http://plantregmap.cbi.pku.edu.cn/) and a cloud-based mirror (http://plantregmap.gao-lab.org/), providing the plant research community with valuable resources for decoding plant transcriptional regulatory systems.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Regulação da Expressão Gênica de Plantas , Plantas/genética , Transcrição Gênica , Sítios de Ligação , Mapeamento Cromossômico , Evolução Molecular , Anotação de Sequência Molecular , Filogenia , Plantas/metabolismo , Ligação Proteica , Fatores de Transcrição/metabolismo , Navegador
5.
Nucleic Acids Res ; 45(W1): W12-W16, 2017 07 03.
Artigo em Inglês | MEDLINE | ID: mdl-28521017

RESUMO

With advances in next-generation sequencing technologies, numerous novel transcripts in a large number of organisms have been identified. With the goal of fast, accurate assessment of the coding ability of RNA transcripts, we upgraded the coding potential calculator CPC1 to CPC2. CPC2 runs ∼1000 times faster than CPC1 and exhibits superior accuracy compared with CPC1, especially for long non-coding transcripts. Moreover, the model of CPC2 is species-neutral, making it feasible for ever-growing non-model organism transcriptomes. A mobile-friendly web server, as well as a downloadable standalone package, is freely available at http://cpc2.cbi.pku.edu.cn.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de RNA/métodos , Software , Algoritmos , Animais , Perfilação da Expressão Gênica , Humanos , Internet , Camundongos , RNA Longo não Codificante/química , RNA Mensageiro/química , Pequeno RNA não Traduzido/química
6.
Nucleic Acids Res ; 45(D1): D1040-D1045, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27924042

RESUMO

With the goal of providing a comprehensive, high-quality resource for both plant transcription factors (TFs) and their regulatory interactions with target genes, we upgraded plant TF database PlantTFDB to version 4.0 (http://planttfdb.cbi.pku.edu.cn/). In the new version, we identified 320 370 TFs from 165 species, presenting a more comprehensive genomic TF repertoires of green plants. Besides updating the pre-existing abundant functional and evolutionary annotation for identified TFs, we generated three new types of annotation which provide more directly clues to investigate functional mechanisms underlying: (i) a set of high-quality, non-redundant TF binding motifs derived from experiments; (ii) multiple types of regulatory elements identified from high-throughput sequencing data; (iii) regulatory interactions curated from literature and inferred by combining TF binding motifs and regulatory elements. In addition, we upgraded previous TF prediction server, and set up four novel tools for regulation prediction and functional enrichment analyses. Finally, we set up a novel companion portal PlantRegMap (http://plantregmap.cbi.pku.edu.cn) for users to access the regulation resource and analysis tools conveniently.


Assuntos
Bases de Dados Genéticas , Regulação da Expressão Gênica de Plantas , Redes Reguladoras de Genes , Plantas/genética , Plantas/metabolismo , Fatores de Transcrição/metabolismo , Sítios de Ligação , Biologia Computacional/métodos , Evolução Molecular , Genômica/métodos , Anotação de Sequência Molecular , Motivos de Nucleotídeos , Ligação Proteica , Navegador , Fluxo de Trabalho
7.
Nucleic Acids Res ; 43(W1): W85-90, 2015 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-25977299

RESUMO

In 2003, we developed an ab initio program, ZCURVE 1.0, to find genes in bacterial and archaeal genomes. In this work, we present the updated version (i.e. ZCURVE 3.0). Using 422 prokaryotic genomes, the average accuracy was 93.7% with the updated version, compared with 88.7% with the original version. Such results also demonstrate that ZCURVE 3.0 is comparable with Glimmer 3.02 and may provide complementary predictions to it. In fact, the joint application of the two programs generated better results by correctly finding more annotated genes while also containing fewer false-positive predictions. As the exclusive function, ZCURVE 3.0 contains one post-processing program that can identify essential genes with high accuracy (generally >90%). We hope ZCURVE 3.0 will receive wide use with the web-based running mode. The updated ZCURVE can be freely accessed from http://cefg.uestc.edu.cn/zcurve/ or http://tubic.tju.edu.cn/zcurveb/ without any restrictions.


Assuntos
Genes Arqueais , Genes Bacterianos , Software , Algoritmos , Genes Essenciais , Genoma Arqueal , Genoma Bacteriano , Internet
8.
PNAS Nexus ; 2(5): pgad141, 2023 May.
Artigo em Inglês | MEDLINE | ID: mdl-37181047

RESUMO

A plant can be thought of as a colony comprising numerous growth buds, each developing to its own rhythm. Such lack of synchrony impedes efforts to describe core principles of plant morphogenesis, dissect the underlying mechanisms, and identify regulators. Here, we use the minimalist known angiosperm to overcome this challenge and provide a model system for plant morphogenesis. We present a detailed morphological description of the monocot Wolffia australiana, as well as high-quality genome information. Further, we developed the plant-on-chip culture system and demonstrate the application of advanced technologies such as single-nucleus RNA-sequencing, protein structure prediction, and gene editing. We provide proof-of-concept examples that illustrate how W. australiana can decipher the core regulatory mechanisms of plant morphogenesis.

9.
Methods Mol Biol ; 2254: 111-131, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33326073

RESUMO

While more than a hundred thousand long noncoding RNAs (lncRNAs) have been identified in human genome, their biological functions and regulation are largely elusive. Here we present AnnoLnc, a one-stop online annotation portal for human lncRNAs ( http://annolnc1.gao-lab.org/ ). As the first (and the most comprehensive) Web server to provide on-the-fly annotation for novel human lncRNAs, AnnoLnc exploits more than 700 data sources to annotate inputted lncRNA systematically, spanning genomic location, secondary structure, expression patterns, coexpression-based functional annotation, transcriptional regulation, miRNA interaction, protein interaction, genetic association, and evolution. Moreover, in addition to a user-friendly Web interface, AnnoLnc can also be integrated into existing pipelines by either a set of JSON-based web service APIs or a stand-alone version for Linux server.


Assuntos
Anotação de Sequência Molecular/métodos , RNA Longo não Codificante/genética , Bases de Dados de Ácidos Nucleicos , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Humanos , Internet , Software , Interface Usuário-Computador
10.
Nat Commun ; 11(1): 3458, 2020 07 10.
Artigo em Inglês | MEDLINE | ID: mdl-32651388

RESUMO

Single-cell RNA-seq (scRNA-seq) is being used widely to resolve cellular heterogeneity. With the rapid accumulation of public scRNA-seq data, an effective and efficient cell-querying method is critical for the utilization of the existing annotations to curate newly sequenced cells. Such a querying method should be based on an accurate cell-to-cell similarity measure, and capable of handling batch effects properly. Herein, we present Cell BLAST, an accurate and robust cell-querying method built on a neural network-based generative model and a customized cell-to-cell similarity metric. Through extensive benchmarks and case studies, we demonstrate the effectiveness of Cell BLAST in annotating discrete cell types and continuous cell differentiation potential, as well as identifying novel cell types. Powered by a well-curated reference database and a user-friendly Web server, Cell BLAST provides the one-stop solution for real-world scRNA-seq cell querying and annotation.


Assuntos
RNA-Seq/métodos , Software , Algoritmos , Aprendizado de Máquina , Transcriptoma/genética
11.
Zhonghua Jie He He Hu Xi Za Zhi ; 29(1): 31-4, 2006 Jan.
Artigo em Zh | MEDLINE | ID: mdl-16638298

RESUMO

OBJECTIVE: To explore the application of serum surface-enhanced laser desorption/ionization (SELDI) marker patterns in distinguishing non-small cell lung cancer patients from healthy people by protein chip technology. METHODS: One hundred and sixty-three serum samples (123 patients with lung cancer and 40 healthy persons), were randomly divided into a training set [94 cases, 53 non-small cell lung cancer (NSCLC), 21 small cell lung cancer and 20 healthy persons] and a blinded test set (69 cases), were included for analysis by surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF-MS). Five protein peaks at 11,493, 6,429, 8,245, 5,336 and 2,536 were automatically chosen for the system training and the development of a decision classification tree model (marker pattern). The accuracy of the model was tested with the blinded test set (an independent set of masked serum samples from 49 patients with NSCLC and 20 healthy persons). RESULTS: The model differentiated the patients with NSCLC from the healthy people with a sensitivity of 95.9% (71/74) and a specificity of 90.0% (18/20) in the training set and a sensitivity of 83.7%, and a specificity of 80.0% in the blinded set respectively. CONCLUSION: SELDI-TOF-MS technique can correctly distinguish NSCLC patients from healthy people, and it has the potential for the development of a screening test for the detection of NSCLC.


Assuntos
Carcinoma Pulmonar de Células não Pequenas/sangue , Neoplasias Pulmonares/sangue , Proteínas de Neoplasias/sangue , Análise Serial de Proteínas , Adulto , Idoso , Biomarcadores Tumorais/sangue , Carcinoma Pulmonar de Células não Pequenas/patologia , Estudos de Casos e Controles , Feminino , Humanos , Neoplasias Pulmonares/patologia , Masculino , Pessoa de Meia-Idade , Estadiamento de Neoplasias , Análise Serial de Proteínas/métodos , Proteômica , Sensibilidade e Especificidade , Espectrometria de Massas por Ionização e Dessorção a Laser Assistida por Matriz
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa