Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
1.
BMC Bioinformatics ; 21(1): 331, 2020 Jul 23.
Artigo em Inglês | MEDLINE | ID: mdl-32703148

RESUMO

BACKGROUND: A number of simulators have been developed for emulating next-generation sequencing data by incorporating known errors such as base substitutions and indels. However, their practicality may be degraded by functional and runtime limitations. Particularly, the positional and genomic contextual information is not effectively utilized for reliably characterizing base substitution patterns, as well as the positional and contextual difference of Phred quality scores is not fully investigated. Thus, a more effective and efficient bioinformatics tool is sorely required. RESULTS: Here, we introduce a novel tool, SimuSCoP, to reliably emulate complex DNA sequencing data. The base substitution patterns and the statistical behavior of quality scores in Illumina sequencing data are fully explored and integrated into the simulation model for reliably emulating datasets for different applications. In addition, an integrated and easy-to-use pipeline is employed in SimuSCoP to facilitate end-to-end simulation of complex samples, and high runtime efficiency is achieved by implementing the tool to run in multithreading with low memory consumption. These features enable SimuSCoP to gets substantial improvements in reliability, functionality, practicality and runtime efficiency. The tool is comprehensively evaluated in multiple aspects including consistency of profiles, simulation of genomic variations and complex tumor samples, and the results demonstrate the advantages of SimuSCoP over existing tools. CONCLUSIONS: SimuSCoP, a new bioinformatics tool is developed to learn informative profiles from real sequencing data and reliably mimic complex data by introducing various genomic variations. We believe that the presented work will catalyse new development of downstream bioinformatics methods for analyzing sequencing data.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Software , Simulação por Computador , Genômica/métodos , Reprodutibilidade dos Testes
2.
Bioinformatics ; 33(20): 3289-3291, 2017 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-28177064

RESUMO

SUMMARY: Next-generation sequencing has been widely applied to understand the complexity of non-coding RNAs (ncRNAs) in the last decades. Here, we present CPSS 2.0, an updated version of CPSS 1.0 for small RNA sequencing data analysis, with the following improvements: (i) a substantial increase of supported species from 10 to 48; (ii) improved strategies applied to detect ncRNAs; (iii) more ncRNAs can be detected and profiled, such as lncRNA and circRNA; (iv) identification of differentially expressed ncRNAs among multiple samples; (v) enhanced visualization interface containing graphs and charts in detailed analysis results. The new version of CPSS is an efficient bioinformatics tool for users in non-coding RNA research. AVAILABILITY AND IMPLEMENTATION: CPSS 2.0 is implemented in PHP + Perl + R and can be freely accessed at http://114.214.166.79/cpss2.0/. CONTACT: zyuanwei@ustc.edu.cn or qshi@ustc.edu.cn. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , RNA não Traduzido/genética , Análise de Sequência de RNA/métodos , Software , Animais , Biologia Computacional/métodos , Eucariotos/genética , Eucariotos/metabolismo , Regulação da Expressão Gênica , Humanos
3.
Nucleic Acids Res ; 44(W1): W166-75, 2016 07 08.
Artigo em Inglês | MEDLINE | ID: mdl-27179030

RESUMO

Small RNA (sRNA) Sequencing technology has revealed that microRNAs (miRNAs) are capable of exhibiting frequent variations from their canonical sequences, generating multiple variants: the isoforms of miRNAs (isomiRs). However, integrated tool to precisely detect and systematically annotate isomiRs from sRNA sequencing data is still in great demand. Here, we present an online tool, DeAnnIso (Detection and Annotation of IsomiRs from sRNA sequencing data). DeAnnIso can detect all the isomiRs in an uploaded sample, and can extract the differentially expressing isomiRs from paired or multiple samples. Once the isomiRs detection is accomplished, detailed annotation information, including isomiRs expression, isomiRs classification, SNPs in miRNAs and tissue specific isomiR expression are provided to users. Furthermore, DeAnnIso provides a comprehensive module of target analysis and enrichment analysis for the selected isomiRs. Taken together, DeAnnIso is convenient for users to screen for isomiRs of their interest and useful for further functional studies. The server is implemented in PHP + Perl + R and available to all users for free at: http://mcg.ustc.edu.cn/bsc/deanniso/ and http://mcg2.ustc.edu.cn/bsc/deanniso/.


Assuntos
MicroRNAs/genética , Plantas/genética , Isoformas de RNA/genética , RNA Citoplasmático Pequeno/genética , RNA Nuclear Pequeno/genética , Software , Animais , Gráficos por Computador , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Internet , MicroRNAs/classificação , Anotação de Sequência Molecular , Polimorfismo de Nucleotídeo Único , Isoformas de RNA/classificação
4.
BMC Bioinformatics ; 18(1): 436, 2017 Oct 03.
Artigo em Inglês | MEDLINE | ID: mdl-28974218

RESUMO

BACKGROUND: Copy number variations (CNVs) are the main genetic structural variations in cancer genome. Detecting CNVs in genetic exome region is efficient and cost-effective in identifying cancer associated genes. Many tools had been developed accordingly and yet these tools lack of reliability because of high false negative rate, which is intrinsically caused by genome exonic bias. RESULTS: To provide an alternative option, here, we report Anaconda, a comprehensive pipeline that allows flexible integration of multiple CNV-calling methods and systematic annotation of CNVs in analyzing WES data. Just by one command, Anaconda can generate CNV detection result by up to four CNV detecting tools. Associated with comprehensive annotation analysis of genes involved in shared CNV regions, Anaconda is able to deliver a more reliable and useful report in assistance with CNV-associate cancer researches. CONCLUSION: Anaconda package and manual can be freely accessed at http://mcg.ustc.edu.cn/bsc/ANACONDA/ .


Assuntos
Algoritmos , Variações do Número de Cópias de DNA/genética , Bases de Dados Genéticas , Sequenciamento do Exoma , Exoma/genética , Anotação de Sequência Molecular , Neoplasias/genética , Automação , Éxons/genética , Humanos , Reprodutibilidade dos Testes
5.
Bioinformatics ; 32(13): 2069-71, 2016 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-27153728

RESUMO

UNLABELLED: : Next-Generation Sequencing (NGS) technology has revealed that microRNAs (miRNAs) are capable of exhibiting frequent differences from their corresponding mature reference sequences, generating multiple variants: the isoforms of miRNAs (isomiRs). These isomiRs mainly originate via the imprecise and alternative cleavage during the pre-miRNA processing and post-transcriptional modifications that influence miRNA stability, their sub-cellular localization and target selection. Although several tools for the identification of isomiR have been reported, no bioinformatics resource dedicated to gather isomiRs from public NGS data and to provide functional analysis of these isomiRs is available to date. Thus, a free online database, IsomiR Bank has been created to integrate isomiRs detected by our previously published algorithm CPSS. In total, 2727 samples (Small RNA NGS data downloaded from ArrayExpress) from eight species (Arabidopsis thaliana, Drosophila melanogaster, Danio rerio, Homo sapiens, Mus musculus, Oryza sativa, Solanum lycopersicum and Zea mays) are analyzed. At present, 308 919 isomiRs from 4706 mature miRNAs are collected into IsomiR Bank. In addition, IsomiR Bank provides target prediction and enrichment analysis to evaluate the effects of isomiRs on target selection. AVAILABILITY AND IMPLEMENTATION: IsomiR Bank is implemented in PHP/PERL + MySQL + R format and can be freely accessed at http://mcg.ustc.edu.cn/bsc/isomir/ CONTACTS: : aoli@ustc.edu.cn or qshi@ustc.edu.cn SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Proteínas , MicroRNAs/genética , Algoritmos , Animais , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Plantas/genética
6.
Nucleic Acids Res ; 43(W1): W289-94, 2015 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-26013811

RESUMO

With the decrease in costs, whole-exome sequencing (WES) has become a very popular and powerful tool for the identification of genetic variants underlying human diseases. However, integrated tools to precisely detect and systematically annotate copy number variations (CNVs) from WES data are still in great demand. Here, we present an online tool, DeAnnCNV (Detection and Annotation of Copy Number Variations from WES data), to meet the current demands of WES users. Upon submitting the file generated from WES data by an in-house tool that can be downloaded from our server, DeAnnCNV can detect CNVs in each sample and extract the shared CNVs among multiple samples. DeAnnCNV also provides additional useful supporting information for the detected CNVs and associated genes to help users to find the potential candidates for further experimental study. The web server is implemented in PHP + Perl + MATLAB and is online available to all users for free at http://mcg.ustc.edu.cn/db/cnv/.


Assuntos
Variações do Número de Cópias de DNA , Exoma , Sequenciamento de Nucleotídeos em Larga Escala , Software , Humanos , Infertilidade Masculina/genética , Internet , Masculino
7.
Nucleic Acids Res ; 41(Database issue): D1055-62, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23193286

RESUMO

Human infertility affects 10-15% of couples, half of which is attributed to the male partner. Abnormal spermatogenesis is a major cause of male infertility. Characterizing the genes involved in spermatogenesis is fundamental to understand the mechanisms underlying this biological process and in developing treatments for male infertility. Although many genes have been implicated in spermatogenesis, no dedicated bioinformatic resource for spermatogenesis is available. We have developed such a database, SpermatogenesisOnline 1.0 (http://mcg.ustc.edu.cn/sdap1/spermgenes/), using manual curation from 30 233 articles published before 1 May 2012. It provides detailed information for 1666 genes reported to participate in spermatogenesis in 37 organisms. Based on the analysis of these genes, we developed an algorithm, Greed AUC Stepwise (GAS) model, which predicted 762 genes to participate in spermatogenesis (GAS probability >0.5) based on genome-wide transcriptional data in Mus musculus testis from the ArrayExpress database. These predicted and experimentally verified genes were annotated, with several identical spermatogenesis-related GO terms being enriched for both classes. Furthermore, protein-protein interaction analysis indicates direct interactions of predicted genes with the experimentally verified ones, which supports the reliability of GAS. The strategy (manual curation and data mining) used to develop SpermatogenesisOnline 1.0 can be easily extended to other biological processes.


Assuntos
Bases de Dados Genéticas , Espermatogênese/genética , Animais , Bovinos , Mineração de Dados , Genômica , Humanos , Internet , Masculino , Camundongos , Anotação de Sequência Molecular , Ratos , Transcriptoma , Interface Usuário-Computador
8.
Bioinformatics ; 28(14): 1925-7, 2012 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-22576177

RESUMO

UNLABELLED: Next generation sequencing (NGS) techniques have been widely used to document the small ribonucleic acids (RNAs) implicated in a variety of biological, physiological and pathological processes. An integrated computational tool is needed for handling and analysing the enormous datasets from small RNA deep sequencing approach. Herein, we present a novel web server, CPSS (a computational platform for the analysis of small RNA deep sequencing data), designed to completely annotate and functionally analyse microRNAs (miRNAs) from NGS data on one platform with a single data submission. Small RNA NGS data can be submitted to this server with analysis results being returned in two parts: (i) annotation analysis, which provides the most comprehensive analysis for small RNA transcriptome, including length distribution and genome mapping of sequencing reads, small RNA quantification, prediction of novel miRNAs, identification of differentially expressed miRNAs, piwi-interacting RNAs and other non-coding small RNAs between paired samples and detection of miRNA editing and modifications and (ii) functional analysis, including prediction of miRNA targeted genes by multiple tools, enrichment of gene ontology terms, signalling pathway involvement and protein-protein interaction analysis for the predicted genes. CPSS, a ready-to-use web server that integrates most functions of currently available bioinformatics tools, provides all the information wanted by the majority of users from small RNA deep sequencing datasets. AVAILABILITY: CPSS is implemented in PHP/PERL+MySQL+R and can be freely accessed at http://mcg.ustc.edu.cn/db/cpss/index.html or http://mcg.ustc.edu.cn/sdap1/cpss/index.html.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , MicroRNAs/genética , Análise de Sequência de RNA/métodos , Animais , Mapeamento Cromossômico , Biologia Computacional/métodos , Humanos , Internet , Camundongos , Transcriptoma
9.
Database (Oxford) ; 20182018 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-29961820

RESUMO

PIWI-interacting RNAs (piRNAs) are essential for transcriptional and post-transcriptional regulation of transposons and coding genes in germline. With the development of sequencing technologies, length variations of piRNAs have been identified in several species. However, the extent to which, piRNA isoforms exist, and whether these isoforms are functionally distinct from canonical piRNAs remain uncharacterized. Through data mining from 2154 datasets of small RNA sequencing data from four species (Homo sapiens, Mus musculus, Danio rerio and Drosophila melanogaster), we have identified 8 749 139 piRNA isoforms from 175 454 canonical piRNAs, and classified them on the basis of variations on 5' or 3' end via the alignment of isoforms with canonical sequence. We thus established a database named IsopiRBank. Each isoforms has detailed annotation as follows: normalized expression data, classification, spatiotemporal expression data and genome origin. Users can also select interested isoforms for further analysis, including target prediction and Enrichment analysis. Taken together, IsopiRBank is an interactive database that aims to present the first integrated resource of piRNA isoforms, and broaden the research of piRNA biology. IsopiRBank can be accessed at http://mcg.ustc.edu.cn/bsc/isopir/index.html without any registration or log in requirement. Database URL: http://mcg.ustc.edu.cn/bsc/isopir/index.html.


Assuntos
Bases de Dados Genéticas , RNA Interferente Pequeno/genética , Animais , Sequência de Bases , Humanos , Internet , Anotação de Sequência Molecular , Isoformas de Proteínas/genética , Interface Usuário-Computador
10.
Database (Oxford) ; 2017(1)2017 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-28365721

RESUMO

Penicillium expansum , the causal agent of blue mold, is one of the most prevalent post-harvest pathogens, infecting a wide range of crops after harvest. In response, crops have evolved various defense systems to protect themselves against this and other pathogens. Penicillium -crop interaction is a multifaceted process and mediated by pathogen- and host-derived proteins. Identification and characterization of the inter-species protein-protein interactions (PPIs) are fundamental to elucidating the molecular mechanisms underlying infection processes between P. expansum and plant crops. Here, we have developed PCPPI, the Penicillium -Crop Protein-Protein Interactions database, which is constructed based on the experimentally determined orthologous interactions in pathogen-plant systems and available domain-domain interactions (DDIs) in each PPI. Thus far, it stores information on 9911 proteins, 439 904 interactions and seven host species, including apple, kiwifruit, maize, pear, rice, strawberry and tomato. Further analysis through the gene ontology (GO) annotation indicated that proteins with more interacting partners tend to execute the essential function. Significantly, semantic statistics of the GO terms also provided strong support for the accuracy of our predicted interactions in PCPPI. We believe that all the PCPPI datasets are helpful to facilitate the study of pathogen-crop interactions and freely available to the research community. Database URL: : http://bdg.hfut.edu.cn/pcppi/index.html.


Assuntos
Produtos Agrícolas/genética , Bases de Dados de Proteínas , Proteínas Fúngicas/genética , Penicillium/genética , Proteínas de Plantas/genética , Produtos Agrícolas/metabolismo , Produtos Agrícolas/microbiologia , Proteínas Fúngicas/metabolismo , Penicillium/metabolismo , Proteínas de Plantas/metabolismo , Domínios Proteicos
11.
Sci Rep ; 6: 25047, 2016 04 28.
Artigo em Inglês | MEDLINE | ID: mdl-27121261

RESUMO

Protein-protein interactions (PPIs) are involved in almost all biological processes and form the basis of the entire interactomics systems of living organisms. Identification and characterization of these interactions are fundamental to elucidating the molecular mechanisms of signal transduction and metabolic pathways at both the cellular and systemic levels. Although a number of experimental and computational studies have been performed on model organisms, the studies exploring and investigating PPIs in tomatoes remain lacking. Here, we developed a Predicted Tomato Interactome Resource (PTIR), based on experimentally determined orthologous interactions in six model organisms. The reliability of individual PPIs was also evaluated by shared gene ontology (GO) terms, co-evolution, co-expression, co-localization and available domain-domain interactions (DDIs). Currently, the PTIR covers 357,946 non-redundant PPIs among 10,626 proteins, including 12,291 high-confidence, 226,553 medium-confidence, and 119,102 low-confidence interactions. These interactions are expected to cover 30.6% of the entire tomato proteome and possess a reasonable distribution. In addition, ten randomly selected PPIs were verified using yeast two-hybrid (Y2H) screening or a bimolecular fluorescence complementation (BiFC) assay. The PTIR was constructed and implemented as a dedicated database and is available at http://bdg.hfut.edu.cn/ptir/index.html without registration.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Mapas de Interação de Proteínas , Solanum lycopersicum/genética
12.
Artigo em Inglês | MEDLINE | ID: mdl-26656885

RESUMO

The Kiwifruit Information Resource (KIR) is dedicated to maintain and integrate comprehensive datasets on genomics, functional genomics and transcriptomics of kiwifruit (Actinidiaceae). KIR serves as a central access point for existing/new genomic and genetic data. KIR also provides researchers with a variety of visualization and analysis tools. Current developments include the updated genome structure of Actinidia chinensis cv. Hongyang and its newest genome annotation, putative transcripts, gene expression, physical markers of genetic traits as well as relevant publications based on the latest genome assembly. Nine thousand five hundred and forty-seven new transcripts are detected and 21 132 old transcripts are changed. At the present release, the next-generation transcriptome sequencing data has been incorporated into gene models and splice variants. Protein-protein interactions are also identified based on experimentally determined orthologous interactions. Furthermore, the experimental results reported in peer-reviewed literature are manually extracted and integrated within a well-developed query page. In total, 122 identifications are currently associated, including commonly used gene names and symbols. All KIR datasets are helpful to facilitate a broad range of kiwifruit research topics and freely available to the research community. Database URL: http://bdg.hfut.edu.cn/kir/index.html.


Assuntos
Actinidia/genética , Frutas/genética , Genoma de Planta , Genômica , Cromossomos de Plantas/genética , Bases de Dados Genéticas , Etiquetas de Sequências Expressas , Ferramenta de Busca , Homologia de Sequência do Ácido Nucleico , Interface Usuário-Computador
13.
Artigo em Inglês | MEDLINE | ID: mdl-25725058

RESUMO

Fruits form unique growing period in the life cycle of higher plants. They provide essential nutrients and have beneficial effects on human health. Characterizing the genes involved in fruit development and ripening is fundamental to understanding the biological process and improving horticultural crops. Although, numerous genes that have been characterized are participated in regulating fruit development and ripening at different stages, no dedicated bioinformatic resource for fruit development and ripening is available. In this study, we have developed such a database, FR database 1.0, using manual curation from 38 423 articles published before 1 April 2014, and integrating protein interactomes and several transcriptome datasets. It provides detailed information for 904 genes derived from 53 organisms reported to participate in fleshy fruit development and ripening. Genes from climacteric and non-climacteric fruits are also annotated, with several interesting Gene Ontology (GO) terms being enriched for these two gene sets and seven ethylene-related GO terms found only in the climacteric fruit group. Furthermore, protein-protein interaction analysis by integrating information from FR database presents the possible function network that affects fleshy fruit size formation. Collectively, FR database will be a valuable platform for comprehensive understanding and future experiments in fruit biology. Database URL: http://www.fruitech.org/


Assuntos
Bases de Dados Genéticas , Frutas , Regulação da Expressão Gênica de Plantas/fisiologia , Genes de Plantas/fisiologia , Plantas , Transcriptoma/fisiologia , Frutas/genética , Frutas/metabolismo , Humanos , Plantas/genética , Plantas/metabolismo
14.
Database (Oxford) ; 2015: bav036, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25931457

RESUMO

Folliculogenesis is an important part of ovarian function as it provides the oocytes for female reproductive life. Characterizing genes/proteins involved in folliculogenesis is fundamental for understanding the mechanisms associated with this biological function and to cure the diseases associated with folliculogenesis. A large number of genes/proteins associated with folliculogenesis have been identified from different species. However, no dedicated public resource is currently available for folliculogenesis-related genes/proteins that are validated by experiments. Here, we are reporting a database 'Follicle Online' that provides the experimentally validated gene/protein map of the folliculogenesis in a number of species. Follicle Online is a web-based database system for storing and retrieving folliculogenesis-related experimental data. It provides detailed information for 580 genes/proteins (from 23 model organisms, including Homo sapiens, Mus musculus, Rattus norvegicus, Mesocricetus auratus, Bos Taurus, Drosophila and Xenopus laevis) that have been reported to be involved in folliculogenesis, POF (premature ovarian failure) and PCOS (polycystic ovary syndrome). The literature was manually curated from more than 43,000 published articles (till 1 March 2014). The Follicle Online database is implemented in PHP + MySQL + JavaScript and this user-friendly web application provides access to the stored data. In summary, we have developed a centralized database that provides users with comprehensive information about genes/proteins involved in folliculogenesis. This database can be accessed freely and all the stored data can be viewed without any registration. Database URL: http://mcg.ustc.edu.cn/sdap1/follicle/index.php


Assuntos
Bases de Dados Genéticas , Sistemas On-Line , Folículo Ovariano/metabolismo , Ovulação , Síndrome do Ovário Policístico , Insuficiência Ovariana Primária , Animais , Bovinos , Feminino , Humanos , Camundongos , Folículo Ovariano/patologia , Síndrome do Ovário Policístico/genética , Síndrome do Ovário Policístico/metabolismo , Insuficiência Ovariana Primária/genética , Insuficiência Ovariana Primária/metabolismo , Ratos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA