RESUMO
Analyzing all features of small non-coding RNA sequencing data can be demanding and challenging. To facilitate this process, we developed miRMaster. After the analysis of over 125 000 human samples and 1.5 trillion human small RNA reads over 4 years, we present miRMaster 2 with a wide range of updates and new features. We extended our reference data sets so that miRMaster 2 now supports the analysis of eight species (e.g. human, mouse, chicken, dog, cow) and 10 non-coding RNA classes (e.g. microRNAs, piRNAs, tRNAs, rRNAs, circRNAs). We also incorporated new downstream analysis modules such as batch effect analysis or sample embeddings using UMAP, and updated annotation data bases included by default (miRBase, Ensembl, GtRNAdb). To accommodate the increasing popularity of single cell small-RNA sequencing data, we incorporated a module for unique molecular identifier (UMI) processing. Further, the output tables and graphics have been improved based on user feedback and new output formats that emerged in the community are now supported (e.g. miRGFF3). Finally, we integrated differential expression analysis with the miRNA enrichment analysis tool miEAA. miRMaster is freely available at https://www.ccb.uni-saarland.de/mirmaster2.
Assuntos
Pequeno RNA não Traduzido/química , Análise de Sequência de RNA/métodos , Animais , Bovinos , Demência/genética , Cães , Humanos , Camundongos , MicroRNAs , Pequeno RNA não Traduzido/metabolismo , Ratos , SoftwareRESUMO
MicroRNAs are regulators of gene expression. A wide-spread, yet not validated, assumption is that the targetome of miRNAs is non-randomly distributed across the transcriptome and that targets share functional pathways. We developed a computational and experimental strategy termed high-throughput miRNA interaction reporter assay (HiTmIR) to facilitate the validation of target pathways. First, targets and target pathways are predicted and prioritized by computational means to increase the specificity and positive predictive value. Second, the novel webtool miRTaH facilitates guided designs of reporter assay constructs at scale. Third, automated and standardized reporter assays are performed. We evaluated HiTmIR using miR-34a-5p, for which TNF- and TGFB-signaling, and Parkinson's Disease (PD)-related categories were identified and repeated the pipeline for miR-7-5p. HiTmIR validated 58.9% of the target genes for miR-34a-5p and 46.7% for miR-7-5p. We confirmed the targeting by measuring the endogenous protein levels of targets in a neuronal cell model. The standardized positive and negative targets are collected in the new miRATBase database, representing a resource for training, or benchmarking new target predictors. Applied to 88 target predictors with different confidence scores, TargetScan 7.2 and miRanda outperformed other tools. Our experiments demonstrate the efficiency of HiTmIR and provide evidence for an orchestrated miRNA-gene targeting.
Assuntos
Regulação da Expressão Gênica/genética , Ensaios de Triagem em Larga Escala , MicroRNAs/genética , 1-Metil-4-fenilpiridínio , Regiões 3' não Traduzidas , Linhagem Celular , Linhagem Celular Tumoral , Genes Reporter , Humanos , Mesencéfalo/citologia , Neuroblastoma/patologia , Neurônios/metabolismo , Doença de Parkinson/genética , Valor Preditivo dos Testes , Sensibilidade e Especificidade , Transdução de Sinais , Transcriptoma , Fator de Crescimento Transformador beta/fisiologia , Fator de Necrose Tumoral alfa/fisiologiaRESUMO
MOTIVATION: Since the initial discovery of microRNAs as post-transcriptional, regulatory key players in the 1990s, a total number of $2656$ mature microRNAs have been publicly described for Homo sapiens. As discovery of new miRNAs is still on-going, target identification remains to be an essential and challenging step preceding functional annotation analysis. One key challenge for researchers seems to be the selection of the most appropriate tool out of the larger multiverse of published solutions for a given research study set-up. RESULTS: In this review we collectively describe the field of in silico target prediction in the course of time and point out long withstanding principles as well as recent developments. By compiling a catalog of characteristics about the 98 prediction methods and identifying common and exclusive traits, we signpost a simplified mechanism to address the problem of application selection. Going further we devised interpretation strategies for common types of output as generated by frequently used computational methods. To this end, our work specifically aims to make prospective users aware of common mistakes and practical questions that arise during the application of target prediction tools. AVAILABILITY: An interactive implementation of our recommendations including materials shown in the manuscript is freely available at https://www.ccb.uni-saarland.de/mtguide.
Assuntos
Biologia Computacional , Simulação por Computador , Regulação da Expressão Gênica , MicroRNAs , Biologia Computacional/métodos , Estudos Prospectivos , SoftwareRESUMO
Since the initial release of miRPathDB, tremendous progress has been made in the field of microRNA (miRNA) research. New miRNA reference databases have emerged, a vast amount of new miRNA candidates has been discovered and the number of experimentally validated target genes has increased considerably. Hence, the demand for a major upgrade of miRPathDB, including extended analysis functionality and intuitive visualizations of query results has emerged. Here, we present the novel release 2.0 of the miRNA Pathway Dictionary Database (miRPathDB) that is freely accessible at https://mpd.bioinf.uni-sb.de/. miRPathDB 2.0 comes with a ten-fold increase of pre-processed data. In total, the updated database provides putative associations between 27 452 (candidate) miRNAs, 28 352 targets and 16 833 pathways for Homo sapiens, as well as interactions of 1978 miRNAs, 24 898 targets and 6511 functional categories for Mus musculus. Additionally, we analyzed publications citing miRPathDB to identify common use-cases and further extensions. Based on this evaluation, we added new functionality for interactive visualizations and down-stream analyses of bulk queries. In summary, the updated version of miRPathDB, with its new custom-tailored features, is one of the most comprehensive and advanced resources for miRNAs and their target pathways.
Assuntos
Bases de Dados de Ácidos Nucleicos , Regulação da Expressão Gênica , MicroRNAs/metabolismo , Animais , Humanos , Camundongos , Interface Usuário-ComputadorRESUMO
Arm selection, the preferential expression of a 3' or 5' mature microRNA (miRNA), is a highly dynamic and tissue-specific process. Time-dependent expression shifts or switches between the arms are also relevant for human diseases. We present miRSwitch, a web server to facilitate the analysis and interpretation of arm selection events. Our species-independent tool evaluates pre-processed small non-coding RNA sequencing (sncRNA-seq) data, i.e. expression matrices or output files from miRNA quantification tools (miRDeep2, miRMaster, sRNAbench). miRSwitch highlights potential changes in the distribution of mature miRNAs from the same precursor. Group comparisons from one or several user-provided annotations (e.g. disease states) are possible. Results can be dynamically adjusted by choosing from a continuous range of highly specific to very sensitive parameters. Users can compare potential arm shifts in the provided data to a human reference map of pre-computed arm shift frequencies. We created this map from 46 tissues and 30 521 samples. As case studies we present novel arm shift information in a Alzheimer's disease biomarker data set and from a comparison of tissues in Homo sapiens and Mus musculus. In summary, miRSwitch offers a broad range of customized arm switch analyses along with comprehensive visualizations, and is freely available at: https://www.ccb.uni-saarland.de/mirswitch/.
Assuntos
MicroRNAs/metabolismo , Software , Doença de Alzheimer/genética , Animais , Humanos , Camundongos , MicroRNAs/química , Precursores de RNA/metabolismo , Análise de Sequência de RNARESUMO
Gene set enrichment analysis has become one of the most frequently used applications in molecular biology research. Originally developed for gene sets, the same statistical principles are now available for all omics types. In 2016, we published the miRNA enrichment analysis and annotation tool (miEAA) for human precursor and mature miRNAs. Here, we present miEAA 2.0, supporting miRNA input from ten frequently investigated organisms. To facilitate inclusion of miEAA in workflow systems, we implemented an Application Programming Interface (API). Users can perform miRNA set enrichment analysis using either the web-interface, a dedicated Python package, or custom remote clients. Moreover, the number of category sets was raised by an order of magnitude. We implemented novel categories like annotation confidence level or localisation in biological compartments. In combination with the miRBase miRNA-version and miRNA-to-precursor converters, miEAA supports research settings where older releases of miRBase are in use. The web server also offers novel comprehensive visualizations such as heatmaps and running sum curves with background distributions. We demonstrate the new features with case studies for human kidney cancer, a biomarker study on Parkinson's disease from the PPMI cohort, and a mouse model for breast cancer. The tool is freely accessible at: https://www.ccb.uni-saarland.de/mieaa2.
Assuntos
MicroRNAs/metabolismo , Software , Animais , Biomarcadores , Neoplasias da Mama/genética , Carcinoma de Células Renais/genética , Progressão da Doença , Feminino , Humanos , Neoplasias Renais/genética , Camundongos , Doença de Parkinson/genética , Fluxo de TrabalhoRESUMO
Modern precision medicine comprises the knowledge and understanding of individual differences in the genomic sequence of patients to provide tailor-made treatments. Regularly, such variants are considered in coding regions only, and their effects are predicted based on their impact on the amino acid sequence of expressed proteins. However, assessing the effects of variants in noncoding elements, in particular microRNAs (miRNAs) and their binding sites, is important as well, as a single miRNA can influence the expression patterns of many genes at the same time. To analyze the effects of variants in miRNAs and their target sites, several databases storing variant impact predictions have been published. In this review, we will compare the core functionalities and features of these databases and discuss the importance of up-to-date data resources in the context of web applications. Finally, we will outline some recommendations for future developments in the field.
Assuntos
Bases de Dados Genéticas , MicroRNAs/genética , Polimorfismo de Nucleotídeo Único , Sítios de Ligação , Humanos , MicroRNAs/metabolismoRESUMO
The study of bacterial isolates or communities requires the analysis of the therein included plasmids in order to provide an extensive characterization of the organisms. Plasmids harboring resistance and virulence factors are of especial interest as they contribute to the dissemination of antibiotic resistance. As the number of newly sequenced bacterial genomes is growing a comprehensive resource is required which will allow to browse and filter the available plasmids, and to perform sequence analyses. Here, we present PLSDB, a resource containing 13 789 plasmid records collected from the NCBI nucleotide database. The web server provides an interactive view of all obtained plasmids with additional meta information such as sequence characteristics, sample-related information and taxonomy. Moreover, nucleotide sequence data can be uploaded to search for short nucleotide sequences (e.g. specific genes) in the plasmids, to compare a given plasmid to the records in the collection or to determine whether a sample contains one or multiple of the known plasmids (containment analysis). The resource is freely accessible under https://ccb-microbe.cs.uni-saarland.de/plsdb/.
Assuntos
Biologia Computacional/métodos , DNA Bacteriano , Bases de Dados Genéticas , Plasmídeos/genética , Anotação de Sequência Molecular , Software , Interface Usuário-Computador , NavegadorRESUMO
While the number of human miRNA candidates continuously increases, only a few of them are completely characterized and experimentally validated. Toward determining the total number of true miRNAs, we employed a combined in silico high- and experimental low-throughput validation strategy. We collected 28 866 human small RNA sequencing data sets containing 363.7 billion sequencing reads and excluded falsely annotated and low quality data. Our high-throughput analysis identified 65% of 24 127 mature miRNA candidates as likely false-positives. Using northern blotting, we experimentally validated miRBase entries and novel miRNA candidates. By exogenous overexpression of 108 precursors that encode 205 mature miRNAs, we confirmed 68.5% of the miRBase entries with the confirmation rate going up to 94.4% for the high-confidence entries and 18.3% of the novel miRNA candidates. Analyzing endogenous miRNAs, we verified the expression of 8 miRNAs in 12 different human cell lines. In total, we extrapolated 2300 true human mature miRNAs, 1115 of which are currently annotated in miRBase V22. The experimentally validated miRNAs will contribute to revising targetomes hypothesized by utilizing falsely annotated miRNAs.
Assuntos
Simulação por Computador , MicroRNAs/análise , MicroRNAs/genética , Análise de Sequência de RNA , Northern Blotting , Linhagem Celular , Conjuntos de Dados como Assunto , Reações Falso-Positivas , Humanos , MicroRNAs/isolamento & purificação , Anotação de Sequência Molecular , Precursores de RNA/análise , Precursores de RNA/genética , Reprodutibilidade dos TestesRESUMO
The repertoire of small noncoding RNAs (sncRNAs), particularly miRNAs, in animals is considered to be evolutionarily conserved. Studies on sncRNAs are often largely based on homology-based information, relying on genomic sequence similarity and excluding actual expression data. To obtain information on sncRNA expression (including miRNAs, snoRNAs, YRNAs and tRNAs), we performed low-input-volume next-generation sequencing of 500 pg of RNA from 21 animals at two German zoological gardens. Notably, none of the species under investigation were previously annotated in any miRNA reference database. Sequencing was performed on blood cells as they are amongst the most accessible, stable and abundant sources of the different sncRNA classes. We evaluated and compared the composition and nature of sncRNAs across the different species by computational approaches. While the distribution of sncRNAs in the different RNA classes varied significantly, general evolutionary patterns were maintained. In particular, miRNA sequences and expression were found to be even more conserved than previously assumed. To make the results available for other researchers, all data, including expression profiles at the species and family levels, and different tools for viewing, filtering and searching the data are freely available in the online resource ASRA (Animal sncRNA Atlas) at https://www.ccb.uni-saarland.de/asra/.
Assuntos
Animais de Zoológico/genética , Ácidos Nucleicos Livres/genética , Biologia Computacional , Pequeno RNA não Traduzido/genética , Animais , Ácidos Nucleicos Livres/classificação , Genoma/genética , Alemanha , MicroRNAs/genética , RNA Nucleolar Pequeno/genética , Pequeno RNA não Traduzido/classificação , RNA de Transferência/genéticaRESUMO
Whole-genome sequencing (WGS) is gaining importance in the analysis of bacterial cultures derived from patients with infectious diseases. Existing computational tools for WGS-based identification have, however, been evaluated on previously defined data relying thereby unwarily on the available taxonomic information.Here, we newly sequenced 846 clinical gram-negative bacterial isolates representing multiple distinct genera and compared the performance of five tools (CLARK, Kaiju, Kraken, DIAMOND/MEGAN and TUIT). To establish a faithful 'gold standard', the expert-driven taxonomy was compared with identifications based on matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry (MS) analysis. Additionally, the tools were also evaluated using a data set of 200 Staphylococcus aureus isolates.CLARK and Kraken (with k =31) performed best with 626 (100%) and 193 (99.5%) correct species classifications for the gram-negative and S. aureus isolates, respectively. Moreover, CLARK and Kraken demonstrated highest mean F-measure values (85.5/87.9% and 94.4/94.7% for the two data sets, respectively) in comparison with DIAMOND/MEGAN (71 and 85.3%), Kaiju (41.8 and 18.9%) and TUIT (34.5 and 86.5%). Finally, CLARK, Kaiju and Kraken outperformed the other tools by a factor of 30 to 170 fold in terms of runtime.We conclude that the application of nucleotide-based tools using k-mers-e.g. CLARK or Kraken-allows for accurate and fast taxonomic characterization of bacterial isolates from WGS data. Hence, our results suggest WGS-based genotyping to be a promising alternative to the MS-based biotyping in clinical settings. Moreover, we suggest that complementary information should be used for the evaluation of taxonomic classification tools, as public databases may suffer from suboptimal annotations.
Assuntos
Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Genoma Bacteriano , Bactérias Gram-Negativas/genética , Bactérias Gram-Negativas/metabolismo , Proteoma , Sequenciamento Completo do Genoma/métodos , Bactérias Gram-Negativas/isolamento & purificação , HumanosRESUMO
MOTIVATION: Breast cancer is the second leading cause of cancer death among women. Tumors, even of the same histopathological subtype, exhibit a high genotypic diversity that impedes therapy stratification and that hence must be accounted for in the treatment decision-making process. RESULTS: Here, we present ClinOmicsTrailbc, a comprehensive visual analytics tool for breast cancer decision support that provides a holistic assessment of standard-of-care targeted drugs, candidates for drug repositioning and immunotherapeutic approaches. To this end, our tool analyzes and visualizes clinical markers and (epi-)genomics and transcriptomics datasets to identify and evaluate the tumor's main driver mutations, the tumor mutational burden, activity patterns of core cancer-relevant pathways, drug-specific biomarkers, the status of molecular drug targets and pharmacogenomic influences. In order to demonstrate ClinOmicsTrailbc's rich functionality, we present three case studies highlighting various ways in which ClinOmicsTrailbc can support breast cancer precision medicine. ClinOmicsTrailbc is a powerful integrated visual analytics tool for breast cancer research in general and for therapy stratification in particular, assisting oncologists to find the best possible treatment options for their breast cancer patients based on actionable, evidence-based results. AVAILABILITY AND IMPLEMENTATION: ClinOmicsTrailbc can be freely accessed at https://clinomicstrail.bioinf.uni-sb.de. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Neoplasias da Mama , Mama , Biologia Computacional , Feminino , Genômica , Humanos , Medicina de PrecisãoRESUMO
BACKGROUND: The use of electronic cigarettes (ECIGs) is increasing, but the impact of ECIG-vapor on cellular processes like inflammation or host defense are less understood. The aim of the present study was to compare the acute effects of traditional cigarettes (TCIGs) and ECIG-exposure on host defense, inflammation, and cellular activation of cell lines and primary differentiated human airway epithelial cells (pHBE). METHODS: We exposed pHBEs and several cell lines to TCIG-smoke or ECIG-vapor. Epithelial host defense and barrier integrity were determined. The transcriptome of airway epithelial cells was compared by gene expression array analysis. Gene interaction networks were constructed and differential gene expression over all groups analyzed. The expression of several candidate genes was validated by qRT-PCR. RESULTS: Bacterial killing, barrier integrity and the expression of antimicrobial peptides were not affected by ECIG-vapor compared to control samples. In contrast, TCIGs negatively affected host defense and reduced barrier integrity in a significant way. Furthermore ECIG-exposure significantly induced IL-8 secretion from Calu-3 cells but had no effect on NCI-H292 or primary cells. The gene expression based on array analysis distinguished TCIG-exposed cells from ECIG and room air-exposed samples. CONCLUSION: The transcriptome patterns of host defense and inflammatory genes are significantly distinct between ECIG-exposed and TCIG-treated cells. The overall effects of ECIGs on epithelial cells are less in comparison to TCIG, and ECIG-vapor does not affect host defense. Nevertheless, although acute exposure to ECIG-vapor induces inflammation, and the expression of S100 proteins, long term in vivo data is needed to evaluate the chronic effects of ECIG use.
Assuntos
Fumar Cigarros/efeitos adversos , Sistemas Eletrônicos de Liberação de Nicotina , Mediadores da Inflamação/metabolismo , Mucosa Respiratória/metabolismo , Poluição por Fumaça de Tabaco/efeitos adversos , Vaping/efeitos adversos , Linhagem Celular Tumoral , Células Cultivadas , Humanos , Mediadores da Inflamação/agonistas , Mucosa Respiratória/efeitos dos fármacosRESUMO
MicroRNAs are regulators of gene expressionand may be key markers in liquid biopsy.Early diagnosis is an effective means to increase patients' overall survival. We generated genome-wide miRNA profiles from serum of patients and controls from the population-based Janus Serum Bank (JSB) and analysed them by bioinformatics and artificial intelligence approaches. JSB contains sera from 318,628 originally healthy persons, more than 96,000 of whom developed cancer. We selected 210 serum samples from patients with lung, colon or breast cancer at three time points prior to diagnosis (up to 32 years prior to diagnosis with median 5 years interval between TPs), one time-point after diagnosis and from individually matched controls. The controls were matched on age and year of all pre-diagnostic sampling time-points for the corresponding case. Using ANOVA we report 70 significantly deregulated markers (adjusted p-value<0.05). The driver for the significance was the diagnostic time point (miR-575, miR-6821-5p, miR-630 with adjusted p-values<10-10). Further, 91miRNAs were differently expressed in pre-diagnostic samples as compared to controls (nominal p < 0.05). Self-organized maps (SOMs)indicated larges effects in lung cancer samples while breast cancer samples showed the least pronounced changes. SOMsalsohighlighted cancer and time point specific miRNA dys-regulation. Intriguingly, a detailed breakdown of the results highlighted that 51% of all miRNAs were highly specific, either for a time-point or a cancer entity. Pathway analysis highlighted 12 pathways including Hipo signalling and ABC transporters.Our results indicate that tumours may be indicated by serum miRNAs decades prior the clinical manifestation.
Assuntos
Biomarcadores Tumorais , MicroRNA Circulante , Biologia Computacional/métodos , MicroRNAs/genética , Neoplasias/diagnóstico , Neoplasias/genética , Inteligência Artificial , Detecção Precoce de Câncer , Humanos , Biópsia Líquida/métodos , Biópsia Líquida/normas , Neoplasias/sangueRESUMO
Web repositories for almost all 'omics' types have been generated-detailing the repertoire of representatives across different tissues or cell types. A logical next step is the combination of these valuable sources. With IMOTA (interactive multi omics tissue atlas), we developed a database that includes 23 725 relations between miRNAs and 23 tissues, 310 932 relations between mRNAs and the same tissues as well as 63 043 relations between proteins and the 23 tissues in Homo sapiens. IMOTA also contains data on tissue-specific interactions, e.g. information on 331 413 miRNAs and target gene pairs that are jointly expressed in the considered tissues. By using intuitive filter and visualization techniques, it is with minimal effort possible to answer various questions. These include rather general questions but also requests specific for genes, miRNAs or proteins. An example for a general task could be 'identify all miRNAs, genes and proteins in the lung that are highly expressed and where experimental evidence proves that the miRNAs target the genes'. An example for a specific request for a gene and a miRNA could for example be 'In which tissues is miR-34c and its target gene BCL2 expressed?'. The IMOTA repository is freely available online at https://ccb-web.cs.uni-saarland.de/imota/.
Assuntos
Biologia Computacional , Bases de Dados Genéticas , Regulação da Expressão Gênica , MicroRNAs/genética , Epigenômica , Previsões , Genoma Humano , Ensaios de Triagem em Larga Escala , Humanos , Especificidade de ÓrgãosRESUMO
The continuous increase of available biological data as consequence of modern high-throughput technologies poses new challenges for analysis techniques and database applications. Especially for miRNAs, one class of small non-coding RNAs, many algorithms have been developed to predict new candidates from next-generation sequencing data. While the amount of publications describing novel miRNA candidates keeps steadily increasing, the current gold standard database for miRNAs - miRBase - has not been updated since June 2014. As a result, publications describing new miRNA candidates in the last three to five years might have a substantial overlap of candidates without noticing. With miRCarta we implemented a database to collect novel miRNA candidates and augment the information provided by miRBase. In the first stage, miRCarta is thought to be a highly sensitive collection of potential miRNA candidates with a high degree of analysis functionality, annotations and details on each miRNA. We added-besides the full content of the miRBase-12,857 human miRNA precursors to miRCarta. Users can match their own predictions to the entries of miRCarta to reduce potential redundancies in their studies. miRCarta provides the most comprehensive collection of human miRNAs and miRNA candidates to form a basis for further refinement and validation studies. The database is freely accessible at https://mircarta.cs.uni-saarland.de/.
Assuntos
Bases de Dados de Ácidos Nucleicos , MicroRNAs/genética , Animais , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , MicroRNAs/química , Anotação de Sequência Molecular , Conformação de Ácido Nucleico , Precursores de RNA/química , Precursores de RNA/genética , Análise de Sequência de RNARESUMO
MicroRNAs (miRNAs) have recently received a significant amount of attention due to their remarkable influence on post-transcriptional gene regulation. In this study, we aim to provide a catalogue of miRNAs present in spermatozoa, seminal plasma and testicular tissue. Expression profiles of miRNA in spermatozoa and seminal plasma of 16 proven fertile men and testicular tissue of eight men with morphologically and/or histologically confirmed obstructive azoospermia were determined by microarray and RT-qPCR in combination with bioinformatics analyses. A total of 123, 156 and 133 miRNAs were consistently detected in spermatozoa, seminal plasma and testicular tissue respectively. Sixty-four miRNAs were shared across all sample types. Based on miRNAs expression level present in each group, correlation analysis showed moderate-to-strong correlations within the spermatozoa and seminal plasma samples and a wider range of correlations within the testicular tissue samples. The target genes of known miRNAs appeared to be involved in a wide range of biological processes related to reproduction, development and differentiation of germ cells. Our results suggest that there is a certain similarity between spermatozoa and seminal plasma for the relative miRNA expression changes with respect to testicular tissue and provide an overview of the miRNAs present in each sample type.
Assuntos
Azoospermia/genética , Fertilidade , MicroRNAs/genética , Sêmen/metabolismo , Espermatozoides/metabolismo , Testículo/metabolismo , Adulto , Azoospermia/metabolismo , Estudos de Casos e Controles , Humanos , Masculino , Análise de Sequência com Séries de Oligonucleotídeos , Reação em Cadeia da Polimerase em Tempo Real , Testículo/química , Adulto JovemRESUMO
BACKGROUND: In many research disciplines, ordered lists are compared. One example is to compare a subset of all significant genes or proteins in a primary study to those in a replication study. Often, the top of the lists are compared using Venn diagrams, ore more precisely Euler diagrams (set diagrams showing logical relations between a finite collection of different sets). If different cohort sizes, different techniques or algorithms for evaluation were applied, a direct comparison of significant genes with a fixed threshold can however be misleading and approaches comparing lists would be more appropriate. RESULTS: We developed DynaVenn, a web-based tool that incrementally creates all possible subsets from two or three ordered lists and computes for each combination a p-value for the overlap. Respectively, dynamic Venn diagrams are generated as graphical representations. Additionally an animation is generated showing how the most significant overlap is reached by backtracking. We demonstrate the improved performance of DynaVenn over an arbitrary cut-off approach on an Alzheimer's Disease biomarker set. CONCLUSION: DynaVenn combines the calculation of the most significant overlap of different cohorts with an intuitive visualization of the results. It is freely available as a web service at http://www.ccb.uni-saarland.de/dynavenn.
Assuntos
Interface Usuário-Computador , Doença de Alzheimer/diagnóstico , Doença de Alzheimer/genética , Doença de Alzheimer/metabolismo , Biomarcadores/metabolismo , Genômica/métodos , Humanos , Internet , MicroRNAs/metabolismoRESUMO
Motivation: Although the amount of small non-coding RNA-sequencing data is continuously increasing, it is still unclear to which extent small RNAs are represented in the human genome. Results: In this study we analyzed 303 billion sequencing reads from nearly 25 000 datasets to answer this question. We determined that 0.8% of the human genome are reliably covered by 874 123 regions with an average length of 31 nt. On the basis of these regions, we found that among the known small non-coding RNA classes, microRNAs were the most prevalent. In subsequent steps, we characterized variations of miRNAs and performed a staged validation of 11 877 candidate miRNAs. Of these, many were actually expressed and significantly dysregulated in lung cancer. Selected candidates were finally validated by northern blots. Although isolated miRNAs could still be present in the human genome, our presented set likely contains the largest fraction of human miRNAs. Contact: c.backes@mx.uni-saarland.de or andreas.keller@ccb.uni-saarland.de. Supplementary information: Supplementary data are available at Bioinformatics online.
Assuntos
Genoma Humano , MicroRNAs , Análise de Sequência de DNA , Transcriptoma , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Neoplasias Pulmonares/genética , Polimorfismo de Nucleotídeo Único , Análise de Sequência de RNARESUMO
The envisioned application of miRNAs as diagnostic or prognostic biomarkers calls for an in-depth understanding of their distribution and variability in different physiological states. While effects with respect to ethnic origin, age, or gender are known, the inter-individual variability of miRNAs across the four seasons remained largely hidden. We sequentially profiled the complete repertoire of blood-borne miRNAs for 25 physiologically normal individuals in spring, summer, fall, and winter (altogether 95 samples) and validated the results on 292 individuals (919 samples collected with the Mitra home sampling device) by RT-qPCR. Principal variance component analysis suggests that the largest variability observed in miRNA expression is due to individual variability and the individuals' gender. But the results also highlight a deviation of miRNA activity in samples collected during spring time. Following adjustment for multiple testing, remarkable differences are observed between spring and fall (77 miRNAs). The two most dys-regulated miRNAs were miR-181c-5p and miR-106b-5p (adjusted p-value of 0.007). Other significant miRNAs include miR-140-3p, miR-21-3p, and let-7c-5p. The dys-regulation was validated by RT-qPCR. Systems biology analysis further provides strong evidence for the immunological origin of the signals: dys-regulated miRNAs are enriched in CD56 cells and belong to various signalling and immune-system-related pathways. Our data suggest that besides known confounding factors such as age and sex, also the season in which a test is conducted might have a considerable influence on the expression of blood-borne miRNAs and subsequently might interfere with diagnosis based on such signatures.