Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Database (Oxford) ; 20222022 03 28.
Artigo em Inglês | MEDLINE | ID: mdl-35348648

RESUMO

The scientific knowledge about which genes are involved in which diseases grows rapidly, which makes it difficult to keep up with new publications and genetics datasets. The DISEASES database aims to provide a comprehensive overview by systematically integrating and assigning confidence scores to evidence for disease-gene associations from curated databases, genome-wide association studies (GWAS) and automatic text mining of the biomedical literature. Here, we present a major update to this resource, which greatly increases the number of associations from all these sources. This is especially true for the text-mined associations, which have increased by at least 9-fold at all confidence cutoffs. We show that this dramatic increase is primarily due to adding full-text articles to the text corpus, secondarily due to improvements to both the disease and gene dictionaries used for named entity recognition, and only to a very small extent due to the growth in number of PubMed abstracts. DISEASES now also makes use of a new GWAS database, Target Illumination by GWAS Analytics, which considerably increased the number of GWAS-derived disease-gene associations. DISEASES itself is also integrated into several other databases and resources, including GeneCards/MalaCards, Pharos/Target Central Resource Database and the Cytoscape stringApp. All data in DISEASES are updated on a weekly basis and is available via a web interface at https://diseases.jensenlab.org, from where it can also be downloaded under open licenses. Database URL: https://diseases.jensenlab.org.


Assuntos
Mineração de Dados , Estudo de Associação Genômica Ampla , Bases de Dados Factuais
2.
Bioinformatics ; 36(1): 264-271, 2020 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-31199464

RESUMO

MOTIVATION: Information extraction by mining the scientific literature is key to uncovering relations between biomedical entities. Most existing approaches based on natural language processing extract relations from single sentence-level co-mentions, ignoring co-occurrence statistics over the whole corpus. Existing approaches counting entity co-occurrences ignore the textual context of each co-occurrence. RESULTS: We propose a novel corpus-wide co-occurrence scoring approach to relation extraction that takes the textual context of each co-mention into account. Our method, called CoCoScore, scores the certainty of stating an association for each sentence that co-mentions two entities. CoCoScore is trained using distant supervision based on a gold-standard set of associations between entities of interest. Instead of requiring a manually annotated training corpus, co-mentions are labeled as positives/negatives according to their presence/absence in the gold standard. We show that CoCoScore outperforms previous approaches in identifying human disease-gene and tissue-gene associations as well as in identifying physical and functional protein-protein associations in different species. CoCoScore is a versatile text mining tool to uncover pairwise associations via co-occurrence mining, within and beyond biomedical applications. AVAILABILITY AND IMPLEMENTATION: CoCoScore is available at: https://github.com/JungeAlexander/cocoscore. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional , Mineração de Dados , Processamento de Linguagem Natural , Publicações , Biologia Computacional/métodos , Humanos , Proteínas/genética
3.
Ecol Evol ; 9(23): 13619-13631, 2019 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-31871671

RESUMO

Coffee leaf rust (CLR), caused by the fungal pathogen Hemileia vastatrix, has plagued coffee production worldwide for over 150 years. Hemileia vastatrix produces urediniospores, teliospores, and the sexual basidiospores. Infection of coffee by basidiospores of H. vastatrix has never been reported and thus far, no alternate host, capable of supporting an aecial stage in the disease cycle, has been found. Due to this, some argue that an alternate host of H. vastatrix does not exist. Yet, to date, the plant pathology community has been puzzled by the ability of H. vastatrix to overcome resistance in coffee cultivars despite the apparent lack of sexual reproduction and an aecidial stage. The purpose of this study was to introduce a new method to search for the alternate host(s) of H. vastatrix. To do this, we present the novel hypothetical alternate host ranking (HAHR) method and an automated text mining (ATM) procedure, utilizing comprehensive biogeographical botanical data from the designated sites of interests (Ethiopia, Kenya and Sri Lanka) and plant pathology insights. With the HAHR/ATM methods, we produced prioritized lists of potential alternate hosts plant of coffee leaf rust. This is a first attempt to seek out an alternate plant host of a pathogenic fungus in this manner. The HAHR method showed the highest-ranking probable alternate host as Psychotria mahonii, Rubus apetalus, and Rhamnus prinoides. The cross-referenced results by the two methods suggest that plant genera of interest are Croton, Euphorbia, and Rubus. The HAHR and ATM methods may also be applied to other plant-rust interactions that include an unknown alternate host or any other biological system, which rely on data mining of published data.

4.
Nucleic Acids Res ; 47(D1): D607-D613, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30476243

RESUMO

Proteins and their functional interactions form the backbone of the cellular machinery. Their connectivity network needs to be considered for the full understanding of biological phenomena, but the available information on protein-protein associations is incomplete and exhibits varying levels of annotation granularity and reliability. The STRING database aims to collect, score and integrate all publicly available sources of protein-protein interaction information, and to complement these with computational predictions. Its goal is to achieve a comprehensive and objective global network, including direct (physical) as well as indirect (functional) interactions. The latest version of STRING (11.0) more than doubles the number of organisms it covers, to 5090. The most important new feature is an option to upload entire, genome-wide datasets as input, allowing users to visualize subsets as interaction networks and to perform gene-set enrichment analysis on the entire input. For the enrichment analysis, STRING implements well-known classification systems such as Gene Ontology and KEGG, but also offers additional, new classification systems based on high-throughput text-mining as well as on a hierarchical clustering of the association network itself. The STRING resource is available online at https://string-db.org/.


Assuntos
Genômica/métodos , Mapeamento de Interação de Proteínas/métodos , Software , Animais , Bases de Dados Genéticas , Ontologia Genética , Humanos
5.
Front Plant Sci ; 8: 1233, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28769948

RESUMO

Release of bud dormancy in perennial woody plants is a temperature-dependent process and thus flowering in these species is heavily affected by climate change. The lack of cold winters in temperate growing regions often results in reduced flowering and low fruit yields. This is likely to decrease the availability of fruits and nuts of the Prunus spp. in the near future. In order to maintain high yields, it is crucial to gain detailed knowledge on the molecular mechanisms controlling the release of bud dormancy. Here, we studied these mechanisms using sweet cherry (Prunus avium L.), a crop where the agrochemical hydrogen cyanamide (HC) is routinely used to compensate for the lack of cold winter temperatures and to induce flower opening. In this work, dormant flower buds were sprayed with hydrogen cyanamide followed by deep RNA sequencing, identifying three main expression patterns in response to HC. These transcript level results were validated by quantitative real time polymerase chain reaction and supported further by phytohormone profiling (ABA, SA, IAA, CK, ethylene, JA). Using these approaches, we identified the most up-regulated pathways: the cytokinin pathway, as well as the jasmonate and the hydrogen cyanide pathway. Our results strongly suggest an inductive effect of these metabolites in bud dormancy release and provide a stepping stone for the characterization of key genes in bud dormancy release.

6.
Gene ; 615: 35-40, 2017 Jun 05.
Artigo em Inglês | MEDLINE | ID: mdl-28322996

RESUMO

t(8;21) acute myeloid leukemia (AML) is characterized by a translocation between chromosomes 8 and 21 and formation of a distinctive RUNX1-RUNX1T1 fusion transcript. This translocation places RUNX1T1 under control of the RUNX1 promoter leading to a pronounced upregulation of RUNX1T1 transcripts in t(8;21) AML, compared to normal hematopoietic cells. We investigated the role of highly-upregulated RUNX1T1 under the hypothesis that it acts as competing endogenous RNA (ceRNA) titrating microRNAs (miRNAs) away from their target transcripts and thus contributes to AML formation. Using publicly available t(8;21) AML RNA-Seq and miRNA-Seq data available from The Cancer Genome Atlas (TCGA) project, we obtained a network consisting of 605 genes that may act as ceRNAs competing for miRNAs with the suggested RUNX1T1 miRNA sponge. Among the 605 ceRNA candidates, 121 have previously been implied in cancer development. Players in the integrin, cadherin, and Wnt signaling pathways affected by the RUNX1T1 sponge were overrepresented. Finally, among a set of 21 high interest RUNX1T1 ceRNAs we found multiple genes that have previously been linked to AML formation. In conclusion, our study offers a novel look at the role of the RUNX1-RUNX1T1 fusion transcript in t(8;21) AML beyond previously investigated genetic and epigenetic aberrations.


Assuntos
Cromossomos Humanos Par 21 , Cromossomos Humanos Par 8 , Leucemia Mieloide Aguda/genética , MicroRNAs , Proteínas Proto-Oncogênicas/genética , Fatores de Transcrição/genética , Regiões 3' não Traduzidas , Sítios de Ligação , Subunidade alfa 2 de Fator de Ligação ao Core/genética , Regulação Leucêmica da Expressão Gênica , Ontologia Genética , Humanos , MicroRNAs/metabolismo , Proteínas de Fusão Oncogênica/genética , Mapas de Interação de Proteínas , Proteína 1 Parceira de Translocação de RUNX1 , Translocação Genética , Via de Sinalização Wnt/genética
7.
Bioinformatics ; 33(14): 2089-2096, 2017 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-28334186

RESUMO

MOTIVATION: Clustering RNA sequences with common secondary structure is an essential step towards studying RNA function. Whereas structural RNA alignment strategies typically identify common structure for orthologous structured RNAs, clustering seeks to group paralogous RNAs based on structural similarities. However, existing approaches for clustering paralogous RNAs, do not take the compensatory base pair changes obtained from structure conservation in orthologous sequences into account. RESULTS: Here, we present RNAscClust , the implementation of a new algorithm to cluster a set of structured RNAs taking their respective structural conservation into account. For a set of multiple structural alignments of RNA sequences, each containing a paralog sequence included in a structural alignment of its orthologs, RNAscClust computes minimum free-energy structures for each sequence using conserved base pairs as prior information for the folding. The paralogs are then clustered using a graph kernel-based strategy, which identifies common structural features. We show that the clustering accuracy clearly benefits from an increasing degree of compensatory base pair changes in the alignments. AVAILABILITY AND IMPLEMENTATION: RNAscClust is available at http://www.bioinf.uni-freiburg.de/Software/RNAscClust . CONTACT: gorodkin@rth.dk or backofen@informatik.uni-freiburg.de. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
RNA/química , Análise de Sequência de RNA/métodos , Software , Algoritmos , Análise por Conglomerados , Humanos , Conformação de Ácido Nucleico
8.
Artigo em Inglês | MEDLINE | ID: mdl-28077569

RESUMO

Protein association networks can be inferred from a range of resources including experimental data, literature mining and computational predictions. These types of evidence are emerging for non-coding RNAs (ncRNAs) as well. However, integration of ncRNAs into protein association networks is challenging due to data heterogeneity. Here, we present a database of ncRNA-RNA and ncRNA-protein interactions and its integration with the STRING database of protein-protein interactions. These ncRNA associations cover four organisms and have been established from curated examples, experimental data, interaction predictions and automatic literature mining. RAIN uses an integrative scoring scheme to assign a confidence score to each interaction. We demonstrate that RAIN outperforms the underlying microRNA-target predictions in inferring ncRNA interactions. RAIN can be operated through an easily accessible web interface and all interaction data can be downloaded.Database URL: http://rth.dk/resources/rain.


Assuntos
Bases de Dados Genéticas , MicroRNAs , Proteínas de Ligação a RNA , Interface Usuário-Computador , MicroRNAs/genética , MicroRNAs/metabolismo , Proteínas de Ligação a RNA/genética , Proteínas de Ligação a RNA/metabolismo
9.
BMC Bioinformatics ; 16 Suppl 2: A1-10, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25708534

RESUMO

This report summarizes the scientific content and activities of the annual symposium organized by the Student Council of the International Society for Computational Biology (ISCB), held in conjunction with the Intelligent Systems for Molecular Biology (ISMB) conference in Boston, USA, on July 11th, 2014.


Assuntos
Biologia Computacional , Resistência a Múltiplos Medicamentos , Sequenciamento de Nucleotídeos em Larga Escala , Repetições de Microssatélites/genética , Revisão da Pesquisa por Pares , Editoração , RNA Mensageiro/metabolismo , Análise de Sequência de DNA
10.
BMC Syst Biol ; 8: 99, 2014 Aug 19.
Artigo em Inglês | MEDLINE | ID: mdl-25134827

RESUMO

BACKGROUND: Over the last decade network enrichment analysis has become popular in computational systems biology to elucidate aberrant network modules. Traditionally, these approaches focus on combining gene expression data with protein-protein interaction (PPI) networks. Nowadays, the so-called omics technologies allow for inclusion of many more data sets, e.g. protein phosphorylation or epigenetic modifications. This creates a need for analysis methods that can combine these various sources of data to obtain a systems-level view on aberrant biological networks. RESULTS: We present a new release of KeyPathwayMiner (version 4.0) that is not limited to analyses of single omics data sets, e.g. gene expression, but is able to directly combine several different omics data types. Version 4.0 can further integrate existing knowledge by adding a search bias towards sub-networks that contain (avoid) genes provided in a positive (negative) list. Finally the new release now also provides a set of novel visualization features and has been implemented as an app for the standard bioinformatics network analysis tool: Cytoscape. CONCLUSION: With KeyPathwayMiner 4.0, we publish a Cytoscape app for multi-omics based sub-network extraction. It is available in Cytoscape's app store http://apps.cytoscape.org/apps/keypathwayminer or via http://keypathwayminer.mpi-inf.mpg.de.


Assuntos
Biologia Computacional/métodos , Software , Gráficos por Computador , Mapeamento de Interação de Proteínas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA