Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 78
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Bioinformatics ; 39(4)2023 04 03.
Artigo em Inglês | MEDLINE | ID: mdl-37018156

RESUMO

MOTIVATION: Relation extraction (RE) is a crucial process to deal with the amount of text published daily, e.g. to find missing associations in a database. RE is a text mining task for which the state-of-the-art approaches use bidirectional encoders, namely, BERT. However, state-of-the-art performance may be limited by the lack of efficient external knowledge injection approaches, with a larger impact in the biomedical area given the widespread usage and high quality of biomedical ontologies. This knowledge can propel these systems forward by aiding them in predicting more explainable biomedical associations. With this in mind, we developed K-RET, a novel, knowledgeable biomedical RE system that, for the first time, injects knowledge by handling different types of associations, multiple sources and where to apply it, and multi-token entities. RESULTS: We tested K-RET on three independent and open-access corpora (DDI, BC5CDR, and PGR) using four biomedical ontologies handling different entities. K-RET improved state-of-the-art results by 2.68% on average, with the DDI Corpus yielding the most significant boost in performance, from 79.30% to 87.19% in F-measure, representing a P-value of 2.91×10-12. AVAILABILITY AND IMPLEMENTATION: https://github.com/lasigeBioTM/K-RET.


Assuntos
Ontologias Biológicas , Mineração de Dados , Mineração de Dados/métodos , Bases de Dados Factuais
2.
BMC Bioinformatics ; 24(1): 171, 2023 Apr 26.
Artigo em Inglês | MEDLINE | ID: mdl-37101154

RESUMO

BACKGROUND: Complex diseases such as neurodevelopmental disorders (NDDs) exhibit multiple etiologies. The multi-etiological nature of complex-diseases emerges from distinct but functionally similar group of genes. Different diseases sharing genes of such groups show related clinical outcomes that further restrict our understanding of disease mechanisms, thus, limiting the applications of personalized medicine approaches to complex genetic disorders. RESULTS: Here, we present an interactive and user-friendly application, called DGH-GO. DGH-GO allows biologists to dissect the genetic heterogeneity of complex diseases by stratifying the putative disease-causing genes into clusters that may contribute to distinct disease outcome development. It can also be used to study the shared etiology of complex-diseases. DGH-GO creates a semantic similarity matrix for the input genes by using Gene Ontology (GO). The resultant matrix can be visualized in 2D plots using different dimension reduction methods (T-SNE, Principal component analysis, umap and Principal coordinate analysis). In the next step, clusters of functionally similar genes are identified from genes functional similarities assessed through GO. This is achieved by employing four different clustering methods (K-means, Hierarchical, Fuzzy and PAM). The user may change the clustering parameters and explore their effect on stratification immediately. DGH-GO was applied to genes disrupted by rare genetic variants in Autism Spectrum Disorder (ASD) patients. The analysis confirmed the multi-etiological nature of ASD by identifying four clusters of genes that were enriched for distinct biological mechanisms and clinical outcome. In the second case study, the analysis of genes shared by different NDDs showed that genes causing multiple disorders tend to aggregate in similar clusters, indicating a possible shared etiology. CONCLUSION: DGH-GO is a user-friendly application that allows biologists to study the multi-etiological nature of complex diseases by dissecting their genetic heterogeneity. In summary, functional similarities, dimension reduction and clustering methods, coupled with interactive visualization and control over analysis allows biologists to explore and analyze their datasets without requiring expert knowledge on these methods. The source code of proposed application is available at https://github.com/Muh-Asif/DGH-GO.


Assuntos
Transtorno do Espectro Autista , Heterogeneidade Genética , Humanos , Ontologia Genética , Transtorno do Espectro Autista/genética , Software
3.
J Biomed Inform ; 132: 104137, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35811025

RESUMO

The existence of unlinkable (NIL) entities is a major hurdle affecting the performance of Named Entity Linking approaches, and, consequently, the performance of downstream models that depend on them. Existing approaches to deal with NIL entities focus mainly on clustering and prediction and are limited to general entities. However, other domains, such as the biomedical sciences, are also prone to the existence of NIL entities, given the growing nature of scientific literature. We propose NILINKER, a model that includes a candidate retrieval module for biomedical NIL entities and a neural network that leverages the attention mechanism to find the top-k relevant concepts from target Knowledge Bases (MEDIC, CTD-Chemicals, ChEBI, HP, CTD-Anatomy and Gene Ontology-Biological Process) that may partially represent a given NIL entity. We also make available a new evaluation dataset designated by EvaNIL, suitable for training and evaluating models focusing on the NIL entity linking task. This dataset contains 846,165 documents (abstracts and full-text biomedical articles), including 1,071,776 annotations, distributed by six different partitions: EvaNIL-MEDIC, EvaNIL-CTD-Chemicals, EvaNIL-ChEBI, EvaNIL-HP, EvaNIL-CTD-Anatomy and EvaNIL-Gene Ontology-Biological Process. NILINKER was integrated into a graph-based Named Entity Linking model (REEL) and the results of the experiments show that this approach is able to increase the performance of the Named Entity Linking model.


Assuntos
Mineração de Dados , Redes Neurais de Computação , Análise por Conglomerados , Mineração de Dados/métodos , Ontologia Genética , Bases de Conhecimento
4.
J Allergy Clin Immunol ; 146(2): 344-355, 2020 08.
Artigo em Inglês | MEDLINE | ID: mdl-32311390

RESUMO

BACKGROUND: Oral food challenge (OFC) is the criterion standard to assess peanut allergy (PA), but it involves a risk of allergic reactions of unpredictable severity. OBJECTIVE: Our aim was to identify biomarkers for risk of severe reactions or low dose threshold during OFC to peanut. METHODS: We assessed Learning Early about Peanut Allergy study, Persistance of Oral Tolerance to Peanut study, and Peanut Allergy Sensitization study participants by administering the basophil activation test (BAT) and the skin prick test (SPT) and measuring the levels of peanut-specific IgE, Arachis hypogaea 2-specific IgE, and peanut-specific IgG4, and we analyzed the utility of the different biomarkers in relation to PA status, severity, and threshold dose of allergic reactions to peanut during OFC. RESULTS: When a previously defined optimal cutoff was used, the BAT diagnosed PA with 98% specificity and 75% sensitivity. The BAT identified severe reactions with 97% specificity and 100% sensitivity. The SPT, level of Arachis hypogaea 2-specific IgE, level of peanut-specific IgE, and IgG4/IgE ratio also had 100% sensitivity but slightly lower specificity (92%, 93%, 90%, and 88%, respectively) to predict severity. Participants with lower thresholds of reactivity had higher basophil activation to peanut in vitro. The SPT and the BAT were the best individual predictors of threshold. Multivariate models were superior to individual biomarkers and were used to generate nomograms to calculate the probability of serious adverse events during OFC for individual patients. CONCLUSIONS: The BAT diagnosed PA with high specificity and identified severe reactors and low threshold with high specificity and high sensitivity. The BAT was the best biomarker for severity, surpassed only by the SPT in predicting threshold. Nomograms can help estimate the likelihood of severe reactions and reactions to a low dose of allergen in individual patients with PA.


Assuntos
Anafilaxia/diagnóstico , Basófilos/imunologia , Hipersensibilidade a Amendoim/diagnóstico , Administração Oral , Alérgenos/imunologia , Arachis/imunologia , Teste de Degranulação de Basófilos , Basófilos/química , Biomarcadores , Criança , Progressão da Doença , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Feminino , Humanos , Imunização , Masculino , Sensibilidade e Especificidade , Índice de Gravidade de Doença
5.
Int Psychogeriatr ; 32(3): 315-324, 2020 03.
Artigo em Inglês | MEDLINE | ID: mdl-31635561

RESUMO

OBJECTIVE: Describe and validate the CHROME (CHemical Restraints avOidance MEthodology) criteria. DESIGN: Observational prospective longitudinal study. SETTING: Single nursing home in Las Palmas de Gran Canaria, Spain. PARTICIPANTS: 288 residents; mean age: 81.6 (SD 10.6). 77.4% had dementia. INTERVENTION: Multicomponent training and consultancy program to eliminate physical and chemical restraints and promote overall quality care. Clinicians were trained in stringent diagnostic criteria of neuropsychiatric syndromes and adequate psychotropic prescription. MEASUREMENTS: Psychotropic prescription (primary study target), neuropsychiatric syndromes, physical restraints, falls, and emergency room visits were semi-annually collected from December 2015 to December 2017. Results are presented for all residents and for those who had dementia and participated in the five study waves (completer analysis, n=107). RESULTS: For the study completers, atypical neuroleptic prescription dropped from 42.7% to 18.7%, long half-life benzodiazepines dropped from 25.2% to 6.5%, and hypnotic medications from 47.7% to 12.1% (p<0.0005). Any kind of fall evolved from 67.3 to 32.7 (number of falls by 100 residents per year). Physicians' diagnostic confidence increased, while the frequency of diagnoses of neuropsychiatric syndromes decreased (p<0.0005). CONCLUSIONS: Implementing the CHROME criteria reduced the prescription of the most dangerous medications in institutionalized people with dementia. Two independent audits found no physical or chemical restraint and confirmed prescription quality of psychotropic drugs. Adequate diagnosis and independent audits appear to be the keys to help and motivate professionals to optimize and reduce the use of psychotropic medication. The CHROME criteria unify, in a single compendium, neuropsychiatric diagnostic criteria, prescription guidelines, independent audit methodology, and minimum legal standards. These criteria can be easily adapted to other countries.


Assuntos
Demência/tratamento farmacológico , Prescrições de Medicamentos/normas , Instituição de Longa Permanência para Idosos/estatística & dados numéricos , Casas de Saúde/estatística & dados numéricos , Prescrições/normas , Idoso , Idoso de 80 Anos ou mais , Demência/psicologia , Eficiência Organizacional , Feminino , Humanos , Prescrição Inadequada/efeitos adversos , Prescrição Inadequada/prevenção & controle , Estudos Longitudinais , Masculino , Reconciliação de Medicamentos/métodos , Estudos Prospectivos , Psicotrópicos/uso terapêutico , Restrição Física
6.
J Allergy Clin Immunol ; 143(3): 1131-1142.e4, 2019 03.
Artigo em Inglês | MEDLINE | ID: mdl-30053528

RESUMO

BACKGROUND: Grass pollen-specific immunotherapy involves immunomodulation of allergen-specific TH2 responses and induction of IL-10+ and/or TGF-ß+CD4+CD25+ regulatory T cells (induced Treg cells). IL-35+CD4+CD25+ forkhead box protein 3-negative T (IL-35-inducible regulatory T [iTR35]) cells have been reported as a novel subset of induced Treg cells with modulatory characteristics. OBJECTIVE: We sought to investigate mechanisms underlying the induction and maintenance of immunologic tolerance induced by IL-35 and iTR35 cells. METHODS: The biological effects of IL-35 were assessed on group 2 innate lymphoid cells (ILC2s); dendritic cells primed with thymic stromal lymphopoietin, IL-25, and IL-33; and B and TH2 cells by using flow cytometry and quantitative RT-PCR. Grass pollen-driven TH2 cell proliferation and cytokine production were measured by using tritiated thymidine and Luminex MagPix, respectively. iTR35 cells were quantified in patients with grass pollen allergy (seasonal allergic rhinitis [SAR] group, n = 16), sublingual immunotherapy (SLIT)-treated patients (SLIT group, n = 16), and nonatopic control subjects (NACs; NAC group, n = 16). RESULTS: The SAR group had increased proportions of ILC2s (P = .002) and IL-5+ cells (P = .042), IL-13+ cells (P = .042), and IL-5+IL-13+ ILC2s (P = .003) compared with NACs. IL-35 inhibited IL-5 and IL-13 production by ILC2s in the presence of IL-25 or IL-33 (P = .031) and allergen-driven TH2 cytokines by effector T cells. IL-35 inhibited CD40 ligand-, IL-4-, and IL-21-mediated IgE production by B cells (P = .015), allergen-driven T-cell proliferation (P = .001), and TH2 cytokine production mediated by primed dendritic cells. iTR35 cells suppressed TH2 cell proliferation and cytokine production. In addition, allergen-driven IL-35 levels and iTR35 cell counts were increased in patients receiving SLIT (all, P < .001) and NACs (all, P < .001) compared with patients with SAR. CONCLUSION: IL-35 and iTR35 cells are potential novel immune regulators induced by SLIT. The clinical relevance of SLIT can be underscored by restoration of protective iTR35 cells.


Assuntos
Alérgenos/imunologia , Interleucinas/imunologia , Linfócitos/imunologia , Poaceae/imunologia , Pólen/imunologia , Rinite Alérgica Sazonal/terapia , Imunoterapia Sublingual , Adulto , Feminino , Humanos , Tolerância Imunológica , Masculino , Pessoa de Meia-Idade , Rinite Alérgica Sazonal/imunologia , Adulto Jovem
7.
J Allergy Clin Immunol ; 143(3): 1067-1076, 2019 03.
Artigo em Inglês | MEDLINE | ID: mdl-30445057

RESUMO

BACKGROUND: Grass pollen subcutaneous immunotherapy (SCIT) is associated with induction of serum IgG4-associated inhibitory antibodies that prevent IgE-facilitated allergen binding to B cells. OBJECTIVE: We sought to determine whether SCIT induces nasal allergen-specific IgG4 antibodies with inhibitory activity that correlates closely with clinical response. METHODS: In a cross-sectional controlled study, nasal fluid and sera were collected during the grass pollen season from 10 SCIT-treated patients, 13 untreated allergic patients (with seasonal allergic rhinitis [SAR]), and 12 nonatopic control subjects. Nasal and serum IgE and IgG4 levels to Phleum pratense components were measured by using the Immuno Solid Allergen Chip microarray. Inhibitory activity was measured by IgE-facilitated allergen binding assay. IL-10+ regulatory B cells were quantified in peripheral blood by using flow cytometry. RESULTS: Nasal and serum Phl p 1- and Phl p 5-specific IgE levels were increased in patients with SAR compared to nonatopic control subjects (all, P < .001) and SCIT-treated patients (nasal, P < .001; serum Phl p 5, P = .073). Nasal IgG4 levels were increased in the SCIT group compared to those in the SAR group (P < .001) during the pollen season compared to out of season. IgG-associated inhibitory activity in nasal fluid and serum was significantly increased in the SCIT group compared to that in the SAR (both, P < .01). The magnitude of the inhibitory activity was 93% (P < .001) in nasal fluid compared to 66% (P < .001) in serum and was reversed after depletion of IgG. Both nasal fluid (r = -0.69, P = .0005) and serum (r = -0.552, P = .0095) blocking activity correlated with global symptom improvement. IL-10+ regulatory B cells were increased in season compared to out of season in the SCIT group (P < .01). CONCLUSION: For the first time, we show that nasal IgG4-associated inhibitory activity correlates closely with the clinical response to allergen immunotherapy in patients with allergic rhinitis with or without asthma.


Assuntos
Alérgenos/imunologia , Anticorpos Neutralizantes/imunologia , Dessensibilização Imunológica , Imunoglobulina E/imunologia , Imunoglobulina G/imunologia , Mucosa Nasal/imunologia , Phleum/imunologia , Pólen/imunologia , Adulto , Linfócitos B Reguladores/imunologia , Biomarcadores , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Rinite Alérgica Sazonal/imunologia , Rinite Alérgica Sazonal/terapia
8.
BMC Bioinformatics ; 20(Suppl 10): 246, 2019 May 29.
Artigo em Inglês | MEDLINE | ID: mdl-31138117

RESUMO

BACKGROUND: Given the increasing amount of biomedical resources that are being annotated with concepts from more than one ontology and covering multiple domains of knowledge, it is important to devise mechanisms to compare these resources that take into account the various domains of annotation. For example, metabolic pathways are annotated with their enzymes and their metabolites, and thus similarity measures should compare them with respect to both of those domains simultaneously. RESULTS: In this paper, we propose two approaches to lift existing single-ontology semantic similarity measures into multi-domain measures. The aggregative approach compares domains independently and averages the various similarity values into a final score. The integrative approach integrates all the relevant ontologies into a single one, calculating similarity in the resulting multi-domain ontology using the single-ontology measure. CONCLUSIONS: We evaluated the two approaches in a multidisciplinary epidemiology dataset by evaluating the capacity of the similarity measures to predict new annotations based on the existing ones. The results show a promising increase in performance of the multi-domain measures over the single-ontology ones in the vast majority of the cases. These results show that multi-domain measures outperform single-domain ones, and should be considered by the community as a starting point to study more efficient multi-domain semantic similarity measures.


Assuntos
Pesquisa Biomédica , Semântica , Epidemias , Humanos
9.
BMC Bioinformatics ; 20(1): 534, 2019 Oct 29.
Artigo em Inglês | MEDLINE | ID: mdl-31664891

RESUMO

BACKGROUND: Biomedical literature concerns a wide range of concepts, requiring controlled vocabularies to maintain a consistent terminology across different research groups. However, as new concepts are introduced, biomedical literature is prone to ambiguity, specifically in fields that are advancing more rapidly, for example, drug design and development. Entity linking is a text mining task that aims at linking entities mentioned in the literature to concepts in a knowledge base. For example, entity linking can help finding all documents that mention the same concept and improve relation extraction methods. Existing approaches focus on the local similarity of each entity and the global coherence of all entities in a document, but do not take into account the semantics of the domain. RESULTS: We propose a method, PPR-SSM, to link entities found in documents to concepts from domain-specific ontologies. Our method is based on Personalized PageRank (PPR), using the relations of the ontology to generate a graph of candidate concepts for the mentioned entities. We demonstrate how the knowledge encoded in a domain-specific ontology can be used to calculate the coherence of a set of candidate concepts, improving the accuracy of entity linking. Furthermore, we explore weighting the edges between candidate concepts using semantic similarity measures (SSM). We show how PPR-SSM can be used to effectively link named entities to biomedical ontologies, namely chemical compounds, phenotypes, and gene-product localization and processes. CONCLUSIONS: We demonstrated that PPR-SSM outperforms state-of-the-art entity linking methods in four distinct gold standards, by taking advantage of the semantic information contained in ontologies. Moreover, PPR-SSM is a graph-based method that does not require training data. Our method improved the entity linking accuracy of chemical compounds by 0.1385 when compared to a method that does not use SSMs.


Assuntos
Semântica , Ontologias Biológicas , Mineração de Dados/métodos , Bases de Dados Factuais , Humanos , Bases de Conhecimento , Vocabulário Controlado
10.
BMC Bioinformatics ; 20(1): 10, 2019 Jan 07.
Artigo em Inglês | MEDLINE | ID: mdl-30616557

RESUMO

BACKGROUND: Recent studies have proposed deep learning techniques, namely recurrent neural networks, to improve biomedical text mining tasks. However, these techniques rarely take advantage of existing domain-specific resources, such as ontologies. In Life and Health Sciences there is a vast and valuable set of such resources publicly available, which are continuously being updated. Biomedical ontologies are nowadays a mainstream approach to formalize existing knowledge about entities, such as genes, chemicals, phenotypes, and disorders. These resources contain supplementary information that may not be yet encoded in training data, particularly in domains with limited labeled data. RESULTS: We propose a new model to detect and classify relations in text, BO-LSTM, that takes advantage of domain-specific ontologies, by representing each entity as the sequence of its ancestors in the ontology. We implemented BO-LSTM as a recurrent neural network with long short-term memory units and using open biomedical ontologies, specifically Chemical Entities of Biological Interest (ChEBI), Human Phenotype, and Gene Ontology. We assessed the performance of BO-LSTM with drug-drug interactions mentioned in a publicly available corpus from an international challenge, composed of 792 drug descriptions and 233 scientific abstracts. By using the domain-specific ontology in addition to word embeddings and WordNet, BO-LSTM improved the F1-score of both the detection and classification of drug-drug interactions, particularly in a document set with a limited number of annotations. We adapted an existing DDI extraction model with our ontology-based method, obtaining a higher F1 score than the original model. Furthermore, we developed and made available a corpus of 228 abstracts annotated with relations between genes and phenotypes, and demonstrated how BO-LSTM can be applied to other types of relations. CONCLUSIONS: Our findings demonstrate that besides the high performance of current deep learning techniques, domain-specific ontologies can still be useful to mitigate the lack of labeled data.


Assuntos
Ontologias Biológicas , Mineração de Dados/métodos , Interações Medicamentosas , Ontologia Genética , Memória de Curto Prazo , Redes Neurais de Computação , Software , Bases de Dados Factuais , Aprendizado Profundo , Humanos , Processamento de Linguagem Natural
11.
J Biomed Inform ; 98: 103273, 2019 10.
Artigo em Inglês | MEDLINE | ID: mdl-31454647

RESUMO

In recent years, the technological advances for capturing genetic variation in large populations led to the identification of large numbers of putative or disease-causing variants. However, their mechanistic understanding is lagging far behind and has posed new challenges regarding their relevance for disease phenotypes, particularly for common complex disorders. In this study, we propose a systematic pipeline to infer biological meaning from genetic variants, namely rare Copy Number Variants (CNVs). The pipeline consists of three modules that seek to (1) improve genetic data quality by excluding low confidence CNVs, (2) identify disrupted biological processes, and (3) aggregate similar enriched biological processes terms using semantic similarity. The proposed pipeline was applied to CNVs from individuals diagnosed with Autism Spectrum Disorder (ASD). We found that rare CNVs disrupting brain expressed genes dysregulated a wide range of biological processes, such as nervous system development and protein polyubiquitination. The disrupted biological processes identified in ASD patients were in accordance with previous findings. This coherence with literature indicates the feasibility of the proposed pipeline in interpreting the biological role of genetic variants in complex disease development. The suggested pipeline is easily adjustable at each step and its independence from any specific dataset and software makes it an effective tool in analyzing existing genetic resources. The FunVar pipeline is available at https://github.com/lasigeBioTM/FunVar and includes pre and post processing steps to effectively interpret biological mechanisms of putative disease causing genetic variants.


Assuntos
Transtorno do Espectro Autista/diagnóstico , Transtorno do Espectro Autista/genética , Biologia Computacional/métodos , Variações do Número de Cópias de DNA , Polimorfismo de Nucleotídeo Único , Algoritmos , Bases de Dados Genéticas , Dosagem de Genes , Predisposição Genética para Doença , Genoma Humano , Genômica , Genótipo , Humanos , Sistema Nervoso , Fenótipo , Semântica , Software
12.
Adv Exp Med Biol ; 1137: 1-8, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31183816

RESUMO

Health and Life studies are well known for the huge amount of data they produce, such as high-throughput sequencing projects (Stephens et al., PLoS Biol 13(7):e1002195, 2015; Hey et al., The fourth paradigm: data-intensive scientific discovery, vol 1. Microsoft research Redmond, Redmond, 2009). However, the value of the data should not be measured by its amount, but instead by the possibility and ability of researchers to retrieve and process it (Leonelli, Data-centric biology: a philosophical study. University of Chicago Press, Chicago, 2016). Transparency, openness, and reproducibility are key aspects to boost the discovery of novel insights into how living systems work (Nosek et al., Science 348(6242):1422-1425, 2015).


Assuntos
Biologia Computacional , Análise de Dados , Sequenciamento de Nucleotídeos em Larga Escala , Reprodutibilidade dos Testes
13.
Adv Exp Med Biol ; 1137: 17-43, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31183818

RESUMO

This chapter starts by introducing an example of how we can retrieve text, where every step is done manually. The chapter will describe step-by-step how we can automatize each step of the example using shell script commands, which will be introduced and explained as long as they are required. The goal is to equip the reader with a basic set of skills to retrieve data from any online database and follow the links to retrieve more information from other sources, such as literature.


Assuntos
Bases de Dados Factuais , Armazenamento e Recuperação da Informação , Linguagens de Programação , Internet
14.
Adv Exp Med Biol ; 1137: 9-15, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31183817

RESUMO

The previous chapter presented the importance of text and semantic resources for Health and Life studies. This chapter will describe what kind of text and semantic resources are available, where they can be found, and how they can be accessed and retrieved.


Assuntos
Armazenamento e Recuperação da Informação , Semântica , Análise de Dados
15.
Adv Exp Med Biol ; 1137: 45-60, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31183819

RESUMO

In the previous chapter we were able to automatically process structured data to retrieve biomedical text about any chemical compound, such as caffeine. This chapter will provide a step-by-step introduction to how we can process that text using shell script commands, specifically extract information about diseases related to caffeine. The goal is to equip the reader with an essential set of skills to extract meaningful information from any text.


Assuntos
Mineração de Dados/métodos , Processamento Eletrônico de Dados , Cafeína , Software
16.
Adv Exp Med Biol ; 1137: 61-91, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31183820

RESUMO

In the previous chapter we were able to automatically process text by recognizing a limited set of entities. This chapter will introduce the world of semantics, and present step-by-step examples to retrieve and enhance text and data processing by using semantics. The goal is to equip the reader with the basic set of skills to explore semantic resources that are nowadays available using simple shell script commands.


Assuntos
Processamento Eletrônico de Dados , Armazenamento e Recuperação da Informação , Semântica
17.
BMC Microbiol ; 18(1): 194, 2018 11 23.
Artigo em Inglês | MEDLINE | ID: mdl-30470193

RESUMO

BACKGROUND: Theobroma cacao L. (cacao) is a perennial tropical tree, endemic to rainforests of the Amazon Basin. Large populations of bacteria live on leaf surfaces and these phylloplane microorganisms can have important effects on plant health. In recent years, the advent of high-throughput sequencing techniques has greatly facilitated studies of the phylloplane microbiome. In this study, we characterized the bacterial microbiome of the phylloplane of the catongo genotype (susceptible to witch's broom) and CCN51 (resistant). Bacterial microbiome was determined by sequencing the V3-V4 region of the bacterial 16S rRNA gene. RESULTS: After the pre-processing, a total of 1.7 million reads were considered. In total, 106 genera of bacteria were characterized. Proteobacteria was the predominant phylum in both genotypes. The exclusive genera of Catongo showed activity in the protection against UV radiation and in the transport of substrates. CCN51 presented genus that act in the biological control and inhibition in several taxonomic groups. Genotype CCN51 presented greater diversity of microorganisms in comparison to the Catongo genotype and the total community was different between both. Scanning electron microscopy analysis of leaves revealed that on the phylloplane, many bacterial occur in large aggregates in several regions of the surface and isolated nearby to the stomata. CONCLUSIONS: We describe for the first time the phylloplane bacterial communities of T. cacao. The Genotype CCN51, resistant to the witch's broom, has a greater diversity of bacterial microbioma in comparison to Catongo and a greater amount of exclusive microorganisms in the phylloplane with antagonistic action against phytopathogens.


Assuntos
Agaricales/fisiologia , Bactérias/isolamento & purificação , Biodiversidade , Cacau/microbiologia , Doenças das Plantas/microbiologia , Folhas de Planta/microbiologia , Bactérias/classificação , Bactérias/genética , Bactérias/crescimento & desenvolvimento , Cacau/genética , Cacau/imunologia , Cacau/fisiologia , Resistência à Doença , Genótipo , Sequenciamento de Nucleotídeos em Larga Escala , Microbiota , Doenças das Plantas/genética , Doenças das Plantas/imunologia , Folhas de Planta/imunologia , Simbiose
18.
J Biomed Inform ; 82: 1-12, 2018 06.
Artigo em Inglês | MEDLINE | ID: mdl-29660494

RESUMO

Sequencing thousands of human genomes has enabled breakthroughs in many areas, among them precision medicine, the study of rare diseases, and forensics. However, mass collection of such sensitive data entails enormous risks if not protected to the highest standards. In this article, we follow the position and argue that post-alignment privacy is not enough and that data should be automatically protected as early as possible in the genomics workflow, ideally immediately after the data is produced. We show that a previous approach for filtering short reads cannot extend to long reads and present a novel filtering approach that classifies raw genomic data (i.e., whose location and content is not yet determined) into privacy-sensitive (i.e., more affected by a successful privacy attack) and non-privacy-sensitive information. Such a classification allows the fine-grained and automated adjustment of protective measures to mitigate the possible consequences of exposure, in particular when relying on public clouds. We present the first filter that can be indistinctly applied to reads of any length, i.e., making it usable with any recent or future sequencing technologies. The filter is accurate, in the sense that it detects all known sensitive nucleotides except those located in highly variable regions (less than 10 nucleotides remain undetected per genome instead of 100,000 in previous works). It has far less false positives than previously known methods (10% instead of 60%) and can detect sensitive nucleotides despite sequencing errors (86% detected instead of 56% with 2% of mutations). Finally, practical experiments demonstrate high performance, both in terms of throughput and memory consumption.


Assuntos
Confidencialidade , Genômica/métodos , Informática Médica/métodos , Algoritmos , Segurança Computacional , Reações Falso-Positivas , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Informática Médica/tendências , Análise de Sequência de DNA , Software
19.
Brief Bioinform ; 16(1): 89-103, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24197933

RESUMO

Semantic web technologies offer an approach to data integration and sharing, even for resources developed independently or broadly distributed across the web. This approach is particularly suitable for scientific domains that profit from large amounts of data that reside in the public domain and that have to be exploited in combination. Translational medicine is such a domain, which in addition has to integrate private data from the clinical domain with proprietary data from the pharmaceutical domain. In this survey, we present the results of our analysis of translational medicine solutions that follow a semantic web approach. We assessed these solutions in terms of their target medical use case; the resources covered to achieve their objectives; and their use of existing semantic web resources for the purposes of data sharing, data interoperability and knowledge discovery. The semantic web technologies seem to fulfill their role in facilitating the integration and exploration of data from disparate sources, but it is also clear that simply using them is not enough. It is fundamental to reuse resources, to define mappings between resources, to share data and knowledge. All these aspects allow the instantiation of translational medicine at the semantic web-scale, thus resulting in a network of solutions that can share resources for a faster transfer of new scientific results into the clinical practice. The envisioned network of translational medicine solutions is on its way, but it still requires resolving the challenges of sharing protected data and of integrating semantic-driven technologies into the clinical practice.


Assuntos
Disseminação de Informação/métodos , Internet , Pesquisa Translacional Biomédica , Algoritmos , Biologia Computacional/métodos , Humanos
20.
J Allergy Clin Immunol ; 135(5): 1249-56, 2015 May.
Artigo em Inglês | MEDLINE | ID: mdl-25670011

RESUMO

BACKGROUND: Most children with detectable peanut-specific IgE (P-sIgE) are not allergic to peanut. We addressed 2 non-mutually exclusive hypotheses for the discrepancy between allergy and sensitization: (1) differences in P-sIgE levels between children with peanut allergy (PA) and peanut-sensitized but tolerant (PS) children and (2) the presence of an IgE inhibitor, such as peanut-specific IgG4 (P-sIgG4), in PS patients. METHODS: Two hundred twenty-eight children (108 patients with PA, 77 PS patients, and 43 nonsensitized nonallergic subjects) were studied. Levels of specific IgE and IgG4 to peanut and its components were determined. IgE-stripped basophils or a mast cell line were used in passive sensitization activation and inhibition assays. Plasma of PS subjects and patients submitted to peanut oral immunotherapy (POIT) were depleted of IgG4 and retested in inhibition assays. RESULTS: Basophils and mast cells sensitized with plasma from patients with PA but not PS patients showed dose-dependent activation in response to peanut. Levels of sIgE to peanut and its components could only partially explain differences in clinical reactivity between patients with PA and PS patients. P-sIgG4 levels (P = .023) and P-sIgG4/P-sIgE (P < .001), Ara h 1-sIgG4/Ara h 1-sIgE (P = .050), Ara h 2-sIgG4/Ara h 2-sIgE (P = .004), and Ara h 3-sIgG4/Ara h 3-sIgE (P = .016) ratios were greater in PS children compared with those in children with PA. Peanut-induced activation was inhibited in the presence of plasma from PS children with detectable P-sIgG4 levels and POIT but not from nonsensitized nonallergic children. Depletion of IgG4 from plasma of children with PS (and POIT) sensitized to Ara h 1 to Ara h 3 partially restored peanut-induced mast cell activation (P = .007). CONCLUSIONS: Differences in sIgE levels and allergen specificity could not justify the clinical phenotype in all children with PA and PS children. Blocking IgG4 antibodies provide an additional explanation for the absence of clinical reactivity in PS patients sensitized to major peanut allergens.


Assuntos
Alérgenos/imunologia , Arachis/efeitos adversos , Basófilos/imunologia , Imunoglobulina G/imunologia , Mastócitos/imunologia , Hipersensibilidade a Amendoim/imunologia , Especificidade de Anticorpos , Antígenos de Plantas , Criança , Pré-Escolar , Feminino , Humanos , Tolerância Imunológica , Imunoglobulina E/imunologia , Masculino
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA