Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 23
Filtrar
Más filtros

Bases de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Neuroimage ; 82: 662-70, 2013 Nov 15.
Artículo en Inglés | MEDLINE | ID: mdl-23684873

RESUMEN

Neuroimaging data is raw material for cognitive neuroscience experiments, leading to scientific knowledge about human neurological and psychological disease, language, perception, attention and ultimately, cognition. The structure of the variables used in the experimental design defines the structure of the data gathered in the experiments; this in turn structures the interpretative assertions that may be presented as experimental conclusions. Representing these assertions and the experimental data which support them in a computable way means that they could be used in logical reasoning environments, i.e. for automated meta-analyses, or linking hypotheses and results across different levels of neuroscientific experiments. Therefore, a crucial first step in being able to represent neuroimaging results in a clear, computable way is to develop representations for the scientific variables involved in neuroimaging experiments. These representations should be expressive, computable, valid, extensible, and easy-to-use. They should also leverage existing semantic standards to interoperate easily with other systems. We present an ontology design pattern called the Ontology of Experimental Variables and Values (OoEVV). This is designed to provide a lightweight framework to capture mathematical properties of data, with appropriate 'hooks' to permit linkage to other ontology-driven projects (such as the Ontology of Biomedical Investigations, OBI). We instantiate the OoEVV system with a small number of functional Magnetic Resonance Imaging datasets, to demonstrate the system's ability to describe the variables of a neuroimaging experiment. OoEVV is designed to be compatible with the XCEDE neuroimaging data standard for data collection terminology, and with the Cognitive Paradigm Ontology (CogPO) for specific reasoning elements of neuroimaging experimental designs.


Asunto(s)
Imagen por Resonancia Magnética/métodos , Neuroimagen , Programas Informáticos , Humanos
2.
BMC Bioinformatics ; 12: 351, 2011 Aug 22.
Artículo en Inglés | MEDLINE | ID: mdl-21859449

RESUMEN

BACKGROUND: We address the goal of curating observations from published experiments in a generalizable form; reasoning over these observations to generate interpretations and then querying this interpreted knowledge to supply the supporting evidence. We present web-application software as part of the 'BioScholar' project (R01-GM083871) that fully instantiates this process for a well-defined domain: using tract-tracing experiments to study the neural connectivity of the rat brain. RESULTS: The main contribution of this work is to provide the first instantiation of a knowledge representation for experimental observations called 'Knowledge Engineering from Experimental Design' (KEfED) based on experimental variables and their interdependencies. The software has three parts: (a) the KEfED model editor - a design editor for creating KEfED models by drawing a flow diagram of an experimental protocol; (b) the KEfED data interface - a spreadsheet-like tool that permits users to enter experimental data pertaining to a specific model; (c) a 'neural connection matrix' interface that presents neural connectivity as a table of ordinal connection strengths representing the interpretations of tract-tracing data. This tool also allows the user to view experimental evidence pertaining to a specific connection. BioScholar is built in Flex 3.5. It uses Persevere (a noSQL database) as a flexible data store and PowerLoom® (a mature First Order Logic reasoning system) to execute queries using spatial reasoning over the BAMS neuroanatomical ontology. CONCLUSIONS: We first introduce the KEfED approach as a general approach and describe its possible role as a way of introducing structured reasoning into models of argumentation within new models of scientific publication. We then describe the design and implementation of our example application: the BioScholar software. This is presented as a possible biocuration interface and supplementary reasoning toolkit for a larger, more specialized bioinformatics system: the Brain Architecture Management System (BAMS).


Asunto(s)
Mapeo Encefálico/métodos , Bases del Conocimiento , Programas Informáticos , Animales , Biología Computacional/métodos , Humanos , Internet , Ratas
3.
IEEE Trans Emerg Top Comput ; 9(1): 316-328, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-35548703

RESUMEN

Data science is a field that has developed to enable efficient integration and analysis of increasingly large data sets in many domains. In particular, big data in genetics, neuroimaging, mobile health, and other subfields of biomedical science, promises new insights, but also poses challenges. To address these challenges, the National Institutes of Health launched the Big Data to Knowledge (BD2K) initiative, including a Training Coordinating Center (TCC) tasked with developing a resource for personalized data science training for biomedical researchers. The BD2K TCC web portal is powered by ERuDIte, the Educational Resource Discovery Index, which collects training resources for data science, including online courses, videos of tutorials and research talks, textbooks, and other web-based materials. While the availability of so many potential learning resources is exciting, they are highly heterogeneous in quality, difficulty, format, and topic, making the field intimidating to enter and difficult to navigate. Moreover, data science is rapidly evolving, so there is a constant influx of new materials and concepts. We leverage data science techniques to build ERuDIte itself, using data extraction, data integration, machine learning, information retrieval, and natural language processing to automatically collect, integrate, describe, and organize existing online resources for learning data science.

4.
Database (Oxford) ; 20212021 07 09.
Artículo en Inglés | MEDLINE | ID: mdl-34244718

RESUMEN

The Ontology for Biomedical Investigations (OBI) underwent a focused review of assay term annotations, logic and hierarchy with a goal to improve and standardize these terms. As a result, inconsistencies in W3C Web Ontology Language (OWL) expressions were identified and corrected, and additionally, standardized design patterns and a formalized template to maintain them were developed. We describe here this informative and productive process to describe the specific benefits and obstacles for OBI and the universal lessons for similar projects.


Asunto(s)
Ontologías Biológicas , Lenguaje , Estándares de Referencia
6.
Database (Oxford) ; 20192019 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-30938776

RESUMEN

We investigate the application of deep learning to biocuration tasks that involve classification of text associated with biomedical evidence in primary research articles. We developed a large-scale corpus of molecular papers derived from PubMed and PubMed Central open access records and used it to train deep learning word embeddings under the GloVe, FastText and ELMo algorithms. We applied those models to a distant supervised method classification task based on text from figure captions or fragments surrounding references to figures in the main text using a variety or models and parameterizations. We then developed document classification (triage) methods for molecular interaction papers by using deep learning mechanisms of attention to aggregate classification-based decisions over selected paragraphs in the document. We were able to obtain triage performance with an accuracy of 0.82 using a combined convolutional neural network, bi-directional long short-term memory architecture augmented by attention to produce a single decision for triage. In this work, we hope to encourage biocuration systems developers to apply deep learning methods to their specialized tasks by repurposing large-scale word embedding to apply to their data.


Asunto(s)
Aprendizaje Profundo , Modelos Teóricos , Publicaciones , Redes Neurales de la Computación , Semántica
7.
Adv Neurobiol ; 21: 101-193, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30334222

RESUMEN

This article focuses on approaches to link transcriptomic, proteomic, and peptidomic datasets mined from brain tissue to the original locations within the brain that they are derived from using digital atlas mapping techniques. We use, as an example, the transcriptomic, proteomic and peptidomic analyses conducted in the mammalian hypothalamus. Following a brief historical overview, we highlight studies that have mined biochemical and molecular information from the hypothalamus and then lay out a strategy for how these data can be linked spatially to the mapped locations in a canonical brain atlas where the data come from, thereby allowing researchers to integrate these data with other datasets across multiple scales. A key methodology that enables atlas-based mapping of extracted datasets-laser-capture microdissection-is discussed in detail, with a view of how this technology is a bridge between systems biology and systems neuroscience.


Asunto(s)
Hipotálamo , Memoria , Proteómica , Refugiados , Animales , Encéfalo , Humanos , Hipotálamo/metabolismo , Memoria/fisiología , Refugiados/psicología , Biología de Sistemas
8.
Pac Symp Biocomput ; 23: 292-303, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-29218890

RESUMEN

The biomedical sciences have experienced an explosion of data which promises to overwhelm many current practitioners. Without easy access to data science training resources, biomedical researchers may find themselves unable to wrangle their own datasets. In 2014, to address the challenges posed such a data onslaught, the National Institutes of Health (NIH) launched the Big Data to Knowledge (BD2K) initiative. To this end, the BD2K Training Coordinating Center (TCC; bigdatau.org) was funded to facilitate both in-person and online learning, and open up the concepts of data science to the widest possible audience. Here, we describe the activities of the BD2K TCC and its focus on the construction of the Educational Resource Discovery Index (ERuDIte), which identifies, collects, describes, and organizes online data science materials from BD2K awardees, open online courses, and videos from scientific lectures and tutorials. ERuDIte now indexes over 9,500 resources. Given the richness of online training materials and the constant evolution of biomedical data science, computational methods applying information retrieval, natural language processing, and machine learning techniques are required - in effect, using data science to inform training in data science. In so doing, the TCC seeks to democratize novel insights and discoveries brought forth via large-scale data science training.


Asunto(s)
Biología Computacional/educación , Biología Computacional/normas , Minería de Datos , Educación a Distancia/métodos , Humanos , Almacenamiento y Recuperación de la Información , Internet , Aprendizaje Automático , Metadatos/normas , National Institutes of Health (U.S.) , Procesamiento de Lenguaje Natural , Estados Unidos
9.
BMC Bioinformatics ; 7: 531, 2006 Dec 13.
Artículo en Inglés | MEDLINE | ID: mdl-17166289

RESUMEN

BACKGROUND: Anatomical studies of neural circuitry describing the basic wiring diagram of the brain produce intrinsically spatial, highly complex data of great value to the neuroscience community. Published neuroanatomical atlases provide a spatial framework for these studies. We have built an informatics framework based on these atlases for the representation of neuroanatomical knowledge. This framework not only captures current methods of anatomical data acquisition and analysis, it allows these studies to be collated, compared and synthesized within a single system. RESULTS: We have developed an atlas-viewing application ('NeuARt II') in the Java language with unique functional properties. These include the ability to use copyrighted atlases as templates within which users may view, save and retrieve data-maps and annotate them with volumetric delineations. NeuARt II also permits users to view multiple levels on multiple atlases at once. Each data-map in this system is simply a stack of vector images with one image per atlas level, so any set of accurate drawings made onto a supported atlas (in vector graphics format) could be uploaded into NeuARt II. Presently the database is populated with a corpus of high-quality neuroanatomical data from the laboratory of Dr Larry Swanson (consisting 64 highly-detailed maps of PHAL tract-tracing experiments, made up of 1039 separate drawings that were published in 27 primary research publications over 17 years). Herein we take selective examples from these data to demonstrate the features of NeuArt II. Our informatics tool permits users to browse, query and compare these maps. The NeuARt II tool operates within a bioinformatics knowledge management platform (called 'NeuroScholar') either as a standalone or a plug-in application. CONCLUSION: Anatomical localization is fundamental to neuroscientific work and atlases provide an easily-understood framework that is widely used by neuroanatomists and non-neuroanatomists alike. NeuARt II, the neuroinformatics tool presented here, provides an accurate and powerful way of representing neuroanatomical data in the context of commonly-used brain atlases for visualization, comparison and analysis. Furthermore, it provides a framework that supports the delivery and manipulation of mapped data either as a standalone system or as a component in a larger knowledge management system.


Asunto(s)
Anatomía Artística/métodos , Imagenología Tridimensional/métodos , Ilustración Médica , Modelos Anatómicos , Neuroanatomía/métodos , Programas Informáticos , Interfaz Usuario-Computador , Gráficos por Computador , Publicaciones Periódicas como Asunto
10.
Neuroinformatics ; 4(2): 139-62, 2006.
Artículo en Inglés | MEDLINE | ID: mdl-16845166

RESUMEN

Scientists continually relate information from the published literature to their current research. The challenge of this essential and time-consuming activity increases as the body of scientific literature continues to grow. In an attempt to lessen the challenge, we have developed an Electronic Laboratory Notebook (ELN) application. Our ELN functions as a component of another application we have developed, an open-source knowledge management system for the neuroscientific literature called NeuroScholar (http://www. neuroscholar. org/). Scanned notebook pages, images, and data files are entered into the ELN, where they can be annotated, organized, and linked to similarly annotated excerpts from the published literature within Neuroscholar. Associations between these knowledge constructs are created within a dynamic node-and-edge user interface. To produce an interactive, adaptable knowledge base. We demonstrate the ELN's utility by using it to organize data and literature related to our studies of the neuroendocrine hypothalamic paraventricular nucleus (PVH). We also discuss how the ELN could be applied to model other neuroendocrine systems; as an example we look at the role of PVH stressor-responsive neurons in the context of their involvement in the suppression of reproductive function. We present this application to the community as open-source software and invite contributions to its development.


Asunto(s)
Electrónica/métodos , Almacenamiento y Recuperación de la Información/estadística & datos numéricos , Bases del Conocimiento , Neuroendocrinología/instrumentación , Neuroendocrinología/métodos , Animales , Sistemas de Administración de Bases de Datos , Humanos , Almacenamiento y Recuperación de la Información/métodos , Núcleo Hipotalámico Paraventricular/anatomía & histología , Núcleo Hipotalámico Paraventricular/fisiología , Lenguajes de Programación , Interfaz Usuario-Computador
11.
Artículo en Inglés | MEDLINE | ID: mdl-27580922

RESUMEN

Automated machine-reading biocuration systems typically use sentence-by-sentence information extraction to construct meaning representations for use by curators. This does not directly reflect the typical discourse structure used by scientists to construct an argument from the experimental data available within a article, and is therefore less likely to correspond to representations typically used in biomedical informatics systems (let alone to the mental models that scientists have). In this study, we develop Natural Language Processing methods to locate, extract, and classify the individual passages of text from articles' Results sections that refer to experimental data. In our domain of interest (molecular biology studies of cancer signal transduction pathways), individual articles may contain as many as 30 small-scale individual experiments describing a variety of findings, upon which authors base their overall research conclusions. Our system automatically classifies discourse segments in these texts into seven categories (fact, hypothesis, problem, goal, method, result, implication) with an F-score of 0.68. These segments describe the essential building blocks of scientific discourse to (i) provide context for each experiment, (ii) report experimental details and (iii) explain the data's meaning in context. We evaluate our system on text passages from articles that were curated in molecular biology databases (the Pathway Logic Datum repository, the Molecular Interaction MINT and INTACT databases) linking individual experiments in articles to the type of assay used (coprecipitation, phosphorylation, translocation etc.). We use supervised machine learning techniques on text passages containing unambiguous references to experiments to obtain baseline F1 scores of 0.59 for MINT, 0.71 for INTACT and 0.63 for Pathway Logic. Although preliminary, these results support the notion that targeting information extraction methods to experimental results could provide accurate, automated methods for biocuration. We also suggest the need for finer-grained curation of experimental methods used when constructing molecular biology databases.


Asunto(s)
Minería de Datos/métodos , Bases de Datos Factuales , Procesamiento Automatizado de Datos/métodos , Aprendizaje Automático , Procesamiento de Lenguaje Natural , Animales , Humanos
12.
J Comp Neurol ; 493(3): 412-38, 2005 Dec 19.
Artículo en Inglés | MEDLINE | ID: mdl-16261534

RESUMEN

The L-shaped anterior zone of the lateral hypothalamic area's subfornical region (LHAsfa) is delineated by a pontine nucleus incertus input. Functional evidence suggests that the subfornical region and nucleus incertus modulate foraging and defensive behaviors, although subfornical region connections are poorly understood. A high-resolution Phaseolus vulgaris-leucoagglutinin (PHAL) structural analysis is presented here of the LHAsfa neuron population's overall axonal projection pattern. The strongest LHAsfa targets are in the interbrain and cerebral hemisphere. The former include inputs to anterior hypothalamic nucleus, dorsomedial part of the ventromedial nucleus, and ventral region of the dorsal premammillary nucleus (defensive behavior control system components), and to lateral habenula and dorsal region of the dorsal premammillary nucleus (foraging behavior control system components). The latter include massive inputs to lateral and medial septal nuclei (septo-hippocampal system components), and inputs to bed nuclei of the stria terminalis posterior division related to the defensive behavior system, intercalated amygdalar nucleus (projecting to central amygdalar nucleus), and posterior part of the basomedial amygdalar nucleus. LHAsfa vertical and horizontal limb basic projection patterns are similar, although each preferentially innervates certain terminal fields. Lateral hypothalamic area regions immediately medial, lateral, and caudal to the LHAsfa each generate quite distinct projection patterns. Combined with previous evidence that major sources of LHAsfa neural inputs include the parabrachial nucleus (nociceptive information), defensive and foraging behavior system components, and the septo-hippocampal system, the present results suggest that the LHAsfa helps match adaptive behavioral responses (either defensive or foraging) to current internal motivational status and external environmental conditions.


Asunto(s)
Corteza Cerebral/citología , Diencéfalo/citología , Área Hipotalámica Lateral/citología , Vías Nerviosas/citología , Neuronas Eferentes/citología , Órgano Subfornical/citología , Animales , Masculino , Puente/citología , Ratas , Ratas Wistar
13.
Neuroinformatics ; 1(1): 81-109, 2003.
Artículo en Inglés | MEDLINE | ID: mdl-15055395

RESUMEN

Within this paper, we describe a neuroinformatics project (called "NeuroScholar," http://www.neuroscholar.org/) that enables researchers to examine, manage, manipulate, and use the information contained within the published neuroscientific literature. The project is built within a multi-level, multi-component framework constructed with the use of software engineering methods that themselves provide code-building functionality for neuroinformaticians. We describe the different software layers of the system. First, we present a hypothetical usage scenario illustrating how NeuroScholar permits users to address large-scale questions in a way that would otherwise be impossible. We do this by applying NeuroScholar to a "real-world" neuroscience question: How is stress-related information processed in the brain? We then explain how the overall design of NeuroScholar enables the system to work and illustrate different components of the user interface. We then describe the knowledge management strategy we use to store interpretations. Finally, we describe the software engineering framework we have devised (called the "View-Primitive-Data Model framework," [VPDMf]) to provide an open-source, accelerated software development environment for the project. We believe that NeuroScholar will be useful to experimental neuroscientists by helping them interact with the primary neuroscientific literature in a meaningful way, and to neuroinformaticians by providing them with useful, affordable software engineering tools.


Asunto(s)
Inteligencia Artificial , Neurociencias , Encéfalo/fisiopatología , Mapeo Encefálico , Sistemas de Información , Neuronas/fisiología , Núcleo Hipotalámico Paraventricular/citología , Núcleo Hipotalámico Paraventricular/fisiopatología , Estrés Psicológico/fisiopatología , Terminología como Asunto
14.
PeerJ ; 2: e483, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25097821

RESUMEN

Background. Unlike full reading, 'skim-reading' involves the process of looking quickly over information in an attempt to cover more material whilst still being able to retain a superficial view of the underlying content. Within this work, we specifically emulate this natural human activity by providing a dynamic graph-based view of entities automatically extracted from text. For the extraction, we use shallow parsing, co-occurrence analysis and semantic similarity computation techniques. Our main motivation is to assist biomedical researchers and clinicians in coping with increasingly large amounts of potentially relevant articles that are being published ongoingly in life sciences. Methods. To construct the high-level network overview of articles, we extract weighted binary statements from the text. We consider two types of these statements, co-occurrence and similarity, both organised in the same distributional representation (i.e., in a vector-space model). For the co-occurrence weights, we use point-wise mutual information that indicates the degree of non-random association between two co-occurring entities. For computing the similarity statement weights, we use cosine distance based on the relevant co-occurrence vectors. These statements are used to build fuzzy indices of terms, statements and provenance article identifiers, which support fuzzy querying and subsequent result ranking. These indexing and querying processes are then used to construct a graph-based interface for searching and browsing entity networks extracted from articles, as well as articles relevant to the networks being browsed. Last but not least, we describe a methodology for automated experimental evaluation of the presented approach. The method uses formal comparison of the graphs generated by our tool to relevant gold standards based on manually curated PubMed, TREC challenge and MeSH data. Results. We provide a web-based prototype (called 'SKIMMR') that generates a network of inter-related entities from a set of documents which a user may explore through our interface. When a particular area of the entity network looks interesting to a user, the tool displays the documents that are the most relevant to those entities of interest currently shown in the network. We present this as a methodology for browsing a collection of research articles. To illustrate the practical applicability of SKIMMR, we present examples of its use in the domains of Spinal Muscular Atrophy and Parkinson's Disease. Finally, we report on the results of experimental evaluation using the two domains and one additional dataset based on the TREC challenge. The results show how the presented method for machine-aided skim reading outperforms tools like PubMed regarding focused browsing and informativeness of the browsing context.

15.
Front Neuroinform ; 7: 38, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-24399964

RESUMEN

The frequency and volume of newly-published scientific literature is quickly making manual maintenance of publicly-available databases of primary data unrealistic and costly. Although machine learning (ML) can be useful for developing automated approaches to identifying scientific publications containing relevant information for a database, developing such tools necessitates manually annotating an unrealistic number of documents. One approach to this problem, active learning (AL), builds classification models by iteratively identifying documents that provide the most information to a classifier. Although this approach has been shown to be effective for related problems, in the context of scientific databases curation, it falls short. We present Virk, an AL system that, while being trained, simultaneously learns a classification model and identifies documents having information of interest for a knowledge base. Our approach uses a support vector machine (SVM) classifier with input features derived from neuroscience-related publications from the primary literature. Using our approach, we were able to increase the size of the Neuron Registry, a knowledge base of neuron-related information, by a factor of 90%, a knowledge base of neuron-related information, in 3 months. Using standard biocuration methods, it would have taken between 1 and 2 years to make the same number of contributions to the Neuron Registry. Here, we describe the system pipeline in detail, and evaluate its performance against other approaches to sampling in AL.

16.
Source Code Biol Med ; 7(1): 7, 2012 May 28.
Artículo en Inglés | MEDLINE | ID: mdl-22640904

RESUMEN

BACKGROUND: The Portable Document Format (PDF) is the most commonly used file format for online scientific publications. The absence of effective means to extract text from these PDF files in a layout-aware manner presents a significant challenge for developers of biomedical text mining or biocuration informatics systems that use published literature as an information source. In this paper we introduce the 'Layout-Aware PDF Text Extraction' (LA-PDFText) system to facilitate accurate extraction of text from PDF files of research articles for use in text mining applications. RESULTS: Our paper describes the construction and performance of an open source system that extracts text blocks from PDF-formatted full-text research articles and classifies them into logical units based on rules that characterize specific sections. The LA-PDFText system focuses only on the textual content of the research articles and is meant as a baseline for further experiments into more advanced extraction methods that handle multi-modal content, such as images and graphs. The system works in a three-stage process: (1) Detecting contiguous text blocks using spatial layout processing to locate and identify blocks of contiguous text, (2) Classifying text blocks into rhetorical categories using a rule-based method and (3) Stitching classified text blocks together in the correct order resulting in the extraction of text from section-wise grouped blocks. We show that our system can identify text blocks and classify them into rhetorical categories with Precision1 = 0.96% Recall = 0.89% and F1 = 0.91%. We also present an evaluation of the accuracy of the block detection algorithm used in step 2. Additionally, we have compared the accuracy of the text extracted by LA-PDFText to the text from the Open Access subset of PubMed Central. We then compared this accuracy with that of the text extracted by the PDF2Text system, 2commonly used to extract text from PDF. Finally, we discuss preliminary error analysis for our system and identify further areas of improvement. CONCLUSIONS: LA-PDFText is an open-source tool for accurately extracting text from full-text scientific articles. The release of the system is available at http://code.google.com/p/lapdftext/.

17.
J Biomed Semantics ; 3(1): 12, 2012 Dec 18.
Artículo en Inglés | MEDLINE | ID: mdl-23249650

RESUMEN

Vaccines and drugs have contributed to dramatic improvements in public health worldwide. Over the last decade, there have been efforts in developing biomedical ontologies that represent various areas associated with vaccines and drugs. These ontologies combined with existing health and clinical terminology systems (e.g., SNOMED, RxNorm, NDF-RT, MedDRA, VO, OAE, and AERO) could play significant roles on clinical and translational research. The first "Vaccine and Drug Ontology in the Study of Mechanism and Effect" workshop (VDOSME 2012) provided a platform for discussing problems and solutions in the development and application of biomedical ontologies in representing and analyzing vaccines/drugs, vaccine/drug administrations, vaccine/drug-induced immune responses (including positive host responses and adverse events), and similar topics. The workshop covered two main areas: (i) ontologies of vaccines, of drugs, and of studies thereof; and (ii) analysis of administration, mechanism and effect in terms of representations based on such ontologies. Six full-length papers included in this thematic issue focus on ontology representation and time analysis of vaccine/drug administration and host responses (including positive immune responses and adverse events), vaccine and drug adverse event text mining, and ontology-based Semantic Web applications. The workshop, together with the follow-up activities and personal meetings, provided a wonderful platform for the researchers and scientists in the vaccine and drug communities to demonstrate research progresses, share ideas, address questions, and promote collaborations for better representation and analysis of vaccine and drug-related terminologies and clinical and research data.

18.
Database (Oxford) ; 2012: bas020, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22513129

RESUMEN

Molecular biology has become heavily dependent on biological knowledge encoded in expert curated biological databases. As the volume of biological literature increases, biocurators need help in keeping up with the literature; (semi-) automated aids for biocuration would seem to be an ideal application for natural language processing and text mining. However, to date, there have been few documented successes for improving biocuration throughput using text mining. Our initial investigations took place for the workshop on 'Text Mining for the BioCuration Workflow' at the third International Biocuration Conference (Berlin, 2009). We interviewed biocurators to obtain workflows from eight biological databases. This initial study revealed high-level commonalities, including (i) selection of documents for curation; (ii) indexing of documents with biologically relevant entities (e.g. genes); and (iii) detailed curation of specific relations (e.g. interactions); however, the detailed workflows also showed many variabilities. Following the workshop, we conducted a survey of biocurators. The survey identified biocurator priorities, including the handling of full text indexed with biological entities and support for the identification and prioritization of documents for curation. It also indicated that two-thirds of the biocuration teams had experimented with text mining and almost half were using text mining at that time. Analysis of our interviews and survey provide a set of requirements for the integration of text mining into the biocuration workflow. These can guide the identification of common needs across curated databases and encourage joint experimentation involving biocurators, text mining developers and the larger biomedical research community.


Asunto(s)
Investigación Biomédica , Minería de Datos , Procesamiento de Lenguaje Natural , Flujo de Trabajo , Animales , Bases de Datos Factuales , Humanos
19.
20.
Front Neuroinform ; 5: 24, 2011.
Artículo en Inglés | MEDLINE | ID: mdl-22053155

RESUMEN

This paper describes software for neuroanatomical knowledge synthesis based on neural connectivity data. This software supports a mature methodology developed since the early 1990s. Over this time, the Swanson laboratory at USC has generated an account of the neural connectivity of the sub-structures of the hypothalamus, amygdala, septum, hippocampus, and bed nucleus of the stria terminalis. This is based on neuroanatomical data maps drawn into a standard brain atlas by experts. In earlier work, we presented an application for visualizing and comparing anatomical macro connections using the Swanson third edition atlas as a framework for accurate registration. Here we describe major improvements to the NeuARt application based on the incorporation of a knowledge representation of experimental design. We also present improvements in the interface and features of the data mapping components within a unified web-application. As a step toward developing an accurate sub-regional account of neural connectivity, we provide navigational access between the data maps and a semantic representation of area-to-area connections that they support. We do so based on an approach called "Knowledge Engineering from Experimental Design" (KEfED) model that is based on experimental variables. We have extended the underlying KEfED representation of tract-tracing experiments by incorporating the definition of a neuronanatomical data map as a measurement variable in the study design. This paper describes the software design of a web-application that allows anatomical data sets to be described within a standard experimental context and thus indexed by non-spatial experimental design features.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA