Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Front Hum Neurosci ; 10: 647, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-28123359

RESUMO

There is increasing interest in real-time brain-computer interfaces (BCIs) for the passive monitoring of human cognitive state, including cognitive workload. Too often, however, effective BCIs based on machine learning techniques may function as "black boxes" that are difficult to analyze or interpret. In an effort toward more interpretable BCIs, we studied a family of N-back working memory tasks using a machine learning model, Gaussian Process Regression (GPR), which was both powerful and amenable to analysis. Participants performed the N-back task with three stimulus variants, auditory-verbal, visual-spatial, and visual-numeric, each at three working memory loads. GPR models were trained and tested on EEG data from all three task variants combined, in an effort to identify a model that could be predictive of mental workload demand regardless of stimulus modality. To provide a comparison for GPR performance, a model was additionally trained using multiple linear regression (MLR). The GPR model was effective when trained on individual participant EEG data, resulting in an average standardized mean squared error (sMSE) between true and predicted N-back levels of 0.44. In comparison, the MLR model using the same data resulted in an average sMSE of 0.55. We additionally demonstrate how GPR can be used to identify which EEG features are relevant for prediction of cognitive workload in an individual participant. A fraction of EEG features accounted for the majority of the model's predictive power; using only the top 25% of features performed nearly as well as using 100% of features. Subsets of features identified by linear models (ANOVA) were not as efficient as subsets identified by GPR. This raises the possibility of BCIs that require fewer model features while capturing all of the information needed to achieve high predictive accuracy.

2.
Pac Symp Biocomput ; : 281-91, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-17990499

RESUMO

We have developed a challenge task for the second BioCreAtIvE (Critical Assessment of Information Extraction in Biology) that requires participating systems to provide lists of the EntrezGene (formerly LocusLink) identifiers for all human genes and proteins mentioned in a MEDLINE abstract. We are distributing 281 annotated abstracts and another 5,000 noisily annotated abstracts along with a gene name lexicon to participants. We have performed a series of baseline experiments to better characterize this dataset and form a foundation for participant exploration.


Assuntos
Bases de Dados Genéticas , Bases de Dados de Proteínas , MEDLINE , Biologia Computacional , Genoma Humano , Genômica/estatística & dados numéricos , Humanos , Proteômica/estatística & dados numéricos
3.
BMC Bioinformatics ; 6 Suppl 1: S12, 2005.
Artigo em Inglês | MEDLINE | ID: mdl-15960824

RESUMO

BACKGROUND: We prepared and evaluated training and test materials for an assessment of text mining methods in molecular biology. The goal of the assessment was to evaluate the ability of automated systems to generate a list of unique gene identifiers from PubMed abstracts for the three model organisms Fly, Mouse, and Yeast. This paper describes the preparation and evaluation of answer keys for training and testing. These consisted of lists of normalized gene names found in the abstracts, generated by adapting the gene list for the full journal articles found in the model organism databases. For the training dataset, the gene list was pruned automatically to remove gene names not found in the abstract; for the testing dataset, it was further refined by manual annotation by annotators provided with guidelines. A critical step in interpreting the results of an assessment is to evaluate the quality of the data preparation. We did this by careful assessment of interannotator agreement and the use of answer pooling of participant results to improve the quality of the final testing dataset. RESULTS: Interannotator analysis on a small dataset showed that our gene lists for Fly and Yeast were good (87% and 91% three-way agreement) but the Mouse gene list had many conflicts (mostly omissions), which resulted in errors (69% interannotator agreement). By comparing and pooling answers from the participant systems, we were able to add an additional check on the test data; this allowed us to find additional errors, especially in Mouse. This led to 1% change in the Yeast and Fly "gold standard" answer keys, but to an 8% change in the mouse answer key. CONCLUSION: We found that clear annotation guidelines are important, along with careful interannotator experiments, to validate the generated gene lists. Also, abstracts alone are a poor resource for identifying genes in paper, containing only a fraction of genes mentioned in the full text (25% for Fly, 36% for Mouse). We found that there are intrinsic differences between the model organism databases related to the number of synonymous terms and also to curation criteria. Finally, we found that answer pooling was much faster and allowed us to identify more conflicting genes than interannotator analysis.


Assuntos
Biologia Computacional/métodos , Bases de Dados Factuais/classificação , Redação , Animais , Biologia Computacional/normas , Bases de Dados Factuais/normas , Armazenamento e Recuperação da Informação/classificação , Armazenamento e Recuperação da Informação/normas
4.
J Comp Neurol ; 471(3): 333-51, 2004 Apr 05.
Artigo em Inglês | MEDLINE | ID: mdl-14991565

RESUMO

Turtle visual cortex has three layers and receives direct input from the dorsolateral geniculate complex of the thalamus. The outer layer 1 contains several populations of interneurons, but their physiological properties have not been characterized. This study used intracellular recording methods followed by filling with Neurobiotin to characterize the morphology and physiology of two populations of layer 1 interneurons. Subpial cells have somata positioned in the outer third of layer 1 and dendrites confined within the band of geniculate afferents that runs from lateral to medial across visual cortex. Their dendrites are composed of a sequence of many beads or varicosities separated by intervaricose segments. They have membrane time constants of tau(o) = 45.5 +/- 5.2 ms and electrotonic lengths of 1.1 +/- 0.2. Subpial cells show spike rate adaptation in response to intracellular current pulses. Stellate cells have somata located in the inner two-thirds of layer 1 and, less frequently, in layers 2 and 3. Their dendrites extend in a stellate configuration across the cortex. They are smooth or sparsely spiny, but never bear distinct varicosities. They have membrane time constants of tau(o) = 155.1 +/- 12 ms and electrotonic lengths of 3.8 +/- 0.5. They show little spike rate adaptation in response to intracellular current pulses. The positions of the two populations of cells in visual cortex and their physiological properties suggest that subpial cells may participate in a feedforward inhibitory pathway to pyramidal cells, whereas stellate cells are involved in feedback inhibition to pyramidal cells.


Assuntos
Interneurônios/citologia , Interneurônios/fisiologia , Tartarugas/fisiologia , Córtex Visual/citologia , Córtex Visual/fisiologia , Potenciais de Ação/fisiologia , Animais , Pia-Máter/citologia , Pia-Máter/fisiologia , Tartarugas/anatomia & histologia
5.
Bioinformatics ; 18(11): 1515-22, 2002 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-12424124

RESUMO

MOTIVATION: Mining the biomedical literature for references to genes and proteins always involves a tradeoff between high precision with false negatives, and high recall with false positives. Having a reliable method for assessing the relevance of literature mining results is crucial to finding ways to balance precision and recall, and for subsequently building automated systems to analyze these results. We hypothesize that abstracts and titles that discuss the same gene or protein use similar words. To validate this hypothesis, we built a dictionary- and rule-based system to mine Medline for references to genes and proteins, and used a Bayesian metric for scoring the relevance of each reference assignment. RESULTS: We analyzed the entire set of Medline records from 1966 to late 2001, and scored each gene and protein reference using a Bayesian estimated probability (EP) based on word frequency in a training set of 137837 known assignments from 30594 articles to 36197 gene and protein symbols. Two test sets of 148 and 150 randomly chosen assignments, respectively, were hand-validated and categorized as either good or bad. The distributions of EP values, when plotted on a log-scale histogram, are shown to markedly differ between good and bad assignments. Using EP values, recall was 100% at 61% precision (EP=2 x 10(-5)), 63% at 88% precision (EP=0.008), and 10% at 100% precision (EP=0.1). These results show that Medline entries discussing the same gene or protein have similar word usage, and that our method of assessing this similarity using EP values is valid, and enables an EP cutoff value to be determined that accurately and reproducibly balances precision and recall, allowing automated analysis of literature mining results. .


Assuntos
Teorema de Bayes , Genes , Armazenamento e Recuperação da Informação/métodos , MEDLINE , Processamento de Linguagem Natural , Proteínas , Indexação e Redação de Resumos/métodos , Algoritmos , Sistemas de Gerenciamento de Base de Dados , Dicionários como Assunto , Reações Falso-Negativas , Reações Falso-Positivas , Modelos Estatísticos , National Library of Medicine (U.S.) , Reconhecimento Automatizado de Padrão , Descritores , Estados Unidos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...