RESUMO
BACKGROUND: Semantic approaches such as concept-based information retrieval rely on a corpus in which resources are indexed by concepts belonging to a domain ontology. In order to keep such applications up-to-date, new entities need to be frequently annotated to enrich the corpus. However, this task is time-consuming and requires a high-level of expertise in both the domain and the related ontology. Different strategies have thus been proposed to ease this indexing process, each one taking advantage from the features of the document. RESULTS: In this paper we present USI (User-oriented Semantic Indexer), a fast and intuitive method for indexing tasks. We introduce a solution to suggest a conceptual annotation for new entities based on related already indexed documents. Our results, compared to those obtained by previous authors using the MeSH thesaurus and a dataset of biomedical papers, show that the method surpasses text-specific methods in terms of both quality and speed. Evaluations are done via usual metrics and semantic similarity. CONCLUSIONS: By only relying on neighbor documents, the User-oriented Semantic Indexer does not need a representative learning set. Yet, it provides better results than the other approaches by giving a consistent annotation scored with a global criterion - instead of one score per concept.
Assuntos
Indexação e Redação de Resumos , Algoritmos , Armazenamento e Recuperação da Informação , Processamento de Linguagem Natural , Semântica , Interface Usuário-Computador , Humanos , Medical Subject Headings , Reconhecimento Automatizado de Padrão , Vocabulário ControladoRESUMO
UNLABELLED: The semantic measures library and toolkit are robust open-source and easy to use software solutions dedicated to semantic measures. They can be used for large-scale computations and analyses of semantic similarities between terms/concepts defined in terminologies and ontologies. The comparison of entities (e.g. genes) annotated by concepts is also supported. A large collection of measures is available. Not limited to a specific application context, the library and the toolkit can be used with various controlled vocabularies and ontology specifications (e.g. Open Biomedical Ontology, Resource Description Framework). The project targets both designers and practitioners of semantic measures providing a JAVA library, as well as a command-line tool that can be used on personal computers or computer clusters. AVAILABILITY AND IMPLEMENTATION: Downloads, documentation, tutorials, evaluation and support are available at http://www.semantic-measures-library.org.
Assuntos
Ontologias Biológicas , Software , Semântica , Vocabulário ControladoRESUMO
Ontologies are widely adopted in the biomedical domain to characterize various resources (e.g. diseases, drugs, scientific publications) with non-ambiguous meanings. By exploiting the structured knowledge that ontologies provide, a plethora of ad hoc and domain-specific semantic similarity measures have been defined over the last years. Nevertheless, some critical questions remain: which measure should be defined/chosen for a concrete application? Are some of the, a priori different, measures indeed equivalent? In order to bring some light to these questions, we perform an in-depth analysis of existing ontology-based measures to identify the core elements of semantic similarity assessment. As a result, this paper presents a unifying framework that aims to improve the understanding of semantic measures, to highlight their equivalences and to propose bridges between their theoretical bases. By demonstrating that groups of measures are just particular instantiations of parameterized functions, we unify a large number of state-of-the-art semantic similarity measures through common expressions. The application of the proposed framework and its practical usefulness is underlined by an empirical analysis of hundreds of semantic measures in a biomedical context.
Assuntos
Informática Médica/métodos , Semântica , Algoritmos , Humanos , Modelos Teóricos , Processamento de Linguagem Natural , Reprodutibilidade dos Testes , Software , Systematized Nomenclature of Medicine , Terminologia como Assunto , Vocabulário ControladoRESUMO
Introduction: Dementia is a neurological disorder associated with aging that can cause a loss of cognitive functions, impacting daily life. Alzheimer's disease (AD) is the most common cause of dementia, accounting for 50-70% of cases, while frontotemporal dementia (FTD) affects social skills and personality. Electroencephalography (EEG) provides an effective tool to study the effects of AD on the brain. Methods: In this study, we propose to use shallow neural networks applied to two sets of features: spectral-temporal and functional connectivity using four methods. We compare three supervised machine learning techniques to the CNN models to classify EEG signals of AD / FTD and control cases. We also evaluate different measures of functional connectivity from common EEG frequency bands considering multiple thresholds. Results and discussion: Results showed that the shallow CNN-based models achieved the highest accuracy of 94.54% with AEC in test dataset when considering all connections, outperforming conventional methods and providing potentially an additional early dementia diagnosis tool.
RESUMO
BACKGROUND: Because of the increasing number of electronic resources, designing efficient tools to retrieve and exploit them is a major challenge. Some improvements have been offered by semantic Web technologies and applications based on domain ontologies. In life science, for instance, the Gene Ontology is widely exploited in genomic applications and the Medical Subject Headings is the basis of biomedical publications indexation and information retrieval process proposed by PubMed. However current search engines suffer from two main drawbacks: there is limited user interaction with the list of retrieved resources and no explanation for their adequacy to the query is provided. Users may thus be confused by the selection and have no idea on how to adapt their queries so that the results match their expectations. RESULTS: This paper describes an information retrieval system that relies on domain ontology to widen the set of relevant documents that is retrieved and that uses a graphical rendering of query results to favor user interactions. Semantic proximities between ontology concepts and aggregating models are used to assess documents adequacy with respect to a query. The selection of documents is displayed in a semantic map to provide graphical indications that make explicit to what extent they match the user's query; this man/machine interface favors a more interactive and iterative exploration of data corpus, by facilitating query concepts weighting and visual explanation. We illustrate the benefit of using this information retrieval system on two case studies one of which aiming at collecting human genes related to transcription factors involved in hemopoiesis pathway. CONCLUSIONS: The ontology based information retrieval system described in this paper (OBIRS) is freely available at: http://www.ontotoolkit.mines-ales.fr/ObirsClient/. This environment is a first step towards a user centred application in which the system enlightens relevant information to provide decision help.
Assuntos
Ontologias Biológicas , Biologia Computacional/métodos , Armazenamento e Recuperação da Informação/métodos , Internet , Interface Usuário-Computador , Hematopoese/genética , Humanos , Medical Subject Headings , PubMed , Semântica , Software , Fatores de Transcrição/metabolismoRESUMO
The object of this study is to put forward uncertainty modeling associated with missing time series data imputation in a predictive context. We propose three imputation methods associated with uncertainty modeling. These methods are evaluated on a COVID-19 dataset out of which some values have been randomly removed. The dataset contains the numbers of daily COVID-19 confirmed diagnoses ("new cases") and daily deaths ("new deaths") recorded since the start of the pandemic up to July 2021. The considered task is to predict the number of new deaths 7 days in advance. The more values are missing, the higher the imputation impact is on the predictive performances. The Evidential K-Nearest Neighbors (EKNN) algorithm is used for its ability to take into account labels uncertainty. Experiments are provided to measure the benefits of the label uncertainty models. Results show the positive impact of uncertainty models on imputation performances, especially in a noisy context where the number of missing values is high.
RESUMO
In recent years, neuroscientists have been interested to the development of brain-computer interface (BCI) devices. Patients with motor disorders may benefit from BCIs as a means of communication and for the restoration of motor functions. Electroencephalography (EEG) is one of most used for evaluating the neuronal activity. In many computer vision applications, deep neural networks (DNN) show significant advantages. Towards to ultimate usage of DNN, we present here a shallow neural network that uses mainly two convolutional neural network (CNN) layers, with relatively few parameters and fast to learn spectral-temporal features from EEG. We compared this models to three other neural network models with different depths applied to a mental arithmetic task using eye-closed state adapted for patients suffering from motor disorders and a decline in visual functions. Experimental results showed that the shallow CNN model outperformed all the other models and achieved the highest classification accuracy of 90.68%. It's also more robust to deal with cross-subject classification issues: only 3% standard deviation of accuracy instead of 15.6% from conventional method.
Assuntos
Interfaces Cérebro-Computador , Aprendizado de Máquina , Algoritmos , Eletroencefalografia/métodos , Humanos , Redes Neurais de ComputaçãoRESUMO
The emergence of the first Fitness-Fatigue impulse responses models (FFMs) have allowed the sport science community to investigate relationships between the effects of training and performance. In the models, athletic performance is described by first order transfer functions which represent Fitness and Fatigue antagonistic responses to training. On this basis, the mathematical structure allows for a precise determination of optimal sequence of training doses that would enhance the greatest athletic performance, at a given time point. Despite several improvement of FFMs and still being widely used nowadays, their efficiency for describing as well as for predicting a sport performance remains mitigated. The main causes may be attributed to a simplification of physiological processes involved by exercise which the model relies on, as well as a univariate consideration of factors responsible for an athletic performance. In this context, machine-learning perspectives appear to be valuable for sport performance modelling purposes. Weaknesses of FFMs may be surpassed by embedding physiological representation of training effects into non-linear and multivariate learning algorithms. Thus, ensemble learning methods may benefit from a combination of individual responses based on physiological knowledge within supervised machine-learning algorithms for a better prediction of athletic performance.In conclusion, the machine-learning approach is not an alternative to FFMs, but rather a way to take advantage of models based on physiological assumptions within powerful machine-learning models.
RESUMO
This paper presents a model-based diagnostic method designed in the context of process supervision. It has been inspired by both artificial intelligence and control theory. AI contributes tools for qualitative modeling, including causal modeling, whose aim is to split a complex process into elementary submodels. Control theory, within the framework of fault detection and isolation (FDI), provides numerical models for generating and testing residuals, and for taking into account inaccuracies in the model, unknown disturbances and noise. Consistency-based reasoning provides a logical foundation for diagnostic reasoning and clarifies fundamental assumptions, such as single fault and exoneration. The diagnostic method presented in the paper benefits from the advantages of all these approaches. Causal modeling enables the method to focus on sufficient relations for fault isolation, which avoids combinatorial explosion. Moreover, it allows the model to be modified easily without changing any aspect of the diagnostic algorithm. The numerical submodels that are used to detect inconsistency benefit from the precise quantitative analysis of the FDI approach. The FDI models are studied in order to link this method with DX component-oriented reasoning. The recursive on-line use of this algorithm is explained and the concept of local exoneration is introduced.
Assuntos
Inteligência Artificial , Técnicas de Apoio para a Decisão , Diagnóstico por Computador/métodos , Análise de Falha de Equipamento/métodos , Modelos Teóricos , Projetos de Pesquisa , Integração de Sistemas , Algoritmos , Simulação por Computador , Comunicação InterdisciplinarRESUMO
Two distinct and parallel research communities have been working along the lines of the model-based diagnosis approach: the fault detection and isolation (FDI) community and the diagnostic (DX) community that have evolved in the fields of automatic control and artificial intelligence, respectively. This paper clarifies and links the concepts and assumptions that underlie the FDI analytical redundancy approach and the DX consistency-based logical approach. A formal framework is proposed in order to compare the two approaches and the theoretical proof of their equivalence together with the necessary and sufficient conditions is provided.