Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
Acta Biotheor ; 62(3): 405-15, 2014 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-25107274

RESUMEN

The Rift Valley fever (RVF), which first appeared in Kenya in 1912, is an anthropozoonosis widespread in tropical areas. In Senegal, it is particularly felt in the Ferlo area where a strong presence of ponds shared by humans, cattle and vectors is noted. As part of the studies carried out on the environmental factors which favour its start and propagation, the focus of this paper is put on the decision making process to evaluate the impacts, the interactions and to make RVF monitoring easier. The present paper proposes a model based on data mining techniques and dedicated to trade experts. This model integrates all the involved data and the results of the analyses made on the characteristics of the surrounding ponds. This approach presents some advantage in revealing the relationship between environmental factors and RVF transmission vectors for space-time epidemiology monitoring purpose.


Asunto(s)
Toma de Decisiones , Fiebre del Valle del Rift/epidemiología , Humanos , Modelos Teóricos , Fiebre del Valle del Rift/transmisión , Senegal/epidemiología
2.
Sci Data ; 10(1): 818, 2023 11 22.
Artículo en Inglés | MEDLINE | ID: mdl-37993460

RESUMEN

Land artificialization is a serious problem of civilization. Urban planning and natural risk management are aimed to improve it. In France, these practices operate the Local Land Plans (PLU - Plan Local d'Urbanisme) and the Natural risk prevention plans (PPRn - Plan de Prévention des Risques naturels) containing land use rules. To facilitate automatic extraction of the rules, we manually annotated a number of those documents concerning Montpellier, a rapidly evolving agglomeration exposed to natural risks. We defined a format for labeled examples in which each entry includes title and subtitle. In addition, we proposed a hierarchical representation of class labels to generalize the use of our corpus. Our corpus, consisting of 1934 textual segments, each of which labeled by one of the 4 classes (Verifiable, Non-verifiable, Informative and Not pertinent) is the first corpus in the French language in the fields of urban planning and natural risk management. Along with presenting the corpus, we tested a state-of-the-art approach for text classification to demonstrate its usability for automatic rule extraction.

3.
Data Brief ; 46: 108870, 2023 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-36687146

RESUMEN

This paper presents an annotated dataset used in the MOOD Antimicrobial Resistance (AMR) hackathon, hosted in Montpellier, June 2022. The collected data concerns unstructured data from news items, scientific publications and national or international reports, collected from four event-based surveillance (EBS) Systems, i.e. ProMED, PADI-web, HealthMap and MedISys. Data was annotated by relevance for epidemic intelligence (EI) purposes with the help of AMR experts and an annotation guideline. Extracted data were intended to include relevant events on the emergence and spread of AMR such as reports on AMR trends, discovery of new drug-bug resistances, or new AMR genes in human, animal or environmental reservoirs. This dataset can be used to train or evaluate classification approaches to automatically identify written text on AMR events across the different reservoirs and sectors of One Health (i.e. human, animal, food, environmental sources, such as soil and waste water) in unstructured data (e.g. news, tweets) and classify these events by relevance for EI purposes.

4.
Data Brief ; 43: 108317, 2022 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-35692611

RESUMEN

This dataset is composed by spatial (e.g. location) and thematic (e.g. diseases, symptoms, virus) entities concerning avian influenza in social media (textual) data in English. It was created from three corpora: the first one includes 10 transcriptions of YouTube videos and 70 tweets manually annotated. The second corpus is composed by the same textual data but automatically annotated with Named Entity Recognition (NER) tools. These two corpora have been built to evaluate NER tools and apply them to a bigger corpus. The third corpus is composed of 100 YouTube transcriptions automatically annotated with NER tools. The aim of the annotation task is to recognize spatial information such as the names of the cities and epidemiological information such as the names of the diseases. An annotation guideline is provided in order to ensure a unified annotation and to help the annotators. This dataset can be used to train or evaluate Natural Language Processing (NLP) approaches such as specialized entity recognition.

5.
J Biomed Inform ; 44 Suppl 1: S12-S16, 2011 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-21397039

RESUMEN

BACKGROUND: The aim of this study was to develop an original method to extract sets of relevant molecular biomarkers (gene sequences) that can be used for class prediction and can be included as prognostic and predictive tools. MATERIALS AND METHODS: The method is based on sequential patterns used as features for class prediction. We applied it to classify breast cancer tumors according to their histological grade. RESULTS: We obtained very good recall and precision for grades 1 and 3 tumors, but, like other authors, our results were less satisfactory for grade 2 tumors. CONCLUSIONS: We demonstrated the interest of sequential patterns for class prediction of microarrays and we now have the material to use them for prognostic and predictive applications.


Asunto(s)
Neoplasias de la Mama/patología , Minería de Datos/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Neoplasias de la Mama/genética , Femenino , Perfilación de la Expresión Génica , Humanos , Estadificación de Neoplasias
6.
Stud Health Technol Inform ; 169: 629-33, 2011.
Artículo en Inglés | MEDLINE | ID: mdl-21893824

RESUMEN

The epidemiology of dengue fever in French Guiana is marked by a combination of permanent transmission of the virus in the whole country and the occurrence of regular epidemics. Since 2006, a multi data source surveillance system was implemented to monitor dengue fever patterns, to improve early detection of outbreaks and to allow a better provision of information to health authorities, in order to guide and evaluate prevention activities and control measures. This report illustrates the validity and the performances of the system. We describe the experience gained by such a surveillance system and outline remaining challenges. Future works will consist in the use of other data sources such as environmental factors in order to improve knowledge on virus transmission mechanisms and determine how to use them for outbreaks prediction.


Asunto(s)
Enfermedades Transmisibles Emergentes/epidemiología , Dengue/epidemiología , Dengue/terapia , Brotes de Enfermedades/prevención & control , Informática Médica/métodos , Informática en Salud Pública/métodos , Algoritmos , Control de Enfermedades Transmisibles , Minería de Datos , Notificación de Enfermedades , Guyana Francesa , Hospitalización , Humanos , Modelos Estadísticos , Vigilancia de la Población/métodos , Programas Informáticos
7.
Health Inf Sci Syst ; 9(1): 29, 2021 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-34276970

RESUMEN

Here, we introduce ITEXT-BIO, an intelligent process for biomedical domain terminology extraction from textual documents and subsequent analysis. The proposed methodology consists of two complementary approaches, including free and driven term extraction. The first is based on term extraction with statistical measures, while the second considers morphosyntactic variation rules to extract term variants from the corpus. The combination of two term extraction and analysis strategies is the keystone of ITEXT-BIO. These include combined intra-corpus strategies that enable term extraction and analysis either from a single corpus (intra), or from corpora (inter). We assessed the two approaches, the corpus or corpora to be analysed and the type of statistical measures used. Our experimental findings revealed that the proposed methodology could be used: (1) to efficiently extract representative, discriminant and new terms from a given corpus or corpora, and (2) to provide quantitative and qualitative analyses on these terms regarding the study domain.

8.
Stud Health Technol Inform ; 160(Pt 2): 1314-8, 2010.
Artículo en Inglés | MEDLINE | ID: mdl-20841897

RESUMEN

UNLABELLED: Analyzing microarrays data is still a great challenge since existing methods produce huge amounts of useless results. We propose a new method called NoDisco for discovering novelties in gene sequences obtained by applying data-mining techniques to microarray data. METHOD: We identify popular genes, which are often cited in the literature, and innovative genes, which are linked to the popular genes in the sequences but are not mentioned in the literature. We also identify popular and innovative sequences containing these genes. Biologists can thus select interesting sequences from the two sets and obtain the k-best documents. RESULTS: We show the efficiency of this method by applying it on real data used to decipher the mechanisms underlying Alzheimer disease. CONCLUSION: The first selection of sequences based on popularity and innovation help experts focus on relevant sequences while the top-k documents help them understand the sequences.


Asunto(s)
Enfermedad de Alzheimer/genética , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Algoritmos , Minería de Datos/métodos , Humanos
9.
Stud Health Technol Inform ; 150: 767-71, 2009.
Artículo en Inglés | MEDLINE | ID: mdl-19745414

RESUMEN

Transcriptomic technologies are promising tools for identifying new genes involved in cerebral ageing or in neurodegenerative diseases such as Alzheimer's disease. These technologies produce massive biological data, which so far are extremely difficult to exploit. In this context, we propose GeneMining, a multidisciplinary methodology, which aims at developing new strategies to analyse such data, and to design interactive tools to help biologists to identify, visualize and interpret brain ageing signatures. In order to address the specific problem of brain ageing signatures discovery, we combine and apply existing tools with emphasis to a new efficient data mining method based on sequential patterns.


Asunto(s)
Envejecimiento/genética , Encéfalo/fisiología , Perfilación de la Expresión Génica , Secuencia de Bases , Biología Computacional , Genómica , Humanos
10.
J Am Med Inform Assoc ; 21(e2): e232-40, 2014 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-24549761

RESUMEN

OBJECTIVE: To identify local meteorological drivers of dengue fever in French Guiana, we applied an original data mining method to the available epidemiological and climatic data. Through this work, we also assessed the contribution of the data mining method to the understanding of factors associated with the dissemination of infectious diseases and their spatiotemporal spread. METHODS: We applied contextual sequential pattern extraction techniques to epidemiological and meteorological data to identify the most significant climatic factors for dengue fever, and we investigated the relevance of the extracted patterns for the early warning of dengue outbreaks in French Guiana. RESULTS: The maximum temperature, minimum relative humidity, global brilliance, and cumulative rainfall were identified as determinants of dengue outbreaks, and the precise intervals of their values and variations were quantified according to the epidemiologic context. The strongest significant correlations were observed between dengue incidence and meteorological drivers after a 4-6-week lag. DISCUSSION: We demonstrated the use of contextual sequential patterns to better understand the determinants of the spatiotemporal spread of dengue fever in French Guiana. Future work should integrate additional variables and explore the notion of neighborhood for extracting sequential patterns. CONCLUSIONS: Dengue fever remains a major public health issue in French Guiana. The development of new methods to identify such specific characteristics becomes crucial in order to better understand and control spatiotemporal transmission.


Asunto(s)
Clima , Dengue/epidemiología , Epidemias/estadística & datos numéricos , Minería de Datos , Guyana Francesa , Humanos , Incidencia
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA