Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 465
Filtrar
1.
PLoS One ; 16(9): e0257091, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34525115

RESUMO

What makes written text appealing? In this registered report protocol, we propose to study the linguistic characteristics of news headline success using a large-scale dataset of field experiments (A/B tests) conducted on the popular website Upworthy comparing multiple headline variants for the same news articles. This unique setup allows us to control for factors that can have crucial confounding effects on headline success. Based on prior literature and a pilot partition of the data, we formulate hypotheses about the linguistic features that are associated with statistically superior headlines. We will test our hypotheses on a much larger partition of the data that will become available after the publication of this registered report protocol. Our results will contribute to resolving competing hypotheses about the linguistic features that affect the success of text and will provide avenues for research into the psychological mechanisms that are activated by those features.


Assuntos
Jornais como Assunto , Bases de Dados como Assunto , Dicionários como Assunto , Internet , Linguística , Análise de Regressão
3.
JAMA Dermatol ; 157(4): 449-455, 2021 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-33688910

RESUMO

Importance: Standard morphological terminology and definitions are vital for identification of lesion types in the clinical trial setting and communication about the condition. For hidradenitis suppurativa (HS), morphological definitions have been proposed by different groups, representing various regions of the world, but no international consensus has been reached regarding such definitions. A lack of globally harmonized terminology and definitions may contribute to poor-quality data collection in clinical trials as well as poor communication among clinicians, investigators, and patients. Objective: To establish and validate consensus definitions for typical morphological findings for HS lesions. Methods: This study was conducted from August 2019 to August 2020. A Delphi study technique was used to assess agreement and then resolve disagreement on HS terminology among international experts. After an initial preparation phase, the process consisted of 3 rounds. In each round, participants reviewed preliminary definitions and rated them as "keep, with no changes," "keep, with changes," or "remove." Eight HS primary lesions, including papule, pustule, nodule, plaque, ulcer, abscess, comedo, and tunnel, were selected because they are most frequently used in HS clinician-reported outcome measures. The initial definitions were based on extant literature, and modifications were made between rounds based on qualitative thematic analysis of open-ended responses or discussion. Consensus was defined as greater than 70% to accept a definition. Consensus stability across rounds was defined as less than 15% change in agreement. Reliability was evaluated using the Gwet agreement coefficient. Validation was based on assessment of face validity and stability across rounds. Results: A total of 31 experts participated. All 8 HS primary lesion definitions achieved greater than 70% consensus by Delphi round 3. Stability was achieved for papule, pustule, and abscess. The Gwet agreement coefficient increased from 0.49 (95% CI, 0.26-0.71) in round 1 to 0.78 (95% CI, 0.64-0.92) in round 3. Face validity was supported by expert endorsement to keep terms in survey responses. Previously unmeasured variation among clinicians' definition of tunnels was identified, and consensus was achieved. Conclusions and Relevance: An international group of experts agreed on definitions for morphological features of HS lesions frequently included in HS clinical trials. These international consensus terms and definitions are needed to support consistency of lesion identification and quantification in clinical trials.


Assuntos
Hidradenite Supurativa/diagnóstico , Consenso , Técnica Delfos , Dicionários como Assunto , Humanos , Internacionalidade , Reprodutibilidade dos Testes , Inquéritos e Questionários , Terminologia como Assunto
4.
J Psycholinguist Res ; 50(1): 223-230, 2021 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-33543380

RESUMO

A computerized linguistic measure, the Weighted Referential Activity Dictionary (WRAD), was applied to locate nodal turns of speech in psychotherapy, defined here as significant moments of patient emotional communication that are likely to reveal important themes. Two published demonstration sessions conducted by a senior clinician, who made extensive comments on this material, were utilized to illustrate the method. The WRAD, defined in the context of referential process theory, was developed and has been validated as assessing the vividness and immediacy of language. Segments of patient speech (turns of speech) were classified based on WRAD level and sufficient length. The themes of the therapist's clinical comments concerning high WRAD segments were coded using a category system developed for this study, and were compared to themes of comments for the remaining segments. Results showed a significant difference in the therapist's comments between the two groups of segments using Fisher's exact test. In particular, the therapist's comments on the nodal turns showed more focus on the emotional effects of the patient's utterances on him, as well as identification of unexpected disclosures in these utterances. The implications and limitations of this method are discussed.


Assuntos
Comunicação , Simulação por Computador , Linguística , Psicoterapia , Dicionários como Assunto , Feminino , Humanos , Idioma
5.
Malar J ; 20(1): 53, 2021 Jan 21.
Artigo em Inglês | MEDLINE | ID: mdl-33478519

RESUMO

Stakeholder engagement is an essential pillar for the development of innovative public health interventions, including genetic approaches for malaria vector control. Scientific terminologies are mainly lacking in local languages, yet when research activities involve international partnership, the question of technical jargon and its translation is crucial for effective and meaningful communication with stakeholders. Target Malaria, a not-for-profit research consortium developing innovative genetic approaches to malaria vector control, carried out a linguistic exercise in Mali, Burkina Faso and Uganda to establish the appropriate translation of its key terminology to local languages of sites where the teams operate. While reviewing the literature, there was no commonly agreed approach to establish such glossary of technical terms in local languages of the field sites where Target Malaria operates. Because of its commitment to the value of co-development, Target Malaria decided to apply this principle for the linguistic work and to take the opportunity of this process to empower communities to take part in the dialogue on innovative vector control. The project worked with linguists from other institutions (whether public research ones or private language centre) who developed a first potential glossary in the local language after better understanding the project scientific approach. This initial glossary was then tested during focus groups with community members, which significantly improved the proposed translations by making them more appropriate to the local context and cultural understanding. The stepwise process revealed the complexity and importance of elaborating a common language with communities as well as the imbrication of language with cultural aspects. This exercise demonstrated the strength of a co-development approach with communities and language experts as a way to develop knowledge together and to tailor communication to the audience even in the language used.


Assuntos
Anopheles/genética , Dicionários como Assunto , Técnicas Genéticas , Malária/prevenção & controle , Mosquitos Vetores/genética , Saúde Pública/métodos , Participação dos Interessados , Animais , Burkina Faso , Feminino , Humanos , Linguística , Malária/parasitologia , Masculino , Mali , Controle de Mosquitos , Mosquitos Vetores/parasitologia , Uganda
6.
J Psycholinguist Res ; 50(1): 51-64, 2021 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-33511546

RESUMO

Reflecting/Reorganizing (R/R) is one of the three functions described by Bucci (Overview of the referential process: the operation of language within and between people, 2021a) as part of the referential process. The Weighted Referential Activity Dictionary (WRAD) was previously developed to model the Symbolizing function of the referential process. This paper presents the development of the Weighted Reflecting Reorganizing List (WRRL) as a model of the R/R function. The basic premise of this approach is that by rating segments of text rather than individual words, and using a word by word weighting procedure designed for this purpose, it is possible to identify the nature of the language style that is connected with particular degrees of involvement in the psychological process being modeled. Starting with a brief description of the R/R function, an iterative process was applied that resulted in a clear scoring manual for the R/R function. The method of developing the dictionary is described, a study providing validation for the measure is presented, and the nature of the language style used to express the R/R function is discussed. As was described for the WRAD, the language style of the WRRL was found to involve use of particular function words, applicable across a wide range of contents.


Assuntos
Dicionários como Assunto , Idioma , Psicoterapia , Aprendizagem Verbal , Humanos
7.
J Psycholinguist Res ; 50(1): 143-153, 2021 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-33484369

RESUMO

Over the past two decades, therapeutic alliance research has increasingly focused on understanding the process by which the alliance is ruptured and repaired. This paper is the first to explore how alliance rupture segments from psychotherapy sessions differ from non-rupture segments on key dimensions of the referential process. A sample of 27 psychotherapy sessions were scored using a measure designed to identify rupture from non-rupture segments. These segments were then scored for key linguistic dimensions of the referential process. During ruptures patients manifested a referential process marked by a decrease in emotional engagement, an increase in a measure of distancing, and an increase in negation as compared to non-rupture segments. Therapists show similar patterns but, in addition, manifest a language pattern that suggests that during ruptures, therapists are attempting to make sense of, and self-disclose, aspects of their inner experience. Implications for research and clinical work are explored.


Assuntos
Linguística , Psicoterapia , Aliança Terapêutica , Adulto , Dicionários como Assunto , Emoções , Feminino , Humanos , Masculino
8.
PLoS One ; 15(9): e0239050, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32915905

RESUMO

Recent years have seen a growing amount of research effort directed toward what positive media psychologists refer to as self-transcendent emotions, such as awe, admiration, elevation, gratitude, inspiration, and hope. While these emotions are invaluable to promote greater human connectedness, prosociality, and human flourishing, researchers are constrained in terms of analyzing self-transcendent emotions as expressed in spoken and written languages. Drawing upon the word-counting approach of the text analysis paradigm, this project aimed at constructing a dictionary tool-Self-Transcendent Emotion Dictionary (STED)-which can be uploaded into mainstream, text analytic software (e.g., LIWC) to identify and analyze self-transcendent emotions in large corpora. This dictionary tool was then refined and validated via three studies, where individual words were first rated with regard to their fitness into the proposed construct (Step 1), and then used to analyze essays written to reflect the corresponding construct (Step 2). Finally, the refined dictionary was applied to examine words used in nearly 4,000 human-coded New York Times articles (Step 3). Results indicated that the final dictionary, consisting of 351 lexicons and phrases, exhibits acceptable face and construct validity, and possesses a reasonable level of external validity and applicability. Despite its shortcoming in accounting for the rhetorical techniques ingrained in natural human language, the STED could be instrumental for social scientific inquiry of positive emotions in textual narratives.


Assuntos
Dicionários como Assunto , Emoções , Idioma , Mineração de Dados , Esperança , Humanos , Narração , Jornais como Assunto , Psicolinguística , Semântica , Software , Redação
9.
Int J Neural Syst ; 30(8): 2050040, 2020 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-32727317

RESUMO

Machine learning (ML) systems are affected by a pervasive lack of transparency. The eXplainable Artificial Intelligence (XAI) research area addresses this problem and the related issue of explaining the behavior of ML systems in terms that are understandable to human beings. In many explanation of XAI approaches, the output of ML systems are explained in terms of low-level features of their inputs. However, these approaches leave a substantive explanatory burden with human users, insofar as the latter are required to map low-level properties into more salient and readily understandable parts of the input. To alleviate this cognitive burden, an alternative model-agnostic framework is proposed here. This framework is instantiated to address explanation problems in the context of ML image classification systems, without relying on pixel relevance maps and other low-level features of the input. More specifically, one obtains sets of middle-level properties of classification inputs that are perceptually salient by applying sparse dictionary learning techniques. These middle-level properties are used as building blocks for explanations of image classifications. The achieved explanations are parsimonious, for their reliance on a limited set of middle-level image properties. And they can be contrastive, because the set of middle-level image properties can be used to explain why the system advanced the proposed classification over other antagonist classifications. In view of its model-agnostic character, the proposed framework is adaptable to a variety of other ML systems and explanation problems.


Assuntos
Processamento de Imagem Assistida por Computador , Aprendizado de Máquina , Modelos Teóricos , Dicionários como Assunto , Humanos
10.
PLoS One ; 15(7): e0236798, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32730307

RESUMO

This study compares how lexical inferencing and dictionary consultation affect L2 vocabulary acquisition. Sixty-one L1 Arabic undergraduates majoring in English language read target words in semi-authentic English reading materials and were either asked to guess their meaning or look it up in a dictionary. A pre- and delayed post-test measured participants' knowledge of target words and overall vocabulary size. The results show a significant and comparable learning effect for both vocabulary learning strategies (VLS), with a higher pre-test vocabulary size related to a larger learning effect for both VLS. In addition, the better participants were at guessing correctly, the better they learned words through inferencing. The results suggest that both VLS are equally effective for our learner group and that learners' overall vocabulary size influences the amount of learning that occurs when using these VLS.


Assuntos
Idioma , Aprendizagem/fisiologia , Multilinguismo , Leitura , Estudantes/psicologia , Vocabulário , Adulto , Dicionários como Assunto , Feminino , Humanos , Testes de Linguagem , Masculino , Adulto Jovem
11.
NMR Biomed ; 33(12): e4344, 2020 12.
Artigo em Inglês | MEDLINE | ID: mdl-32618082

RESUMO

PURPOSE: Compressive sensing (CS)-based image reconstruction methods have proposed random undersampling schemes that produce incoherent, noise-like aliasing artifacts, which are easier to remove. The denoising process is critically assisted by imposing sparsity-enforcing priors. Sparsity is known to be induced if the prior is in the form of the Lp (0 ≤ p ≤ 1) norm. CS methods generally use a convex relaxation of these priors such as the L1 norm, which may not exploit the full power of CS. An efficient, discrete optimization formulation is proposed, which works not only on arbitrary Lp -norm priors as some non-convex CS methods do, but also on highly non-convex truncated penalty functions, resulting in a specific type of edge-preserving prior. These advanced features make the minimization problem highly non-convex, and thus call for more sophisticated minimization routines. THEORY AND METHODS: The work combines edge-preserving priors with random undersampling, and solves the resulting optimization using a set of discrete optimization methods called graph cuts. The resulting optimization problem is solved by applying graph cuts iteratively within a dictionary, defined here as an appropriately constructed set of vectors relevant to brain MRI data used here. RESULTS: Experimental results with in vivo data are presented. CONCLUSION: The proposed algorithm produces better results than regularized SENSE or standard CS for reconstruction of in vivo data.


Assuntos
Algoritmos , Dicionários como Assunto , Processamento de Imagem Assistida por Computador , Imageamento por Ressonância Magnética , Humanos
12.
Muscle Nerve ; 62(1): 10-12, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-32337730

RESUMO

Modern neuromuscular electrodiagnosis (EDX) and neuromuscular ultrasound (NMUS) require a universal language for effective communication in clinical practice and research and, in particular, for teaching young colleagues. Therefore, the AANEM and the IFCN have decided to publish a joint glossary as they feel the need for an updated terminology to support educational activities in neuromuscular EDX and NMUS in all parts of the world. In addition NMUS has been rapidly progressing over the last years and is now widely used in the diagnosis of disorders of nerve and muscle in conjunction with EDX. This glossary has been developed by experts in the field of neuromuscular EDX and NMUS on behalf of the AANEM and the IFCN and has been agreed upon by electronic communication between January and November 2019. It is based on the glossaries of the AANEM from 2015 and of the IFCN from 1999. The EDX and NMUS terms and the explanatory illustrations have been updated and supplemented where necessary. The result is a comprehensive glossary of terms covering all fields of neuromuscular EDX and NMUS. It serves as a standard reference for clinical practice, education and research worldwide. HIGHLIGHTS: Optimal terminology in neuromuscular electrodiagnosis and ultrasound has been revisited. A team of international experts have revised and expanded a standardized glossary. This list of terms serves as standard reference for clinical practice, education and research.


Assuntos
Dicionários como Assunto , Eletrodiagnóstico/classificação , Doenças Neuromusculares/classificação , Doenças Neuromusculares/diagnóstico por imagem , Sociedades Médicas/classificação , Ultrassonografia/classificação , Humanos , Estados Unidos
13.
AMA J Ethics ; 22(3): E255-259, 2020 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-32220274

RESUMO

In September 2019, a prominent dictionary recognized they as a proper pronoun for nonbinary individuals. This change can be seen as a source of newfound legitimacy for students and trainees self-advocating for nonbinary pronoun recognition in health care practice and training. This article considers one student's experience after coming out as nonbinary and voicing that their pronouns are they/them.


Assuntos
Competência Cultural , Atenção à Saúde , Educação Médica , Identidade de Gênero , Idioma , Justiça Social , Pessoas Transgênero , Dicionários como Assunto , Feminino , Humanos , Masculino , Relações Médico-Paciente , Estudantes , Transexualidade
14.
Matern Child Nutr ; 16(3): e12969, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-32032481

RESUMO

During the last decade, there have been several publications highlighting the need for consistent terminology in breastfeeding research. Standard terms and definitions are essential for the comparison and interpretation of scientific studies that, in turn, support evidence-based education, consistency of health care, and breastfeeding policy. Inconsistent advice is commonly reported by mothers to contribute to early weaning. A standard language is the fundamental starting point required for the provision of consistent advice. LactaPedia (www.lactapedia.com) is a comprehensive lactation glossary of over 500 terms and definitions created during the development of LactaMap (www.lactamap.com), an online lactation care support system. This paper describes the development of LactaPedia, a website that is accessible free of charge to anyone with access to the Internet. Multiple methodological frameworks were incorporated in LactaPedia's development in order to meet the needs of a glossary to support both consistent health care and scientific research. The resulting LactaPedia methodology is a six-stage process that was developed inductively and includes framework to guide vetting and extension of its content using public feedback via discussion forums. The discussion forums support ongoing usability and refinement of the glossary. The development of LactaPedia provides a fundamental first step towards improving breastfeeding outcomes that are currently well below World Health Organisation recommendations globally.


Assuntos
Aleitamento Materno , Dicionários como Assunto , Comunicação em Saúde/métodos , Lactação , Terminologia como Assunto , Feminino , Humanos , Internet
15.
BMC Med Res Methodol ; 19(1): 155, 2019 07 18.
Artigo em Inglês | MEDLINE | ID: mdl-31319802

RESUMO

BACKGROUND: The identification of sections in narrative content of Electronic Health Records (EHR) has demonstrated to improve the performance of clinical extraction tasks; however, there is not yet a shared understanding of the concept and its existing methods. The objective is to report the results of a systematic review concerning approaches aimed at identifying sections in narrative content of EHR, using both automatic or semi-automatic methods. METHODS: This review includes articles from the databases: SCOPUS, Web of Science and PubMed (from January 1994 to September 2018). The selection of studies was done using predefined eligibility criteria and applying the PRISMA recommendations. Search criteria were elaborated by using an iterative and collaborative keyword enrichment. RESULTS: Following the eligibility criteria, 39 studies were selected for analysis. The section identification approaches proposed by these studies vary greatly depending on the kind of narrative, the type of section, and the application. We observed that 57% of them proposed formal methods for identifying sections and 43% adapted a previously created method. Seventy-eight percent were intended for English texts and 41% for discharge summaries. Studies that are able to identify explicit (with headings) and implicit sections correspond to 46%. Regarding the level of granularity, 54% of the studies are able to identify sections, but not subsections. From the technical point of view, the methods can be classified into rule-based methods (59%), machine learning methods (22%) and a combination of both (19%). Hybrid methods showed better results than those relying on pure machine learning approaches, but lower than rule-based methods; however, their scope was more ambitious than the latter ones. Despite all the promising performance results, very few studies reported tests under a formal setup. Almost all the studies relied on custom dictionaries; however, they used them in conjunction with a controlled terminology, most commonly the UMLSⓇ metathesaurus. CONCLUSIONS: Identification of sections in EHR narratives is gaining popularity for improving clinical extraction projects. This study enabled the community working on clinical NLP to gain a formal analysis of this task, including the most successful ways to perform it.


Assuntos
Registros Eletrônicos de Saúde , Narração , Dicionários como Assunto , Humanos , Aprendizado de Máquina , Terminologia como Assunto
16.
PLoS One ; 14(3): e0213433, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30921343

RESUMO

Low-rank representation-based frameworks are becoming popular for the saliency and the object detection because of their easiness and simplicity. These frameworks only need global features to extract the salient objects while the local features are compromised. To deal with this issue, we regularize the low-rank representation through a local graph-regularization and a maximum mean-discrepancy regularization terms. Firstly, we introduce a novel feature space that is extracted by combining the four feature spaces like CIELab, RGB, HOG and LBP. Secondly, we combine a boundary metric, a candidate objectness metric and a candidate distance metric to compute the low-level saliency map. Thirdly, we extract salient and non-salient dictionaries from the low-level saliency. Finally, we regularize the low-rank representation through the Laplacian regularization term that saves the structural and geometrical features and using the mean discrepancy term that reduces the distribution divergence and connections among similar regions. The proposed model is tested against seven latest salient region detection methods using the precision-recall curve, receiver operating characteristics curve, F-measure and mean absolute error. The proposed model remains persistent in all the tests and outperformed against the selected models with higher precision value.


Assuntos
Processamento de Imagem Assistida por Computador/métodos , Algoritmos , Bases de Dados Factuais , Dicionários como Assunto , Humanos , Processamento de Imagem Assistida por Computador/estatística & dados numéricos , Aprendizado de Máquina , Redes Neurais de Computação , Fotografação , Percepção Visual
17.
PLoS One ; 14(3): e0213343, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30908489

RESUMO

The Moral Foundations Dictionary (MFD) is a useful tool for applying the conceptual framework developed in Moral Foundations Theory and quantifying the moral meanings implicated in the linguistic information people convey. However, the applicability of the MFD is limited because it is available only in English. Translated versions of the MFD are therefore needed to study morality across various cultures, including non-Western cultures. The contribution of this paper is two-fold. We developed the first Japanese version of the MFD (referred to as the J-MFD) using a semi-automated method-this serves as a reference when translating the MFD into other languages. We next tested the validity of the J-MFD by analyzing open-ended written texts about the situations that Japanese participants thought followed and violated the five moral foundations. We found that the J-MFD correctly categorized the Japanese participants' descriptions into the corresponding moral foundations, and that the Moral Foundations Questionnaire (MFQ) scores correlated with the frequency of situations, of total words, and of J-MFD words in the participants' descriptions for the Harm and Fairness foundations. The J-MFD can be used to study morality unique to the Japanese and also multicultural comparisons in moral behavior.


Assuntos
Teoria Ética , Princípios Morais , Comparação Transcultural , Dicionários como Assunto , Humanos , Japão , Idioma , Traduções
18.
BMC Bioinformatics ; 20(1): 62, 2019 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-30709336

RESUMO

BACKGROUND: Benefiting from big data, powerful computation and new algorithmic techniques, we have been witnessing the renaissance of deep learning, particularly the combination of natural language processing (NLP) and deep neural networks. The advent of electronic medical records (EMRs) has not only changed the format of medical records but also helped users to obtain information faster. However, there are many challenges regarding researching directly using Chinese EMRs, such as low quality, huge quantity, imbalance, semi-structure and non-structure, particularly the high density of the Chinese language compared with English. Therefore, effective word segmentation, word representation and model architecture are the core technologies in the literature on Chinese EMRs. RESULTS: In this paper, we propose a deep learning framework to study intelligent diagnosis using Chinese EMR data, which incorporates a convolutional neural network (CNN) into an EMR classification application. The novelty of this paper is reflected in the following: (1) We construct a pediatric medical dictionary based on Chinese EMRs. (2) Word2vec adopted in word embedding is used to achieve the semantic description of the content of Chinese EMRs. (3) A fine-tuning CNN model is constructed to feed the pediatric diagnosis with Chinese EMR data. Our results on real-world pediatric Chinese EMRs demonstrate that the average accuracy and F1-score of the CNN models are up to 81%, which indicates the effectiveness of the CNN model for the classification of EMRs. Particularly, a fine-tuning one-layer CNN performs best among all CNNs, recurrent neural network (RNN) (long short-term memory, gated recurrent unit) and CNN-RNN models, and the average accuracy and F1-score are both up to 83%. CONCLUSION: The CNN framework that includes word segmentation, word embedding and model training can serve as an intelligent auxiliary diagnosis tool for pediatricians. Particularly, a fine-tuning one-layer CNN performs well, which indicates that word order does not appear to have a useful effect on our Chinese EMRs.


Assuntos
Registros Eletrônicos de Saúde , Idioma , Redes Neurais de Computação , Dicionários como Assunto , Humanos , Processamento de Linguagem Natural , Semântica , Vocabulário
19.
BMC Res Notes ; 12(1): 42, 2019 Jan 18.
Artigo em Inglês | MEDLINE | ID: mdl-30658682

RESUMO

OBJECTIVE: Misspellings in clinical free text present challenges to natural language processing. With an objective to identify misspellings and their corrections, we developed a prototype spelling analysis method that implements Word2Vec, Levenshtein edit distance constraints, a lexical resource, and corpus term frequencies. We used the prototype method to process two different corpora, surgical pathology reports, and emergency department progress and visit notes, extracted from Veterans Health Administration resources. We evaluated performance by measuring positive predictive value and performing an error analysis of false positive output, using four classifications. We also performed an analysis of spelling errors in each corpus, using common error classifications. RESULTS: In this small-scale study utilizing a total of 76,786 clinical notes, the prototype method achieved positive predictive values of 0.9057 and 0.8979, respectively, for the surgical pathology reports, and emergency department progress and visit notes, in identifying and correcting misspelled words. False positives varied by corpus. Spelling error types were similar among the two corpora, however, the authors of emergency department progress and visit notes made over four times as many errors. Overall, the results of this study suggest that this method could also perform sufficiently in identifying misspellings in other clinical document types.


Assuntos
Dicionários como Assunto , Informática Médica/métodos , Processamento de Linguagem Natural , Vocabulário Controlado , Algoritmos , Humanos , Idioma , Informática Médica/normas , Informática Médica/estatística & dados numéricos , Sistemas Computadorizados de Registros Médicos/normas , Sistemas Computadorizados de Registros Médicos/estatística & dados numéricos , Patologia Cirúrgica/métodos , Reprodutibilidade dos Testes , Relatório de Pesquisa/normas , Unified Medical Language System/normas , Unified Medical Language System/estatística & dados numéricos
20.
Methods Inf Med ; 58(4-05): 151-159, 2019 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-32170719

RESUMO

BACKGROUND: Evaluating potential data losses from mapping proprietary medical data formats to standards is essential for decision making. The article implements a method to evaluate the preliminary content overlap of proprietary medical formats, including national terminologies and Fast Healthcare Interoperability Resources (FHIR)-international medical standard. METHODS: Three types of mappings were evaluated in the article: proprietary format matched to FHIR, national terminologies matched to the FHIR mappings, and concepts from national terminologies matched to Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT). We matched attributes of the formats with FHIR definitions and calculated content overlap. RESULTS: The article reports the results of a manual mapping between a proprietary medical format and the FHIR standard. The following results were obtained: 81% of content overlap for the proprietary format to FHIR mapping, 88% of content overlap for the national terminologies to FHIR mapping, and 98.6% of concepts matching can be reached from national terminologies to SNOMED CT mapping. Twenty tables from the proprietary format and 20 dictionaries were matched with FHIR resources; nine dictionaries were matched with SNOMED CT concepts. CONCLUSION: Mapping medical formats is a challenge. The obtained overlaps are promising in comparison with the investigated results. The study showed that standardization of data exchange between proprietary formats and FHIR is possible in Russia, and national terminologies can be used in FHIR-based information systems.


Assuntos
Interoperabilidade da Informação em Saúde , Systematized Nomenclature of Medicine , Dicionários como Assunto , Federação Russa
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...