Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
Sci Rep ; 14(1): 10385, 2024 05 06.
Artículo en Inglés | MEDLINE | ID: mdl-38710786

RESUMEN

The verified text data of wheat varieties is an important component of wheat germplasm information. To automatically obtain a structured description of the phenotypic and genetic characteristics of wheat varieties, the aim at solve the issues of fuzzy entity boundaries and overlapping relationships in unstructured wheat variety approval data, WGIE-DCWF (joint extraction model of wheat germplasm information entity relationship based on deep character and word fusion) was proposed. The encoding layer of the model deeply fused word semantic information and character information using the Transformer encoder of BERT. This allowed for the cascading fusion of contextual semantic feature information to achieve rich character vector representation and improve the recognition ability of entity features. The triple extraction layer of the model established a cascading pointer network, extracted the head entity, extracted the tail entity according to the relationship category, and decoded the output triplet. This approach improved the model's capability to extract overlapping relationships. The experimental results demonstrated that the WGIE-DCWF model performed exceptionally well on both the WGD (wheat germplasm dataset) and the public dataset DuIE. The WGIE-DCWF model not only achieved high performance on the evaluation datasets but also demonstrated good generalization. This provided valuable technical support for the construction of a wheat germplasm information knowledge base and is of great significance for wheat breeding, genetic research, cultivation management, and agricultural production.


Asunto(s)
Triticum , Triticum/genética , Semántica , Algoritmos
2.
Heliyon ; 9(7): e17806, 2023 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-37456013

RESUMEN

In light of the significance of regulatory authorities and the rising demand for information disclosure, a vast amount of information on food safety news reports is readily accessible on the Internet. The extraction of such information for precise classification and provision of appropriate safety alerts based on their respective categories has emerged as a challenging problem for academic research. Given that most food safety-related events in news reports comprise lengthy text, the pre-trained language models currently employed for text analysis are generally limited in their capability to handle long documents. This paper proposes a long-text classification model utilising hierarchical Transformers. We categorise information in long documents into two distinct types: (1) multiple text chunks meeting the length constraint and (2) essential sentences within long documents, such as headings, paragraph start and end sentences, etc. Initially, our proposed model utilises the text chunks as input to the BERT model. Then, it concatenates the output of the BERT model with the important sentences from the document and use them as input to the Transformer model for feature transformation. Finally, we utilise a classifier for food safety news classification. We conducted several comparative experiments with various commonly used text classification models on a dataset constructed from publicly available information on food regulatory websites. Our proposed method outperforms existing methods, establishing itself as the leading approach in terms of performance.

3.
iScience ; 26(6): 106874, 2023 Jun 16.
Artículo en Inglés | MEDLINE | ID: mdl-37260749

RESUMEN

The chromosome instability (CIN) is one of the hallmarks of cancer and is closely related to tumor metastasis. However, the sheer size and resolution of histopathology whole-slide images (WSIs) already challenges the capabilities of computational pathology. In this study, we propose a correlation graph attention network (MLP-GAT) that can construct graphs for classifying multi-type CINs from the WSIs of breast cancer. We construct a WSIs dataset of breast cancer from the Cancer Genome Atlas Breast Invasive Carcinoma (TCGA-BRCA). Extensive experiments show that MLP-GAT far outperforms accepted state-of-the-art methods and demonstrate the advantages of the constructed graph networks for analyzing WSI data. The visualization shows the difference among the tiles in a WSI. Furthermore, the generalization performance of the proposed method was verified on the stomach cancer. This study provides guidance for studying the relationship between CIN and cancer from the perspective of image phenotype.

4.
Entropy (Basel) ; 25(5)2023 May 13.
Artículo en Inglés | MEDLINE | ID: mdl-37238549

RESUMEN

Affective understanding of language is an important research focus in artificial intelligence. The large-scale annotated datasets of Chinese textual affective structure (CTAS) are the foundation for subsequent higher-level analysis of documents. However, there are very few published datasets for CTAS. This paper introduces a new benchmark dataset for the task of CTAS to promote development in this research direction. Specifically, our benchmark is a CTAS dataset with the following advantages: (a) it is Weibo-based, which is the most popular Chinese social media platform used by the public to express their opinions; (b) it includes the most comprehensive affective structure labels at present; and (c) we propose a maximum entropy Markov model that incorporates neural network features and experimentally demonstrate that it outperforms the two baseline models.

5.
IEEE J Biomed Health Inform ; 27(7): 3384-3395, 2023 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-37023156

RESUMEN

Identifying the subtypes of low-grade glioma (LGG) can help prevent brain tumor progression and patient death. However, the complicated non-linear relationship and high dimensionality of 3D brain MRI limit the performance of machine learning methods. Therefore, it is important to develop a classification method that can overcome these limitations. This study proposes a self-attention similarity-guided graph convolutional network (SASG-GCN) that uses the constructed graphs to complete multi-classification (tumor-free (TF), WG, and TMG). In the pipeline of SASG-GCN, we use a convolutional deep belief network and a self-attention similarity-based method to construct the vertices and edges of the constructed graphs at 3D MRI level, respectively. The multi-classification experiment is performed in a two-layer GCN model. SASG-GCN is trained and evaluated on 402 3D MRI images which are produced from the TCGA-LGG dataset. Empirical tests demonstrate that SASG-GCN accurately classifies the subtypes of LGG. The accuracy of SASG-GCN achieves 93.62%, outperforming several other state-of-the-art classification methods. In-depth discussion and analysis reveal that the self-attention similarity-guided strategy improves the performance of SASG-GCN. The visualization revealed differences between different gliomas.


Asunto(s)
Neoplasias Encefálicas , Glioma , Humanos , Glioma/diagnóstico por imagen , Neoplasias Encefálicas/diagnóstico por imagen , Encéfalo , Cabeza , Aprendizaje Automático
6.
IEEE J Biomed Health Inform ; 27(7): 3372-3383, 2023 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-37104101

RESUMEN

Segmenting stroke lesions and assessing the thrombolysis in cerebral infarction (TICI) grade are two important but challenging prerequisites for an auxiliary diagnosis of the stroke. However, most previous studies have focused only on a single one of two tasks, without considering the relation between them. In our study, we propose a simulated quantum mechanics-based joint learning network (SQMLP-net) that simultaneously segments a stroke lesion and assesses the TICI grade. The correlation and heterogeneity between the two tasks are tackled with a single-input double-output hybrid network. SQMLP-net has a segmentation branch and a classification branch. These two branches share an encoder, which extracts and shares the spatial and global semantic information for the segmentation and classification tasks. Both tasks are optimized by a novel joint loss function that learns the intra- and inter-task weights between these two tasks. Finally, we evaluate SQMLP-net with a public stroke dataset (ATLAS R2.0). SQMLP-net obtains state-of-the-art metrics (Dice:70.98% and accuracy:86.78%) and outperforms single-task and existing advanced methods. An analysis found a negative correlation between the severity of TICI grading and the accuracy of stroke lesion segmentation.


Asunto(s)
Infarto Cerebral , Accidente Cerebrovascular , Humanos , Infarto Cerebral/diagnóstico por imagen , Accidente Cerebrovascular/diagnóstico por imagen , Benchmarking , Semántica , Procesamiento de Imagen Asistido por Computador
7.
Comput Intell Neurosci ; 2022: 5467262, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35983151

RESUMEN

Personal medication intake detection aims to automatically detect tweets that show clear evidence of personal medication consumption. It is a research topic that has attracted considerable attention to drug safety surveillance. This task is inevitably dependent on medical domain information, and the current main model for this task does not explicitly consider domain information. To tackle this problem, we propose a domain attention mechanism for recurrent neural networks, LSTMs, with a multi-level feature representation of Twitter data. Specifically, we utilize character-level CNN to capture morphological features at the word level. Subsequently, we feed them with word embeddings into a BiLSTM to get the hidden representation of a tweet. An attention mechanism is introduced over the hidden state of the BiLSTM to attend to special medical information. Finally, a classification is performed on the weighted hidden representation of tweets. Experiments over a publicly available benchmark dataset show that our model can exploit a domain attention mechanism to consider medical information to improve performance. For example, our approach achieves a precision score of 0.708, a recall score of 0.694, and a F1 score of 0.697, which is significantly outperforming multiple strong and relevant baselines.


Asunto(s)
Medios de Comunicación Sociales , Recolección de Datos , Humanos , Redes Neurales de la Computación
8.
Med Image Anal ; 81: 102550, 2022 10.
Artículo en Inglés | MEDLINE | ID: mdl-35872360

RESUMEN

It has been proven that neuropsychiatric disorders (NDs) can be associated with both structures and functions of brain regions. Thus, data about structures and functions could be usefully combined in a comprehensive analysis. While brain structural MRI (sMRI) images contain anatomic and morphological information about NDs, functional MRI (fMRI) images carry complementary information. However, efficient extraction and fusion of sMRI and fMRI data remains challenging. In this study, we develop an enhanced multi-modal graph convolutional network (MME-GCN) in a binary classification between patients with NDs and healthy controls, based on the fusion of the structural and functional graphs of the brain region. First, based on the same brain atlas, we construct structural and functional graphs from sMRI and fMRI data, respectively. Second, we use machine learning to extract important features from the structural graph network. Third, we use these extracted features to adjust the corresponding edge weights in the functional graph network. Finally, we train a multi-layer GCN and use it in binary classification task. MME-GCN achieved 93.71% classification accuracy on the open data set provided by the Consortium for Neuropsychiatric Phenomics. In addition, we analyzed the important features selected from the structural graph and verified them in the functional graph. Using MME-GCN, we found several specific brain connections important to NDs.


Asunto(s)
Encéfalo , Imagen por Resonancia Magnética , Humanos , Encéfalo/anatomía & histología , Endrín/análogos & derivados , Imagen por Resonancia Magnética/métodos , Neuroimagen
9.
BMC Med Inform Decis Mak ; 21(1): 372, 2021 12 31.
Artículo en Inglés | MEDLINE | ID: mdl-34972505

RESUMEN

BACKGROUND: Named entity recognition (NER) on Chinese electronic medical/healthcare records has attracted significantly attentions as it can be applied to building applications to understand these records. Most previous methods have been purely data-driven, requiring high-quality and large-scale labeled medical data. However, labeled data is expensive to obtain, and these data-driven methods are difficult to handle rare and unseen entities. METHODS: To tackle these problems, this study presents a novel multi-task deep neural network model for Chinese NER in the medical domain. We incorporate dictionary features into neural networks, and a general secondary named entity segmentation is used as auxiliary task to improve the performance of the primary task of named entity recognition. RESULTS: In order to evaluate the proposed method, we compare it with other currently popular methods, on three benchmark datasets. Two of the datasets are publicly available, and the other one is constructed by us. Experimental results show that the proposed model achieves 91.07% average f-measure on the two public datasets and 87.05% f-measure on private dataset. CONCLUSIONS: The comparison results of different models demonstrated the effectiveness of our model. The proposed model outperformed traditional statistical models.


Asunto(s)
Nombres , Redes Neurales de la Computación , Atención , China , Registros Electrónicos de Salud , Humanos
10.
Front Neuroinform ; 15: 782262, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34975444

RESUMEN

Convolutional neural networks (CNNs) have brought hope for the medical image auxiliary diagnosis. However, the shortfall of labeled medical image data is the bottleneck that limits the performance improvement of supervised CNN methods. In addition, annotating a large number of labeled medical image data is often expensive and time-consuming. In this study, we propose a co-optimization learning network (COL-Net) for Magnetic Resonance Imaging (MRI) segmentation of ischemic penumbra tissues. COL-Net base on the limited labeled samples and consists of an unsupervised reconstruction network (R), a supervised segmentation network (S), and a transfer block (T). The reconstruction network extracts the robust features from reconstructing pseudo unlabeled samples, which is the auxiliary branch of the segmentation network. The segmentation network is used to segment the target lesions under the limited labeled samples and the auxiliary of the reconstruction network. The transfer block is used to co-optimization the feature maps between the bottlenecks of the reconstruction network and segmentation network. We propose a mix loss function to optimize COL-Net. COL-Net is verified on the public ischemic penumbra segmentation challenge (SPES) with two dozen labeled samples. Results demonstrate that COL-Net has high predictive accuracy and generalization with the Dice coefficient of 0.79. The extended experiment also shows COL-Net outperforms most supervised segmentation methods. COL-Net is a meaningful attempt to alleviate the limited labeled sample problem in medical image segmentation.

11.
BMC Med Inform Decis Mak ; 20(Suppl 3): 121, 2020 07 09.
Artículo en Inglés | MEDLINE | ID: mdl-32646430

RESUMEN

BACKGROUND: Blood cultures are often performed to detect patients who has a serious illness without infections and patients with bloodstream infections. Early positive blood culture prediction is important, as bloodstream infections may cause inflammation of the body, even organ failure or death. However, existing work mainly adopts statistical models with laboratory indicators, and fails to make full use of textual description information from EHRs. METHODS: We study the problem of positive blood culture prediction by using neural network model. Specifically, we first construct dataset from raw EHRs. Then we propose a hybrid neural network which incorporates attention based Bi-directional Long Short-Term Memory and Autoencoder networks to fully capture the information in EHRs. RESULTS: In order to evaluate the proposed method, we constructe a dataset which consists of totally 5963 patients who had one or more blood cultures tests during hospitalization. Experimental results show that the proposed neural model gets 91.23% F-measure for this task. CONCLUSIONS: The comparison results of different models demonstrated the effectiveness of our model. The proposed model outperformed traditional statistical models.


Asunto(s)
Cultivo de Sangre , Registros Electrónicos de Salud , Humanos , Modelos Estadísticos , Redes Neurales de la Computación
12.
Bioinformatics ; 33(15): 2363-2371, 2017 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-28369171

RESUMEN

MOTIVATION: Disease named entities play a central role in many areas of biomedical research, and automatic recognition and normalization of such entities have received increasing attention in biomedical research communities. Existing methods typically used pipeline models with two independent phases: (i) a disease named entity recognition (DER) system is used to find the boundaries of mentions in text and (ii) a disease named entity normalization (DEN) system is used to connect the mentions recognized to concepts in a controlled vocabulary. The main problems of such models are: (i) there is error propagation from DER to DEN and (ii) DEN is useful for DER, but pipeline models cannot utilize this. METHODS: We propose a transition-based model to jointly perform disease named entity recognition and normalization, casting the output construction process into an incremental state transition process, learning sequences of transition actions globally, which correspond to joint structural outputs. Beam search and online structured learning are used, with learning being designed to guide search. Compared with the only existing method for joint DEN and DER, our method allows non-local features to be used, which significantly improves the accuracies. RESULTS: We evaluate our model on two corpora: the BioCreative V Chemical Disease Relation (CDR) corpus and the NCBI disease corpus. Experiments show that our joint framework achieves significantly higher performances compared to competitive pipeline baselines. Our method compares favourably to other state-of-the-art approaches. AVAILABILITY AND IMPLEMENTATION: Data and code are available at https://github.com/louyinxia/jointRN. CONTACT: dhji@whu.edu.cn.


Asunto(s)
Minería de Datos/métodos , Enfermedad/clasificación , Vocabulario Controlado , Humanos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...