Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Biomed Res Int ; 2022: 3524090, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35342762

RESUMO

Biomedical named entity recognition (BioNER) from clinical texts is a fundamental task for clinical data analysis due to the availability of large volume of electronic medical record data, which are mostly in free text format, in real-world clinical settings. Clinical text data incorporates significant phenotypic medical entities (e.g., symptoms, diseases, and laboratory indexes), which could be used for profiling the clinical characteristics of patients in specific disease conditions (e.g., Coronavirus Disease 2019 (COVID-19)). However, general BioNER approaches mostly rely on coarse-grained annotations of phenotypic entities in benchmark text dataset. Owing to the numerous negation expressions of phenotypic entities (e.g., "no fever," "no cough," and "no hypertension") in clinical texts, this could not feed the subsequent data analysis process with well-prepared structured clinical data. In this paper, we developed Human-machine Cooperative Phenotypic Spectrum Annotation System (http://www.tcmai.org/login, HCPSAS) and constructed a fine-grained Chinese clinical corpus. Thereafter, we proposed a phenotypic named entity recognizer: Phenonizer, which utilized BERT to capture character-level global contextual representation, extracted local contextual features combined with bidirectional long short-term memory, and finally obtained the optimal label sequences through conditional random field. The results on COVID-19 dataset show that Phenonizer outperforms those methods based on Word2Vec with an F1-score of 0.896. By comparing character embeddings from different data, it is found that character embeddings trained by clinical corpora can improve F-score by 0.0103. In addition, we evaluated Phenonizer on two kinds of granular datasets and proved that fine-grained dataset can boost methods' F1-score slightly by about 0.005. Furthermore, the fine-grained dataset enables methods to distinguish between negated symptoms and presented symptoms. Finally, we tested the generalization performance of Phenonizer, achieving a superior F1-score of 0.8389. In summary, together with fine-grained annotated benchmark dataset, Phenonizer proposes a feasible approach to effectively extract symptom information from Chinese clinical texts with acceptable performance.


Assuntos
COVID-19 , China , Registros Eletrônicos de Saúde , Humanos
2.
Artigo em Inglês | MEDLINE | ID: mdl-32750864

RESUMO

The knowledge of phenotype-genotype associations is crucial for the understanding of disease mechanisms. Numerous studies have focused on developing efficient and accurate computing approaches to predict disease genes. However, owing to the sparseness and complexity of medical data, developing an efficient deep neural network model to identify disease genes remains a huge challenge. Therefore, we develop a novel deep neural network model that fuses the multi-view features of phenotypes and genotypes to identify disease genes (termed PDGNet). Our model integrated the multi-view features of diseases and genes and leveraged the feedback information of training samples to optimize the parameters of deep neural network and obtain the deep vector features of diseases and genes. The evaluation experiments on a large data set indicated that PDGNet obtained higher performance than the state-of-the-art method (precision and recall improved by 9.55 and 9.63 percent). The analysis results for the candidate genes indicated that the predicted genes have strong functional homogeneity and dense interactions with known genes. We validated the top predicted genes of Parkinson's disease based on external curated data and published medical literatures, which indicated that the candidate genes have a huge potential to guide the selection of causal genes in the 'wet experiment'. The source codes and the data of PDGNet are available at https://github.com/yangkuoone/PDGNet.


Assuntos
Biologia Computacional , Redes Neurais de Computação , Retroalimentação , Fenótipo , Software
3.
Ann Palliat Med ; 10(9): 9940-9952, 2021 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-34628918

RESUMO

BACKGROUND: It is becoming more and more important to judge whether patients with coronary heart disease (CHD) have phlegm and blood stasis syndromes in the process of traditional Chinese medicine (TCM) diagnosis and treatment of CHD. The syndrome differentiation strategy of phlegm and blood stasis syndromes of CHD is still not standardized, and it is particularly necessary to make syndrome differentiation simpler and more accurate. METHODS: Twenty-eight medical cases that met the criteria, comprising 10 ancient medical cases and 18 modern ones, were selected from the TCM literature, which were then analyzed by 57 experts via questionnaire. Statistical analysis of the data was mainly based on frequency analysis. RESULTS: (I) The average age of the 57 experts from 20 provinces was 48.9±8.5 years; 89.5% were associate professor or above, and 75.4% of them worked at a tertiary hospital. (II) Consistency of expert consultation over medical cases: for the ancient medical cases, the diagnostic consistency rate of phlegm syndrome was 27/34 (79.4%) and additional diagnosis rate of the blood stasis syndrome was 27/57 (47.4%); for the modern medical cases, the consistency rate compared with the original diagnosis of phlegm syndrome was 54/80 (67.5%) and that of blood stasis syndrome was 73/90 (81.1%). (III) The top five experts' diagnostic basics of phlegm syndrome were oppression in the chest, slippery pulse, greasy fur, coughing of phlegm, and chest pain; the top five diagnostic basics of blood stasis syndrome were chest pain, dark tongue, oppression in chest, red tongue, and ecchymosis on tongue. (IV) In the questionnaire consultation on CHD phlegm-blood stasis syndrome cases, the diagnostic basis of "symptom or (and) tongue manifestation" accounted for 12/27 (44.4%) of the diagnostic basics of phlegm syndrome and 28/38 (73.7%) of that of blood stasis syndrome basis. CONCLUSIONS: Modern Chinese medicine experts pay much attention to the diagnosis and treatment of CHD based on TCM pathology theories of phlegm and blood stasis. To collect and detect the patients' symptoms and tongue manifestation is an important strategy of the experts for CHD phlegm and blood stasis syndrome differentiation.


Assuntos
Doença das Coronárias , Adulto , Doença das Coronárias/diagnóstico , Humanos , Medicina Tradicional Chinesa , Pessoa de Meia-Idade , Encaminhamento e Consulta , Síndrome , Língua
4.
Hum Genet ; 140(6): 897-913, 2021 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-33409574

RESUMO

Disease gene identification is a critical step towards uncovering the molecular mechanisms of diseases and systematically investigating complex disease phenotypes. Despite considerable efforts to develop powerful computing methods, candidate gene identification remains a severe challenge owing to the connectivity of an incomplete interactome network, which hampers the discovery of true novel candidate genes. We developed a network-based machine-learning framework to identify both functional modules and disease candidate genes. In this framework, we designed a semi-supervised non-negative matrix factorization model to obtain the functional modules related to the diseases and genes. Of note, we proposed a disease gene-prioritizing method called MapGene that integrates the correlations from both functional modules and network closeness. Our framework identified a set of functional modules with highly functional homogeneity and close gene interactions. Experiments on a large-scale benchmark dataset showed that MapGene performs significantly better than the state-of-the-art algorithms. Further analysis demonstrates MapGene can effectively relieve the impact of the incompleteness of interactome networks and obtain highly reliable rankings of candidate genes. In addition, disease cases on Parkinson's disease and diabetes mellitus confirmed the generalization of MapGene for novel candidate gene identification. This work proposed, for the first time, an integrated computing framework to predict both functional modules and disease candidate genes. The methodology and results support that our framework has the potential to help discover underlying functional modules and reliable candidate genes in human disease.


Assuntos
Redes Reguladoras de Genes , Redes e Vias Metabólicas/genética , Valor Preditivo dos Testes , Aprendizado de Máquina Supervisionado , Sequência de Aminoácidos , Biologia Computacional/métodos , Gastroenteropatias/diagnóstico , Gastroenteropatias/genética , Gastroenteropatias/patologia , Humanos , Doenças do Sistema Imunitário/diagnóstico , Doenças do Sistema Imunitário/genética , Doenças do Sistema Imunitário/patologia , Transtornos Mentais/diagnóstico , Transtornos Mentais/genética , Transtornos Mentais/patologia , Doenças Metabólicas/diagnóstico , Doenças Metabólicas/genética , Doenças Metabólicas/patologia , Doenças Musculoesqueléticas/diagnóstico , Doenças Musculoesqueléticas/genética , Doenças Musculoesqueléticas/patologia , Neoplasias/diagnóstico , Neoplasias/genética , Neoplasias/patologia , Doenças Neurodegenerativas/diagnóstico , Doenças Neurodegenerativas/genética , Doenças Neurodegenerativas/patologia , Mapeamento de Interação de Proteínas , Terminologia como Assunto
5.
Artigo em Inglês | MEDLINE | ID: mdl-33381207

RESUMO

This study aims to explore the topological regularities of the character network of ancient traditional Chinese medicine (TCM) book. We applied the 2-gram model to construct language networks from ancient TCM books. Each text of the book was separated into sentences and a TCM book was generated as a directed network, in which nodes represent Chinese characters and links represent the sequential associations between Chinese characters in the sentences (the occurrence of identical sequential associations is considered as the weight of this link). We first calculated node degrees, average path lengths, and clustering coefficients of the book networks and explored the basic topological correlations between them. Then, we compared the similarity of network nodes to assess the specificity of TCM concepts in the network. In order to explore the relationship between TCM concepts, we screened TCM concepts and clustered them. Finally, we selected the binary groups whose weights are greater than 10 in Inner Canon of Huangdi (ICH, ) and Treatise on Cold Pathogenic Disease (TCPD, ), hoping to find the core differences of these two ancient TCM books through them. We found that the degree distributions of ancient TCM book networks are consistent with power law distribution. Moreover, the average path lengths of book networks are much smaller than random networks of the same scale; clustering coefficients are higher, which means that ancient book networks have small-world patterns. In addition, the similar TCM concepts are displayed and linked closely, according to the results of cosine similarity comparison and clustering. Furthermore, the core words of Inner Canon of Huangdi and Treatise on Cold Pathogenic Diseases have essential differences, which might indicate the significant differences of language and conceptual patterns between theoretical and clinical books. This study adopts language network approach to investigate the basic conceptual characteristics of ancient TCM book networks, which proposes a useful method to identify the underlying conceptual meanings of particular concepts conceived in TCM theories and clinical operations.

6.
J Biomed Inform ; 107: 103482, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-32535270

RESUMO

Identifying the symptom clusters (two or more related symptoms) with shared underlying molecular mechanisms has been a vital analysis task to promote the symptom science and precision health. Related studies have applied the clustering algorithms (e.g. k-means, latent class model) to detect the symptom clusters mostly from various kinds of clinical data. In addition, they focused on identifying the symptom clusters (SCs) for a specific disease, which also mainly concerned with the clinical regularities for symptom management. Here, we utilized a network-based clustering algorithm (i.e., BigCLAM) to obtain 208 typical SCs across disease conditions on a large-scale symptom network derived from integrated high-quality disease-symptom associations. Furthermore, we evaluated the underlying shared molecular mechanisms for SCs, i.e., shared genes, protein-protein interaction (PPI) and gene functional annotations using integrated networks and similarity measures. We found that the symptoms in the same SCs tend to share a higher degree of genes, PPIs and have higher functional homogeneities. In addition, we found that most SCs have related symptoms with shared underlying molecular mechanisms (e.g. enriched pathways) across different disease conditions. Our work demonstrated that the integrated network analysis method could be used for identifying robust SCs and investigate the molecular mechanisms of these SCs, which would be valuable for symptom science and precision health.


Assuntos
Algoritmos , Cuidados Paliativos , Análise por Conglomerados , Humanos , Síndrome
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...