Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
1.
BMC Med Inform Decis Mak ; 18(Suppl 5): 114, 2018 12 07.
Artigo em Inglês | MEDLINE | ID: mdl-30526592

RESUMO

BACKGROUND: Disease named entity recognition (NER) is a fundamental step in information processing of medical texts. However, disease NER involves complex issues such as descriptive modifiers in actual practice. The accurate identification of disease NER is a still an open and essential research problem in medical information extraction and text mining tasks. METHODS: A hybrid model named Semantics Bidirectional LSTM and CRF (SBLC) for disease named entity recognition task is proposed. The model leverages word embeddings, Bidirectional Long Short Term Memory networks and Conditional Random Fields. A publically available NCBI disease dataset is applied to evaluate the model through comparing with nine state-of-the-art baseline methods including cTAKES, MetaMap, DNorm, C-Bi-LSTM-CRF, TaggerOne and DNER. RESULTS: The results show that the SBLC model achieves an F1 score of 0.862 and outperforms the other methods. In addition, the model does not rely on external domain dictionaries, thus it can be more conveniently applied in many aspects of medical text processing. CONCLUSIONS: According to performance comparison, the proposed SBLC model achieved the best performance, demonstrating its effectiveness in disease named entity recognition.


Assuntos
Mineração de Dados , Aprendizado de Máquina , Aplicações da Informática Médica , Redes Neurais de Computação , Humanos , Semântica
2.
BMC Evol Biol ; 13: 76, 2013 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-23547742

RESUMO

BACKGROUND: The Hippo pathway controls growth by mediating cell proliferation and apoptosis. Dysregulation of Hippo signaling causes abnormal proliferation in both healthy and cancerous cells. The Hippo pathway receives inputs from multiple developmental pathways and interacts with many tissue-specific transcription factors, but how genes in the pathway have evolved remains inadequately revealed. RESULTS: To explore the origin and evolution of Hippo pathway, we have extensively examined 16 Hippo pathway genes, including upstream regulators and downstream targets, in 24 organisms covering major metazoan phyla. From simple to complex organisms, these genes are varied in the length and number of exons but encode conserved domains with similar higher-order organization. The core of the pathway is more conserved than its upstream regulators and downstream targets. Several components, despite existing in the most basal metazoan sponges, cannot be convincingly identified in other species. Potential recombination breakpoints were identified in some genes. Coevolutionary analysis reveals that most functional domains in Hippo genes have coevolved with interacting functional domains in other genes. CONCLUSIONS: The two essential upstream regulators cadherins fat and dachsous may have originated in the unicellular organism Monosiga brevicollis and evolved more significantly than the core of the pathway. Genes having varied numbers of exons in different species, recombination events, and the gain and loss of some genes indicate alternative splicing and species-specific evolution. Coevolution signals explain some species-specific loss of functional domains. These results significantly unveil the structure and evolution of the Hippo pathway in distant phyla and provide valuable clues for further examination of Hippo signaling.


Assuntos
Apoptose , Proliferação de Células , Evolução Molecular , Transdução de Sinais , Animais , Caderinas/genética , Caderinas/metabolismo , Humanos , Filogenia , Proteínas Serina-Treonina Quinases/genética , Proteínas Serina-Treonina Quinases/metabolismo
3.
Sci Rep ; 12(1): 8842, 2022 05 25.
Artigo em Inglês | MEDLINE | ID: mdl-35614133

RESUMO

Today's growing phishing websites pose significant threats due to their extremely undetectable risk. They anticipate internet users to mistake them as genuine ones in order to reveal user information and privacy, such as login ids, pass-words, credit card numbers, etc. without notice. This paper proposes a new approach to solve the anti-phishing problem. The new features of this approach can be represented by URL character sequence without phishing prior knowledge, various hyperlink information, and textual content of the webpage, which are combined and fed to train the XGBoost classifier. One of the major contributions of this paper is the selection of different new features, which are capable enough to detect 0-h attacks, and these features do not depend on any third-party services. In particular, we extract character level Term Frequency-Inverse Document Frequency (TF-IDF) features from noisy parts of HTML and plaintext of the given webpage. Moreover, our proposed hyperlink features determine the relationship between the content and the URL of a webpage. Due to the absence of publicly available large phishing data sets, we needed to create our own data set with 60,252 webpages to validate the proposed solution. This data contains 32,972 benign webpages and 27,280 phishing webpages. For evaluations, the performance of each category of the proposed feature set is evaluated, and various classification algorithms are employed. From the empirical results, it was observed that the proposed individual features are valuable for phishing detection. However, the integration of all the features improves the detection of phishing sites with significant accuracy. The proposed approach achieved an accuracy of 96.76% with only 1.39% false-positive rate on our dataset, and an accuracy of 98.48% with 2.09% false-positive rate on benchmark dataset, which outperforms the existing baseline approaches.


Assuntos
Algoritmos , Segurança Computacional , Benchmarking , Coleta de Dados , Privacidade
4.
Front Neurorobot ; 16: 1041702, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36425928

RESUMO

Obtaining accurate depth information is key to robot grasping tasks. However, for transparent objects, RGB-D cameras have difficulty perceiving them owing to the objects' refraction and reflection properties. This property makes it difficult for humanoid robots to perceive and grasp everyday transparent objects. To remedy this, existing studies usually remove transparent object areas using a model that learns patterns from the remaining opaque areas so that depth estimations can be completed. Notably, this frequently leads to deviations from the ground truth. In this study, we propose a new depth completion method [i.e., ClueDepth Grasp (CDGrasp)] that works more effectively with transparent objects in RGB-D images. Specifically, we propose a ClueDepth module, which leverages the geometry method to filter-out refractive and reflective points while preserving the correct depths, consequently providing crucial positional clues for object location. To acquire sufficient features to complete the depth map, we design a DenseFormer network that integrates DenseNet to extract local features and swin-transformer blocks to obtain the required global information. Furthermore, to fully utilize the information obtained from multi-modal visual maps, we devise a Multi-Modal U-Net Module to capture multiscale features. Extensive experiments conducted on the ClearGrasp dataset show that our method achieves state-of-the-art performance in terms of accuracy and generalization of depth completion for transparent objects, and the successful employment of a humanoid robot grasping capability verifies the efficacy of our proposed method.

5.
IEEE Trans Pattern Anal Mach Intell ; 42(5): 1243-1256, 2020 May.
Artigo em Inglês | MEDLINE | ID: mdl-30668464

RESUMO

Internet platforms provide new ways for people to share experiences, generating massive amounts of data related to various real-world concepts. In this paper, we present an event detection framework to discover real-world events from multiple data domains, including online news media and social media. As multi-domain data possess multiple data views that are heterogeneous, initial dictionaries consisting of labeled data samples are exploited to align the multi-view data. Furthermore, a shared multi-view data representation (SMDR) model is devised, which learns underlying and intrinsic structures shared among the data views by considering the structures underlying the data, data variations, and informativeness of dictionaries. SMDR incorpvarious constraints in the objective function, including shared representation, low-rank, local invariance, reconstruction error, and dictionary independence constraints. Given the data representations achieved by SMDR, class-wise residual models are designed to discover the events underlying the data based on the reconstruction residuals. Extensive experiments conducted on two real-world event detection datasets, i.e., Multi-domain and Multi-modality Event Detection dataset, and MediaEval Social Event Detection 2014 dataset, indicating the effectiveness of the proposed approaches.

6.
Neural Netw ; 130: 1-10, 2020 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-32589586

RESUMO

Activated hidden units in convolutional neural networks (CNNs), known as feature maps, dominate image representation, which is compact and discriminative. For ultra-large datasets, high dimensional feature maps in float format not only result in high computational complexity, but also occupy massive memory space. To this end, a new image representation by aggregating convolution kernels (ACK) is proposed, where some convolution kernels capturing certain patterns are activated. The top-n index numbers of the convolution kernels are extracted directly as image representation in discrete integer values, which rebuild relationship between convolution kernels and image. Furthermore, a distance measurement is defined from the perspective of ordered sets to calculate position-sensitive similarities between image representations. Extensive experiments conducted on Oxford Buildings, Paris, and Holidays, etc., manifest that the proposed ACK achieves competitive performance on image retrieval with much lower computational cost, outperforming the ones using feature maps for image representation.


Assuntos
Processamento de Imagem Assistida por Computador/métodos , Redes Neurais de Computação , Reconhecimento Automatizado de Padrão/métodos
7.
Comput Biol Med ; 108: 122-132, 2019 05.
Artigo em Inglês | MEDLINE | ID: mdl-31003175

RESUMO

BACKGROUND: Disease named entity recognition (NER) plays an important role in biomedical research. There are a significant number of challenging issues to be addressed; among these, the identification of rare diseases and complex disease names and the problem of tagging inconsistency (i.e., if an entity is tagged differently in a document) are attracting substantial research attention. METHODS: We propose a new neural network method named Dic-Att-BiLSTM-CRF (DABLC) for disease NER. DABLC applies an efficient exact string matching method to match disease entities with a disease dictionary; here, the dictionary is constructed based on the Disease Ontology. Furthermore, DABLC constructs a dictionary attention layer by incorporating a disease dictionary matching method and document-level attention mechanism. Finally, a bidirectional long short-term memory network and conditional random field (BiLSTM-CRF) with a dictionary attention layer is proposed to combine the disease dictionary to develop disease NER. RESULTS: Extensive experiments are conducted on two widely-used corpora: the NCBI disease corpus and the BioCreative V CDR corpus. We apply each test on 10 executions of each model, with a 95% confidence interval. DABLC achieves the highest F1 scores (NCBI: Precision = 0.883, Recall = 0.89, F1 = 0.886; BioCreative V CDR: Precision = 0.891, Recall = 0.875, F1 = 0.883), outperforming the state-of-the-art methods. CONCLUSION: DABLC combines the advantages of both external dictionary resources and deep attention neural networks. This aids the identification of rare diseases and complex disease names; moreover, it reduces the impact of tagging inconsistency. Special disease NER and deep learning models addressing long sentences are noteworthy areas for future examination.


Assuntos
Mineração de Dados , Aprendizado Profundo , Doença , Idioma , Humanos , Terminologia como Assunto
8.
J Mass Spectrom ; 41(12): 1615-22, 2006 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-17103492

RESUMO

BAPTA-AM is the acetoxymethylester of the calcium chelator BAPTA and has demonstrated efficacy in several animal models of cerebral ischemia. This paper describes the development of a method for the determination of BAPTA-AM in rat plasma by liquid chromatography/tandem mass spectrometry. Owing to multiple ester groups in the structure of BAPTA-AM, [M + Na](+) was chosen as the analytical ion for quantification of BAPTA-AM. During the analytical method development, a high percentage of organic solvent and the addition of an amount of sodium acetate and formic acid in the mobile phase were found to favor the sensitivity and reproducibility of [M + Na](+). Poor fragmentation was usually observed in the MS/MS spectra of sodium adduct ions. However, abundant and reproducible fragment ions were observed for the BAPTA-AM sodium adduct ion, and therefore the traditional selective reaction-monitoring mode was used to further improve the sensitivity of MS detection. Because of the lability of the ester bond, a combination of fluoride and hydrochloric acid was applied to minimize the enzymatic hydrolysis, and acetonitrile was chosen to avoid the chemical hydrolysis or solvolysis during the sample collection and preparation procedure. On the basis of these studies, a rapid, sensitive and reproducible method for the determination of BAPTA-AM in rat plasma, using LC/ESI-MS/MS and a simple protein precipitation procedure, was developed and validated. Also, the present method was successfully applied to the determination of BAPTA-AM plasma concentrations for pharmacokinetic studies in rats.


Assuntos
Quelantes/farmacocinética , Cromatografia Líquida/métodos , Ácido Egtázico/análogos & derivados , Espectrometria de Massas em Tandem/métodos , Animais , Calibragem , Quelantes/química , Cromatografia Líquida/normas , Ácido Egtázico/sangue , Ácido Egtázico/química , Ácido Egtázico/farmacocinética , Nimodipina/sangue , Nimodipina/química , Nimodipina/farmacocinética , Ratos , Ratos Sprague-Dawley , Reprodutibilidade dos Testes , Espectrometria de Massas em Tandem/normas
9.
IEEE Trans Pattern Anal Mach Intell ; 37(8): 1723-9, 2015 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-26353007

RESUMO

Sketch matching is the fundamental problem in sketch based interfaces. After years of study, it remains challenging when there exists large irregularity and variations in the hand drawn sketch shapes. While most existing works exploit topology relations and graph representations for this problem, they are usually limited by the coarse topology exploration and heuristic (thus suboptimal) similarity metrics between graphs. We present a new sketch matching method with two novel contributions. We introduce a comprehensive definition of topology relations, which results in a rich and informative graph representation of sketches. For graph matching, we propose topology product graph that retains the full correspondence for matching two graphs. Based on it, we derive an intuitive sketch similarity metric whose exact solution is easy to compute. In addition, the graph representation and new metric naturally support partial matching, an important practical problem that received less attention in the literature. Extensive experimental results on a real challenging dataset and the superior performance of our method show that it outperforms the state-of-the-art.

10.
Nan Fang Yi Ke Da Xue Xue Bao ; 33(6): 870-3, 2013 Jun.
Artigo em Zh | MEDLINE | ID: mdl-23803200

RESUMO

OBJECTIVE: To explore the core mechanism of cell cycle compensation using a mathematical model. METHODS: A set of ordinary differential equations were used to describe the interactions between the core cell cycle molecules. Continuous and cyclic changes of the concentrations of these molecules were computed to capture the discrete events of molecular interactions. RESULTS: The calculated molecule concentrations and captured signaling events agreed with the experimental results. CONCLUSION: E2F transcription factor 1 is the pivotal element linking the positive and negative feedbacks and regulating G1/S and G2/M phase compensation.


Assuntos
Ciclo Celular , Fator de Transcrição E2F1 , Animais , Drosophila/citologia , Retroalimentação Fisiológica , Modelos Teóricos
11.
IEEE Trans Pattern Anal Mach Intell ; 33(5): 1009-21, 2011 May.
Artigo em Inglês | MEDLINE | ID: mdl-20733219

RESUMO

Term weighting has proven to be an effective way to improve the performance of text categorization. Very recently, with the development of user-interactive question answering or community question answering, there has emerged a need to accurately categorize questions into predefined categories. However, as a question is usually a piece of short text, can the existing term-weighting methods perform consistently in question categorization as they do in text categorization? The answer is not clear, since to the best of our knowledge, we have not seen any work related to this problem despite of its significance. In this study, we investigate the popular unsupervised and supervised term-weighting methods for question categorization. At the same time, we propose three new supervised term-weighting methods, namely, qf*icf, iqf*qf*icf, and vrf. Comparisons of them with existing unsupervised and supervised term-weighting methods are made through a series of experiments on question collections of Yahoo! Answers. The experimental results show that iqf*qf*icf achieves the best performance among all term-weighting methods, while qf*icf and vrf are also competitive for question categorization. Meanwhile, tf*OR is proven to be the most significant one among existing methods. In addition, iqf*qf*icf and vrf are also effective for long document categorization.

12.
IEEE Trans Neural Netw ; 22(10): 1532-46, 2011 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-21824844

RESUMO

A novel framework using a Bayesian approach for content-based phishing web page detection is presented. Our model takes into account textual and visual contents to measure the similarity between the protected web page and suspicious web pages. A text classifier, an image classifier, and an algorithm fusing the results from classifiers are introduced. An outstanding feature of this paper is the exploration of a Bayesian model to estimate the matching threshold. This is required in the classifier for determining the class of the web page and identifying whether the web page is phishing or not. In the text classifier, the naive Bayes rule is used to calculate the probability that a web page is phishing. In the image classifier, the earth mover's distance is employed to measure the visual similarity, and our Bayesian model is designed to determine the threshold. In the data fusion algorithm, the Bayes theory is used to synthesize the classification results from textual and visual content. The effectiveness of our proposed approach was examined in a large-scale dataset collected from real phishing cases. Experimental results demonstrated that the text classifier and the image classifier we designed deliver promising results, the fusion algorithm outperforms either of the individual classifiers, and our model can be adapted to different phishing cases.


Assuntos
Algoritmos , Inteligência Artificial , Teorema de Bayes , Segurança Computacional , Internet/normas , Reconhecimento Automatizado de Padrão/métodos , Crime/prevenção & controle , Mineração de Dados/métodos , Humanos , Modelos Estatísticos , Software/normas , Validação de Programas de Computador , Estatística como Assunto/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA