Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 29
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Mol Carcinog ; 63(1): 120-135, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-37750589

RESUMEN

Head and neck squamous cell carcinomas (HNSCC) remain a poorly understood disease clinically and immunologically. HPV is a known risk factor of HNSCC associated with better outcome, whereas HPV-negative HNSCC are more heterogeneous in outcome. Gene expression signatures have been developed to classify HNSCC into four molecular subtypes (classical, basal, mesenchymal, and atypical). However, the molecular underpinnings of treatment response and the immune landscape for these molecular subtypes are largely unknown. Herein, we described a comprehensive immune landscape analysis in three independent HNSCC cohorts (>700 patients) using transcriptomics data. We assigned the HPV- HNSCC patients into these four molecular subtypes and characterized the tumor microenvironment using deconvolution method. We determined that atypical and mesenchymal subtypes have greater immune enrichment and exhibit a T-cell exhaustion phenotype, compared to classical and basal subtypes. Further analyses revealed different B cell maturation and antibody isotypes enrichment patterns, and distinct immune microenvironment crosstalk in the atypical and mesenchymal subtypes. Taken together, our study suggests that treatments that enhances B cell activity may benefit patients with HNSCC of the atypical subtypes. The rationale can be utilized in the design of future precision immunotherapy trials based on the molecular subtypes of HPV- HNSCC.


Asunto(s)
Neoplasias de Cabeza y Cuello , Infecciones por Papillomavirus , Humanos , Carcinoma de Células Escamosas de Cabeza y Cuello/genética , Virus del Papiloma Humano , Infecciones por Papillomavirus/complicaciones , Infecciones por Papillomavirus/genética , Neoplasias de Cabeza y Cuello/genética , Inmunoterapia , Microambiente Tumoral
2.
Brief Bioinform ; 22(3)2021 05 20.
Artículo en Inglés | MEDLINE | ID: mdl-32770181

RESUMEN

MOTIVATION: To obtain key information for personalized medicine and cancer research, clinicians and researchers in the biomedical field are in great need of searching genomic variant information from the biomedical literature now than ever before. Due to the various written forms of genomic variants, however, it is difficult to locate the right information from the literature when using a general literature search system. To address the difficulty of locating genomic variant information from the literature, researchers have suggested various solutions based on automated literature-mining techniques. There is, however, no study for summarizing and comparing existing tools for genomic variant literature mining in terms of how to search easily for information in the literature on genomic variants. RESULTS: In this article, we systematically compared currently available genomic variant recognition and normalization tools as well as the literature search engines that adopted these literature-mining techniques. First, we explain the problems that are caused by the use of non-standard formats of genomic variants in the PubMed literature by considering examples from the literature and show the prevalence of the problem. Second, we review literature-mining tools that address the problem by recognizing and normalizing the various forms of genomic variants in the literature and systematically compare them. Third, we present and compare existing literature search engines that are designed for a genomic variant search by using the literature-mining techniques. We expect this work to be helpful for researchers who seek information about genomic variants from the literature, developers who integrate genomic variant information from the literature and beyond.


Asunto(s)
Minería de Datos , Variación Genética , Medicina de Precisión , Motor de Búsqueda , PubMed , Publicaciones
3.
Nucleic Acids Res ; 49(W1): W352-W358, 2021 07 02.
Artículo en Inglés | MEDLINE | ID: mdl-33950204

RESUMEN

Searching and reading relevant literature is a routine practice in biomedical research. However, it is challenging for a user to design optimal search queries using all the keywords related to a given topic. As such, existing search systems such as PubMed often return suboptimal results. Several computational methods have been proposed as an effective alternative to keyword-based query methods for literature recommendation. However, those methods require specialized knowledge in machine learning and natural language processing, which can make them difficult for biologists to utilize. In this paper, we propose LitSuggest, a web server that provides an all-in-one literature recommendation and curation service to help biomedical researchers stay up to date with scientific literature. LitSuggest combines advanced machine learning techniques for suggesting relevant PubMed articles with high accuracy. In addition to innovative text-processing methods, LitSuggest offers multiple advantages over existing tools. First, LitSuggest allows users to curate, organize, and download classification results in a single interface. Second, users can easily fine-tune LitSuggest results by updating the training corpus. Third, results can be readily shared, enabling collaborative analysis and curation of scientific literature. Finally, LitSuggest provides an automated personalized weekly digest of newly published articles for each user's project. LitSuggest is publicly available at https://www.ncbi.nlm.nih.gov/research/litsuggest.


Asunto(s)
Publicaciones , Programas Informáticos , COVID-19 , Curaduría de Datos , Disparidades en Atención de Salud , Humanos , Internet , Neoplasias Hepáticas/epidemiología , Aprendizaje Automático
4.
Bioinformatics ; 37(20): 3681-3683, 2021 Oct 25.
Artículo en Inglés | MEDLINE | ID: mdl-33901274

RESUMEN

SUMMARY: The heterogeneous cell types of the tumor-immune microenvironment (TIME) play key roles in determining cancer progression, metastasis and response to treatment. We report the development of TIMEx, a novel TIME deconvolution method emphasizing on estimating infiltrating immune cells for bulk transcriptomics using pan-cancer single-cell RNA-seq signatures. We also implemented a comprehensive, user-friendly web-portal for users to evaluate TIMEx and other deconvolution methods with bulk transcriptomic profiles. AVAILABILITY AND IMPLEMENTATION: TIMEx web-portal is freely accessible at http://timex.moffitt.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

5.
PLoS Comput Biol ; 16(4): e1007617, 2020 04.
Artículo en Inglés | MEDLINE | ID: mdl-32324731

RESUMEN

A massive number of biological entities, such as genes and mutations, are mentioned in the biomedical literature. The capturing of the semantic relatedness of biological entities is vital to many biological applications, such as protein-protein interaction prediction and literature-based discovery. Concept embeddings-which involve the learning of vector representations of concepts using machine learning models-have been employed to capture the semantics of concepts. To develop concept embeddings, named-entity recognition (NER) tools are first used to identify and normalize concepts from the literature, and then different machine learning models are used to train the embeddings. Despite multiple attempts, existing biomedical concept embeddings generally suffer from suboptimal NER tools, small-scale evaluation, and limited availability. In response, we employed high-performance machine learning-based NER tools for concept recognition and trained our concept embeddings, BioConceptVec, via four different machine learning models on ~30 million PubMed abstracts. BioConceptVec covers over 400,000 biomedical concepts mentioned in the literature and is of the largest among the publicly available biomedical concept embeddings to date. To evaluate the validity and utility of BioConceptVec, we respectively performed two intrinsic evaluations (identifying related concepts based on drug-gene and gene-gene interactions) and two extrinsic evaluations (protein-protein interaction prediction and drug-drug interaction extraction), collectively using over 25 million instances from nine independent datasets (17 million instances from six intrinsic evaluation tasks and 8 million instances from three extrinsic evaluation tasks), which is, by far, the most comprehensive to our best knowledge. The intrinsic evaluation results demonstrate that BioConceptVec consistently has, by a large margin, better performance than existing concept embeddings in identifying similar and related concepts. More importantly, the extrinsic evaluation results demonstrate that using BioConceptVec with advanced deep learning models can significantly improve performance in downstream bioinformatics studies and biomedical text-mining applications. Our BioConceptVec embeddings and benchmarking datasets are publicly available at https://github.com/ncbi-nlp/BioConceptVec.


Asunto(s)
Biología Computacional/métodos , Minería de Datos/métodos , Aprendizaje Profundo , Publicaciones , Algoritmos , Bases de Datos de Proteínas , Interacciones Farmacológicas , Registros Electrónicos de Salud , Humanos , Mapeo de Interacción de Proteínas , PubMed , Semántica
6.
Nucleic Acids Res ; 46(W1): W530-W536, 2018 07 02.
Artículo en Inglés | MEDLINE | ID: mdl-29762787

RESUMEN

The identification and interpretation of genomic variants play a key role in the diagnosis of genetic diseases and related research. These tasks increasingly rely on accessing relevant manually curated information from domain databases (e.g. SwissProt or ClinVar). However, due to the sheer volume of medical literature and high cost of expert curation, curated variant information in existing databases are often incomplete and out-of-date. In addition, the same genetic variant can be mentioned in publications with various names (e.g. 'A146T' versus 'c.436G>A' versus 'rs121913527'). A search in PubMed using only one name usually cannot retrieve all relevant articles for the variant of interest. Hence, to help scientists, healthcare professionals, and database curators find the most up-to-date published variant research, we have developed LitVar for the search and retrieval of standardized variant information. In addition, LitVar uses advanced text mining techniques to compute and extract relationships between variants and other associated entities such as diseases and chemicals/drugs. LitVar is publicly available at https://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/LitVar.


Asunto(s)
Curaduría de Datos/métodos , Minería de Datos/métodos , Polimorfismo de Nucleótido Simple , Motor de Búsqueda , Interfaz Usuario-Computador , Genética Médica , Genoma Humano , Genómica/métodos , Humanos , Internet , PubMed , Semántica
7.
PLoS Comput Biol ; 14(8): e1006390, 2018 08.
Artículo en Inglés | MEDLINE | ID: mdl-30102703

RESUMEN

Manually curating biomedical knowledge from publications is necessary to build a knowledge based service that provides highly precise and organized information to users. The process of retrieving relevant publications for curation, which is also known as document triage, is usually carried out by querying and reading articles in PubMed. However, this query-based method often obtains unsatisfactory precision and recall on the retrieved results, and it is difficult to manually generate optimal queries. To address this, we propose a machine-learning assisted triage method. We collect previously curated publications from two databases UniProtKB/Swiss-Prot and the NHGRI-EBI GWAS Catalog, and used them as a gold-standard dataset for training deep learning models based on convolutional neural networks. We then use the trained models to classify and rank new publications for curation. For evaluation, we apply our method to the real-world manual curation process of UniProtKB/Swiss-Prot and the GWAS Catalog. We demonstrate that our machine-assisted triage method outperforms the current query-based triage methods, improves efficiency, and enriches curated content. Our method achieves a precision 1.81 and 2.99 times higher than that obtained by the current query-based triage methods of UniProtKB/Swiss-Prot and the GWAS Catalog, respectively, without compromising recall. In fact, our method retrieves many additional relevant publications that the query-based method of UniProtKB/Swiss-Prot could not find. As these results show, our machine learning-based method can make the triage process more efficient and is being implemented in production so that human curators can focus on more challenging tasks to improve the quality of knowledge bases.


Asunto(s)
Curaduría de Datos/métodos , Almacenamiento y Recuperación de la Información/métodos , Curaduría de Datos/estadística & datos numéricos , Bases de Datos Genéticas , Bases de Datos de Proteínas , Aprendizaje Profundo , Genómica , Bases del Conocimiento , Aprendizaje Automático , Publicaciones
8.
Nucleic Acids Res ; 45(D1): D784-D789, 2017 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-27899563

RESUMEN

Fusion gene is an important class of therapeutic targets and prognostic markers in cancer. ChimerDB is a comprehensive database of fusion genes encompassing analysis of deep sequencing data and manual curations. In this update, the database coverage was enhanced considerably by adding two new modules of The Cancer Genome Atlas (TCGA) RNA-Seq analysis and PubMed abstract mining. ChimerDB 3.0 is composed of three modules of ChimerKB, ChimerPub and ChimerSeq. ChimerKB represents a knowledgebase including 1066 fusion genes with manual curation that were compiled from public resources of fusion genes with experimental evidences. ChimerPub includes 2767 fusion genes obtained from text mining of PubMed abstracts. ChimerSeq module is designed to archive the fusion candidates from deep sequencing data. Importantly, we have analyzed RNA-Seq data of the TCGA project covering 4569 patients in 23 cancer types using two reliable programs of FusionScan and TopHat-Fusion. The new user interface supports diverse search options and graphic representation of fusion gene structure. ChimerDB 3.0 is available at http://ercsb.ewha.ac.kr/fusiongene/.


Asunto(s)
Minería de Datos , Bases de Datos Genéticas , Neoplasias/genética , Proteínas de Fusión Oncogénica/genética , Transcriptoma , Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Humanos , Programas Informáticos , Interfaz Usuario-Computador
9.
BMC Bioinformatics ; 19(1): 21, 2018 01 25.
Artículo en Inglés | MEDLINE | ID: mdl-29368597

RESUMEN

BACKGROUND: Molecular biomarkers that can predict drug efficacy in cancer patients are crucial components for the advancement of precision medicine. However, identifying these molecular biomarkers remains a laborious and challenging task. Next-generation sequencing of patients and preclinical models have increasingly led to the identification of novel gene-mutation-drug relations, and these results have been reported and published in the scientific literature. RESULTS: Here, we present two new computational methods that utilize all the PubMed articles as domain specific background knowledge to assist in the extraction and curation of gene-mutation-drug relations from the literature. The first method uses the Biomedical Entity Search Tool (BEST) scoring results as some of the features to train the machine learning classifiers. The second method uses not only the BEST scoring results, but also word vectors in a deep convolutional neural network model that are constructed from and trained on numerous documents such as PubMed abstracts and Google News articles. Using the features obtained from both the BEST search engine scores and word vectors, we extract mutation-gene and mutation-drug relations from the literature using machine learning classifiers such as random forest and deep convolutional neural networks. Our methods achieved better results compared with the state-of-the-art methods. We used our proposed features in a simple machine learning model, and obtained F1-scores of 0.96 and 0.82 for mutation-gene and mutation-drug relation classification, respectively. We also developed a deep learning classification model using convolutional neural networks, BEST scores, and the word embeddings that are pre-trained on PubMed or Google News data. Using deep learning, the classification accuracy improved, and F1-scores of 0.96 and 0.86 were obtained for the mutation-gene and mutation-drug relations, respectively. CONCLUSION: We believe that our computational methods described in this research could be used as an important tool in identifying molecular biomarkers that predict drug responses in cancer patients. We also built a database of these mutation-gene-drug relations that were extracted from all the PubMed abstracts. We believe that our database can prove to be a valuable resource for precision medicine researchers.


Asunto(s)
Resistencia a Antineoplásicos/genética , Motor de Búsqueda , Antineoplásicos/uso terapéutico , Bases de Datos Factuales , Humanos , Mutación , Neoplasias/tratamiento farmacológico , Neoplasias/genética , Neoplasias/patología , Redes Neurales de la Computación , Medicina de Precisión
10.
Bioinformatics ; 32(18): 2886-8, 2016 09 15.
Artículo en Inglés | MEDLINE | ID: mdl-27485446

RESUMEN

UNLABELLED: We introduce HiPub, a seamless Chrome browser plug-in that automatically recognizes, annotates and translates biomedical entities from texts into networks for knowledge discovery. Using a combination of two different named-entity recognition resources, HiPub can recognize genes, proteins, diseases, drugs, mutations and cell lines in texts, and achieve high precision and recall. HiPub extracts biomedical entity-relationships from texts to construct context-specific networks, and integrates existing network data from external databases for knowledge discovery. It allows users to add additional entities from related articles, as well as user-defined entities for discovering new and unexpected entity-relationships. HiPub provides functional enrichment analysis on the biomedical entity network, and link-outs to external resources to assist users in learning new entities and relations. AVAILABILITY AND IMPLEMENTATION: HiPub and detailed user guide are available at http://hipub.korea.ac.kr CONTACT: kangj@korea.ac.kr, aikchoon.tan@ucdenver.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Curaduría de Datos , Bases de Datos Factuales , Reconocimiento de Normas Patrones Automatizadas , Algoritmos , Biología Computacional/métodos , Genes , Humanos , Preparaciones Farmacéuticas , Proteínas , PubMed , Motor de Búsqueda
11.
Bioinformatics ; 31(18): 3069-71, 2015 Sep 15.
Artículo en Inglés | MEDLINE | ID: mdl-25990557

RESUMEN

UNLABELLED: We report the creation of Drug Signatures Database (DSigDB), a new gene set resource that relates drugs/compounds and their target genes, for gene set enrichment analysis (GSEA). DSigDB currently holds 22 527 gene sets, consists of 17 389 unique compounds covering 19 531 genes. We also developed an online DSigDB resource that allows users to search, view and download drugs/compounds and gene sets. DSigDB gene sets provide seamless integration to GSEA software for linking gene expressions with drugs/compounds for drug repurposing and translational research. AVAILABILITY AND IMPLEMENTATION: DSigDB is freely available for non-commercial use at http://tanlab.ucdenver.edu/DSigDB. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. CONTACT: aikchoon.tan@ucdenver.edu.


Asunto(s)
Biología Computacional/métodos , Bases de Datos Farmacéuticas , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Neoplasias Pulmonares/genética , Inhibidores de Proteínas Quinasas/farmacología , Programas Informáticos , Carcinoma de Pulmón de Células no Pequeñas/tratamiento farmacológico , Carcinoma de Pulmón de Células no Pequeñas/genética , Reposicionamiento de Medicamentos , Receptores ErbB/antagonistas & inhibidores , Receptores ErbB/genética , Humanos , Neoplasias Pulmonares/tratamiento farmacológico , Mutación/genética
12.
BMC Med Imaging ; 16: 23, 2016 Mar 12.
Artículo en Inglés | MEDLINE | ID: mdl-26968938

RESUMEN

BACKGROUND: Facial palsy or paralysis (FP) is a symptom that loses voluntary muscles movement in one side of the human face, which could be very devastating in the part of the patients. Traditional methods are solely dependent to clinician's judgment and therefore time consuming and subjective in nature. Hence, a quantitative assessment system becomes apparently invaluable for physicians to begin the rehabilitation process; and to produce a reliable and robust method is challenging and still underway. METHODS: We introduce a novel approach for a quantitative assessment of facial paralysis that tackles classification problem for FP type and degree of severity. Specifically, a novel method of quantitative assessment is presented: an algorithm that extracts the human iris and detects facial landmarks; and a hybrid approach combining the rule-based and machine learning algorithm to analyze and prognosticate facial paralysis using the captured images. A method combining the optimized Daugman's algorithm and Localized Active Contour (LAC) model is proposed to efficiently extract the iris and facial landmark or key points. To improve the performance of LAC, appropriate parameters of initial evolving curve for facial features' segmentation are automatically selected. The symmetry score is measured by the ratio between features extracted from the two sides of the face. Hybrid classifiers (i.e. rule-based with regularized logistic regression) were employed for discriminating healthy and unhealthy subjects, FP type classification, and for facial paralysis grading based on House-Brackmann (H-B) scale. RESULTS: Quantitative analysis was performed to evaluate the performance of the proposed approach. Experiments show that the proposed method demonstrates its efficiency. CONCLUSIONS: Facial movement feature extraction on facial images based on iris segmentation and LAC-based key point detection along with a hybrid classifier provides a more efficient way of addressing classification problem on facial palsy type and degree of severity. Combining iris segmentation and key point-based method has several merits that are essential for our real application. Aside from the facial key points, iris segmentation provides significant contribution as it describes the changes of the iris exposure while performing some facial expressions. It reveals the significant difference between the healthy side and the severe palsy side when raising eyebrows with both eyes directed upward, and can model the typical changes in the iris region.


Asunto(s)
Parálisis Facial/fisiopatología , Iris/patología , Algoritmos , Parálisis Facial/diagnóstico , Humanos , Interpretación de Imagen Asistida por Computador/métodos
13.
Bioinformatics ; 30(1): 135-6, 2014 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-24149052

RESUMEN

SUMMARY: Biomedical Entity-Relationship eXplorer (BEReX) is a new biomedical knowledge integration, search and exploration tool. BEReX integrates eight popular databases (STRING, DrugBank, KEGG, PhamGKB, BioGRID, GO, HPRD and MSigDB) and delineates an integrated network by combining the information available from these databases. Users search the integrated network by entering key words, and BEReX returns a sub-network matching the key words. The resulting graph can be explored interactively. BEReX allows users to find the shortest paths between two remote nodes, find the most relevant drugs, diseases, pathways and so on related to the current network, expand the network by particular types of entities and relations and modify the network by removing or adding selected nodes. BEReX is implemented as a standalone Java application. AVAILABILITY AND IMPLEMENTATION: BEReX and a detailed user guide are available for download at our project Web site (http://infos.korea.ac.kr/berex).


Asunto(s)
Interfaz Usuario-Computador , Algoritmos , Tecnología Biomédica , Biología Computacional/métodos , Bases de Datos Factuales , Humanos , Redes Neurales de la Computación
14.
BMC Med Inform Decis Mak ; 13 Suppl 1: S7, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23566263

RESUMEN

BACKGROUND: Most previous Protein Protein Interaction (PPI) studies evaluated their algorithms' performance based on "per-instance" precision and recall, in which the instances of an interaction relation were evaluated independently. However, we argue that this standard evaluation method should be revisited. In a large corpus, the same relation can be described in various different forms and, in practice, correctly identifying not all but a small subset of them would often suffice to detect the given interaction. METHODS: In this regard, we propose a more pragmatic "per-relation" basis performance evaluation method instead of the conventional per-instance basis method. In the per-relation basis method, only a subset of a relation's instances needs to be correctly identified to make the relation positive. In this work, we also introduce a new high-precision rule-based PPI extraction algorithm. While virtually all current PPI extraction studies focus on improving F-score, aiming to balance the performance on both precision and recall, in many realistic scenarios involving large corpora, one can benefit more from a high-precision algorithm than a high-recall counterpart. RESULTS: We show that our algorithm not only achieves better per-relation performance than previous solutions but also serves as a good complement to the existing PPI extraction tools. Our algorithm improves the performance of the existing tools through simple pipelining. CONCLUSION: The significance of this research can be found in that this research brought new perspective to the performance evaluation of PPI extraction studies, which we believe is more important in practice than existing evaluation criteria. Given the new evaluation perspective, we also showed the importance of a high-precision extraction tool and validated the efficacy of our rule-based system as the high-precision tool candidate.


Asunto(s)
Biología Computacional/normas , Técnicas de Apoyo para la Decisión , Almacenamiento y Recuperación de la Información/métodos , Mapeo de Interacción de Proteínas/normas , Humanos , Procesamiento de Lenguaje Natural , Reconocimiento de Normas Patrones Automatizadas
15.
NPJ Precis Oncol ; 7(1): 68, 2023 Jul 18.
Artículo en Inglés | MEDLINE | ID: mdl-37464050

RESUMEN

Preclinical genetically engineered mouse models (GEMMs) of lung adenocarcinoma are invaluable for investigating molecular drivers of tumor formation, progression, and therapeutic resistance. However, histological analysis of these GEMMs requires significant time and training to ensure accuracy and consistency. To achieve a more objective and standardized analysis, we used machine learning to create GLASS-AI, a histological image analysis tool that the broader cancer research community can utilize to grade, segment, and analyze tumors in preclinical models of lung adenocarcinoma. GLASS-AI demonstrates strong agreement with expert human raters while uncovering a significant degree of unreported intratumor heterogeneity. Integrating immunohistochemical staining with high-resolution grade analysis by GLASS-AI identified dysregulation of Mapk/Erk signaling in high-grade lung adenocarcinomas and locally advanced tumor regions. Our work demonstrates the benefit of employing GLASS-AI in preclinical lung adenocarcinoma models and the power of integrating machine learning and molecular biology techniques for studying the molecular pathways that underlie cancer progression.

16.
BMC Med Inform Decis Mak ; 12 Suppl 1: S7, 2012 Apr 30.
Artículo en Inglés | MEDLINE | ID: mdl-22595092

RESUMEN

BACKGROUND: There exist many academic search solutions and most of them can be put on either ends of spectrum: general-purpose search and domain-specific "deep" search systems. The general-purpose search systems, such as PubMed, offer flexible query interface, but churn out a list of matching documents that users have to go through the results in order to find the answers to their queries. On the other hand, the "deep" search systems, such as PPI Finder and iHOP, return the precompiled results in a structured way. Their results, however, are often found only within some predefined contexts. In order to alleviate these problems, we introduce a new search engine, BOSS, Biomedical Object Search System. METHODS: Unlike the conventional search systems, BOSS indexes segments, rather than documents. A segment refers to a Maximal Coherent Semantic Unit (MCSU) such as phrase, clause or sentence that is semantically coherent in the given context (e.g., biomedical objects or their relations). For a user query, BOSS finds all matching segments, identifies the objects appearing in those segments, and aggregates the segments for each object. Finally, it returns the ranked list of the objects along with their matching segments. RESULTS: The working prototype of BOSS is available at http://boss.korea.ac.kr. The current version of BOSS has indexed abstracts of more than 20 million articles published during last 16 years from 1996 to 2011 across all science disciplines. CONCLUSION: BOSS fills the gap between either ends of the spectrum by allowing users to pose context-free queries and by returning a structured set of results. Furthermore, BOSS exhibits the characteristic of good scalability, just as with conventional document search engines, because it is designed to use a standard document-indexing model with minimal modifications. Considering the features, BOSS notches up the technological level of traditional solutions for search on biomedical information.


Asunto(s)
Refuerzo Biomédico , Motor de Búsqueda/métodos , Semántica , Indización y Redacción de Resúmenes/normas , Indización y Redacción de Resúmenes/estadística & datos numéricos , Indización y Redacción de Resúmenes/tendencias , Humanos , Reproducibilidad de los Resultados
17.
JCO Clin Cancer Inform ; 6: e2100129, 2022 05.
Artículo en Inglés | MEDLINE | ID: mdl-35623021

RESUMEN

PURPOSE: Liver cancer is a global challenge, and disparities exist across multiple domains and throughout the disease continuum. However, liver cancer's global epidemiology and etiology are shifting, and the literature is rapidly evolving, presenting a challenge to the synthesis of knowledge needed to identify areas of research needs and to develop research agendas focusing on disparities. Machine learning (ML) techniques can be used to semiautomate the literature review process and improve efficiency. In this study, we detail our approach and provide practical benchmarks for the development of a ML approach to classify literature and extract data at the intersection of three fields: liver cancer, health disparities, and epidemiology. METHODS: We performed a six-phase process including: training (I), validating (II), confirming (III), and performing error analysis (IV) for a ML classifier. We then developed an extraction model (V) and applied it (VI) to the liver cancer literature identified through PubMed. We present precision, recall, F1, and accuracy metrics for the classifier and extraction models as appropriate for each phase of the process. We also provide the results for the application of our extraction model. RESULTS: With limited training data, we achieved a high degree of accuracy for both our classifier and for the extraction model for liver cancer disparities research literature performed using epidemiologic methods. The disparities concept was the most challenging to accurately classify, and concepts that appeared infrequently in our data set were the most difficult to extract. CONCLUSION: We provide a roadmap for using ML to classify and extract comprehensive information on multidisciplinary literature. Our technique can be adapted and modified for other cancers or diseases where disparities persist.


Asunto(s)
Neoplasias Hepáticas , Aprendizaje Automático , Humanos , Neoplasias Hepáticas/diagnóstico , Neoplasias Hepáticas/epidemiología , Neoplasias Hepáticas/terapia
18.
Nat Commun ; 13(1): 614, 2022 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-35105868

RESUMEN

Distinct lung stem cells give rise to lung adenocarcinoma (LUAD) and squamous cell carcinoma (LUSC). ΔNp63, the p53 family member and p63 isoform, guides the maturation of these stem cells through the regulation of their self-renewal and terminal differentiation; however, the underlying mechanistic role regulated by ∆Np63 in lung cancer development has remained elusive. By utilizing a ΔNp63-specific conditional knockout mouse model and xenograft models of LUAD and LUSC, we found that ∆Np63 promotes non-small cell lung cancer by maintaining the lung stem cells necessary for lung cancer cell initiation and progression in quiescence. ChIP-seq analysis of lung basal cells, alveolar type 2 (AT2) cells, and LUAD reveals robust ∆Np63 regulation of a common landscape of enhancers of cell identity genes. Importantly, one of these genes, BCL9L, is among the enhancer associated genes regulated by ∆Np63 in Kras-driven LUAD and mediates the oncogenic effects of ∆Np63 in both LUAD and LUSC. Accordingly, high BCL9L levels correlate with poor prognosis in LUAD patients. Taken together, our findings provide a unifying oncogenic role for ∆Np63 in both LUAD and LUSC through the regulation of a common landscape of enhancer associated genes.


Asunto(s)
Carcinoma de Pulmón de Células no Pequeñas/genética , Regulación Neoplásica de la Expresión Génica , Neoplasias Pulmonares/genética , Adenocarcinoma del Pulmón/genética , Adenocarcinoma del Pulmón/patología , Animales , Carcinoma de Células Escamosas/genética , Carcinoma de Células Escamosas/patología , Línea Celular Tumoral , Proliferación Celular , Epitelio , Femenino , Humanos , Pulmón/patología , Neoplasias Pulmonares/patología , Masculino , Ratones , Ratones Noqueados
19.
Front Artif Intell ; 4: 754641, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34568816

RESUMEN

The tumor immune microenvironment (TIME) encompasses many heterogeneous cell types that engage in extensive crosstalk among the cancer, immune, and stromal components. The spatial organization of these different cell types in TIME could be used as biomarkers for predicting drug responses, prognosis and metastasis. Recently, deep learning approaches have been widely used for digital histopathology images for cancer diagnoses and prognoses. Furthermore, some recent approaches have attempted to integrate spatial and molecular omics data to better characterize the TIME. In this review we focus on machine learning-based digital histopathology image analysis methods for characterizing tumor ecosystem. In this review, we will consider three different scales of histopathological analyses that machine learning can operate within: whole slide image (WSI)-level, region of interest (ROI)-level, and cell-level. We will systematically review the various machine learning methods in these three scales with a focus on cell-level analysis. We will provide a perspective of workflow on generating cell-level training data sets using immunohistochemistry markers to "weakly-label" the cell types. We will describe some common steps in the workflow of preparing the data, as well as some limitations of this approach. Finally, we will discuss future opportunities of integrating molecular omics data with digital histopathology images for characterizing tumor ecosystem.

20.
Top Stroke Rehabil ; 28(2): 81-87, 2021 03.
Artículo en Inglés | MEDLINE | ID: mdl-32482159

RESUMEN

BACKGROUND: Accurate prediction of fall likelihood is advantageous for instituting fall prevention program in rehabilitation facilities. OBJECTIVE: This study was designed to determine the clinical measures, which can predict the risk of fall events in a rehabilitation hospital. METHODS: Medical records of 166 patients (114 males and 52 females) who were hospitalized in an adult inpatient unit of a rehabilitation hospital were retrospectively analyzed for this study. As predictor variables for assessing fall risk, demographic data and the following measurements were selectively collected from patient's medical records: Tinetti Performance-Oriented Mobility Assessment-Ambulation (POMA-G), Timed Up and Go test (TUG), 10 m walk test, 2 min walk test, Korean version Mini-Mental State Examination (K-MMSE), Korean version of the Modified Barthel Index (KMBI), Berg Balance Scale (BBS), Global Deterioration Scale (GDS), and Morse Fall Scale (Morse FS). RESULTS: The Morse FS, TUG, and age were found to be risk factors for the classification of faller and non-faller groups. CONCLUSION: This study suggests Morse FS, TUG, and age in the routine initial assessment upon admission in a rehabilitation setting, as key variables for screening the risk of fall. Additionally, the cutoff scores of Morse FS and TUG were observed to be more rigid than other clinical settings.


Asunto(s)
Accidentes por Caídas/estadística & datos numéricos , Rehabilitación de Accidente Cerebrovascular , Accidente Cerebrovascular/complicaciones , Accidentes por Caídas/prevención & control , Adulto , Anciano , Anciano de 80 o más Años , Femenino , Hospitalización , Humanos , Incidencia , Masculino , Pruebas de Estado Mental y Demencia , Persona de Mediana Edad , Equilibrio Postural , Estudios Retrospectivos , Factores de Riesgo , Sensibilidad y Especificidad , Accidente Cerebrovascular/fisiopatología , Accidente Cerebrovascular/psicología , Estudios de Tiempo y Movimiento , Caminata , Adulto Joven
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA