Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Más filtros

Banco de datos
Tipo de estudio
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
J Biomed Inform ; 104: 103399, 2020 04.
Artículo en Inglés | MEDLINE | ID: mdl-32151769

RESUMEN

OBJECTIVE: The centrality of data to biomedical research is difficult to understate, and the same is true for the importance of the biomedical literature in disseminating empirical findings to scientific questions made on such data. But the connections between the literature and related datasets are often weak, hampering the ability of scientists to easily move between existing datasets and existing findings to derive new scientific hypotheses. This work aims to recommend relevant literature articles for datasets with the ultimate goal of increasing the productivity of researchers. Our approach to literature recommendation for datasets is a part of the dataset reusability platform developed at the University Texas Health Science Center at Houston for datasets related to gene expression. This platform incorporates datasets from Gene Expression Omnibus (GEO). An average of 34 datasets were added to GEO daily in the last five years (i.e. 2014 to 2018), demonstrating the need for automatic methods to connect these datasets with relevant literature. The relevant literature for a given dataset may describe that dataset, provide a scientific finding based on that dataset, or even describe prior and related work to the dataset's topic that is of interest to users of the dataset. MATERIALS AND METHODS: We adopt an information retrieval paradigm for literature recommendation. In our experiments, distributional semantic features are created from the title and abstract of MEDLINE articles. Then, related articles are identified for datasets in GEO. We evaluate multiple distributional methods such as TF-IDF, BM25, Latent Semantic Analysis, Latent Dirichlet Allocation, word2vec, and doc2vec. Top similar papers are recommended for each dataset using cosine similarity between the dataset's vector representation and every paper's vector representation. We also propose several novel re-ranking and normalization methods over embeddings to improve the recommendations. RESULTS: The top-performing literature recommendation technique achieved a strict precision at 10 of 0.8333 and a partial precision at 10 of 0.9000 using BM25 based on a manual evaluation of 36 datasets. Evaluation on a larger, automatically-collected benchmark shows small but consistent gains by emphasizing the similarity of dataset and article titles. CONCLUSION: This work is the first step toward developing a literature recommendation tool by recommending relevant literature for datasets. This will hopefully lead to better data reuse experience.


Asunto(s)
Investigación Biomédica , Almacenamiento y Recuperación de la Información , Expresión Génica , Humanos , Publicaciones , Semántica
2.
Infect Dis Model ; 6: 461-473, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33644499

RESUMEN

While the Coronavirus Disease 2019 (COVID-19) pandemic continues to threaten public health and safety, every state has strategically reopened the business in the United States. It is urgent to evaluate the effect of reopening policies on the COVID-19 pandemic to help with the decision-making on the control measures and medical resource allocations. In this study, a novel SEIR model was developed to evaluate the effect of reopening policies based on the real-world reported COVID-19 data in Texas. The earlier reported data before the reopening were used to develop the SEIR model; data after the reopening were used for evaluation. The simulation results show that if continuing the "stay-at-home order" without reopening the business, the COVID-19 pandemic could end in December 2020 in Texas. On the other hand, the pandemic could be controlled similarly as the case of no-reopening only if the contact rate was low and additional high magnitude of control measures could be implemented. If the control measures are only slightly enhanced after reopening, it could flatten the curve of the COVID-19 epidemic with reduced numbers of infections and deaths, but it might make the epidemic last longer. Based on the reported data up to July 2020 in Texas, the real-world epidemic pattern is between the cases of the low and high magnitude of control measures with a medium risk of contact rate after reopening. In this case, the pandemic might last until summer 2021 to February 2022 with a total of 4-10 million infected cases and 20,080-58,604 deaths.

3.
Int J Comput Biol Drug Des ; 13(1): 124-143, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32153660

RESUMEN

Gene dynamic analysis is essential in identifying target genes involved pathogenesis of various diseases, including cancer. Cancer prognosis is often influenced by hypoxia. We apply a multi-step pipeline to study dynamic gene expressions in response to hypoxia in three cancer cell lines: prostate (DU145), colon (HT29), and breast (MCF7) cancers. We identified 26 distinct temporal expression patterns for prostate cell line, and 29 patterns for colon and breast cell lines. The module-based dynamic networks have been developed for all three cell lines. Our analyses improve the existing results in multiple ways. It exploits the time-dependence nature of gene expression values in identifying the dynamically significant genes; hence, more key significant genes and transcription factors have been identified. Our gene network returns significant information regarding biologically important modules of genes. Furthermore, the network has potential in learning the regulatory path between transcription factors and the downstream genes. In addition, our findings suggest that changes in genes BMP6 and ARSJ expression might have a key role in the time-dependent response to hypoxia in breast cancer.

4.
Database (Oxford) ; 20202020 11 28.
Artículo en Inglés | MEDLINE | ID: mdl-33247935

RESUMEN

The exponential growth of genomic/genetic data in the era of Big Data demands new solutions for making these data findable, accessible, interoperable and reusable. In this article, we present a web-based platform named Gene Expression Time-Course Research (GETc) Platform that enables the discovery and visualization of time-course gene expression data and analytical results from the NIH/NCBI-sponsored Gene Expression Omnibus (GEO). The analytical results are produced from an analytic pipeline based on the ordinary differential equation model. Furthermore, in order to extract scientific insights from these results and disseminate the scientific findings, close and efficient collaborations between domain-specific experts from biomedical and scientific fields and data scientists is required. Therefore, GETc provides several recommendation functions and tools to facilitate effective collaborations. GETc platform is a very useful tool for researchers from the biomedical genomics community to present and communicate large numbers of analysis results from GEO. It is generalizable and broadly applicable across different biomedical research areas. GETc is a user-friendly and efficient web-based platform freely accessible at http://genestudy.org/.


Asunto(s)
Bases de Datos Genéticas , Genómica , Expresión Génica , Perfilación de la Expresión Génica , Informática , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA