Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
1.
Bioinformatics ; 34(4): 652-659, 2018 02 15.
Artículo en Inglés | MEDLINE | ID: mdl-29028901

RESUMEN

Motivation: The increase in publication rates makes it challenging for an individual researcher to stay abreast of all relevant research in order to find novel research hypotheses. Literature-based discovery methods make use of knowledge graphs built using text mining and can infer future associations between biomedical concepts that will likely occur in new publications. These predictions are a valuable resource for researchers to explore a research topic. Current methods for prediction are based on the local structure of the knowledge graph. A method that uses global knowledge from across the knowledge graph needs to be developed in order to make knowledge discovery a frequently used tool by researchers. Results: We propose an approach based on the singular value decomposition (SVD) that is able to combine data from across the knowledge graph through a reduced representation. Using cooccurrence data extracted from published literature, we show that SVD performs better than the leading methods for scoring discoveries. We also show the diminishing predictive power of knowledge discovery as we compare our predictions with real associations that appear further into the future. Finally, we examine the strengths and weaknesses of the SVD approach against another well-performing system using several predicted associations. Availability and implementation: All code and results files for this analysis can be accessed at https://github.com/jakelever/knowledgediscovery. Contact: sjones@bcgsc.ca. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Minería de Datos/métodos , Publicaciones , Programas Informáticos
2.
JAMA Netw Open ; 2(4): e192597, 2019 04 05.
Artículo en Inglés | MEDLINE | ID: mdl-31026023

RESUMEN

Importance: A molecular diagnostic method that incorporates information about the transcriptional status of all genes across multiple tissue types can strengthen confidence in cancer diagnosis. Objective: To determine the practical use of a whole transcriptome-based pan-cancer method in diagnosing primary and metastatic cancers and resolving complex diagnoses. Design, Setting, and Participants: This cross-sectional diagnostic study assessed Supervised Cancer Origin Prediction Using Expression (SCOPE), a machine learning method using whole-transcriptome RNA sequencing data. Training was performed on publicly available primary cancer data sets, including The Cancer Genome Atlas. Testing was performed retrospectively on untreated primary cancers and treated metastases from volunteer adult patients at BC Cancer in Vancouver, British Columbia, from January 1, 2013, to March 31, 2016, and testing spanned 10 822 samples and 66 output classes representing untreated primary cancers (n = 40) and adjacent normal tissues (n = 26). SCOPE's performance was demonstrated on 211 untreated primary mesothelioma cancers and 201 treatment-resistant metastatic cancers. Finally, SCOPE was used to identify the putative site of origin in 15 cases with initial presentation as cancers with unknown primary of origin. Results: A total of 10 688 adult patient samples representing 40 untreated primary tumor types and 26 adjacent-normal tissues were used for training. Demographic data were not available for all data sets. Among the training data set, 5157 of 10 244 (50.3%) were male and the mean (SD) age was 58.9 (14.5) years. Testing was performed on 211 patients with untreated primary mesothelioma (173 [82.0%] male; mean [SD] age, 64.5 [11.3] years); 201 patients with treatment-resistant cancers (141 [70.1%] female; mean [SD] age, 55.6 [12.9] years); and 15 patients with cancers of unknown primary of origin; among the treatment-resistant cancers, 168 were metastatic, and 33 were the primary presentation. An accuracy rate of 99% was obtained for primary epithelioid mesotheliomas tested (125 of 126). The remaining 85 mesotheliomas had a mixed etiology (sarcomatoid mesotheliomas) and were correctly identified as a mixture of their primary components, with potential implications in resolving subtypes and incidences of mixed histology. SCOPE achieved an overall mean (SD) accuracy rate of 86% (11%) and F1 score of 0.79 (0.12) on the 201 treatment-resistant cancers and matched 12 of 15 of the putative diagnoses for cancers with indeterminate diagnosis from conventional pathology. Conclusions and Relevance: These results suggest that machine learning approaches incorporating multiple tumor profiles can more accurately identify the cancerous state and discriminate it from normal cells. SCOPE uses the whole transcriptomes from normal and tumor tissues, and results of this study suggest that it performs well for rare cancer types, primary cancers, treatment-resistant metastatic cancers, and cancers of unknown primary of origin. Genes most relevant in SCOPE's decision making were examined, and several are known biological markers of respective cancers. SCOPE may be applied as an orthogonal diagnostic method in cases where the site of origin of a cancer is unknown, or when standard pathology assessment is inconclusive.


Asunto(s)
Biomarcadores de Tumor/genética , Secuenciación del Exoma/métodos , Neoplasias/diagnóstico , Redes Neurales de la Computación , Transcriptoma , Adulto , Estudios Transversales , Femenino , Humanos , Neoplasias Pulmonares/diagnóstico , Neoplasias Pulmonares/genética , Aprendizaje Automático , Masculino , Mesotelioma/diagnóstico , Mesotelioma/genética , Mesotelioma Maligno , Persona de Mediana Edad , Neoplasias/genética
3.
J Endocrinol ; 235(2): 153-165, 2017 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-28808080

RESUMEN

The thyroid gland, necessary for normal human growth and development, functions as an essential regulator of metabolism by the production and secretion of appropriate levels of thyroid hormone. However, assessment of abnormal thyroid function may be challenging suggesting a more fundamental understanding of normal function is needed. One way to characterize normal gland function is to study the epigenome and resulting transcriptome within its constituent cells. This study generates the first published reference epigenomes for human thyroid from four individuals using ChIP-seq and RNA-seq. We profiled six histone modifications (H3K4me1, H3K4me3, H3K27ac, H3K36me3, H3K9me3, H3K27me3), identified chromatin states using a hidden Markov model, produced a novel quantitative metric for model selection and established epigenomic maps of 19 chromatin states. We found that epigenetic features characterizing promoters and transcription elongation tend to be more consistent than regions characterizing enhancers or Polycomb-repressed regions and that epigenetically active genes consistent across all epigenomes tend to have higher expression than those not marked as epigenetically active in all epigenomes. We also identified a set of 18 genes epigenetically active and consistently expressed in the thyroid that are likely highly relevant to thyroid function. Altogether, these epigenomes represent a powerful resource to develop a deeper understanding of the underlying molecular biology of thyroid function and provide contextual information of thyroid and human epigenomic data for comparison and integration into future studies.


Asunto(s)
Epigénesis Genética/fisiología , Epigenómica/métodos , Regulación de la Expresión Génica/fisiología , Glándula Tiroides/fisiología , Cromatina , Histonas/genética , Histonas/metabolismo , Humanos , Regiones Promotoras Genéticas , Transcriptoma
4.
Cell Rep ; 17(8): 2112-2124, 2016 11 15.
Artículo en Inglés | MEDLINE | ID: mdl-27851972

RESUMEN

Nucleosome position, density, and post-translational modification are widely accepted components of mechanisms regulating DNA transcription but still incompletely understood. We present a modified native ChIP-seq method combined with an analytical framework that allows MNase accessibility to be integrated with histone modification profiles. Application of this methodology to the primitive (CD34+) subset of normal human cord blood cells enabled genomic regions enriched in one versus two nucleosomes marked by histone 3 lysine 4 trimethylation (H3K4me3) and/or histone 3 lysine 27 trimethylation (H3K27me3) to be associated with their transcriptional and DNA methylation states. From this analysis, we defined four classes of promoter-specific profiles and demonstrated that a majority of bivalent marked promoters are heterogeneously marked at a single-cell level in this primitive cell type. Interestingly, extension of this approach to human embryonic stem cells revealed an altered relationship between chromatin modification state and nucleosome content at promoters, suggesting developmental stage-specific organization of histone methylation states.


Asunto(s)
Inmunoprecipitación de Cromatina , Nucleosomas/metabolismo , Análisis de Secuencia de ARN , Antígenos CD34/metabolismo , Islas de CpG/genética , ADN/metabolismo , Metilación de ADN/genética , Sangre Fetal/citología , Sangre Fetal/metabolismo , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Histonas/metabolismo , Células Madre Embrionarias Humanas/metabolismo , Humanos , Nucleasa Microcócica/metabolismo , Regiones Promotoras Genéticas , Procesamiento Proteico-Postraduccional , ARN/genética , ARN/metabolismo
5.
Cell Rep ; 17(8): 2060-2074, 2016 11 15.
Artículo en Inglés | MEDLINE | ID: mdl-27851968

RESUMEN

The normal adult human mammary gland is a continuous bilayered epithelial system. Bipotent and myoepithelial progenitors are prominent and unique components of the outer (basal) layer. The inner (luminal) layer includes both luminal-restricted progenitors and a phenotypically separable fraction that lacks progenitor activity. We now report an epigenomic comparison of these three subsets with one another, with their associated stromal cells, and with three immortalized, non-tumorigenic human mammary cell lines. Each genome-wide analysis contains profiles for six histone marks, methylated DNA, and RNA transcripts. Analysis of these datasets shows that each cell type has unique features, primarily within genomic regulatory regions, and that the cell lines group together. Analyses of the promoter and enhancer profiles place the luminal progenitors in between the basal cells and the non-progenitor luminal subset. Integrative analysis reveals networks of subset-specific transcription factors.


Asunto(s)
Mama/metabolismo , Elementos de Facilitación Genéticos/genética , Epigénesis Genética , Redes Reguladoras de Genes , Factores de Transcripción/metabolismo , Adulto , Separación Celular , Cromatina/metabolismo , Células Epiteliales/citología , Células Epiteliales/metabolismo , Femenino , Humanos , Fenotipo , Regiones Promotoras Genéticas , Reproducibilidad de los Resultados
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA