Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 69
Filtrar
Mais filtros

Bases de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Bioinformatics ; 38(4): 1176-1178, 2022 01 27.
Artigo em Inglês | MEDLINE | ID: mdl-34788784

RESUMO

SUMMARY: Mian is a web application to interactively visualize, run statistical tools and train machine learning models on operational taxonomic unit (OTU) or amplicon sequence variant (ASV) datasets to identify key taxonomic groups, diversity trends or taxonomic composition shifts in the context of provided categorical or numerical sample metadata. Tools, including Fisher's exact test, Boruta feature selection, alpha and beta diversity, and random forest and deep neural network classifiers, facilitate open-ended data exploration and hypothesis generation on microbial datasets. AVAILABILITY: Mian is freely available at: miandata.org. Mian is an open-source platform licensed under the MIT license with source code available at github.com/tbj128/mian. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Microbiota , Software , Visualização de Dados , Aprendizado de Máquina , Internet
2.
Respir Res ; 24(1): 124, 2023 May 04.
Artigo em Inglês | MEDLINE | ID: mdl-37143066

RESUMO

BACKGROUND: People living with HIV (PLWH) are at increased risk of developing Chronic Obstructive Pulmonary Disease (COPD) independent of cigarette smoking. We hypothesized that dysbiosis in PLWH is associated with epigenetic and transcriptomic disruptions in the airway epithelium. METHODS: Airway epithelial brushings were collected from 18 COPD + HIV + , 16 COPD - HIV + , 22 COPD + HIV - and 20 COPD - HIV - subjects. The microbiome, methylome, and transcriptome were profiled using 16S sequencing, Illumina Infinium Methylation EPIC chip, and RNA sequencing, respectively. Multi 'omic integration was performed using Data Integration Analysis for Biomarker discovery using Latent cOmponents. A correlation > 0.7 was used to identify key interactions between the 'omes. RESULTS: The COPD + HIV -, COPD -HIV + , and COPD + HIV + groups had reduced Shannon Diversity (p = 0.004, p = 0.023, and p = 5.5e-06, respectively) compared to individuals with neither COPD nor HIV, with the COPD + HIV + group demonstrating the most reduced diversity. Microbial communities were significantly different between the four groups (p = 0.001). Multi 'omic integration identified correlations between Bacteroidetes Prevotella, genes FUZ, FASTKD3, and ACVR1B, and epigenetic features CpG-FUZ and CpG-PHLDB3. CONCLUSION: PLWH with COPD manifest decreased diversity and altered microbial communities in their airway epithelial microbiome. The reduction in Prevotella in this group was linked with epigenetic and transcriptomic disruptions in host genes including FUZ, FASTKD3, and ACVR1B.


Assuntos
Infecções por HIV , Doença Pulmonar Obstrutiva Crônica , Humanos , Disbiose/genética , Doença Pulmonar Obstrutiva Crônica/epidemiologia , Doença Pulmonar Obstrutiva Crônica/genética , Perfilação da Expressão Gênica , Epitélio , Infecções por HIV/epidemiologia , Infecções por HIV/genética
3.
Eur Respir J ; 59(5)2022 05.
Artigo em Inglês | MEDLINE | ID: mdl-34675046

RESUMO

RATIONALE: Peripheral airway obstruction is a key feature of chronic obstructive pulmonary disease (COPD), but the mechanisms of airway loss are unknown. This study aims to identify the molecular and cellular mechanisms associated with peripheral airway obstruction in COPD. METHODS: Ten explanted lung specimens donated by patients with very severe COPD treated by lung transplantation and five unused donor control lungs were sampled using systematic uniform random sampling (SURS), resulting in 240 samples. These samples were further examined by micro-computed tomography (CT), quantitative histology and gene expression profiling. RESULTS: Micro-CT analysis showed that the loss of terminal bronchioles in COPD occurs in regions of microscopic emphysematous destruction with an average airspace size of ≥500 and <1000 µm, which we have termed a "hot spot". Based on microarray gene expression profiling, the hot spot was associated with an 11-gene signature, with upregulation of pro-inflammatory genes and downregulation of inhibitory immune checkpoint genes, indicating immune response activation. Results from both quantitative histology and the bioinformatics computational tool CIBERSORT, which predicts the percentage of immune cells in tissues from transcriptomic data, showed that the hot spot regions were associated with increased infiltration of CD4 and CD8 T-cell and B-cell lymphocytes. INTERPRETATION: The reduction in terminal bronchioles observed in lungs from patients with COPD occurs in a hot spot of microscopic emphysema, where there is upregulation of IFNG signalling, co-stimulatory immune checkpoint genes and genes related to the inflammasome pathway, and increased infiltration of immune cells. These could be potential targets for therapeutic interventions in COPD.


Assuntos
Obstrução das Vias Respiratórias , Enfisema , Doença Pulmonar Obstrutiva Crônica , Enfisema Pulmonar , Bronquíolos/patologia , Enfisema/complicações , Humanos , Doença Pulmonar Obstrutiva Crônica/complicações , Microtomografia por Raio-X
4.
BMC Med Res Methodol ; 22(1): 136, 2022 05 12.
Artigo em Inglês | MEDLINE | ID: mdl-35549854

RESUMO

BACKGROUND: Manually extracted data points from health records are collated on an institutional, provincial, and national level to facilitate clinical research. However, the labour-intensive clinical chart review process puts an increasing burden on healthcare system budgets. Therefore, an automated information extraction system is needed to ensure the timeliness and scalability of research data. METHODS: We used a dataset of 100 synoptic operative and 100 pathology reports, evenly split into 50 reports in training and test sets for each report type. The training set guided our development of a Natural Language Processing (NLP) extraction pipeline system, which accepts scanned images of operative and pathology reports. The system uses a combination of rule-based and transfer learning methods to extract numeric encodings from text. We also developed visualization tools to compare the manual and automated extractions. The code for this paper was made available on GitHub. RESULTS: A test set of 50 operative and 50 pathology reports were used to evaluate the extraction accuracies of the NLP pipeline. Gold standard, defined as manual extraction by expert reviewers, yielded accuracies of 90.5% for operative reports and 96.0% for pathology reports, while the NLP system achieved overall 91.9% (operative) and 95.4% (pathology) accuracy. The pipeline successfully extracted outcomes data pertinent to breast cancer tumor characteristics (e.g. presence of invasive carcinoma, size, histologic type), prognostic factors (e.g. number of lymph nodes with micro-metastases and macro-metastases, pathologic stage), and treatment-related variables (e.g. margins, neo-adjuvant treatment, surgical indication) with high accuracy. Out of the 48 variables across operative and pathology codebooks, NLP yielded 43 variables with F-scores of at least 0.90; in comparison, a trained human annotator yielded 44 variables with F-scores of at least 0.90. CONCLUSIONS: The NLP system achieves near-human-level accuracy in both operative and pathology reports using a minimal curated dataset. This system uniquely provides a robust solution for transparent, adaptable, and scalable automation of data extraction from patient health records. It may serve to advance breast cancer clinical research by facilitating collection of vast amounts of valuable health data at a population level.


Assuntos
Neoplasias da Mama , Processamento de Linguagem Natural , Neoplasias da Mama/cirurgia , Registros Eletrônicos de Saúde , Feminino , Humanos , Armazenamento e Recuperação da Informação , Avaliação de Resultados em Cuidados de Saúde , Relatório de Pesquisa
5.
Bioinformatics ; 36(18): 4797-4804, 2020 09 15.
Artigo em Inglês | MEDLINE | ID: mdl-32573679

RESUMO

MOTIVATION: The interaction between proteins and nucleic acids plays a crucial role in gene regulation and cell function. Determining the binding preferences of nucleic acid-binding proteins (NBPs), namely RNA-binding proteins (RBPs) and transcription factors (TFs), is the key to decipher the protein-nucleic acids interaction code. Today, available NBP binding data from in vivo or in vitro experiments are still limited, which leaves a large portion of NBPs uncovered. Unfortunately, existing computational methods that model the NBP binding preferences are mostly protein specific: they need the experimental data for a specific protein in interest, and thus only focus on experimentally characterized NBPs. The binding preferences of experimentally unexplored NBPs remain largely unknown. RESULTS: Here, we introduce ProbeRating, a nucleic acid recommender system that utilizes techniques from deep learning and word embeddings of natural language processing. ProbeRating is developed to predict binding profiles for unexplored or poorly studied NBPs by exploiting their homologs NBPs which currently have available binding data. Requiring only sequence information as input, ProbeRating adapts FastText from Facebook AI Research to extract biological features. It then builds a neural network-based recommender system. We evaluate the performance of ProbeRating on two different tasks: one for RBP and one for TF. As a result, ProbeRating outperforms previous methods on both tasks. The results show that ProbeRating can be a useful tool to study the binding mechanism for the many NBPs that lack direct experimental evidence. and implementation. AVAILABILITY AND IMPLEMENTATION: The source code is freely available at . SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Ácidos Nucleicos , Proteínas de Ligação a RNA , Sítios de Ligação , Redes Neurais de Computação , Ligação Proteica , Proteínas de Ligação a RNA/metabolismo , Software
6.
Clin Chem ; 66(8): 1063-1071, 2020 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-32705124

RESUMO

BACKGROUND: HEARTBiT is a whole blood-based gene profiling assay using the nucleic acid counting NanoString technology for the exclusionary diagnosis of acute cellular rejection in heart transplant patients. The HEARTBiT score measures the risk of acute cellular rejection in the first year following heart transplant, distinguishing patients with stable grafts from those at risk for acute cellular rejection. Here, we provide the analytical performance characteristics of the HEARTBiT assay and the results on pilot clinical validation. METHODS: We used purified RNA collected from PAXgene blood samples to evaluate the characteristics of a 12-gene panel HEARTBiT assay, for its linearity range, quantitative bias, precision, and reproducibility. These parameters were estimated either from serial dilutions of individual samples or from repeated runs on pooled samples. RESULTS: We found that all 12 genes showed linear behavior within the recommended assay input range of 125 ng to 500 ng of purified RNA, with most genes showing 3% or lower quantitative bias and around 5% coefficient of variation. Total variation resulting from unique operators, reagent lots, and runs was less than 0.02 units standard deviation (SD). The performance of the analytically validated assay (AUC = 0.75) was equivalent to what we observed in the signature development dataset. CONCLUSION: The analytical performance of the assay within the specification input range demonstrated reliable quantification of the HEARTBiT score within 0.02 SD units, measured on a 0 to 1 unit scale. This assay may therefore be of high utility in clinical validation of HEARTBiT in future biomarker observational trials.


Assuntos
Perfilação da Expressão Gênica/métodos , Rejeição de Enxerto/diagnóstico , Transplante de Coração/efeitos adversos , RNA/sangue , Adulto , Biomarcadores/sangue , Feminino , Humanos , Limite de Detecção , Masculino , Pessoa de Meia-Idade , Projetos Piloto , Prognóstico , Reprodutibilidade dos Testes
7.
Clin Chem ; 65(2): 282-290, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-30463841

RESUMO

BACKGROUND: Cholesterol efflux capacity (CEC) is a measure of HDL function that, in cell-based studies, has demonstrated an inverse association with cardiovascular disease. The cell-based measure of CEC is complex and low-throughput. We hypothesized that assessment of the lipoprotein proteome would allow for precise, high-throughput CEC prediction. METHODS: After isolating lipoprotein particles from serum, we used LC-MS/MS to quantify 21 lipoprotein-associated proteins. A bioinformatic pipeline was used to identify proteins with univariate correlation to cell-based CEC measurements and generate a multivariate algorithm for CEC prediction (pCE). Using logistic regression, protein coefficients in the pCE model were reweighted to yield a new algorithm predicting coronary artery disease (pCAD). RESULTS: Discovery using targeted LC-MS/MS analysis of 105 training and test samples yielded a pCE model comprising 5 proteins (Spearman r = 0.86). Evaluation of pCE in a case-control study of 231 specimens from healthy individuals and patients with coronary artery disease revealed lower pCE in cases (P = 0.03). Derived within this same study, the pCAD model significantly improved classification (P < 0.0001). Following analytical validation of the multiplexed proteomic method, we conducted a case-control study of myocardial infarction in 137 postmenopausal women that confirmed significant separation of specimen cohorts in both the pCE (P = 0.015) and pCAD (P = 0.001) models. CONCLUSIONS: Development of a proteomic pCE provides a reproducible high-throughput alternative to traditional cell-based CEC assays. The pCAD model improves stratification of case and control cohorts and, with further studies to establish clinical validity, presents a new opportunity for the assessment of cardiovascular health.


Assuntos
Apolipoproteína A-I/sangue , Colesterol/metabolismo , Doença da Artéria Coronariana/patologia , Lipoproteínas/sangue , Proteoma/análise , Espectrometria de Massas em Tandem/métodos , Área Sob a Curva , Estudos de Casos e Controles , Cromatografia Líquida de Alta Pressão , Doença da Artéria Coronariana/sangue , Feminino , Humanos , Limite de Detecção , Masculino , Pessoa de Meia-Idade , Infarto do Miocárdio/sangue , Infarto do Miocárdio/patologia , Curva ROC , Estudos de Validação como Assunto
8.
Respir Res ; 20(1): 176, 2019 Aug 05.
Artigo em Inglês | MEDLINE | ID: mdl-31382977

RESUMO

BACKGROUND: Effects of systemic corticosteroids on blood gene expression are largely unknown. This study determined gene expression signature associated with short-term oral prednisone therapy in patients with chronic obstructive pulmonary disease (COPD) and its relationship to 1-year mortality following an acute exacerbation of COPD (AECOPD). METHODS: Gene expression in whole blood was profiled using the Affymetrix Human Gene 1.1 ST microarray chips from two cohorts: 1) a prednisone cohort with 37 stable COPD patients randomly assigned to prednisone 30 mg/d + standard therapy for 4 days or standard therapy alone and 2) the Rapid Transition Program (RTP) cohort with 218 COPD patients who experienced AECOPD and were treated with systemic corticosteroids. All gene expression data were adjusted for the total number of white blood cells and their differential cell counts. RESULTS: In the prednisone cohort, 51 genes were differentially expressed between prednisone and standard therapy group at a false discovery rate of < 0.05. The top 3 genes with the largest fold-changes were KLRF1, GZMH and ADGRG1; and 21 genes were significantly enriched in immune system pathways including the natural killer cell mediated cytotoxicity. In the RTP cohort, 27 patients (12.4%) died within 1 year after hospitalisation of AECOPD; 32 of 51 genes differentially expressed in the prednisone cohort significantly changed from AECOPD to the convalescent state and were enriched in similar cellular immune pathways to that in the prednisone cohort. Of these, 10 genes including CX3CR1, KLRD1, S1PR5 and PRF1 were significantly associated with 1-year mortality. CONCLUSIONS: Short-term daily prednisone therapy produces a distinct blood gene signature that may be used to determine and monitor treatment responses to prednisone in COPD patients during AECOPD. TRIAL REGISTRATION: The prednisone cohort was registered at clinicalTrials.gov ( NCT02534402 ) and the RTP cohort was registered at ClinicalTrials.gov ( NCT02050022 ).


Assuntos
Glucocorticoides/administração & dosagem , Prednisona/administração & dosagem , Doença Pulmonar Obstrutiva Crônica/sangue , Doença Pulmonar Obstrutiva Crônica/genética , Administração Oral , Idoso , Idoso de 80 Anos ou mais , Esquema de Medicação , Feminino , Expressão Gênica , Humanos , Masculino , Pessoa de Meia-Idade , Doença Pulmonar Obstrutiva Crônica/tratamento farmacológico
9.
BMC Bioinformatics ; 19(1): 96, 2018 03 12.
Artigo em Inglês | MEDLINE | ID: mdl-29529991

RESUMO

BACKGROUND: Characterizing the binding preference of RNA-binding proteins (RBP) is essential for us to understand the interaction between an RBP and its RNA targets, and to decipher the mechanism of post-transcriptional regulation. Experimental methods have been used to generate protein-RNA binding data for a number of RBPs in vivo and in vitro. Utilizing the binding data, a couple of computational methods have been developed to detect the RNA sequence or structure preferences of the RBPs. However, the majority of RBPs have not yet been experimentally characterized and lack RNA binding data. For these poorly studied RBPs, the identification of their binding preferences cannot be performed by most existing computational methods because the experimental binding data are prerequisite to these methods. RESULTS: Here we propose a new method based on co-evolution to predict the sequence preferences for the poorly studied RBPs, waiving the requirement of their binding data. First, we demonstrate the co-evolutionary relationship between RBPs and their RNA partners. We then present a K-nearest neighbors (KNN) based algorithm to infer the sequence preference of an RBP using only the preference information from its homologous RBPs. By benchmarking against several in vitro and in vivo datasets, our proposed method outperforms the existing alternative which uses the closest neighbor's preference on all the datasets. Moreover, it shows comparable performance with two state-of-the-art methods that require the presence of the experimental binding data. Finally, we demonstrate the usage of this method to infer sequence preferences for novel proteins which have no binding preference information available. CONCLUSION: For a poorly studied RBP, the current methods used to determine its binding preference need experimental data, which is expensive and time consuming. Therefore, determining RBP's preference is not practical in many situations. This study provides an economic solution to infer the sequence preference of such protein based on the co-evolution. The source codes and related datasets are available at https://github.com/syang11/KNN .


Assuntos
Algoritmos , Evolução Molecular , Proteínas de Ligação a RNA/química , Proteínas de Ligação a RNA/metabolismo , RNA/química , RNA/metabolismo , Sítios de Ligação
10.
Am J Respir Cell Mol Biol ; 57(4): 411-418, 2017 10.
Artigo em Inglês | MEDLINE | ID: mdl-28459279

RESUMO

Chronic obstructive pulmonary disease is the third leading cause of death worldwide. Gene expression profiling across multiple regions of the same lung identified genes significantly related to emphysema. We sought to determine whether the lung and epithelial expression of 127 emphysema-related genes was also related to lung function in independent cohorts, and whether any of these genes could be used as biomarkers in the peripheral blood of patients with chronic obstructive pulmonary disease. To that end, we examined whether the expression levels of these genes were under genetic control in lung tissue (n = 1,111). We then determined whether the mRNA levels of these genes in lung tissue (n = 727), small airway epithelial cells (n = 238), and peripheral blood (n = 620) were significantly related to lung function measurements. The expression of 63 of the 127 genes (50%) was under genetic control in lung tissue. The lung and epithelial mRNA expression of a subset of the emphysema-associated genes, including ASRGL1, LPHN2, and EDNRB, was strongly associated with lung function. In peripheral blood, the expression of 40 genes was significantly associated with lung function. Twenty-nine of these genes (73%) were also associated with lung function in lung tissue, but with the opposite direction of effect for 24 of the 29 genes, including those involved in hypoxia and B cell-related responses. The integrative genomics approach uncovered a significant overlap of emphysema genes associations with lung function between lung and blood with opposite directions between the two. These results support the use of peripheral blood to detect disease biomarkers.


Assuntos
Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Genômica , Pulmão/metabolismo , Enfisema Pulmonar/metabolismo , RNA Mensageiro/biossíntese , Linfócitos B/metabolismo , Linfócitos B/patologia , Biomarcadores/metabolismo , Hipóxia Celular , Feminino , Humanos , Pulmão/patologia , Masculino , Enfisema Pulmonar/genética , Enfisema Pulmonar/patologia , RNA Mensageiro/genética
11.
BMC Genomics ; 18(1): 43, 2017 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-28061752

RESUMO

BACKGROUND: Measuring genome-wide changes in transcript abundance in circulating peripheral whole blood is a useful way to study disease pathobiology and may help elucidate the molecular mechanisms of disease, or discovery of useful disease biomarkers. The sensitivity and interpretability of analyses carried out in this complex tissue, however, are significantly affected by its dynamic cellular heterogeneity. It is therefore desirable to quantify this heterogeneity, either to account for it or to better model interactions that may be present between the abundance of certain transcripts, specific cell types and the indication under study. Accurate enumeration of the many component cell types that make up peripheral whole blood can further complicate the sample collection process, however, and result in additional costs. Many approaches have been developed to infer the composition of a sample from high-dimensional transcriptomic and, more recently, epigenetic data. These approaches rely on the availability of isolated expression profiles for the cell types to be enumerated. These profiles are platform-specific, suitable datasets are rare, and generating them is expensive. No such dataset exists on the Affymetrix Gene ST platform. RESULTS: We present 'Enumerateblood', a freely-available and open source R package that exposes a multi-response Gaussian model capable of accurately predicting the composition of peripheral whole blood samples from Affymetrix Gene ST expression profiles, outperforming other current methods when applied to Gene ST data. CONCLUSIONS: 'Enumerateblood' significantly improves our ability to study disease pathobiology from whole blood gene expression assayed on the popular Affymetrix Gene ST platform by allowing a more complete study of the various components of this complex tissue without the need for additional data collection. Future use of the model may allow for novel insights to be generated from the ~400 Affymetrix Gene ST blood gene expression datasets currently available on the Gene Expression Omnibus (GEO) website.


Assuntos
Células Sanguíneas/citologia , Células Sanguíneas/metabolismo , Perfilação da Expressão Gênica , Genômica/métodos , Aprendizado de Máquina , Humanos , Modelos Estatísticos
12.
Respir Res ; 18(1): 72, 2017 04 24.
Artigo em Inglês | MEDLINE | ID: mdl-28438154

RESUMO

BACKGROUND: Chronic obstructive pulmonary disease (COPD) is currently the third leading cause of death and there is a huge unmet clinical need to identify disease biomarkers in peripheral blood. Compared to gene level differential expression approaches to identify gene signatures, network analyses provide a biologically intuitive approach which leverages the co-expression patterns in the transcriptome to identify modules of co-expressed genes. METHODS: A weighted gene co-expression network analysis (WGCNA) was applied to peripheral blood transcriptome from 238 COPD subjects to discover co-expressed gene modules. We then determined the relationship between these modules and forced expiratory volume in 1 s (FEV1). In a second, independent cohort of 381 subjects, we determined the preservation of these modules and their relationship with FEV1. For those modules that were significantly related to FEV1, we determined the biological processes as well as the blood cell-specific gene expression that were over-represented using additional external datasets. RESULTS: Using WGCNA, we identified 17 modules of co-expressed genes in the discovery cohort. Three of these modules were significantly correlated with FEV1 (FDR < 0.1). In the replication cohort, these modules were highly preserved and their FEV1 associations were reproducible (P < 0.05). Two of the three modules were negatively related to FEV1 and were enriched in IL8 and IL10 pathways and correlated with neutrophil-specific gene expression. The positively related module, on the other hand, was enriched in DNA transcription and translation and was strongly correlated to CD4+, CD8+ T cell-specific gene expression. CONCLUSIONS: Network based approaches are promising tools to identify potential biomarkers for COPD. TRIAL REGISTRATION: The ECLIPSE study was funded by GlaxoSmithKline, under ClinicalTrials.gov identifier NCT00292552 and GSK No. SCO104960.


Assuntos
Citocinas/sangue , Citocinas/genética , Perfilação da Expressão Gênica/métodos , Redes e Vias Metabólicas/genética , Modelos Genéticos , Doença Pulmonar Obstrutiva Crônica/sangue , Doença Pulmonar Obstrutiva Crônica/genética , Adulto , Idoso , Biomarcadores/sangue , Simulação por Computador , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Doença Pulmonar Obstrutiva Crônica/diagnóstico , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
13.
BMC Bioinformatics ; 17(1): 460, 2016 Nov 14.
Artigo em Inglês | MEDLINE | ID: mdl-27842512

RESUMO

BACKGROUND: Gene network inference (GNI) algorithms can be used to identify sets of coordinately expressed genes, termed network modules from whole transcriptome gene expression data. The identification of such modules has become a popular approach to systems biology, with important applications in translational research. Although diverse computational and statistical approaches have been devised to identify such modules, their performance behavior is still not fully understood, particularly in complex human tissues. Given human heterogeneity, one important question is how the outputs of these computational methods are sensitive to the input sample set, or stability. A related question is how this sensitivity depends on the size of the sample set. We describe here the SABRE (Similarity Across Bootstrap RE-sampling) procedure for assessing the stability of gene network modules using a re-sampling strategy, introduce a novel criterion for identifying stable modules, and demonstrate the utility of this approach in a clinically-relevant cohort, using two different gene network module discovery algorithms. RESULTS: The stability of modules increased as sample size increased and stable modules were more likely to be replicated in larger sets of samples. Random modules derived from permutated gene expression data were consistently unstable, as assessed by SABRE, and provide a useful baseline value for our proposed stability criterion. Gene module sets identified by different algorithms varied with respect to their stability, as assessed by SABRE. Finally, stable modules were more readily annotated in various curated gene set databases. CONCLUSIONS: The SABRE procedure and proposed stability criterion may provide guidance when designing systems biology studies in complex human disease and tissues.


Assuntos
Biologia Computacional/métodos , Redes Reguladoras de Genes , Algoritmos , Perfilação da Expressão Gênica , Humanos , Software , Biologia de Sistemas , Transcriptoma
14.
Thorax ; 71(3): 216-22, 2016 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-25777587

RESUMO

BACKGROUND: Despite the significant morbidity and mortality related to pulmonary exacerbations in cystic fibrosis (CF), there remains no reliable predictor of imminent exacerbation. OBJECTIVE: To identify blood-based biomarkers to predict imminent (<4 months from stable blood draw) CF pulmonary exacerbations using targeted proteomics. METHODS: 104 subjects provided plasma samples when clinically stable and were randomly split into discovery (n=70) and replication (n=34) cohorts. Multiple reaction monitoring mass spectrometry (MRM-MS) was used to measure 117 peptides (79 proteins) from plasma. Plasma proteins with differential abundance between subjects who did versus did not develop an imminent exacerbation were analysed and proteins with fold difference >1.5 between the groups were included in an MRM-MS classifier model to predict imminent exacerbations. Performance characteristics were compared with clinical predictors and candidate plasma protein biomarkers. RESULTS: Six proteins were included in the final MRM-MS protein panel. The area under the curve (AUC) for the prediction of imminent exacerbations was highest for the MRM-MS protein panel (AUC 0.74) in comparison to FEV1% predicted (AUC 0.55) and the top candidate plasma protein biomarkers, including C-reactive protein (AUC 0.61) and interleukin-6 (AUC 0.60). The MRM-MS protein panel performed similarly in the replication cohort (AUC 0.73). CONCLUSIONS: Using MRM-MS, a six-protein panel measured from plasma can distinguish individuals with versus without an imminent exacerbation. With further replication and assay development, this biomarker panel may be clinically applicable for prediction of exacerbations in individuals with CF.


Assuntos
Biomarcadores/sangue , Proteínas Sanguíneas/análise , Fibrose Cística/sangue , Espectrometria de Massas/métodos , Monitorização Fisiológica/métodos , Proteômica/métodos , Adulto , Progressão da Doença , Feminino , Seguimentos , Humanos , Masculino , Estudos Retrospectivos , Fatores de Tempo
15.
Nicotine Tob Res ; 18(9): 1903-9, 2016 09.
Artigo em Inglês | MEDLINE | ID: mdl-27154971

RESUMO

INTRODUCTION: Smoking is the number one modifiable environmental risk factor for chronic obstructive pulmonary disease (COPD). Clinical, epidemiological and increasingly "omics" studies assess or adjust for current smoking status using only self-report, which may be inaccurate. Objective measures such as exhaled carbon monoxide (eCO) may also be problematic owing to limitations in the measurements and the relatively short half life of the molecule. In this study, we determined the impact of different case definitions of current cigarette smoking on gene expression in peripheral blood of patients with COPD. METHODS: Peripheral blood gene expression from 573 former- and current-smokers with COPD in the ECLIPSE study was used to find genes whose expression was associated with smoking status. Current smoking was defined using self-report, eCO concentrations, or both. Linear regression was used to determine the association of current smoking status with gene expression adjusting for age, sex and propensity score. Pathway enrichment analyses were performed on genes with P < .001. RESULT: Using self-report or eCO, only two genes were differentially expressed between current and ex-smokers, with no enrichment in biological processes. When current smoking was defined using both eCO and self-report, four genes were differentially expressed (LRRN3, PID1, FUCA1, GPR15) with enrichment in 40 biological pathways related to metabolic processes, response to hypoxia and hormonal stimulus. Additionally, the combined definition provided better distributions of test statistics for differential gene expression. CONCLUSION: A combined phenotype of eCO and self report allows for better discovery of genes and pathways related to current smoking. IMPLICATIONS: Studies relying only on self report of smoking status to assess or adjust for the impact of smoking may not fully capture its effect and will lead to residual confounding of results.


Assuntos
Doença Pulmonar Obstrutiva Crônica/etiologia , Autorrelato , Fumar/genética , Adulto , Idoso , Monóxido de Carbono/análise , Proteínas de Transporte/genética , Feminino , Expressão Gênica , Humanos , Masculino , Glicoproteínas de Membrana , Proteínas de Membrana/genética , Pessoa de Meia-Idade , Proteínas de Neoplasias/genética , Fenótipo , Receptores Acoplados a Proteínas G/genética , Receptores de Peptídeos/genética , Fatores de Risco , Fumar/efeitos adversos , Fumar/sangue , Transcriptoma , alfa-L-Fucosidase/genética
16.
Am J Respir Crit Care Med ; 192(10): 1162-70, 2015 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-26176936

RESUMO

Chronic obstructive pulmonary disease (COPD) is one of the major causes of morbidity and mortality in the world. Regrettably, there are no biomarkers to objectively diagnose COPD exacerbations, which are the major drivers of hospitalization and deaths from COPD. Moreover, there are no biomarkers to guide therapeutic choices or to risk stratify patients for imminent exacerbations and no objective biomarkers of disease activity or disease progression. Although there has been a tremendous investment in COPD biomarker discovery over the past 2 decades, clinical translation and implementation have not matched these efforts. In this article, we outline the challenges of biomarker development in COPD and provide an overview of a developmental pipeline that may be able to surmount these challenges and bring novel biomarker solutions to accelerate therapeutic discoveries and to improve the care and outcomes of the millions of individuals worldwide with COPD.


Assuntos
Marcadores Genéticos , Medicina de Precisão/métodos , Doença Pulmonar Obstrutiva Crônica/genética , Progressão da Doença , Perfilação da Expressão Gênica , Humanos , Metabolômica/métodos , Prognóstico , Proteômica/métodos , Doença Pulmonar Obstrutiva Crônica/tratamento farmacológico , Doença Pulmonar Obstrutiva Crônica/fisiopatologia , Medição de Risco/métodos
17.
PLoS Comput Biol ; 9(4): e1002963, 2013 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-23592955

RESUMO

Recent technical advances in the field of quantitative proteomics have stimulated a large number of biomarker discovery studies of various diseases, providing avenues for new treatments and diagnostics. However, inherent challenges have limited the successful translation of candidate biomarkers into clinical use, thus highlighting the need for a robust analytical methodology to transition from biomarker discovery to clinical implementation. We have developed an end-to-end computational proteomic pipeline for biomarkers studies. At the discovery stage, the pipeline emphasizes different aspects of experimental design, appropriate statistical methodologies, and quality assessment of results. At the validation stage, the pipeline focuses on the migration of the results to a platform appropriate for external validation, and the development of a classifier score based on corroborated protein biomarkers. At the last stage towards clinical implementation, the main aims are to develop and validate an assay suitable for clinical deployment, and to calibrate the biomarker classifier using the developed assay. The proposed pipeline was applied to a biomarker study in cardiac transplantation aimed at developing a minimally invasive clinical test to monitor acute rejection. Starting with an untargeted screening of the human plasma proteome, five candidate biomarker proteins were identified. Rejection-regulated proteins reflect cellular and humoral immune responses, acute phase inflammatory pathways, and lipid metabolism biological processes. A multiplex multiple reaction monitoring mass-spectrometry (MRM-MS) assay was developed for the five candidate biomarkers and validated by enzyme-linked immune-sorbent (ELISA) and immunonephelometric assays (INA). A classifier score based on corroborated proteins demonstrated that the developed MRM-MS assay provides an appropriate methodology for an external validation, which is still in progress. Plasma proteomic biomarkers of acute cardiac rejection may offer a relevant post-transplant monitoring tool to effectively guide clinical care. The proposed computational pipeline is highly applicable to a wide range of biomarker proteomic studies.


Assuntos
Biomarcadores/análise , Proteínas Sanguíneas/análise , Biologia Computacional/métodos , Transplante de Coração , Proteômica/métodos , Calibragem , Estudos de Coortes , Ensaio de Imunoadsorção Enzimática , Rejeição de Enxerto , Insuficiência Cardíaca/terapia , Humanos , Inflamação , Espectrometria de Massas , Proteoma/análise
18.
Commun Med (Lond) ; 4(1): 69, 2024 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-38589545

RESUMO

BACKGROUND: Patients with cancer often have unmet psychosocial needs. Early detection of who requires referral to a counsellor or psychiatrist may improve their care. This work used natural language processing to predict which patients will see a counsellor or psychiatrist from a patient's initial oncology consultation document. We believe this is the first use of artificial intelligence to predict psychiatric outcomes from non-psychiatric medical documents. METHODS: This retrospective prognostic study used data from 47,625 patients at BC Cancer. We analyzed initial oncology consultation documents using traditional and neural language models to predict whether patients would see a counsellor or psychiatrist in the 12 months following their initial oncology consultation. RESULTS: Here, we show our best models achieved a balanced accuracy (receiver-operating-characteristic area-under-curve) of 73.1% (0.824) for predicting seeing a psychiatrist, and 71.0% (0.784) for seeing a counsellor. Different words and phrases are important for predicting each outcome. CONCLUSION: These results suggest natural language processing can be used to predict psychosocial needs of patients with cancer from their initial oncology consultation document. Future research could extend this work to predict the psychosocial needs of medical patients in other settings.


Patients with cancer often need support for their mental health. Early detection of who requires referral to a counsellor or psychiatrist may improve their care. This study trained a type of artificial intelligence (AI) called natural language processing to read the consultation report an oncologist writes after they first see a patient to predict which patients will see a counsellor or psychiatrist. The AI predicted this with performance similar to other uses of AI in mental health, and used different words and phrases to predict who would see a psychiatrist compared to seeing a counsellor. We believe this is the first use of AI to predict mental health outcomes from medical documents written by clinicians outside of mental health. This study suggests this type of AI can predict the mental health needs of patients with cancer from this widely-available document.

19.
Proc Natl Acad Sci U S A ; 107(39): 17053-8, 2010 Sep 28.
Artigo em Inglês | MEDLINE | ID: mdl-20833815

RESUMO

Signal transduction networks can be perturbed biochemically, genetically, and pharmacologically to unravel their functions. But at the systems level, it is not clear how such perturbations are best implemented to extract molecular mechanisms that underlie network function. Here, we combined pairwise perturbations with multiparameter phosphorylation measurements to reveal causal mechanisms within the signaling network response of cardiomyocytes to coxsackievirus B3 (CVB3) infection. Using all possible pairs of six kinase inhibitors, we assembled a dynamic nine-protein phosphorylation signature of perturbed CVB3 infectivity. Cluster analysis of the resulting dataset showed repeatedly that paired inhibitor data were required for accurate data-driven predictions of kinase substrate links in the host network. With pairwise data, we also derived a high-confidence network based on partial correlations, which identified phospho-IκBα as a central "hub" in the measured phosphorylation signature. The reconstructed network helped to connect phospho-IκBα with an autocrine feedback circuit in host cells involving the proinflammatory cytokines, TNF and IL-1. Autocrine blockade substantially inhibited CVB3 progeny release and improved host cell viability, implicating TNF and IL-1 as cell autonomous components of CVB3-induced myocardial damage. We conclude that pairwise perturbations, when combined with network-level intracellular measurements, enrich for mechanisms that would be overlooked by single perturbants.


Assuntos
Enterovirus Humano B , Infecções por Enterovirus/metabolismo , Interações Hospedeiro-Patógeno , Redes e Vias Metabólicas , Miócitos Cardíacos/virologia , Linhagem Celular , Humanos , Interleucina-1/metabolismo , Miócitos Cardíacos/efeitos dos fármacos , Miócitos Cardíacos/metabolismo , Fosforilação , Inibidores de Proteínas Quinases/farmacologia , Transdução de Sinais , Fator de Necrose Tumoral alfa/metabolismo
20.
JAMA Netw Open ; 6(2): e230813, 2023 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-36848085

RESUMO

Importance: Predicting short- and long-term survival of patients with cancer may improve their care. Prior predictive models either use data with limited availability or predict the outcome of only 1 type of cancer. Objective: To investigate whether natural language processing can predict survival of patients with general cancer from a patient's initial oncologist consultation document. Design, Setting, and Participants: This retrospective prognostic study used data from 47 625 of 59 800 patients who started cancer care at any of the 6 BC Cancer sites located in the province of British Columbia between April 1, 2011, and December 31, 2016. Mortality data were updated until April 6, 2022, and data were analyzed from update until September 30, 2022. All patients with a medical or radiation oncologist consultation document generated within 180 days of diagnosis were included; patients seen for multiple cancers were excluded. Exposures: Initial oncologist consultation documents were analyzed using traditional and neural language models. Main Outcomes and Measures: The primary outcome was the performance of the predictive models, including balanced accuracy and receiver operating characteristics area under the curve (AUC). The secondary outcome was investigating what words the models used. Results: Of the 47 625 patients in the sample, 25 428 (53.4%) were female and 22 197 (46.6%) were male, with a mean (SD) age of 64.9 (13.7) years. A total of 41 447 patients (87.0%) survived 6 months, 31 143 (65.4%) survived 36 months, and 27 880 (58.5%) survived 60 months, calculated from their initial oncologist consultation. The best models achieved a balanced accuracy of 0.856 (AUC, 0.928) for predicting 6-month survival, 0.842 (AUC, 0.918) for 36-month survival, and 0.837 (AUC, 0.918) for 60-month survival, on a holdout test set. Differences in what words were important for predicting 6- vs 60-month survival were found. Conclusions and Relevance: These findings suggest that models performed comparably with or better than previous models predicting cancer survival and that they may be able to predict survival using readily available data without focusing on 1 cancer type.


Assuntos
Processamento de Linguagem Natural , Neoplasias , Humanos , Feminino , Masculino , Pessoa de Meia-Idade , Idoso , Estudos Retrospectivos , Neoplasias/terapia , Oncologia , Encaminhamento e Consulta
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA