Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 139
Filtrar
1.
Artículo en Inglés | MEDLINE | ID: mdl-37885703

RESUMEN

We describe a collaborative project involving faculty and students in a university bioinformatics/biostatistics center. The project focuses on identification of differentially expressed gene sets ("pathways") in subjects expressing a disease state, medical intervention, or other distinguishable condition. The key feature of the endeavor is the data structure presented to the team: a single cohort of subjects with two samples taken from each subject - one for each of two differing conditions without replication. This particular structure leads to essentially a cohort of 2×2 contingency tables, where each table compares the differential gene state with the pathway condition. Recognizing that correlations both within and between pathway responses can disrupt standard 2×2 table analytics, we develop methods for analyzing this data structure in the presence of complicated intra-table correlations. These provide some convenient approaches for this problem, using design effect adjustments from sample survey theory and manipulations of the summary 2×2 table counts. Monte Carlo simulations show that the methods operate extremely well, validating their use in practice. In the end, the collaborative connections among the team members led to solutions no one of us would have envisioned separately.

2.
Eur J Respir Med ; 5(1): 359-371, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-38390497

RESUMEN

Background: A limited pool of SNPs are linked to the development and severity of sarcoidosis, a systemic granulomatous inflammatory disease. By integrating genome-wide association studies (GWAS) data and expression quantitative trait loci (eQTL) single nuclear polymorphisms (SNPs), we aimed to identify novel sarcoidosis SNPs potentially influencing the development of complicated sarcoidosis. Methods: A GWAS (Affymetrix 6.0) involving 209 African-American (AA) and 193 European-American (EA, 75 and 51 complicated cases respectively) and publicly-available GWAS controls (GAIN) was utilized. Annotation of multi-tissue eQTL SNPs present on the GWAS created a pool of ~46,000 eQTL SNPs examined for association with sarcoidosis risk and severity (Logistic Model, Plink). The most significant EA/AA eQTL SNPs were genotyped in a sarcoidosis validation cohort (n=1034) and cross-validated in two independent GWAS cohorts. Results: No single GWAS SNP achieved significance (p<1x10-8), however, analysis of the eQTL/GWAS SNP pool yielded 621 eQTL SNPs (p<10-4) associated with 730 genes that highlighted innate immunity, MHC Class II, and allograft rejection pathways with multiple SNPs validated in an independent sarcoidosis cohort (105 SNPs analyzed) (NOTCH4, IL27RA, BTNL2, ANXA11, HLA-DRB1). These studies confirm significant association of eQTL/GWAS SNPs in EAs and AAs with sarcoidosis risk and severity (complicated sarcoidosis) involving HLA region and innate immunity. Conclusion: Despite the challenge of deciphering the genetic basis for sarcoidosis risk/severity, these results suggest that integrated eQTL/GWAS approaches may identify novel variants/genes and support the contribution of dysregulated innate immune responses to sarcoidosis severity.

3.
JCI Insight ; 7(22)2022 11 22.
Artículo en Inglés | MEDLINE | ID: mdl-36166305

RESUMEN

Disseminated coccidioidomycosis (DCM) is caused by Coccidioides, pathogenic fungi endemic to the southwestern United States and Mexico. Illness occurs in approximately 30% of those infected, less than 1% of whom develop disseminated disease. To address why some individuals allow dissemination, we enrolled patients with DCM and performed whole-exome sequencing. In an exploratory set of 67 patients with DCM, 2 had haploinsufficient STAT3 mutations, and defects in ß-glucan sensing and response were seen in 34 of 67 cases. Damaging CLEC7A and PLCG2 variants were associated with impaired production of ß-glucan-stimulated TNF-α from PBMCs compared with healthy controls. Using ancestry-matched controls, damaging CLEC7A and PLCG2 variants were overrepresented in DCM, including CLEC7A Y238* and PLCG2 R268W. A validation cohort of 111 patients with DCM confirmed the PLCG2 R268W, CLEC7A I223S, and CLEC7A Y238* variants. Stimulation with a DECTIN-1 agonist induced DUOX1/DUOXA1-derived hydrogen peroxide [H2O2] in transfected cells. Heterozygous DUOX1 or DUOXA1 variants that impaired H2O2 production were overrepresented in discovery and validation cohorts. Patients with DCM have impaired ß-glucan sensing or response affecting TNF-α and H2O2 production. Impaired Coccidioides recognition and decreased cellular response are associated with disseminated coccidioidomycosis.


Asunto(s)
Coccidioidomicosis , beta-Glucanos , Humanos , Factor de Necrosis Tumoral alfa/genética , Peróxido de Hidrógeno , Coccidioidomicosis/genética , Coccidioidomicosis/epidemiología , Coccidioidomicosis/microbiología , Coccidioides/genética
4.
J Allergy Clin Immunol ; 150(3): 604-611, 2022 09.
Artículo en Inglés | MEDLINE | ID: mdl-35367470

RESUMEN

BACKGROUND: The study of pathogenic mechanisms in adult asthma is often marred by a lack of precise information about the natural history of the disease. Children who have persistent wheezing (PW) during the first 6 years of life and whose symptoms start before age 3 years (PW+) are much more likely to have wheezing illnesses due to rhinovirus (RV) in infancy and to have asthma into adult life than are those who do not have PW (PW-). OBJECTIVE: Our aim was to determine whether nasal epithelial cells from PW+ asthmatic adults as compared with cells from PW- asthmatic adults show distinct biomechanistic processes activated by RV exposure. METHODS: Air-liquid interface cultures derived from nasal epithelial cells of 36-year old participants with active asthma with and without a history of PW in childhood (10 PW+ participants and 20 PW- participants) from the Tucson Children's Respiratory Study were challenged with a human RV-A strain (RV-A16) or control, and their RNA was sequenced. RESULTS: A total of 35 differentially expressed genes involved in extracellular remodeling and angiogenesis distinguished the PW+ group from the PW- group at baseline and after RV-A stimulation. Notably, 22 transcriptomic pathways showed PW-by-RV interactions; the pathways were invariably overactivated in PW+ patients, and were involved in Toll-like receptor- and cytokine-mediated responses, remodeling, and angiogenic processes. CONCLUSIONS: Asthmatic adults with a history of persistent wheeze in the first 6 years of life have specific biomolecular alterations in response to RV-A that are not present in patients without such a history. Targeting these mechanisms may slow the progression of asthma in these patients.


Asunto(s)
Asma , Infecciones por Enterovirus , Infecciones por Picornaviridae , Adulto , Asma/diagnóstico , Niño , Preescolar , Células Epiteliales , Humanos , Fenotipo , Ruidos Respiratorios , Rhinovirus/genética
5.
Bioinformatics ; 37(Suppl_1): i67-i75, 2021 07 12.
Artículo en Inglés | MEDLINE | ID: mdl-34252934

RESUMEN

MOTIVATION: Identifying altered transcripts between very small human cohorts is particularly challenging and is compounded by the low accrual rate of human subjects in rare diseases or sub-stratified common disorders. Yet, single-subject studies (S3) can compare paired transcriptome samples drawn from the same patient under two conditions (e.g. treated versus pre-treatment) and suggest patient-specific responsive biomechanisms based on the overrepresentation of functionally defined gene sets. These improve statistical power by: (i) reducing the total features tested and (ii) relaxing the requirement of within-cohort uniformity at the transcript level. We propose Inter-N-of-1, a novel method, to identify meaningful differences between very small cohorts by using the effect size of 'single-subject-study'-derived responsive biological mechanisms. RESULTS: In each subject, Inter-N-of-1 requires applying previously published S3-type N-of-1-pathways MixEnrich to two paired samples (e.g. diseased versus unaffected tissues) for determining patient-specific enriched genes sets: Odds Ratios (S3-OR) and S3-variance using Gene Ontology Biological Processes. To evaluate small cohorts, we calculated the precision and recall of Inter-N-of-1 and that of a control method (GLM+EGS) when comparing two cohorts of decreasing sizes (from 20 versus 20 to 2 versus 2) in a comprehensive six-parameter simulation and in a proof-of-concept clinical dataset. In simulations, the Inter-N-of-1 median precision and recall are > 90% and >75% in cohorts of 3 versus 3 distinct subjects (regardless of the parameter values), whereas conventional methods outperform Inter-N-of-1 at sample sizes 9 versus 9 and larger. Similar results were obtained in the clinical proof-of-concept dataset. AVAILABILITY AND IMPLEMENTATION: R software is available at Lussierlab.net/BSSD.


Asunto(s)
Perfilación de la Expresión Génica , Enfermedades Raras , Ontología de Genes , Humanos , Enfermedades Raras/genética , Transcriptoma
6.
BMJ Health Care Inform ; 28(1)2021 May.
Artículo en Inglés | MEDLINE | ID: mdl-33980502

RESUMEN

OBJECTIVES: Prior research has reported an increased risk of fatality for patients with cancer, but most studies investigated the risk by comparing cancer to non-cancer patients among COVID-19 infections, where cancer might have contributed to the increased risk. This study is to understand COVID-19's imposed HR of fatality while controlling for covariates, such as age, sex, metastasis status and cancer type. METHODS: We conducted survival analyses of 4606 cancer patients with COVID-19 test results from 16 March to 11 October 2020 in UK Biobank and estimated the overall HR of fatality with and without COVID-19 infection. We also examined the HRs of 13 specific cancer types with at least 100 patients using a stratified analysis. RESULTS: COVID-19 resulted in an overall HR of 7.76 (95% CI 5.78 to 10.40, p<10-10) by following 4606 patients with cancer for 21 days after the tests. The HR varied among cancer type, with over a 10-fold increase in fatality rate (false discovery rate ≤0.02) for melanoma, haematological malignancies, uterine cancer and kidney cancer. Although COVID-19 imposed a higher risk for localised versus distant metastasis cancers, those of distant metastases yielded higher overall fatality rates due to their multiplicative effects. DISCUSSION: The results confirmed prior reports for the increased risk of fatality for patients with COVID-19 plus hematological malignancies and demonstrated similar findings of COVID-19 on melanoma, uterine, and kidney cancers. CONCLUSION: The results highlight the heightened risk that COVID-19 imposes on localised and haematological cancer patients and the necessity to vaccinate uninfected patients with cancer promptly, particularly for the cancer types most influenced by COVID-19. Results also suggest the importance of timely care for patients with localised cancer, whether they are infected by COVID-19 or not.


Asunto(s)
COVID-19/mortalidad , Estado de Salud , Neoplasias/mortalidad , Vigilancia en Salud Pública , Adulto , Anciano , Anciano de 80 o más Años , Femenino , Humanos , Masculino , Neoplasias/patología , Medición de Riesgo , Factores de Riesgo , Análisis de Supervivencia , Adulto Joven
7.
Transl Res ; 228: 1-12, 2021 02.
Artículo en Inglés | MEDLINE | ID: mdl-32711186

RESUMEN

Idiopathic pulmonary fibrosis (IPF) is a chronic, progressive interstitial lung disease of unknown etiology that poses significant challenges in early diagnosis and prediction of progression. Analyses of microRNA and gene expression in IPF have yielded potentially predictive information. However, the relationship between microRNA/gene expression and quantitative phenotypic value in IPF remains controversial, as is the added value of this approach to current molecular signatures in IPF. To identify biomarkers predictive of survival in IPF via a microRNA-driven strategy. We profiled microRNA and protein-coding gene expression in peripheral blood mononuclear cells from 70 IPF subjects in a discovery cohort. We linked the microRNA/gene expression level with the quantitative phenotypic variation in IPF, including diffusing capacity of the lung for carbon monoxide and the forced vital capacity percent predicted. In silico analyses of expression profiles and quantitative phenotypic data allowed the generation of 2 sets of IPF molecular signatures (unique for microRNAs and protein-coding genes) that predict IPF survival. Each signature performed well in a validation cohort comprised of IPF patients aggregated from distinct patient populations recruited from different sites. Resampling test suggests that the protein-coding gene based signature is comparable and potentially superior to published IPF prognostic gene signatures. In conclusion, these results highlight the utility of microRNA-driven peripheral blood molecular signatures as valuable and novel biomarkers associated to individuals at high survival risk and for potentially facilitating individualized therapies in this enigmatic disorder.


Asunto(s)
Perfilación de la Expresión Génica , Fibrosis Pulmonar Idiopática/genética , MicroARNs/genética , Proteínas/genética , Anciano , Biomarcadores/metabolismo , Estudios de Casos y Controles , Femenino , Humanos , Leucocitos Mononucleares/metabolismo , Masculino , Persona de Mediana Edad , Pronóstico , Reproducibilidad de los Resultados , Análisis de Supervivencia
8.
BMC Bioinformatics ; 21(1): 495, 2020 Nov 02.
Artículo en Inglés | MEDLINE | ID: mdl-33138767

RESUMEN

An amendment to this paper has been published and can be accessed via the original article.

9.
BMC Bioinformatics ; 21(1): 374, 2020 Aug 28.
Artículo en Inglés | MEDLINE | ID: mdl-32859146

RESUMEN

BACKGROUND: In this era of data science-driven bioinformatics, machine learning research has focused on feature selection as users want more interpretation and post-hoc analyses for biomarker detection. However, when there are more features (i.e., transcripts) than samples (i.e., mice or human samples) in a study, it poses major statistical challenges in biomarker detection tasks as traditional statistical techniques are underpowered in high dimension. Second and third order interactions of these features pose a substantial combinatoric dimensional challenge. In computational biology, random forest (RF) classifiers are widely used due to their flexibility, powerful performance, their ability to rank features, and their robustness to the "P > > N" high-dimensional limitation that many matrix regression algorithms face. We propose binomialRF, a feature selection technique in RFs that provides an alternative interpretation for features using a correlated binomial distribution and scales efficiently to analyze multiway interactions. RESULTS: In both simulations and validation studies using datasets from the TCGA and UCI repositories, binomialRF showed computational gains (up to 5 to 300 times faster) while maintaining competitive variable precision and recall in identifying biomarkers' main effects and interactions. In two clinical studies, the binomialRF algorithm prioritizes previously-published relevant pathological molecular mechanisms (features) with high classification precision and recall using features alone, as well as with their statistical interactions alone. CONCLUSION: binomialRF extends upon previous methods for identifying interpretable features in RFs and brings them together under a correlated binomial distribution to create an efficient hypothesis testing algorithm that identifies biomarkers' main effects and interactions. Preliminary results in simulations demonstrate computational gains while retaining competitive model selection and classification accuracies. Future work will extend this framework to incorporate ontologies that provide pathway-level feature selection from gene expression input data.


Asunto(s)
Algoritmos , Biomarcadores/metabolismo , Biomarcadores de Tumor/metabolismo , Neoplasias de la Mama/diagnóstico , Biología Computacional/métodos , Femenino , Humanos , Neoplasias Renales/diagnóstico
11.
J Pers Med ; 11(1)2020 Dec 31.
Artículo en Inglés | MEDLINE | ID: mdl-33396440

RESUMEN

Background: Developing patient-centric baseline standards that enable the detection of clinically significant outlier gene products on a genome-scale remains an unaddressed challenge required for advancing personalized medicine beyond the small pools of subjects implied by "precision medicine". This manuscript proposes a novel approach for reference standard development to evaluate the accuracy of single-subject analyses of transcriptomes and offers extensions into proteomes and metabolomes. In evaluation frameworks for which the distributional assumptions of statistical testing imperfectly model genome dynamics of gene products, artefacts and biases are confounded with authentic signals. Model confirmation biases escalate when studies use the same analytical methods in the discovery sets and reference standards. In such studies, replicated biases are confounded with measures of accuracy. We hypothesized that developing method-agnostic reference standards would reduce such replication biases. We propose to evaluate discovery methods with a reference standard derived from a consensus of analytical methods distinct from the discovery one to minimize statistical artefact biases. Our methods involve thresholding effect-size and expression-level filtering of results to improve consensus between analytical methods. We developed and released an R package "referenceNof1" to facilitate the construction of robust reference standards. Results: Since RNA-Seq data analysis methods often rely on binomial and negative binomial assumptions to non-parametric analyses, the differences create statistical noise and make the reference standards method dependent. In our experimental design, the accuracy of 30 distinct combinations of fold changes (FC) and expression counts (hereinafter "expression") were determined for five types of RNA analyses in two different datasets. This design was applied to two distinct datasets: Breast cancer cell lines and a yeast study with isogenic biological replicates in two experimental conditions. Furthermore, the reference standard (RS) comprised all RNA analytical methods with the exception of the method testing accuracy. To mitigate biases towards a specific analytical method, the pairwise Jaccard Concordance Index between observed results of distinct analytical methods were calculated for optimization. Optimization through thresholding effect-size and expression-level reduced the greatest discordances between distinct methods' analytical results and resulted in a 65% increase in concordance. Conclusions: We have demonstrated that comparing accuracies of different single-subject analysis methods for clinical optimization in transcriptomics requires a new evaluation framework. Reliable and robust reference standards, independent of the evaluated method, can be obtained under a limited number of parameter combinations: Fold change (FC) ranges thresholds, expression level cutoffs, and exclusion of the tested method from the RS development process. When applying anticonservative reference standard frameworks (e.g., using the same method for RS development and prediction), most of the concordant signal between prediction and Gold Standard (GS) cannot be confirmed by other methods, which we conclude as biased results. Statistical tests to determine DEGs from a single-subject study generate many biased results requiring subsequent filtering to increase reliability. Conventional single-subject studies pertain to one or a few patient's measures over time and require a substantial conceptual framework extension to address the numerous measures in genome-wide analyses of gene products. The proposed referenceNof1 framework addresses some of the inherent challenges for improving transcriptome scale single-subject analyses by providing a robust approach to constructing reference standards.

12.
Oncogene ; 39(10): 2103-2117, 2020 03.
Artículo en Inglés | MEDLINE | ID: mdl-31804622

RESUMEN

Rational new strategies are needed to treat tumors resistant to kinase inhibitors. Mechanistic studies of resistance provide fertile ground for development of new approaches. Cancer drug addiction is a paradoxical resistance phenomenon, well-described in MEK-ERK-driven solid tumors, in which drug-target overexpression promotes resistance but a toxic overdose of signaling if the inhibitor is withdrawn. This can permit prolonged control of tumors through intermittent dosing. We and others showed previously that cancer drug addiction arises also in the hematologic malignancy ALK-positive anaplastic large-cell lymphoma (ALCL) resistant to ALK-specific tyrosine kinase inhibitors (TKIs). This is driven by the overexpression of the fusion kinase NPM1-ALK, but the mechanism by which ALK overactivity drives toxicity upon TKI withdrawal remained obscure. Here we reveal the mechanism of ALK-TKI addiction in ALCL. We interrogated the well-described mechanism of MEK/ERK pathway inhibitor addiction in solid tumors and found it does not apply to ALCL. Instead, phosphoproteomics and confirmatory functional studies revealed that the STAT1 overactivation is the key mechanism of ALK-TKI addiction in ALCL. The withdrawal of TKI from addicted tumors in vitro and in vivo leads to overwhelming phospho-STAT1 activation, turning on its tumor-suppressive gene-expression program and turning off STAT3's oncogenic program. Moreover, a novel NPM1-ALK-positive ALCL PDX model showed a significant survival benefit from intermittent compared with continuous TKI dosing. In sum, we reveal for the first time the mechanism of cancer drug addiction in ALK-positive ALCL and the benefit of scheduled intermittent dosing in high-risk patient-derived tumors in vivo.


Asunto(s)
Quinasa de Linfoma Anaplásico/antagonistas & inhibidores , Resistencia a Antineoplásicos , Linfoma Anaplásico de Células Grandes/fisiopatología , Inhibidores de Proteínas Quinasas/farmacología , Factor de Transcripción STAT1/metabolismo , Transducción de Señal , Quinasa de Linfoma Anaplásico/genética , Quinasa de Linfoma Anaplásico/metabolismo , Antineoplásicos/farmacología , Antineoplásicos/uso terapéutico , Línea Celular Tumoral , Regulación Neoplásica de la Expresión Génica , Humanos , Linfoma Anaplásico de Células Grandes/enzimología , Linfoma Anaplásico de Células Grandes/genética , Linfoma Anaplásico de Células Grandes/metabolismo , Nucleofosmina , Inhibidores de Proteínas Quinasas/uso terapéutico , Proteómica , Factor de Transcripción STAT3/genética
13.
BMC Med Genomics ; 12(Suppl 5): 96, 2019 07 11.
Artículo en Inglés | MEDLINE | ID: mdl-31296218

RESUMEN

BACKGROUND: Gene expression profiling has benefited medicine by providing clinically relevant insights at the molecular candidate and systems levels. However, to adopt a more 'precision' approach that integrates individual variability including 'omics data into risk assessments, diagnoses, and therapeutic decision making, whole transcriptome expression needs to be interpreted meaningfully for single subjects. We propose an "all-against-one" framework that uses biological replicates in isogenic conditions for testing differentially expressed genes (DEGs) in a single subject (ss) in the absence of an appropriate external reference standard or replicates. To evaluate our proposed "all-against-one" framework, we construct reference standards (RSs) with five conventional replicate-anchored analyses (NOISeq, DEGseq, edgeR, DESeq, DESeq2) and the remainder were treated separately as single-subject sample pairs for ss analyses (without replicates). RESULTS: Eight ss methods (NOISeq, DEGseq, edgeR, mixture model, DESeq, DESeq2, iDEG, and ensemble) for identifying genes with differential expression were compared in Yeast (parental line versus snf2 deletion mutant; n = 42/condition) and a MCF7 breast-cancer cell line (baseline versus stimulated with estradiol; n = 7/condition). Receiver-operator characteristic (ROC) and precision-recall plots were determined for eight ss methods against each of the five RSs in both datasets. Consistent with prior analyses of these data, ~ 50% and ~ 15% DEGs were obtained in Yeast and MCF7 datasets respectively, regardless of the RSs method. NOISeq, edgeR, and DESeq were the most concordant for creating a RS. Single-subject versions of NOISeq, DEGseq, and an ensemble learner achieved the best median ROC-area-under-the-curve to compare two transcriptomes without replicates regardless of the RS method and dataset (> 90% in Yeast, > 0.75 in MCF7). Further, distinct specific single-subject methods perform better according to different proportions of DEGs. CONCLUSIONS: The "all-against-one" framework provides a honest evaluation framework for single-subject DEG studies since these methods are evaluated, by design, against reference standards produced by unrelated DEG methods. The ss-ensemble method was the only one to reliably produce higher accuracies in all conditions tested in this conservative evaluation framework. However, single-subject methods for identifying DEGs from paired samples need improvement, as no method performed with precision> 90% and obtained moderate levels of recall. http://www.lussiergroup.org/publications/EnsembleBiomarker.


Asunto(s)
Perfilación de la Expresión Génica/métodos , Medicina de Precisión , Perfilación de la Expresión Génica/normas , Humanos , Estándares de Referencia
14.
Front Genet ; 10: 414, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31143202

RESUMEN

RNA-Sequencing data offers an opportunity to enable precision medicine, but most methods rely on gene expression alone. To date, no methodology exists to identify and interpret alternative splicing patterns within pathways for an individual patient. This study develops methodology and conducts computational experiments to test the hypothesis that pathway aggregation of subject-specific alternatively spliced genes (ASGs) can inform upon disease mechanisms and predict survival. We propose the N-of-1-pathways Alternatively Spliced (N1PAS) method that takes an individual patient's paired-sample RNA-Seq isoform expression data (e.g., tumor vs. non-tumor, before-treatment vs. during-therapy) and pathway annotations as inputs. N1PAS quantifies the degree of alternative splicing via Hellinger distances followed by two-stage clustering to determine pathway enrichment. We provide a clinically relevant "odds ratio" along with statistical significance to quantify pathway enrichment. We validate our method in clinical samples and find that our method selects relevant pathways (p < 0.05 in 4/6 data sets). Extensive Monte Carlo studies show N1PAS powerfully detects pathway enrichment of ASGs while adequately controlling false discovery rates. Importantly, our studies also unveil highly heterogeneous single-subject alternative splicing patterns that cohort-based approaches overlook. Finally, we apply our patient-specific results to predict cancer survival (FDR < 20%) while providing diagnostics in pursuit of translating transcriptome data into clinically actionable information. Software available at https://github.com/grizant/n1pas/tree/master.

15.
Pac Symp Biocomput ; 24: 308-319, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-30864332

RESUMEN

Repurposing existing drugs for new therapeutic indications can improve success rates and streamline development. Use of large-scale biomedical data repositories, including eQTL regulatory relationships and genome-wide disease risk associations, offers opportunities to propose novel indications for drugs targeting common or convergent molecular candidates associated to two or more diseases. This proposed novel computational approach scales across 262 complex diseases, building a multi-partite hierarchical network integrating (i) GWAS-derived SNP-to-disease associations, (ii) eQTL-derived SNP-to-eGene associations incorporating both cis- and trans-relationships from 19 tissues, (iii) protein target-to-drug, and (iv) drug-to-disease indications with (iv) Gene Ontology-based information theoretic semantic (ITS) similarity calculated between protein target functions. Our hypothesis is that if two diseases are associated to a common or functionally similar eGene - and a drug targeting that eGene/protein in one disease exists - the second disease becomes a potential repurposing indication. To explore this, all possible pairs of independently segregating GWAS-derived SNPs were generated, and a statistical network of similarity within each SNP-SNP pair was calculated according to scale-free overrepresentation of convergent biological processes activity in regulated eGenes (ITSeGENE-eGENE) and scale-free overrepresentation of common eGene targets between the two SNPs (ITSSNP-SNP). Significance of ITSSNP-SNP was conservatively estimated using empirical scale-free permutation resampling keeping the node-degree constant for each molecule in each permutation. We identified 26 new drug repurposing indication candidates spanning 89 GWAS diseases, including a potential repurposing of the calcium-channel blocker Verapamil from coronary disease to gout. Predictions from our approach are compared to known drug indications using DrugBank as a gold standard (odds ratio=13.1, p-value=2.49x10-8). Because of specific disease-SNPs associations to candidate drug targets, the proposed method provides evidence for future precision drug repositioning to a patient's specific polymorphisms.


Asunto(s)
Reposicionamiento de Medicamentos/métodos , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Biología Computacional , Bases de Datos Genéticas , Reposicionamiento de Medicamentos/estadística & datos numéricos , Ontología de Genes , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo/estadística & datos numéricos , Humanos , Medicina de Precisión/métodos , Medicina de Precisión/estadística & datos numéricos
16.
Pac Symp Biocomput ; 24: 444-448, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-30864345

RESUMEN

Identifying functional elements and predicting mechanistic insight from non-coding DNA and noncoding variation remains a challenge. Advances in genome-scale, high-throughput technology, however, have brought these answers closer within reach than ever, though there is still a need for new computational approaches to analysis and integration. This workshop aims to explore these resources and new computational methods applied to regulatory elements, chromatin interactions, non-protein-coding genes, and other non-coding DNA.


Asunto(s)
Biología Computacional/métodos , ADN/genética , Secuenciación de Nucleótidos de Alto Rendimiento/estadística & datos numéricos , Análisis de Secuencia de ADN/estadística & datos numéricos , Sistemas CRISPR-Cas , Epigénesis Genética , Redes Reguladoras de Genes , Variación Genética , Humanos , Mutación , ARN no Traducido/genética , Elementos Reguladores de la Transcripción , Biología de Sistemas
17.
Brief Bioinform ; 20(3): 789-805, 2019 05 21.
Artículo en Inglés | MEDLINE | ID: mdl-29272327

RESUMEN

The development of computational methods capable of analyzing -omics data at the individual level is critical for the success of precision medicine. Although unprecedented opportunities now exist to gather data on an individual's -omics profile ('personalome'), interpreting and extracting meaningful information from single-subject -omics remain underdeveloped, particularly for quantitative non-sequence measurements, including complete transcriptome or proteome expression and metabolite abundance. Conventional bioinformatics approaches have largely been designed for making population-level inferences about 'average' disease processes; thus, they may not adequately capture and describe individual variability. Novel approaches intended to exploit a variety of -omics data are required for identifying individualized signals for meaningful interpretation. In this review-intended for biomedical researchers, computational biologists and bioinformaticians-we survey emerging computational and translational informatics methods capable of constructing a single subject's 'personalome' for predicting clinical outcomes or therapeutic responses, with an emphasis on methods that provide interpretable readouts. Key points: (i) the single-subject analytics of the transcriptome shows the greatest development to date and, (ii) the methods were all validated in simulations, cross-validations or independent retrospective data sets. This survey uncovers a growing field that offers numerous opportunities for the development of novel validation methods and opens the door for future studies focusing on the interpretation of comprehensive 'personalomes' through the integration of multiple -omics, providing valuable insights into individual patient outcomes and treatments.


Asunto(s)
Medicina de Precisión , Transcriptoma , Humanos
18.
AMIA Annu Symp Proc ; 2019: 582-591, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-32308852

RESUMEN

Calculating Differentially Expressed Genes (DEGs) from RNA-sequencing requires replicates to estimate gene-wise variability, a requirement that is at times financially or physiologically infeasible in clinics. By imposing restrictive transcriptome-wide assumptions limiting inferential opportunities of conventional methods (edgeR, NOISeq-sim, DESeq, DEGseq), comparing two conditions without replicates (TCWR) has been proposed, but not evaluated. Under TCWR conditions (e.g., unaffected tissue vs. tumor), differences of transformed expression of the proposed individualized DEG (iDEG) method follow a distribution calculated across a local partition of related transcripts at baseline expression; thereafter the probability of each DEG is estimated by empirical Bayes with local false discovery rate control using a two-group mixture model. In extensive simulation studies of TCWR methods, iDEG and NOISeq are more accurate at 5%90%, recall>75%, false_positive_rate<1%) and 30%

Asunto(s)
Algoritmos , Perfilación de la Expresión Génica , Análisis de Secuencia de ARN/métodos , Transcriptoma , Teorema de Bayes , Genómica , Humanos , Conceptos Matemáticos , Modelos Teóricos , Medicina de Precisión
19.
Artículo en Inglés | MEDLINE | ID: mdl-29888037

RESUMEN

The transition of procedure coding from ICD-9-CM-Vol-3 to ICD-10-PCS has generated problems for the medical community at large resulting from the lack of clarity required to integrate two non-congruent coding systems. We hypothesized that quantifying these issues with network topology analyses offers a better understanding of the issues, and therefore we developed solutions (online tools) to empower hospital administrators and researchers to address these challenges. Five topologies were identified: "identity"(I), "class-to-subclass"(C2S), "subclass-toclass"(S2C), "convoluted(C)", and "no mapping"(NM). The procedure codes in the 2010 Illinois Medicaid dataset (3,290 patients, 116 institutions) were categorized as C=55%, C2S=40%, I=3%, NM=2%, and S2C=1%. Majority of the problematic and ambiguous mappings (convoluted) pertained to operations in ophthalmology cardiology, urology, gyneco-obstetrics, and dermatology. Finally, the algorithms were expanded into a user-friendly tool to identify problematic topologies and specify lists of procedural codes utilized by medical professionals and researchers for mitigating error-prone translations, simplifying research, and improving quality.http://www.lussiergroup.org/transition-to-ICD10PCS.

20.
Int J Med Inform ; 113: 63-71, 2018 05.
Artículo en Inglés | MEDLINE | ID: mdl-29602435

RESUMEN

BACKGROUND: Physician and nurses have worked together for generations; however, their language and training are vastly different; comparing and contrasting their work and their joint impact on patient outcomes is difficult in light of this difference. At the same time, the EHR only includes the physician perspective via the physician-authored discharge summary, but not nurse documentation. Prior research in this area has focused on collaboration and the usage of similar terminology. OBJECTIVE: The objective of the study is to gain insight into interprofessional care by developing a computational metric to identify similarities, related concepts and differences in physician and nurse work. METHODS: 58 physician discharge summaries and the corresponding nurse plans of care were transformed into Unified Medical Language System (UMLS) Concept Unique Identifiers (CUIs). MedLEE, a Natural Language Processing (NLP) program, extracted "physician terms" from free-text physician summaries. The nursing plans of care were constructed using the HANDS© nursing documentation software. HANDS© utilizes structured terminologies: nursing diagnosis (NANDA-I), outcomes (NOC), and interventions (NIC) to create "nursing terms". The physician's and nurse's terms were compared using the UMLS network for relatedness, overlaying the physician and nurse terms for comparison. Our overarching goal is to provide insight into the care, by innovatively applying graph algorithms to the UMLS network. We reveal the relationships between the care provided by each professional that is specific to the patient level. RESULTS: We found that only 26% of patients had synonyms (identical UMLS CUIs) between the two professions' documentation. On average, physicians' discharge summaries contain 27 terms and nurses' documentation, 18. Traversing the UMLS network, we found an average of 4 terms related (distance less than 2) between the professions, leaving most concepts as unrelated between nurse and physician care. CONCLUSION: Our hypothesis that physician's and nurse's practice domains are markedly different is supported by the preliminary, quantitative evidence we found. Leveraging the UMLS network and graph traversal algorithms, allows us to compare and contrast nursing and physician care on a single patient, enabling a more complete picture of patient care. We can differentiate professional contributions to patient outcomes and related and divergent concepts by each profession.


Asunto(s)
Algoritmos , Atención a la Salud/normas , Planificación de Atención al Paciente/normas , Pautas de la Práctica en Enfermería/normas , Pautas de la Práctica en Medicina/normas , Unified Medical Language System , Humanos , Procesamiento de Lenguaje Natural , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...