Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 46
Filtrar
1.
Nat Methods ; 20(9): 1379-1387, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37592182

RESUMO

Spatially resolved genomic technologies have allowed us to study the physical organization of cells and tissues, and promise an understanding of local interactions between cells. However, it remains difficult to precisely align spatial observations across slices, samples, scales, individuals and technologies. Here, we propose a probabilistic model that aligns spatially-resolved samples onto a known or unknown common coordinate system (CCS) with respect to phenotypic readouts (for example, gene expression). Our method, Gaussian Process Spatial Alignment (GPSA), consists of a two-layer Gaussian process: the first layer maps observed samples' spatial locations onto a CCS, and the second layer maps from the CCS to the observed readouts. Our approach enables complex downstream spatially aware analyses that are impossible or inaccurate with unaligned data, including an analysis of variance, creation of a dense three-dimensional (3D) atlas from sparse two-dimensional (2D) slices or association tests across data modalities.


Assuntos
Genômica , Modelos Estatísticos , Humanos , Distribuição Normal
2.
bioRxiv ; 2023 Jan 31.
Artigo em Inglês | MEDLINE | ID: mdl-36778332

RESUMO

Spatially-resolved genomic technologies have shown promise for studying the relationship between the structural arrangement of cells and their functional behavior. While numerous sequencing and imaging platforms exist for performing spatial transcriptomics and spatial proteomics profiling, these experiments remain expensive and labor-intensive. Thus, when performing spatial genomics experiments using multiple tissue slices, there is a need to select the tissue cross sections that will be maximally informative for the purposes of the experiment. In this work, we formalize the problem of experimental design for spatial genomics experiments, which we generalize into a problem class that we call structured batch experimental design. We propose approaches for optimizing these designs in two types of spatial genomics studies: one in which the goal is to construct a spatially-resolved genomic atlas of a tissue and another in which the goal is to localize a region of interest in a tissue, such as a tumor. We demonstrate the utility of these optimal designs, where each slice is a two-dimensional plane, on several spatial genomics datasets.

3.
Nat Methods ; 20(2): 229-238, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36587187

RESUMO

Nonnegative matrix factorization (NMF) is widely used to analyze high-dimensional count data because, in contrast to real-valued alternatives such as factor analysis, it produces an interpretable parts-based representation. However, in applications such as spatial transcriptomics, NMF fails to incorporate known structure between observations. Here, we present nonnegative spatial factorization (NSF), a spatially-aware probabilistic dimension reduction model based on transformed Gaussian processes that naturally encourages sparsity and scales to tens of thousands of observations. NSF recovers ground truth factors more accurately than real-valued alternatives such as MEFISTO in simulations, and has lower out-of-sample prediction error than probabilistic NMF on three spatial transcriptomics datasets from mouse brain and liver. Since not all patterns of gene expression have spatial correlations, we also propose a hybrid extension of NSF that combines spatial and nonspatial components, enabling quantification of spatial importance for both observations and features. A TensorFlow implementation of NSF is available from https://github.com/willtownes/nsf-paper .


Assuntos
Algoritmos , Perfilação da Expressão Gênica , Animais , Camundongos , Perfilação da Expressão Gênica/métodos , Genômica , Modelos Estatísticos
4.
BMC Bioinformatics ; 23(1): 529, 2022 Dec 08.
Artigo em Inglês | MEDLINE | ID: mdl-36482321

RESUMO

BACKGROUND: Single-cell RNA-sequencing (scRNA-seq) technologies allow for the study of gene expression in individual cells. Often, it is of interest to understand how transcriptional activity is associated with cell-specific covariates, such as cell type, genotype, or measures of cell health. Traditional approaches for this type of association mapping assume independence between the outcome variables (or genes), and perform a separate regression for each. However, these methods are computationally costly and ignore the substantial correlation structure of gene expression. Furthermore, count-based scRNA-seq data pose challenges for traditional models based on Gaussian assumptions. RESULTS: We aim to resolve these issues by developing a reduced-rank regression model that identifies low-dimensional linear associations between a large number of cell-specific covariates and high-dimensional gene expression readouts. Our probabilistic model uses a Poisson likelihood in order to account for the unique structure of scRNA-seq counts. We demonstrate the performance of our model using simulations, and we apply our model to a scRNA-seq dataset, a spatial gene expression dataset, and a bulk RNA-seq dataset to show its behavior in three distinct analyses. CONCLUSION: We show that our statistical modeling approach, which is based on reduced-rank regression, captures associations between gene expression and cell- and sample-specific covariates by leveraging low-dimensional representations of transcriptional states.

5.
Life Sci Alliance ; 5(12)2022 08 17.
Artigo em Inglês | MEDLINE | ID: mdl-35977827

RESUMO

Expression quantitative trait loci (eQTLs), or single-nucleotide polymorphisms that affect average gene expression levels, provide important insights into context-specific gene regulation. Classic eQTL analyses use one-to-one association tests, which test gene-variant pairs individually and ignore correlations induced by gene regulatory networks and linkage disequilibrium. Probabilistic topic models, such as latent Dirichlet allocation, estimate latent topics for a collection of count observations. Prior multimodal frameworks that bridge genotype and expression data assume matched sample numbers between modalities. However, many data sets have a nested structure where one individual has several associated gene expression samples and a single germline genotype vector. Here, we build a telescoping bimodal latent Dirichlet allocation (TBLDA) framework to learn shared topics across gene expression and genotype data that allows multiple RNA sequencing samples to correspond to a single individual's genotype. By using raw count data, our model avoids possible adulteration via normalization procedures. Ancestral structure is captured in a genotype-specific latent space, effectively removing it from shared components. Using GTEx v8 expression data across 10 tissues and genotype data, we show that the estimated topics capture meaningful and robust biological signal in both modalities and identify associations within and across tissue types. We identify 4,645 cis-eQTLs and 995 trans-eQTLs by conducting eQTL mapping between the most informative features in each topic. Our TBLDA model is able to identify associations using raw sequencing count data when the samples in two separate data modalities are matched one-to-many, as is often the case in biological data. Our code is freely available at https://github.com/gewirtz/TBLDA.


Assuntos
Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Regulação da Expressão Gênica , Redes Reguladoras de Genes , Genótipo , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética
6.
Biochem J ; 479(11): 1257-1263, 2022 06 17.
Artigo em Inglês | MEDLINE | ID: mdl-35713413

RESUMO

Petabytes of increasingly complex and multidimensional live cell and tissue imaging data are generated every year. These videos hold large promise for understanding biology at a deep and fundamental level, as they capture single-cell and multicellular events occurring over time and space. However, the current modalities for analysis and mining of these data are scattered and user-specific, preventing more unified analyses from being performed over different datasets and obscuring possible scientific insights. Here, we propose a unified pipeline for storage, segmentation, analysis, and statistical parametrization of live cell imaging datasets.


Assuntos
Conjuntos de Dados como Assunto
7.
J Pers Med ; 12(5)2022 Apr 20.
Artigo em Inglês | MEDLINE | ID: mdl-35629084

RESUMO

Both provider- and protocol-driven electrolyte replacement have been linked to the over-prescription of ubiquitous electrolytes. Here, we describe the development and retrospective validation of a data-driven clinical decision support tool that uses reinforcement learning (RL) algorithms to recommend patient-tailored electrolyte replacement policies for ICU patients. We used electronic health records (EHR) data that originated from two institutions (UPHS; MIMIC-IV). The tool uses a set of patient characteristics, such as their physiological and pharmacological state, a pre-defined set of possible repletion actions, and a set of clinical goals to present clinicians with a recommendation for the route and dose of an electrolyte. RL-driven electrolyte repletion substantially reduces the frequency of magnesium and potassium replacements (up to 60%), adjusts the timing of interventions in all three electrolytes considered (potassium, magnesium, and phosphate), and shifts them towards orally administered repletion over intravenous replacement. This shift in recommended treatment limits risk of the potentially harmful effects of over-repletion and implies monetary savings. Overall, the RL-driven electrolyte repletion recommendations reduce excess electrolyte replacements and improve the safety, precision, efficacy, and cost of each electrolyte repletion event, while showing robust performance across patient cohorts and hospital systems.

8.
Pac Symp Biocomput ; 27: 266-277, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-34890155

RESUMO

Gaussian processes (GPs) are a versatile nonparametric model for nonlinear regression and have been widely used to study spatiotemporal phenomena. However, standard GPs offer limited interpretability and generalizability for datasets with naturally occurring hierarchies. With large-scale, rapidly-updating electronic health record (EHR) data, we want to study patient trajectories across diverse patient cohorts while preserving patient subgroup structure. In this work, we partition our cohort of over 2000 COVID-19 patients by sex and ethnicity. We develop and apply a hierarchical Gaussian process and a mixture of experts (MOE) hierarchical GP model to fit patient trajectories on clinical markers of disease progression. A case study for albumin, an effective predictor of COVID-19 patient outcomes, highlights the predictive performance of these models. These hierarchical spatiotemporal models of EHR data bring us a step closer toward our goal of building flexible approaches to capture patient data that can be used in real-time systems*.


Assuntos
COVID-19 , Estudos de Coortes , Biologia Computacional , Registros Eletrônicos de Saúde , Humanos , SARS-CoV-2
9.
Neuroimage ; 245: 118580, 2021 12 15.
Artigo em Inglês | MEDLINE | ID: mdl-34740792

RESUMO

A key problem in functional magnetic resonance imaging (fMRI) is to estimate spatial activity patterns from noisy high-dimensional signals. Spatial smoothing provides one approach to regularizing such estimates. However, standard smoothing methods ignore the fact that correlations in neural activity may fall off at different rates in different brain areas, or exhibit discontinuities across anatomical or functional boundaries. Moreover, such methods do not exploit the fact that widely separated brain regions may exhibit strong correlations due to bilateral symmetry or the network organization of brain regions. To capture this non-stationary spatial correlation structure, we introduce the brain kernel, a continuous covariance function for whole-brain activity patterns. We define the brain kernel in terms of a continuous nonlinear mapping from 3D brain coordinates to a latent embedding space, parametrized with a Gaussian process (GP). The brain kernel specifies the prior covariance between voxels as a function of the distance between their locations in embedding space. The GP mapping warps the brain nonlinearly so that highly correlated voxels are close together in latent space, and uncorrelated voxels are far apart. We estimate the brain kernel using resting-state fMRI data, and we develop an exact, scalable inference method based on block coordinate descent to overcome the challenges of high dimensionality (10-100K voxels). Finally, we illustrate the brain kernel's usefulness with applications to brain decoding and factor analysis with multiple task-based fMRI datasets.


Assuntos
Mapeamento Encefálico/métodos , Processamento de Imagem Assistida por Computador/métodos , Imageamento por Ressonância Magnética/métodos , Neuroimagem/métodos , Humanos , Imageamento Tridimensional
10.
Proc Natl Acad Sci U S A ; 118(32)2021 08 10.
Artigo em Inglês | MEDLINE | ID: mdl-34362843

RESUMO

Multicellular organisms rely on spatial signaling among cells to drive their organization, development, and response to stimuli. Several models have been proposed to capture the behavior of spatial signaling in multicellular systems, but existing approaches fail to capture both the autonomous behavior of single cells and the interactions of a cell with its neighbors simultaneously. We propose a spatiotemporal model of dynamic cell signaling based on Hawkes processes-self-exciting point processes-that model the signaling processes within a cell and spatial couplings between cells. With this cellular point process (CPP), we capture both the single-cell pathway activation rate and the magnitude and duration of signaling between cells relative to their spatial location. Furthermore, our model captures tissues composed of heterogeneous cell types with different bursting rates and signaling behaviors across multiple signaling proteins. We apply our model to epithelial cell systems that exhibit a range of autonomous and spatial signaling behaviors basally and under pharmacological exposure. Our model identifies known drug-induced signaling deficits, characterizes signaling changes across a wound front, and generalizes to multichannel observations.


Assuntos
Queratinócitos/metabolismo , Modelos Biológicos , Transdução de Sinais , Animais , Dipeptídeos/farmacologia , Cães , Células Epiteliais , Ácidos Hidroxâmicos/farmacologia , Queratinócitos/citologia , Queratinócitos/efeitos dos fármacos , Sistema de Sinalização das MAP Quinases/efeitos dos fármacos , Células Madin Darby de Rim Canino , Camundongos Endogâmicos , Camundongos Transgênicos , Modelos Estatísticos , Inibidores de Proteínas Quinases/farmacologia , Transdução de Sinais/efeitos dos fármacos , Análise Espaço-Temporal
11.
Nat Commun ; 12(1): 1609, 2021 03 11.
Artigo em Inglês | MEDLINE | ID: mdl-33707455

RESUMO

Histopathological images are used to characterize complex phenotypes such as tumor stage. Our goal is to associate features of stained tissue images with high-dimensional genomic markers. We use convolutional autoencoders and sparse canonical correlation analysis (CCA) on paired histological images and bulk gene expression to identify subsets of genes whose expression levels in a tissue sample correlate with subsets of morphological features from the corresponding sample image. We apply our approach, ImageCCA, to two TCGA data sets, and find gene sets associated with the structure of the extracellular matrix and cell wall infrastructure, implicating uncharacterized genes in extracellular processes. We find sets of genes associated with specific cell types, including neuronal cells and cells of the immune system. We apply ImageCCA to the GTEx v6 data, and find image features that capture population variation in thyroid and in colon tissues associated with genetic variants (image morphology QTLs, or imQTLs), suggesting that genetic variation regulates population variation in tissue morphological traits.


Assuntos
Biologia Computacional/métodos , Regulação Neoplásica da Expressão Gênica/genética , Expressão Gênica/genética , Neoplasias/patologia , Locos de Características Quantitativas/genética , Proteína BRCA1/genética , Biomarcadores Tumorais/genética , Membrana Celular/genética , Membrana Celular/fisiologia , Matriz Extracelular/genética , Matriz Extracelular/fisiologia , Humanos , Processamento de Imagem Assistida por Computador , Neoplasias/genética , Polimorfismo de Nucleotídeo Único/genética
12.
Nat Commun ; 12(1): 1186, 2021 02 19.
Artigo em Inglês | MEDLINE | ID: mdl-33608535

RESUMO

Single-cell technologies characterize complex cell populations across multiple data modalities at unprecedented scale and resolution. Multi-omic data for single cell gene expression, in situ hybridization, or single cell chromatin states are increasingly available across diverse tissue types. When isolating specific cell types from a sample of disassociated cells or performing in situ sequencing in collections of heterogeneous cells, one challenging task is to select a small set of informative markers that robustly enable the identification and discrimination of specific cell types or cell states as precisely as possible. Given single cell RNA-seq data and a set of cellular labels to discriminate, scGeneFit selects gene markers that jointly optimize cell label recovery using label-aware compressive classification methods. This results in a substantially more robust and less redundant set of markers than existing methods, most of which identify markers that separate each cell label from the rest. When applied to a data set given a hierarchy of cell types as labels, the markers found by our method improves the recovery of the cell type hierarchy with fewer markers than existing methods using a computationally efficient and principled optimization.


Assuntos
Marcadores Genéticos , Análise de Célula Única/métodos , Algoritmos , Análise por Conglomerados , Expressão Gênica , Perfilação da Expressão Gênica/métodos , Humanos , RNA-Seq , Análise de Sequência de RNA/métodos , Transcriptoma
13.
PLoS Comput Biol ; 17(1): e1008223, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-33513136

RESUMO

Gene regulatory network inference is essential to uncover complex relationships among gene pathways and inform downstream experiments, ultimately enabling regulatory network re-engineering. Network inference from transcriptional time-series data requires accurate, interpretable, and efficient determination of causal relationships among thousands of genes. Here, we develop Bootstrap Elastic net regression from Time Series (BETS), a statistical framework based on Granger causality for the recovery of a directed gene network from transcriptional time-series data. BETS uses elastic net regression and stability selection from bootstrapped samples to infer causal relationships among genes. BETS is highly parallelized, enabling efficient analysis of large transcriptional data sets. We show competitive accuracy on a community benchmark, the DREAM4 100-gene network inference challenge, where BETS is one of the fastest among methods of similar performance and additionally infers whether causal effects are activating or inhibitory. We apply BETS to transcriptional time-series data of differentially-expressed genes from A549 cells exposed to glucocorticoids over a period of 12 hours. We identify a network of 2768 genes and 31,945 directed edges (FDR ≤ 0.2). We validate inferred causal network edges using two external data sources: Overexpression experiments on the same glucocorticoid system, and genetic variants associated with inferred edges in primary lung tissue in the Genotype-Tissue Expression (GTEx) v6 project. BETS is available as an open source software package at https://github.com/lujonathanh/BETS.


Assuntos
Glucocorticoides/farmacologia , Modelos Estatísticos , Transcriptoma/efeitos dos fármacos , Células A549 , Algoritmos , Biologia Computacional , Humanos , Pulmão/química , Pulmão/metabolismo , Aprendizado de Máquina , Software , Transcriptoma/genética
14.
Science ; 369(6509)2020 09 11.
Artigo em Inglês | MEDLINE | ID: mdl-32913072

RESUMO

Many complex human phenotypes exhibit sex-differentiated characteristics. However, the molecular mechanisms underlying these differences remain largely unknown. We generated a catalog of sex differences in gene expression and in the genetic regulation of gene expression across 44 human tissue sources surveyed by the Genotype-Tissue Expression project (GTEx, v8 release). We demonstrate that sex influences gene expression levels and cellular composition of tissue samples across the human body. A total of 37% of all genes exhibit sex-biased expression in at least one tissue. We identify cis expression quantitative trait loci (eQTLs) with sex-differentiated effects and characterize their cellular origin. By integrating sex-biased eQTLs with genome-wide association study data, we identify 58 gene-trait associations that are driven by genetic regulation of gene expression in a single sex. These findings provide an extensive characterization of sex differences in the human transcriptome and its genetic regulation.


Assuntos
Regulação da Expressão Gênica , Expressão Gênica , Caracteres Sexuais , Cromossomos Humanos X/genética , Doença/genética , Epigênese Genética , Feminino , Variação Genética , Estudo de Associação Genômica Ampla , Humanos , Masculino , Especificidade de Órgãos , Regiões Promotoras Genéticas , Locos de Características Quantitativas , Fatores Sexuais
15.
BMC Med Inform Decis Mak ; 20(1): 152, 2020 07 08.
Artigo em Inglês | MEDLINE | ID: mdl-32641134

RESUMO

BACKGROUND: For real-time monitoring of hospital patients, high-quality inference of patients' health status using all information available from clinical covariates and lab test results is essential to enable successful medical interventions and improve patient outcomes. Developing a computational framework that can learn from observational large-scale electronic health records (EHRs) and make accurate real-time predictions is a critical step. In this work, we develop and explore a Bayesian nonparametric model based on multi-output Gaussian process (GP) regression for hospital patient monitoring. METHODS: We propose MedGP, a statistical framework that incorporates 24 clinical covariates and supports a rich reference data set from which relationships between observed covariates may be inferred and exploited for high-quality inference of patient state over time. To do this, we develop a highly structured sparse GP kernel to enable tractable computation over tens of thousands of time points while estimating correlations among clinical covariates, patients, and periodicity in patient observations. MedGP has a number of benefits over current methods, including (i) not requiring an alignment of the time series data, (ii) quantifying confidence regions in the predictions, (iii) exploiting a vast and rich database of patients, and (iv) inferring interpretable relationships among clinical covariates. RESULTS: We evaluate and compare results from MedGP on the task of online prediction for three patient subgroups from two medical data sets across 8,043 patients. We find MedGP improves online prediction over baseline and state-of-the-art methods for nearly all covariates across different disease subgroups and hospitals. CONCLUSIONS: The MedGP framework is robust and efficient in estimating the temporal dependencies from sparse and irregularly sampled medical time series data for online prediction. The publicly available code is at https://github.com/bee-hive/MedGP .


Assuntos
Algoritmos , Modelos Estatísticos , Teorema de Bayes , Distribuição Normal
16.
BMC Bioinformatics ; 21(1): 324, 2020 Jul 21.
Artigo em Inglês | MEDLINE | ID: mdl-32693778

RESUMO

BACKGROUND: Modern developments in single-cell sequencing technologies enable broad insights into cellular state. Single-cell RNA sequencing (scRNA-seq) can be used to explore cell types, states, and developmental trajectories to broaden our understanding of cellular heterogeneity in tissues and organs. Analysis of these sparse, high-dimensional experimental results requires dimension reduction. Several methods have been developed to estimate low-dimensional embeddings for filtered and normalized single-cell data. However, methods have yet to be developed for unfiltered and unnormalized count data that estimate uncertainty in the low-dimensional space. We present a nonlinear latent variable model with robust, heavy-tailed error and adaptive kernel learning to estimate low-dimensional nonlinear structure in scRNA-seq data. RESULTS: Gene expression in a single cell is modeled as a noisy draw from a Gaussian process in high dimensions from low-dimensional latent positions. This model is called the Gaussian process latent variable model (GPLVM). We model residual errors with a heavy-tailed Student's t-distribution to estimate a manifold that is robust to technical and biological noise found in normalized scRNA-seq data. We compare our approach to common dimension reduction tools across a diverse set of scRNA-seq data sets to highlight our model's ability to enable important downstream tasks such as clustering, inferring cell developmental trajectories, and visualizing high throughput experiments on available experimental data. CONCLUSION: We show that our adaptive robust statistical approach to estimate a nonlinear manifold is well suited for raw, unfiltered gene counts from high-throughput sequencing technologies for visualization, exploration, and uncertainty estimation of cell states.


Assuntos
Dinâmica não Linear , RNA-Seq , Análise de Célula Única/métodos , Células Sanguíneas/metabolismo , Regulação da Expressão Gênica , Humanos , Modelos Genéticos , Neurônios/metabolismo , Distribuição Normal , Análise de Componente Principal , Fatores de Tempo
17.
Genome Res ; 30(2): 195-204, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-31992614

RESUMO

Single-cell RNA-sequencing (scRNA-seq) enables high-throughput measurement of RNA expression in single cells. However, because of technical limitations, scRNA-seq data often contain zero counts for many transcripts in individual cells. These zero counts, or dropout events, complicate the analysis of scRNA-seq data using standard methods developed for bulk RNA-seq data. Current scRNA-seq analysis methods typically overcome dropout by combining information across cells in a lower-dimensional space, leveraging the observation that cells generally occupy a small number of RNA expression states. We introduce netNMF-sc, an algorithm for scRNA-seq analysis that leverages information across both cells and genes. netNMF-sc learns a low-dimensional representation of scRNA-seq transcript counts using network-regularized non-negative matrix factorization. The network regularization takes advantage of prior knowledge of gene-gene interactions, encouraging pairs of genes with known interactions to be nearby each other in the low-dimensional representation. The resulting matrix factorization imputes gene abundance for both zero and nonzero counts and can be used to cluster cells into meaningful subpopulations. We show that netNMF-sc outperforms existing methods at clustering cells and estimating gene-gene covariance using both simulated and real scRNA-seq data, with increasing advantages at higher dropout rates (e.g., >60%). We also show that the results from netNMF-sc are robust to variation in the input network, with more representative networks leading to greater performance gains.


Assuntos
Epistasia Genética/genética , RNA-Seq , Análise de Célula Única/métodos , Software , Análise por Conglomerados , Perfilação da Expressão Gênica , Humanos , Sequenciamento do Exoma
18.
R Soc Open Sci ; 7(11): 200958, 2020 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-33391794

RESUMO

Angiotensin-converting enzyme 2 (ACE2) and serine protease TMPRSS2 have been implicated in cell entry for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the virus responsible for coronavirus disease 2019 (COVID-19). The expression of ACE2 and TMPRSS2 in the lung epithelium might have implications for the risk of SARS-CoV-2 infection and severity of COVID-19. We use human genetic variants that proxy angiotensin-converting enzyme (ACE) inhibitor drug effects and cardiovascular risk factors to investigate whether these exposures affect lung ACE2 and TMPRSS2 gene expression and circulating ACE2 levels. We observed no consistent evidence of an association of genetically predicted serum ACE levels with any of our outcomes. There was weak evidence for an association of genetically predicted serum ACE levels with ACE2 gene expression in the Lung eQTL Consortium (p = 0.014), but this finding did not replicate. There was evidence of a positive association of genetic liability to type 2 diabetes mellitus with lung ACE2 gene expression in the Gene-Tissue Expression (GTEx) study (p = 4 × 10-4) and with circulating plasma ACE2 levels in the INTERVAL study (p = 0.03), but not with lung ACE2 expression in the Lung eQTL Consortium study (p = 0.68). There were no associations of genetically proxied liability to the other cardiometabolic traits with any outcome. This study does not provide consistent evidence to support an effect of serum ACE levels (as a proxy for ACE inhibitors) or cardiometabolic risk factors on lung ACE2 and TMPRSS2 expression or plasma ACE2 levels.

19.
Pac Symp Biocomput ; 24: 320-331, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30864333

RESUMO

Laboratory testing is an integral tool in the management of patient care in hospitals, particularly in intensive care units (ICUs). There exists an inherent trade-off in the selection and timing of lab tests between considerations of the expected utility in clinical decision-making of a given test at a specific time, and the associated cost or risk it poses to the patient. In this work, we introduce a framework that learns policies for ordering lab tests which optimizes for this trade-off. Our approach uses batch off-policy reinforcement learning with a composite reward function based on clinical imperatives, applied to data that include examples of clinicians ordering labs for patients. To this end, we develop and extend principles of Pareto optimality to improve the selection of actions based on multiple reward function components while respecting typical procedural considerations and prioritization of clinical goals in the ICU. Our experiments show that we can estimate a policy that reduces the frequency of lab tests and optimizes timing to minimize information redundancy. We also find that the estimated policies typically suggest ordering lab tests well ahead of critical onsets-such as mechanical ventilation or dialysis-that depend on the lab results. We evaluate our approach by quantifying how these policies may initiate earlier onset of treatment.


Assuntos
Técnicas de Laboratório Clínico , Unidades de Terapia Intensiva , Injúria Renal Aguda/diagnóstico , Técnicas de Laboratório Clínico/estatística & dados numéricos , Biologia Computacional , Cuidados Críticos/estatística & dados numéricos , Técnicas de Apoio para a Decisão , Humanos , Unidades de Terapia Intensiva/organização & administração , Unidades de Terapia Intensiva/estatística & dados numéricos , Administração dos Cuidados ao Paciente/organização & administração , Administração dos Cuidados ao Paciente/estatística & dados numéricos , Reforço Psicológico , Recompensa , Sepse/diagnóstico
20.
Bioinformatics ; 35(2): 200-210, 2019 01 15.
Artigo em Inglês | MEDLINE | ID: mdl-29982387

RESUMO

Motivation: Identifying variants, both discrete and continuous, that are associated with quantitative traits, or QTs, is the primary focus of quantitative genetics. Most current methods are limited to identifying mean effects, or associations between genotype or covariates and the mean value of a quantitative trait. It is possible, however, that a variant may affect the variance of the quantitative trait in lieu of, or in addition to, affecting the trait mean. Here, we develop a general methodology to identify covariates with variance effects on a quantitative trait using a Bayesian heteroskedastic linear regression model (BTH). We compare BTH with existing methods to detect variance effects across a large range of simulations drawn from scenarios common to the analysis of quantitative traits. Results: We find that BTH and a double generalized linear model (dglm) outperform classical tests used for detecting variance effects in recent genomic studies. We show BTH and dglm are less likely to generate spurious discoveries through simulations and application to identifying methylation variance QTs and expression variance QTs. We identify four variance effects of sex in the Cardiovascular and Pharmacogenetics study. Our work is the first to offer a comprehensive view of variance identifying methodology. We identify shortcomings in previously used methodology and provide a more conservative and robust alternative. We extend variance effect analysis to a wide array of covariates that enables a new statistical dimension in the study of sex and age specific quantitative trait effects. Availability and implementation: https://github.com/b2du/bth. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Teorema de Bayes , Genômica/métodos , Modelos Lineares , Modelos Genéticos , Locos de Características Quantitativas , Análise de Variância , Biologia Computacional , Humanos , Fenótipo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...