RESUMO
Children with single ventricle heart disease (SVHD) experience morbidity due to inadequate pulmonary blood flow. Using proteomic screening, our group previously identified members of the matrix metalloproteinase (MMP), tissue inhibitor of metalloproteinase (TIMP), and fibroblast growth factor (FGF) families as potentially dysregulated in SVHD. No prior study has taken a targeted approach to mapping circulating levels of these protein families or their relationship to pulmonary vascular outcomes in SVHD. We performed a prospective cohort study of 70 SVHD infants pre-Stage 2 palliation and 24 healthy controls. We report targeted serum quantification of 39 proteins in the MMP, TIMP, and FGF families using the SomaScan platform. Clinical variables were extracted from the medical record. Twenty of 39 tested proteins (7/14 MMPs, 2/4 TIMPs, and 11/21 FGFs) differed between cases and controls. On single variable testing, 6 proteins and no clinical covariates were associated with both post-Stage 2 hypoxemia and length of stay. Multiple-protein modeling identified increased circulating MMP 7 and MMP 17, and decreased circulating MMP 8 and FGFR2 as most associated with post-Stage 2 hypoxemia; increased MMP 7 and TIMP 4 and decreased circulating MMP 1 and MMP 8 were most associated with post-operation length of stay. The MMP, TIMP, and FGF families are altered in SVHD. Pre-Stage 2 imbalance of extracellular matrix (ECM) proteins-increased MMP 7 and decreased MMP 8-was associated with multiple adverse post-operation outcomes. Maintenance of the ECM may be an important pathophysiologic driver of Stage 2 readiness in SVHD.
Assuntos
Cardiopatias , Metaloproteinase 8 da Matriz , Criança , Humanos , Lactente , Metaloproteinase 8 da Matriz/metabolismo , Metaloproteinase 7 da Matriz/metabolismo , Inibidor Tecidual de Metaloproteinase-1/metabolismo , Estudos Prospectivos , Proteômica , Matriz Extracelular/metabolismo , Biomarcadores , Proteínas da Matriz Extracelular/metabolismo , Cardiopatias/metabolismoRESUMO
Contemporary high-throughput experimental and surveying techniques give rise to ultrahigh-dimensional supervised problems with sparse signals; that is, a limited number of observations (n), each with a very large number of covariates (p >> n), only a small share of which is truly associated with the response. In these settings, major concerns on computational burden, algorithmic stability, and statistical accuracy call for substantially reducing the feature space by eliminating redundant covariates before the use of any sophisticated statistical analysis. Along the lines of Sure Independence Screening (Fan and Lv, 2008) and other model- and correlation-based feature screening methods, we propose a model-free procedure called Covariate Information Number - Sure Independence Screening (CIS). CIS uses a marginal utility connected to the notion of the traditional Fisher Information, possesses the sure screening property, and is applicable to any type of response (features) with continuous features (response). Simulations and an application to transcriptomic data on rats reveal the comparative strengths of CIS over some popular feature screening methods.
RESUMO
When analyzing large datasets from high-throughput technologies, researchers often encounter missing quantitative measurements, which are particularly frequent in metabolomics datasets. Metabolomics, the comprehensive profiling of metabolite abundances, are typically measured using mass spectrometry technologies that often introduce missingness via multiple mechanisms: (1) the metabolite signal may be smaller than the instrument limit of detection; (2) the conditions under which the data are collected and processed may lead to missing values; (3) missing values can be introduced randomly. Missingness resulting from mechanism (1) would be classified as Missing Not At Random (MNAR), that from mechanism (2) would be Missing At Random (MAR), and that from mechanism (3) would be classified as Missing Completely At Random (MCAR). Two common approaches for handling missing data are the following: (1) omit missing data from the analysis; (2) impute the missing values. Both approaches may introduce bias and reduce statistical power in downstream analyses such as testing metabolite associations with clinical variables. Further, standard imputation methods in metabolomics often ignore the mechanisms causing missingness and inaccurately estimate missing values within a data set. We propose a mechanism-aware imputation algorithm that leverages a two-step approach in imputing missing values. First, we use a random forest classifier to classify the missing mechanism for each missing value in the data set. Second, we impute each missing value using imputation algorithms that are specific to the predicted missingness mechanism (i.e., MAR/MCAR or MNAR). Using complete data, we conducted simulations, where we imposed different missingness patterns within the data and tested the performance of combinations of imputation algorithms. Our proposed algorithm provided imputations closer to the original data than those using only one imputation algorithm for all the missing values. Consequently, our two-step approach was able to reduce bias for improved downstream analyses.
Assuntos
Algoritmos , Metabolômica , Viés , Espectrometria de Massas/métodos , Metabolômica/métodosRESUMO
BACKGROUND: Metabolomic analysis is commonly used to understand the biological underpinning of diseases such as obesity. However, our knowledge of gut metabolites related to weight outcomes in young children is currently limited. OBJECTIVES: To (1) explore the relationships between metabolites and child weight outcomes, (2) determine the potential effect of covariates (e.g., child's diet, maternal health/habits during pregnancy, etc.) in the relationship between metabolites and child weight outcomes, and (3) explore the relationship between selected gut metabolites and gut microbiota abundance. METHODS: Using 1 H-NMR, we quantified 30 metabolites from stool samples of 170 two-year-old children. To identify metabolites and covariates associated with children's weight outcomes (BMI [weight/height2 ], BMI z-score [BMI adjusted for age and sex], and growth index [weight/height]), we analysed the 1 H-NMR data, along with 20 covariates recorded on children and mothers, using LASSO and best subset selection regression techniques. Previously characterized microbiota community information from the same stool samples was used to determine associations between selected gut metabolites and gut microbiota. RESULTS: At age 2 years, stool butyrate concentration had a significant positive association with child BMI (p-value = 3.58 × 10-4 ), BMI z-score (p-value = 3.47 × 10-4 ), and growth index (p-value = 7.73 × 10-4 ). Covariates such as maternal smoking during pregnancy are important to consider. Butyrate concentration was positively associated with the abundance of the bacterial genus Faecalibacterium (p-value = 9.61 × 10-3 ). CONCLUSIONS: Stool butyrate concentration is positively associated with increased child weight outcomes and should be investigated further as a factor affecting childhood obesity.
Assuntos
Microbioma Gastrointestinal , Obesidade Infantil , Índice de Massa Corporal , Butiratos , Criança , Pré-Escolar , Fezes , Feminino , Humanos , Mães , Obesidade Infantil/epidemiologia , GravidezRESUMO
BACKGROUND: Since the onset of the SARS-CoV-2 pandemic, most clinical testing has focused on RT-PCR1. Host epigenome manipulation post coronavirus infection2-4 suggests that DNA methylation signatures may differentiate patients with SARS-CoV-2 infection from uninfected individuals, and help predict COVID-19 disease severity, even at initial presentation. METHODS: We customized Illumina's Infinium MethylationEPIC array to enhance immune response detection and profiled peripheral blood samples from 164 COVID-19 patients with longitudinal measurements of disease severity and 296 patient controls. RESULTS: Epigenome-wide association analysis revealed 13,033 genome-wide significant methylation sites for case-vs-control status. Genes and pathways involved in interferon signaling and viral response were significantly enriched among differentially methylated sites. We observe highly significant associations at genes previously reported in genetic association studies (e.g. IRF7, OAS1). Using machine learning techniques, models built using sparse regression yielded highly predictive findings: cross-validated best fit AUC was 93.6% for case-vs-control status, and 79.1%, 80.8%, and 84.4% for hospitalization, ICU admission, and progression to death, respectively. CONCLUSIONS: In summary, the strong COVID-19-specific epigenetic signature in peripheral blood driven by key immune-related pathways related to infection status, disease severity, and clinical deterioration provides insights useful for diagnosis and prognosis of patients with viral infections.
Viral infections affect the body in many ways, including via changes to the epigenome, the sum of chemical modifications to an individual's collection of genes that affect gene activity. Here, we analyzed the epigenome in blood samples from people with and without COVID-19 to determine whether we could find changes consistent with SARS-CoV-2 infection. Using a combination of statistical and machine learning techniques, we identify markers of SARS-CoV-2 infection as well as of severity and progression of COVID-19 disease. These signals of disease progression were present from the initial blood draw when first walking into the hospital. Together, these approaches demonstrate the potential of measuring the epigenome for monitoring SARS-CoV-2 status and severity.
RESUMO
BACKGROUND: Since the onset of the SARS-CoV-2 pandemic, most clinical testing has focused on RT-PCR1. Host epigenome manipulation post coronavirus infection2-4 suggests that DNA methylation signatures may differentiate patients with SARS-CoV-2 infection from uninfected individuals, and help predict COVID-19 disease severity, even at initial presentation. METHODS: We customized Illumina's Infinium MethylationEPIC array to enhance immune response detection and profiled peripheral blood samples from 164 COVID-19 patients with longitudinal measurements of disease severity and 296 patient controls. RESULTS: Epigenome-wide association analysis revealed 13,033 genome-wide significant methylation sites for case-vs-control status. Genes and pathways involved in interferon signaling and viral response were significantly enriched among differentially methylated sites. We observe highly significant associations at genes previously reported in genetic association studies (e.g. IRF7, OAS1). Using machine learning techniques, models built using sparse regression yielded highly predictive findings: cross-validated best fit AUC was 93.6% for case-vs-control status, and 79.1%, 80.8%, and 84.4% for hospitalization, ICU admission, and progression to death, respectively. CONCLUSIONS: In summary, the strong COVID-19-specific epigenetic signature in peripheral blood driven by key immune-related pathways related to infection status, disease severity, and clinical deterioration provides insights useful for diagnosis and prognosis of patients with viral infections.