Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 25
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Respir Med ; 234: 107802, 2024 Sep 10.
Artigo em Inglês | MEDLINE | ID: mdl-39260678

RESUMO

BACKGROUND: The underlying population of patients selected for each respiratory monoclonal antibody might change as other biologics are approved. OBJECTIVE: To evaluate effect modification by calendar time of the effectiveness of each respiratory biologics in asthma. METHODS: The Effectiveness of Respiratory biologics in Asthma (ERA) is a retrospective cohort of severe asthma patients from the Mass General Brigham clinics between January 2013 and September 2023. Periods were pre-specified as the anti-IgE (2013-2015), anti-IL5 (2016-2018), anti-IL4/13 (2019-2021) or anti-alarmin (2022-2023) era. We evaluated each biologic's effect on asthma-related exacerbations comparing the one-year period before and after therapy initiation using Poisson regression and Cox regression for time-to-first exacerbation. RESULTS: Of 647 biologic-naïve patients, 165 initiated omalizumab, 235 anti-IL5, 227 dupilumab, and 20 tezepelumab. Omalizumab's effectiveness improved as more biologics were approved: incidence rate ratio (IRR) 1.16 [0.94-1.43] anti-IgE era vs. 0.54 [0.37-0.80] anti-IL4/13-alarmin era. Omalizumab patients in the anti-IL4/13-alarmin era had lower blood eosinophil counts and less chronic rhinosinusitis with nasal polyps (CRSwNP). For anti-IL5s, effectiveness peaked in the anti-IL4/13 era (IRR 0.52 [0.42-0.64]) when patients had higher BMI and less concomitant CRSwNP. Dupilumab was most effective in the anti-IL4/13 era (IRR 0.60 [0.50-0.72]). There were fewer current smokers in dupilumab patients in the anti-IL4/13 era. Results were similar in time-to-event analyses and in sensitivity analyses accounting for the COVID-19 pandemic. CONCLUSION: There are temporal variations in the effectiveness of biologics partly explained by the shift in the underlying population, particularly for omalizumab. Though having more choices was associated with better patient selection for omalizumab, this was inconsistent for other biologics.

2.
Genet Epidemiol ; 2024 Jul 09.
Artigo em Inglês | MEDLINE | ID: mdl-38982682

RESUMO

The prediction of the susceptibility of an individual to a certain disease is an important and timely research area. An established technique is to estimate the risk of an individual with the help of an integrated risk model, that is, a polygenic risk score with added epidemiological covariates. However, integrated risk models do not capture any time dependence, and may provide a point estimate of the relative risk with respect to a reference population. The aim of this work is twofold. First, we explore and advocate the idea of predicting the time-dependent hazard and survival (defined as disease-free time) of an individual for the onset of a disease. This provides a practitioner with a much more differentiated view of absolute survival as a function of time. Second, to compute the time-dependent risk of an individual, we use published methodology to fit a Cox's proportional hazard model to data from a genetic SNP study of time to Alzheimer's disease (AD) onset, using the lasso to incorporate further epidemiological variables such as sex, APOE (apolipoprotein E, a genetic risk factor for AD) status, 10 leading principal components, and selected genomic loci. We apply the lasso for Cox's proportional hazards to a data set of 6792 AD patients (composed of 4102 cases and 2690 controls) and 87 covariates. We demonstrate that fitting a lasso model for Cox's proportional hazards allows one to obtain more accurate survival curves than with state-of-the-art (likelihood-based) methods. Moreover, the methodology allows one to obtain personalized survival curves for a patient, thus giving a much more differentiated view of the expected progression of a disease than the view offered by integrated risk models. The runtime to compute personalized survival curves is under a minute for the entire data set of AD patients, thus enabling it to handle datasets with 60,000-100,000 subjects in less than 1 h.

3.
J Cardiovasc Dev Dis ; 11(7)2024 Jun 27.
Artigo em Inglês | MEDLINE | ID: mdl-39057616

RESUMO

Background: Coronary artery calcium (CAC) is a marker of subclinical atherosclerosis and is a complex heritable trait with both genetic and environmental risk factors, including sex and smoking. Methods: We performed genome-wide association (GWA) analyses for CAC among all participants and stratified by sex in the COPDGene study (n = 6144 participants of European ancestry and n = 2589 participants of African ancestry) with replication in the Diabetes Heart Study (DHS). We adjusted for age, sex, current smoking status, BMI, diabetes, self-reported high blood pressure, self-reported high cholesterol, and genetic ancestry (as summarized by principal components computed within each racial group). For the significant signals from the GWA analyses, we examined the single nucleotide polymorphism (SNP) by sex interactions, stratified by smoking status (current vs. former), and tested for a SNP by smoking status interaction on CAC. Results: We identified genome-wide significant associations for CAC in the chromosome 9p21 region [CDKN2B-AS1] among all COPDGene participants (p = 7.1 × 10-14) and among males (p = 1.0 × 10-9), but the signal was not genome-wide significant among females (p = 6.4 × 10-6). For the sex stratified GWA analyses among females, the chromosome 6p24 region [PHACTR1] had a genome-wide significant association (p = 4.4 × 10-8) with CAC, but this signal was not genome-wide significant among all COPDGene participants (p = 1.7 × 10-7) or males (p = 0.03). There was a significant interaction for the SNP rs9349379 in PHACTR1 with sex (p = 0.02), but the interaction was not significant for the SNP rs10757272 in CDKN2B-AS1 with sex (p = 0.21). In addition, PHACTR1 had a stronger association with CAC among current smokers (p = 6.2 × 10-7) than former smokers (p = 7.5 × 10-3) and the SNP by smoking status interaction was marginally significant (p = 0.03). CDKN2B-AS1 had a strong association with CAC among both former (p = 7.7 × 10-8) and current smokers (p = 1.7 × 10-7) and the SNP by smoking status interaction was not significant (p = 0.40). Conclusions: Among current and former smokers of European ancestry in the COPDGene study, we identified a genome-wide significant association in the chromosome 6p24 region [PHACTR1] with CAC among females, but not among males. This region had a significant SNP by sex and SNP by smoking interaction on CAC.

4.
Clin Pharmacol Ther ; 2024 Jul 23.
Artigo em Inglês | MEDLINE | ID: mdl-39044346

RESUMO

The priority review voucher was established to incentivize research and development of treatments for traditionally underfunded diseases and was extended to medical countermeasures from 2016 to 2023, despite limited evidence of an association between the voucher program and increased product development. To determine whether the voucher program has incentivized initiation of new medical countermeasures in clinical trials, we created three cohorts of material threats: (i) COVID-19, (ii) opioid pharmaceutical-based agents, and (iii) all others. Using the Citeline Trialtrove database, we determined the number of medical countermeasures initiated in clinical trials from 2009-2016 and 2017-2023. Eligibility of COVID-19 products for the voucher was confirmed with the issuance of a voucher for remdesivir in October 2020, so we compared January 2020-October 2020 to November 2020-July 2023. We fit two Poisson models-before and after voucher creation-within each cohort. Among COVID-19 medical countermeasures, there was a decrease in the proportion of drugs initiated before (4.5%; 95% CI, 1.0 to 8.3%) vs. after voucher eligibility (-5.1%; 95% CI, -6.1 to -4.0%) (P = 0.01). Among opioid pharmaceutical-based agents medical countermeasures, the rate of new drugs initiated did not change from 2009-2016 (8.1%, 95% CI, -4.4 to 22.6%) to 2017-2023 (5.6%; 95% CI, -3.2 to 15.2%) (P = 0.82). Among all other medical countermeasures, the rate of new drugs initiated also did not change from 2009-2016 (6.6%; 95% CI, -5.6 to 20.8%) to 2017-2023 (-14.8%; 95% CI, -29.2 to 2.0%) (P = 0.15). The priority review voucher program was not associated with stimulating new clinical testing of investigational medical countermeasures.

5.
Ecol Evol ; 14(6): e11530, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38895566

RESUMO

The capacity of forests to sequester carbon in both above- and belowground compartments is a crucial tool to mitigate rising atmospheric carbon concentrations. Belowground carbon storage in forests is strongly linked to soil microbial communities that are the key drivers of soil heterotrophic respiration, organic matter decomposition and thus nutrient cycling. However, the relationships between tree diversity and soil microbial properties such as biomass and respiration remain unclear with inconsistent findings among studies. It is unknown so far how the spatial configuration and soil depth affect the relationship between tree richness and microbial properties. Here, we studied the spatial distribution of soil microbial properties in the context of a tree diversity experiment by measuring soil microbial biomass and respiration in subtropical forests (BEF-China experiment). We sampled soil cores at two depths at five locations along a spatial transect between the trees in mono- and hetero-specific tree pairs of the native deciduous species Liquidambar formosana and Sapindus saponaria. Our analyses showed decreasing soil microbial biomass and respiration with increasing soil depth and distance from the tree in mono-specific tree pairs. We calculated belowground overyielding of soil microbial biomass and respiration - which is higher microbial biomass or respiration than expected from the monocultures - and analysed the distribution patterns along the transect. We found no general overyielding across all sampling positions and depths. Yet, we encountered a spatial pattern of microbial overyielding with a significant microbial overyielding close to L. formosana trees and microbial underyielding close to S. saponaria trees. We found similar spatial patterns across microbial properties and depths that only differed in the strength of their effects. Our results highlight the importance of small-scale variations of tree-tree interaction effects on soil microbial communities and functions and are calling for better integration of within-plot variability to understand biodiversity-ecosystem functioning relationships.

6.
Brief Bioinform ; 25(4)2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-38836403

RESUMO

In precision medicine, both predicting the disease susceptibility of an individual and forecasting its disease-free survival are areas of key research. Besides the classical epidemiological predictor variables, data from multiple (omic) platforms are increasingly available. To integrate this wealth of information, we propose new methodology to combine both cooperative learning, a recent approach to leverage the predictive power of several datasets, and polygenic hazard score models. Polygenic hazard score models provide a practitioner with a more differentiated view of the predicted disease-free survival than the one given by merely a point estimate, for instance computed with a polygenic risk score. Our aim is to leverage the advantages of cooperative learning for the computation of polygenic hazard score models via Cox's proportional hazard model, thereby improving the prediction of the disease-free survival. In our experimental study, we apply our methodology to forecast the disease-free survival for Alzheimer's disease (AD) using three layers of data. One layer contains epidemiological variables such as sex, APOE (apolipoprotein E, a genetic risk factor for AD) status and 10 leading principal components. Another layer contains selected genomic loci, and the last layer contains methylation data for selected CpG sites. We demonstrate that the survival curves computed via cooperative learning yield an AUC of around $0.7$, above the state-of-the-art performance of its competitors. Importantly, the proposed methodology returns (1) a linear score that can be easily interpreted (in contrast to machine learning approaches), and (2) a weighting of the predictive power of the involved data layers, allowing for an assessment of the importance of each omic (or other) platform. Similarly to polygenic hazard score models, our methodology also allows one to compute individual survival curves for each patient.


Assuntos
Doença de Alzheimer , Medicina de Precisão , Humanos , Medicina de Precisão/métodos , Doença de Alzheimer/genética , Doença de Alzheimer/mortalidade , Intervalo Livre de Doença , Aprendizado de Máquina , Modelos de Riscos Proporcionais , Herança Multifatorial , Masculino , Feminino , Multiômica
7.
Genes (Basel) ; 15(5)2024 04 27.
Artigo em Inglês | MEDLINE | ID: mdl-38790194

RESUMO

Depression is heritable, differs by sex, and has environmental risk factors such as cigarette smoking. However, the effect of single nucleotide polymorphisms (SNPs) on depression through cigarette smoking and the role of sex is unclear. In order to examine the association of SNPs with depression and smoking in the UK Biobank with replication in the COPDGene study, we used counterfactual-based mediation analysis to test the indirect or mediated effect of SNPs on broad depression through the log of pack-years of cigarette smoking, adjusting for age, sex, current smoking status, and genetic ancestry (via principal components). In secondary analyses, we adjusted for age, sex, current smoking status, genetic ancestry (via principal components), income, education, and living status (urban vs. rural). In addition, we examined sex-stratified mediation models and sex-moderated mediation models. For both analyses, we adjusted for age, current smoking status, and genetic ancestry (via principal components). In the UK Biobank, rs6424532 [LOC105378800] had a statistically significant indirect effect on broad depression through the log of pack-years of cigarette smoking (p = 4.0 × 10-4) among all participants and a marginally significant indirect effect among females (p = 0.02) and males (p = 4.0 × 10-3). Moreover, rs10501696 [GRM5] had a marginally significant indirect effect on broad depression through the log of pack-years of cigarette smoking (p = 0.01) among all participants and a significant indirect effect among females (p = 2.2 × 10-3). In the secondary analyses, the sex-moderated indirect effect was marginally significant for rs10501696 [GRM5] on broad depression through the log of pack-years of cigarette smoking (p = 0.01). In the COPDGene study, the effect of an SNP (rs10501696) in GRM5 on depressive symptoms and medication was mediated by log of pack-years (p = 0.02); however, no SNPs had a sex-moderated mediated effect on depressive symptoms. In the UK Biobank, we found SNPs in two genes [LOC105378800, GRM5] with an indirect effect on broad depression through the log of pack-years of cigarette smoking. In addition, the indirect effect for GRM5 on broad depression through smoking may be moderated by sex. These results suggest that genetic regions associated with broad depression may be mediated by cigarette smoking and this relationship may be moderated by sex.


Assuntos
Depressão , Polimorfismo de Nucleotídeo Único , Humanos , Masculino , Feminino , Depressão/genética , Depressão/epidemiologia , Pessoa de Meia-Idade , Idoso , Fumar/genética , Fatores Sexuais , Predisposição Genética para Doença , Reino Unido/epidemiologia , Fumar Cigarros/genética , Fumar Cigarros/efeitos adversos , Fatores de Risco
8.
Alzheimers Dement ; 20(5): 3397-3405, 2024 05.
Artigo em Inglês | MEDLINE | ID: mdl-38563508

RESUMO

INTRODUCTION: Genome-wide association studies have identified numerous disease susceptibility loci (DSLs) for Alzheimer's disease (AD). However, only a limited number of studies have investigated the dependence of the genetic effect size of established DSLs on genetic ancestry. METHODS: We utilized the whole genome sequencing data from the Alzheimer's Disease Sequencing Project (ADSP) including 35,569 participants. A total of 25,459 subjects in four distinct populations (African ancestry, non-Hispanic White, admixed Hispanic, and Asian) were analyzed. RESULTS: We found that nine DSLs showed significant heterogeneity across populations. Single nucleotide polymorphism (SNP) rs2075650 in translocase of outer mitochondrial membrane 40 (TOMM40) showed the largest heterogeneity (Cochran's Q = 0.00, I2 = 90.08), followed by other SNPs in apolipoprotein C1 (APOC1) and apolipoprotein E (APOE). Two additional loci, signal-induced proliferation-associated 1 like 2 (SIPA1L2) and solute carrier 24 member 4 (SLC24A4), showed significant heterogeneity across populations. DISCUSSION: We observed substantial heterogeneity for the APOE-harboring 19q13.32 region with TOMM40/APOE/APOC1 genes. The largest risk effect was seen among African Americans, while Asians showed a surprisingly small risk effect.


Assuntos
Doença de Alzheimer , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Proteínas do Complexo de Importação de Proteína Precursora Mitocondrial , Polimorfismo de Nucleotídeo Único , Humanos , Doença de Alzheimer/genética , Predisposição Genética para Doença/genética , Polimorfismo de Nucleotídeo Único/genética , Apolipoproteínas E/genética , Feminino , Masculino , Apolipoproteína C-I/genética , Idoso , Proteínas de Membrana Transportadoras/genética , Loci Gênicos/genética
9.
BMC Bioinformatics ; 25(1): 43, 2024 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-38273228

RESUMO

The computation of a similarity measure for genomic data is a standard tool in computational genetics. The principal components of such matrices are routinely used to correct for biases due to confounding by population stratification, for instance in linear regressions. However, the calculation of both a similarity matrix and its singular value decomposition (SVD) are computationally intensive. The contribution of this article is threefold. First, we demonstrate that the calculation of three matrices (called the covariance matrix, the weighted Jaccard matrix, and the genomic relationship matrix) can be reformulated in a unified way which allows for the application of a randomized SVD algorithm, which is faster than the traditional computation. The fast SVD algorithm we present is adapted from an existing randomized SVD algorithm and ensures that all computations are carried out in sparse matrix algebra. The algorithm only assumes that row-wise and column-wise subtraction and multiplication of a vector with a sparse matrix is available, an operation that is efficiently implemented in common sparse matrix packages. An exception is the so-called Jaccard matrix, which does not have a structure applicable for the fast SVD algorithm. Second, an approximate Jaccard matrix is introduced to which the fast SVD computation is applicable. Third, we establish guaranteed theoretical bounds on the accuracy (in [Formula: see text] norm and angle) between the principal components of the Jaccard matrix and the ones of our proposed approximation, thus putting the proposed Jaccard approximation on a solid mathematical foundation, and derive the theoretical runtime of our algorithm. We illustrate that the approximation error is low in practice and empirically verify the theoretical runtime scalings on both simulated data and data of the 1000 Genome Project.


Assuntos
Genoma , Genômica , Algoritmos , Modelos Lineares
10.
ArXiv ; 2023 Sep 25.
Artigo em Inglês | MEDLINE | ID: mdl-37808094

RESUMO

Principal components computed via PCA (principal component analysis) are traditionally used to reduce dimensionality in genomic data or to correct for population stratification. In this paper, we explore the penalized eigenvalue problem (PEP) which reformulates the computation of the first eigenvector as an optimization problem and adds an L1 penalty constraint. The contribution of our article is threefold. First, we extend PEP by applying Nesterov smoothing to the original LASSO-type L1 penalty. This allows one to compute analytical gradients which enable faster and more efficient minimization of the objective function associated with the optimization problem. Second, we demonstrate how higher order eigenvectors can be calculated with PEP using established results from singular value decomposition (SVD). Third, using data from the 1000 Genome Project dataset, we empirically demonstrate that our proposed smoothed PEP allows one to increase numerical stability and obtain meaningful eigenvectors. We further investigate the utility of the penalized eigenvector approach over traditional PCA.

11.
Epigenetics ; 18(1): 2257437, 2023 12.
Artigo em Inglês | MEDLINE | ID: mdl-37731367

RESUMO

Background: Recent studies have identified thousands of associations between DNA methylation CpGs and complex diseases/traits, emphasizing the critical role of epigenetics in understanding disease aetiology and identifying biomarkers. However, association analyses based on methylation array data are susceptible to batch/slide effects, which can lead to inflated false positive rates or reduced statistical powerResults: We use multiple DNA methylation datasets based on the popular Illumina Infinium MethylationEPIC BeadChip array to describe consistent patterns and the joint distribution of slide effects across CpGs, confirming and extending previous results. The susceptible CpGs overlap with the Illumina Infinium HumanMethylation450 BeadChip array content.Conclusions: Our findings reveal systematic patterns in slide effects. The observations provide further insights into the characteristics of these effects and can improve existing adjustment approaches.


Assuntos
Metilação de DNA , Epigênese Genética , Epigenômica , Herança Multifatorial
12.
Front Immunol ; 14: 1220028, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37533854

RESUMO

Background: Influenza virus is responsible for a large global burden of disease, especially in children. Multiple Organ Dysfunction Syndrome (MODS) is a life-threatening and fatal complication of severe influenza infection. Methods: We measured RNA expression of 469 biologically plausible candidate genes in children admitted to North American pediatric intensive care units with severe influenza virus infection with and without MODS. Whole blood samples from 191 influenza-infected children (median age 6.4 years, IQR: 2.2, 11) were collected a median of 27 hours following admission; for 45 children a second blood sample was collected approximately seven days later. Extracted RNA was hybridized to NanoString mRNA probes, counts normalized, and analyzed using linear models controlling for age and bacterial co-infections (FDR q<0.05). Results: Comparing pediatric samples collected near admission, children with Prolonged MODS for ≥7 days (n=38; 9 deaths) had significant upregulation of nine mRNA transcripts associated with neutrophil degranulation (RETN, TCN1, OLFM4, MMP8, LCN2, BPI, LTF, S100A12, GUSB) compared to those who recovered more rapidly from MODS (n=27). These neutrophil transcripts present in early samples predicted Prolonged MODS or death when compared to patients who recovered, however in paired longitudinal samples, they were not differentially expressed over time. Instead, five genes involved in protein metabolism and/or adaptive immunity signaling pathways (RPL3, MRPL3, HLA-DMB, EEF1G, CD8A) were associated with MODS recovery within a week. Conclusion: Thus, early increased expression of neutrophil degranulation genes indicated worse clinical outcomes in children with influenza infection, consistent with reports in adult cohorts with influenza, sepsis, and acute respiratory distress syndrome.


Assuntos
Infecções Bacterianas , Influenza Humana , Humanos , Insuficiência de Múltiplos Órgãos/genética , Influenza Humana/genética , Influenza Humana/complicações , Transcriptoma , Fenótipo , Hospitalização , Infecções Bacterianas/complicações
13.
Genes (Basel) ; 14(6)2023 05 24.
Artigo em Inglês | MEDLINE | ID: mdl-37372314

RESUMO

We are interested in detecting a departure from the baseline in a longitudinal analysis in the context of multiple organ dysfunction syndrome (MODS). In particular, we are given gene expression reads at two time points for a fixed number of genes and individuals. The individuals can be subdivided into two groups, denoted as groups A and B. Using the two time points, we compute a contrast of gene expression reads per individual and gene. The age of each individual is known and it is used to compute, for each gene separately, a linear regression of the gene expression contrasts on the individual's age. Looking at the intercept of the linear regression to detect a departure from the baseline, we aim to reliably single out those genes for which there is a difference in the intercept among those individuals in group A and not in group B. In this work, we develop testing methodology for this setting based on two hypothesis tests-one under the null and one under an appropriately formulated alternative. We demonstrate the validity of our approach using a dataset created by bootstrapping from a real data application in the context of multiple organ dysfunction syndrome (MODS).


Assuntos
Insuficiência de Múltiplos Órgãos , Humanos , Insuficiência de Múltiplos Órgãos/genética , Insuficiência de Múltiplos Órgãos/diagnóstico , Modelos Lineares , Expressão Gênica
14.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36585781

RESUMO

Genetic similarity matrices are commonly used to assess population substructure (PS) in genetic studies. Through simulation studies and by the application to whole-genome sequencing (WGS) data, we evaluate the performance of three genetic similarity matrices: the unweighted and weighted Jaccard similarity matrices and the genetic relationship matrix. We describe different scenarios that can create numerical pitfalls and lead to incorrect conclusions in some instances. We consider scenarios in which PS is assessed based on loci that are located across the genome ('globally') and based on loci from a specific genomic region ('locally'). We also compare scenarios in which PS is evaluated based on loci from different minor allele frequency bins: common (>5%), low-frequency (5-0.5%) and rare (<0.5%) single-nucleotide variations (SNVs). Overall, we observe that all approaches provide the best clustering performance when computed based on rare SNVs. The performance of the similarity matrices is very similar for common and low-frequency variants, but for rare variants, the unweighted Jaccard matrix provides preferable clustering features. Based on visual inspection and in terms of standard clustering metrics, its clusters are the densest and the best separated in the principal component analysis of variants with rare SNVs compared with the other methods and different allele frequency cutoffs. In an application, we assessed the role of rare variants on local and global PS, using WGS data from multiethnic Alzheimer's disease data sets and European or East Asian populations from the 1000 Genome Project.


Assuntos
Genoma , Genômica , Análise de Componente Principal , Frequência do Gene , Simulação por Computador , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único
15.
Artigo em Inglês | MEDLINE | ID: mdl-38179578

RESUMO

Quantum annealing is a specialized type of quantum computation that aims to use quantum fluctuations in order to obtain global minimum solutions of combinatorial optimization problems. Programmable D-Wave quantum annealers are available as cloud computing resources, which allow users low-level access to quantum annealing control features. In this article, we are interested in improving the quality of the solutions returned by a quantum annealer by encoding an initial state into the annealing process. We explore twoD-Wave features that allow one toencode such an initialstate: the reverse annealing (RA) and theh-gain(HG)features.RAaimstorefineaknownsolutionfollowinganannealpathstartingwithaclassical state representing a good solution, going backward to a point where a transverse field is present, and then finishing the annealing process with a forward anneal. The HG feature allows one to put a time-dependent weighting scheme on linear (h) biases of the Hamiltonian, and we demonstrate that this feature likewise can be used to bias the annealing to start from an initial state. We also consider a hybrid method consisting of a backward phase resembling RA and a forward phase using the HG initial state encoding. Importantly, we investigate the idea of iteratively applying RA and HG to a problem, with the goal of monotonically improving on an initial state that is not optimal. The HG encoding technique is evaluated on a variety of input problems including the edge-weighted maximum cut problem and the vertex-weighted maximum clique problem, demonstrating that the HG technique is a viable alternative to RA for some problems. We also investigate how the iterative procedures perform for both RA and HG initial state encodings on random whole-chip spin glasses with the native hardware connectivity of the D-Wave Chimera and Pegasus chips.

16.
BMC Bioinformatics ; 23(1): 547, 2022 Dec 19.
Artigo em Inglês | MEDLINE | ID: mdl-36536276

RESUMO

As of June 2022, the GISAID database contains more than 11 million SARS-CoV-2 genomes, including several thousand nucleotide sequences for the most common variants such as delta or omicron. These SARS-CoV-2 strains have been collected from patients around the world since the beginning of the pandemic. We start by assessing the similarity of all pairs of nucleotide sequences using the Jaccard index and principal component analysis. As shown previously in the literature, an unsupervised cluster analysis applied to the SARS-CoV-2 genomes results in clusters of sequences according to certain characteristics such as their strain or their clade. Importantly, we observe that nucleotide sequences of common variants are often outliers in clusters of sequences stemming from variants identified earlier on during the pandemic. Motivated by this finding, we are interested in applying outlier detection to nucleotide sequences. We demonstrate that nucleotide sequences of common variants (such as alpha, delta, or omicron) can be identified solely based on a statistical outlier criterion. We argue that outlier detection might be a useful surveillance tool to identify emerging variants in real time as the pandemic progresses.


Assuntos
COVID-19 , Humanos , Sequência de Bases , SARS-CoV-2 , Análise por Conglomerados , Bases de Dados Factuais
17.
PLoS One ; 17(5): e0266752, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35544468

RESUMO

To increase power and minimize bias in statistical analyses, quantitative outcomes are often adjusted for precision and confounding variables using standard regression approaches. The outcome is modeled as a linear function of the precision variables and confounders; however, for many complex phenotypes, the assumptions of the linear regression models are not always met. As an alternative, we used neural networks for the modeling of complex phenotypes and covariate adjustments. We compared the prediction accuracy of the neural network models to that of classical approaches based on linear regression. Using data from the UK Biobank, COPDGene study, and Childhood Asthma Management Program (CAMP), we examined the features of neural networks in this context and compared them with traditional regression approaches for prediction of three outcomes: forced expiratory volume in one second (FEV1), age at smoking cessation, and log transformation of age at smoking cessation (due to age at smoking cessation being right-skewed). We used mean squared error to compare neural network and regression models, and found the models performed similarly unless the observed distribution of the phenotype was skewed, in which case the neural network had smaller mean squared error. Our results suggest neural network models have an advantage over standard regression approaches when the phenotypic distribution is skewed. However, when the distribution is not skewed, the approaches performed similarly. Our findings are relevant to studies that analyze phenotypes that are skewed by nature or where the phenotype of interest is skewed as a result of the ascertainment condition.


Assuntos
Redes Neurais de Computação , Fumar , Volume Expiratório Forçado/genética , Fenótipo , Espirometria
18.
Sci Rep ; 12(1): 8539, 2022 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-35595786

RESUMO

Quantum annealers manufactured by D-Wave Systems, Inc., are computational devices capable of finding high-quality heuristic solutions of NP-hard problems. In this contribution, we explore the potential and effectiveness of such quantum annealers for computing Boolean tensor networks. Tensors offer a natural way to model high-dimensional data commonplace in many scientific fields, and representing a binary tensor as a Boolean tensor network is the task of expressing a tensor containing categorical (i.e., [Formula: see text]) values as a product of low dimensional binary tensors. A Boolean tensor network is computed by Boolean tensor decomposition, and it is usually not exact. The aim of such decomposition is to minimize the given distance measure between the high-dimensional input tensor and the product of lower-dimensional (usually three-dimensional) tensors and matrices representing the tensor network. In this paper, we introduce and analyze three general algorithms for Boolean tensor networks: Tucker, Tensor Train, and Hierarchical Tucker networks. The computation of a Boolean tensor network is reduced to a sequence of Boolean matrix factorizations, which we show can be expressed as a quadratic unconstrained binary optimization problem suitable for solving on a quantum annealer. By using a novel method we introduce called parallel quantum annealing, we demonstrate that Boolean tensor's with up to millions of elements can be decomposed efficiently using a DWave 2000Q quantum annealer.

19.
Sci Rep ; 12(1): 4499, 2022 Mar 16.
Artigo em Inglês | MEDLINE | ID: mdl-35296721

RESUMO

Quantum annealers of D-Wave Systems, Inc., offer an efficient way to compute high quality solutions of NP-hard problems. This is done by mapping a problem onto the physical qubits of the quantum chip, from which a solution is obtained after quantum annealing. However, since the connectivity of the physical qubits on the chip is limited, a minor embedding of the problem structure onto the chip is required. In this process, and especially for smaller problems, many qubits will stay unused. We propose a novel method, called parallel quantum annealing, to make better use of available qubits, wherein either the same or several independent problems are solved in the same annealing cycle of a quantum annealer, assuming enough physical qubits are available to embed more than one problem. Although the individual solution quality may be slightly decreased when solving several problems in parallel (as opposed to solving each problem separately), we demonstrate that our method may give dramatic speed-ups in terms of the Time-To-Solution (TTS) metric for solving instances of the Maximum Clique problem when compared to solving each problem sequentially on the quantum annealer. Additionally, we show that solving a single Maximum Clique problem using parallel quantum annealing reduces the TTS significantly.

20.
Genes (Basel) ; 13(1)2022 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-35052450

RESUMO

Polygenic risk scores are a popular means to predict the disease risk or disease susceptibility of an individual based on its genotype information. When adding other important epidemiological covariates such as age or sex, we speak of an integrated risk model. Methodological advances for fitting more accurate integrated risk models are of immediate importance to improve the precision of risk prediction, thereby potentially identifying patients at high risk early on when they are still able to benefit from preventive steps/interventions targeted at increasing their odds of survival, or at reducing their chance of getting a disease in the first place. This article proposes a smoothed version of the "Lassosum" penalty used to fit polygenic risk scores and integrated risk models using either summary statistics or raw data. The smoothing allows one to obtain explicit gradients everywhere for efficient minimization of the Lassosum objective function while guaranteeing bounds on the accuracy of the fit. An experimental section on both Alzheimer's disease and COPD (chronic obstructive pulmonary disease) demonstrates the increased accuracy of the proposed smoothed Lassosum penalty compared to the original Lassosum algorithm (for the datasets under consideration), allowing it to draw equal with state-of-the-art methodology such as LDpred2 when evaluated via the AUC (area under the ROC curve) metric.


Assuntos
Algoritmos , Doença de Alzheimer/genética , Predisposição Genética para Doença , Modelos Genéticos , Herança Multifatorial , Polimorfismo de Nucleotídeo Único , Doença Pulmonar Obstrutiva Crônica/genética , Idoso , Idoso de 80 Anos ou mais , Doença de Alzheimer/patologia , Estudos de Casos e Controles , Feminino , Estudo de Associação Genômica Ampla , Humanos , Pessoa de Meia-Idade , Doença Pulmonar Obstrutiva Crônica/patologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA