Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 87
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
PLoS Genet ; 19(9): e1010902, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37738239

RESUMEN

Common genetic variants associated with lung cancer have been well studied in the past decade. However, only 12.3% heritability has been explained by these variants. In this study, we investigate the contribution of rare variants (RVs) (minor allele frequency <0.01) to lung cancer through two large whole exome sequencing case-control studies. We first performed gene-based association tests using a novel Bayes Factor statistic in the International Lung Cancer Consortium, the discovery study (European, 1042 cases vs. 881 controls). The top genes identified are further assessed in the UK Biobank (European, 630 cases vs. 172 864 controls), the replication study. After controlling for the false discovery rate, we found two genes, CTSL and APOE, significantly associated with lung cancer in both studies. Single variant tests in UK Biobank identified 4 RVs (3 missense variants) in CTSL and 2 RVs (1 missense variant) in APOE stongly associated with lung cancer (OR between 2.0 and 139.0). The role of these genetic variants in the regulation of CTSL or APOE expression remains unclear. If such a role is established, this could have important therapeutic implications for lung cancer patients.


Asunto(s)
Neoplasias Pulmonares , Humanos , Teorema de Bayes , Secuenciación del Exoma , Neoplasias Pulmonares/genética , Estudios de Casos y Controles , Apolipoproteínas E/genética
2.
Proc Natl Acad Sci U S A ; 119(49): e2207824119, 2022 12 06.
Artículo en Inglés | MEDLINE | ID: mdl-36454756

RESUMEN

Revealing the molecular events associated with reprogramming different somatic cell types to pluripotency is critical for understanding the characteristics of induced pluripotent stem cell (iPSC) therapeutic derivatives. Inducible reprogramming factor transgenic cells or animals-designated as secondary (2°) reprogramming systems-not only provide excellent experimental tools for such studies but also offer a strategy to study the variances in cellular reprogramming outcomes due to different in vitro and in vivo environments. To make such studies less cumbersome, it is desirable to have a variety of efficient reprogrammable mouse systems to induce successful mass reprogramming in somatic cell types. Here, we report the development of two transgenic mouse lines from which 2° cells reprogram with unprecedented efficiency. These systems were derived by exposing primary reprogramming cells containing doxycycline-inducible Yamanaka factor expression to a transient interruption in transgene expression, resulting in selection for a subset of clones with robust transgene response. These systems also include reporter genes enabling easy readout of endogenous Oct4 activation (GFP), indicative of pluripotency, and reprogramming transgene expression (mCherry). Notably, somatic cells derived from various fetal and adult tissues from these 2° mouse lines gave rise to highly efficient and rapid reprogramming, with transgene-independent iPSC colonies emerging as early as 1 wk after induction. These mouse lines serve as a powerful tool to explore sources of variability in reprogramming and the mechanistic underpinnings of efficient reprogramming systems.


Asunto(s)
Reprogramación Celular , Doxiciclina , Animales , Ratones , Ratones Transgénicos , Reprogramación Celular/genética , Transgenes , Células Clonales , Doxiciclina/farmacología
3.
Stat Med ; 43(5): 1048-1082, 2024 Feb 28.
Artículo en Inglés | MEDLINE | ID: mdl-38118464

RESUMEN

State-of-the-art biostatistics methods allow for the simultaneous modeling of several correlated non-fatal disease processes over time, but there is no clear guidance on the optimal analysis in most settings. An example occurs in diabetes, where it is not known with certainty how microvascular complications of the eyes, kidneys, and nerves co-develop over time. In this article, we propose and contrast two general model frameworks for studying complications (sequential state and parallel trajectory frameworks) and review multivariate methods for their analysis, focusing on multistate and joint modeling. We illustrate these methods in a tutorial format using the long-term follow-up from the Diabetes Control and Complications Trial and Epidemiology of Diabetes Interventions and Complications study public data repository. A formal comparison of prediction error and discrimination is included. Multistate models are particularly advantageous for determining the order and timing of complications, but require discretization of the longitudinal outcomes and possibly a very complex state space process. Intermittent observation of the states must be accounted for, and discretization is a probable disadvantage in this setting. In contrast, joint models can account for variations of continuous biomarkers over time and are particularly designed for modeling complex association structures between the complications and for performing dynamic predictions of an outcome of interest to inform clinical decisions (eg, a late-stage complication). We found that both models have helpful features that can better-inform our understanding of the complex trajectories that complications may take and can therefore help with decision making for patients presenting with diabetes complications.


Asunto(s)
Complicaciones de la Diabetes , Diabetes Mellitus , Humanos , Complicaciones de la Diabetes/epidemiología , Diabetes Mellitus/epidemiología , Probabilidad , Ensayos Clínicos como Asunto
4.
Pharm Stat ; 23(1): 60-80, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-37717945

RESUMEN

The sum of the longest diameter (SLD) of the target lesions is a longitudinal biomarker used to assess tumor response in cancer clinical trials, which can inform about early treatment effect. This biomarker is semicontinuous, often characterized by an excess of zeros and right skewness. Conditional two-part joint models were introduced to account for the excess of zeros in the longitudinal biomarker distribution and link it to a time-to-event outcome. A limitation of the conditional two-part model is that it only provides an effect of covariates, such as treatment, on the conditional mean of positive biomarker values, and not an overall effect on the biomarker, which is often of clinical relevance. As an alternative, we propose in this article, a marginalized two-part joint model (M-TPJM) for the repeated measurements of the SLD and a terminal event, where the covariates affect the overall mean of the biomarker. Our simulation studies assessed the good performance of the marginalized model in terms of estimation and coverage rates. Our application of the M-TPJM to a randomized clinical trial of advanced head and neck cancer shows that the combination of panitumumab in addition with chemotherapy increases the odds of observing a disappearance of all target lesions compared to chemotherapy alone, leading to a possible indirect effect of the combined treatment on time to death.


Asunto(s)
Neoplasias de Cabeza y Cuello , Modelos Estadísticos , Humanos , Simulación por Computador , Neoplasias de Cabeza y Cuello/tratamiento farmacológico , Biomarcadores , Estudios Longitudinales
5.
Biostatistics ; 23(1): 50-68, 2022 01 13.
Artículo en Inglés | MEDLINE | ID: mdl-32282877

RESUMEN

Joint models for a longitudinal biomarker and a terminal event have gained interests for evaluating cancer clinical trials because the tumor evolution reflects directly the state of the disease. A biomarker characterizing the tumor size evolution over time can be highly informative for assessing treatment options and could be taken into account in addition to the survival time. The biomarker often has a semicontinuous distribution, i.e., it is zero inflated and right skewed. An appropriate model is needed for the longitudinal biomarker as well as an association structure with the survival outcome. In this article, we propose a joint model for a longitudinal semicontinuous biomarker and a survival time. The semicontinuous nature of the longitudinal biomarker is specified by a two-part model, which splits its distribution into a binary outcome (first part) represented by the positive versus zero values and a continuous outcome (second part) with the positive values only. Survival times are modeled with a proportional hazards model for which we propose three association structures with the biomarker. Our simulation studies show some bias can arise in the parameter estimates when the semicontinuous nature of the biomarker is ignored, assuming the true model is a two-part model. An application to advanced metastatic colorectal cancer data from the GERCOR study is performed where our two-part model is compared to one-part joint models. Our results show that treatment arm B (FOLFOX6/FOLFIRI) is associated to higher SLD values over time and its positive association with the terminal event leads to an increased risk of death compared to treatment arm A (FOLFIRI/FOLFOX6).


Asunto(s)
Neoplasias Colorrectales , Modelos Estadísticos , Biomarcadores , Neoplasias Colorrectales/tratamiento farmacológico , Simulación por Computador , Humanos , Estudios Longitudinales
6.
PLoS Genet ; 16(6): e1008790, 2020 06.
Artículo en Inglés | MEDLINE | ID: mdl-32525877

RESUMEN

Recent discoveries from large-scale genome-wide association studies (GWASs) explain a larger proportion of the genetic variability to BMI and obesity. The genetic risk associated with BMI and obesity can be assessed by an obesity-specific genetic risk score (GRS) constructed from genome-wide significant genetic variants. The aim of our study is to examine whether the duration and exclusivity of breastfeeding can attenuate BMI increase during childhood and adolescence due to genetic risks. A total sample of 5,266 children (2,690 boys and 2,576 girls) from the Avon Longitudinal Study of Parents and Children (ALSPAC) was used for the analysis. We evaluated the role of breastfeeding (exclusivity and duration) in modulating BMI increase attributed to the GRS from birth to 18 years of age. The GRS was composed of 69 variants associated with adult BMI and 25 non-overlapping SNPs associated with pediatric BMI. In the high genetic susceptible group (upper GRS quartile), exclusive breastfeeding (EBF) to 5 months reduces BMI by 1.14 kg/m2 (95% CI, 0.37 to 1.91, p = 0.0037) in 18-year-old boys, which compensates a 3.9-decile GRS increase. In 18-year-old girls, EBF to 5 months decreases BMI by 1.53 kg/m2 (95% CI, 0.76 to 2.29, p<0.0001), which compensates a 7.0-decile GRS increase. EBF acts early in life by delaying the age at adiposity peak and at adiposity rebound. EBF to 3 months or non-exclusive breastfeeding was associated with a significantly diminished impact on reducing BMI growth during childhood. EBF influences early life growth and development and thus may play a critical role in preventing overweight and obesity among children at high-risk due to genetic factors.


Asunto(s)
Índice de Masa Corporal , Lactancia Materna/estadística & datos numéricos , Obesidad/genética , Adiposidad/genética , Adolescente , Lactancia Materna/métodos , Femenino , Predisposición Genética a la Enfermedad , Humanos , Lactante , Masculino , Obesidad/epidemiología , Obesidad/prevención & control , Polimorfismo de Nucleótido Simple , Adulto Joven
7.
Biom J ; 65(4): e2100322, 2023 04.
Artículo en Inglés | MEDLINE | ID: mdl-36846925

RESUMEN

Two-part joint models for a longitudinal semicontinuous biomarker and a terminal event have been recently introduced based on frequentist estimation. The biomarker distribution is decomposed into a probability of positive value and the expected value among positive values. Shared random effects can represent the association structure between the biomarker and the terminal event. The computational burden increases compared to standard joint models with a single regression model for the biomarker. In this context, the frequentist estimation implemented in the R package frailtypack can be challenging for complex models (i.e., a large number of parameters and dimension of the random effects). As an alternative, we propose a Bayesian estimation of two-part joint models based on the Integrated Nested Laplace Approximation (INLA) algorithm to alleviate the computational burden and fit more complex models. Our simulation studies confirm that INLA provides accurate approximation of posterior estimates and to reduced computation time and variability of estimates compared to frailtypack in the situations considered. We contrast the Bayesian and frequentist approaches in the analysis of two randomized cancer clinical trials (GERCOR and PRIME studies), where INLA has a reduced variability for the association between the biomarker and the risk of event. Moreover, the Bayesian approach was able to characterize subgroups of patients associated with different responses to treatment in the PRIME study. Our study suggests that the Bayesian approach using the INLA algorithm enables to fit complex joint models that might be of interest in a wide range of clinical applications.


Asunto(s)
Modelos Estadísticos , Neoplasias , Humanos , Teorema de Bayes , Simulación por Computador , Algoritmos
8.
Biometrics ; 77(1): 316-328, 2021 03.
Artículo en Inglés | MEDLINE | ID: mdl-32277476

RESUMEN

The discovery of rare genetic variants through next generation sequencing is a very challenging issue in the field of human genetics. We propose a novel region-based statistical approach based on a Bayes Factor (BF) to assess evidence of association between a set of rare variants (RVs) located on the same genomic region and a disease outcome in the context of case-control design. Marginal likelihoods are computed under the null and alternative hypotheses assuming a binomial distribution for the RV count in the region and a beta or mixture of Dirac and beta prior distribution for the probability of RV. We derive the theoretical null distribution of the BF under our prior setting and show that a Bayesian control of the false Discovery Rate can be obtained for genome-wide inference. Informative priors are introduced using prior evidence of association from a Kolmogorov-Smirnov test statistic. We use our simulation program, sim1000G, to generate RV data similar to the 1000 genomes sequencing project. Our simulation studies showed that the new BF statistic outperforms standard methods (SKAT, SKAT-O, Burden test) in case-control studies with moderate sample sizes and is equivalent to them under large sample size scenarios. Our real data application to a lung cancer case-control study found enrichment for RVs in known and novel cancer genes. It also suggests that using the BF with informative prior improves the overall gene discovery compared to the BF with noninformative prior.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Teorema de Bayes , Estudios de Casos y Controles , Simulación por Computador , Humanos , Tamaño de la Muestra
9.
J Stat Softw ; 97(7)2021 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-34512212

RESUMEN

FamEvent is a comprehensive R package for simulating and modelling age-at-disease onset in families carrying a rare gene mutation. The package can simulate complex family data for variable time-to-event outcomes under three common family study designs (population, high-risk clinic and multi-stage) with various levels of missing genetic information among family members. Residual familial correlation can be induced through the inclusion of a frailty term or a second gene. Disease-gene carrier probabilities are evaluated assuming Mendelian transmission or empirically from the data. When genetic information on the disease gene is missing, an Expectation-Maximization algorithm is employed to calculate the carrier probabilities. Penetrance model functions with ascertainment correction adapted to the sampling design provide age-specific cumulative disease risks by sex, mutation status, and other covariates for simulated data as well as real data analysis. Robust standard errors and 95% confidence intervals are available for these estimates. Plots of pedigrees and penetrance functions based on the fitted model provide graphical displays to evaluate and summarize the models.

10.
BMC Bioinformatics ; 20(1): 26, 2019 Jan 15.
Artículo en Inglés | MEDLINE | ID: mdl-30646839

RESUMEN

BACKGROUND: Simulation of genetic variants data is frequently required for the evaluation of statistical methods in the fields of human and animal genetics. Although a number of high-quality genetic simulators have been developed, many of them require advanced knowledge in population genetics or in computation to be used effectively. In addition, generating simulated data in the context of family-based studies demands sophisticated methods and advanced computer programming. RESULTS: To address these issues, we propose a new user-friendly and integrated R package, sim1000G, which simulates variants in genomic regions among unrelated individuals or among families. The only input needed is a raw phased Variant Call Format (VCF) file. Haplotypes are extracted to compute linkage disequilibrium (LD) in the simulated genomic regions and for the generation of new genotype data among unrelated individuals. The covariance across variants is used to preserve the LD structure of the original population. Pedigrees of arbitrary sizes are generated by modeling recombination events with sim1000G. To illustrate the application of sim1000G, various scenarios are presented assuming unrelated individuals from a single population or two distinct populations, or alternatively for three-generation pedigree data. Sim1000G can capture allele frequency diversity, short and long-range linkage disequilibrium (LD) patterns and subtle population differences in LD structure without the need of any tuning parameters. CONCLUSION: Sim1000G fills a gap in the vast area of genetic variants simulators by its simplicity and independence from external tools. Currently, it is one of the few simulation packages completely integrated into R and able to simulate multiple genetic variants among unrelated individuals and within families. Its implementation will facilitate the application and development of computational methods for association studies with both rare and common variants.


Asunto(s)
Biología Computacional/métodos , Ligamiento Genético , Marcadores Genéticos , Genética de Población , Modelos Genéticos , Polimorfismo de Nucleótido Simple , Programas Informáticos , Femenino , Humanos , Desequilibrio de Ligamiento , Masculino , Linaje
11.
J Cell Mol Med ; 21(10): 2386-2402, 2017 10.
Artículo en Inglés | MEDLINE | ID: mdl-28429508

RESUMEN

The onset of labour in rodents and in humans is associated with physiological inflammation which is manifested by infiltration of activated maternal peripheral leukocytes (mPLs) into uterine tissues. Here, we used flow cytometry to immunophenotype mPLs throughout gestation and labour, both term and preterm. Peripheral blood was collected from non-pregnant women and pregnant women in the 1st, 2nd and 3rd trimesters. Samples were also collected from women in active labour at term (TL) or preterm (PTL) and compared with women term not-in-labour (TNIL) and preterm not-in-labour (PTNIL). Different leukocyte populations were identified by surface markers such as CD45, CD14, CD15, CD3, CD4, CD8, CD19 and CD56. Their activation status was measured by the expression levels of CD11b, CD44, CD55, CD181 and CD192 proteins. Of all circulating CD45+ leukocytes, we detected significant increases in CD15+ granulocytes (i) in pregnant women versus non-pregnant; (ii) in TL women versus TNIL and versus pregnant women in the 1st/2nd/3rd trimester; (iii) in PTL women versus PTNIL. TL was characterized by (iv) increased expressions of CD11b, CD55 and CD192 on granulocytes; (v) increased mean fluorescent intensity (MFI) of CD55 and CD192 on monocytes; (vi) increased CD44 MFI on CD3+ lymphocytes as compared to late gestation. In summary, we have identified sub-populations of mPLs that are specifically activated in association with gestation (granulocytes) or with the onset of labour (granulocytes, monocytes and lymphocytes). Additionally, beta regression analysis created a set of reference values to rank this association between immune markers of pregnancy and to identify activation status with potential prognostic and diagnostic capability.


Asunto(s)
Inmunofenotipificación/métodos , Trabajo de Parto/inmunología , Leucocitos/inmunología , Trabajo de Parto Prematuro/inmunología , Nacimiento a Término/inmunología , Adulto , Antígenos CD/inmunología , Antígenos CD/metabolismo , Femenino , Citometría de Flujo , Granulocitos/inmunología , Granulocitos/metabolismo , Humanos , Trabajo de Parto/sangre , Recuento de Leucocitos , Leucocitos/metabolismo , Linfocitos/inmunología , Linfocitos/metabolismo , Monocitos/inmunología , Monocitos/metabolismo , Neutrófilos/inmunología , Neutrófilos/metabolismo , Trabajo de Parto Prematuro/sangre , Embarazo , Nacimiento a Término/sangre , Adulto Joven
12.
Biometrics ; 73(1): 271-282, 2017 03.
Artículo en Inglés | MEDLINE | ID: mdl-27378229

RESUMEN

In this article, we propose an association model to estimate the penetrance (risk) of successive cancers in the presence of competing risks. The association between the successive events is modeled via a copula and a proportional hazards model is specified for each competing event. This work is motivated by the analysis of successive cancers for people with Lynch Syndrome in the presence of competing risks. The proposed inference procedure is adapted to handle missing genetic covariates and selection bias, induced by the data collection protocol of the data at hand. The performance of the proposed estimation procedure is evaluated by simulations and its use is illustrated with data from the Colon Cancer Family Registry (Colon CFR).


Asunto(s)
Neoplasias Colorrectales Hereditarias sin Poliposis/patología , Interpretación Estadística de Datos , Modelos de Riesgos Proporcionales , Análisis de Varianza , Sesgo , Neoplasias del Colon , Simulación por Computador , Genética , Humanos , Sistema de Registros , Riesgo
13.
Gastroenterology ; 148(3): 556-64, 2015 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-25479140

RESUMEN

BACKGROUND & AIMS: We investigated the prevalence of germline mutations in APC, ATM, BRCA1, BRCA2, CDKN2A, MLH1, MSH2, MSH6, PALB2, PMS2, PRSS1, STK11, and TP53 in patients with pancreatic cancer. METHODS: The Ontario Pancreas Cancer Study enrolls consenting participants with pancreatic cancer from a province-wide electronic pathology database; 708 probands were enrolled from April 2003 through August 2012. To improve the precision of BRCA2 prevalence estimates, 290 probands were selected from 3 strata, based on family history of breast and/or ovarian cancer, pancreatic cancer, or neither. Germline DNA was analyzed by next-generation sequencing using a custom multiple-gene panel. Mutation prevalence estimates were calculated from the sample for the entire cohort. RESULTS: Eleven pathogenic mutations were identified: 3 in ATM, 1 in BRCA1, 2 in BRCA2, 1 in MLH1, 2 in MSH2, 1 in MSH6, and 1 in TP53. The prevalence of mutations in all 13 genes was 3.8% (95% confidence interval, 2.1%-5.6%). Carrier status was associated significantly with breast cancer in the proband or first-degree relative (P < .01), and with colorectal cancer in the proband or first-degree relative (P < .01), but not family history of pancreatic cancer, age at diagnosis, or stage at diagnosis. Of patients with a personal or family history of breast and colorectal cancer, 10.7% (95% confidence interval, 4.4%-17.0%) and 11.1% (95% confidence interval, 3.0%-19.1%) carried pathogenic mutations, respectively. CONCLUSIONS: A small but clinically important proportion of pancreatic cancer is associated with mutations in known predisposition genes. The heterogeneity of mutations identified in this study shows the value of using a multiple-gene panel in pancreatic cancer.


Asunto(s)
Acalasia del Esófago/genética , Genes Relacionados con las Neoplasias/genética , Hepatitis Alcohólica/inmunología , Trasplante de Hígado/tendencias , Óxido Nítrico Sintasa de Tipo I/genética , Enfermedad del Hígado Graso no Alcohólico/epidemiología , Neoplasias Pancreáticas/genética , Humanos
14.
Stat Appl Genet Mol Biol ; 13(5): 567-87, 2014 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-25153607

RESUMEN

Genome-wide association studies have been successful in uncovering novel genetic variants that are associated with disease status or cross-sectional phenotypic traits. Researchers are beginning to investigate how genes play a role in the development of a trait over time. Linear mixed effects models (LMM) are commonly used to model longitudinal data; however, it is unclear if the failure to meet the models distributional assumptions will affect the conclusions when conducting a genome-wide association study. In an extensive simulation study, we compare coverage probabilities, bias, type 1 error rates and statistical power when the error of the LMM is either heteroscedastic or has a non-Gaussian distribution. We conclude that the model is robust to misspecification if the same function of age is included in the fixed and random effects. However, type 1 error of the genetic effect over time is inflated, regardless of the model misspecification, if the polynomial function for age in the fixed and random effects differs. In situations where the model will not converge with a high order polynomial function in the random effects, a reduced function can be used but a robust standard error needs to be calculated to avoid inflation of the type 1 error. As an illustration, a LMM was applied to longitudinal body mass index (BMI) data over childhood in the ALSPAC cohort; the results emphasised the need for the robust standard error to ensure correct inference of associations of longitudinal BMI with chromosome 16 single nucleotide polymorphisms.


Asunto(s)
Estudio de Asociación del Genoma Completo , Modelos Teóricos , Índice de Masa Corporal , Cromosomas Humanos Par 6 , Humanos , Estudios Longitudinales , Polimorfismo de Nucleótido Simple , Probabilidad
15.
Hum Hered ; 78(3-4): 140-52, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25342289

RESUMEN

BACKGROUND/AIMS: Gene network analysis can be a very valuable approach for elucidating complex dependence between functional SNPs in a candidate genetic pathway and for assessing their association with a disease of interest. Even when the number of SNPs evaluated is relatively small (<20), the number of potential gene networks induced by the SNPs can be very large and the contingency tables representing their joint distribution very sparse. METHODS: In this paper, we propose a Bayesian model determination for gene network analysis using decomposable discrete graphical models combined with Reversible Jump Markov chain Monte Carlo. We show the application of this approach in a study of 13 SNPs in the DNA repair pathway and their association with breast cancer from a case-control study conducted in Ontario, Canada. RESULTS: The strength of associations among the SNPs and between the SNPs and the disease status is evaluated by computing the posterior probability of any pair of variables. The corresponding gene network is reconstructed by retaining pair-wise associations with the highest posterior probabilities. In our real data analysis, we found evidence for a particular association between one SNP in the gene POLL and the disease status and also several interesting patterns of association between the SNPs themselves. CONCLUSION: This general statistical framework could serve as a basis for prioritizing genes and SNPs that play a major role in breast cancer etiology and to better understand their complex interactions in a specific genetic pathway.


Asunto(s)
Teorema de Bayes , Neoplasias de la Mama/genética , Redes Reguladoras de Genes , Modelos Estadísticos , Neoplasias de la Mama/epidemiología , Estudios de Casos y Controles , Reparación del ADN/genética , Humanos , Ontario/epidemiología , Polimorfismo de Nucleótido Simple
16.
Hum Genet ; 133(8): 951-66, 2014 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-24770874

RESUMEN

This paper provides a review of recent applications of quantile regression to the fields of genetic and the emerging -omic studies. It begins with a general background about this statistical approach following the seminal paper of Koenker and Bassett (Econometrica 46:33-50, 1978). Applications are described, as diverse as genetic association studies, penetrance estimation, gene expression, CGH array experiments, RNAseq experiments, methylation data and proteomics. This paper also introduces recent extensions of quantile regression with a particular focus on the Copula-quantile regression, an approach we recently proposed for sib-pair analysis. A real data example from eQTL analysis is then presented and the [Formula: see text] codes, which run the analyses are provided. Finally, we conclude with some statistical software presentation and some general statements about the potential and interests of quantile regression in modern biological experiments.


Asunto(s)
Genética , Genómica , Análisis de Regresión , Animales , Simulación por Computador , Expresión Génica , Humanos , Metabolómica , Análisis de Secuencia por Matrices de Oligonucleótidos , Proteómica
17.
Stat Med ; 33(4): 618-38, 2014 Feb 20.
Artículo en Inglés | MEDLINE | ID: mdl-23946183

RESUMEN

Lynch Syndrome (LS) families harbor mutated mismatch repair genes,which predispose them to specific types of cancer. Because individuals within LS families can experience multiple cancers over their lifetime, we developed a progressive three-state model to estimate the disease risk from a healthy (state 0) to a first cancer (state 1) and then to a second cancer (state 2). Ascertainment correction of the likelihood was made to adjust for complex sampling designs with carrier probabilities for family members with missing genotype information estimated using their family's observed genotype and phenotype information in a one-step expectation-maximization algorithm. A sandwich variance estimator was employed to overcome possible model misspecification. The main objective of this paper is to estimate the disease risk (penetrance) for age at a second cancer after someone has experienced a first cancer that is also associated with a mutated gene. Simulation study results indicate that our approach generally provides unbiased risk estimates and low root mean squared errors across different family study designs, proportions of missing genotypes, and risk heterogeneities. An application to 12 large LS families from Newfoundland demonstrates that the risk for a second cancer was substantial and that the age at a first colorectal cancer significantly impacted the age at any LS subsequent cancer. This study provides new insights for developing more effective management of mutation carriers in LS families by providing more accurate multiple cancer risk estimates.


Asunto(s)
Neoplasias Colorrectales Hereditarias sin Poliposis/genética , Heterocigoto , Modelos Genéticos , Mutación/genética , Penetrancia , Modelos de Riesgos Proporcionales , Factores de Edad , Algoritmos , Simulación por Computador , Femenino , Predisposición Genética a la Enfermedad/genética , Genotipo , Humanos , Masculino , Terranova y Labrador , Medición de Riesgo/métodos
18.
PLoS Genet ; 7(2): e1001307, 2011 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-21379325

RESUMEN

An age-dependent association between variation at the FTO locus and BMI in children has been suggested. We meta-analyzed associations between the FTO locus (rs9939609) and BMI in samples, aged from early infancy to 13 years, from 8 cohorts of European ancestry. We found a positive association between additional minor (A) alleles and BMI from 5.5 years onwards, but an inverse association below age 2.5 years. Modelling median BMI curves for each genotype using the LMS method, we found that carriers of minor alleles showed lower BMI in infancy, earlier adiposity rebound (AR), and higher BMI later in childhood. Differences by allele were consistent with two independent processes: earlier AR equivalent to accelerating developmental age by 2.37% (95% CI 1.87, 2.87, p = 10(-20)) per A allele and a positive age by genotype interaction such that BMI increased faster with age (p = 10(-23)). We also fitted a linear mixed effects model to relate genotype to the BMI curve inflection points adiposity peak (AP) in infancy and AR. Carriage of two minor alleles at rs9939609 was associated with lower BMI at AP (-0.40% (95% CI: -0.74, -0.06), p = 0.02), higher BMI at AR (0.93% (95% CI: 0.22, 1.64), p = 0.01), and earlier AR (-4.72% (-5.81, -3.63), p = 10(-17)), supporting cross-sectional results. Overall, we confirm the expected association between variation at rs9939609 and BMI in childhood, but only after an inverse association between the same variant and BMI in infancy. Patterns are consistent with a shift on the developmental scale, which is reflected in association with the timing of AR rather than just a global increase in BMI. Results provide important information about longitudinal gene effects and about the role of FTO in adiposity. The associated shifts in developmental timing have clinical importance with respect to known relationships between AR and both later-life BMI and metabolic disease risk.


Asunto(s)
Índice de Masa Corporal , Estudios de Asociación Genética , Sitios Genéticos/genética , Variación Genética , Crecimiento y Desarrollo/genética , Proteínas/genética , Adiposidad/genética , Adolescente , Alelos , Dioxigenasa FTO Dependiente de Alfa-Cetoglutarato , Estatura/genética , Peso Corporal/genética , Niño , Preescolar , Estudios Transversales , Femenino , Genotipo , Humanos , Lactante , Recién Nacido , Estudios Longitudinales , Masculino , Metaanálisis como Asunto , Polimorfismo de Nucleótido Simple/genética
19.
Vaccine ; 42(11): 2733-2739, 2024 Apr 19.
Artículo en Inglés | MEDLINE | ID: mdl-38521677

RESUMEN

BACKGROUND: GENCOV is a prospective, observational cohort study of COVID-19-positive adults. Here, we characterize and compare side effects between COVID-19 vaccines and determine whether reactogenicity is exacerbated by prior SARS-CoV-2 infection. METHODS: Participants were recruited across Ontario, Canada. Participant-reported demographic and COVID-19 vaccination data were collected using a questionnaire. Multivariable logistic regression was performed to assess whether vaccine manufacturer, type, and previous SARS-CoV-2 infection are associated with reactogenicity. RESULTS: Responses were obtained from n = 554 participants. Tiredness and localized side effects were the most common reactions across vaccine doses. For most participants, side effects occurred and subsided within 1-2 days. Recipients of Moderna mRNA and AstraZeneca vector vaccines reported reactions more frequently compared to recipients of a Pfizer-BioNTech mRNA vaccine. Previous SARS-CoV-2 infection was independently associated with developing side effects. CONCLUSIONS: We provide evidence of relatively mild and short-lived reactions reported by participants who have received approved COVID-19 vaccines.


Asunto(s)
Vacunas contra la COVID-19 , COVID-19 , Adulto , Humanos , Vacunas contra la COVID-19/efectos adversos , COVID-19/prevención & control , Estudios Prospectivos , SARS-CoV-2 , Ontario/epidemiología
20.
Ann Appl Stat ; 17(3): 1958-1983, 2023 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-37830084

RESUMEN

Recent advances in biological research have seen the emergence of high-throughput technologies with numerous applications that allow the study of biological mechanisms at an unprecedented depth and scale. A large amount of genomic data is now distributed through consortia like The Cancer Genome Atlas (TCGA), where specific types of biological information on specific type of tissue or cell are available. In cancer research, the challenge is now to perform integrative analyses of high-dimensional multi-omic data with the goal to better understand genomic processes that correlate with cancer outcomes, e.g. elucidate gene networks that discriminate a specific cancer subgroups (cancer sub-typing) or discovering gene networks that overlap across different cancer types (pan-cancer studies). In this paper, we propose a novel mixed graphical model approach to analyze multi-omic data of different types (continuous, discrete and count) and perform model selection by extending the Birth-Death MCMC (BDMCMC) algorithm initially proposed by Stephens (2000) and later developed by Mohammadi and Wit (2015). We compare the performance of our method to the LASSO method and the standard BDMCMC method using simulations and find that our method is superior in terms of both computational efficiency and the accuracy of the model selection results. Finally, an application to the TCGA breast cancer data shows that integrating genomic information at different levels (mutation and expression data) leads to better subtyping of breast cancers.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA