Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 45
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Am J Hum Genet ; 105(6): 1213-1221, 2019 12 05.
Artículo en Inglés | MEDLINE | ID: mdl-31761295

RESUMEN

Polygenic prediction has the potential to contribute to precision medicine. Clumping and thresholding (C+T) is a widely used method to derive polygenic scores. When using C+T, several p value thresholds are tested to maximize predictive ability of the derived polygenic scores. Along with this p value threshold, we propose to tune three other hyper-parameters for C+T. We implement an efficient way to derive thousands of different C+T scores corresponding to a grid over four hyper-parameters. For example, it takes a few hours to derive 123K different C+T scores for 300K individuals and 1M variants using 16 physical cores. We find that optimizing over these four hyper-parameters improves the predictive performance of C+T in both simulations and real data applications as compared to tuning only the p value threshold. A particularly large increase can be noted when predicting depression status, from an AUC of 0.557 (95% CI: [0.544-0.569]) when tuning only the p value threshold to an AUC of 0.592 (95% CI: [0.580-0.604]) when tuning all four hyper-parameters we propose for C+T. We further propose stacked clumping and thresholding (SCT), a polygenic score that results from stacking all derived C+T scores. Instead of choosing one set of hyper-parameters that maximizes prediction in some training set, SCT learns an optimal linear combination of all C+T scores by using an efficient penalized regression. We apply SCT to eight different case-control diseases in the UK biobank data and find that SCT substantially improves prediction accuracy with an average AUC increase of 0.035 over standard C+T.


Asunto(s)
Algoritmos , Enfermedad/genética , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Herencia Multifactorial/genética , Polimorfismo de Nucleótido Simple , Bancos de Muestras Biológicas , Estudios de Casos y Controles , Simulación por Computador , Humanos , Modelos Genéticos , Reino Unido
2.
BMC Med Res Methodol ; 22(1): 335, 2022 12 28.
Artículo en Inglés | MEDLINE | ID: mdl-36577946

RESUMEN

BACKGROUND: An external control arm is a cohort of control patients that are collected from data external to a single-arm trial. To provide an unbiased estimation of efficacy, the clinical profiles of patients from single and external arms should be aligned, typically using propensity score approaches. There are alternative approaches to infer efficacy based on comparisons between outcomes of single-arm patients and machine-learning predictions of control patient outcomes. These methods include G-computation and Doubly Debiased Machine Learning (DDML) and their evaluation for External Control Arms (ECA) analysis is insufficient. METHODS: We consider both numerical simulations and a trial replication procedure to evaluate the different statistical approaches: propensity score matching, Inverse Probability of Treatment Weighting (IPTW), G-computation, and DDML. The replication study relies on five type 2 diabetes randomized clinical trials granted by the Yale University Open Data Access (YODA) project. From the pool of five trials, observational experiments are artificially built by replacing a control arm from one trial by an arm originating from another trial and containing similarly-treated patients. RESULTS: Among the different statistical approaches, numerical simulations show that DDML has the smallest bias followed by G-computation. In terms of mean squared error, G-computation usually minimizes mean squared error. Compared to other methods, DDML has varying Mean Squared Error performances that improves with increasing sample sizes. For hypothesis testing, all methods control type I error and DDML is the most conservative. G-computation is the best method in terms of statistical power, and DDML has comparable power at [Formula: see text] but inferior ones for smaller sample sizes. The replication procedure also indicates that G-computation minimizes mean squared error whereas DDML has intermediate performances in between G-computation and propensity score approaches. The confidence intervals of G-computation are the narrowest whereas confidence intervals obtained with DDML are the widest for small sample sizes, which confirms its conservative nature. CONCLUSIONS: For external control arm analyses, methods based on outcome prediction models can reduce estimation error and increase statistical power compared to propensity score approaches.


Asunto(s)
Diabetes Mellitus Tipo 2 , Humanos , Sesgo , Simulación por Computador , Diabetes Mellitus Tipo 2/terapia , Aprendizaje Automático , Puntaje de Propensión , Proyectos de Investigación , Ensayos Clínicos Controlados Aleatorios como Asunto
3.
Mol Biol Evol ; 37(7): 2153-2154, 2020 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-32343802

RESUMEN

R package pcadapt is a user-friendly R package for performing genome scans for local adaptation. Here, we present version 4 of pcadapt which substantially improves computational efficiency while providing similar results. This improvement is made possible by using a different format for storing genotypes and a different algorithm for computing principal components of the genotype matrix, which is the most computationally demanding step in method pcadapt. These changes are seamlessly integrated into the existing pcadapt package, and users will experience a large reduction in computation time (by a factor of 20-60 in our analyses) as compared with previous versions.


Asunto(s)
Adaptación Biológica , Genómica/métodos , Programas Informáticos
4.
Bioinformatics ; 36(16): 4449-4457, 2020 08 15.
Artículo en Inglés | MEDLINE | ID: mdl-32415959

RESUMEN

MOTIVATION: Principal component analysis (PCA) of genetic data is routinely used to infer ancestry and control for population structure in various genetic analyses. However, conducting PCA analyses can be complicated and has several potential pitfalls. These pitfalls include (i) capturing linkage disequilibrium (LD) structure instead of population structure, (ii) projected PCs that suffer from shrinkage bias, (iii) detecting sample outliers and (iv) uneven population sizes. In this work, we explore these potential issues when using PCA, and present efficient solutions to these. Following applications to the UK Biobank and the 1000 Genomes project datasets, we make recommendations for best practices and provide efficient and user-friendly implementations of the proposed solutions in R packages bigsnpr and bigutilsr. RESULTS: For example, we find that PC19-PC40 in the UK Biobank capture complex LD structure rather than population structure. Using our automatic algorithm for removing long-range LD regions, we recover 16 PCs that capture population structure only. Therefore, we recommend using only 16-18 PCs from the UK Biobank to account for population structure confounding. We also show how to use PCA to restrict analyses to individuals of homogeneous ancestry. Finally, when projecting individual genotypes onto the PCA computed from the 1000 Genomes project data, we find a shrinkage bias that becomes large for PC5 and beyond. We then demonstrate how to obtain unbiased projections efficiently using bigsnpr. Overall, we believe this work would be of interest for anyone using PCA in their analyses of genetic data, as well as for other omics data. AVAILABILITY AND IMPLEMENTATION: R packages bigsnpr and bigutilsr can be installed from either CRAN or GitHub (see https://github.com/privefl/bigsnpr). A tutorial on the steps to perform PCA on 1000G data is available at https://privefl.github.io/bigsnpr/articles/bedpca.html. All code used for this paper is available at https://github.com/privefl/paper4-bedpca/tree/master/code. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Genética de Población , Programas Informáticos , Algoritmos , Humanos , Desequilibrio de Ligamiento , Análisis de Componente Principal
5.
BMC Bioinformatics ; 21(1): 16, 2020 Jan 13.
Artículo en Inglés | MEDLINE | ID: mdl-31931698

RESUMEN

BACKGROUND: Cell-type heterogeneity of tumors is a key factor in tumor progression and response to chemotherapy. Tumor cell-type heterogeneity, defined as the proportion of the various cell-types in a tumor, can be inferred from DNA methylation of surgical specimens. However, confounding factors known to associate with methylation values, such as age and sex, complicate accurate inference of cell-type proportions. While reference-free algorithms have been developed to infer cell-type proportions from DNA methylation, a comparative evaluation of the performance of these methods is still lacking. RESULTS: Here we use simulations to evaluate several computational pipelines based on the software packages MeDeCom, EDec, and RefFreeEWAS. We identify that accounting for confounders, feature selection, and the choice of the number of estimated cell types are critical steps for inferring cell-type proportions. We find that removal of methylation probes which are correlated with confounder variables reduces the error of inference by 30-35%, and that selection of cell-type informative probes has similar effect. We show that Cattell's rule based on the scree plot is a powerful tool to determine the number of cell-types. Once the pre-processing steps are achieved, the three deconvolution methods provide comparable results. We observe that all the algorithms' performance improves when inter-sample variation of cell-type proportions is large or when the number of available samples is large. We find that under specific circumstances the methods are sensitive to the initialization method, suggesting that averaging different solutions or optimizing initialization is an avenue for future research. CONCLUSION: Based on the lessons learned, to facilitate pipeline validation and catalyze further pipeline improvement by the community, we develop a benchmark pipeline for inference of cell-type proportions and implement it in the R package medepir.


Asunto(s)
Biología Computacional/normas , Metilación de ADN , Neoplasias/genética , Algoritmos , Biología Computacional/métodos , Simulación por Computador , Humanos , Programas Informáticos
6.
Mol Biol Evol ; 35(9): 2318-2326, 2018 09 01.
Artículo en Inglés | MEDLINE | ID: mdl-29931083

RESUMEN

Admixture between populations provides opportunity to study biological adaptation and phenotypic variation. Admixture studies rely on local ancestry inference for admixed individuals, which consists of computing at each locus the number of copies that originate from ancestral source populations. Existing software packages for local ancestry inference are tuned to provide accurate results on human data and recent admixture events. Here, we introduce Loter, an open-source software package that does not require any biological parameter besides haplotype data in order to make local ancestry inference available for a wide range of species. Using simulations, we compare the performance of Loter to HAPMIX, LAMP-LD, and RFMix. HAPMIX is the only software severely impacted by imperfect haplotype reconstruction. Loter is the less impacted software by increasing admixture time when considering simulated and admixed human genotypes. For simulations of admixed Populus genotypes, Loter and LAMP-LD are robust to increasing admixture times by contrast to RFMix. When comparing length of reconstructed and true ancestry tracts, Loter and LAMP-LD provide results whose accuracy is again more robust than RFMix to increasing admixture times. We apply Loter to individuals resulting from admixture between Populus trichocarpa and Populus balsamifera and lengths of ancestry tracts indicate that admixture took place ∼100 generations ago. We expect that providing a rapid and parameter-free software for local ancestry inference will make more accessible genomic studies about admixture processes.


Asunto(s)
Técnicas Genéticas , Programas Informáticos , Haplotipos , Humanos , Populus/genética
7.
Bioinformatics ; 34(16): 2781-2787, 2018 08 15.
Artículo en Inglés | MEDLINE | ID: mdl-29617937

RESUMEN

Motivation: Genome-wide datasets produced for association studies have dramatically increased in size over the past few years, with modern datasets commonly including millions of variants measured in dozens of thousands of individuals. This increase in data size is a major challenge severely slowing down genomic analyses, leading to some software becoming obsolete and researchers having limited access to diverse analysis tools. Results: Here we present two R packages, bigstatsr and bigsnpr, allowing for the analysis of large scale genomic data to be performed within R. To address large data size, the packages use memory-mapping for accessing data matrices stored on disk instead of in RAM. To perform data pre-processing and data analysis, the packages integrate most of the tools that are commonly used, either through transparent system calls to existing software, or through updated or improved implementation of existing methods. In particular, the packages implement fast and accurate computations of principal component analysis and association studies, functions to remove single nucleotide polymorphisms in linkage disequilibrium and algorithms to learn polygenic risk scores on millions of single nucleotide polymorphisms. We illustrate applications of the two R packages by analyzing a case-control genomic dataset for celiac disease, performing an association study and computing polygenic risk scores. Finally, we demonstrate the scalability of the R packages by analyzing a simulated genome-wide dataset including 500 000 individuals and 1 million markers on a single desktop computer. Availability and implementation: https://privefl.github.io/bigstatsr/ and https://privefl.github.io/bigsnpr/. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Genómica , Algoritmos , Genoma Humano , Humanos , Herencia Multifactorial , Polimorfismo de Nucleótido Simple , Programas Informáticos
8.
Mol Ecol ; 28(9): 2360-2377, 2019 05.
Artículo en Inglés | MEDLINE | ID: mdl-30849200

RESUMEN

Multiple introductions are key features for the establishment and persistence of introduced species. However, little is known about the contribution of genetic admixture to the invasive potential of populations. To address this issue, we studied the recent invasion of the Asian tiger mosquito (Aedes albopictus) in Europe. Combining genome-wide single nucleotide polymorphisms and historical knowledge using an approximate Bayesian computation framework, we reconstruct the colonization routes and establish the demographic dynamics of invasion. The colonization of Europe involved at least three independent introductions in Albania, North Italy and Central Italy that subsequently acted as dispersal centres throughout Europe. We show that the topology of human transportation networks shaped demographic histories with North Italy and Central Italy being the main dispersal centres in Europe. Introduction modalities conditioned the levels of genetic diversity in invading populations, and genetically diverse and admixed populations promoted more secondary introductions and have spread farther than single-source invasions. This genomic study provides further crucial insights into a general understanding of the role of genetic diversity promoted by modern trade in driving biological invasions.


Asunto(s)
Aedes/fisiología , Variación Genética , Especies Introducidas , Aedes/genética , Animales , Teorema de Bayes , Europa (Continente) , Genética de Población , Italia , Polimorfismo de Nucleótido Simple , Densidad de Población
9.
Mol Biol Evol ; 33(4): 1082-93, 2016 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-26715629

RESUMEN

To characterize natural selection, various analytical methods for detecting candidate genomic regions have been developed. We propose to perform genome-wide scans of natural selection using principal component analysis (PCA). We show that the common FST index of genetic differentiation between populations can be viewed as the proportion of variance explained by the principal components. Considering the correlations between genetic variants and each principal component provides a conceptual framework to detect genetic variants involved in local adaptation without any prior definition of populations. To validate the PCA-based approach, we consider the 1000 Genomes data (phase 1) considering 850 individuals coming from Africa, Asia, and Europe. The number of genetic variants is of the order of 36 millions obtained with a low-coverage sequencing depth (3×). The correlations between genetic variation and each principal component provide well-known targets for positive selection (EDAR, SLC24A5, SLC45A2, DARC), and also new candidate genes (APPBPP2, TP1A1, RTTN, KCNMA, MYO5C) and noncoding RNAs. In addition to identifying genes involved in biological adaptation, we identify two biological pathways involved in polygenic adaptation that are related to the innate immune system (beta defensins) and to lipid metabolism (fatty acid omega oxidation). An additional analysis of European data shows that a genome scan based on PCA retrieves classical examples of local adaptation even when there are no well-defined populations. PCA-based statistics, implemented in the PCAdapt R package and the PCAdapt fast open-source software, retrieve well-known signals of human adaptation, which is encouraging for future whole-genome sequencing project, especially when defining populations is difficult.


Asunto(s)
Adaptación Fisiológica/genética , Genética de Población , Análisis de Componente Principal/métodos , Selección Genética , Genoma Humano , Genómica , Humanos , Estructura Terciaria de Proteína , Análisis de Secuencia de ADN , Programas Informáticos
10.
Mol Ecol ; 25(20): 5029-5042, 2016 10.
Artículo en Inglés | MEDLINE | ID: mdl-27565448

RESUMEN

Finding genetic signatures of local adaptation is of great interest for many population genetic studies. Common approaches to sorting selective loci from their genomic background focus on the extreme values of the fixation index, FST , across loci. However, the computation of the fixation index becomes challenging when the population is genetically continuous, when predefining subpopulations is a difficult task, and in the presence of admixed individuals in the sample. In this study, we present a new method to identify loci under selection based on an extension of the FST statistic to samples with admixed individuals. In our approach, FST values are computed from the ancestry coefficients obtained with ancestry estimation programs. More specifically, we used factor models to estimate FST , and we compared our neutrality tests with those derived from a principal component analysis approach. The performances of the tests were illustrated using simulated data and by re-analysing genomic data from European lines of the plant species Arabidopsis thaliana and human genomic data from the population reference sample, POPRES.


Asunto(s)
Genética de Población/métodos , Genómica/métodos , Adaptación Biológica/genética , Arabidopsis/genética , Simulación por Computador , Frecuencia de los Genes , Sitios Genéticos , Genoma Humano , Humanos , Modelos Genéticos , Polimorfismo de Nucleótido Simple , Selección Genética
11.
BMC Bioinformatics ; 16: 242, 2015 Jul 31.
Artículo en Inglés | MEDLINE | ID: mdl-26227424

RESUMEN

BACKGROUND: In ecology and forensics, some population assignment techniques use molecular markers to assign individuals to known groups. However, assigning individuals to known populations can be difficult if the level of genetic differentiation among populations is small. Most assignment studies handle independent markers, often by pruning markers in Linkage Disequilibrium (LD), ignoring the information contained in the correlation among markers due to LD. RESULTS: To improve the accuracy of population assignment, we present an algorithm, implemented in the HaploPOP software, that combines markers into haplotypes, without requiring independence. The algorithm is based on the Gain of Informativeness for Assignment that provides a measure to decide if a pair of markers should be combined into haplotypes, or not, in order to improve assignment. Because complete exploration of all possible solutions for constructing haplotypes is computationally prohibitive, our approach uses a greedy algorithm based on windows of fixed sizes. We evaluate the performance of HaploPOP to assign individuals to populations using a split-validation approach. We investigate both simulated SNPs data and dense genotype data from individuals from Spain and Portugal. CONCLUSIONS: Our results show that constructing haplotypes with HaploPOP can substantially reduce assignment error. The HaploPOP software is freely available as a command-line software at www.ieg.uu.se/Jakobsson/software/HaploPOP/.


Asunto(s)
Genómica , Programas Informáticos , Algoritmos , Genética de Población , Genotipo , Haplotipos , Humanos , Internet , Desequilibrio de Ligamiento , Polimorfismo de Nucleótido Simple , Análisis de Componente Principal
12.
Mol Biol Evol ; 31(9): 2483-95, 2014 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-24899666

RESUMEN

There is a considerable impetus in population genomics to pinpoint loci involved in local adaptation. A powerful approach to find genomic regions subject to local adaptation is to genotype numerous molecular markers and look for outlier loci. One of the most common approaches for selection scans is based on statistics that measure population differentiation such as FST. However, there are important caveats with approaches related to FST because they require grouping individuals into populations and they additionally assume a particular model of population structure. Here, we implement a more flexible individual-based approach based on Bayesian factor models. Factor models capture population structure with latent variables called factors, which can describe clustering of individuals into populations or isolation-by-distance patterns. Using hierarchical Bayesian modeling, we both infer population structure and identify outlier loci that are candidates for local adaptation. In order to identify outlier loci, the hierarchical factor model searches for loci that are atypically related to population structure as measured by the latent factors. In a model of population divergence, we show that it can achieve a 2-fold or more reduction of false discovery rate compared with the software BayeScan or with an FST approach. We show that our software can handle large data sets by analyzing the single nucleotide polymorphisms of the Human Genome Diversity Project. The Bayesian factor model is implemented in the open-source PCAdapt software.


Asunto(s)
Genómica/métodos , Polimorfismo de Nucleótido Simple , Población/genética , Programas Informáticos , Adaptación Biológica , Teorema de Bayes , Variación Genética , Genoma Humano , Humanos
13.
J Clin Microbiol ; 53(2): 389-97, 2015 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-25411182

RESUMEN

Despite the gain in sustained virological responses (SVR) provided by protease inhibitors (PIs), failures still occur. The aim of this study was to determine if a baseline analysis of the NS3 region using ultradeep pyrosequencing (UDPS) can help to predict an SVR. Serum samples from 40 patients with previously nonresponding genotype 1 chronic hepatitis C who were retreated with triple therapy, including a PI, were analyzed. Baseline UDPS of the NS3 gene was performed on plasma and peripheral blood mononuclear cells (PBMC). Mutations conferring resistance to PIs were sought. The overall diversity of the quasispecies was evaluated by calculating the Shannon entropy (SE). Resistance mutations were found in plasma and PBMC but were not discriminating enough to predict an SVR. NS3 quasispecies heterogeneity was significantly lower at baseline in patients achieving an SVR than in those not achieving an SVR (SE of 26.98 ± 16.64 × 10(-3) versus 44.93 ± 19.58 × 10(-3), P = 0.0047). With multivariate analysis, the independent predictors of an SVR were fibrosis of stage F ≤2 (odds ratio [OR], 13.3; 95% confidence interval [CI], 1.25 to 141.096; P < 0.03) and SE below the median (OR, 5.4; 95% CI, 1.22 to 23.87; P < 0.03). More than the presence of minor mutations at the baseline in plasma or in PBMC, the NS3 viral heterogeneity determined by UDPS is an independent factor for an SVR in previously treated patients receiving triple therapy that includes a PI.


Asunto(s)
Antivirales/uso terapéutico , Farmacorresistencia Viral , Hepatitis C Crónica/tratamiento farmacológico , Hepatitis C Crónica/virología , Secuenciación de Nucleótidos de Alto Rendimiento , Inhibidores de Proteasas/uso terapéutico , Proteínas no Estructurales Virales/genética , Adulto , Anciano , Anciano de 80 o más Años , Quimioterapia Combinada/métodos , Femenino , Variación Genética , Hepatitis C Crónica/diagnóstico , Humanos , Masculino , Persona de Mediana Edad , Mutación Missense , Pronóstico , Terapia Recuperativa/métodos , Adulto Joven
14.
Mol Biol Evol ; 30(3): 513-25, 2013 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-23171862

RESUMEN

Genetic differentiation among human populations is greatly influenced by geography due to the accumulation of local allele frequency differences. However, little is known about the possibly different increment of genetic differentiation along the different geographical axes (north-south, east-west, etc.). Here, we provide new methods to examine the asymmetrical patterns of genetic differentiation. We analyzed genome-wide polymorphism data from populations in Africa (n = 29), Asia (n = 26), America (n = 9), and Europe (n = 38), and we found that the major orientations of genetic differentiation are north-south in Europe and Africa, and east-west in Asia, but no preferential orientation was found in the Americas. Additionally, we showed that the localization of the individual geographic origins based on single nucleotide polymorphism data was not equally precise along all orientations. Confirming our findings, we obtained that, in each continent, the orientation along which the precision is maximal corresponds to the orientation of maximum differentiation. Our results have implications for interpreting human genetic variation in terms of isolation by distance and spatial range expansion processes. In Europe, for instance, the precise northnorthwest-southsoutheast axis of main European differentiation cannot be explained by a simple Neolithic demic diffusion model without admixture with the local populations because in that case the orientation of greatest differentiation should be perpendicular to the direction of expansion. In addition to humans, anisotropic analyses can guide the description of genetic differentiation for other organisms and provide information on expansions of invasive species or the processes of plant dispersal.


Asunto(s)
Modelos Genéticos , África , Algoritmos , Américas , Asia , Simulación por Computador , Europa (Continente) , Frecuencia de los Genes , Genoma Humano , Humanos , Filogeografía , Polimorfismo de Nucleótido Simple , Análisis de Componente Principal
15.
Eur J Cancer ; 202: 113978, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38471290

RESUMEN

BACKGROUND: The PAOLA-1/ENGOT-ov25 trial showed that maintenance olaparib plus bevacizumab increases survival of advanced ovarian cancer patients with homologous recombination deficiency (HRD). However, decentralized solutions to test for HRD in clinical routine are scarce. The goal of this study was to retrospectively validate on tumor samples from the PAOLA-1 trial, the decentralized SeqOne assay, which relies on shallow Whole Genome Sequencing (sWGS) to capture genomic instability and targeted sequencing to determine BRCA status. METHODS: The study comprised 368 patients from the PAOLA-1 trial. The SeqOne assay was compared to the Myriad MyChoice HRD test (Myriad Genetics), and results were analyzed with respect to Progression-Free Survival (PFS). RESULTS: We found a 95% concordance between the HRD status of the two tests (95% Confidence Interval (CI); 92%-97%). The Positive Percentage Agreement (PPA) of the sWGS test was 95% (95% CI; 91%-97%) like its Negative Percentage Agreement (NPA) (95% CI; 89%-98%). In patients with HRD-positive tumors treated with olaparib plus bevacizumab, the PFS Hazard Ratio (HR) was 0.38 (95% CI; 0.26-0.54) with SeqOne assay and 0.32 (95% CI; 0.22-0.45) with the Myriad assay. In patients with HRD-negative tumors, HR was 0.99 (95% CI; 0.68-1.42) and 1.05 (95% CI; 0.70-1.57) with SeqOne and Myriad assays. Among patients with BRCA-wildtype tumors, those with HRD-positive tumors, benefited from olaparib plus bevacizumab maintenance, with HR of 0.48 (95% CI: 0.29-0.79) and of 0.38 (95% CI: 0.23 to 0.63) with the SeqOne and Myriad assay. CONCLUSION: The SeqOne assay offers a clinically validated approach to detect HRD.


Asunto(s)
Neoplasias Ováricas , Humanos , Femenino , Bevacizumab/uso terapéutico , Estudios Retrospectivos , Neoplasias Ováricas/tratamiento farmacológico , Neoplasias Ováricas/genética , Carcinoma Epitelial de Ovario , Recombinación Homóloga
16.
Mol Biol Evol ; 29(7): 1851-60, 2012 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-22319141

RESUMEN

Based on the accumulation of genetic, climatic, and fossil evidence, a central theory in paleoanthropology stipulates that a demographic bottleneck coincided with the origin of our species Homo Sapiens. This theory proposes that anatomically modern humans--which were only present in Africa at the time--experienced a drastic bottleneck during the penultimate glacial age (130-190 kya) when a cold and dry climate prevailed. Two scenarios have been proposed to describe the bottleneck, which involve either a fragmentation of the range occupied by humans or the survival of one small group of humans. Here, we analyze DNA sequence data from 61 nuclear loci sequenced in three African populations using Approximate Bayesian Computation and numerical simulations. In contrast to the bottleneck theory, we show that a simple model without any bottleneck during the penultimate ice age has the greatest statistical support compared with bottleneck models. Although the proposed bottleneck is ancient, occurring at least 130 kya, we can discard the possibility that it did not leave detectable footprints in the DNA sequence data except if the bottleneck involves a less than a 3-fold reduction in population size. Finally, we confirm that a simple model without a bottleneck is able to reproduce the main features of the observed patterns of genetic variation. We conclude that models of Pleistocene refugium for modern human origins now require substantial revision.


Asunto(s)
Efecto Fundador , Modelos Genéticos , África , Antropología , Teorema de Bayes , Clima , Genética Médica , Humanos , Paleontología , Densidad de Población , Análisis de Secuencia de ADN
17.
Trials ; 24(1): 380, 2023 Jun 06.
Artículo en Inglés | MEDLINE | ID: mdl-37280655

RESUMEN

Adjustment for prognostic covariates increases the statistical power of randomized trials. The factors influencing the increase of power are well-known for trials with continuous outcomes. Here, we study which factors influence power and sample size requirements in time-to-event trials. We consider both parametric simulations and simulations derived from the Cancer Genome Atlas (TCGA) cohort of hepatocellular carcinoma (HCC) patients to assess how sample size requirements are reduced with covariate adjustment. Simulations demonstrate that the benefit of covariate adjustment increases with the prognostic performance of the adjustment covariate (C-index) and with the cumulative incidence of the event in the trial. For a covariate that has an intermediate prognostic performance (C-index=0.65), the reduction of sample size varies from 3.1% when cumulative incidence is of 10% to 29.1% when the cumulative incidence is of 90%. Broadening eligibility criteria usually reduces statistical power while our simulations show that it can be maintained with adequate covariate adjustment. In a simulation of adjuvant trials in HCC, we find that the number of patients screened for eligibility can be divided by 2.4 when broadening eligibility criteria. Last, we find that the Cox-Snell [Formula: see text] is a conservative estimation of the reduction in sample size requirements provided by covariate adjustment. Overall, more systematic adjustment for prognostic covariates leads to more efficient and inclusive clinical trials especially when cumulative incidence is large as in metastatic and advanced cancers. Code and results are available at https://github.com/owkin/CovadjustSim .


Asunto(s)
Carcinoma Hepatocelular , Neoplasias Hepáticas , Humanos , Carcinoma Hepatocelular/genética , Simulación por Computador , Neoplasias Hepáticas/terapia , Pronóstico , Tamaño de la Muestra , Ensayos Clínicos como Asunto
18.
Mol Biol Evol ; 28(2): 889-98, 2011 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-20930054

RESUMEN

Two competing hypotheses are at the forefront of the debate on modern human origins. In the first scenario, known as the recent Out-of-Africa hypothesis, modern humans arose in Africa about 100,000-200,000 years ago and spread throughout the world by replacing the local archaic human populations. By contrast, the second hypothesis posits substantial gene flow between archaic and emerging modern humans. In the last two decades, the young time estimates--between 100,000 and 200,000 years--of the most recent common ancestors for the mitochondrion and the Y chromosome provided evidence in favor of a recent African origin of modern humans. However, the presence of very old lineages for autosomal and X-linked genes has often been claimed to be incompatible with a simple, single origin of modern humans. Through the analysis of a public DNA sequence database, we find, similar to previous estimates, that the common ancestors of autosomal and X-linked genes are indeed very old, living, on average, respectively, 1,500,000 and 1,000,000 years ago. However, contrary to previous conclusions, we find that these deep gene genealogies are consistent with the Out-of-Africa scenario provided that the ancestral effective population size was approximately 14,000 individuals. We show that an ancient bottleneck in the Middle Pleistocene, possibly arising from an ancestral structured population, can reconcile the contradictory findings from the mitochondrion on the one hand, with the autosomes and the X chromosome on the other hand.


Asunto(s)
Evolución Biológica , Genética de Población , Hominidae/genética , Animales , Cromosomas Humanos X , Genoma Humano , Humanos
19.
Hum Reprod ; 27(11): 3337-46, 2012 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-22888167

RESUMEN

STUDY QUESTION: Can we identify new sequence variants in the aurora kinase C gene (AURKC) of patients with macrozoospermia and establish a genotype-phenotype correlation? SUMMARY ANSWER: We identified a new non-sense mutation, p.Y248*, that represents 13% of all mutant alleles. There was no difference in the phenotype of individuals carrying this new mutation versus the initially described and main mutation c.144delC. WHAT IS KNOWN ALREADY: The absence of a functional AURKC gene causes primary infertility in men by blocking the first meiotic division and leading to the production of tetraploid large-headed spermatozoa. We previously demonstrated that most affected men were of North African origin and carried a homozygous truncating mutation (c.144delC). STUDY DESIGN, SIZE, DURATION: This is a retrospective study carried out on patients consulting for infertility and described as having >5% large-headed spermatozoa. A total of 87 patients are presented here, 43 patients were published previously and 44 are new patients recruited between January 2008 and December 2011. PARTICIPANTS/MATERIALS, SETTING, METHODS: All patients consulted for primary infertility in fertility clinics in France (n = 44), Tunisia (n = 30), Morocco (n = 9) or Algeria (n = 4). Sperm analysis was carried out in the recruiting fertility clinics and all molecular analyses were performed at Grenoble teaching hospital. DNA was extracted from blood or saliva and the seven AURKC exons were sequenced. RT-PCR was carried out on transcripts extracted from leukocytes from one patient homozygous for p.Y248*. Microsatellite analysis was performed on all p.Y248* patients to evaluate the age of this new mutation. MAIN RESULTS AND THE ROLE OF CHANCE: We identified a new non-sense mutation, p.Y248*, in 10 unrelated individuals of European (n = 4) and North African origin (n = 6). We show that this new variant represents 13% of all mutant alleles and that the initially described c.144delC variant accounts for almost all of the remaining mutated alleles (85.5%). No mutated transcripts could be detected by RT-PCR suggesting a specific degradation of the mutant transcripts by non-sense mediated mRNA decay. A rare variant located in the 3' untranslated region was found to strictly co-segregate with p.Y248*, demonstrating a founding effect. Microsatellite analysis confirmed this linkage and allowed us to estimate a mutational age of between 925 and 1325 years, predating the c.144delC variant predicted by the same method to have arisen 250-650 years ago. Patients with no identified AURKC mutation (n = 15) have significantly improved parameters in terms of vitality and concentration of normal spermatozoa, and a decreased rate of spermatozoa with a large head and multiple flagella (P < 0.001). LIMITATIONS, REASONS FOR CAUTION: Despite adherence to the World Health Organization guidelines, large variations in most characteristic sperm parameters were observed, even for patients with the same homozygous mutation. We believe that is mainly related to inter-laboratory variability in sperm parameter scoring. This prevented us from establishing clear-cut values to indicate a need for molecular analysis of patients with macrozoospermia. WIDER IMPLICATIONS OF THE FINDINGS: This study confirms yet again the importance of AURKC mutations in the aetiology of macrozoospermia. Although a large majority of patients are of North African origin, we have now identified European patients carrying a new non-sense mutation indicating that a diagnosis of absence of a functional AURKC gene should not be ruled out for non-Magrebian individuals. Indirect evidence indicates that AURKC might be playing a role in the meiotic spindle assembly checkpoint (SAC) during meiosis. We postulate that heterozygous men might have a more relaxed SAC leading to a more abundant sperm production and a reproductive advantage. This could be the reason for the rapid accumulation of the two AURKC mutations we observe in North African individuals. STUDY FUNDING/COMPETING INTEREST(S): None of the authors have any competing interest. This work is part of the project 'Identification and Characterization of Genes Involved in Infertility (ICG2I)' funded by the programme GENOPAT 2009 from the French Research Agency (ANR).


Asunto(s)
Infertilidad Masculina/genética , Mutación , Proteínas Serina-Treonina Quinasas/genética , Espermatozoides/anomalías , Adulto , Argelia , Aurora Quinasa C , Aurora Quinasas , Codón sin Sentido , Estudios de Cohortes , Intercambio Genético , Exones , Efecto Fundador , Francia , Estudios de Asociación Genética , Humanos , Infertilidad Masculina/metabolismo , Masculino , Marruecos , Linaje , Proteínas Serina-Treonina Quinasas/metabolismo , Estudios Retrospectivos , Cabeza del Espermatozoide/patología , Espermatozoides/patología , Túnez
20.
Biostatistics ; 11(4): 644-60, 2010 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-20457785

RESUMEN

Missing data is a recurrent issue in epidemiology where the infection process may be partially observed. Approximate Bayesian computation (ABC), an alternative to data imputation methods such as Markov chain Monte Carlo (MCMC) integration, is proposed for making inference in epidemiological models. It is a likelihood-free method that relies exclusively on numerical simulations. ABC consists in computing a distance between simulated and observed summary statistics and weighting the simulations according to this distance. We propose an original extension of ABC to path-valued summary statistics, corresponding to the cumulated number of detections as a function of time. For a standard compartmental model with Suceptible, Infectious and Recovered individuals (SIR), we show that the posterior distributions obtained with ABC and MCMC are similar. In a refined SIR model well suited to the HIV contact-tracing data in Cuba, we perform a comparison between ABC with full and binned detection times. For the Cuban data, we evaluate the efficiency of the detection system and predict the evolution of the HIV-AIDS disease. In particular, the percentage of undetected infectious individuals is found to be of the order of 40%.


Asunto(s)
Trazado de Contacto , Infecciones por VIH/epidemiología , Infecciones por VIH/transmisión , Modelos Biológicos , Algoritmos , Teorema de Bayes , Simulación por Computador , Cuba/epidemiología , Humanos , Modelos Lineales , Cadenas de Markov , Método de Montecarlo , Dinámicas no Lineales , Procesos Estocásticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA