RESUMO
The proportion of samples with one or more close relatives in a genetic dataset increases rapidly with sample size, necessitating relatedness modeling and enabling pedigree-based analyses. Despite this, relatives are generally unreported and current inference methods typically detect only the degree of relatedness of sample pairs and not pedigree relationships. We developed CREST, an accurate and fast method that identifies the pedigree relationships of close relatives. CREST utilizes identity by descent (IBD) segments shared between a pair of samples and their mutual relatives, leveraging the fact that sharing rates among these individuals differ across pedigree configurations. Furthermore, CREST exploits the profound differences in sex-specific genetic maps to classify pairs as maternally or paternally related-e.g., paternal half-siblings-using the locations of autosomal IBD segments shared between the pair. In simulated data, CREST correctly classifies 91.5%-100% of grandparent-grandchild (GP) pairs, 80.0%-97.5% of avuncular (AV) pairs, and 75.5%-98.5% of half-siblings (HS) pairs compared to PADRE's rates of 38.5%-76.0% of GP, 60.5%-92.0% of AV, 73.0%-95.0% of HS pairs. Turning to the real 20,032 sample Generation Scotland (GS) dataset, CREST identified seven pedigrees with incorrect relationship types or maternal/paternal parent sexes, five of which we confirmed as mistakes, and two with uncertain relationships. After correcting these, CREST correctly determines relationship types for 93.5% of GP, 97.7% of AV, and 92.2% of HS pairs that have sufficient mutual relative data; the parent sex in 100% of HS and 99.6% of GP pairs; and it completes this analysis in 2.8 h including IBD detection in eight threads.
Assuntos
Genoma Humano/genética , Feminino , Ligação Genética/genética , Genótipo , Humanos , Masculino , Modelos Genéticos , Linhagem , EscóciaRESUMO
Simulations of close relatives and identical by descent (IBD) segments are common in genetic studies, yet most past efforts have utilized sex averaged genetic maps and ignored crossover interference, thus omitting features known to affect the breakpoints of IBD segments. We developed Ped-sim, a method for simulating relatives that can utilize either sex-specific or sex averaged genetic maps and also either a model of crossover interference or the traditional Poisson model for inter-crossover distances. To characterize the impact of previously ignored mechanisms, we simulated data for all four combinations of these factors. We found that modeling crossover interference decreases the standard deviation of pairwise IBD proportions by 10.4% on average in full siblings through second cousins. By contrast, sex-specific maps increase this standard deviation by 4.2% on average, and also impact the number of segments relatives share. Most notably, using sex-specific maps, the number of segments half-siblings share is bimodal; and when combined with interference modeling, the probability that sixth cousins have non-zero IBD sharing ranges from 9.0 to 13.1%, depending on the sexes of the individuals through which they are related. We present new analytical results for the distributions of IBD segments under these models and show they match results from simulations. Finally, we compared IBD sharing rates between simulated and real relatives and find that the combination of sex-specific maps and interference modeling most accurately captures IBD rates in real data. Ped-sim is open source and available from https://github.com/williamslab/ped-sim.
Assuntos
Mapeamento Cromossômico/métodos , Simulação por Computador , Caracteres Sexuais , Feminino , Variação Genética , Genética Populacional , Genoma Humano , Humanos , Masculino , Modelos Genéticos , Linhagem , Distribuição de PoissonRESUMO
Multiple-objective optimization is common in biological systems. In the mammalian olfactory system, each sensory neuron stochastically expresses only one out of up to thousands of olfactory receptor (OR) gene alleles; at the organism level, the types of expressed ORs need to be maximized. Existing models focus only on monoallele activation, and cannot explain recent observations in mutants, especially the reduced global diversity of expressed ORs in G9a/GLP knockouts. In this work we integrated existing information on OR expression, and constructed a comprehensive model that has all its components based on physical interactions. Analyzing the model reveals an evolutionarily optimized three-layer regulation mechanism, which includes zonal segregation, epigenetic barrier crossing coupled to a negative feedback loop that mechanistically differs from previous theoretical proposals, and a previously unidentified enhancer competition step. This model not only recapitulates monoallelic OR expression, but also elucidates how the olfactory system maximizes and maintains the diversity of OR expression, and has multiple predictions validated by existing experimental results. Through making an analogy to a physical system with thermally activated barrier crossing and comparative reverse engineering analyses, the study reveals that the olfactory receptor selection system is optimally designed, and particularly underscores cooperativity and synergy as a general design principle for multiobjective optimization in biology.