RESUMO
The control of transcription is crucial for homeostasis in mammals. A previous selective sweep analysis of horse racing performance revealed a 19.6 kb candidate regulatory region 50 kb downstream of the Endothelin3 (EDN3) gene. Here, the region was narrowed to a 5.5 kb span of 14 SNVs, with elite and sub-elite haplotypes analyzed for association to racing performance, blood pressure and plasma levels of EDN3 in Coldblooded trotters and Standardbreds. Comparative analysis of human HiCap data identified the span as an enhancer cluster active in endothelial cells, interacting with genes relevant to blood pressure regulation. Coldblooded trotters with the sub-elite haplotype had significantly higher blood pressure compared to horses with the elite performing haplotype during exercise. Alleles within the elite haplotype were part of the standing variation in pre-domestication horses, and have risen in frequency during the era of breed development and selection. These results advance our understanding of the molecular genetics of athletic performance and vascular traits in both horses and humans.
Assuntos
Desempenho Atlético , Pressão Sanguínea , Haplótipos , Cavalos/genética , Animais , Humanos , Pressão Sanguínea/genética , Desempenho Atlético/fisiologia , Haplótipos/genética , Endotelina-3/genética , Polimorfismo de Nucleotídeo Único , Alelos , Masculino , Células Endoteliais/metabolismoRESUMO
Chronic kidney disease (CKD) affects 10% of the human population, with only a small fraction genetically defined. CKD is also common in dogs and has been diagnosed in nearly all breeds, but its genetic basis remains unclear. Here, we performed a Bayesian mixed model genome-wide association analysis for canine CKD in a boxer population of 117 canine cases and 137 controls, and identified 21 genetic regions associated with the disease. At the top markers from each CKD region, the cases carried an average of 20.2 risk alleles, significantly higher than controls (15.6 risk alleles). An ANOVA test showed that the 21 CKD regions together explained 57% of CKD phenotypic variation in the population. Based on whole genome sequencing data of 20 boxers, we identified 5,206 variants in LD with the top 50 BayesR markers. Following comparative analysis with human regulatory data, 17 putative regulatory variants were identified and tested with electrophoretic mobility shift assays. In total four variants, three intronic variants from the MAGI2 and GALNT18 genes, and one variant in an intergenic region on chr28, showed alternative binding ability for the risk and protective alleles in kidney cell lines. Many genes from the 21 CKD regions, RELN, MAGI2, FGFR2 and others, have been implicated in human kidney development or disease. The results from this study provide new information that may enlighten the etiology of CKD in both dogs and humans.
Assuntos
Estudo de Associação Genômica Ampla , Insuficiência Renal Crônica , Cães , Humanos , Animais , Teorema de Bayes , Insuficiência Renal Crônica/genética , Insuficiência Renal Crônica/veterinária , Insuficiência Renal Crônica/epidemiologia , Rim , Alelos , Polimorfismo de Nucleotídeo ÚnicoRESUMO
With the generation of more than 100 sequenced vertebrate genomes in less than 25 years, the key question arises of how these resources can be used to inform new or ongoing projects. In the past, this diverse collection of sequences from human as well as model and non-model organisms has been used to annotate the human genome and to increase the understanding of human disease. In the future, comparative vertebrate genomics in conjunction with additional genomic resources will yield insights into the processes of genome function, evolution, speciation, selection and adaptation, as well as the quantification of species diversity. In this Review, we discuss how the genomics of non-human organisms can provide insights into vertebrate biology and how this can contribute to the understanding of human physiology and health.
Assuntos
Genômica/métodos , Vertebrados/genética , Animais , Humanos , Análise de Sequência de DNARESUMO
OBJECTIVE: To identify and characterize genetic loci associated with the risk of developing ANCA-associated vasculitides (AAV). METHODS: Genetic association analyses were performed after Illumina sequencing of 1853 genes and subsequent replication with genotyping of selected single nucleotide polymorphisms in a total cohort of 1110 Scandinavian cases with granulomatosis with polyangiitis or microscopic polyangiitis, and 1589 controls. A novel AAV-associated single nucleotide polymorphism was analysed for allele-specific effects on gene expression using luciferase reporter assay. RESULTS: PR3-ANCA+ AAV was significantly associated with two independent loci in the HLA-DPB1/HLA-DPA1 region [rs1042335, P = 6.3 × 10-61, odds ratio (OR) 0.10; rs9277341, P = 1.5 × 10-44, OR 0.22] and with rs28929474 in the SERPINA1 gene (P = 2.7 × 10-10, OR 2.9). MPO-ANCA+ AAV was significantly associated with the HLA-DQB1/HLA-DQA2 locus (rs9274619, P = 5.4 × 10-25, OR 3.7) and with a rare variant in the BACH2 gene (rs78275221, P = 7.9 × 10-7, OR 3.0), the latter a novel susceptibility locus for MPO-ANCA+ granulomatosis with polyangiitis/microscopic polyangiitis. The rs78275221-A risk allele reduced luciferase gene expression in endothelial cells, specifically, as compared with the non-risk allele. CONCLUSION: We identified a novel susceptibility locus for MPO-ANCA+ AAV and propose that the associated variant is of mechanistic importance, exerting a regulatory function on gene expression in specific cell types.
Assuntos
Vasculite Associada a Anticorpo Anticitoplasma de Neutrófilos , Granulomatose com Poliangiite , Poliangiite Microscópica , Vasculite Associada a Anticorpo Anticitoplasma de Neutrófilos/complicações , Vasculite Associada a Anticorpo Anticitoplasma de Neutrófilos/genética , Anticorpos Anticitoplasma de Neutrófilos , Células Endoteliais , Granulomatose com Poliangiite/complicações , Granulomatose com Poliangiite/genética , Humanos , Poliangiite Microscópica/complicações , Poliangiite Microscópica/genética , Mieloblastina/genética , PeroxidaseRESUMO
OBJECTIVES: Clinical presentation of primary Sjögren's syndrome (pSS) varies considerably. A shortage of evidence-based objective markers hinders efficient drug development and most clinical trials have failed to reach primary endpoints. METHODS: We performed a multicentre study to identify patient subgroups based on clinical, immunological and genetic features. Targeted DNA sequencing of 1853 autoimmune-related loci was performed. After quality control, 918 patients with pSS, 1264 controls and 107 045 single nucleotide variants remained for analysis. Replication was performed in 177 patients with pSS and 7672 controls. RESULTS: We found strong signals of association with pSS in the HLA region. Principal component analysis of clinical data distinguished two patient subgroups defined by the presence of SSA/SSB antibodies. We observed an unprecedented high risk of pSS for an association in the HLA-DQA1 locus of odds ratio 6.10 (95% CI: 4.93, 7.54, P=2.2×10-62) in the SSA/SSB-positive subgroup, while absent in the antibody negative group. Three independent signals within the MHC were observed. The two most significant variants in MHC class I and II respectively, identified patients with a higher risk of hypergammaglobulinaemia, leukopenia, anaemia, purpura, major salivary gland swelling and lymphadenopathy. Replication confirmed the association with both MHC class I and II signals confined to SSA/SSB antibody positive pSS. CONCLUSION: Two subgroups of patients with pSS with distinct clinical manifestations can be defined by the presence or absence of SSA/SSB antibodies and genetic markers in the HLA locus. These subgroups should be considered in clinical follow-up, drug development and trial outcomes, for the benefit of both subgroups.
Assuntos
Autoanticorpos/sangue , Cadeias alfa de HLA-DQ/genética , Síndrome de Sjogren , Idade de Início , Autoimunidade/genética , Correlação de Dados , Feminino , Marcadores Genéticos/genética , Predisposição Genética para Doença , Variação Genética , Humanos , Masculino , Pessoa de Meia-Idade , Noruega/epidemiologia , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA/métodos , Síndrome de Sjogren/classificação , Síndrome de Sjogren/genética , Síndrome de Sjogren/imunologia , Síndrome de Sjogren/fisiopatologia , Suécia/epidemiologiaRESUMO
UNLABELLED: High-throughput genotyping and sequencing technologies facilitate studies of complex genetic traits and provide new research opportunities. The increasing popularity of genome-wide association studies (GWAS) leads to the discovery of new associated loci and a better understanding of the genetic architecture underlying not only diseases, but also other monogenic and complex phenotypes. Several softwares are available for performing GWAS analyses, R environment being one of them. RESULTS: We present cgmisc, an R package that enables enhanced data analysis and visualization of results from GWAS. The package contains several utilities and modules that complement and enhance the functionality of the existing software. It also provides several tools for advanced visualization of genomic data and utilizes the power of the R language to aid in preparation of publication-quality figures. Some of the package functions are specific for the domestic dog (Canis familiaris) data. AVAILABILITY AND IMPLEMENTATION: The package is operating system-independent and is available from: https://github.com/cgmisc-team/cgmisc CONTACT: marcin.kierczak@imbim.uu.se. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Gráficos por Computador , Estudo de Associação Genômica Ampla/métodos , Genômica/métodos , Software , Animais , Cães , Genótipo , Humanos , Perda de Heterozigosidade , FenótipoRESUMO
Domestic animals are excellent models for genetic studies of phenotypic evolution. They have evolved genetic adaptations to a new environment, the farm, and have been subjected to strong human-driven selection leading to remarkable phenotypic changes in morphology, physiology and behaviour. Identifying the genetic changes underlying these developments provides new insight into general mechanisms by which genetic variation shapes phenotypic diversity. Here we describe the use of massively parallel sequencing to identify selective sweeps of favourable alleles and candidate mutations that have had a prominent role in the domestication of chickens (Gallus gallus domesticus) and their subsequent specialization into broiler (meat-producing) and layer (egg-producing) chickens. We have generated 44.5-fold coverage of the chicken genome using pools of genomic DNA representing eight different populations of domestic chickens as well as red jungle fowl (Gallus gallus), the major wild ancestor. We report more than 7,000,000 single nucleotide polymorphisms, almost 1,300 deletions and a number of putative selective sweeps. One of the most striking selective sweeps found in all domestic chickens occurred at the locus for thyroid stimulating hormone receptor (TSHR), which has a pivotal role in metabolic regulation and photoperiod control of reproduction in vertebrates. Several of the selective sweeps detected in broilers overlapped genes associated with growth, appetite and metabolic regulation. We found little evidence that selection for loss-of-function mutations had a prominent role in chicken domestication, but we detected two deletions in coding sequences that we suggest are functionally important. This study has direct application to animal breeding and enhances the importance of the domestic chicken as a model organism for biomedical research.
Assuntos
Galinhas/genética , Loci Gênicos/genética , Genoma/genética , Seleção Genética/genética , Sequência de Aminoácidos , Animais , Evolução Biológica , Feminino , Masculino , Dados de Sequência Molecular , Polimorfismo de Nucleotídeo Único , Alinhamento de Sequência , Análise de Sequência de DNA , Deleção de SequênciaRESUMO
BACKGROUND: Genomic duplications constitute major events in the evolution of species, allowing paralogous copies of genes to take on fine-tuned biological roles. Unambiguously identifying the orthology relationship between copies across multiple genomes can be resolved by synteny, i.e. the conserved order of genomic sequences. However, a comprehensive analysis of duplication events and their contributions to evolution would require all-to-all genome alignments, which increases at N2 with the number of available genomes, N. RESULTS: Here, we introduce Kraken, software that omits the all-to-all requirement by recursively traversing a graph of pairwise alignments and dynamically re-computing orthology. Kraken scales linearly with the number of targeted genomes, N, which allows for including large numbers of genomes in analyses. We first evaluated the method on the set of 12 Drosophila genomes, finding that orthologous correspondence computed indirectly through a graph of multiple synteny maps comes at minimal cost in terms of sensitivity, but reduces overall computational runtime by an order of magnitude. We then used the method on three well-annotated mammalian genomes, human, mouse, and rat, and show that up to 93% of protein coding transcripts have unambiguous pairwise orthologous relationships across the genomes. On a nucleotide level, 70 to 83% of exons match exactly at both splice junctions, and up to 97% on at least one junction. We last applied Kraken to an RNA-sequencing dataset from multiple vertebrates and diverse tissues, where we confirmed that brain-specific gene family members, i.e. one-to-many or many-to-many homologs, are more highly correlated across species than single-copy (i.e. one-to-one homologous) genes. Not limited to protein coding genes, Kraken also identifies thousands of newly identified transcribed loci, likely non-coding RNAs that are consistently transcribed in human, chimpanzee and gorilla, and maintain significant correlation of expression levels across species. CONCLUSIONS: Kraken is a computational genome coordinate translator that facilitates cross-species comparisons, distinguishes orthologs from paralogs, and does not require costly all-to-all whole genome mappings. Kraken is freely available under LPGL from http://github.com/nedaz/kraken.
Assuntos
Genômica/métodos , Software , Animais , Mapeamento Cromossômico , Drosophila melanogaster/genética , Evolução Molecular , Genoma/genética , Humanos , Camundongos , Anotação de Sequência Molecular , Ratos , Sintenia/genética , Transcrição GênicaRESUMO
Hereditary periodic fever syndromes are characterized by recurrent episodes of fever and inflammation with no known pathogenic or autoimmune cause. In humans, several genes have been implicated in this group of diseases, but the majority of cases remain unexplained. A similar periodic fever syndrome is relatively frequent in the Chinese Shar-Pei breed of dogs. In the western world, Shar-Pei have been strongly selected for a distinctive thick and heavily folded skin. In this study, a mutation affecting both these traits was identified. Using genome-wide SNP analysis of Shar-Pei and other breeds, the strongest signal of a breed-specific selective sweep was located on chromosome 13. The same region also harbored the strongest genome-wide association (GWA) signal for susceptibility to the periodic fever syndrome (p(raw)â= 2.3 × 10â»6, p(genome)â= 0.01). Dense targeted resequencing revealed two partially overlapping duplications, 14.3 Kb and 16.1 Kb in size, unique to Shar-Pei and upstream of the Hyaluronic Acid Synthase 2 (HAS2) gene. HAS2 encodes the rate-limiting enzyme synthesizing hyaluronan (HA), a major component of the skin. HA is up-regulated and accumulates in the thickened skin of Shar-Pei. A high copy number of the 16.1 Kb duplication was associated with an increased expression of HAS2 as well as the periodic fever syndrome (p < 0.0001). When fragmented, HA can act as a trigger of the innate immune system and stimulate sterile fever and inflammation. The strong selection for the skin phenotype therefore appears to enrich for a pleiotropic mutation predisposing these dogs to a periodic fever syndrome. The identification of HA as a major risk factor for this canine disease raises the potential of this glycosaminoglycan as a risk factor for human periodic fevers and as an important driver of chronic inflammation.
Assuntos
Doenças do Cão/genética , Cães/genética , Febre/veterinária , Duplicação Gênica/genética , Glucuronosiltransferase/genética , Fenótipo , Pele , Animais , Cruzamento , Doenças do Cão/patologia , Febre/genética , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Glucuronosiltransferase/metabolismo , Ácido Hialurônico/genética , Ácido Hialurônico/metabolismo , Polimorfismo de Nucleotídeo Único , Fatores de Risco , Pele/enzimologia , Pele/patologia , SíndromeRESUMO
BACKGROUND: The presence of mitochondrial sequences in the nuclear genome (Numts) confounds analyses of mitochondrial sequence variation, and is a potential source of false positives in disease studies. To improve the analysis of mitochondrial variation in canines, we completed a systematic assessment of Numt content across genome assemblies, canine populations and the carnivore lineage. RESULTS: Centering our analysis on the UU_Cfam_GSD_1.0/canFam4/Mischka assembly, a commonly used reference in dog genetic variation studies, we found a total of 321 Numts located throughout the nuclear genome and encompassing the entire sequence of the mitochondria. A comparison with 14 canine genome assemblies identified 63 Numts with presence-absence dimorphism among dogs, wolves, and a coyote. Furthermore, a subset of Numts were maintained across carnivore evolutionary time (arctic fox, polar bear, cat), with eight sequences likely more than 10 million years old, and shared with the domestic cat. On a population level, using structural variant data from the Dog10K Consortium for 1879 dogs and wolves, we identified 11 Numts that are absent in at least one sample, as well as 53 Numts that are absent from the Mischka assembly. CONCLUSIONS: We highlight scenarios where the presence of Numts is a potentially confounding factor and provide an annotation of these sequences in canine genome assemblies. This resource will aid the identification and interpretation of polymorphisms in both somatic and germline mitochondrial studies in canines.
Assuntos
Núcleo Celular , Animais , Cães/genética , Núcleo Celular/genética , Genoma/genética , Genoma Mitocondrial , DNA Mitocondrial/genética , Lobos/genética , Mitocôndrias/genética , Variação GenéticaRESUMO
Elite performing exercise requires an intricate modulation of the blood pressure to support the working muscles with oxygen. We have previously identified a genomic regulatory module that associates with differences in blood pressures of importance for elite performance in racehorses. This study aimed to determine the effect of the regulatory module on the protein repertoire. We sampled plasma from 12 Coldblooded trotters divided into two endothelial regulatory module haplotype groups, a sub-elite performing haplotype (SPH) and an elite performing haplotype (EPH), each at rest and exercise. The haplotype groups and their interaction were interrogated in two analyses, i) individual paired ratio analysis for identifying differentially abundant proteins of exercise (DAPE) and interaction (DAPI) between haplotype and exercise, and ii) unpaired ratio analysis for identifying differentially abundant protein of haplotype (DAPH). The proteomics analyses revealed a widespread change in plasma protein content during exercise, with a decreased tendency in protein abundance that is mainly related to lung function, tissue fluids, metabolism, calcium ion pathway and cellular energy metabolism. Furthermore, we provide the first investigation of the proteome variation due to the interaction between exercise and related blood pressure haplotypes, which this difference was related to a faster switch to the lipoprotein and lipid metabolism during exercise for EPH. The molecular signatures identified in the present study contribute to an improved understanding of exercise-related blood pressure regulation.
RESUMO
OBJECTIVE: The antineutrophil cytoplasmic antibody (ANCA)-associated vasculitides (AAV) are inflammatory disorders with ANCA autoantibodies recognising either proteinase 3 (PR3-AAV) or myeloperoxidase (MPO-AAV). PR3-AAV and MPO-AAV have been associated with distinct loci in the human leucocyte antigen (HLA) region. While the association between MPO-AAV and HLA has been well characterised in East Asian populations where MPO-AAV is more common, studies in populations of European descent are limited. The aim of this study was to thoroughly characterise associations to the HLA region in Scandinavian patients with PR3-AAV as well as MPO-AAV. METHODS: Genotypes of single-nucleotide polymorphisms (SNPs) located in the HLA region were extracted from a targeted exome-sequencing dataset comprising Scandinavian AAV cases and controls. Classical HLA alleles were called using xHLA. After quality control, association analyses were performed of a joint SNP/classical HLA allele dataset for cases with PR3-AAV (n=411) and MPO-AAV (n=162) versus controls (n=1595). Disease-associated genetic variants were analysed for association with organ involvement, age at diagnosis and relapse, respectively. RESULTS: PR3-AAV was significantly associated with both HLA-DPB1*04:01 and rs1042335 at the HLA-DPB1 locus, also after stepwise conditional analysis. MPO-AAV was significantly associated with HLA-DRB1*04:04. Neither carriage of HLA-DPB1*04:01 alleles in PR3-AAV nor of HLA-DRB1*04:04 alleles in MPO-AAV were associated with organ involvement, age at diagnosis or relapse. CONCLUSIONS: The association to the HLA region was distinct in Scandinavian cases with MPO-AAV compared with cases of East Asian descent. In PR3-AAV, the two separate signals of association to the HLD-DPB1 region mediate potentially different functional effects.
Assuntos
Vasculite Associada a Anticorpo Anticitoplasma de Neutrófilos , Anticorpos Anticitoplasma de Neutrófilos , Humanos , Anticorpos Anticitoplasma de Neutrófilos/genética , Vasculite Associada a Anticorpo Anticitoplasma de Neutrófilos/genética , Mieloblastina/genética , Genótipo , RecidivaRESUMO
OBJECTIVE: Autoantibodies against the adrenal enzyme 21-hydroxylase is a hallmark manifestation in autoimmune Addison's disease (AAD). Steroid 21-hydroxylase is encoded by CYP21A2, which is located in the human leucocyte antigen (HLA) region together with the highly similar pseudogene CYP21A1P. A high level of copy number variation is seen for the 2 genes, and therefore, we asked whether genetic variation of the CYP21 genes is associated with AAD. DESIGN: Case-control study on patients with AAD and healthy controls. METHODS: Using next-generation DNA sequencing, we estimated the copy number of CYP21A2 and CYP21A1P, together with HLA alleles, in 479 Swedish patients with AAD and autoantibodies against 21-hydroxylase and in 1393 healthy controls. RESULTS: With 95% of individuals carrying 2 functional 21-hydroxylase genes, no difference in CYP21A2 copy number was found when comparing patients and controls. In contrast, we discovered a lower copy number of the pseudogene CYP21A1P among AAD patients (P = 5 × 10-44), together with associations of additional nucleotide variants, in the CYP21 region. However, the strongest association was found for HLA-DQB1*02:01 (P = 9 × 10-63), which, in combination with the DRB1*04:04-DQB1*03:02 haplotype, imposed the greatest risk of AAD. CONCLUSIONS: We identified strong associations between copy number variants in the CYP21 region and risk of AAD, although these associations most likely are due to linkage disequilibrium with disease-associated HLA class II alleles.
Assuntos
Doença de Addison , Humanos , Doença de Addison/genética , Esteroide 21-Hidroxilase/genética , Variações do Número de Cópias de DNA/genética , Estudos de Casos e Controles , Suécia/epidemiologia , AutoanticorposRESUMO
Although thousands of genomic regions have been associated with heritable human diseases, attempts to elucidate biological mechanisms are impeded by a general inability to discern which genomic positions are functionally important. Evolutionary constraint is a powerful predictor of function that is agnostic to cell type or disease mechanism. Here, single base phyloP scores from the whole genome alignment of 240 placental mammals identified 3.5% of the human genome as significantly constrained, and likely functional. We compared these scores to large-scale genome annotation, genome-wide association studies (GWAS), copy number variation, clinical genetics findings, and cancer data sets. Evolutionarily constrained positions are enriched for variants explaining common disease heritability (more than any other functional annotation). Our results improve variant annotation but also highlight that the regulatory landscape of the human genome still needs to be further explored and linked to disease.
RESUMO
Thousands of genomic regions have been associated with heritable human diseases, but attempts to elucidate biological mechanisms are impeded by an inability to discern which genomic positions are functionally important. Evolutionary constraint is a powerful predictor of function, agnostic to cell type or disease mechanism. Single-base phyloP scores from 240 mammals identified 3.3% of the human genome as significantly constrained and likely functional. We compared phyloP scores to genome annotation, association studies, copy-number variation, clinical genetics findings, and cancer data. Constrained positions are enriched for variants that explain common disease heritability more than other functional annotations. Our results improve variant annotation but also highlight that the regulatory landscape of the human genome still needs to be further explored and linked to disease.
Assuntos
Doença , Variação Genética , Animais , Humanos , Evolução Biológica , Genoma Humano , Estudo de Associação Genômica Ampla , Genômica , Anotação de Sequência Molecular , Polimorfismo de Nucleotídeo Único , Doença/genéticaRESUMO
Zoonomia is the largest comparative genomics resource for mammals produced to date. By aligning genomes for 240 species, we identify bases that, when mutated, are likely to affect fitness and alter disease risk. At least 332 million bases (~10.7%) in the human genome are unusually conserved across species (evolutionarily constrained) relative to neutrally evolving repeats, and 4552 ultraconserved elements are nearly perfectly conserved. Of 101 million significantly constrained single bases, 80% are outside protein-coding exons and half have no functional annotations in the Encyclopedia of DNA Elements (ENCODE) resource. Changes in genes and regulatory elements are associated with exceptional mammalian traits, such as hibernation, that could inform therapeutic development. Earth's vast and imperiled biodiversity offers distinctive power for identifying genetic variants that affect genome function and organismal phenotypes.
Assuntos
Eutérios , Evolução Molecular , Animais , Feminino , Humanos , Sequência Conservada/genética , Eutérios/genética , Genoma HumanoRESUMO
BACKGROUND: The international Dog10K project aims to sequence and analyze several thousand canine genomes. Incorporating 20 × data from 1987 individuals, including 1611 dogs (321 breeds), 309 village dogs, 63 wolves, and four coyotes, we identify genomic variation across the canid family, setting the stage for detailed studies of domestication, behavior, morphology, disease susceptibility, and genome architecture and function. RESULTS: We report the analysis of > 48 M single-nucleotide, indel, and structural variants spanning the autosomes, X chromosome, and mitochondria. We discover more than 75% of variation for 239 sampled breeds. Allele sharing analysis indicates that 94.9% of breeds form monophyletic clusters and 25 major clades. German Shepherd Dogs and related breeds show the highest allele sharing with independent breeds from multiple clades. On average, each breed dog differs from the UU_Cfam_GSD_1.0 reference at 26,960 deletions and 14,034 insertions greater than 50 bp, with wolves having 14% more variants. Discovered variants include retrogene insertions from 926 parent genes. To aid functional prioritization, single-nucleotide variants were annotated with SnpEff and Zoonomia phyloP constraint scores. Constrained positions were negatively correlated with allele frequency. Finally, the utility of the Dog10K data as an imputation reference panel is assessed, generating high-confidence calls across varied genotyping platform densities including for breeds not included in the Dog10K collection. CONCLUSIONS: We have developed a dense dataset of 1987 sequenced canids that reveals patterns of allele sharing, identifies likely functional variants, informs breed structure, and enables accurate imputation. Dog10K data are publicly available.
Assuntos
Lobos , Cães , Animais , Lobos/genética , Mapeamento Cromossômico , Alelos , Polimorfismo de Nucleotídeo Único , Nucleotídeos , DemografiaRESUMO
Insect bite hypersensitivity (IBH) is a chronic allergic dermatitis common in horses. Affected horses mainly react against antigens present in the saliva from the biting midges, Culicoides ssp, and occasionally black flies, Simulium ssp. Because of this insect dependency, the disease is clearly seasonal and prevalence varies between geographical locations. For two distinct horse breeds, we genotyped four microsatellite markers positioned within the MHC class II region and sequenced the highly polymorphic exons two from DRA and DRB3, respectively. Initially, 94 IBH-affected and 93 unaffected Swedish born Icelandic horses were tested for genetic association. These horses had previously been genotyped on the Illumina Equine SNP50 BeadChip, which made it possible to ensure that our study did not suffer from the effects of stratification. The second population consisted of 106 unaffected and 80 IBH-affected Exmoor ponies. We show that variants in the MHC class II region are associated with disease susceptibility (p (raw) = 2.34 × 10(-5)), with the same allele (COR112:274) associated in two separate populations. In addition, we combined microsatellite and sequencing data in order to investigate the pattern of homozygosity and show that homozygosity across the entire MHC class II region is associated with a higher risk of developing IBH (p = 0.0013). To our knowledge this is the first time in any atopic dermatitis suffering species, including man, where the same risk allele has been identified in two distinct populations.
Assuntos
Ceratopogonidae/imunologia , Dermatite Atópica/veterinária , Genes MHC da Classe II , Doenças dos Cavalos/genética , Mordeduras e Picadas de Insetos/veterinária , Animais , Dermatite Atópica/genética , Dermatite Atópica/imunologia , Genótipo , Doenças dos Cavalos/imunologia , Cavalos , Mordeduras e Picadas de Insetos/genética , Mordeduras e Picadas de Insetos/imunologia , Repetições de Microssatélites , Polimorfismo de Nucleotídeo Único , Fatores de RiscoRESUMO
Pea-comb is a dominant mutation in chickens that drastically reduces the size of the comb and wattles. It is an adaptive trait in cold climates as it reduces heat loss and makes the chicken less susceptible to frost lesions. Here we report that Pea-comb is caused by a massive amplification of a duplicated sequence located near evolutionary conserved non-coding sequences in intron 1 of the gene encoding the SOX5 transcription factor. This must be the causative mutation since all other polymorphisms associated with the Pea-comb allele were excluded by genetic analysis. SOX5 controls cell fate and differentiation and is essential for skeletal development, chondrocyte differentiation, and extracellular matrix production. Immunostaining in early embryos demonstrated that Pea-comb is associated with ectopic expression of SOX5 in mesenchymal cells located just beneath the surface ectoderm where the comb and wattles will subsequently develop. The results imply that the duplication expansion interferes with the regulation of SOX5 expression during the differentiation of cells crucial for the development of comb and wattles. The study provides novel insight into the nature of mutations that contribute to phenotypic evolution and is the first description of a spontaneous and fully viable mutation in this developmentally important gene.
Assuntos
Galinhas/genética , Crista e Barbelas/crescimento & desenvolvimento , Dosagem de Genes , Íntrons , Mutação , Fatores de Transcrição SOXD/genética , Animais , Diferenciação Celular , Galinhas/crescimento & desenvolvimento , Galinhas/metabolismo , Mapeamento Cromossômico , Crista e Barbelas/metabolismo , Feminino , Regulação da Expressão Gênica no Desenvolvimento , Variação Genética , Masculino , Dados de Sequência Molecular , Fenótipo , Fatores de Transcrição SOXD/metabolismoRESUMO
Transcriptomic analyses are commonly used to identify differentially expressed genes between patients and controls, or within individuals across disease courses. These methods, whilst effective, cannot encompass the combinatorial effects of genes driving disease. We applied rule-based machine learning (RBML) models and rule networks (RN) to an existing paediatric Systemic Lupus Erythematosus (SLE) blood expression dataset, with the goal of developing gene networks to separate low and high disease activity (DA1 and DA3). The resultant model had an 81% accuracy to distinguish between DA1 and DA3, with unsupervised hierarchical clustering revealing additional subgroups indicative of the immune axis involved or state of disease flare. These subgroups correlated with clinical variables, suggesting that the gene sets identified may further the understanding of gene networks that act in concert to drive disease progression. This included roles for genes (i) induced by interferons (IFI35 and OTOF), (ii) key to SLE cell types (KLRB1 encoding CD161), or (iii) with roles in autophagy and NF-κB pathway responses (CKAP4). As demonstrated here, RBML approaches have the potential to reveal novel gene patterns from within a heterogeneous disease, facilitating patient clinical and therapeutic stratification.