RESUMO
We carried out whole genome resequencing of 127 chicken including red jungle fowl and multiple populations of commercial broilers and layers to perform a systematic screening of adaptive changes in modern chicken (Gallus gallus domesticus). We uncovered >21 million high quality SNPs of which 34% are newly detected variants. This panel comprises >115,000 predicted amino-acid altering substitutions as well as 1,100 SNPs predicted to be stop-gain or -loss, several of which reach high frequencies. Signatures of selection were investigated both through analyses of fixation and differentiation to reveal selective sweeps that may have had prominent roles during domestication and breed development. Contrasting wild and domestic chicken we confirmed selection at the BCO2 and TSHR loci and identified 34 putative sweeps co-localized with ALX1, KITLG, EPGR, IGF1, DLK1, JPT2, CRAMP1, and GLI3, among others. Analysis of enrichment between groups of wild vs. commercials and broilers vs. layers revealed a further panel of candidate genes including CORIN, SKIV2L2 implicated in pigmentation and LEPR, MEGF10 and SPEF2, suggestive of production-oriented selection. SNPs with marked allele frequency differences between wild and domestic chicken showed a highly significant deficiency in the proportion of amino-acid altering mutations (P<2.5×10-6). The results contribute to the understanding of major genetic changes that took place during the evolution of modern chickens and in poultry breeding.
Assuntos
Adaptação Biológica , Galinhas/genética , Genoma , Genômica , Alelos , Animais , Biologia Computacional/métodos , Frequência do Gene , Variação Genética , Genômica/métodos , Anotação de Sequência Molecular , Polimorfismo de Nucleotídeo ÚnicoRESUMO
Darwin's finches, inhabiting the Galápagos archipelago and Cocos Island, constitute an iconic model for studies of speciation and adaptive evolution. Here we report the results of whole-genome re-sequencing of 120 individuals representing all of the Darwin's finch species and two close relatives. Phylogenetic analysis reveals important discrepancies with the phenotype-based taxonomy. We find extensive evidence for interspecific gene flow throughout the radiation. Hybridization has given rise to species of mixed ancestry. A 240 kilobase haplotype encompassing the ALX1 gene that encodes a transcription factor affecting craniofacial development is strongly associated with beak shape diversity across Darwin's finch species as well as within the medium ground finch (Geospiza fortis), a species that has undergone rapid evolution of beak shape in response to environmental changes. The ALX1 haplotype has contributed to diversification of beak shapes among the Darwin's finches and, thereby, to an expanded utilization of food resources.
Assuntos
Bico/anatomia & histologia , Evolução Molecular , Tentilhões/anatomia & histologia , Tentilhões/genética , Animais , Proteínas Aviárias/genética , Proteínas Aviárias/metabolismo , Equador , Feminino , Tentilhões/classificação , Tentilhões/embriologia , Fluxo Gênico , Genoma/genética , Haplótipos/genética , Hibridização Genética , Ilhas do Oceano Índico , Masculino , Dados de Sequência Molecular , Filogenia , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismoRESUMO
BACKGROUND: Copy number variation (CNV) plays an important role in human genetic diversity and has been associated with multiple complex disorders. Here we investigate a CNV on chromosome 10q11.22 that spans NPY4R, the gene for the appetite-regulating pancreatic polypeptide receptor Y4. This genomic region has been challenging to map due to multiple repeated elements and its precise organization has not yet been resolved. Previous studies using microarrays were interpreted to show that the most common copy number was 2 per genome. RESULTS: We have investigated 18 individuals from the 1000 Genomes project using the well-established method of read depth analysis and the new droplet digital PCR (ddPCR) method. We find that the most common copy number for NPY4R is 4. The estimated number of copies ranged from three to seven based on read depth analyses with Control-FREEC and CNVnator, and from four to seven based on ddPCR. We suggest that the difference between our results and those published previously can be explained by methodological differences such as reference gene choice, data normalization and method reliability. Three high-quality archaic human genomes (two Neanderthal and one Denisova) display four copies of the NPY4R gene indicating that a duplication occurred prior to the human-Neanderthal/Denisova split. CONCLUSIONS: We conclude that ddPCR is a sensitive and reliable method for CNV determination, that it can be used for read depth calibration in CNV studies based on already available whole-genome sequencing data, and that further investigation of NPY4R copy number variation and its consequences are necessary due to the role of Y4 receptor in food intake regulation.
Assuntos
Variações do Número de Cópias de DNA/genética , Dosagem de Genes , Reação em Cadeia da Polimerase/métodos , Receptores de Neuropeptídeo Y/genética , Análise de Sequência de DNA/métodos , Genoma Humano/genética , Genômica/métodos , Humanos , Reprodutibilidade dos TestesRESUMO
The domestication of dogs was an important episode in the development of human civilization. The precise timing and location of this event is debated and little is known about the genetic changes that accompanied the transformation of ancient wolves into domestic dogs. Here we conduct whole-genome resequencing of dogs and wolves to identify 3.8 million genetic variants used to identify 36 genomic regions that probably represent targets for selection during dog domestication. Nineteen of these regions contain genes important in brain function, eight of which belong to nervous system development pathways and potentially underlie behavioural changes central to dog domestication. Ten genes with key roles in starch digestion and fat metabolism also show signals of selection. We identify candidate mutations in key genes and provide functional support for an increased starch digestion in dogs relative to wolves. Our results indicate that novel adaptations allowing the early ancestors of modern dogs to thrive on a diet rich in starch, relative to the carnivorous diet of wolves, constituted a crucial step in the early domestication of dogs.
Assuntos
Animais Domésticos/genética , Dieta/veterinária , Cães/genética , Genoma/genética , Amido , Amilases/genética , Animais , Doença de Depósito de Glicogênio Tipo II , Mutação , Lobos/genética , alfa-Glucosidases/genéticaRESUMO
Domestication of wild boar (Sus scrofa) and subsequent selection have resulted in dramatic phenotypic changes in domestic pigs for a number of traits, including behavior, body composition, reproduction, and coat color. Here we have used whole-genome resequencing to reveal some of the loci that underlie phenotypic evolution in European domestic pigs. Selective sweep analyses revealed strong signatures of selection at three loci harboring quantitative trait loci that explain a considerable part of one of the most characteristic morphological changes in the domestic pig--the elongation of the back and an increased number of vertebrae. The three loci were associated with the NR6A1, PLAG1, and LCORL genes. The latter two have repeatedly been associated with loci controlling stature in other domestic animals and in humans. Most European domestic pigs are homozygous for the same haplotype at these three loci. We found an excess of derived nonsynonymous substitutions in domestic pigs, most likely reflecting both positive selection and relaxed purifying selection after domestication. Our analysis of structural variation revealed four duplications at the KIT locus that were exclusively present in white or white-spotted pigs, carrying the Dominant white, Patch, or Belt alleles. This discovery illustrates how structural changes have contributed to rapid phenotypic evolution in domestic animals and how alleles in domestic animals may evolve by the accumulation of multiple causative mutations as a response to strong directional selection.
Assuntos
Animais Domésticos/genética , Genoma , Seleção Genética , Suínos/genética , Sequência de Aminoácidos , Animais , Variações do Número de Cópias de DNA , Homozigoto , Dados de Sequência Molecular , Locos de Características Quantitativas , Homologia de Sequência de AminoácidosRESUMO
Introduction: The suitability of whole-genome sequencing (WGS) as the sole method to detect clinically relevant genomic aberrations in B-cell acute lymphoblastic leukemia (ALL) was investigated with the aim of replacing current diagnostic methods. Methods: For this purpose, we assessed the analytical performance of 150 bp paired-end WGS (90x leukemia/30x germline). A set of 88 retrospective B-cell ALL samples were selected to represent established ALL subgroups as well as ALL lacking stratifying markers by standard-of-care (SoC), so-called B-other ALL. Results: Both the analysis of paired leukemia/germline (L/N)(n=64) as well as leukemia-only (L-only)(n=88) detected all types of aberrations mandatory in the current ALLTogether trial protocol, i.e., aneuploidies, structural variants, and focal copy-number aberrations. Moreover, comparison to SoC revealed 100% concordance and that all patients had been assigned to the correct genetic subgroup using both approaches. Notably, WGS could allocate 35 out of 39 B-other ALL samples to one of the emerging genetic subgroups considered in the most recent classifications of ALL. We further investigated the impact of high (90x; n=58) vs low (30x; n=30) coverage on the diagnostic yield and observed an equally perfect concordance with SoC; low coverage detected all relevant lesions. Discussion: The filtration of the WGS findings with a short list of genes recurrently rearranged in ALL was instrumental to extract the clinically relevant information efficiently. Nonetheless, the detection of DUX4 rearrangements required an additional customized analysis, due to multiple copies of this gene embedded in the highly repetitive D4Z4 region. We conclude that the diagnostic performance of WGS as the standalone method was remarkable and allowed detection of all clinically relevant genomic events in the diagnostic setting of B-cell ALL.
RESUMO
PURPOSE: Several studies have indicated that broad genomic characterization of childhood cancer provides diagnostically and/or therapeutically relevant information in selected high-risk cases. However, the extent to which such characterization offers clinically actionable data in a prospective broadly inclusive setting remains largely unexplored. METHODS: We implemented prospective whole-genome sequencing (WGS) of tumor and germline, complemented by whole-transcriptome sequencing (RNA-Seq) for all children diagnosed with a primary or relapsed solid malignancy in Sweden. Multidisciplinary molecular tumor boards were set up to integrate genomic data in the clinical decision process along with a medicolegal framework enabling secondary use of sequencing data for research purposes. RESULTS: During the study's first 14 months, 118 solid tumors from 117 patients were subjected to WGS, with complementary RNA-Seq for fusion gene detection in 52 tumors. There was no significant geographic bias in patient enrollment, and the included tumor types reflected the annual national incidence of pediatric solid tumor types. Of the 112 tumors with somatic mutations, 106 (95%) exhibited alterations with a clear clinical correlation. In 46 of 118 tumors (39%), sequencing only corroborated histopathological diagnoses, while in 59 cases (50%), it contributed to additional subclassification or detection of prognostic markers. Potential treatment targets were found in 31 patients (26%), most commonly ALK mutations/fusions (n = 4), RAS/RAF/MEK/ERK pathway mutations (n = 14), FGFR1 mutations/fusions (n = 5), IDH1 mutations (n = 2), and NTRK2 gene fusions (n = 2). In one patient, the tumor diagnosis was revised based on sequencing. Clinically relevant germline variants were detected in 8 of 94 patients (8.5%). CONCLUSION: Up-front, large-scale genomic characterization of pediatric solid malignancies provides diagnostically valuable data in the majority of patients also in a largely unselected cohort.
Assuntos
Carcinoma , Medicina de Precisão , Humanos , Criança , Recidiva Local de Neoplasia , Fusão Gênica , GenômicaRESUMO
Autism spectrum disorder (ASD) is a heterogeneous neuropsychiatric disorder with a complex genetic background. Analysis of altered molecular processes in ASD patients requires linear and nonlinear methods that provide interpretable solutions. Interpretable machine learning provides legible models that allow explaining biological mechanisms and support analysis of clinical subgroups. In this work, we investigated several case-control studies of gene expression measurements of ASD individuals. We constructed a rule-based learning model from three independent datasets that we further visualized as a nonlinear gene-gene co-predictive network. To find dissimilarities between ASD subtypes, we scrutinized a topological structure of the network and estimated a centrality distance. Our analysis revealed that autism is the most severe subtype of ASD, while pervasive developmental disorder-not otherwise specified and Asperger syndrome are closely related and milder ASD subtypes. Furthermore, we analyzed the most important ASD-related features that were described in terms of gene co-predictors. Among others, we found a strong co-predictive mechanism between EMC4 and TMEM30A, which may suggest a co-regulation between these genes. The present study demonstrates the potential of applying interpretable machine learning in bioinformatics analyses. Although the proposed methodology was designed for transcriptomics data, it can be applied to other omics disciplines.
RESUMO
BACKGROUND: Oesophageal atresia (OA) is a life-threatening developmental defect characterized by a lost continuity between the upper and lower oesophagus. The most common form is a distal connection between the trachea and the oesophagus, i.e. a tracheoesophageal fistula (TEF). The condition may be part of a syndrome or occurs as an isolated feature. The recurrence risk in affected families is increased compared to the population-based incidence suggesting contributing genetic factors. METHODS: To gain insight into gene variants and genes associated with isolated OA we conducted whole genome sequencing on samples from three families with recurrent cases affected by congenital and isolated TEF. RESULTS: We identified a combination of single nucleotide variants (SNVs), splice site variants (SSV) and structural variants (SV) annotated to altogether 100 coding genes in the six affected individuals. CONCLUSION: This study highlights rare SVs among candidate gene variants in our individuals with OA and provides a gene framework for further investigations of genetic factors behind this malformation.