Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 18 de 18
Filter
Add more filters










Publication year range
1.
BMC Cancer ; 23(1): 618, 2023 Jul 03.
Article in English | MEDLINE | ID: mdl-37400763

ABSTRACT

BACKGROUND: Gene fusions are important cancer drivers in pediatric cancer and their accurate detection is essential for diagnosis and treatment. Clinical decision-making requires high confidence and precision of detection. Recent developments show RNA sequencing (RNA-seq) is promising for genome-wide detection of fusion products but hindered by many false positives that require extensive manual curation and impede discovery of pathogenic fusions. METHODS: We developed Fusion-sq to overcome existing disadvantages of detecting gene fusions. Fusion-sq integrates and "fuses" evidence from RNA-seq and whole genome sequencing (WGS) using intron-exon gene structure to identify tumor-specific protein coding gene fusions. Fusion-sq was then applied to the data generated from a pediatric pan-cancer cohort of 128 patients by WGS and RNA sequencing. RESULTS: In a pediatric pan-cancer cohort of 128 patients, we identified 155 high confidence tumor-specific gene fusions and their underlying structural variants (SVs). This includes all clinically relevant fusions known to be present in this cohort (30 patients). Fusion-sq distinguishes healthy-occurring from tumor-specific fusions and resolves fusions in amplified regions and copy number unstable genomes. A high gene fusion burden is associated with copy number instability. We identified 27 potentially pathogenic fusions involving oncogenes or tumor-suppressor genes characterized by underlying SVs, in some cases leading to expression changes indicative of activating or disruptive effects. CONCLUSIONS: Our results indicate how clinically relevant and potentially pathogenic gene fusions can be identified and their functional effects investigated by combining WGS and RNA-seq. Integrating RNA fusion predictions with underlying SVs advances fusion detection beyond extensive manual filtering. Taken together, we developed a method for identifying candidate gene fusions that is suitable for precision oncology applications. Our method provides multi-omics evidence for assessing the pathogenicity of tumor-specific gene fusions for future clinical decision making.


Subject(s)
Neoplasms , Child , Humans , Neoplasms/genetics , RNA-Seq , High-Throughput Nucleotide Sequencing/methods , Precision Medicine , Sequence Analysis, RNA/methods , Gene Fusion , Whole Genome Sequencing
2.
Cancers (Basel) ; 14(19)2022 Oct 05.
Article in English | MEDLINE | ID: mdl-36230794

ABSTRACT

Chromosomal alterations have recurrently been identified in Wilms tumors (WTs) and some are associated with poor prognosis. Gain of 1q (1q+) is of special interest given its high prevalence and is currently actively studied for its prognostic value. However, the underlying mutational mechanisms and functional effects remain unknown. In a national unbiased cohort of 30 primary WTs, we integrated somatic SNVs, CNs and SVs with expression data and distinguished four clusters characterized by affected biological processes: muscle differentiation, immune system, kidney development and proliferation. Combined genome-wide CN and SV profiles showed that tumors profoundly differ in both their types of 1q+ and genomic stability and can be grouped into WTs with co-occurring 1p-/1q+, multiple chromosomal gains or CN neutral tumors. We identified 1q+ in eight tumors that differ in mutational mechanisms, subsequent rearrangements and genomic contexts. Moreover, 1q+ tumors were present in all four expression clusters reflecting activation of various biological processes, and individual tumors overexpress different genes on 1q. In conclusion, by integrating CNs, SVs and gene expression, we identified subgroups of 1q+ tumors reflecting differences in the functional effect of 1q gain, indicating that expression data is likely needed for further risk stratification of 1q+ WTs.

3.
Sci Data ; 9(1): 169, 2022 04 13.
Article in English | MEDLINE | ID: mdl-35418585

ABSTRACT

The genomes of thousands of individuals are profiled within Dutch healthcare and research each year. However, this valuable genomic data, associated clinical data and consent are captured in different ways and stored across many systems and organizations. This makes it difficult to discover rare disease patients, reuse data for personalized medicine and establish research cohorts based on specific parameters. FAIR Genomes aims to enable NGS data reuse by developing metadata standards for the data descriptions needed to FAIRify genomic data while also addressing ELSI issues. We developed a semantic schema of essential data elements harmonized with international FAIR initiatives. The FAIR Genomes schema v1.1 contains 110 elements in 9 modules. It reuses common ontologies such as NCIT, DUO and EDAM, only introducing new terms when necessary. The schema is represented by a YAML file that can be transformed into templates for data entry software (EDC) and programmatic interfaces (JSON, RDF) to ease genomic data sharing in research and healthcare. The schema, documentation and MOLGENIS reference implementation are available at https://fairgenomes.org .


Subject(s)
High-Throughput Nucleotide Sequencing , Metadata , Delivery of Health Care , Genomics , Humans , Software
4.
Nat Commun ; 11(1): 1310, 2020 03 11.
Article in English | MEDLINE | ID: mdl-32161258

ABSTRACT

Kidney tumours are among the most common solid tumours in children, comprising distinct subtypes differing in many aspects, including cell-of-origin, genetics, and pathology. Pre-clinical cell models capturing the disease heterogeneity are currently lacking. Here, we describe the first paediatric cancer organoid biobank. It contains tumour and matching normal kidney organoids from over 50 children with different subtypes of kidney cancer, including Wilms tumours, malignant rhabdoid tumours, renal cell carcinomas, and congenital mesoblastic nephromas. Paediatric kidney tumour organoids retain key properties of native tumours, useful for revealing patient-specific drug sensitivities. Using single cell RNA-sequencing and high resolution 3D imaging, we further demonstrate that organoid cultures derived from Wilms tumours consist of multiple different cell types, including epithelial, stromal and blastemal-like cells. Our organoid biobank captures the heterogeneity of paediatric kidney tumours, providing a representative collection of well-characterised models for basic cancer research, drug-screening and personalised medicine.


Subject(s)
Biological Specimen Banks , Kidney Neoplasms/genetics , Kidney/pathology , Organoids/pathology , Adolescent , Carcinoma, Renal Cell/drug therapy , Carcinoma, Renal Cell/genetics , Carcinoma, Renal Cell/pathology , Cell Culture Techniques/methods , Child , Child, Preschool , DNA Methylation , Drug Screening Assays, Antitumor/methods , Female , Gene Expression Regulation, Neoplastic , Genetic Heterogeneity , Genotyping Techniques , Humans , Infant , Kidney Neoplasms/drug therapy , Kidney Neoplasms/pathology , Male , Nephroma, Mesoblastic/drug therapy , Nephroma, Mesoblastic/genetics , Nephroma, Mesoblastic/pathology , Netherlands , Precision Medicine/methods , RNA-Seq , Rhabdoid Tumor/drug therapy , Rhabdoid Tumor/genetics , Rhabdoid Tumor/pathology , Single-Cell Analysis , Transfection , Tumor Cells, Cultured , Whole Genome Sequencing , Wilms Tumor/drug therapy , Wilms Tumor/genetics , Wilms Tumor/pathology , Young Adult
5.
Epigenetics Chromatin ; 12(1): 14, 2019 02 15.
Article in English | MEDLINE | ID: mdl-30767785

ABSTRACT

BACKGROUND: Genomic imprinting, resulting in parent-of-origin specific gene expression, plays a critical role in mammalian development. Here, we apply allele-specific RNA-seq on isogenic B6D2F1 mice to assay imprinted genes in tissues from early embryonic tissues between E3.5 and E7.25 and in pluripotent cell lines to evaluate maintenance of imprinted gene expression. For the cell lines, we include embryonic stem cells (ESCs) and epiblast stem cells (EpiSCs) derived from fertilized embryos and from embryos obtained after nuclear transfer (NT) or parthenogenetic activation (PGA). RESULTS: As homozygous genomic regions of PGA-derived cells are not compatible with allele-specific RNA-seq, we developed an RNA-seq-based genotyping strategy allowing identification of informative heterozygous regions. Global analysis shows that proper imprinted gene expression as observed in embryonic tissues is largely lost in the ESC lines included in this study, which mainly consisted of female ESCs. Differentiation of ESC lines to embryoid bodies or NPCs does not restore monoallelic expression of imprinted genes, neither did reprogramming of the serum-cultured ESCs to the pluripotent ground state by the use of 2 kinase inhibitors. Fertilized EpiSC and EpiSC-NT lines largely maintain imprinted gene expression, as did EpiSC-PGA lines that show known paternally expressed genes being silent and known maternally expressed genes consistently showing doubled expression. Notably, two EpiSC-NT lines show aberrant silencing of Rian and Meg3, two critically imprinted genes in mouse iPSCs. With respect to female EpiSC, most of the lines displayed completely skewed X inactivation suggesting a (near) clonal origin. CONCLUSIONS: Altogether, our analysis provides a comprehensive overview of imprinted gene expression in pluripotency and provides a benchmark to allow identification of cell lines that faithfully maintain imprinted gene expression and therefore retain full developmental potential.


Subject(s)
Alleles , Genomic Imprinting , Mouse Embryonic Stem Cells/metabolism , RNA, Messenger/genetics , Animals , Cell Differentiation , Cell Line , Cells, Cultured , Female , Gene Expression Profiling , Gene Expression Regulation, Developmental , Gene Silencing , Germ Layers/cytology , Germ Layers/metabolism , Male , Mice , Mice, Inbred C57BL , Mice, Inbred DBA , Mouse Embryonic Stem Cells/cytology
6.
Biol Open ; 7(8)2018 Aug 17.
Article in English | MEDLINE | ID: mdl-30026265

ABSTRACT

During early mammalian development, transient pools of pluripotent cells emerge that can be immortalised upon stem cell derivation. The pluripotent state, 'naïve' or 'primed', depends on the embryonic stage and derivation conditions used. Here we analyse the temporal gene expression patterns of mouse, cattle and porcine embryos at stages that harbour different types of pluripotent cells. We document conserved and divergent traits in gene expression, and identify predictor genes shared across the species that are associated with pluripotent states in vivo and in vitro Amongst these are the pluripotency-linked genes Klf4 and Lin28b The novel genes discovered include naïve- (Spic, Scpep1 and Gjb5) and primed-associated (Sema6a and Jakmip2) genes as well as naïve to primed transition genes (Dusp6 and Trip6). Both Gjb5 and Dusp6 play a role in pluripotency since their knockdown results in differentiation and downregulation of key pluripotency genes. Our interspecies comparison revealed new insights of pluripotency, pluripotent stem cell identity and a new molecular criterion for distinguishing between pluripotent states in various species, including human.

8.
Genome Biol ; 16: 149, 2015 Aug 03.
Article in English | MEDLINE | ID: mdl-26235224

ABSTRACT

BACKGROUND: During early embryonic development, one of the two X chromosomes in mammalian female cells is inactivated to compensate for a potential imbalance in transcript levels with male cells, which contain a single X chromosome. Here, we use mouse female embryonic stem cells (ESCs) with non-random X chromosome inactivation (XCI) and polymorphic X chromosomes to study the dynamics of gene silencing over the inactive X chromosome by high-resolution allele-specific RNA-seq. RESULTS: Induction of XCI by differentiation of female ESCs shows that genes proximal to the X-inactivation center are silenced earlier than distal genes, while lowly expressed genes show faster XCI dynamics than highly expressed genes. The active X chromosome shows a minor but significant increase in gene activity during differentiation, resulting in complete dosage compensation in differentiated cell types. Genes escaping XCI show little or no silencing during early propagation of XCI. Allele-specific RNA-seq of neural progenitor cells generated from the female ESCs identifies three regions distal to the X-inactivation center that escape XCI. These regions, which stably escape during propagation and maintenance of XCI, coincide with topologically associating domains (TADs) as present in the female ESCs. Also, the previously characterized gene clusters escaping XCI in human fibroblasts correlate with TADs. CONCLUSIONS: The gene silencing observed during XCI provides further insight in the establishment of the repressive complex formed by the inactive X chromosome. The association of escape regions with TADs, in mouse and human, suggests that TADs are the primary targets during propagation of XCI over the X chromosome.


Subject(s)
Gene Silencing , X Chromosome Inactivation , Alleles , Animals , Chromatin/chemistry , Embryoid Bodies/metabolism , Embryonic Stem Cells/metabolism , Female , Humans , Mice , Neural Stem Cells/metabolism , Sequence Analysis, RNA
9.
Science ; 345(6204): 1251086, 2014 Sep 26.
Article in English | MEDLINE | ID: mdl-25258085

ABSTRACT

Monocyte differentiation into macrophages represents a cornerstone process for host defense. Concomitantly, immunological imprinting of either tolerance or trained immunity determines the functional fate of macrophages and susceptibility to secondary infections. We characterized the transcriptomes and epigenomes in four primary cell types: monocytes and in vitro-differentiated naïve, tolerized, and trained macrophages. Inflammatory and metabolic pathways were modulated in macrophages, including decreased inflammasome activation, and we identified pathways functionally implicated in trained immunity. ß-glucan training elicits an exclusive epigenetic signature, revealing a complex network of enhancers and promoters. Analysis of transcription factor motifs in deoxyribonuclease I hypersensitive sites at cell-type-specific epigenetic loci unveiled differentiation and treatment-specific repertoires. Altogether, we provide a resource to understand the epigenetic changes that underlie innate immunity in humans.


Subject(s)
Cell Differentiation/genetics , Epigenesis, Genetic , Immunity, Innate/genetics , Macrophages/cytology , Monocytes/cytology , Animals , Binding Sites/genetics , Deoxyribonuclease I/chemistry , Genomic Imprinting , Humans , Immunologic Memory , Inflammasomes/genetics , Inflammasomes/immunology , Macrophages/immunology , Mice , Monocytes/immunology , Transcription Factors/metabolism , beta-Glucans/immunology
10.
Science ; 345(6204): 1251033, 2014 Sep 26.
Article in English | MEDLINE | ID: mdl-25258084

ABSTRACT

Blood cells derive from hematopoietic stem cells through stepwise fating events. To characterize gene expression programs driving lineage choice, we sequenced RNA from eight primary human hematopoietic progenitor populations representing the major myeloid commitment stages and the main lymphoid stage. We identified extensive cell type-specific expression changes: 6711 genes and 10,724 transcripts, enriched in non-protein-coding elements at early stages of differentiation. In addition, we found 7881 novel splice junctions and 2301 differentially used alternative splicing events, enriched in genes involved in regulatory processes. We demonstrated experimentally cell-specific isoform usage, identifying nuclear factor I/B (NFIB) as a regulator of megakaryocyte maturation-the platelet precursor. Our data highlight the complexity of fating events in closely related progenitor populations, the understanding of which is essential for the advancement of transplantation and regenerative medicine.


Subject(s)
Alternative Splicing , Cell Lineage/genetics , Hematopoiesis/genetics , Hematopoietic Stem Cells/cytology , Genetic Variation , Hematopoietic Stem Cells/metabolism , Humans , NFI Transcription Factors/genetics , NFI Transcription Factors/metabolism , RNA-Binding Proteins/metabolism , Thrombopoiesis/genetics , Transcriptome
11.
Cell Stem Cell ; 13(3): 360-9, 2013 Sep 05.
Article in English | MEDLINE | ID: mdl-23850244

ABSTRACT

The use of two kinase inhibitors (2i) enables derivation of mouse embryonic stem cells (ESCs) in the pluripotent ground state. Using whole-genome bisulfite sequencing (WGBS), we show that male 2i ESCs are globally hypomethylated compared to conventional ESCs maintained in serum. In serum, female ESCs are hypomethyated similarly to male ESCs in 2i, and DNA methylation is further reduced in 2i. Regions with elevated DNA methylation in 2i strongly correlate with the presence of H3K9me3 on endogenous retroviruses (ERVs) and imprinted loci. The methylome of male ESCs in serum parallels postimplantation blastocyst cells, while 2i stalls ESCs in a hypomethylated, ICM-like state. WGBS analysis during adaptation of 2i ESCs to serum suggests that deposition of DNA methylation is largely random, while loss of DNA methylation during reversion to 2i occurs passively, initiating at TET1 binding sites. Together, our analysis provides insight into DNA methylation dynamics in cultured ESCs paralleling early developmental processes.


Subject(s)
Blastocyst/physiology , DNA-Binding Proteins/metabolism , Embryonic Stem Cells/physiology , Histone Demethylases/metabolism , Pluripotent Stem Cells/physiology , Proto-Oncogene Proteins/metabolism , Animals , Cells, Cultured , DNA Methylation/drug effects , DNA-Binding Proteins/genetics , Embryonic Stem Cells/drug effects , Female , Fetal Development , Genome/genetics , Histones/metabolism , Leukemia Inhibitory Factor/metabolism , Male , Methylation , Mice , Protein Kinase Inhibitors/pharmacology , Proto-Oncogene Proteins/genetics , Sequence Analysis, DNA , Sulfites/chemistry
12.
BMC Evol Biol ; 12: 45, 2012 Apr 02.
Article in English | MEDLINE | ID: mdl-22462721

ABSTRACT

BACKGROUND: The study of speciation and maintenance of species barriers is at the core of evolutionary biology. During speciation the genome of one population becomes separated from other populations of the same species, which may lead to genomic incompatibility with time. This separation is complete when no fertile offspring is produced from inter-population matings, which is the basis of the biological species concept. Birds, in particular ducks, are recognised as a challenging and illustrative group of higher vertebrates for speciation studies. There are many sympatric and ecologically similar duck species, among which fertile hybrids occur relatively frequently in nature, yet these species remain distinct. RESULTS: We show that the degree of shared single nucleotide polymorphisms (SNPs) between five species of dabbling ducks (genus Anas) is an order of magnitude higher than that previously reported between any pair of eukaryotic species with comparable evolutionary distances. We demonstrate that hybridisation has led to sustained exchange of genetic material between duck species on an evolutionary time scale without disintegrating species boundaries. Even though behavioural, genetic and ecological factors uphold species boundaries in ducks, we detect opposing forces allowing for viable interspecific hybrids, with long-term evolutionary implications. Based on the superspecies concept we here introduce the novel term "supra-population" to explain the persistence of SNPs identical by descent within the studied ducks despite their history as distinct species dating back millions of years. CONCLUSIONS: By reviewing evidence from speciation theory, palaeogeography and palaeontology we propose a fundamentally new model of speciation to accommodate our genetic findings in dabbling ducks. This model, we argue, may also shed light on longstanding unresolved general speciation and hybridisation patterns in higher organisms, e.g. in other bird groups with unusually high hybridisation rates. Observed parallels to horizontal gene transfer in bacteria facilitate the understanding of why ducks have been such an evolutionarily successful group of animals. There is large evolutionary potential in the ability to exchange genes among species and the resulting dramatic increase of effective population size to counter selective constraints.


Subject(s)
Ducks/genetics , Genetic Speciation , Animals , Female , Gene Frequency , Gene Transfer, Horizontal , Genotyping Techniques , Linkage Disequilibrium , Male , Polymorphism, Single Nucleotide , Principal Component Analysis , Sequence Analysis, DNA
13.
BMC Genomics ; 12: 150, 2011 Mar 16.
Article in English | MEDLINE | ID: mdl-21410945

ABSTRACT

BACKGROUND: Next generation sequencing technologies allow to obtain at low cost the genomic sequence information that currently lacks for most economically and ecologically important organisms. For the mallard duck genomic data is limited. The mallard is, besides a species of large agricultural and societal importance, also the focal species when it comes to long distance dispersal of Avian Influenza. For large scale identification of SNPs we performed Illumina sequencing of wild mallard DNA and compared our data with ongoing genome and EST sequencing of domesticated conspecifics. This is the first study of its kind for waterfowl. RESULTS: More than one billion base pairs of sequence information were generated resulting in a 16× coverage of a reduced representation library of the mallard genome. Sequence reads were aligned to a draft domesticated duck reference genome and allowed for the detection of over 122,000 SNPs within our mallard sequence dataset. In addition, almost 62,000 nucleotide positions on the domesticated duck reference showed a different nucleotide compared to wild mallard. Approximately 20,000 SNPs identified within our data were shared with SNPs identified in the sequenced domestic duck or in EST sequencing projects. The shared SNPs were considered to be highly reliable and were used to benchmark non-shared SNPs for quality. Genotyping of a representative sample of 364 SNPs resulted in a SNP conversion rate of 99.7%. The correlation of the minor allele count and observed minor allele frequency in the SNP discovery pool was 0.72. CONCLUSION: We identified almost 150,000 SNPs in wild mallards that will likely yield good results in genotyping. Of these, ~101,000 SNPs were detected within our wild mallard sequences and ~49,000 were detected between wild and domesticated duck data. In the ~101,000 SNPs we found a subset of ~20,000 SNPs shared between wild mallards and the sequenced domesticated duck suggesting a low genetic divergence. Comparison of quality metrics between the total SNP set (122,000 + 62,000 = 184,000 SNPs) and the validated subset shows similar characteristics for both sets. This indicates that we have detected a large amount (~150,000) of accurately inferred mallard SNPs, which will benefit bird evolutionary studies, ecological studies (e.g. disentangling migratory connectivity) and industrial breeding programs.


Subject(s)
Ducks/genetics , Genome , Polymorphism, Single Nucleotide , Animals , Chromosome Mapping , Evolution, Molecular , Expressed Sequence Tags , Female , Gene Frequency , Genotype , Male , Sequence Analysis, DNA
14.
Mol Ecol ; 19 Suppl 1: 89-99, 2010 Mar.
Article in English | MEDLINE | ID: mdl-20331773

ABSTRACT

Identifying genes that underlie ecological traits will open exiting possibilities to study gene-environment interactions in shaping phenotypes and in measuring natural selection on genes. Evolutionary ecology has been pursuing these objectives for decades, but they come into reach now that next generation sequencing technologies have dramatically lowered the costs to obtain the genomic sequence information that is currently lacking for most ecologically important species. Here we describe how we generated over 2 billion basepairs of novel sequence information for an ecological model species, the great tit Parus major. We used over 16 million short sequence reads for the de novo assembly of a reference sequence consisting of 550 000 contigs, covering 2.5% of the genome of the great tit. This reference sequence was used as the scaffold for mapping of the sequence reads, which allowed for the detection of over 20 000 novel single nucleotide polymorphisms. Contigs harbouring 4272 of the single nucleotide polymorphisms could be mapped to a unique location on the recently sequenced zebra finch genome. Of all the great tit contigs, significantly more were mapped to the microchromosomes than to the intermediate and the macrochromosomes of the zebra finch, indicating a higher overall level of sequence conservation on the microchromosomes than on the other types of chromosomes. The large number of great tit contigs that can be aligned to the zebra finch genome shows that this genome provides a valuable framework for large scale genetics, e.g. QTL mapping or whole genome association studies, in passerines.


Subject(s)
Passeriformes/genetics , Polymorphism, Single Nucleotide , Sequence Analysis, DNA/methods , Animals , Comparative Genomic Hybridization , Contig Mapping , Finches/genetics , Gene Library , Genomics/methods , Sequence Alignment
15.
Vet Microbiol ; 141(1-2): 110-4, 2010 Feb 24.
Article in English | MEDLINE | ID: mdl-19716242

ABSTRACT

Acute secretory diarrhea is a major cause of morbidity and mortality in young animals and humans. Deaths result from excessive fluid and electrolyte losses. The disease is caused by non-invasive bacteria such as Vibrio cholerae and Escherichia coli which produce enterotoxins, however, much less is known about the role of individual host responses. Here we report the response of intact porcine small intestinal mucosa to infection with enterotoxigenic E. coli (ETEC). Jejunal segments in four piglets were infused with or without ETEC, and perfused for 8h, and net absorption measured. Microarray analysis at 8h post-infection showed significant differential regulation of on average fifteen transcripts in mucosa, with considerable individual variation. Differential net absorption varied between animals, and correlated negatively with the number of up regulated genes, and with one individual gene (THO complex 4). This shows that quantitative differences in gene regulation can be functionally linked to the physiological response in these four animals.


Subject(s)
Enterotoxigenic Escherichia coli/physiology , Escherichia coli Infections/physiopathology , Gene Expression Profiling , Gene Expression Regulation , Intestinal Diseases/microbiology , Swine Diseases/metabolism , Animals , Antigens, Neoplasm , Biomarkers, Tumor , Blotting, Northern , Host-Pathogen Interactions , Intestine, Small/metabolism , Intestine, Small/microbiology , Jejunum/metabolism , Jejunum/microbiology , Lectins, C-Type , Microarray Analysis , Pancreatitis-Associated Proteins , Swine , Swine Diseases/microbiology
16.
BMC Genet ; 10: 86, 2009 Dec 20.
Article in English | MEDLINE | ID: mdl-20021697

ABSTRACT

BACKGROUND: The chicken (Gallus gallus), like most avian species, has a very distinct karyotype consisting of many micro- and a few macrochromosomes. While it is known that recombination frequencies are much higher for micro- as compared to macrochromosomes, there is limited information on differences in linkage disequilibrium (LD) and haplotype diversity between these two classes of chromosomes. In this study, LD and haplotype diversity were systematically characterized in 371 birds from eight chicken populations (commercial lines, fancy breeds, and red jungle fowl) across macro- and microchromosomes. To this end we sampled four regions of approximately 1 cM each on macrochromosomes (GGA1 and GGA2), and four 1.5 -2 cM regions on microchromosomes (GGA26 and GGA27) at a high density of 1 SNP every 2 kb (total of 889 SNPs). RESULTS: At a similar physical distance, LD, haplotype homozygosity, haploblock structure, and haplotype sharing were all lower for the micro- as compared to the macrochromosomes. These differences were consistent across populations. Heterozygosity, genetic differentiation, and derived allele frequencies were also higher for the microchromosomes. Differences in LD, haplotype variation, and haplotype sharing between populations were largely in line with known demographic history of the commercial chicken. Despite very low levels of LD, as measured by r2 for most populations, some haploblock structure was observed, particularly in the macrochromosomes, but the haploblock sizes were typically less than 10 kb. CONCLUSION: Differences in LD between micro- and macrochromosomes were almost completely explained by differences in recombination rate. Differences in haplotype diversity and haplotype sharing between micro- and macrochromosomes were explained by differences in recombination rate and genotype variation. Haploblock structure was consistent with demography of the chicken populations, and differences in recombination rates between micro- and macrochromosomes. The limited haploblock structure and LD suggests that future whole-genome marker assays will need 100+K SNPs to exploit haplotype information. Interpretation and transferability of genetic parameters will need to take into account the size of chromosomes in chicken, and, since most birds have microchromosomes, in other avian species as well.


Subject(s)
Chickens/genetics , Chromosome Mapping , Haplotypes , Linkage Disequilibrium , Animals , Female , Gene Frequency , Genetics, Population , Male , Polymorphism, Single Nucleotide , Sequence Analysis, DNA
17.
BMC Genomics ; 10: 479, 2009 Oct 16.
Article in English | MEDLINE | ID: mdl-19835600

ABSTRACT

BACKGROUND: The development of second generation sequencing methods has enabled large scale DNA variation studies at moderate cost. For the high throughput discovery of single nucleotide polymorphisms (SNPs) in species lacking a sequenced reference genome, we set-up an analysis pipeline based on a short read de novo sequence assembler and a program designed to identify variation within short reads. To illustrate the potential of this technique, we present the results obtained with a randomly sheared, enzymatically generated, 2-3 kbp genome fraction of six pooled Meleagris gallopavo (turkey) individuals. RESULTS: A total of 100 million 36 bp reads were generated, representing approximately 5-6% (approximately 62 Mbp) of the turkey genome, with an estimated sequence depth of 58. Reads consisting of bases called with less than 1% error probability were selected and assembled into contigs. Subsequently, high throughput discovery of nucleotide variation was performed using sequences with more than 90% reliability by using the assembled contigs that were 50 bp or longer as the reference sequence. We identified more than 7,500 SNPs with a high probability of representing true nucleotide variation in turkeys. Increasing the reference genome by adding publicly available turkey BAC-end sequences increased the number of SNPs to over 11,000. A comparison with the sequenced chicken genome indicated that the assembled turkey contigs were distributed uniformly across the turkey genome. Genotyping of a representative sample of 340 SNPs resulted in a SNP conversion rate of 95%. The correlation of the minor allele count (MAC) and observed minor allele frequency (MAF) for the validated SNPs was 0.69. CONCLUSION: We provide an efficient and cost-effective approach for the identification of thousands of high quality SNPs in species currently lacking a sequenced genome and applied this to turkey. The methodology addresses a random fraction of the genome, resulting in an even distribution of SNPs across the targeted genome.


Subject(s)
Polymorphism, Single Nucleotide , Sequence Analysis, DNA/methods , Turkeys/genetics , Animals , Contig Mapping , Gene Frequency , Genomic Library , Genomics/methods , Genotype
18.
BMC Genomics ; 10: 374, 2009 Aug 12.
Article in English | MEDLINE | ID: mdl-19674453

ABSTRACT

BACKGROUND: Although the Illumina 1 G Genome Analyzer generates billions of base pairs of sequence data, challenges arise in sequence selection due to the varying sequence quality. Therefore, in the framework of the International Porcine SNP Chip Consortium, this pilot study aimed to evaluate the impact of the quality level of the sequenced bases on mapping quality and identification of true SNPs on a large scale. RESULTS: DNA pooled from five animals from a commercial boar line was digested with DraI; 150-250-bp fragments were isolated and end-sequenced using the Illumina 1 G Genome Analyzer, yielding 70,348,064 sequences 36-bp long. Rules were developed to select sequences, which were then aligned to unique positions in a reference genome. Sequences were selected based on quality, and three thresholds of sequence quality (SQ) were compared. The highest threshold of SQ allowed identification of a larger number of SNPs (17,489), distributed widely across the pig genome. In total, 3,142 SNPs were validated with a success rate of 96%. The correlation between estimated minor allele frequency (MAF) and genotyped MAF was moderate, and SNPs were highly polymorphic in other pig breeds. Lowering the SQ threshold and maintaining the same criteria for SNP identification resulted in the discovery of fewer SNPs (16,768), of which 259 were not identified using higher SQ levels. Validation of SNPs found exclusively in the lower SQ threshold had a success rate of 94% and a low correlation between estimated MAF and genotyped MAF. Base change analysis suggested that the rate of transitions in the pig genome is likely to be similar to that observed in humans. Chromosome X showed reduced nucleotide diversity relative to autosomes, as observed for other species. CONCLUSION: Large numbers of SNPs can be identified reliably by creating strict rules for sequence selection, which simultaneously decreases sequence ambiguity. Selection of sequences using a higher SQ threshold leads to more reliable identification of SNPs. Lower SQ thresholds can be used to guarantee sufficient sequence coverage, resulting in high success rate but less reliable MAF estimation. Nucleotide diversity varies between porcine chromosomes, with the X chromosome showing less variation as observed in other species.


Subject(s)
Genome , Polymorphism, Single Nucleotide , Sequence Analysis, DNA/methods , Sus scrofa/genetics , Algorithms , Animals , Chromosome Mapping/methods , Chromosomes, Mammalian/genetics , Genomic Library , Genotype , Male , Pilot Projects , Sequence Alignment
SELECTION OF CITATIONS
SEARCH DETAIL
...