Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 38
Filtrar
1.
Genetics ; 214(3): 691-702, 2020 03.
Artigo em Inglês | MEDLINE | ID: mdl-31879319

RESUMO

The azoxymethane model of colorectal cancer (CRC) was used to gain insights into the genetic heterogeneity of nonfamilial CRC. We observed significant differences in susceptibility parameters across 40 mouse inbred strains, with 6 new and 18 of 24 previously identified mouse CRC modifier alleles detected using genome-wide association analysis. Tumor incidence varied in F1 as well as intercrosses and backcrosses between resistant and susceptible strains. Analysis of inheritance patterns indicates that resistance to CRC development is inherited as a dominant characteristic genome-wide, and that susceptibility appears to occur in individuals lacking a large-effect, or sufficient numbers of small-effect, polygenic resistance alleles. Our results suggest a new polygenic model for inheritance of nonfamilial CRC, and that genetic studies in humans aimed at identifying individuals with elevated susceptibility should be pursued through the lens of absence of dominant resistance alleles rather than for the presence of susceptibility alleles.


Assuntos
Neoplasias Colorretais/genética , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Herança Multifatorial/genética , Alelos , Animais , Azoximetano/toxicidade , Neoplasias Colorretais/induzido quimicamente , Neoplasias Colorretais/patologia , Modelos Animais de Doenças , Resistencia a Medicamentos Antineoplásicos , Heterogeneidade Genética , Hereditariedade , Humanos , Camundongos , Camundongos Endogâmicos/genética , Modelos Genéticos
2.
G3 (Bethesda) ; 9(5): 1613-1622, 2019 05 07.
Artigo em Inglês | MEDLINE | ID: mdl-30877080

RESUMO

Reproductive success in the eight founder strains of the Collaborative Cross (CC) was measured using a diallel-mating scheme. Over a 48-month period we generated 4,448 litters, and provided 24,782 weaned pups for use in 16 different published experiments. We identified factors that affect the average litter size in a cross by estimating the overall contribution of parent-of-origin, heterosis, inbred, and epistatic effects using a Bayesian zero-truncated overdispersed Poisson mixed model. The phenotypic variance of litter size has a substantial contribution (82%) from unexplained and environmental sources, but no detectable effect of seasonality. Most of the explained variance was due to additive effects (9.2%) and parental sex (maternal vs. paternal strain; 5.8%), with epistasis accounting for 3.4%. Within the parental effects, the effect of the dam's strain explained more than the sire's strain (13.2% vs. 1.8%), and the dam's strain effects account for 74.2% of total variation explained. Dams from strains C57BL/6J and NOD/ShiLtJ increased the expected litter size by a mean of 1.66 and 1.79 pups, whereas dams from strains WSB/EiJ, PWK/PhJ, and CAST/EiJ reduced expected litter size by a mean of 1.51, 0.81, and 0.90 pups. Finally, there was no strong evidence for strain-specific effects on sex ratio distortion. Overall, these results demonstrate that strains vary substantially in their reproductive ability depending on their genetic background, and that litter size is largely determined by dam's strain rather than sire's strain effects, as expected. This analysis adds to our understanding of factors that influence litter size in mammals, and also helps to explain breeding successes and failures in the extinct lines and surviving CC strains.


Assuntos
Alelos , Animais Geneticamente Modificados , Camundongos de Cruzamento Colaborativo/genética , Tamanho da Ninhada de Vivíparos/genética , Herança Materna , Algoritmos , Animais , Cruzamentos Genéticos , Meio Ambiente , Interação Gene-Ambiente , Testes Genéticos , Camundongos , Camundongos Endogâmicos , Modelos Genéticos , Fenótipo , Razão de Masculinidade , Especificidade da Espécie
3.
G3 (Bethesda) ; 9(5): 1303-1311, 2019 05 07.
Artigo em Inglês | MEDLINE | ID: mdl-30858237

RESUMO

Two key features of recombinant inbred panels are well-characterized genomes and reproducibility. Here we report on the sequenced genomes of six additional Collaborative Cross (CC) strains and on inbreeding progress of 72 CC strains. We have previously reported on the sequences of 69 CC strains that were publicly available, bringing the total of CC strains with whole genome sequence up to 75. The sequencing of these six CC strains updates the efforts toward inbreeding undertaken by the UNC Systems Genetics Core. The timing reflects our competing mandates to release to the public as many CC strains as possible while achieving an acceptable level of inbreeding. The new six strains have a higher than average founder contribution from non-domesticus strains than the previously released CC strains. Five of the six strains also have high residual heterozygosity (>14%), which may be related to non-domesticus founder contributions. Finally, we report on updated estimates on residual heterozygosity across the entire CC population using a novel, simple and cost effective genotyping platform on three mice from each strain. We observe a reduction in residual heterozygosity across all previously released CC strains. We discuss the optimal use of different genetic resources available for the CC population.


Assuntos
Camundongos de Cruzamento Colaborativo/genética , Genética Populacional , Endogamia , Sequenciamento Completo do Genoma , Alelos , Animais , Animais Geneticamente Modificados , Mapeamento Cromossômico , Cruzamentos Genéticos , Frequência do Gene , Genoma , Genótipo , Camundongos , Camundongos Endogâmicos
4.
BMC Bioinformatics ; 19(1): 50, 2018 02 09.
Artigo em Inglês | MEDLINE | ID: mdl-29426289

RESUMO

BACKGROUND: Long read sequencing is changing the landscape of genomic research, especially de novo assembly. Despite the high error rate inherent to long read technologies, increased read lengths dramatically improve the continuity and accuracy of genome assemblies. However, the cost and throughput of these technologies limits their application to complex genomes. One solution is to decrease the cost and time to assemble novel genomes by leveraging "hybrid" assemblies that use long reads for scaffolding and short reads for accuracy. RESULTS: We describe a novel method leveraging a multi-string Burrows-Wheeler Transform with auxiliary FM-index to correct errors in long read sequences using a set of complementary short reads. We demonstrate that our method efficiently produces significantly more high quality corrected sequence than existing hybrid error-correction methods. We also show that our method produces more contiguous assemblies, in many cases, than existing state-of-the-art hybrid and long-read only de novo assembly methods. CONCLUSION: Our method accurately corrects long read sequence data using complementary short reads. We demonstrate higher total throughput of corrected long reads and a corresponding increase in contiguity of the resulting de novo assemblies. Improved throughput and computational efficiency than existing methods will help better economically utilize emerging long read sequencing technologies.


Assuntos
Algoritmos , Bases de Dados Genéticas , Genoma Fúngico , Saccharomyces cerevisiae/genética , Análise de Sequência de DNA
5.
Biol Reprod ; 97(5): 698-708, 2017 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-29036474

RESUMO

The ability to accurately monitor alterations in sperm motility is paramount to understanding multiple genetic and biochemical perturbations impacting normal fertilization. Computer-aided sperm analysis (CASA) of human sperm typically reports motile percentage and kinematic parameters at the population level, and uses kinematic gating methods to identify subpopulations such as progressive or hyperactivated sperm. The goal of this study was to develop an automated method that classifies all patterns of human sperm motility during in vitro capacitation following the removal of seminal plasma. We visually classified CASA tracks of 2817 sperm from 18 individuals and used a support vector machine-based decision tree to compute four hyperplanes that separate five classes based on their kinematic parameters. We then developed a web-based program, CASAnova, which applies these equations sequentially to assign a single classification to each motile sperm. Vigorous sperm are classified as progressive, intermediate, or hyperactivated, and nonvigorous sperm as slow or weakly motile. This program correctly classifies sperm motility into one of five classes with an overall accuracy of 89.9%. Application of CASAnova to capacitating sperm populations showed a shift from predominantly linear patterns of motility at initial time points to more vigorous patterns, including hyperactivated motility, as capacitation proceeds. Both intermediate and hyperactivated motility patterns were largely eliminated when sperm were incubated in noncapacitating medium, demonstrating the sensitivity of this method. The five CASAnova classifications are distinctive and reflect kinetic parameters of washed human sperm, providing an accurate, quantitative, and high-throughput method for monitoring alterations in motility.


Assuntos
Processamento de Imagem Assistida por Computador/métodos , Motilidade dos Espermatozoides/fisiologia , Espermatozoides/fisiologia , Máquina de Vetores de Suporte , Humanos , Masculino , Análise do Sêmen , Espermatozoides/classificação
6.
Genetics ; 206(2): 537-556, 2017 06.
Artigo em Inglês | MEDLINE | ID: mdl-28592495

RESUMO

The Collaborative Cross (CC) is a multiparent panel of recombinant inbred (RI) mouse strains derived from eight founder laboratory strains. RI panels are popular because of their long-term genetic stability, which enhances reproducibility and integration of data collected across time and conditions. Characterization of their genomes can be a community effort, reducing the burden on individual users. Here we present the genomes of the CC strains using two complementary approaches as a resource to improve power and interpretation of genetic experiments. Our study also provides a cautionary tale regarding the limitations imposed by such basic biological processes as mutation and selection. A distinct advantage of inbred panels is that genotyping only needs to be performed on the panel, not on each individual mouse. The initial CC genome data were haplotype reconstructions based on dense genotyping of the most recent common ancestors (MRCAs) of each strain followed by imputation from the genome sequence of the corresponding founder inbred strain. The MRCA resource captured segregating regions in strains that were not fully inbred, but it had limited resolution in the transition regions between founder haplotypes, and there was uncertainty about founder assignment in regions of limited diversity. Here we report the whole genome sequence of 69 CC strains generated by paired-end short reads at 30× coverage of a single male per strain. Sequencing leads to a substantial improvement in the fine structure and completeness of the genomes of the CC. Both MRCAs and sequenced samples show a significant reduction in the genome-wide haplotype frequencies from two wild-derived strains, CAST/EiJ and PWK/PhJ. In addition, analysis of the evolution of the patterns of heterozygosity indicates that selection against three wild-derived founder strains played a significant role in shaping the genomes of the CC. The sequencing resource provides the first description of tens of thousands of new genetic variants introduced by mutation and drift in the CC genomes. We estimate that new SNP mutations are accumulating in each CC strain at a rate of 2.4 ± 0.4 per gigabase per generation. The fixation of new mutations by genetic drift has introduced thousands of new variants into the CC strains. The majority of these mutations are novel compared to currently sequenced laboratory stocks and wild mice, and some are predicted to alter gene function. Approximately one-third of the CC inbred strains have acquired large deletions (>10 kb) many of which overlap known coding genes and functional elements. The sequence of these mice is a critical resource to CC users, increases threefold the number of mouse inbred strain genomes available publicly, and provides insight into the effect of mutation and drift on common resources.


Assuntos
Deriva Genética , Genoma/genética , Camundongos Endogâmicos/genética , Locos de Características Quantitativas/genética , Animais , Mapeamento Cromossômico , Cruzamentos Genéticos , Genótipo , Haplótipos , Masculino , Camundongos , Mutação , Polimorfismo de Nucleotídeo Único
7.
G3 (Bethesda) ; 6(12): 4211-4216, 2016 12 07.
Artigo em Inglês | MEDLINE | ID: mdl-27765810

RESUMO

Wild-derived mouse inbred strains are becoming increasingly popular for complex traits analysis, evolutionary studies, and systems genetics. Here, we report the whole-genome sequencing of two wild-derived mouse inbred strains, LEWES/EiJ and ZALENDE/EiJ, of Mus musculus domesticus origin. These two inbred strains were selected based on their geographic origin, karyotype, and use in ongoing research. We generated 14× and 18× coverage sequence, respectively, and discovered over 1.1 million novel variants, most of which are private to one of these strains. This report expands the number of wild-derived inbred genomes in the Mus genus from six to eight. The sequence variation can be accessed via an online query tool; variant calls (VCF format) and alignments (BAM format) are available for download from a dedicated ftp site. Finally, the sequencing data have also been stored in a lossless, compressed, and indexed format using the multi-string Burrows-Wheeler transform. All data can be used without restriction.


Assuntos
Animais Selvagens/genética , Diploide , Genoma , Camundongos Endogâmicos/genética , Animais , Animais Selvagens/classificação , Feminino , Variação Genética , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Masculino , Camundongos , Camundongos Endogâmicos/classificação , Filogenia
8.
Genetics ; 204(1): 267-85, 2016 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-27371833

RESUMO

Gene duplication and loss are major sources of genetic polymorphism in populations, and are important forces shaping the evolution of genome content and organization. We have reconstructed the origin and history of a 127-kbp segmental duplication, R2d, in the house mouse (Mus musculus). R2d contains a single protein-coding gene, Cwc22 De novo assembly of both the ancestral (R2d1) and the derived (R2d2) copies reveals that they have been subject to nonallelic gene conversion events spanning tens of kilobases. R2d2 is also a hotspot for structural variation: its diploid copy number ranges from zero in the mouse reference genome to >80 in wild mice sampled from around the globe. Hemizygosity for high copy-number alleles of R2d2 is associated in cis with meiotic drive; suppression of meiotic crossovers; and copy-number instability, with a mutation rate in excess of 1 per 100 transmissions in some laboratory populations. Our results provide a striking example of allelic diversity generated by duplication and demonstrate the value of de novo assembly in a phylogenetic context for understanding the mutational processes affecting duplicate genes.


Assuntos
Evolução Biológica , Duplicação Gênica , Proteínas Nucleares/genética , Duplicações Segmentares Genômicas , Alelos , Animais , Animais Selvagens/genética , Evolução Molecular , Conversão Gênica , Dosagem de Genes , Genes Duplicados , Variação Genética , Camundongos , Filogenia , Proteínas de Ligação a RNA
9.
Mol Biol Evol ; 33(6): 1381-95, 2016 06.
Artigo em Inglês | MEDLINE | ID: mdl-26882987

RESUMO

A selective sweep is the result of strong positive selection driving newly occurring or standing genetic variants to fixation, and can dramatically alter the pattern and distribution of allelic diversity in a population. Population-level sequencing data have enabled discoveries of selective sweeps associated with genes involved in recent adaptations in many species. In contrast, much debate but little evidence addresses whether "selfish" genes are capable of fixation-thereby leaving signatures identical to classical selective sweeps-despite being neutral or deleterious to organismal fitness. We previously described R2d2, a large copy-number variant that causes nonrandom segregation of mouse Chromosome 2 in females due to meiotic drive. Here we show population-genetic data consistent with a selfish sweep driven by alleles of R2d2 with high copy number (R2d2(HC)) in natural populations. We replicate this finding in multiple closed breeding populations from six outbred backgrounds segregating for R2d2 alleles. We find that R2d2(HC) rapidly increases in frequency, and in most cases becomes fixed in significantly fewer generations than can be explained by genetic drift. R2d2(HC) is also associated with significantly reduced litter sizes in heterozygous mothers, making it a true selfish allele. Our data provide direct evidence of populations actively undergoing selfish sweeps, and demonstrate that meiotic drive can rapidly alter the genomic landscape in favor of mutations with neutral or even negative effects on overall Darwinian fitness. Further study will reveal the incidence of selfish sweeps, and will elucidate the relative contributions of selfish genes, adaptation and genetic drift to evolution.


Assuntos
Proteínas Nucleares/genética , Proteínas de Ligação a RNA/genética , Sequências Repetitivas de Ácido Nucleico , Adaptação Fisiológica/genética , Alelos , Animais , Evolução Biológica , Variações do Número de Cópias de DNA/genética , Evolução Molecular , Feminino , Variação Genética , Genética Populacional , Masculino , Camundongos , Modelos Genéticos , Mutação , Seleção Genética
10.
G3 (Bethesda) ; 6(2): 263-79, 2015 Dec 18.
Artigo em Inglês | MEDLINE | ID: mdl-26684931

RESUMO

Genotyping microarrays are an important resource for genetic mapping, population genetics, and monitoring of the genetic integrity of laboratory stocks. We have developed the third generation of the Mouse Universal Genotyping Array (MUGA) series, GigaMUGA, a 143,259-probe Illumina Infinium II array for the house mouse (Mus musculus). The bulk of the content of GigaMUGA is optimized for genetic mapping in the Collaborative Cross and Diversity Outbred populations, and for substrain-level identification of laboratory mice. In addition to 141,090 single nucleotide polymorphism probes, GigaMUGA contains 2006 probes for copy number concentrated in structurally polymorphic regions of the mouse genome. The performance of the array is characterized in a set of 500 high-quality reference samples spanning laboratory inbred strains, recombinant inbred lines, outbred stocks, and wild-caught mice. GigaMUGA is highly informative across a wide range of genetically diverse samples, from laboratory substrains to other Mus species. In addition to describing the content and performance of the array, we provide detailed probe-level annotation and recommendations for quality control.


Assuntos
Mapeamento Cromossômico , Genoma , Genômica , Genótipo , Alelos , Animais , Mapeamento Cromossômico/métodos , Biologia Computacional/métodos , Dosagem de Genes , Genética Populacional , Genômica/métodos , Camundongos , Camundongos Endogâmicos , Análise de Sequência com Séries de Oligonucleotídeos , Filogenia , Polimorfismo de Nucleotídeo Único
11.
J Am Stat Assoc ; 110(511): 975-986, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26617424

RESUMO

We have developed a statistical method named IsoDOT to assess differential isoform expression (DIE) and differential isoform usage (DIU) using RNA-seq data. Here isoform usage refers to relative isoform expression given the total expression of the corresponding gene. IsoDOT performs two tasks that cannot be accomplished by existing methods: to test DIE/DIU with respect to a continuous covariate, and to test DIE/DIU for one case versus one control. The latter task is not an uncommon situation in practice, e.g., comparing the paternal and maternal alleles of one individual or comparing tumor and normal samples of one cancer patient. Simulation studies demonstrate the high sensitivity and specificity of IsoDOT. We apply IsoDOT to study the effects of haloperidol treatment on the mouse transcriptome and identify a group of genes whose isoform usages respond to haloperidol treatment.

12.
G3 (Bethesda) ; 5(12): 2671-83, 2015 Oct 19.
Artigo em Inglês | MEDLINE | ID: mdl-26483008

RESUMO

Surveys of inbred strains of mice are standard approaches to determine the heritability and range of phenotypic variation for biomedical traits. In addition, they may lead to the identification of novel phenotypes and models of human disease. Surprisingly, male reproductive phenotypes are among the least-represented traits in the Mouse Phenome Database. Here we report the results of a broad survey of the eight founder inbred strains of both the Collaborative Cross (CC) and the Diversity Outbred populations, two new mouse resources that are being used as platforms for systems genetics and sources of mouse models of human diseases. Our survey includes representatives of the three main subspecies of the house mice and a mix of classical and wild-derived inbred strains. In addition to standard staples of male reproductive phenotyping such as reproductive organ weights, sperm counts, and sperm morphology, our survey includes sperm motility and the first detailed survey of testis histology. As expected for such a broad survey, heritability varies widely among traits. We conclude that although all eight inbred strains are fertile, most display a mix of advantageous and deleterious male reproductive traits. The CAST/EiJ strain is an outlier, with an unusual combination of deleterious male reproductive traits including low sperm counts, high levels of morphologically abnormal sperm, and poor motility. In contrast, sperm from the PWK/PhJ and WSB/EiJ strains had the greatest percentages of normal morphology and vigorous motility. Finally, we report an abnormal testis phenotype that is highly heritable and restricted to the WSB/EiJ strain. This phenotype is characterized by the presence of a large, but variable, number of vacuoles in at least 10% of the seminiferous tubules. The onset of the phenotype between 2 and 3 wk of age is temporally correlated with the formation of the blood-testis barrier. We speculate that this phenotype may play a role in high rates of extinction in the CC project and in the phenotypes associated with speciation in genetic crosses that use the WSB/EiJ strain as representative of the Mus muculus domesticus subspecies.


Assuntos
Cruzamentos Genéticos , Efeito Fundador , Locos de Características Quantitativas , Característica Quantitativa Herdável , Reprodução/genética , Animais , Feminino , Infertilidade Masculina/genética , Ácido Láctico/biossíntese , Masculino , Camundongos , Camundongos Endogâmicos , Fenótipo , Contagem de Espermatozoides , Motilidade dos Espermatozoides , Espermatozoides/citologia , Espermatozoides/fisiologia , Testículo/anatomia & histologia , Testículo/citologia , Testículo/fisiologia
13.
PLoS Genet ; 11(10): e1005504, 2015 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-26452100

RESUMO

New systems genetics approaches are needed to rapidly identify host genes and genetic networks that regulate complex disease outcomes. Using genetically diverse animals from incipient lines of the Collaborative Cross mouse panel, we demonstrate a greatly expanded range of phenotypes relative to classical mouse models of SARS-CoV infection including lung pathology, weight loss and viral titer. Genetic mapping revealed several loci contributing to differential disease responses, including an 8.5Mb locus associated with vascular cuffing on chromosome 3 that contained 23 genes and 13 noncoding RNAs. Integrating phenotypic and genetic data narrowed this region to a single gene, Trim55, an E3 ubiquitin ligase with a role in muscle fiber maintenance. Lung pathology and transcriptomic data from mice genetically deficient in Trim55 were used to validate its role in SARS-CoV-induced vascular cuffing and inflammation. These data establish the Collaborative Cross platform as a powerful genetic resource for uncovering genetic contributions of complex traits in microbial disease severity, inflammation and virus replication in models of outbred populations.


Assuntos
Interações Hospedeiro-Patógeno , Inflamação/genética , Síndrome Respiratória Aguda Grave/genética , Coronavírus Relacionado à Síndrome Respiratória Aguda Grave/genética , Animais , Modelos Animais de Doenças , Suscetibilidade a Doenças , Humanos , Inflamação/patologia , Inflamação/virologia , Camundongos , Fenótipo , Coronavírus Relacionado à Síndrome Respiratória Aguda Grave/patogenicidade , Síndrome Respiratória Aguda Grave/patologia , Síndrome Respiratória Aguda Grave/virologia , Replicação Viral/genética
15.
Nat Genet ; 47(4): 353-60, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-25730764

RESUMO

Complex human traits are influenced by variation in regulatory DNA through mechanisms that are not fully understood. Because regulatory elements are conserved between humans and mice, a thorough annotation of cis regulatory variants in mice could aid in further characterizing these mechanisms. Here we provide a detailed portrait of mouse gene expression across multiple tissues in a three-way diallel. Greater than 80% of mouse genes have cis regulatory variation. Effects from these variants influence complex traits and usually extend to the human ortholog. Further, we estimate that at least one in every thousand SNPs creates a cis regulatory effect. We also observe two types of parent-of-origin effects, including classical imprinting and a new global allelic imbalance in expression favoring the paternal allele. We conclude that, as with humans, pervasive regulatory variation influences complex genetic traits in mice and provide a new resource toward understanding the genetic control of transcription in mammals.


Assuntos
Alelos , Desequilíbrio Alélico/genética , Cruzamentos Genéticos , Expressão Gênica , Especiação Genética , Camundongos/genética , Animais , Mecanismo Genético de Compensação de Dose , Feminino , Humanos , Masculino , Camundongos Knockout , Filogenia , Polimorfismo de Nucleotídeo Único
16.
PLoS Genet ; 11(2): e1004850, 2015 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-25679959

RESUMO

Significant departures from expected Mendelian inheritance ratios (transmission ratio distortion, TRD) are frequently observed in both experimental crosses and natural populations. TRD on mouse Chromosome (Chr) 2 has been reported in multiple experimental crosses, including the Collaborative Cross (CC). Among the eight CC founder inbred strains, we found that Chr 2 TRD was exclusive to females that were heterozygous for the WSB/EiJ allele within a 9.3 Mb region (Chr 2 76.9 - 86.2 Mb). A copy number gain of a 127 kb-long DNA segment (designated as responder to drive, R2d) emerged as the strongest candidate for the causative allele. We mapped R2d sequences to two loci within the candidate interval. R2d1 is located near the proximal boundary, and contains a single copy of R2d in all strains tested. R2d2 maps to a 900 kb interval, and the number of R2d copies varies from zero in classical strains (including the mouse reference genome) to more than 30 in wild-derived strains. Using real-time PCR assays for the copy number, we identified a mutation (R2d2WSBdel1) that eliminates the majority of the R2d2WSB copies without apparent alterations of the surrounding WSB/EiJ haplotype. In a three-generation pedigree segregating for R2d2WSBdel1, the mutation is transmitted to the progeny and Mendelian segregation is restored in females heterozygous for R2d2WSBdel1, thus providing direct evidence that the copy number gain is causal for maternal TRD. We found that transmission ratios in R2d2WSB heterozygous females vary between Mendelian segregation and complete distortion depending on the genetic background, and that TRD is under genetic control of unlinked distorter loci. Although the R2d2WSB transmission ratio was inversely correlated with average litter size, several independent lines of evidence support the contention that female meiotic drive is the cause of the distortion. We discuss the implications and potential applications of this novel meiotic drive system.


Assuntos
Variações do Número de Cópias de DNA/genética , Genômica , Padrões de Herança/genética , Meiose/genética , Alelos , Animais , Cromossomos/genética , Cruzamentos Genéticos , Feminino , Técnicas de Genotipagem , Haplótipos/genética , Masculino , Camundongos , Mutação
17.
Bioinformatics ; 30(24): 3524-31, 2014 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-25172922

RESUMO

MOTIVATION: The throughput of genomic sequencing has increased to the point that is overrunning the rate of downstream analysis. This, along with the desire to revisit old data, has led to a situation where large quantities of raw, and nearly impenetrable, sequence data are rapidly filling the hard drives of modern biology labs. These datasets can be compressed via a multi-string variant of the Burrows-Wheeler Transform (BWT), which provides the side benefit of searches for arbitrary k-mers within the raw data as well as the ability to reconstitute arbitrary reads as needed. We propose a method for merging such datasets for both increased compression and downstream analysis. RESULTS: We present a novel algorithm that merges multi-string BWTs in [Formula: see text] time where LCS is the length of their longest common substring between any of the inputs, and N is the total length of all inputs combined (number of symbols) using [Formula: see text] bits where F is the number of multi-string BWTs merged. This merged multi-string BWT is also shown to have a higher compressibility compared with the input multi-string BWTs separately. Additionally, we explore some uses of a merged multi-string BWT for bioinformatics applications.


Assuntos
Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Animais , Compressão de Dados , Genômica/métodos , Camundongos , Alinhamento de Sequência
18.
Artigo em Inglês | MEDLINE | ID: mdl-24948510

RESUMO

Mapping reads to a reference sequence is a common step when analyzing allele effects in high-throughput sequencing data. The choice of reference is critical because its effect on quantitative sequence analysis is non-negligible. Recent studies suggest aligning to a single standard reference sequence, as is common practice, can lead to an underlying bias depending on the genetic distances of the target sequences from the reference. To avoid this bias, researchers have resorted to using modified reference sequences. Even with this improvement, various limitations and problems remain unsolved, which include reduced mapping ratios, shifts in read mappings and the selection of which variants to include to remove biases. To address these issues, we propose a novel and generic multi-alignment pipeline. Our pipeline integrates the genomic variations from known or suspected founders into separate reference sequences and performs alignments to each one. By mapping reads to multiple reference sequences and merging them afterward, we are able to rescue more reads and diminish the bias caused by using a single common reference. Moreover, the genomic origin of each read is determined and annotated during the merging process, providing a better source of information to assess differential expression than simple allele queries at known variant positions. Using RNA-seq of a diallel cross, we compare our pipeline with the single-reference pipeline and demonstrate our advantages of more aligned reads and a higher percentage of reads with assigned origins. Database URL: http://csbio.unc.edu/CCstatus/index.py?run=Pseudo.


Assuntos
Bases de Dados de Ácidos Nucleicos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Alelos , Animais , Sequência de Bases , Cromossomos de Mamíferos/genética , Cruzamentos Genéticos , Feminino , Genoma/genética , Hibridização Genética , Masculino , Camundongos , Dados de Sequência Molecular , Pseudogenes/genética
19.
PLoS Genet ; 9(10): e1003853, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24098153

RESUMO

X chromosome inactivation (XCI) is the mammalian mechanism of dosage compensation that balances X-linked gene expression between the sexes. Early during female development, each cell of the embryo proper independently inactivates one of its two parental X-chromosomes. In mice, the choice of which X chromosome is inactivated is affected by the genotype of a cis-acting locus, the X-chromosome controlling element (Xce). Xce has been localized to a 1.9 Mb interval within the X-inactivation center (Xic), yet its molecular identity and mechanism of action remain unknown. We combined genotype and sequence data for mouse stocks with detailed phenotyping of ten inbred strains and with the development of a statistical model that incorporates phenotyping data from multiple sources to disentangle sources of XCI phenotypic variance in natural female populations on X inactivation. We have reduced the Xce candidate 10-fold to a 176 kb region located approximately 500 kb proximal to Xist. We propose that structural variation in this interval explains the presence of multiple functional Xce alleles in the genus Mus. We have identified a new allele, Xce(e) present in Mus musculus and a possible sixth functional allele in Mus spicilegus. We have also confirmed a parent-of-origin effect on X inactivation choice and provide evidence that maternal inheritance magnifies the skewing associated with strong Xce alleles. Based on the phylogenetic analysis of 155 laboratory strains and wild mice we conclude that Xce(a) is either a derived allele that arose concurrently with the domestication of fancy mice but prior the derivation of most classical inbred strains or a rare allele in the wild. Furthermore, we have found that despite the presence of multiple haplotypes in the wild Mus musculus domesticus has only one functional Xce allele, Xce(b). Lastly, we conclude that each mouse taxa examined has a different functional Xce allele.


Assuntos
Mecanismo Genético de Compensação de Dose , Genes Ligados ao Cromossomo X , RNA Longo não Codificante/genética , Inativação do Cromossomo X/genética , Alelos , Animais , Mapeamento Cromossômico , Feminino , Loci Gênicos , Haplótipos , Camundongos , Filogenia
20.
Bioinformatics ; 29(13): i291-9, 2013 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-23812996

RESUMO

MOTIVATION: RNA-seq techniques provide an unparalleled means for exploring a transcriptome with deep coverage and base pair level resolution. Various analysis tools have been developed to align and assemble RNA-seq data, such as the widely used TopHat/Cufflinks pipeline. A common observation is that a sizable fraction of the fragments/reads align to multiple locations of the genome. These multiple alignments pose substantial challenges to existing RNA-seq analysis tools. Inappropriate treatment may result in reporting spurious expressed genes (false positives) and missing the real expressed genes (false negatives). Such errors impact the subsequent analysis, such as differential expression analysis. In our study, we observe that ~3.5% of transcripts reported by TopHat/Cufflinks pipeline correspond to annotated nonfunctional pseudogenes. Moreover, ~10.0% of reported transcripts are not annotated in the Ensembl database. These genes could be either novel expressed genes or false discoveries. RESULTS: We examine the underlying genomic features that lead to multiple alignments and investigate how they generate systematic errors in RNA-seq analysis. We develop a general tool, GeneScissors, which exploits machine learning techniques guided by biological knowledge to detect and correct spurious transcriptome inference by existing RNA-seq analysis methods. In our simulated study, GeneScissors can predict spurious transcriptome calls owing to misalignment with an accuracy close to 90%. It provides substantial improvement over the widely used TopHat/Cufflinks or MapSplice/Cufflinks pipelines in both precision and F-measurement. On real data, GeneScissors reports 53.6% less pseudogenes and 0.97% more expressed and annotated transcripts, when compared with the TopHat/Cufflinks pipeline. In addition, among the 10.0% unannotated transcripts reported by TopHat/Cufflinks, GeneScissors finds that >16.3% of them are false positives. AVAILABILITY: The software can be downloaded at http://csbio.unc.edu/genescissors/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Perfilação da Expressão Gênica/métodos , Alinhamento de Sequência/métodos , Análise de Sequência de RNA/métodos , Software , Animais , Inteligência Artificial , Genômica , Camundongos , Pseudogenes
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...