Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 38
Filtrar
1.
BMC Bioinformatics ; 19(1): 50, 2018 02 09.
Artigo em Inglês | MEDLINE | ID: mdl-29426289

RESUMO

BACKGROUND: Long read sequencing is changing the landscape of genomic research, especially de novo assembly. Despite the high error rate inherent to long read technologies, increased read lengths dramatically improve the continuity and accuracy of genome assemblies. However, the cost and throughput of these technologies limits their application to complex genomes. One solution is to decrease the cost and time to assemble novel genomes by leveraging "hybrid" assemblies that use long reads for scaffolding and short reads for accuracy. RESULTS: We describe a novel method leveraging a multi-string Burrows-Wheeler Transform with auxiliary FM-index to correct errors in long read sequences using a set of complementary short reads. We demonstrate that our method efficiently produces significantly more high quality corrected sequence than existing hybrid error-correction methods. We also show that our method produces more contiguous assemblies, in many cases, than existing state-of-the-art hybrid and long-read only de novo assembly methods. CONCLUSION: Our method accurately corrects long read sequence data using complementary short reads. We demonstrate higher total throughput of corrected long reads and a corresponding increase in contiguity of the resulting de novo assemblies. Improved throughput and computational efficiency than existing methods will help better economically utilize emerging long read sequencing technologies.


Assuntos
Algoritmos , Bases de Dados Genéticas , Genoma Fúngico , Saccharomyces cerevisiae/genética , Análise de Sequência de DNA
2.
PLoS Genet ; 11(10): e1005504, 2015 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-26452100

RESUMO

New systems genetics approaches are needed to rapidly identify host genes and genetic networks that regulate complex disease outcomes. Using genetically diverse animals from incipient lines of the Collaborative Cross mouse panel, we demonstrate a greatly expanded range of phenotypes relative to classical mouse models of SARS-CoV infection including lung pathology, weight loss and viral titer. Genetic mapping revealed several loci contributing to differential disease responses, including an 8.5Mb locus associated with vascular cuffing on chromosome 3 that contained 23 genes and 13 noncoding RNAs. Integrating phenotypic and genetic data narrowed this region to a single gene, Trim55, an E3 ubiquitin ligase with a role in muscle fiber maintenance. Lung pathology and transcriptomic data from mice genetically deficient in Trim55 were used to validate its role in SARS-CoV-induced vascular cuffing and inflammation. These data establish the Collaborative Cross platform as a powerful genetic resource for uncovering genetic contributions of complex traits in microbial disease severity, inflammation and virus replication in models of outbred populations.


Assuntos
Interações Hospedeiro-Patógeno , Inflamação/genética , Síndrome Respiratória Aguda Grave/genética , Coronavírus Relacionado à Síndrome Respiratória Aguda Grave/genética , Animais , Modelos Animais de Doenças , Suscetibilidade a Doenças , Humanos , Inflamação/patologia , Inflamação/virologia , Camundongos , Fenótipo , Coronavírus Relacionado à Síndrome Respiratória Aguda Grave/patogenicidade , Síndrome Respiratória Aguda Grave/patologia , Síndrome Respiratória Aguda Grave/virologia , Replicação Viral/genética
3.
PLoS Genet ; 11(2): e1004850, 2015 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-25679959

RESUMO

Significant departures from expected Mendelian inheritance ratios (transmission ratio distortion, TRD) are frequently observed in both experimental crosses and natural populations. TRD on mouse Chromosome (Chr) 2 has been reported in multiple experimental crosses, including the Collaborative Cross (CC). Among the eight CC founder inbred strains, we found that Chr 2 TRD was exclusive to females that were heterozygous for the WSB/EiJ allele within a 9.3 Mb region (Chr 2 76.9 - 86.2 Mb). A copy number gain of a 127 kb-long DNA segment (designated as responder to drive, R2d) emerged as the strongest candidate for the causative allele. We mapped R2d sequences to two loci within the candidate interval. R2d1 is located near the proximal boundary, and contains a single copy of R2d in all strains tested. R2d2 maps to a 900 kb interval, and the number of R2d copies varies from zero in classical strains (including the mouse reference genome) to more than 30 in wild-derived strains. Using real-time PCR assays for the copy number, we identified a mutation (R2d2WSBdel1) that eliminates the majority of the R2d2WSB copies without apparent alterations of the surrounding WSB/EiJ haplotype. In a three-generation pedigree segregating for R2d2WSBdel1, the mutation is transmitted to the progeny and Mendelian segregation is restored in females heterozygous for R2d2WSBdel1, thus providing direct evidence that the copy number gain is causal for maternal TRD. We found that transmission ratios in R2d2WSB heterozygous females vary between Mendelian segregation and complete distortion depending on the genetic background, and that TRD is under genetic control of unlinked distorter loci. Although the R2d2WSB transmission ratio was inversely correlated with average litter size, several independent lines of evidence support the contention that female meiotic drive is the cause of the distortion. We discuss the implications and potential applications of this novel meiotic drive system.


Assuntos
Variações do Número de Cópias de DNA/genética , Genômica , Padrões de Herança/genética , Meiose/genética , Alelos , Animais , Cromossomos/genética , Cruzamentos Genéticos , Feminino , Técnicas de Genotipagem , Haplótipos/genética , Masculino , Camundongos , Mutação
4.
Mol Biol Evol ; 33(6): 1381-95, 2016 06.
Artigo em Inglês | MEDLINE | ID: mdl-26882987

RESUMO

A selective sweep is the result of strong positive selection driving newly occurring or standing genetic variants to fixation, and can dramatically alter the pattern and distribution of allelic diversity in a population. Population-level sequencing data have enabled discoveries of selective sweeps associated with genes involved in recent adaptations in many species. In contrast, much debate but little evidence addresses whether "selfish" genes are capable of fixation-thereby leaving signatures identical to classical selective sweeps-despite being neutral or deleterious to organismal fitness. We previously described R2d2, a large copy-number variant that causes nonrandom segregation of mouse Chromosome 2 in females due to meiotic drive. Here we show population-genetic data consistent with a selfish sweep driven by alleles of R2d2 with high copy number (R2d2(HC)) in natural populations. We replicate this finding in multiple closed breeding populations from six outbred backgrounds segregating for R2d2 alleles. We find that R2d2(HC) rapidly increases in frequency, and in most cases becomes fixed in significantly fewer generations than can be explained by genetic drift. R2d2(HC) is also associated with significantly reduced litter sizes in heterozygous mothers, making it a true selfish allele. Our data provide direct evidence of populations actively undergoing selfish sweeps, and demonstrate that meiotic drive can rapidly alter the genomic landscape in favor of mutations with neutral or even negative effects on overall Darwinian fitness. Further study will reveal the incidence of selfish sweeps, and will elucidate the relative contributions of selfish genes, adaptation and genetic drift to evolution.


Assuntos
Proteínas Nucleares/genética , Proteínas de Ligação a RNA/genética , Sequências Repetitivas de Ácido Nucleico , Adaptação Fisiológica/genética , Alelos , Animais , Evolução Biológica , Variações do Número de Cópias de DNA/genética , Evolução Molecular , Feminino , Variação Genética , Genética Populacional , Masculino , Camundongos , Modelos Genéticos , Mutação , Seleção Genética
5.
Biol Reprod ; 97(5): 698-708, 2017 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-29036474

RESUMO

The ability to accurately monitor alterations in sperm motility is paramount to understanding multiple genetic and biochemical perturbations impacting normal fertilization. Computer-aided sperm analysis (CASA) of human sperm typically reports motile percentage and kinematic parameters at the population level, and uses kinematic gating methods to identify subpopulations such as progressive or hyperactivated sperm. The goal of this study was to develop an automated method that classifies all patterns of human sperm motility during in vitro capacitation following the removal of seminal plasma. We visually classified CASA tracks of 2817 sperm from 18 individuals and used a support vector machine-based decision tree to compute four hyperplanes that separate five classes based on their kinematic parameters. We then developed a web-based program, CASAnova, which applies these equations sequentially to assign a single classification to each motile sperm. Vigorous sperm are classified as progressive, intermediate, or hyperactivated, and nonvigorous sperm as slow or weakly motile. This program correctly classifies sperm motility into one of five classes with an overall accuracy of 89.9%. Application of CASAnova to capacitating sperm populations showed a shift from predominantly linear patterns of motility at initial time points to more vigorous patterns, including hyperactivated motility, as capacitation proceeds. Both intermediate and hyperactivated motility patterns were largely eliminated when sperm were incubated in noncapacitating medium, demonstrating the sensitivity of this method. The five CASAnova classifications are distinctive and reflect kinetic parameters of washed human sperm, providing an accurate, quantitative, and high-throughput method for monitoring alterations in motility.


Assuntos
Processamento de Imagem Assistida por Computador/métodos , Motilidade dos Espermatozoides/fisiologia , Espermatozoides/fisiologia , Máquina de Vetores de Suporte , Humanos , Masculino , Análise do Sêmen , Espermatozoides/classificação
6.
PLoS Genet ; 9(10): e1003853, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24098153

RESUMO

X chromosome inactivation (XCI) is the mammalian mechanism of dosage compensation that balances X-linked gene expression between the sexes. Early during female development, each cell of the embryo proper independently inactivates one of its two parental X-chromosomes. In mice, the choice of which X chromosome is inactivated is affected by the genotype of a cis-acting locus, the X-chromosome controlling element (Xce). Xce has been localized to a 1.9 Mb interval within the X-inactivation center (Xic), yet its molecular identity and mechanism of action remain unknown. We combined genotype and sequence data for mouse stocks with detailed phenotyping of ten inbred strains and with the development of a statistical model that incorporates phenotyping data from multiple sources to disentangle sources of XCI phenotypic variance in natural female populations on X inactivation. We have reduced the Xce candidate 10-fold to a 176 kb region located approximately 500 kb proximal to Xist. We propose that structural variation in this interval explains the presence of multiple functional Xce alleles in the genus Mus. We have identified a new allele, Xce(e) present in Mus musculus and a possible sixth functional allele in Mus spicilegus. We have also confirmed a parent-of-origin effect on X inactivation choice and provide evidence that maternal inheritance magnifies the skewing associated with strong Xce alleles. Based on the phylogenetic analysis of 155 laboratory strains and wild mice we conclude that Xce(a) is either a derived allele that arose concurrently with the domestication of fancy mice but prior the derivation of most classical inbred strains or a rare allele in the wild. Furthermore, we have found that despite the presence of multiple haplotypes in the wild Mus musculus domesticus has only one functional Xce allele, Xce(b). Lastly, we conclude that each mouse taxa examined has a different functional Xce allele.


Assuntos
Mecanismo Genético de Compensação de Dose , Genes Ligados ao Cromossomo X , RNA Longo não Codificante/genética , Inativação do Cromossomo X/genética , Alelos , Animais , Mapeamento Cromossômico , Feminino , Loci Gênicos , Haplótipos , Camundongos , Filogenia
7.
Bioinformatics ; 30(24): 3524-31, 2014 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-25172922

RESUMO

MOTIVATION: The throughput of genomic sequencing has increased to the point that is overrunning the rate of downstream analysis. This, along with the desire to revisit old data, has led to a situation where large quantities of raw, and nearly impenetrable, sequence data are rapidly filling the hard drives of modern biology labs. These datasets can be compressed via a multi-string variant of the Burrows-Wheeler Transform (BWT), which provides the side benefit of searches for arbitrary k-mers within the raw data as well as the ability to reconstitute arbitrary reads as needed. We propose a method for merging such datasets for both increased compression and downstream analysis. RESULTS: We present a novel algorithm that merges multi-string BWTs in [Formula: see text] time where LCS is the length of their longest common substring between any of the inputs, and N is the total length of all inputs combined (number of symbols) using [Formula: see text] bits where F is the number of multi-string BWTs merged. This merged multi-string BWT is also shown to have a higher compressibility compared with the input multi-string BWTs separately. Additionally, we explore some uses of a merged multi-string BWT for bioinformatics applications.


Assuntos
Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Animais , Compressão de Dados , Genômica/métodos , Camundongos , Alinhamento de Sequência
8.
PLoS Pathog ; 9(2): e1003196, 2013 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-23468633

RESUMO

Genetic variation contributes to host responses and outcomes following infection by influenza A virus or other viral infections. Yet narrow windows of disease symptoms and confounding environmental factors have made it difficult to identify polymorphic genes that contribute to differential disease outcomes in human populations. Therefore, to control for these confounding environmental variables in a system that models the levels of genetic diversity found in outbred populations such as humans, we used incipient lines of the highly genetically diverse Collaborative Cross (CC) recombinant inbred (RI) panel (the pre-CC population) to study how genetic variation impacts influenza associated disease across a genetically diverse population. A wide range of variation in influenza disease related phenotypes including virus replication, virus-induced inflammation, and weight loss was observed. Many of the disease associated phenotypes were correlated, with viral replication and virus-induced inflammation being predictors of virus-induced weight loss. Despite these correlations, pre-CC mice with unique and novel disease phenotype combinations were observed. We also identified sets of transcripts (modules) that were correlated with aspects of disease. In order to identify how host genetic polymorphisms contribute to the observed variation in disease, we conducted quantitative trait loci (QTL) mapping. We identified several QTL contributing to specific aspects of the host response including virus-induced weight loss, titer, pulmonary edema, neutrophil recruitment to the airways, and transcriptional expression. Existing whole-genome sequence data was applied to identify high priority candidate genes within QTL regions. A key host response QTL was located at the site of the known anti-influenza Mx1 gene. We sequenced the coding regions of Mx1 in the eight CC founder strains, and identified a novel Mx1 allele that showed reduced ability to inhibit viral replication, while maintaining protection from weight loss.


Assuntos
Variação Genética , Interações Hospedeiro-Patógeno/genética , Influenza Humana/virologia , Modelos Genéticos , Infecções por Orthomyxoviridae/virologia , Doenças dos Roedores/virologia , Animais , Cruzamentos Genéticos , Feminino , Humanos , Vírus da Influenza A , Influenza Humana/genética , Influenza Humana/patologia , Pulmão/patologia , Camundongos , Camundongos Endogâmicos , Infecções por Orthomyxoviridae/genética , Infecções por Orthomyxoviridae/patologia , Fenótipo , Vírus Reordenados/genética , Vírus Reordenados/patogenicidade , Recombinação Genética , Doenças dos Roedores/genética , Doenças dos Roedores/patologia , Especificidade da Espécie , Replicação Viral
9.
Genome Res ; 21(8): 1213-22, 2011 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-21406540

RESUMO

The Collaborative Cross (CC) is a mouse recombinant inbred strain panel that is being developed as a resource for mammalian systems genetics. Here we describe an experiment that uses partially inbred CC lines to evaluate the genetic properties and utility of this emerging resource. Genome-wide analysis of the incipient strains reveals high genetic diversity, balanced allele frequencies, and dense, evenly distributed recombination sites-all ideal qualities for a systems genetics resource. We map discrete, complex, and biomolecular traits and contrast two quantitative trait locus (QTL) mapping approaches. Analysis based on inferred haplotypes improves power, reduces false discovery, and provides information to identify and prioritize candidate genes that is unique to multifounder crosses like the CC. The number of expression QTLs discovered here exceeds all previous efforts at eQTL mapping in mice, and we map local eQTL at 1-Mb resolution. We demonstrate that the genetic diversity of the CC, which derives from random mixing of eight founder strains, results in high phenotypic diversity and enhances our ability to map causative loci underlying complex disease-related traits.


Assuntos
Genoma , Locos de Características Quantitativas , Animais , Cruzamentos Genéticos , Feminino , Expressão Gênica , Estudos de Associação Genética , Haplótipos , Masculino , Camundongos , Fenótipo
10.
Bioinformatics ; 29(13): i291-9, 2013 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-23812996

RESUMO

MOTIVATION: RNA-seq techniques provide an unparalleled means for exploring a transcriptome with deep coverage and base pair level resolution. Various analysis tools have been developed to align and assemble RNA-seq data, such as the widely used TopHat/Cufflinks pipeline. A common observation is that a sizable fraction of the fragments/reads align to multiple locations of the genome. These multiple alignments pose substantial challenges to existing RNA-seq analysis tools. Inappropriate treatment may result in reporting spurious expressed genes (false positives) and missing the real expressed genes (false negatives). Such errors impact the subsequent analysis, such as differential expression analysis. In our study, we observe that ~3.5% of transcripts reported by TopHat/Cufflinks pipeline correspond to annotated nonfunctional pseudogenes. Moreover, ~10.0% of reported transcripts are not annotated in the Ensembl database. These genes could be either novel expressed genes or false discoveries. RESULTS: We examine the underlying genomic features that lead to multiple alignments and investigate how they generate systematic errors in RNA-seq analysis. We develop a general tool, GeneScissors, which exploits machine learning techniques guided by biological knowledge to detect and correct spurious transcriptome inference by existing RNA-seq analysis methods. In our simulated study, GeneScissors can predict spurious transcriptome calls owing to misalignment with an accuracy close to 90%. It provides substantial improvement over the widely used TopHat/Cufflinks or MapSplice/Cufflinks pipelines in both precision and F-measurement. On real data, GeneScissors reports 53.6% less pseudogenes and 0.97% more expressed and annotated transcripts, when compared with the TopHat/Cufflinks pipeline. In addition, among the 10.0% unannotated transcripts reported by TopHat/Cufflinks, GeneScissors finds that >16.3% of them are false positives. AVAILABILITY: The software can be downloaded at http://csbio.unc.edu/genescissors/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Perfilação da Expressão Gênica/métodos , Alinhamento de Sequência/métodos , Análise de Sequência de RNA/métodos , Software , Animais , Inteligência Artificial , Genômica , Camundongos , Pseudogenes
11.
BMC Bioinformatics ; 13 Suppl 3: S13, 2012 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-22536897

RESUMO

BACKGROUND: Genome browsers are a common tool used by biologists to visualize genomic features including genes, polymorphisms, and many others. However, existing genome browsers and visualization tools are not well-suited to perform meaningful comparative analysis among a large number of genomes. With the increasing quantity and availability of genomic data, there is an increased burden to provide useful visualization and analysis tools for comparison of multiple collinear genomes such as the large panels of model organisms which are the basis for much of the current genetic research. RESULTS: We have developed a novel web-based tool for visualizing and analyzing multiple collinear genomes. Our tool illustrates genome-sequence similarity through a mosaic of intervals representing local phylogeny, subspecific origin, and haplotype identity. Comparative analysis is facilitated through reordering and clustering of tracks, which can vary throughout the genome. In addition, we provide local phylogenetic trees as an alternate visualization to assess local variations. CONCLUSIONS: Unlike previous genome browsers and viewers, ours allows for simultaneous and comparative analysis. Our browser provides intuitive selection and interactive navigation about features of interest. Dynamic visualizations adjust to scale and data content making analysis at variable resolutions and of multiple data sets more informative. We demonstrate our genome browser for an extensive set of genomic data sets composed of almost 200 distinct mouse laboratory strains.


Assuntos
Genoma , Internet , Camundongos/genética , Software , Animais , Análise por Conglomerados , Camundongos/classificação , Camundongos Endogâmicos , Filogenia , Polimorfismo de Nucleotídeo Único
12.
BMC Genomics ; 13: 34, 2012 Jan 19.
Artigo em Inglês | MEDLINE | ID: mdl-22260749

RESUMO

BACKGROUND: High-density genotyping arrays that measure hybridization of genomic DNA fragments to allele-specific oligonucleotide probes are widely used to genotype single nucleotide polymorphisms (SNPs) in genetic studies, including human genome-wide association studies. Hybridization intensities are converted to genotype calls by clustering algorithms that assign each sample to a genotype class at each SNP. Data for SNP probes that do not conform to the expected pattern of clustering are often discarded, contributing to ascertainment bias and resulting in lost information - as much as 50% in a recent genome-wide association study in dogs. RESULTS: We identified atypical patterns of hybridization intensities that were highly reproducible and demonstrated that these patterns represent genetic variants that were not accounted for in the design of the array platform. We characterized variable intensity oligonucleotide (VINO) probes that display such patterns and are found in all hybridization-based genotyping platforms, including those developed for human, dog, cattle, and mouse. When recognized and properly interpreted, VINOs recovered a substantial fraction of discarded probes and counteracted SNP ascertainment bias. We developed software (MouseDivGeno) that identifies VINOs and improves the accuracy of genotype calling. MouseDivGeno produced highly concordant genotype calls when compared with other methods but it uniquely identified more than 786000 VINOs in 351 mouse samples. We used whole-genome sequence from 14 mouse strains to confirm the presence of novel variants explaining 28000 VINOs in those strains. We also identified VINOs in human HapMap 3 samples, many of which were specific to an African population. Incorporating VINOs in phylogenetic analyses substantially improved the accuracy of a Mus species tree and local haplotype assignment in laboratory mouse strains. CONCLUSION: The problems of ascertainment bias and missing information due to genotyping errors are widely recognized as limiting factors in genetic studies. We have conducted the first formal analysis of the effect of novel variants on genotyping arrays, and we have shown that these variants account for a large portion of miscalled and uncalled genotypes. Genetic studies will benefit from substantial improvements in the accuracy of their results by incorporating VINOs in their analyses.


Assuntos
Estudo de Associação Genômica Ampla , Hibridização de Ácido Nucleico , Sondas de Oligonucleotídeos/química , Algoritmos , Animais , Bovinos , Análise por Conglomerados , Cães , Genótipo , Haplótipos , Humanos , Camundongos , Polimorfismo de Nucleotídeo Único , Software
13.
Mamm Genome ; 23(9-10): 706-12, 2012 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-22847377

RESUMO

The Collaborative Cross (CC) is a panel of recombinant inbred lines derived from eight genetically diverse laboratory inbred strains. Recently, the genetic architecture of the CC population was reported based on the genotype of a single male per line, and other publications reported incompletely inbred CC mice that have been used to map a variety of traits. The three breeding sites, in the US, Israel, and Australia, are actively collaborating to accelerate the inbreeding process through marker-assisted inbreeding and to expedite community access of CC lines deemed to have reached defined thresholds of inbreeding. Plans are now being developed to provide access to this novel genetic reference population through distribution centers. Here we provide a description of the distribution efforts by the University of North Carolina Systems Genetics Core, Tel Aviv University, Israel and the University of Western Australia.


Assuntos
Comportamento Cooperativo , Camundongos Endogâmicos/genética , Animais , Genoma , Internet , Masculino , Camundongos
14.
Bioinformatics ; 26(12): i199-207, 2010 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-20529906

RESUMO

MOTIVATION: High-density SNP data of model animal resources provides opportunities for fine-resolution genetic variation studies. These genetic resources are generated through a variety of breeding schemes that involve multiple generations of matings derived from a set of founder animals. In this article, we investigate the problem of inferring the most probable ancestry of resulting genotypes, given a set of founder genotypes. Due to computational difficulty, existing methods either handle only small pedigree data or disregard the pedigree structure. However, large pedigrees of model animal resources often contain repetitive substructures that can be utilized in accelerating computation. RESULTS: We present an accurate and efficient method that can accept complex pedigrees with inbreeding in inferring genome ancestry. Inbreeding is a commonly used process in generating genetically diverse and reproducible animals. It is often carried out for many generations and can account for most of the computational complexity in real-world model animal pedigrees. Our method builds a hidden Markov model that derives the ancestry probabilities through inbreeding process without explicit modeling in every generation. The ancestry inference is accurate and fast, independent of the number of generations, for model animal resources such as the Collaborative Cross (CC). Experiments on both simulated and real CC data demonstrate that our method offers comparable accuracy to those methods that build an explicit model of the entire pedigree, but much better scalability with respect to the pedigree size.


Assuntos
Genoma , Genômica/métodos , Endogamia , Linhagem , Animais , Variação Genética , Genótipo , Polimorfismo de Nucleotídeo Único
15.
Genetics ; 214(3): 691-702, 2020 03.
Artigo em Inglês | MEDLINE | ID: mdl-31879319

RESUMO

The azoxymethane model of colorectal cancer (CRC) was used to gain insights into the genetic heterogeneity of nonfamilial CRC. We observed significant differences in susceptibility parameters across 40 mouse inbred strains, with 6 new and 18 of 24 previously identified mouse CRC modifier alleles detected using genome-wide association analysis. Tumor incidence varied in F1 as well as intercrosses and backcrosses between resistant and susceptible strains. Analysis of inheritance patterns indicates that resistance to CRC development is inherited as a dominant characteristic genome-wide, and that susceptibility appears to occur in individuals lacking a large-effect, or sufficient numbers of small-effect, polygenic resistance alleles. Our results suggest a new polygenic model for inheritance of nonfamilial CRC, and that genetic studies in humans aimed at identifying individuals with elevated susceptibility should be pursued through the lens of absence of dominant resistance alleles rather than for the presence of susceptibility alleles.


Assuntos
Neoplasias Colorretais/genética , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Herança Multifatorial/genética , Alelos , Animais , Azoximetano/toxicidade , Neoplasias Colorretais/induzido quimicamente , Neoplasias Colorretais/patologia , Modelos Animais de Doenças , Resistencia a Medicamentos Antineoplásicos , Heterogeneidade Genética , Hereditariedade , Humanos , Camundongos , Camundongos Endogâmicos/genética , Modelos Genéticos
16.
G3 (Bethesda) ; 9(5): 1613-1622, 2019 05 07.
Artigo em Inglês | MEDLINE | ID: mdl-30877080

RESUMO

Reproductive success in the eight founder strains of the Collaborative Cross (CC) was measured using a diallel-mating scheme. Over a 48-month period we generated 4,448 litters, and provided 24,782 weaned pups for use in 16 different published experiments. We identified factors that affect the average litter size in a cross by estimating the overall contribution of parent-of-origin, heterosis, inbred, and epistatic effects using a Bayesian zero-truncated overdispersed Poisson mixed model. The phenotypic variance of litter size has a substantial contribution (82%) from unexplained and environmental sources, but no detectable effect of seasonality. Most of the explained variance was due to additive effects (9.2%) and parental sex (maternal vs. paternal strain; 5.8%), with epistasis accounting for 3.4%. Within the parental effects, the effect of the dam's strain explained more than the sire's strain (13.2% vs. 1.8%), and the dam's strain effects account for 74.2% of total variation explained. Dams from strains C57BL/6J and NOD/ShiLtJ increased the expected litter size by a mean of 1.66 and 1.79 pups, whereas dams from strains WSB/EiJ, PWK/PhJ, and CAST/EiJ reduced expected litter size by a mean of 1.51, 0.81, and 0.90 pups. Finally, there was no strong evidence for strain-specific effects on sex ratio distortion. Overall, these results demonstrate that strains vary substantially in their reproductive ability depending on their genetic background, and that litter size is largely determined by dam's strain rather than sire's strain effects, as expected. This analysis adds to our understanding of factors that influence litter size in mammals, and also helps to explain breeding successes and failures in the extinct lines and surviving CC strains.


Assuntos
Alelos , Animais Geneticamente Modificados , Camundongos de Cruzamento Colaborativo/genética , Tamanho da Ninhada de Vivíparos/genética , Herança Materna , Algoritmos , Animais , Cruzamentos Genéticos , Meio Ambiente , Interação Gene-Ambiente , Testes Genéticos , Camundongos , Camundongos Endogâmicos , Modelos Genéticos , Fenótipo , Razão de Masculinidade , Especificidade da Espécie
17.
G3 (Bethesda) ; 9(5): 1303-1311, 2019 05 07.
Artigo em Inglês | MEDLINE | ID: mdl-30858237

RESUMO

Two key features of recombinant inbred panels are well-characterized genomes and reproducibility. Here we report on the sequenced genomes of six additional Collaborative Cross (CC) strains and on inbreeding progress of 72 CC strains. We have previously reported on the sequences of 69 CC strains that were publicly available, bringing the total of CC strains with whole genome sequence up to 75. The sequencing of these six CC strains updates the efforts toward inbreeding undertaken by the UNC Systems Genetics Core. The timing reflects our competing mandates to release to the public as many CC strains as possible while achieving an acceptable level of inbreeding. The new six strains have a higher than average founder contribution from non-domesticus strains than the previously released CC strains. Five of the six strains also have high residual heterozygosity (>14%), which may be related to non-domesticus founder contributions. Finally, we report on updated estimates on residual heterozygosity across the entire CC population using a novel, simple and cost effective genotyping platform on three mice from each strain. We observe a reduction in residual heterozygosity across all previously released CC strains. We discuss the optimal use of different genetic resources available for the CC population.


Assuntos
Camundongos de Cruzamento Colaborativo/genética , Genética Populacional , Endogamia , Sequenciamento Completo do Genoma , Alelos , Animais , Animais Geneticamente Modificados , Mapeamento Cromossômico , Cruzamentos Genéticos , Frequência do Gene , Genoma , Genótipo , Camundongos , Camundongos Endogâmicos
18.
Bioinformatics ; 23(13): i401-7, 2007 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-17646323

RESUMO

MOTIVATION: Typical high-throughput genotyping techniques produce numerous missing calls that confound subsequent analyses, such as disease association studies. Common remedies for this problem include removing affected markers and/or samples or, otherwise, imputing the missing data. On small marker sets imputation is frequently based on a vote of the K-nearest-neighbor (KNN) haplotypes, but this technique is neither practical nor justifiable for large datasets. RESULTS: We describe a data structure that supports efficient KNN queries over arbitrarily sized, sliding haplotype windows, and evaluate its use for genotype imputation. The performance of our method enables exhaustive exploration over all window sizes and known sites in large (150K, 8.3M) SNP panels. We also compare the accuracy and performance of our methods with competing imputation approaches. AVAILABILITY: A free open source software package, NPUTE, is available at http://compgen.unc.edu/software, for non-commercial uses.


Assuntos
Algoritmos , Artefatos , Mapeamento Cromossômico/métodos , Análise Mutacional de DNA/métodos , Polimorfismo de Nucleotídeo Único/genética , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Variação Genética/genética , Reconhecimento Automatizado de Padrão/métodos , Sensibilidade e Especificidade
19.
IEEE Trans Image Process ; 16(5): 1185-94, 2007 May.
Artigo em Inglês | MEDLINE | ID: mdl-17491451

RESUMO

We present a technique for enhancing underexposed visible-spectrum video by fusing it with simultaneously captured video from sensors in nonvisible spectra, such as Short Wave IR or Near IR. Although IR sensors can accurately capture video in low-light and night-vision applications, they lack the color and relative luminances of visible-spectrum sensors. RGB sensors do capture color and correct relative luminances, but are underexposed, noisy, and lack fine features due to short video exposure times. Our enhanced fusion output is a reconstruction of the RGB input assisted by the IR data, not an incorporation of elements imaged only in IR. With a temporal noise reduction, we first remove shot noise and increase the color accuracy of the RGB footage. The IR video is then normalized to ensure cross-spectral compatibility with the visible-spectrum video using ratio images. To aid fusion, we decompose the video sources with edge-preserving filters. We introduce a multispectral version of the bilateral filter called the "dual bilateral" that robustly decomposes the RGB video. It utilizes the less-noisy IR for edge detection but also preserves strong visible-spectrum edges not in the IR. We fuse the RGB low frequencies, the IR texture details, and the dual bilateral edges into a noise-reduced video with sharp details, correct chrominances, and natural relative luminances.


Assuntos
Algoritmos , Cor , Colorimetria/métodos , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Espectrofotometria Infravermelho/métodos , Gravação em Vídeo/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Processamento de Sinais Assistido por Computador
20.
Genetics ; 206(2): 537-556, 2017 06.
Artigo em Inglês | MEDLINE | ID: mdl-28592495

RESUMO

The Collaborative Cross (CC) is a multiparent panel of recombinant inbred (RI) mouse strains derived from eight founder laboratory strains. RI panels are popular because of their long-term genetic stability, which enhances reproducibility and integration of data collected across time and conditions. Characterization of their genomes can be a community effort, reducing the burden on individual users. Here we present the genomes of the CC strains using two complementary approaches as a resource to improve power and interpretation of genetic experiments. Our study also provides a cautionary tale regarding the limitations imposed by such basic biological processes as mutation and selection. A distinct advantage of inbred panels is that genotyping only needs to be performed on the panel, not on each individual mouse. The initial CC genome data were haplotype reconstructions based on dense genotyping of the most recent common ancestors (MRCAs) of each strain followed by imputation from the genome sequence of the corresponding founder inbred strain. The MRCA resource captured segregating regions in strains that were not fully inbred, but it had limited resolution in the transition regions between founder haplotypes, and there was uncertainty about founder assignment in regions of limited diversity. Here we report the whole genome sequence of 69 CC strains generated by paired-end short reads at 30× coverage of a single male per strain. Sequencing leads to a substantial improvement in the fine structure and completeness of the genomes of the CC. Both MRCAs and sequenced samples show a significant reduction in the genome-wide haplotype frequencies from two wild-derived strains, CAST/EiJ and PWK/PhJ. In addition, analysis of the evolution of the patterns of heterozygosity indicates that selection against three wild-derived founder strains played a significant role in shaping the genomes of the CC. The sequencing resource provides the first description of tens of thousands of new genetic variants introduced by mutation and drift in the CC genomes. We estimate that new SNP mutations are accumulating in each CC strain at a rate of 2.4 ± 0.4 per gigabase per generation. The fixation of new mutations by genetic drift has introduced thousands of new variants into the CC strains. The majority of these mutations are novel compared to currently sequenced laboratory stocks and wild mice, and some are predicted to alter gene function. Approximately one-third of the CC inbred strains have acquired large deletions (>10 kb) many of which overlap known coding genes and functional elements. The sequence of these mice is a critical resource to CC users, increases threefold the number of mouse inbred strain genomes available publicly, and provides insight into the effect of mutation and drift on common resources.


Assuntos
Deriva Genética , Genoma/genética , Camundongos Endogâmicos/genética , Locos de Características Quantitativas/genética , Animais , Mapeamento Cromossômico , Cruzamentos Genéticos , Genótipo , Haplótipos , Masculino , Camundongos , Mutação , Polimorfismo de Nucleotídeo Único
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa