Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 38
Filtrar
1.
BMC Bioinformatics ; 19(1): 50, 2018 02 09.
Artículo en Inglés | MEDLINE | ID: mdl-29426289

RESUMEN

BACKGROUND: Long read sequencing is changing the landscape of genomic research, especially de novo assembly. Despite the high error rate inherent to long read technologies, increased read lengths dramatically improve the continuity and accuracy of genome assemblies. However, the cost and throughput of these technologies limits their application to complex genomes. One solution is to decrease the cost and time to assemble novel genomes by leveraging "hybrid" assemblies that use long reads for scaffolding and short reads for accuracy. RESULTS: We describe a novel method leveraging a multi-string Burrows-Wheeler Transform with auxiliary FM-index to correct errors in long read sequences using a set of complementary short reads. We demonstrate that our method efficiently produces significantly more high quality corrected sequence than existing hybrid error-correction methods. We also show that our method produces more contiguous assemblies, in many cases, than existing state-of-the-art hybrid and long-read only de novo assembly methods. CONCLUSION: Our method accurately corrects long read sequence data using complementary short reads. We demonstrate higher total throughput of corrected long reads and a corresponding increase in contiguity of the resulting de novo assemblies. Improved throughput and computational efficiency than existing methods will help better economically utilize emerging long read sequencing technologies.


Asunto(s)
Algoritmos , Bases de Datos Genéticas , Genoma Fúngico , Saccharomyces cerevisiae/genética , Análisis de Secuencia de ADN
2.
PLoS Genet ; 11(10): e1005504, 2015 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-26452100

RESUMEN

New systems genetics approaches are needed to rapidly identify host genes and genetic networks that regulate complex disease outcomes. Using genetically diverse animals from incipient lines of the Collaborative Cross mouse panel, we demonstrate a greatly expanded range of phenotypes relative to classical mouse models of SARS-CoV infection including lung pathology, weight loss and viral titer. Genetic mapping revealed several loci contributing to differential disease responses, including an 8.5Mb locus associated with vascular cuffing on chromosome 3 that contained 23 genes and 13 noncoding RNAs. Integrating phenotypic and genetic data narrowed this region to a single gene, Trim55, an E3 ubiquitin ligase with a role in muscle fiber maintenance. Lung pathology and transcriptomic data from mice genetically deficient in Trim55 were used to validate its role in SARS-CoV-induced vascular cuffing and inflammation. These data establish the Collaborative Cross platform as a powerful genetic resource for uncovering genetic contributions of complex traits in microbial disease severity, inflammation and virus replication in models of outbred populations.


Asunto(s)
Interacciones Huésped-Patógeno , Inflamación/genética , Síndrome Respiratorio Agudo Grave/genética , Coronavirus Relacionado al Síndrome Respiratorio Agudo Severo/genética , Animales , Modelos Animales de Enfermedad , Susceptibilidad a Enfermedades , Humanos , Inflamación/patología , Inflamación/virología , Ratones , Fenotipo , Coronavirus Relacionado al Síndrome Respiratorio Agudo Severo/patogenicidad , Síndrome Respiratorio Agudo Grave/patología , Síndrome Respiratorio Agudo Grave/virología , Replicación Viral/genética
3.
PLoS Genet ; 11(2): e1004850, 2015 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-25679959

RESUMEN

Significant departures from expected Mendelian inheritance ratios (transmission ratio distortion, TRD) are frequently observed in both experimental crosses and natural populations. TRD on mouse Chromosome (Chr) 2 has been reported in multiple experimental crosses, including the Collaborative Cross (CC). Among the eight CC founder inbred strains, we found that Chr 2 TRD was exclusive to females that were heterozygous for the WSB/EiJ allele within a 9.3 Mb region (Chr 2 76.9 - 86.2 Mb). A copy number gain of a 127 kb-long DNA segment (designated as responder to drive, R2d) emerged as the strongest candidate for the causative allele. We mapped R2d sequences to two loci within the candidate interval. R2d1 is located near the proximal boundary, and contains a single copy of R2d in all strains tested. R2d2 maps to a 900 kb interval, and the number of R2d copies varies from zero in classical strains (including the mouse reference genome) to more than 30 in wild-derived strains. Using real-time PCR assays for the copy number, we identified a mutation (R2d2WSBdel1) that eliminates the majority of the R2d2WSB copies without apparent alterations of the surrounding WSB/EiJ haplotype. In a three-generation pedigree segregating for R2d2WSBdel1, the mutation is transmitted to the progeny and Mendelian segregation is restored in females heterozygous for R2d2WSBdel1, thus providing direct evidence that the copy number gain is causal for maternal TRD. We found that transmission ratios in R2d2WSB heterozygous females vary between Mendelian segregation and complete distortion depending on the genetic background, and that TRD is under genetic control of unlinked distorter loci. Although the R2d2WSB transmission ratio was inversely correlated with average litter size, several independent lines of evidence support the contention that female meiotic drive is the cause of the distortion. We discuss the implications and potential applications of this novel meiotic drive system.


Asunto(s)
Variaciones en el Número de Copia de ADN/genética , Genómica , Patrón de Herencia/genética , Meiosis/genética , Alelos , Animales , Cromosomas/genética , Cruzamientos Genéticos , Femenino , Técnicas de Genotipaje , Haplotipos/genética , Masculino , Ratones , Mutación
4.
Mol Biol Evol ; 33(6): 1381-95, 2016 06.
Artículo en Inglés | MEDLINE | ID: mdl-26882987

RESUMEN

A selective sweep is the result of strong positive selection driving newly occurring or standing genetic variants to fixation, and can dramatically alter the pattern and distribution of allelic diversity in a population. Population-level sequencing data have enabled discoveries of selective sweeps associated with genes involved in recent adaptations in many species. In contrast, much debate but little evidence addresses whether "selfish" genes are capable of fixation-thereby leaving signatures identical to classical selective sweeps-despite being neutral or deleterious to organismal fitness. We previously described R2d2, a large copy-number variant that causes nonrandom segregation of mouse Chromosome 2 in females due to meiotic drive. Here we show population-genetic data consistent with a selfish sweep driven by alleles of R2d2 with high copy number (R2d2(HC)) in natural populations. We replicate this finding in multiple closed breeding populations from six outbred backgrounds segregating for R2d2 alleles. We find that R2d2(HC) rapidly increases in frequency, and in most cases becomes fixed in significantly fewer generations than can be explained by genetic drift. R2d2(HC) is also associated with significantly reduced litter sizes in heterozygous mothers, making it a true selfish allele. Our data provide direct evidence of populations actively undergoing selfish sweeps, and demonstrate that meiotic drive can rapidly alter the genomic landscape in favor of mutations with neutral or even negative effects on overall Darwinian fitness. Further study will reveal the incidence of selfish sweeps, and will elucidate the relative contributions of selfish genes, adaptation and genetic drift to evolution.


Asunto(s)
Proteínas Nucleares/genética , Proteínas de Unión al ARN/genética , Secuencias Repetitivas de Ácidos Nucleicos , Adaptación Fisiológica/genética , Alelos , Animales , Evolución Biológica , Variaciones en el Número de Copia de ADN/genética , Evolución Molecular , Femenino , Variación Genética , Genética de Población , Masculino , Ratones , Modelos Genéticos , Mutación , Selección Genética
5.
Biol Reprod ; 97(5): 698-708, 2017 Nov 01.
Artículo en Inglés | MEDLINE | ID: mdl-29036474

RESUMEN

The ability to accurately monitor alterations in sperm motility is paramount to understanding multiple genetic and biochemical perturbations impacting normal fertilization. Computer-aided sperm analysis (CASA) of human sperm typically reports motile percentage and kinematic parameters at the population level, and uses kinematic gating methods to identify subpopulations such as progressive or hyperactivated sperm. The goal of this study was to develop an automated method that classifies all patterns of human sperm motility during in vitro capacitation following the removal of seminal plasma. We visually classified CASA tracks of 2817 sperm from 18 individuals and used a support vector machine-based decision tree to compute four hyperplanes that separate five classes based on their kinematic parameters. We then developed a web-based program, CASAnova, which applies these equations sequentially to assign a single classification to each motile sperm. Vigorous sperm are classified as progressive, intermediate, or hyperactivated, and nonvigorous sperm as slow or weakly motile. This program correctly classifies sperm motility into one of five classes with an overall accuracy of 89.9%. Application of CASAnova to capacitating sperm populations showed a shift from predominantly linear patterns of motility at initial time points to more vigorous patterns, including hyperactivated motility, as capacitation proceeds. Both intermediate and hyperactivated motility patterns were largely eliminated when sperm were incubated in noncapacitating medium, demonstrating the sensitivity of this method. The five CASAnova classifications are distinctive and reflect kinetic parameters of washed human sperm, providing an accurate, quantitative, and high-throughput method for monitoring alterations in motility.


Asunto(s)
Procesamiento de Imagen Asistido por Computador/métodos , Motilidad Espermática/fisiología , Espermatozoides/fisiología , Máquina de Vectores de Soporte , Humanos , Masculino , Análisis de Semen , Espermatozoides/clasificación
6.
PLoS Genet ; 9(10): e1003853, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-24098153

RESUMEN

X chromosome inactivation (XCI) is the mammalian mechanism of dosage compensation that balances X-linked gene expression between the sexes. Early during female development, each cell of the embryo proper independently inactivates one of its two parental X-chromosomes. In mice, the choice of which X chromosome is inactivated is affected by the genotype of a cis-acting locus, the X-chromosome controlling element (Xce). Xce has been localized to a 1.9 Mb interval within the X-inactivation center (Xic), yet its molecular identity and mechanism of action remain unknown. We combined genotype and sequence data for mouse stocks with detailed phenotyping of ten inbred strains and with the development of a statistical model that incorporates phenotyping data from multiple sources to disentangle sources of XCI phenotypic variance in natural female populations on X inactivation. We have reduced the Xce candidate 10-fold to a 176 kb region located approximately 500 kb proximal to Xist. We propose that structural variation in this interval explains the presence of multiple functional Xce alleles in the genus Mus. We have identified a new allele, Xce(e) present in Mus musculus and a possible sixth functional allele in Mus spicilegus. We have also confirmed a parent-of-origin effect on X inactivation choice and provide evidence that maternal inheritance magnifies the skewing associated with strong Xce alleles. Based on the phylogenetic analysis of 155 laboratory strains and wild mice we conclude that Xce(a) is either a derived allele that arose concurrently with the domestication of fancy mice but prior the derivation of most classical inbred strains or a rare allele in the wild. Furthermore, we have found that despite the presence of multiple haplotypes in the wild Mus musculus domesticus has only one functional Xce allele, Xce(b). Lastly, we conclude that each mouse taxa examined has a different functional Xce allele.


Asunto(s)
Compensación de Dosificación (Genética) , Genes Ligados a X , ARN Largo no Codificante/genética , Inactivación del Cromosoma X/genética , Alelos , Animales , Mapeo Cromosómico , Femenino , Sitios Genéticos , Haplotipos , Ratones , Filogenia
7.
Bioinformatics ; 30(24): 3524-31, 2014 Dec 15.
Artículo en Inglés | MEDLINE | ID: mdl-25172922

RESUMEN

MOTIVATION: The throughput of genomic sequencing has increased to the point that is overrunning the rate of downstream analysis. This, along with the desire to revisit old data, has led to a situation where large quantities of raw, and nearly impenetrable, sequence data are rapidly filling the hard drives of modern biology labs. These datasets can be compressed via a multi-string variant of the Burrows-Wheeler Transform (BWT), which provides the side benefit of searches for arbitrary k-mers within the raw data as well as the ability to reconstitute arbitrary reads as needed. We propose a method for merging such datasets for both increased compression and downstream analysis. RESULTS: We present a novel algorithm that merges multi-string BWTs in [Formula: see text] time where LCS is the length of their longest common substring between any of the inputs, and N is the total length of all inputs combined (number of symbols) using [Formula: see text] bits where F is the number of multi-string BWTs merged. This merged multi-string BWT is also shown to have a higher compressibility compared with the input multi-string BWTs separately. Additionally, we explore some uses of a merged multi-string BWT for bioinformatics applications.


Asunto(s)
Algoritmos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Animales , Compresión de Datos , Genómica/métodos , Ratones , Alineación de Secuencia
8.
PLoS Pathog ; 9(2): e1003196, 2013 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-23468633

RESUMEN

Genetic variation contributes to host responses and outcomes following infection by influenza A virus or other viral infections. Yet narrow windows of disease symptoms and confounding environmental factors have made it difficult to identify polymorphic genes that contribute to differential disease outcomes in human populations. Therefore, to control for these confounding environmental variables in a system that models the levels of genetic diversity found in outbred populations such as humans, we used incipient lines of the highly genetically diverse Collaborative Cross (CC) recombinant inbred (RI) panel (the pre-CC population) to study how genetic variation impacts influenza associated disease across a genetically diverse population. A wide range of variation in influenza disease related phenotypes including virus replication, virus-induced inflammation, and weight loss was observed. Many of the disease associated phenotypes were correlated, with viral replication and virus-induced inflammation being predictors of virus-induced weight loss. Despite these correlations, pre-CC mice with unique and novel disease phenotype combinations were observed. We also identified sets of transcripts (modules) that were correlated with aspects of disease. In order to identify how host genetic polymorphisms contribute to the observed variation in disease, we conducted quantitative trait loci (QTL) mapping. We identified several QTL contributing to specific aspects of the host response including virus-induced weight loss, titer, pulmonary edema, neutrophil recruitment to the airways, and transcriptional expression. Existing whole-genome sequence data was applied to identify high priority candidate genes within QTL regions. A key host response QTL was located at the site of the known anti-influenza Mx1 gene. We sequenced the coding regions of Mx1 in the eight CC founder strains, and identified a novel Mx1 allele that showed reduced ability to inhibit viral replication, while maintaining protection from weight loss.


Asunto(s)
Variación Genética , Interacciones Huésped-Patógeno/genética , Gripe Humana/virología , Modelos Genéticos , Infecciones por Orthomyxoviridae/virología , Enfermedades de los Roedores/virología , Animales , Cruzamientos Genéticos , Femenino , Humanos , Virus de la Influenza A , Gripe Humana/genética , Gripe Humana/patología , Pulmón/patología , Ratones , Ratones Endogámicos , Infecciones por Orthomyxoviridae/genética , Infecciones por Orthomyxoviridae/patología , Fenotipo , Virus Reordenados/genética , Virus Reordenados/patogenicidad , Recombinación Genética , Enfermedades de los Roedores/genética , Enfermedades de los Roedores/patología , Especificidad de la Especie , Replicación Viral
9.
Genome Res ; 21(8): 1213-22, 2011 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-21406540

RESUMEN

The Collaborative Cross (CC) is a mouse recombinant inbred strain panel that is being developed as a resource for mammalian systems genetics. Here we describe an experiment that uses partially inbred CC lines to evaluate the genetic properties and utility of this emerging resource. Genome-wide analysis of the incipient strains reveals high genetic diversity, balanced allele frequencies, and dense, evenly distributed recombination sites-all ideal qualities for a systems genetics resource. We map discrete, complex, and biomolecular traits and contrast two quantitative trait locus (QTL) mapping approaches. Analysis based on inferred haplotypes improves power, reduces false discovery, and provides information to identify and prioritize candidate genes that is unique to multifounder crosses like the CC. The number of expression QTLs discovered here exceeds all previous efforts at eQTL mapping in mice, and we map local eQTL at 1-Mb resolution. We demonstrate that the genetic diversity of the CC, which derives from random mixing of eight founder strains, results in high phenotypic diversity and enhances our ability to map causative loci underlying complex disease-related traits.


Asunto(s)
Genoma , Sitios de Carácter Cuantitativo , Animales , Cruzamientos Genéticos , Femenino , Expresión Génica , Estudios de Asociación Genética , Haplotipos , Masculino , Ratones , Fenotipo
10.
Bioinformatics ; 29(13): i291-9, 2013 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-23812996

RESUMEN

MOTIVATION: RNA-seq techniques provide an unparalleled means for exploring a transcriptome with deep coverage and base pair level resolution. Various analysis tools have been developed to align and assemble RNA-seq data, such as the widely used TopHat/Cufflinks pipeline. A common observation is that a sizable fraction of the fragments/reads align to multiple locations of the genome. These multiple alignments pose substantial challenges to existing RNA-seq analysis tools. Inappropriate treatment may result in reporting spurious expressed genes (false positives) and missing the real expressed genes (false negatives). Such errors impact the subsequent analysis, such as differential expression analysis. In our study, we observe that ~3.5% of transcripts reported by TopHat/Cufflinks pipeline correspond to annotated nonfunctional pseudogenes. Moreover, ~10.0% of reported transcripts are not annotated in the Ensembl database. These genes could be either novel expressed genes or false discoveries. RESULTS: We examine the underlying genomic features that lead to multiple alignments and investigate how they generate systematic errors in RNA-seq analysis. We develop a general tool, GeneScissors, which exploits machine learning techniques guided by biological knowledge to detect and correct spurious transcriptome inference by existing RNA-seq analysis methods. In our simulated study, GeneScissors can predict spurious transcriptome calls owing to misalignment with an accuracy close to 90%. It provides substantial improvement over the widely used TopHat/Cufflinks or MapSplice/Cufflinks pipelines in both precision and F-measurement. On real data, GeneScissors reports 53.6% less pseudogenes and 0.97% more expressed and annotated transcripts, when compared with the TopHat/Cufflinks pipeline. In addition, among the 10.0% unannotated transcripts reported by TopHat/Cufflinks, GeneScissors finds that >16.3% of them are false positives. AVAILABILITY: The software can be downloaded at http://csbio.unc.edu/genescissors/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Perfilación de la Expresión Génica/métodos , Alineación de Secuencia/métodos , Análisis de Secuencia de ARN/métodos , Programas Informáticos , Animales , Inteligencia Artificial , Genómica , Ratones , Seudogenes
11.
BMC Bioinformatics ; 13 Suppl 3: S13, 2012 Mar 21.
Artículo en Inglés | MEDLINE | ID: mdl-22536897

RESUMEN

BACKGROUND: Genome browsers are a common tool used by biologists to visualize genomic features including genes, polymorphisms, and many others. However, existing genome browsers and visualization tools are not well-suited to perform meaningful comparative analysis among a large number of genomes. With the increasing quantity and availability of genomic data, there is an increased burden to provide useful visualization and analysis tools for comparison of multiple collinear genomes such as the large panels of model organisms which are the basis for much of the current genetic research. RESULTS: We have developed a novel web-based tool for visualizing and analyzing multiple collinear genomes. Our tool illustrates genome-sequence similarity through a mosaic of intervals representing local phylogeny, subspecific origin, and haplotype identity. Comparative analysis is facilitated through reordering and clustering of tracks, which can vary throughout the genome. In addition, we provide local phylogenetic trees as an alternate visualization to assess local variations. CONCLUSIONS: Unlike previous genome browsers and viewers, ours allows for simultaneous and comparative analysis. Our browser provides intuitive selection and interactive navigation about features of interest. Dynamic visualizations adjust to scale and data content making analysis at variable resolutions and of multiple data sets more informative. We demonstrate our genome browser for an extensive set of genomic data sets composed of almost 200 distinct mouse laboratory strains.


Asunto(s)
Genoma , Internet , Ratones/genética , Programas Informáticos , Animales , Análisis por Conglomerados , Ratones/clasificación , Ratones Endogámicos , Filogenia , Polimorfismo de Nucleótido Simple
12.
BMC Genomics ; 13: 34, 2012 Jan 19.
Artículo en Inglés | MEDLINE | ID: mdl-22260749

RESUMEN

BACKGROUND: High-density genotyping arrays that measure hybridization of genomic DNA fragments to allele-specific oligonucleotide probes are widely used to genotype single nucleotide polymorphisms (SNPs) in genetic studies, including human genome-wide association studies. Hybridization intensities are converted to genotype calls by clustering algorithms that assign each sample to a genotype class at each SNP. Data for SNP probes that do not conform to the expected pattern of clustering are often discarded, contributing to ascertainment bias and resulting in lost information - as much as 50% in a recent genome-wide association study in dogs. RESULTS: We identified atypical patterns of hybridization intensities that were highly reproducible and demonstrated that these patterns represent genetic variants that were not accounted for in the design of the array platform. We characterized variable intensity oligonucleotide (VINO) probes that display such patterns and are found in all hybridization-based genotyping platforms, including those developed for human, dog, cattle, and mouse. When recognized and properly interpreted, VINOs recovered a substantial fraction of discarded probes and counteracted SNP ascertainment bias. We developed software (MouseDivGeno) that identifies VINOs and improves the accuracy of genotype calling. MouseDivGeno produced highly concordant genotype calls when compared with other methods but it uniquely identified more than 786000 VINOs in 351 mouse samples. We used whole-genome sequence from 14 mouse strains to confirm the presence of novel variants explaining 28000 VINOs in those strains. We also identified VINOs in human HapMap 3 samples, many of which were specific to an African population. Incorporating VINOs in phylogenetic analyses substantially improved the accuracy of a Mus species tree and local haplotype assignment in laboratory mouse strains. CONCLUSION: The problems of ascertainment bias and missing information due to genotyping errors are widely recognized as limiting factors in genetic studies. We have conducted the first formal analysis of the effect of novel variants on genotyping arrays, and we have shown that these variants account for a large portion of miscalled and uncalled genotypes. Genetic studies will benefit from substantial improvements in the accuracy of their results by incorporating VINOs in their analyses.


Asunto(s)
Estudio de Asociación del Genoma Completo , Hibridación de Ácido Nucleico , Sondas de Oligonucleótidos/química , Algoritmos , Animales , Bovinos , Análisis por Conglomerados , Perros , Genotipo , Haplotipos , Humanos , Ratones , Polimorfismo de Nucleótido Simple , Programas Informáticos
13.
Mamm Genome ; 23(9-10): 706-12, 2012 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-22847377

RESUMEN

The Collaborative Cross (CC) is a panel of recombinant inbred lines derived from eight genetically diverse laboratory inbred strains. Recently, the genetic architecture of the CC population was reported based on the genotype of a single male per line, and other publications reported incompletely inbred CC mice that have been used to map a variety of traits. The three breeding sites, in the US, Israel, and Australia, are actively collaborating to accelerate the inbreeding process through marker-assisted inbreeding and to expedite community access of CC lines deemed to have reached defined thresholds of inbreeding. Plans are now being developed to provide access to this novel genetic reference population through distribution centers. Here we provide a description of the distribution efforts by the University of North Carolina Systems Genetics Core, Tel Aviv University, Israel and the University of Western Australia.


Asunto(s)
Conducta Cooperativa , Ratones Endogámicos/genética , Animales , Genoma , Internet , Masculino , Ratones
14.
Bioinformatics ; 26(12): i199-207, 2010 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-20529906

RESUMEN

MOTIVATION: High-density SNP data of model animal resources provides opportunities for fine-resolution genetic variation studies. These genetic resources are generated through a variety of breeding schemes that involve multiple generations of matings derived from a set of founder animals. In this article, we investigate the problem of inferring the most probable ancestry of resulting genotypes, given a set of founder genotypes. Due to computational difficulty, existing methods either handle only small pedigree data or disregard the pedigree structure. However, large pedigrees of model animal resources often contain repetitive substructures that can be utilized in accelerating computation. RESULTS: We present an accurate and efficient method that can accept complex pedigrees with inbreeding in inferring genome ancestry. Inbreeding is a commonly used process in generating genetically diverse and reproducible animals. It is often carried out for many generations and can account for most of the computational complexity in real-world model animal pedigrees. Our method builds a hidden Markov model that derives the ancestry probabilities through inbreeding process without explicit modeling in every generation. The ancestry inference is accurate and fast, independent of the number of generations, for model animal resources such as the Collaborative Cross (CC). Experiments on both simulated and real CC data demonstrate that our method offers comparable accuracy to those methods that build an explicit model of the entire pedigree, but much better scalability with respect to the pedigree size.


Asunto(s)
Genoma , Genómica/métodos , Endogamia , Linaje , Animales , Variación Genética , Genotipo , Polimorfismo de Nucleótido Simple
15.
Genetics ; 214(3): 691-702, 2020 03.
Artículo en Inglés | MEDLINE | ID: mdl-31879319

RESUMEN

The azoxymethane model of colorectal cancer (CRC) was used to gain insights into the genetic heterogeneity of nonfamilial CRC. We observed significant differences in susceptibility parameters across 40 mouse inbred strains, with 6 new and 18 of 24 previously identified mouse CRC modifier alleles detected using genome-wide association analysis. Tumor incidence varied in F1 as well as intercrosses and backcrosses between resistant and susceptible strains. Analysis of inheritance patterns indicates that resistance to CRC development is inherited as a dominant characteristic genome-wide, and that susceptibility appears to occur in individuals lacking a large-effect, or sufficient numbers of small-effect, polygenic resistance alleles. Our results suggest a new polygenic model for inheritance of nonfamilial CRC, and that genetic studies in humans aimed at identifying individuals with elevated susceptibility should be pursued through the lens of absence of dominant resistance alleles rather than for the presence of susceptibility alleles.


Asunto(s)
Neoplasias Colorrectales/genética , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Herencia Multifactorial/genética , Alelos , Animales , Azoximetano/toxicidad , Neoplasias Colorrectales/inducido químicamente , Neoplasias Colorrectales/patología , Modelos Animales de Enfermedad , Resistencia a Antineoplásicos , Heterogeneidad Genética , Herencia , Humanos , Ratones , Ratones Endogámicos/genética , Modelos Genéticos
16.
G3 (Bethesda) ; 9(5): 1613-1622, 2019 05 07.
Artículo en Inglés | MEDLINE | ID: mdl-30877080

RESUMEN

Reproductive success in the eight founder strains of the Collaborative Cross (CC) was measured using a diallel-mating scheme. Over a 48-month period we generated 4,448 litters, and provided 24,782 weaned pups for use in 16 different published experiments. We identified factors that affect the average litter size in a cross by estimating the overall contribution of parent-of-origin, heterosis, inbred, and epistatic effects using a Bayesian zero-truncated overdispersed Poisson mixed model. The phenotypic variance of litter size has a substantial contribution (82%) from unexplained and environmental sources, but no detectable effect of seasonality. Most of the explained variance was due to additive effects (9.2%) and parental sex (maternal vs. paternal strain; 5.8%), with epistasis accounting for 3.4%. Within the parental effects, the effect of the dam's strain explained more than the sire's strain (13.2% vs. 1.8%), and the dam's strain effects account for 74.2% of total variation explained. Dams from strains C57BL/6J and NOD/ShiLtJ increased the expected litter size by a mean of 1.66 and 1.79 pups, whereas dams from strains WSB/EiJ, PWK/PhJ, and CAST/EiJ reduced expected litter size by a mean of 1.51, 0.81, and 0.90 pups. Finally, there was no strong evidence for strain-specific effects on sex ratio distortion. Overall, these results demonstrate that strains vary substantially in their reproductive ability depending on their genetic background, and that litter size is largely determined by dam's strain rather than sire's strain effects, as expected. This analysis adds to our understanding of factors that influence litter size in mammals, and also helps to explain breeding successes and failures in the extinct lines and surviving CC strains.


Asunto(s)
Alelos , Animales Modificados Genéticamente , Ratones de Colaboración Cruzada/genética , Tamaño de la Camada/genética , Herencia Materna , Algoritmos , Animales , Cruzamientos Genéticos , Ambiente , Interacción Gen-Ambiente , Pruebas Genéticas , Ratones , Ratones Endogámicos , Modelos Genéticos , Fenotipo , Razón de Masculinidad , Especificidad de la Especie
17.
G3 (Bethesda) ; 9(5): 1303-1311, 2019 05 07.
Artículo en Inglés | MEDLINE | ID: mdl-30858237

RESUMEN

Two key features of recombinant inbred panels are well-characterized genomes and reproducibility. Here we report on the sequenced genomes of six additional Collaborative Cross (CC) strains and on inbreeding progress of 72 CC strains. We have previously reported on the sequences of 69 CC strains that were publicly available, bringing the total of CC strains with whole genome sequence up to 75. The sequencing of these six CC strains updates the efforts toward inbreeding undertaken by the UNC Systems Genetics Core. The timing reflects our competing mandates to release to the public as many CC strains as possible while achieving an acceptable level of inbreeding. The new six strains have a higher than average founder contribution from non-domesticus strains than the previously released CC strains. Five of the six strains also have high residual heterozygosity (>14%), which may be related to non-domesticus founder contributions. Finally, we report on updated estimates on residual heterozygosity across the entire CC population using a novel, simple and cost effective genotyping platform on three mice from each strain. We observe a reduction in residual heterozygosity across all previously released CC strains. We discuss the optimal use of different genetic resources available for the CC population.


Asunto(s)
Ratones de Colaboración Cruzada/genética , Genética de Población , Endogamia , Secuenciación Completa del Genoma , Alelos , Animales , Animales Modificados Genéticamente , Mapeo Cromosómico , Cruzamientos Genéticos , Frecuencia de los Genes , Genoma , Genotipo , Ratones , Ratones Endogámicos
18.
Bioinformatics ; 23(13): i401-7, 2007 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-17646323

RESUMEN

MOTIVATION: Typical high-throughput genotyping techniques produce numerous missing calls that confound subsequent analyses, such as disease association studies. Common remedies for this problem include removing affected markers and/or samples or, otherwise, imputing the missing data. On small marker sets imputation is frequently based on a vote of the K-nearest-neighbor (KNN) haplotypes, but this technique is neither practical nor justifiable for large datasets. RESULTS: We describe a data structure that supports efficient KNN queries over arbitrarily sized, sliding haplotype windows, and evaluate its use for genotype imputation. The performance of our method enables exhaustive exploration over all window sizes and known sites in large (150K, 8.3M) SNP panels. We also compare the accuracy and performance of our methods with competing imputation approaches. AVAILABILITY: A free open source software package, NPUTE, is available at http://compgen.unc.edu/software, for non-commercial uses.


Asunto(s)
Algoritmos , Artefactos , Mapeo Cromosómico/métodos , Análisis Mutacional de ADN/métodos , Polimorfismo de Nucleótido Simple/genética , Alineación de Secuencia/métodos , Análisis de Secuencia de ADN/métodos , Variación Genética/genética , Reconocimiento de Normas Patrones Automatizadas/métodos , Sensibilidad y Especificidad
19.
IEEE Trans Image Process ; 16(5): 1185-94, 2007 May.
Artículo en Inglés | MEDLINE | ID: mdl-17491451

RESUMEN

We present a technique for enhancing underexposed visible-spectrum video by fusing it with simultaneously captured video from sensors in nonvisible spectra, such as Short Wave IR or Near IR. Although IR sensors can accurately capture video in low-light and night-vision applications, they lack the color and relative luminances of visible-spectrum sensors. RGB sensors do capture color and correct relative luminances, but are underexposed, noisy, and lack fine features due to short video exposure times. Our enhanced fusion output is a reconstruction of the RGB input assisted by the IR data, not an incorporation of elements imaged only in IR. With a temporal noise reduction, we first remove shot noise and increase the color accuracy of the RGB footage. The IR video is then normalized to ensure cross-spectral compatibility with the visible-spectrum video using ratio images. To aid fusion, we decompose the video sources with edge-preserving filters. We introduce a multispectral version of the bilateral filter called the "dual bilateral" that robustly decomposes the RGB video. It utilizes the less-noisy IR for edge detection but also preserves strong visible-spectrum edges not in the IR. We fuse the RGB low frequencies, the IR texture details, and the dual bilateral edges into a noise-reduced video with sharp details, correct chrominances, and natural relative luminances.


Asunto(s)
Algoritmos , Color , Colorimetría/métodos , Aumento de la Imagen/métodos , Interpretación de Imagen Asistida por Computador/métodos , Espectrofotometría Infrarroja/métodos , Grabación en Video/métodos , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Procesamiento de Señales Asistido por Computador
20.
Genetics ; 206(2): 537-556, 2017 06.
Artículo en Inglés | MEDLINE | ID: mdl-28592495

RESUMEN

The Collaborative Cross (CC) is a multiparent panel of recombinant inbred (RI) mouse strains derived from eight founder laboratory strains. RI panels are popular because of their long-term genetic stability, which enhances reproducibility and integration of data collected across time and conditions. Characterization of their genomes can be a community effort, reducing the burden on individual users. Here we present the genomes of the CC strains using two complementary approaches as a resource to improve power and interpretation of genetic experiments. Our study also provides a cautionary tale regarding the limitations imposed by such basic biological processes as mutation and selection. A distinct advantage of inbred panels is that genotyping only needs to be performed on the panel, not on each individual mouse. The initial CC genome data were haplotype reconstructions based on dense genotyping of the most recent common ancestors (MRCAs) of each strain followed by imputation from the genome sequence of the corresponding founder inbred strain. The MRCA resource captured segregating regions in strains that were not fully inbred, but it had limited resolution in the transition regions between founder haplotypes, and there was uncertainty about founder assignment in regions of limited diversity. Here we report the whole genome sequence of 69 CC strains generated by paired-end short reads at 30× coverage of a single male per strain. Sequencing leads to a substantial improvement in the fine structure and completeness of the genomes of the CC. Both MRCAs and sequenced samples show a significant reduction in the genome-wide haplotype frequencies from two wild-derived strains, CAST/EiJ and PWK/PhJ. In addition, analysis of the evolution of the patterns of heterozygosity indicates that selection against three wild-derived founder strains played a significant role in shaping the genomes of the CC. The sequencing resource provides the first description of tens of thousands of new genetic variants introduced by mutation and drift in the CC genomes. We estimate that new SNP mutations are accumulating in each CC strain at a rate of 2.4 ± 0.4 per gigabase per generation. The fixation of new mutations by genetic drift has introduced thousands of new variants into the CC strains. The majority of these mutations are novel compared to currently sequenced laboratory stocks and wild mice, and some are predicted to alter gene function. Approximately one-third of the CC inbred strains have acquired large deletions (>10 kb) many of which overlap known coding genes and functional elements. The sequence of these mice is a critical resource to CC users, increases threefold the number of mouse inbred strain genomes available publicly, and provides insight into the effect of mutation and drift on common resources.


Asunto(s)
Flujo Genético , Genoma/genética , Ratones Endogámicos/genética , Sitios de Carácter Cuantitativo/genética , Animales , Mapeo Cromosómico , Cruzamientos Genéticos , Genotipo , Haplotipos , Masculino , Ratones , Mutación , Polimorfismo de Nucleótido Simple
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA