Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 24
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Genome Res ; 22(4): 778-90, 2012 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-22300768

RESUMEN

Copy number variations (CNVs) affect a wide range of phenotypic traits; however, CNVs in or near segmental duplication regions are often intractable. Using a read depth approach based on next-generation sequencing, we examined genome-wide copy number differences among five taurine (three Angus, one Holstein, and one Hereford) and one indicine (Nelore) cattle. Within mapped chromosomal sequence, we identified 1265 CNV regions comprising ~55.6-Mbp sequence--476 of which (~38%) have not previously been reported. We validated this sequence-based CNV call set with array comparative genomic hybridization (aCGH), quantitative PCR (qPCR), and fluorescent in situ hybridization (FISH), achieving a validation rate of 82% and a false positive rate of 8%. We further estimated absolute copy numbers for genomic segments and annotated genes in each individual. Surveys of the top 25 most variable genes revealed that the Nelore individual had the lowest copy numbers in 13 cases (~52%, χ(2) test; P-value <0.05). In contrast, genes related to pathogen- and parasite-resistance, such as CATHL4 and ULBP17, were highly duplicated in the Nelore individual relative to the taurine cattle, while genes involved in lipid transport and metabolism, including APOL3 and FABP2, were highly duplicated in the beef breeds. These CNV regions also harbor genes like BPIFA2A (BSP30A) and WC1, suggesting that some CNVs may be associated with breed-specific differences in adaptation, health, and production traits. By providing the first individualized cattle CNV and segmental duplication maps and genome-wide gene copy number estimates, we enable future CNV studies into highly duplicated regions in the cattle genome.


Asunto(s)
Bovinos/genética , Variaciones en el Número de Copia de ADN , Genoma/genética , Análisis de Secuencia de ADN/métodos , Animales , Bovinos/clasificación , Mapeo Cromosómico , Cromosomas de los Mamíferos/genética , Hibridación Genómica Comparativa , Proteínas de Unión a Ácidos Grasos/genética , Proteínas de Unión a Ácidos Grasos/metabolismo , Femenino , Dosificación de Gen , Duplicación de Gen , Genómica/métodos , Hibridación Fluorescente in Situ , Masculino , Reacción en Cadena de la Polimerasa , Especificidad de la Especie
2.
Genome Res ; 20(5): 693-703, 2010 May.
Artículo en Inglés | MEDLINE | ID: mdl-20212021

RESUMEN

Genomic structural variation is an important and abundant source of genetic and phenotypic variation. Here, we describe the first systematic and genome-wide analysis of copy number variations (CNVs) in modern domesticated cattle using array comparative genomic hybridization (array CGH), quantitative PCR (qPCR), and fluorescent in situ hybridization (FISH). The array CGH panel included 90 animals from 11 Bos taurus, three Bos indicus, and three composite breeds for beef, dairy, or dual purpose. We identified over 200 candidate CNV regions (CNVRs) in total and 177 within known chromosomes, which harbor or are adjacent to gains or losses. These 177 high-confidence CNVRs cover 28.1 megabases or approximately 1.07% of the genome. Over 50% of the CNVRs (89/177) were found in multiple animals or breeds and analysis revealed breed-specific frequency differences and reflected aspects of the known ancestry of these cattle breeds. Selected CNVs were further validated by independent methods using qPCR and FISH. Approximately 67% of the CNVRs (119/177) completely or partially span cattle genes and 61% of the CNVRs (108/177) directly overlap with segmental duplications. The CNVRs span about 400 annotated cattle genes that are significantly enriched for specific biological functions, such as immunity, lactation, reproduction, and rumination. Multiple gene families, including ULBP, have gone through ruminant lineage-specific gene amplification. We detected and confirmed marked differences in their CNV frequencies across diverse breeds, indicating that some cattle CNVs are likely to arise independently in breeds and contribute to breed differences. Our results provide a valuable resource beyond microsatellites and single nucleotide polymorphisms to explore the full dimension of genetic variability for future cattle genomic research.


Asunto(s)
Bovinos/clasificación , Bovinos/genética , Variaciones en el Número de Copia de ADN , Dosificación de Gen , Animales , Cruzamiento , Hibridación Genómica Comparativa , Genética de Población , Genoma , Variación Estructural del Genoma , Genómica , Hibridación Fluorescente in Situ , Análisis de Secuencia por Matrices de Oligonucleótidos , Reacción en Cadena de la Polimerasa/métodos , Duplicaciones Segmentarias en el Genoma , Especificidad de la Especie
3.
Funct Integr Genomics ; 12(1): 81-92, 2012 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-21928070

RESUMEN

Genomic structural variation is an important and abundant source of genetic and phenotypic variation. We previously reported an initial analysis of copy number variations (CNVs) in Angus cattle selected for resistance or susceptibility to gastrointestinal nematodes. In this study, we performed a large-scale analysis of CNVs using SNP genotyping data from 472 animals of the same population. We detected 811 candidate CNV regions, which represent 141.8 Mb (~4.7%) of the genome. To investigate the functional impacts of CNVs, we created 2 groups of 100 individual animals with extremely low or high estimated breeding values of eggs per gram of feces and referred to these groups as parasite resistant (PR) or parasite susceptible (PS), respectively. We identified 297 (~51 Mb) and 282 (~48 Mb) CNV regions from PR and PS groups, respectively. Approximately 60% of the CNV regions were specific to the PS group or PR group of animals. Selected PR- or PS-specific CNVs were further experimentally validated by quantitative PCR. A total of 297 PR CNV regions overlapped with 437 Ensembl genes enriched in immunity and defense, like WC1 gene which uniquely expresses on gamma/delta T cells in cattle. Network analyses indicated that the PR-specific genes were predominantly involved in gastrointestinal disease, immunological disease, inflammatory response, cell-to-cell signaling and interaction, lymphoid tissue development, and cell death. By contrast, the 282 PS CNV regions contained 473 Ensembl genes which are overrepresented in environmental interactions. Network analyses indicated that the PS-specific genes were particularly enriched for inflammatory response, immune cell trafficking, metabolic disease, cell cycle, and cellular organization and movement.


Asunto(s)
Enfermedades de los Bovinos/genética , Variaciones en el Número de Copia de ADN , Resistencia a la Enfermedad/genética , Enfermedades Gastrointestinales/veterinaria , Tracto Gastrointestinal/parasitología , Infecciones por Nematodos/veterinaria , Enfermedades Parasitarias en Animales/genética , Animales , Bovinos , Heces/parasitología , Femenino , Enfermedades Gastrointestinales/genética , Estudios de Asociación Genética , Predisposición Genética a la Enfermedad , Genoma , Interacciones Huésped-Parásitos , Masculino , Nematodos/fisiología , Infecciones por Nematodos/genética
4.
BMC Genomics ; 12: 127, 2011 Feb 23.
Artículo en Inglés | MEDLINE | ID: mdl-21345189

RESUMEN

BACKGROUND: Copy number variation (CNV) represents another important source of genetic variation complementary to single nucleotide polymorphism (SNP). High-density SNP array data have been routinely used to detect human CNVs, many of which have significant functional effects on gene expression and human diseases. In the dairy industry, a large quantity of SNP genotyping results are becoming available and can be used for CNV discovery to understand and accelerate genetic improvement for complex traits. RESULTS: We performed a systematic analysis of CNV using the Bovine HapMap SNP genotyping data, including 539 animals of 21 modern cattle breeds and 6 outgroups. After correcting genomic waves and considering the pedigree information, we identified 682 candidate CNV regions, which represent 139.8 megabases (~4.60%) of the genome. Selected CNVs were further experimentally validated and we found that copy number "gain" CNVs were predominantly clustered in tandem rather than existing as interspersed duplications. Many CNV regions (~56%) overlap with cattle genes (1,263), which are significantly enriched for immunity, lactation, reproduction and rumination. The overlap of this new dataset and other published CNV studies was less than 40%; however, our discovery of large, high frequency (> 5% of animals surveyed) CNV regions showed 90% agreement with other studies. These results highlight the differences and commonalities between technical platforms. CONCLUSIONS: We present a comprehensive genomic analysis of cattle CNVs derived from SNP data which will be a valuable genomic variation resource. Combined with SNP detection assays, gene-containing CNV regions may help identify genes undergoing artificial selection in domesticated animals.


Asunto(s)
Bovinos/genética , Dosificación de Gen , Polimorfismo de Nucleótido Simple , Animales , Cruzamiento , Hibridación Genómica Comparativa , Marcadores Genéticos , Genoma , Genómica/métodos , Genotipo , Linaje , Análisis de Secuencia de ADN
5.
BMC Genomics ; 12: 408, 2011 Aug 11.
Artículo en Inglés | MEDLINE | ID: mdl-21831322

RESUMEN

BACKGROUND: Genome-wide association analysis is a powerful tool for annotating phenotypic effects on the genome and knowledge of genes and chromosomal regions associated with dairy phenotypes is useful for genome and gene-based selection. Here, we report results of a genome-wide analysis of predicted transmitting ability (PTA) of 31 production, health, reproduction and body conformation traits in contemporary Holstein cows. RESULTS: Genome-wide association analysis identified a number of candidate genes and chromosome regions associated with 31 dairy traits in contemporary U.S. Holstein cows. Highly significant genes and chromosome regions include: BTA13's GNAS region for milk, fat and protein yields; BTA7's INSR region and BTAX's LOC520057 and GRIA3 for daughter pregnancy rate, somatic cell score and productive life; BTA2's LRP1B for somatic cell score; BTA14's DGAT1-NIBP region for fat percentage; BTA1's FKBP2 for protein yields and percentage, BTA26's MGMT and BTA6's PDGFRA for protein percentage; BTA18's 53.9-58.7 Mb region for service-sire and daughter calving ease and service-sire stillbirth; BTA18's PGLYRP1-IGFL1 region for a large number of traits; BTA18's LOC787057 for service-sire stillbirth and daughter calving ease; BTA15's CD82, BTA23's DST and the MOCS1-LRFN2 region for daughter stillbirth; and BTAX's LOC520057 and GRIA3 for daughter pregnancy rate. For body conformation traits, BTA11, BTAX, BTA10, BTA5, and BTA26 had the largest concentrations of SNP effects, and PHKA2 of BTAX and REN of BTA16 had the most significant effects for body size traits. For body shape traits, BTAX, BTA19 and BTA3 were most significant. Udder traits were affected by BTA16, BTA22, BTAX, BTA2, BTA10, BTA11, BTA20, BTA22 and BTA25, teat traits were affected by BTA6, BTA7, BTA9, BTA16, BTA11, BTA26 and BTA17, and feet/legs traits were affected by BTA11, BTA13, BTA18, BTA20, and BTA26. CONCLUSIONS: Genome-wide association analysis identified a number of genes and chromosome regions associated with 31 production, health, reproduction and body conformation traits in contemporary Holstein cows. The results provide useful information for annotating phenotypic effects on the dairy genome and for building consensus of dairy QTL effects.


Asunto(s)
Constitución Corporal , Bovinos/genética , Estudios de Asociación Genética , Carácter Cuantitativo Heredable , Animales , Industria Lechera , Femenino , Genotipo , Leche , Fenotipo , Polimorfismo de Nucleótido Simple , Embarazo , Sitios de Carácter Cuantitativo , Reproducción/genética
6.
Nat Methods ; 5(3): 247-52, 2008 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-18297082

RESUMEN

High-density single-nucleotide polymorphism (SNP) arrays have revolutionized the ability of genome-wide association studies to detect genomic regions harboring sequence variants that affect complex traits. Extensive numbers of validated SNPs with known allele frequencies are essential to construct genotyping assays with broad utility. We describe an economical, efficient, single-step method for SNP discovery, validation and characterization that uses deep sequencing of reduced representation libraries (RRLs) from specified target populations. Using nearly 50 million sequences generated on an Illumina Genome Analyzer from DNA of 66 cattle representing three populations, we identified 62,042 putative SNPs and predicted their allele frequencies. Genotype data for these 66 individuals validated 92% of 23,357 selected genome-wide SNPs, with a genotypic and sequence allele frequency correlation of r = 0.67. This approach for simultaneous de novo discovery of high-quality SNPs and population characterization of allele frequencies may be applied to any species with at least a partially sequenced genome.


Asunto(s)
Biología Computacional/métodos , Frecuencia de los Genes , Polimorfismo de Nucleótido Simple , Análisis de Secuencia de ADN/métodos , Animales , Bovinos , Biblioteca Genómica , Genotipo
7.
BMC Genomics ; 10: 77, 2009 Feb 10.
Artículo en Inglés | MEDLINE | ID: mdl-19208255

RESUMEN

BACKGROUND: MicroRNA (miR) are a class of small RNAs that regulate gene expression by inhibiting translation of protein encoding transcripts. To evaluate the role of miR in skeletal muscle of swine, global microRNA abundance was measured at specific developmental stages including proliferating satellite cells, three stages of fetal growth, day-old neonate, and the adult. RESULTS: Twelve potential novel miR were detected that did not match previously reported sequences. In addition, a number of miR previously reported to be expressed in mammalian muscle were detected, having a variety of abundance patterns through muscle development. Muscle-specific miR-206 was nearly absent in proliferating satellite cells in culture, but was the highest abundant miR at other time points evaluated. In addition, miR-1 was moderately abundant throughout developmental stages with highest abundance in the adult. In contrast, miR-133 was moderately abundant in adult muscle and either not detectable or lowly abundant throughout fetal and neonate development. Changes in abundance of ubiquitously expressed miR were also observed. MiR-432 abundance was highest at the earliest stage of fetal development tested (60 day-old fetus) and decreased throughout development to the adult. Conversely, miR-24 and miR-27 exhibited greatest abundance in proliferating satellite cells and the adult, while abundance of miR-368, miR-376, and miR-423-5p was greatest in the neonate. CONCLUSION: These data present a complete set of transcriptome profiles to evaluate miR abundance at specific stages of skeletal muscle growth in swine. Identification of these miR provides an initial group of miR that may play a vital role in muscle development and growth.


Asunto(s)
Perfilación de la Expresión Génica , MicroARNs/genética , Desarrollo de Músculos , Músculo Esquelético/metabolismo , Porcinos/genética , Animales , Femenino , Regulación del Desarrollo de la Expresión Génica , Biblioteca de Genes , Masculino , Análisis de Secuencia por Matrices de Oligonucleótidos , Porcinos/crecimiento & desarrollo
8.
BMC Genet ; 10: 19, 2009 Apr 24.
Artículo en Inglés | MEDLINE | ID: mdl-19393054

RESUMEN

BACKGROUND: The Bovine HapMap Consortium has generated assay panels to genotype ~30,000 single nucleotide polymorphisms (SNPs) from 501 animals sampled from 19 worldwide taurine and indicine breeds, plus two outgroup species (Anoa and Water Buffalo). Within the larger set of SNPs we targeted 101 high density regions spanning up to 7.6 Mb with an average density of approximately one SNP per 4 kb, and characterized the linkage disequilibrium (LD) and haplotype block structure within individual breeds and groups of breeds in relation to their geographic origin and use. RESULTS: From the 101 targeted high-density regions on bovine chromosomes 6, 14, and 25, between 57 and 95% of the SNPs were informative in the individual breeds. The regions of high LD extend up to ~100 kb and the size of haplotype blocks ranges between 30 bases and 75 kb (10.3 kb average). On the scale from 1-100 kb the extent of LD and haplotype block structure in cattle has high similarity to humans. The estimation of effective population sizes over the previous 10,000 generations conforms to two main events in cattle history: the initiation of cattle domestication (~12,000 years ago), and the intensification of population isolation and current population bottleneck that breeds have experienced worldwide within the last ~700 years. Haplotype block density correlation, block boundary discordances, and haplotype sharing analyses were consistent in revealing unexpected similarities between some beef and dairy breeds, making them non-differentiable. Clustering techniques permitted grouping of breeds into different clades given their similarities and dissimilarities in genetic structure. CONCLUSION: This work presents the first high-resolution analysis of haplotype block structure in worldwide cattle samples. Several novel results were obtained. First, cattle and human share a high similarity in LD and haplotype block structure on the scale of 1-100 kb. Second, unexpected similarities in haplotype block structure between dairy and beef breeds make them non-differentiable. Finally, our findings suggest that ~30,000 uniformly distributed SNPs would be necessary to construct a complete genome LD map in Bos taurus breeds, and ~580,000 SNPs would be necessary to characterize the haplotype block structure across the complete cattle genome.


Asunto(s)
Algoritmos , Bovinos/genética , Genoma/genética , Haplotipos , Animales , Cruzamiento , Bovinos/clasificación , Análisis por Conglomerados , Femenino , Frecuencia de los Genes , Genotipo , Desequilibrio de Ligamiento , Masculino , Filogenia , Polimorfismo de Nucleótido Simple
9.
Genetics ; 176(1): 685-96, 2007 May.
Artículo en Inglés | MEDLINE | ID: mdl-17339218

RESUMEN

The first genetic transcript map of the soybean genome was created by mapping one SNP in each of 1141 genes in one or more of three recombinant inbred line mapping populations, thus providing a picture of the distribution of genic sequences across the mapped portion of the genome. Single-nucleotide polymorphisms (SNPs) were discovered via the resequencing of sequence-tagged sites (STSs) developed from expressed sequence tag (EST) sequence. From an initial set of 9459 polymerase chain reaction primer sets designed to a diverse set of genes, 4240 STSs were amplified and sequenced in each of six diverse soybean genotypes. In the resulting 2.44 Mbp of aligned sequence, a total of 5551 SNPs were discovered, including 4712 single-base changes and 839 indels for an average nucleotide diversity of Theta= 0.000997. The analysis of the observed genetic distances between adjacent genes vs. the theoretical distribution based upon the assumption of a random distribution of genes across the 20 soybean linkage groups clearly indicated that genes were clustered. Of the 1141 genes, 291 mapped to 72 of the 112 gaps of 5-10 cM in the preexisting simple sequence repeat (SSR)-based map, while 111 genes mapped in 19 of the 26 gaps >10 cM. The addition of 1141 sequence-based genic markers to the soybean genome map will provide an important resource to soybean geneticists for quantitative trait locus discovery and map-based cloning, as well as to soybean breeders who increasingly depend upon marker-assisted selection in cultivar improvement.


Asunto(s)
Mapeo Cromosómico , Genes de Plantas/genética , Glycine max/genética , Haplotipos/genética , Polimorfismo de Nucleótido Simple/genética , ARN de Planta/genética , Transcripción Genética/genética , Secuencia de Bases , Cartilla de ADN , Bases de Datos de Ácidos Nucleicos , Exones/genética , Etiquetas de Secuencia Expresada , Heterogeneidad Genética , Ligamiento Genético , Intrones/genética , Repeticiones de Minisatélite/genética , Polimorfismo de Longitud del Fragmento de Restricción , ARN Mensajero/genética , Lugares Marcados de Secuencia
10.
BMC Genet ; 9: 37, 2008 May 20.
Artículo en Inglés | MEDLINE | ID: mdl-18492244

RESUMEN

BACKGROUND: Analyses of population structure and breed diversity have provided insight into the origin and evolution of cattle. Previously, these studies have used a low density of microsatellite markers, however, with the large number of single nucleotide polymorphism markers that are now available, it is possible to perform genome wide population genetic analyses in cattle. In this study, we used a high-density panel of SNP markers to examine population structure and diversity among eight cattle breeds sampled from Bos indicus and Bos taurus. RESULTS: Two thousand six hundred and forty one single nucleotide polymorphisms (SNPs) spanning all of the bovine autosomal genome were genotyped in Angus, Brahman, Charolais, Dutch Black and White Dairy, Holstein, Japanese Black, Limousin and Nelore cattle. Population structure was examined using the linkage model in the program STRUCTURE and Fst estimates were used to construct a neighbor-joining tree to represent the phylogenetic relationship among these breeds. CONCLUSION: The whole-genome SNP panel identified several levels of population substructure in the set of examined cattle breeds. The greatest level of genetic differentiation was detected between the Bos taurus and Bos indicus breeds. When the Bos indicus breeds were excluded from the analysis, genetic differences among beef versus dairy and European versus Asian breeds were detected among the Bos taurus breeds. Exploration of the number of SNP loci required to differentiate between breeds showed that for 100 SNP loci, individuals could only be correctly clustered into breeds 50% of the time, thus a large number of SNP markers are required to replace the 30 microsatellite markers that are currently commonly used in genetic diversity studies.


Asunto(s)
Bovinos/genética , Genoma/genética , Polimorfismo de Nucleótido Simple , Análisis de Varianza , Animales , Cruzamientos Genéticos , Marcadores Genéticos , Genética de Población , Genotipo , Filogenia
11.
Genomics Proteomics Bioinformatics ; 6(3-4): 129-43, 2008 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-19329064

RESUMEN

A systematic phylogenetic footprinting approach was performed to identify conserved transcription factor binding sites (TFBSs) in mammalian promoter regions using human, mouse and rat sequence alignments. We found that the score distributions of most binding site models did not follow the Gaussian distribution required by many statistical methods. Therefore, we performed an empirical test to establish the optimal threshold for each model. We gauged our computational predictions by comparing with previously known TFBSs in the PCK1 gene promoter of the cytosolic isoform of phosphoenolpyruvate carboxykinase, and achieved a sensitivity of 75% and a specificity of approximately 32%. Almost all known sites overlapped with predicted sites, and several new putative TFBSs were also identified. We validated a predicted SP1 binding site in the control of PCK1 transcription using gel shift and reporter assays. Finally, we applied our computational approach to the prediction of putative TFBSs within the promoter regions of all available RefSeq genes. Our full set of TFBS predictions is freely available at http://bfgl.anri.barc.usda.gov/tfbsConsSites.


Asunto(s)
Péptidos y Proteínas de Señalización Intracelular/genética , Fosfoenolpiruvato Carboxiquinasa (GTP)/genética , Regiones Promotoras Genéticas/genética , Secuencias Reguladoras de Ácidos Nucleicos/genética , Algoritmos , Secuencia de Aminoácidos , Animales , Secuencia de Bases , Sitios de Unión/genética , Línea Celular Tumoral , Biología Computacional/métodos , Secuencia Conservada , Ensayo de Cambio de Movilidad Electroforética , Humanos , Luciferasas/genética , Luciferasas/metabolismo , Ratones , Distribución Normal , Oligonucleótidos/genética , Oligonucleótidos/metabolismo , Unión Proteica , Ratas , Proteínas Recombinantes de Fusión/genética , Proteínas Recombinantes de Fusión/metabolismo , Reproducibilidad de los Resultados , Factor de Transcripción Sp1/genética , Factor de Transcripción Sp1/metabolismo , Factores de Transcripción/metabolismo , Transfección
12.
Physiol Genomics ; 29(1): 35-43, 2007 Mar 14.
Artículo en Inglés | MEDLINE | ID: mdl-17105755

RESUMEN

MicroRNAs are small approximately 22 nucleotide-long noncoding RNAs capable of controlling gene expression by inhibiting translation. Alignment of human microRNA stem-loop sequences (mir) against a recent draft sequence assembly of the bovine genome resulted in identification of 334 predicted bovine mir. We sequenced five tissue-specific cDNA libraries derived from the small RNA fractions of bovine embryo, thymus, small intestine, and lymph node to validate these predictions and identify new mir. This strategy combined with comparative sequence analysis identified 129 sequences that corresponded to mature microRNAs (miR). A total of 107 sequences aligned to known human mir, and 100 of these matched expressed miR. The other seven sequences represented novel miR expressed from the complementary strand of previously characterized human mir. The 22 sequences without matches displayed characteristic mir secondary structures when folded in silico, and 10 of these retained sequence conservation with other vertebrate species. Expression analysis based on sequence identity counts revealed that some miR were preferentially expressed in certain tissues, while bta-miR-26a and bta-miR-103 were prevalent in all tissues examined. These results support the premise that species differences in regulation of gene expression by miR occur primarily at the level of expression and processing.


Asunto(s)
Embrión de Mamíferos/metabolismo , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , MicroARNs/genética , MicroARNs/metabolismo , Animales , Emparejamiento Base , Secuencia de Bases , Bovinos , Análisis por Conglomerados , Biología Computacional , Secuencia Conservada/genética , Biblioteca de Genes , Genómica/métodos , Datos de Secuencia Molecular , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa , Alineación de Secuencia , Análisis de Secuencia de ADN
13.
BMC Genet ; 8: 74, 2007 Oct 25.
Artículo en Inglés | MEDLINE | ID: mdl-17961247

RESUMEN

BACKGROUND: Bovine whole genome linkage disequilibrium maps were constructed for eight breeds of cattle. These data provide fundamental information concerning bovine genome organization which will allow the design of studies to associate genetic variation with economically important traits and also provides background information concerning the extent of long range linkage disequilibrium in cattle. RESULTS: Linkage disequilibrium was assessed using r2 among all pairs of syntenic markers within eight breeds of cattle from the Bos taurus and Bos indicus subspecies. Bos taurus breeds included Angus, Charolais, Dutch Black and White Dairy, Holstein, Japanese Black and Limousin while Bos indicus breeds included Brahman and Nelore. Approximately 2670 markers spanning the entire bovine autosomal genome were used to estimate pairwise r2 values. We found that the extent of linkage disequilibrium is no more than 0.5 Mb in these eight breeds of cattle. CONCLUSION: Linkage disequilibrium in cattle has previously been reported to extend several tens of centimorgans. Our results, based on a much larger sample of marker loci and across eight breeds of cattle indicate that in cattle linkage disequilibrium persists over much more limited distances. Our findings suggest that 30,000-50,000 loci will be needed to conduct whole genome association studies in cattle.


Asunto(s)
Bovinos/genética , Mapeo Cromosómico/métodos , Genoma , Desequilibrio de Ligamiento , Animales , Frecuencia de los Genes , Marcadores Genéticos , Haplotipos , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo
14.
BMC Bioinformatics ; 7: 4, 2006 Jan 06.
Artículo en Inglés | MEDLINE | ID: mdl-16398931

RESUMEN

BACKGROUND: Single nucleotide polymorphisms (SNP) constitute more than 90% of the genetic variation, and hence can account for most trait differences among individuals in a given species. Polymorphism detection software PolyBayes and PolyPhred give high false positive SNP predictions even with stringent parameter values. We developed a machine learning (ML) method to augment PolyBayes to improve its prediction accuracy. ML methods have also been successfully applied to other bioinformatics problems in predicting genes, promoters, transcription factor binding sites and protein structures. RESULTS: The ML program C4.5 was applied to a set of features in order to build a SNP classifier from training data based on human expert decisions (True/False). The training data were 27,275 candidate SNP generated by sequencing 1973 STS (sequence tag sites) (12 Mb) in both directions from 6 diverse homozygous soybean cultivars and PolyBayes analysis. Test data of 18,390 candidate SNP were generated similarly from 1359 additional STS (8 Mb). SNP from both sets were classified by experts. After training the ML classifier, it agreed with the experts on 97.3% of test data compared with 7.8% agreement between PolyBayes and experts. The PolyBayes positive predictive values (PPV) (i.e., fraction of candidate SNP being real) were 7.8% for all predictions and 16.7% for those with 100% posterior probability of being real. Using ML improved the PPV to 84.8%, a 5- to 10-fold increase. While both ML and PolyBayes produced a similar number of true positives, the ML program generated only 249 false positives as compared to 16,955 for PolyBayes. The complexity of the soybean genome may have contributed to high false SNP predictions by PolyBayes and hence results may differ for other genomes. CONCLUSION: A machine learning (ML) method was developed as a supplementary feature to the polymorphism detection software for improving prediction accuracies. The results from this study indicate that a trained ML classifier can significantly reduce human intervention and in this case achieved a 5-10 fold enhanced productivity. The optimized feature set and ML framework can also be applied to all polymorphism discovery software. ML support software is written in Perl and can be easily integrated into an existing SNP discovery pipeline.


Asunto(s)
Biología Computacional/métodos , Polimorfismo de Nucleótido Simple , Algoritmos , Inteligencia Artificial , Secuencia de Bases , Sitios de Unión , Etiquetas de Secuencia Expresada , Variación Genética , Genoma Humano , Genoma de Planta , Haplotipos , Homocigoto , Humanos , Datos de Secuencia Molecular , Polimorfismo Genético , Valor Predictivo de las Pruebas , Lugares Marcados de Secuencia , Programas Informáticos , Glycine max/genética , Factores de Transcripción
15.
BMC Bioinformatics ; 7: 468, 2006 Oct 23.
Artículo en Inglés | MEDLINE | ID: mdl-17059604

RESUMEN

BACKGROUND: Single nucleotide polymorphisms (SNPs) as defined here are single base sequence changes or short insertion/deletions between or within individuals of a given species. As a result of their abundance and the availability of high throughput analysis technologies SNP markers have begun to replace other traditional markers such as restriction fragment length polymorphisms (RFLPs), amplified fragment length polymorphisms (AFLPs) and simple sequence repeats (SSRs or microsatellite) markers for fine mapping and association studies in several species. For SNP discovery from chromatogram data, several bioinformatics programs have to be combined to generate an analysis pipeline. Results have to be stored in a relational database to facilitate interrogation through queries or to generate data for further analyses such as determination of linkage disequilibrium and identification of common haplotypes. Although these tasks are routinely performed by several groups, an integrated open source SNP discovery pipeline that can be easily adapted by new groups interested in SNP marker development is currently unavailable. RESULTS: We developed SNP-PHAGE (SNP discovery Pipeline with additional features for identification of common haplotypes within a sequence tagged site (Haplotype Analysis) and GenBank (-dbSNP) submissions. This tool was applied for analyzing sequence traces from diverse soybean genotypes to discover over 10,000 SNPs. This package was developed on UNIX/Linux platform, written in Perl and uses a MySQL database. Scripts to generate a user-friendly web interface are also provided with common queries for preliminary data analysis. A machine learning tool developed by this group for increasing the efficiency of SNP discovery is integrated as a part of this package as an optional feature. The SNP-PHAGE package is being made available open source at http://bfgl.anri.barc.usda.gov/ML/snp-phage/. CONCLUSION: SNP-PHAGE provides a bioinformatics solution for high throughput SNP discovery, identification of common haplotypes within an amplicon, and GenBank (dbSNP) submissions. SNP selection and visualization are aided through a user-friendly web interface. This tool is useful for analyzing sequence tagged sites (STSs) of genomic sequences, and this software can serve as a starting point for groups interested in developing SNP markers.


Asunto(s)
Mapeo Cromosómico/métodos , Análisis Mutacional de ADN/métodos , Polimorfismo de Nucleótido Simple/genética , Alineación de Secuencia/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Interfaz Usuario-Computador , Secuencia de Bases , Datos de Secuencia Molecular
16.
BMC Genomics ; 7: 140, 2006 Jun 07.
Artículo en Inglés | MEDLINE | ID: mdl-16759380

RESUMEN

BACKGROUND: Approximately 11 Mb of finished high quality genomic sequences were sampled from cattle, dog and human to estimate genomic divergences and their regional variation among these lineages. RESULTS: Optimal three-way multi-species global sequence alignments for 84 cattle clones or loci (each >50 kb of genomic sequence) were constructed using the human and dog genome assemblies as references. Genomic divergences and substitution rates were examined for each clone and for various sequence classes under different functional constraints. Analysis of these alignments revealed that the overall genomic divergences are relatively constant (0.32-0.37 change/site) for pairwise comparisons among cattle, dog and human; however substitution rates vary across genomic regions and among different sequence classes. A neutral mutation rate (2.0-2.2 x 10(-9) change/site/year) was derived from ancestral repetitive sequences, whereas the substitution rate in coding sequences (1.1 x 10(-9) change/site/year) was approximately half of the overall rate (1.9-2.0 x 10(-9) change/site/year). Relative rate tests also indicated that cattle have a significantly faster rate of substitution as compared to dog and that this difference is about 6%. CONCLUSION: This analysis provides a large-scale and unbiased assessment of genomic divergences and regional variation of substitution rates among cattle, dog and human. It is expected that these data will serve as a baseline for future mammalian molecular evolution studies.


Asunto(s)
Variación Genética , Alineación de Secuencia , Animales , Secuencia de Bases , Bovinos , Perros , Femenino , Genoma , Genómica , Humanos , Masculino , Mutación , Pan troglodytes , Ratas , Caracteres Sexuales
17.
Mol Biotechnol ; 30(2): 143-50, 2005 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-15920284

RESUMEN

Intraepithelial lymphocytes (IELs) play a critical role in protective immune response to intestinal pathogens such as Eimeria, the etiologic agent of avian coccidiosis. A list of genes expressed by intestinal IELs of Eimeria-infected chickens was compiled using the expressed sequence tag (EST) strategy. The 14,409 ESTs consisted of 1851 clusters and 7595 singletons, which revealed 9446 unique genes in the data set. Comparison of the sequence data with chicken DNA sequences in GenBank identified 125 novel clones. This EST library will provide a valuable resource for profiling global gene expression in normal and pathogen-infected chickens and identifying additional unique immune-related genes.


Asunto(s)
Pollos/parasitología , Coccidiosis/veterinaria , Eimeria , Etiquetas de Secuencia Expresada , Mucosa Intestinal/inmunología , Linfocitos/metabolismo , Enfermedades de las Aves de Corral/parasitología , Animales , Pollos/genética , Pollos/inmunología , Coccidiosis/genética , Coccidiosis/inmunología , Perfilación de la Expresión Génica , Interleucina-16/genética , Interleucina-17/genética , Mucosa Intestinal/parasitología , Enfermedades de las Aves de Corral/genética , Enfermedades de las Aves de Corral/inmunología
18.
PLoS One ; 9(7): e103046, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25050984

RESUMEN

Genomic structural variations represent an important source of genetic variation in mammal genomes, thus, they are commonly related to phenotypic expressions. In this work, ∼ 770,000 single nucleotide polymorphism genotypes from 506 animals from 19 cattle breeds were analyzed. A simple LD-based structural variation was defined, and a genome-wide analysis was performed. After applying some quality control filters, for each breed and each chromosome we calculated the linkage disequilibrium (r2) of short range (≤ 100 Kb). We sorted SNP pairs by distance and obtained a set of LD means (called the expected means) using bins of 5 Kb. We identified 15,246 segments of at least 1 Kb, among the 19 breeds, consisting of sets of at least 3 adjacent SNPs so that, for each SNP, r2 within its neighbors in a 100 Kb range, to the right side of that SNP, were all bigger than, or all smaller than, the corresponding expected mean, and their P-value were significant after a Benjamini-Hochberg multiple testing correction. In addition, to account just for homogeneously distributed regions we considered only SNPs having at least 15 SNP neighbors within 100 Kb. We defined such segments as structural variations. By grouping all variations across all animals in the sample we defined 9,146 regions, involving a total of 53,137 SNPs; representing the 6.40% (160.98 Mb) from the bovine genome. The identified structural variations covered 3,109 genes. Clustering analysis showed the relatedness of breeds given the geographic region in which they are evolving. In summary, we present an analysis of structural variations based on the deviation of the expected short range LD between SNPs in the bovine genome. With an intuitive and simple definition based only on SNPs data it was possible to discern closeness of breeds due to grouping by geographic region in which they are evolving.


Asunto(s)
Cruzamiento , Bovinos/genética , Desequilibrio de Ligamiento , Polimorfismo de Nucleótido Simple , Animales , Femenino , Frecuencia de los Genes , Genoma , Estudio de Asociación del Genoma Completo , Genotipo , Haplotipos , Masculino
19.
Genome Biol ; 11(10): R102, 2010.
Artículo en Inglés | MEDLINE | ID: mdl-20961407

RESUMEN

BACKGROUND: A comprehensive transcriptome survey, or gene atlas, provides information essential for a complete understanding of the genomic biology of an organism. We present an atlas of RNA abundance for 92 adult, juvenile and fetal cattle tissues and three cattle cell lines. RESULTS: The Bovine Gene Atlas was generated from 7.2 million unique digital gene expression tag sequences (300.2 million total raw tag sequences), from which 1.59 million unique tag sequences were identified that mapped to the draft bovine genome accounting for 85% of the total raw tag abundance. Filtering these tags yielded 87,764 unique tag sequences that unambiguously mapped to 16,517 annotated protein-coding loci in the draft genome accounting for 45% of the total raw tag abundance. Clustering of tissues based on tag abundance profiles generally confirmed ontology classification based on anatomy. There were 5,429 constitutively expressed loci and 3,445 constitutively expressed unique tag sequences mapping outside annotated gene boundaries that represent a resource for enhancing current gene models. Physical measures such as inferred transcript length or antisense tag abundance identified tissues with atypical transcriptional tag profiles. We report for the first time the tissue-specific variation in the proportion of mitochondrial transcriptional tag abundance. CONCLUSIONS: The Bovine Gene Atlas is the deepest and broadest transcriptome survey of any livestock genome to date. Commonalities and variation in sense and antisense transcript tag profiles identified in different tissues facilitate the examination of the relationship between gene expression, tissue, and gene function.


Asunto(s)
Bovinos/genética , Etiquetas de Secuencia Expresada , Genoma , Anotación de Secuencia Molecular , Animales , Bovinos/clasificación , Línea Celular , Mapeo Cromosómico , Femenino , Expresión Génica , Perfilación de la Expresión Génica , Genes Mitocondriales , Masculino , Anotación de Secuencia Molecular/métodos , Proteómica
20.
PLoS One ; 4(4): e5350, 2009.
Artículo en Inglés | MEDLINE | ID: mdl-19390634

RESUMEN

The success of genome-wide association (GWA) studies for the detection of sequence variation affecting complex traits in human has spurred interest in the use of large-scale high-density single nucleotide polymorphism (SNP) genotyping for the identification of quantitative trait loci (QTL) and for marker-assisted selection in model and agricultural species. A cost-effective and efficient approach for the development of a custom genotyping assay interrogating 54,001 SNP loci to support GWA applications in cattle is described. A novel algorithm for achieving a compressed inter-marker interval distribution proved remarkably successful, with median interval of 37 kb and maximum predicted gap of <350 kb. The assay was tested on a panel of 576 animals from 21 cattle breeds and six outgroup species and revealed that from 39,765 to 46,492 SNP are polymorphic within individual breeds (average minor allele frequency (MAF) ranging from 0.24 to 0.27). The assay also identified 79 putative copy number variants in cattle. Utility for GWA was demonstrated by localizing known variation for coat color and the presence/absence of horns to their correct genomic locations. The combination of SNP selection and the novel spacing algorithm allows an efficient approach for the development of high-density genotyping platforms in species having full or even moderate quality draft sequence. Aspects of the approach can be exploited in species which lack an available genome sequence. The BovineSNP50 assay described here is commercially available from Illumina and provides a robust platform for mapping disease genes and QTL in cattle.


Asunto(s)
Bovinos/genética , Biología Computacional/métodos , Genotipo , Polimorfismo de Nucleótido Simple/genética , Animales , Cromosomas Artificiales Bacterianos/genética , Frecuencia de los Genes , Genoma , Estudio de Asociación del Genoma Completo , Sitios de Carácter Cuantitativo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA