Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
Nucleic Acids Res ; 51(8): e46, 2023 05 08.
Artículo en Inglés | MEDLINE | ID: mdl-36912074

RESUMEN

16S rRNA gene sequence clustering is an important tool in characterizing the diversity of microbial communities. As 16S rRNA gene data sets are growing in size, existing sequence clustering algorithms increasingly become an analytical bottleneck. Part of this bottleneck is due to the substantial computational cost expended on small clusters and singleton sequences. We propose an iterative sampling-based 16S rRNA gene sequence clustering approach that targets the largest clusters in the data set, allowing users to stop the clustering process when sufficient clusters are available for the specific analysis being targeted. We describe a probabilistic analysis of the iterative clustering process that supports the intuition that the clustering process identifies the larger clusters in the data set first. Using real data sets of 16S rRNA gene sequences, we show that the iterative algorithm, coupled with an adaptive sampling process and a mode-shifting strategy for identifying cluster representatives, substantially speeds up the clustering process while being effective at capturing the large clusters in the data set. The experiments also show that SCRAPT (Sample, Cluster, Recruit, AdaPt and iTerate) is able to produce operational taxonomic units that are less fragmented than popular tools: UCLUST, CD-HIT and DNACLUST. The algorithm is implemented in the open-source package SCRAPT. The source code used to generate the results presented in this paper is available at https://github.com/hsmurali/SCRAPT.


Asunto(s)
Algoritmos , Programas Informáticos , ARN Ribosómico 16S/genética , Genes de ARNr , Análisis por Conglomerados
2.
BMC Genomics ; 24(1): 186, 2023 Apr 06.
Artículo en Inglés | MEDLINE | ID: mdl-37024818

RESUMEN

BACKGROUND: Understanding the evolutionary forces related to climate changes that have been shaped genetic variation within species has long been a fundamental pursuit in biology. In this study, we generated whole-genome sequence (WGS) data from 65 cross-bred and 45 Mongolian cattle. Together with 62 whole-genome sequences from world-wide cattle populations, we estimated the genetic diversity and population genetic structure of cattle populations. In addition, we performed comparative population genomics analyses to explore the genetic basis underlying variation in the adaptation to cold climate and immune response in cross-bred cattle located in the cold region of China. To elucidate genomic signatures that underlie adaptation to cold climate, we performed three statistical measurements, fixation index (FST), log2 nucleotide diversity (θπ ratio) and cross population composite likelihood ratio (XP-CLR), and further investigated the results to identify genomic regions under selection for cold adaptation and immune response-related traits. RESULTS: By generating WGS data, we investigated the population genetic structure and phylogenetic relationship of studied cattle populations. The results revealed clustering of cattle groups in agreement with their geographic distribution. We detected noticeable genetic diversity between indigenous cattle ecotypes and commercial populations. Analysis of population structure demonstrated evidence of shared genetic ancestry between studied cross-bred population and both Red-Angus and Mongolian breeds. Among all studied cattle populations, the highest and lowest levels of linkage disequilibrium (LD) per Kb were detected in Holstein and Rashoki populations (ranged from ~ 0.54 to 0.73, respectively). Our search for potential genomic regions under selection in cross-bred cattle revealed several candidate genes related with immune response and cold shock protein on multiple chromosomes. We identified some adaptive introgression genes with greater than expected contributions from Mongolian ancestry into Molgolian x Red Angus composites such as TRPM8, NMUR1, PRKAA2, SMTNL2 and OXR1 that are involved in energy metabolism and metabolic homeostasis. In addition, we detected some candidate genes probably associated with immune response-related traits. CONCLUSION: The study identified candidate genes involved in responses to cold adaptation and immune response in cross-bred cattle, including new genes or gene pathways putatively involved in these adaptations. The identification of these genes may clarify the molecular basis underlying adaptation to extreme environmental climate and as such they might be used in cattle breeding programs to select more efficient breeds for cold climate regions.


Asunto(s)
Genoma , Genómica , Bovinos/genética , Animales , Filogenia , Genómica/métodos , Fenotipo , Aclimatación/genética , Polimorfismo de Nucleótido Simple , Selección Genética
3.
J Anim Breed Genet ; 140(5): 473-484, 2023 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-37014360

RESUMEN

Many quantitative traits measured in breeding programs are genetically correlated. The genetic correlations between the traits indicate that the measurement of one trait carries information on others. To benefit from this information, multi-trait genomic prediction (MTGP) is preferable to use. However, MTGP is more difficult to implement compared to single-trait genomic prediction (STGP), and even more challenging for the goal to exploit not only the information on other traits but also the information on ungenotyped animals. This could be accomplished using both single and multistep methods. The single-step method was achieved by implementing a single-step genomic best linear unbiased prediction (ssGBLUP) approach using a multi-trait model. Here, we examined a multistep analysis based on an approach called "Absorption" to achieve this goal. The Absorption approach absorbed all available information including the phenotypic information on ungenotyped animals as well as the information on other traits if applicable, into mixed model equations of genotyped animals. The multistep analysis included (1) to apply the Absorption approach that exploits all available information and (2) to implement genomic BLUP (GBLUP) prediction on the absorbed dataset. In this study, the ssGBLUP and multistep analysis were applied to 5 traits in Duroc pigs, which were slaughter percentage, feed consumption from 40 to 120 kg (FC40_120), days of growth from 40 to 120 kg (D40_120), age at 40 kg (A40) and lean meat percentage. The results showed that MTGP yielded higher accuracy than STGP, which on average was 0.057 higher for the multistep method and 0.045 higher for ssGBLUP. The multistep method achieved similar prediction accuracy as ssGBLUP. However, the prediction bias of the multistep method was in general lower than that of ssGBLUP.


Asunto(s)
Genómica , Carne , Animales , Porcinos , Fenotipo , Genotipo
4.
Genet Sel Evol ; 47: 9, 2015 Feb 25.
Artículo en Inglés | MEDLINE | ID: mdl-25888184

RESUMEN

BACKGROUND: GBLUP (genomic best linear unbiased prediction) uses high-density single nucleotide polymorphism (SNP) markers to construct genomic identity-by-state (IBS) relationship matrices. However, identity-by-descent (IBD) relationships can be accurately calculated for extremely sparse markers. Here, we compare the accuracy of prediction of genome-wide breeding values (GW-BV) for a sib-evaluated trait in a typical aquaculture population, assuming either IBS or IBD genomic relationship matrices, and by varying marker density and size of the training dataset. METHODS: A simulation study was performed, assuming a population with strong family structure over three subsequent generations. Traditional and genomic BLUP were used to estimate breeding values, the latter using either IBS or IBD genomic relationship matrices, with marker densities ranging from 10 to ~1200 SNPs/Morgan (M). Heritability ranged from 0.1 to 0.8, and phenotypes were recorded on 25 to 45 sibs per full-sib family (50 full-sib families). Models were compared based on their predictive ability (accuracy) with respect to true breeding values of unphenotyped (albeit genotyped) sibs in the last generation. RESULTS: As expected, genomic prediction had greater accuracy compared to pedigree-based prediction. At the highest marker density, genomic prediction based on IBS information (IBS-GS) was slightly superior to that based on IBD information (IBD-GS), while at lower densities (≤100 SNPs/M), IBD-GS was more accurate. At the lowest densities (10 to 20 SNPs/M), IBS-GS was even outperformed by the pedigree-based model. Accuracy of IBD-GS was stable across marker densities performing well even down to 10 SNPs/M (2.5 to 6.1% reduction in accuracy compared to ~1200 SNPs/M). Loss of accuracy due to reduction in the size of training datasets was moderate and similar for both genomic prediction models. The relative superiority of (high-density) IBS-GS over IBD-GS was more pronounced for traits with a low heritability. CONCLUSIONS: Using dense markers, GBLUP based on either IBD or IBS relationship matrices proved to perform better than a pedigree-based model. However, accuracy of IBS-GS declined rapidly with decreasing marker densities, and was even outperformed by a traditional pedigree-based model at the lowest densities. In contrast, the accuracy of IBD-GS was very stable across marker densities.


Asunto(s)
Genómica/métodos , Modelos Genéticos , Polimorfismo de Nucleótido Simple , Selección Genética/genética , Animales , Acuicultura/métodos , Cruzamiento , Simulación por Computador , Genoma , Genotipo , Linaje , Fenotipo , Sitios de Carácter Cuantitativo/genética , Hermanos
5.
Genet Sel Evol ; 46: 64, 2014 Oct 04.
Artículo en Inglés | MEDLINE | ID: mdl-25284459

RESUMEN

BACKGROUND: Genomic prediction is based on the accurate estimation of the genomic relationships among and between training animals and selection candidates in order to obtain accurate estimates of the genomic estimated breeding values (GEBV). Various methods have been used to predict GEBV based on population-wide linkage disequilibrium relationships (G IBS ) or sometimes on linkage analysis relationships (G LA ). Here, we propose a novel method to predict GEBV based on a genomic relationship matrix using runs of homozygosity (G ROH ). Runs of homozygosity were used to derive probabilities of multi-locus identity by descent chromosome segments. The accuracy and bias of the prediction of GEBV using G ROH were compared to those using G IBS and G LA . Comparisons were performed using simulated datasets derived from a random pedigree and a real pedigree of Italian Brown Swiss bulls. The comparison of accuracies of GEBV was also performed on data from 1086 Italian Brown Swiss dairy cattle. RESULTS: Simulations with various thresholds of minor allele frequency for markers and quantitative trait loci showed that G ROH achieved consistently more accurate GEBV (0 to 4% points higher) than G IBS and G LA . The bias of GEBV prediction for simulated data was higher based on the real pedigree than based on a random pedigree. In the analyses with real data, G ROH and G LA had similar accuracies. However, G LA achieved a higher accuracy when the prediction was done on the youngest animals. The G IBS matrices calculated with and without standardized marker genotypes resulted in similar accuracies. CONCLUSIONS: The present study proposes G ROH as a novel method to estimate genomic relationship matrices and predict GEBV based on runs of homozygosity and shows that it can result in higher or similar accuracies of GEBV prediction than G LA , except for the real data analysis with validation of young animals. Compared to G IBS , G ROH resulted in more accurate GEBV predictions.


Asunto(s)
Bovinos/genética , Frecuencia de los Genes/genética , Genómica/métodos , Homocigoto , Animales , Cruzamiento , Simulación por Computador , Masculino , Modelos Genéticos , Linaje , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo
6.
ArXiv ; 2024 Mar 03.
Artículo en Inglés | MEDLINE | ID: mdl-38903742

RESUMEN

Metagenomic studies have primarily relied on de novo assembly for reconstructing genes and genomes from microbial mixtures. While reference-guided approaches have been employed in the assembly of single organisms, they have not been used in a metagenomic context. Here we describe the first effective approach for reference-guided metagenomic assembly that can complement and improve upon de novo metagenomic assembly methods for certain organisms. Such approaches will be increasingly useful as more genomes are sequenced and made publicly available.

7.
bioRxiv ; 2023 Dec 14.
Artículo en Inglés | MEDLINE | ID: mdl-38168205

RESUMEN

For decades, the 16S rRNA gene has been used to taxonomically classify prokaryotic species and to taxonomically profile microbial communities. The 16S rRNA gene has been criticized for being too conserved to differentiate between distinct species. We argue that the inability to differentiate between species is not a unique feature of the 16S rRNA gene. Rather, we observe the gradual loss of species-level resolution for other marker genes as the number of gene sequences increases in reference databases. We demonstrate this effect through the analysis of three commonly used databases of nearly-universal prokaryotic marker genes: the SILVA 16S rRNA gene database, the Genome Taxonomy Database (GTDB), and a set of 40 taxonomically-informative single-copy genes. Our results reflect a more fundamental property of the taxonomies themselves and have broad implications for bioinformatic analyses beyond taxonomic classification. Effective solutions for fine-level taxonomic classification require a more precise, and operationally-relevant, definition of the taxonomic labels being sought, and the use of combinations of genomic markers in the classification process. Importance: The use of reference databases for assigning taxonomic labels to genomic and metagenomic sequences is a fundamental bioinformatic task in the characterization of microbial communities. The increasing accessibility of high throughput sequencing has led to a rapid increase in the size and number of sequences in databases. This has been beneficial for improving our understanding of the global microbial genetic diversity. However, there is evidence that as the microbial diversity is more densely sampled, increasingly longer genomic segments are needed to differentiate between distinct species. The scientific community needs to be aware of this issue and needs to develop methods that better account for it when assigning taxonomic labels to metagenomic sequences from microbial communities.

8.
Genet Sel Evol ; 44: 28, 2012 Aug 31.
Artículo en Inglés | MEDLINE | ID: mdl-22937985

RESUMEN

BACKGROUND: It is commonly assumed that prediction of genome-wide breeding values in genomic selection is achieved by capitalizing on linkage disequilibrium between markers and QTL but also on genetic relationships. Here, we investigated the reliability of predicting genome-wide breeding values based on population-wide linkage disequilibrium information, based on identity-by-descent relationships within the known pedigree, and to what extent linkage disequilibrium information improves predictions based on identity-by-descent genomic relationship information. METHODS: The study was performed on milk, fat, and protein yield, using genotype data on 35 706 SNP and deregressed proofs of 1086 Italian Brown Swiss bulls. Genome-wide breeding values were predicted using a genomic identity-by-state relationship matrix and a genomic identity-by-descent relationship matrix (averaged over all marker loci). The identity-by-descent matrix was calculated by linkage analysis using one to five generations of pedigree data. RESULTS: We showed that genome-wide breeding values prediction based only on identity-by-descent genomic relationships within the known pedigree was as or more reliable than that based on identity-by-state, which implicitly also accounts for genomic relationships that occurred before the known pedigree. Furthermore, combining the two matrices did not improve the prediction compared to using identity-by-descent alone. Including different numbers of generations in the pedigree showed that most of the information in genome-wide breeding values prediction comes from animals with known common ancestors less than four generations back in the pedigree. CONCLUSIONS: Our results show that, in pedigreed breeding populations, the accuracy of genome-wide breeding values obtained by identity-by-descent relationships was not improved by identity-by-state information. Although, in principle, genomic selection based on identity-by-state does not require pedigree data, it does use the available pedigree structure. Our findings may explain why the prediction equations derived for one breed may not predict accurate genome-wide breeding values when applied to other breeds, since family structures differ among breeds.


Asunto(s)
Genoma/genética , Linaje , Selección Genética , Animales , Bovinos/genética , Desequilibrio de Ligamiento , Masculino , Modelos Genéticos , Modelos Estadísticos , Polimorfismo de Nucleótido Simple , Población/genética , Sitios de Carácter Cuantitativo/genética
9.
J Magn Reson ; 174(2): 188-99, 2005 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-15862234

RESUMEN

Three-way decomposition is a very versatile analysis tool with applications in a variety of protein NMR fields. It has been used to extract structural data from 3D NOESYs, to determine relaxation rates in large proteins, to identify ligand binding in screening for lead compounds, and to complement non-uniformly recorded (sparse) spectra. All applications so far concerned experimental data sets; it thus remains to address questions of accuracy and robustness of the method using simulated data where the correct answer is known. Systematic tests are presented for relaxation and NOESY data sets. Mixtures of real and synthetic data are used to allow control of various parameters and comparisons with correct reference data, while working with input that is as realistic as possible. The influence of the following parameters is evaluated: signal-to-noise, overlap of signals and the use of a regularization procedure within the algorithm. The main criteria used for the evaluation are accuracy and precision. It is shown that deterioration of accuracy is indicated by internal checks such as decrease of precision. Both with relaxation data and when interpreting NOESY spectra, three-way decomposition exhibits a robust behavior in situations with severe signal overlap and/or poor signal-to-noise, e.g., by avoiding false positives in the NOE shapes of NOESY decompositions. As a complement to this study, three-way decomposition is compared to other methods that achieve the same type of results.


Asunto(s)
Azurina/química , Resonancia Magnética Nuclear Biomolecular , Procesamiento de Señales Asistido por Computador , Algoritmos
10.
J Tradit Chin Med ; 22(2): 83-6, 2002 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-12125499

RESUMEN

Fifty-seven cases of nephrotic syndrome were treated with TCM decoctions as accessory treatment for prednisone and cyclophosphamide, and the effects were observed in a follow-up period of 5-15 years. The long-term complete remission rate of 68.4% and recurrence rate of 26.3% in the treatment group were respectively higher and lower than those in the control group (P < 0.01, and P < 0.01). The results suggested that the TCM decoctions were very helpful in treating this condition.


Asunto(s)
Ciclofosfamida/uso terapéutico , Medicamentos Herbarios Chinos/uso terapéutico , Síndrome Nefrótico/tratamiento farmacológico , Fitoterapia , Prednisona/uso terapéutico , Adolescente , Adulto , Diagnóstico Diferencial , Combinación de Medicamentos , Quimioterapia Combinada , Femenino , Estudios de Seguimiento , Humanos , Masculino , Medicina Tradicional China , Persona de Mediana Edad
11.
Genetics ; 183(3): 1119-26, 2009 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-19704013

RESUMEN

Genomic Selection (GS) is a newly developed tool for the estimation of breeding values for quantitative traits through the use of dense markers covering the whole genome. For a successful application of GS, accuracy of the prediction of genomewide breeding value (GW-EBV) is a key issue to consider. Here we investigated the accuracy and possible bias of GW-EBV prediction, using real bovine SNP genotyping (18,991 SNPs) and phenotypic data of 500 Norwegian Red bulls. The study was performed on milk yield, fat yield, protein yield, first lactation mastitis traits, and calving ease. Three methods, best linear unbiased prediction (G-BLUP), Bayesian statistics (BayesB), and a mixture model approach (MIXTURE), were used to estimate marker effects, and their accuracy and bias were estimated by using cross-validation. The accuracies of the GW-EBV prediction were found to vary widely between 0.12 and 0.62. G-BLUP gave overall the highest accuracy. We observed a strong relationship between the accuracy of the prediction and the heritability of the trait. GW-EBV prediction for production traits with high heritability achieved higher accuracy and also lower bias than health traits with low heritability. To achieve a similar accuracy for the health traits probably more records will be needed.


Asunto(s)
Cruzamiento/métodos , Bovinos/genética , Genoma/genética , Selección Genética , Algoritmos , Crianza de Animales Domésticos/métodos , Animales , Teorema de Bayes , Bovinos/metabolismo , Femenino , Estudio de Asociación del Genoma Completo , Genotipo , Masculino , Leche/metabolismo , Leche/normas , Noruega , Polimorfismo de Nucleótido Simple/genética , Sitios de Carácter Cuantitativo/genética , Reproducibilidad de los Resultados
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA