Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
1.
J Anim Breed Genet ; 139(5): 489-501, 2022 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-35698863

RESUMO

Pooling samples to derive group genotypes can enable the economically efficient use of commercial animals within genetic evaluations. To test a multivariate framework for genetic evaluations using pooled data, simulation was used to mimic a beef cattle population including two moderately heritable traits with varying genetic correlations, genotypes and pedigree data. There were 15 generations (n = 32,000; random selection and mating), and the last generation was subjected to genotyping through pooling. Missing records were induced in two ways: (a) sequential culling and (b) random missing records. Gaps in genotyping were also explored whereby genotyping occurred through generation 13 or 14. Pools of 1, 20, 50 and 100 animals were constructed randomly or by minimizing phenotypic variation. The EBV was estimated using a bivariate single-step genomic best linear unbiased prediction model. Pools of 20 animals constructed by minimizing phenotypic variation generally led to accuracies that were not different than using individual progeny data. Gaps in genotyping led to significantly different EBV accuracies (p < .05) for sires and dams born in the generation nearest the pools. Pooling of any size generally led to larger accuracies than no information from generation 15 regardless of the way missing records arose, the percentage of records available or the genetic correlation. Pooling to aid in the use of commercial data in genetic evaluations can be utilized in multivariate cases with varying relationships between the traits and in the presence of systematic and randomly missing phenotypes.


Assuntos
Genoma , Genômica , Animais , Bovinos/genética , Genótipo , Modelos Genéticos , Linhagem , Fenótipo
2.
BMC Med ; 15(1): 213, 2017 12 06.
Artigo em Inglês | MEDLINE | ID: mdl-29207974

RESUMO

BACKGROUND: Diagnosis of monogenic as well as atypical forms of diabetes mellitus has important clinical implications for their specific diagnosis, prognosis, and targeted treatment. Single gene mutations that affect beta-cell function represent 1-2% of all cases of diabetes. However, phenotypic heterogeneity and lack of family history of diabetes can limit the diagnosis of monogenic forms of diabetes. Next-generation sequencing technologies provide an excellent opportunity to screen large numbers of individuals with a diagnosis of diabetes for mutations in disease-associated genes. METHODS: We utilized a targeted sequencing approach using the Illumina HiSeq to perform a case-control sequencing study of 22 monogenic diabetes genes in 4016 individuals with type 2 diabetes (including 1346 individuals diagnosed before the age of 40 years) and 2872 controls. We analyzed protein-coding variants identified from the sequence data and compared the frequencies of pathogenic variants (protein-truncating variants and missense variants) between the cases and controls. RESULTS: A total of 40 individuals with diabetes (1.8% of early onset sub-group and 0.6% of adult onset sub-group) were carriers of known pathogenic missense variants in the GCK, HNF1A, HNF4A, ABCC8, and INS genes. In addition, heterozygous protein truncating mutations were detected in the GCK, HNF1A, and HNF1B genes in seven individuals with diabetes. Rare missense mutations in the GCK gene were significantly over-represented in individuals with diabetes (0.5% carrier frequency) compared to controls (0.035%). One individual with early onset diabetes was homozygous for a rare pathogenic missense variant in the WFS1 gene but did not have the additional phenotypes associated with Wolfram syndrome. CONCLUSION: Targeted sequencing of genes linked with monogenic diabetes can identify disease-relevant mutations in individuals diagnosed with type 2 diabetes not suspected of having monogenic forms of the disease. Our data suggests that GCK-MODY frequently masquerades as classical type 2 diabetes. The results confirm that MODY is under-diagnosed, particularly in individuals presenting with early onset diabetes and clinically labeled as type 2 diabetes; thus, sequencing of all monogenic diabetes genes should be routinely considered in such individuals. Genetic information can provide a specific diagnosis, inform disease prognosis and may help to better stratify treatment plans.


Assuntos
Diabetes Mellitus Tipo 2/genética , Mutação , Adulto , Estudos de Casos e Controles , Estudos de Coortes , Análise Mutacional de DNA , Diabetes Mellitus Tipo 2/diagnóstico , Feminino , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Masculino , Mutação de Sentido Incorreto , Fenótipo , Prognóstico , Análise de Sequência de DNA
3.
J Assist Reprod Genet ; 34(1): 117-124, 2017 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-27817035

RESUMO

PURPOSE: Endometriosis is a gynecological disease influenced by multiple genetic and environmental factors. The aim of the current study was to use SNP-array technology to identify genomic aberrations that may possibly contribute to the development of endometriosis. METHODS: We performed an SNP-array genotyping of pooled DNA samples from both patients (n = 100) and controls (n = 50). Copy number variation (CNV) calling and association analyses were performed using PennCNV software. MLPA and TaqMan Copy-Number assays were used for validation of CNVs discovered. RESULTS: We detected 49 CNV loci that were present in patients with endometriosis and absent in the control group. After validation procedures, we confirmed six CNV loci in the subtelomeric regions, including 1p36.33, 16p13.3, 19p13.3, and 20p13, representing gains, while 17q25.3 and 20q13.33 showed losses. Among the intrachromosomal regions, our results revealed duplication at 19q13.1 within the FCGBP gene (p = 0.007). CONCLUSIONS: We identified CNVs previously associated with endometriosis, together with six suggestive novel loci possibly involved in this disease. The intergenic locus on chromosome 19q13.1 shows strong association with endometriosis and is under further functional investigation.


Assuntos
Variações do Número de Cópias de DNA/genética , Endometriose/genética , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Cromossomos Humanos Par 19/genética , Endometriose/patologia , Feminino , Genoma Humano , Genótipo , Humanos , Polimorfismo de Nucleotídeo Único/genética
4.
G3 (Bethesda) ; 13(10)2023 09 30.
Artigo em Inglês | MEDLINE | ID: mdl-37565490

RESUMO

Reliable and high-throughput genotyping platforms are of immense importance for identifying and dissecting genomic regions controlling important phenotypes, supporting selection processes in breeding programs, and managing wild populations and germplasm collections. Amongst available genotyping tools, single nucleotide polymorphism arrays have been shown to be comparatively easy to use and generate highly accurate genotypic data. Single-species arrays are the most commonly used type so far; however, some multi-species arrays have been developed for closely related species that share single nucleotide polymorphism markers, exploiting inter-species cross-amplification. In this study, the suitability of a multiplexed plant-animal single nucleotide polymorphism array, including both closely and distantly related species, was explored. The performance of the single nucleotide polymorphism array across species for diverse applications, ranging from intra-species diversity assessments to parentage analysis, was assessed. Moreover, the value of genotyping pooled DNA of distantly related species on the single nucleotide polymorphism array as a technique to further reduce costs was evaluated. Single nucleotide polymorphism performance was generally high, and species-specific single nucleotide polymorphisms proved suitable for diverse applications. The multi-species single nucleotide polymorphism array approach reported here could be transferred to other species to achieve cost savings resulting from the increased throughput when several projects use the same array, and the pooling technique adds another highly promising advancement to additionally decrease genotyping costs by half.


Assuntos
Polimorfismo de Nucleotídeo Único , Seleção Artificial , Animais , Genótipo , Genômica/métodos , Fenótipo
5.
J Anim Sci ; 1012023 Jan 03.
Artigo em Inglês | MEDLINE | ID: mdl-37227930

RESUMO

Genotyping pools of commercial cattle and individual seedstock animals may reveal hidden relationships between sectors enabling use of commercial data for genetic evaluation. However, commercial data capture may be compromised by inexact pool formation. We aimed to estimate the concordance between distances or genomic covariance among pooling allele frequencies (PAFs) of DNA pools comprised of 100 animals with 0% or 50% overlap of animals in common between pools. Cattle lung samples were collected from a commercial beef processing plant on a single day. Six pools of 100 animals each were constructed so that overlap between pools was 0% or 50%. Two pools of all 200 animals were constructed to estimate PAFs for all 200 animals. Frozen lung tissue (0.01 g) from each animal was weighed into a tube containing a pool; there were two pools of 200 animals each and six pools of 100 animals each. Every contribution of an individual animal was an independent measurement to insure independence of pooling errors. Lung samples were kept on dried ice during the pooling process to keep them from thawing. The eight pools were then assayed for approximately 100,000 single nucleotide polymorphisms (SNP). PAF for each SNP and pool was based on the relative intensity of the two dyes used to detect the alleles rather than genotype calls which are not tractable from pooling data. Euclidean distances and genomic relationships among the PAFs for the eight pools were estimated and distances were tested for concordance with pool overlap using permutation-based analysis of distance. Distances among pools were concordant with the planned overlap of animals shared between pools (P = 0.0024); pool overlap accounted for 70% of the variation and pooling error accounted for 30%. Pools containing 100 animals with no overlap were the most distant from one another and pools with 50% overlap were the least distant. This work shows that we can discern differences in distance between pairs of overlapping DNA pools sharing 0% and 50% of the animals. Genomic correlations among nonoverlapping pools indicated that nonoverlapping pool pairs did not share many related animals because genomic correlations were near zero for these pairs. On the other hand, one pair of nonoverlapping pools likely contained related animals between pools because the correlation was 0.21. Pools sharing 50% overlap ranged in genomic relationship between 0.21 and 0.39 (N = 12).


Genetic evaluation of seedstock cattle could benefit from commercial data. There are hidden relationships between commercial and seedstock sectors because many commercial producers buy bulls from the seedstock sector. Relationships are hidden because pedigree is not tracked in commercial populations. Single nucleotide polymorphism genotypes could reveal these hidden relationships; however, genotyping can be cost prohibitive. Cost of commercial data capture could be decreased by pooling DNA which is a method to genotype groups of animals to use their data in genetic evaluation; however, error from inexact pool formation can complicate interpretation. Results from pools of overlapping random unrelated animals mimic the results from pools sharing relatives with the same degree of shared genomes. For example, a pool of progeny and a pool of the dams of the pooled progeny would produce the same result as two pools sharing 50% overlap of random unrelated animals. We can estimate the relatedness between unknown pools even in the presence of pooling error if an unknown pool comparison is similar to an overlapping pool comparison. Knowing the relationship between seedstock cattle and pools of commercial cattle may allow commercial data to enhance genetic evaluation of seedstock animals.


Assuntos
DNA , Genômica , Animais , Bovinos/genética , Genótipo , Frequência do Gene , DNA/genética , Polimorfismo de Nucleotídeo Único , Alelos
6.
Front Genet ; 13: 896774, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36092907

RESUMO

Genomic selection has a great potential in aquaculture breeding since many important traits are not directly measured on the candidates themselves. However, its implementation has been hindered by staggering genotyping costs because of many individual genotypes. In this study, we explored the potential of DNA pooling for creating a reference population as a tool for genomic selection of a binary trait. Two datasets from the SalmoBreed population challenged with salmonid alphavirus, which causes pancreas disease, were used. Dataset-1, that includes 855 individuals (478 survivors and 377 dead), was used to develop four DNA pool samples (i.e., 2 pools each for dead and survival). Dataset-2 includes 914 individuals (435 survivors and 479 dead) belonging to 65 full-sibling families and was used to develop in-silico DNA pools. SNP effects from the pool data were calculated based on allele frequencies estimated from the pools and used to calculate genomic breeding values (GEBVs). The correlation between SNP effects estimated based on individual genotypes and pooled data increased from 0.3 to 0.912 when the number of pools increased from 1 to 200. A similar trend was also observed for the correlation between GEBVs, which increased from 0.84 to 0.976, as the number of pools per phenotype increased from 1 to 200. For dataset-1, the accuracy of prediction was 0.71 and 0.70 when the DNA pools were sequenced in 40× and 20×, respectively, compared to an accuracy of 0.73 for the SNP chip genotypes. For dataset-2, the accuracy of prediction increased from 0.574 to 0.691 when the number of in-silico DNA pools increased from 1 to 200. For this dataset, the accuracy of prediction using individual genotypes was 0.712. A limited effect of sequencing depth on the correlation of GEBVs and prediction accuracy was observed. Results showed that a large number of pools are required to achieve as good prediction as individual genotypes; however, alternative effective pooling strategies should be studied to reduce the number of pools without reducing the prediction power. Nevertheless, it is demonstrated that pooling of a reference population can be used as a tool to optimize between cost and accuracy of selection.

7.
G3 (Bethesda) ; 11(11)2021 10 19.
Artigo em Inglês | MEDLINE | ID: mdl-34510188

RESUMO

Despite decreasing genotyping costs, in some cases individually genotyping animals is not economically feasible (e.g., in small ruminants). An alternative is to pool DNA, using the pooled allele frequency (PAF) to garner information on performance. Still, the use of PAF for prediction (estimation of genomic breeding values; GEBVs) has been limited. Two potential sources of error on accuracy of GEBV of sires, obtained from PAF of their progeny themselves lacking pedigree information, were tested: (i) pool construction error (unequal contribution of DNA from animals in pools), and (ii) technical error (variability when reading the array). Pooling design (random, extremes, K-means), pool size (5, 10, 25, 50, and 100 individuals), and selection scenario (random, phenotypic) also were considered. These factors were tested by simulating a sheep population. Accuracy of GEBV-the correlation between true and estimated values-was not substantially affected by pool construction or technical error, or selection scenario. A significant interaction, however, between pool size and design was found. Still, regardless of design, mean accuracy was higher for pools of 10 or less individuals. Mean accuracy of GEBV was 0.174 (SE 0.001) for random pooling, and 0.704 (SE 0.004) and 0.696 (SE 0.004) for extreme and K-means pooling, respectively. Non-random pooling resulted in moderate accuracy of GEBV. Overall, pooled genotypes can be used in conjunction with individual genotypes of sires for moderately accurate predictions of their genetic merit with little effect of pool construction or technical error.


Assuntos
Genoma , Modelos Genéticos , Animais , Frequência do Gene , Genótipo , Masculino , Linhagem , Fenótipo , Polimorfismo de Nucleotídeo Único , Ovinos/genética
8.
J Anim Sci ; 98(6)2020 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-32497209

RESUMO

Economically relevant traits are routinely collected within the commercial segments of the beef industry but are rarely included in genetic evaluations because of unknown pedigrees. Individual relationships could be resurrected with genomics, but this would be costly; therefore, pooling DNA and phenotypic data provide a cost-effective solution. Pedigree, phenotypic, and genomic data were simulated for a beef cattle population consisting of 15 generations. Genotypes mimicked a 50k marker panel (841 quantitative trait loci were located across the genome, approximately once per 3 Mb) and the phenotype was moderately heritable. Individuals from generation 15 were included in pools (observed genotype and phenotype were mean values of a group). Estimated breeding values (EBV) were generated from a single-step genomic best linear unbiased prediction model. The effects of pooling strategy (random and minimizing or uniformly maximizing phenotypic variation within pools), pool size (1, 2, 10, 20, 50, 100, or no data from generation 15), and generational gaps of genotyping on EBV accuracy (correlation of EBV with true breeding values) were quantified. Greatest EBV accuracies of sires and dams were observed when there was no gap between genotyped parents and pooled offspring. The EBV accuracies resulting from pools were usually greater than no data from generation 15 regardless of sire or dam genotyping. Minimizing phenotypic variation increased EBV accuracy by 8% and 9% over random pooling and uniformly maximizing phenotypic variation, respectively. A pool size of 2 was the only scenario that did not significantly decrease EBV accuracy compared with individual data when pools were formed randomly or by uniformly maximizing phenotypic variation (P > 0.05). Pool sizes of 2, 10, 20, or 50 did not generally lead to statistical differences in EBV accuracy than individual data when pools were constructed to minimize phenotypic variation (P > 0.05). Largest numerical increases in EBV accuracy resulting from pooling compared with no data from generation 15 were seen with sires with prior low EBV accuracy (those born in generation 14). Pooling of any size led to larger EBV accuracies of the pools than individual data when minimizing phenotypic variation. Resulting EBV for the pools could be used to inform management decisions of those pools. Pooled genotyping to garner commercial-level phenotypes for genetic evaluations seems plausible although differences exist depending on pool size and pool formation strategy.


Assuntos
Bovinos/genética , Genômica/métodos , Modelos Genéticos , Animais , Cruzamento , Feminino , Genótipo , Modelos Lineares , Masculino , Linhagem , Fenótipo , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas
9.
J Anim Sci ; 98(6)2020 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-32428206

RESUMO

In this study, we aimed to assess the value of genotyping DNA pools as a strategy to generate accurate and cost-effective genomic estimated breeding values (GEBV) of sires in multi-sire mating systems. In order to do that, we used phenotypic records of 2,436 Australian Angus cattle from 174 sires, including yearling weight (YWT; N = 1,589 records), coat score (COAT; N = 2,026 records), and Meat Standards Australia marbling score (MARB; N = 1,304 records). Phenotypes were adjusted for fixed effects and age at measurement and pools of 2, 5, 10, 15, 20, and 25 animals were explored. Pools were created either by phenotype or at random. When pools were created at random, 10 replicates were examined to provide a measure of sampling variation. The relative accuracy of each pooling strategy was measured by the Pearson correlation coefficient between the sire's GEBV with pooled progeny and the GEBV using individually genotyped progeny. Random pools allow the computation of sire GEBV that are, on average, moderately correlated (i.e., r > 0.5 at pool sizes [PS] ≤ 10) with those obtained without pooling. However, for pools assigned at random, the difference between the best and the worst relative accuracy obtained out of the 10 replicates was as high as 0.41 for YWT, 0.36 for COAT, and 0.61 for MARB. This uncertainty associated with the relative accuracy of GEBV makes randomly assigning animals to pools an unreliable approach. In contrast, pooling by phenotype allowed the estimation of sires' GEBV with a relative accuracy ≥ 0.9 at PS < 10 for all three phenotypes. Moreover, even with larger PS, the lowest relative accuracy obtained was 0.88 (YWT, PS = 20). In agreement with results using simulated data, we conclude that pooling by phenotype is a robust approach to implementing genomic evaluation using commercial herd data, and PS larger than 10 individuals can be considered.


Assuntos
Cruzamento , Bovinos/genética , Genoma , Genótipo , Técnicas de Genotipagem/normas , Animais , Austrália , Composição Corporal/genética , Bovinos/classificação , Simulação por Computador , Feminino , Genômica/métodos , Masculino , Reprodutibilidade dos Testes
10.
Front Plant Sci ; 11: 568699, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33488638

RESUMO

Genebanks harbor original landraces carrying many original favorable alleles for mitigating biotic and abiotic stresses. Their genetic diversity remains, however, poorly characterized due to their large within genetic diversity. We developed a high-throughput, cheap and labor saving DNA bulk approach based on single-nucleotide polymorphism (SNP) Illumina Infinium HD array to genotype landraces. Samples were gathered for each landrace by mixing equal weights from young leaves, from which DNA was extracted. We then estimated allelic frequencies in each DNA bulk based on fluorescent intensity ratio (FIR) between two alleles at each SNP using a two step-approach. We first tested either whether the DNA bulk was monomorphic or polymorphic according to the two FIR distributions of individuals homozygous for allele A or B, respectively. If the DNA bulk was polymorphic, we estimated its allelic frequency by using a predictive equation calibrated on FIR from DNA bulks with known allelic frequencies. Our approach: (i) gives accurate allelic frequency estimations that are highly reproducible across laboratories, (ii) protects against false detection of allele fixation within landraces. We estimated allelic frequencies of 23,412 SNPs in 156 landraces representing American and European maize diversity. Modified Roger's genetic Distance between 156 landraces estimated from 23,412 SNPs and 17 simple sequence repeats using the same DNA bulks were highly correlated, suggesting that the ascertainment bias is low. Our approach is affordable, easy to implement and does not require specific bioinformatics support and laboratory equipment, and therefore should be highly relevant for large-scale characterization of genebanks for a wide range of species.

11.
J Anim Sci ; 97(12): 4761-4769, 2019 Dec 17.
Artigo em Inglês | MEDLINE | ID: mdl-31710679

RESUMO

The growing concern with the environment is making important for livestock producers to focus on selection for efficiency-related traits, which is a challenge for commercial cattle herds due to the lack of pedigree information. To explore a cost-effective opportunity for genomic evaluations of commercial herds, this study compared the accuracy of bulls' genomic estimated breeding values (GEBV) using different pooled genotype strategies. We used ten replicates of previously simulated genomic and phenotypic data for one low (t1) and one moderate (t2) heritability trait of 200 sires and 2,200 progeny. Sire's GEBV were calculated using a univariate mixed model, with a hybrid genomic relationship matrix (h-GRM) relating sires to: 1) 1,100 pools of 2 animals; 2) 440 pools of 5 animals; 3) 220 pools of 10 animals; 4) 110 pools of 20 animals; 5) 88 pools of 25 animals; 6) 44 pools of 50 animals; and 7) 22 pools of 100 animals. Pooling criteria were: at random, grouped sorting by t1, grouped sorting by t2, and grouped sorting by a combination of t1 and t2. The same criteria were used to select 110, 220, 440, and 1,100 individual genotypes for GEBV calculation to compare GEBV accuracy using the same number of individual genotypes and pools. Although the best accuracy was achieved for a given trait when pools were grouped based on that same trait (t1: 0.50-0.56, t2: 0.66-0.77), pooling by one trait impacted negatively on the accuracy of GEBV for the other trait (t1: 0.25-0.46, t2: 0.29-0.71). Therefore, the combined measure may be a feasible alternative to use the same pools to calculate GEBVs for both traits (t1: 0.45-0.57, t2: 0.62-0.76). Pools of 10 individuals were identified as representing a good compromise between loss of accuracy (~10%-15%) and cost savings (~90%) from genotype assays. In addition, we demonstrated that in more than 90% of the simulations, pools present higher sires' GEBV accuracy than individual genotypes when the number of genotype assays is limited (i.e., 110 or 220) and animals are assigned to pools based on phenotype. Pools assigned at random presented the poorest results (t1: 0.07-0.45, t2: 0.14-0.70). In conclusion, pooling by phenotype is the best approach to implementing genomic evaluation using commercial herd data, particularly when pools of 10 individuals are evaluated. While combining phenotypes seems a promising strategy to allow more flexibility to the estimates made using pools, more studies are necessary in this regard.


Assuntos
Bovinos/genética , Genômica/métodos , Genótipo , Algoritmos , Animais , Cruzamento , Feminino , Variação Genética , Masculino
12.
Animals (Basel) ; 9(9)2019 Aug 30.
Artigo em Inglês | MEDLINE | ID: mdl-31480266

RESUMO

Bovine tuberculosis (bTB) is a disease of cattle that represents a risk to public health and causes severe economic losses to the livestock industry. Recently, genetic studies, like genome-wide association studies (GWAS) have greatly improved the investigation of complex diseases identifying thousands of disease-associated genomic variants. Here, we present evidence of genetic variants associated with resistance to TB in Mexican dairy cattle using a case-control approach with a selective DNA pooling experimental design. A total of 154 QTLRs (quantitative trait loci regions) at 10% PFP (proportion of false positives), 42 at 5% PFP and 5 at 1% PFP have been identified, which harbored 172 annotated genes. On BTA13, five new QTLRs were identified in the MACROD2 and KIF16B genes, supporting their involvement in resistance to bTB. Six QTLRs harbor seven annotated genes that have been previously reported as involved in immune response against Mycobacterium spp: BTA (Bos taurus autosome) 1 (CD80), BTA3 (CTSS), BTA 3 (FCGR1A), BTA 23 (HFE), BTA 25 (IL21R), and BTA 29 (ANO9 and SIGIRR). We identified novel QTLRs harboring genes involved in Mycobacterium spp. immune response. This is a first screening for resistance to TB infection on Mexican dairy cattle based on a dense SNP (Single Nucleotide Polymorphism) chip.

13.
Oncotarget ; 8(55): 93450-93463, 2017 Nov 07.
Artigo em Inglês | MEDLINE | ID: mdl-29212164

RESUMO

The underlying genetic cause of colorectal cancer (CRC) can be identified for 5-10% of all cases, while at least 20% of CRC cases are thought to be due to inherited genetic factors. Screening for highly penetrant mutations in genes associated with Mendelian cancer syndromes using next-generation sequencing (NGS) can be prohibitively expensive for studies requiring large samples sizes. The aim of the study was to identify rare single nucleotide variants and small indels in 40 established or candidate CRC susceptibility genes in 1,046 familial CRC cases (including both MSS and MSI-H tumor subtypes) and 1,006 unrelated controls from the Colon Cancer Family Registry Cohort using a robust and cost-effective DNA pooling NGS strategy. We identified 264 variants in 38 genes that were observed only in cases, comprising either very rare (minor allele frequency <0.001) or not previously reported (n=90, 34%) in reference databases, including six stop-gain, three frameshift, and 255 non-synonymous variants predicted to be damaging. We found novel germline mutations in established CRC genes MLH1, APC, and POLE, and likely pathogenic variants in cancer susceptibility genes BAP1, CDH1, CHEK2, ENG, and MSH3. For the candidate CRC genes, we identified likely pathogenic variants in the helicase domain of POLQ and in the LRIG1, SH2B3, and NOS1 genes and present their clinicopathological characteristics. Using a DNA pooling NGS strategy, we identified novel germline mutations in established CRC susceptibility genes in familial CRC cases. Further studies are required to support the role of POLQ, LRIG1, SH2B3 and NOS1 as CRC susceptibility genes.

14.
Exp Ther Med ; 12(5): 3143-3150, 2016 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-27882129

RESUMO

With the advent of next-generation sequencing technology, the cost of sequencing has significantly decreased. However, sequencing costs remain high for large-scale studies. In the present study, DNA pooling was applied as a cost-effective strategy for sequencing. The sequencing results for 100 healthy individuals obtained via whole-genome resequencing and using DNA pooling are presented in the present study. In order to minimise the likelihood of systematic bias in sampling, paired-end libraries with an insert size of 500 bp were prepared for all samples and then subjected to whole-genome sequencing using four lanes for each library and resulting in at least a 30-fold haploid coverage for each sample. The NCBI human genome build37 (hg19) was used as a reference genome for the present study and the short reads were aligned to the reference genome achieving 99.84% coverage. In addition, the average sequencing depth was 32.76. In total, ~3 million single-nucleotide polymorphisms were identified, of which 99.88% were in the NCBI dbSNP database. Furthermore, ~600,000 small insertion/deletions, 500,000 structure variants, 5,000 copy number variations and 13,000 single nucleotide variants were identified. According to the present study, the whole genome has been sequenced for a small sample subjects from southern China for the first time. Furthermore, new variation sites were identified by comparing with the reference sequence, and new knowledge of the human genome variation was added to the human genomic databases. Furthermore, the particular distribution regions of variation were illustrated by analyzing various sites of variation, such as single-nucleotide polymorphisms.

15.
Comput Biol Med ; 61: 48-55, 2015 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-25863000

RESUMO

BACKGROUND: The costs associated with developing high density microarray technologies are prohibitive for genotyping animals when there is low economic value associated with a single animal (e.g. prawns). DNA pooling is an attempt to address this issue by combining multiple DNA samples prior to genotyping. Instead of genotyping the DNA samples of the individuals, a mixture of DNA samples (i.e. the pool) from the individuals is genotyped only once. This greatly reduces the cost of genotyping. Pooled samples are subject to greater genotyping inaccuracies than individual samples. Wrong genotyping will lead to wrong biological conclusions. It is thus required to calibrate the resulting genotypes (allele frequencies). METHODS: We present a regression based approach to translate raw array output to allele frequency. During training, few pools and the individuals that constitute the pools are genotyped. Given the genotypes of individuals that constitute the pool, we compute the true allele frequency. We then train a regression algorithm to produce a mapping between the raw array outputs to the true allele frequency. We test the algorithm using pool samples withheld from the training set. During prediction, we use this map to genotype pools with no prior knowledge of the individuals constituting the pools. RESULTS AND DISCUSSION: After data quality control we have available a dataset comprised of 912 pools. We estimate allele frequency using three approaches: the raw data, a commonly used piecewise linear transformation, and the proposed local-global learner fusion method. The resulting RMS errors for the three approaches are 0.135, 0.120, and 0.080 respectively.


Assuntos
Alelos , Frequência do Gene , Técnicas de Genotipagem/normas , Análise de Sequência com Séries de Oligonucleotídeos/normas , Polimorfismo de Nucleotídeo Único , Animais , Calibragem , DNA , Bases de Dados Genéticas , Humanos
16.
Mol Ecol Resour ; 15(5): 1145-52, 2015 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-25703535

RESUMO

Massively parallel sequencing a small proportion of the whole genome at high coverage enables answering a wide range of questions from molecular evolution and evolutionary biology to animal and plant breeding and forensics. In this study, we describe the development of restriction-site associated DNA (RAD) sequencing approach for Ion Torrent PGM platform. Our protocol results in extreme genome complexity reduction using two rare-cutting restriction enzymes and strict size selection of the library allowing sequencing of a relatively small number of genomic fragments with high sequencing depth. We applied this approach to a common freshwater fish species, the Eurasian perch (Perca fluviatilis L.), and generated over 2.2 MB of novel sequence data consisting of ~17,000 contigs, identified 1259 single nucleotide polymorphisms (SNPs). We also estimated genetic differentiation between the DNA pools from freshwater (Lake Peipus) and brackish water (the Baltic Sea) populations and identified SNPs with the strongest signal of differentiation that could be used for robust individual assignment in the future. This work represents an important step towards developing genomic resources and genetic tools for the Eurasian perch. We expect that our ddRAD sequencing protocol for semiconductor sequencing technology will be useful alternative for currently available RAD protocols.


Assuntos
Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Animais , Variação Genética , Genética Populacional/métodos , Percas/classificação , Percas/genética , Polimorfismo de Nucleotídeo Único
17.
Microarrays (Basel) ; 3(1): 52-71, 2014 Feb 28.
Artigo em Inglês | MEDLINE | ID: mdl-27605030

RESUMO

Bipolar disorder is a complex psychiatric disorder with high heritability, but its genetic determinants are still largely unknown. Copy number variation (CNV) is one of the sources to explain part of the heritability. However, it is a challenge to estimate discrete values of the copy numbers using continuous signals calling from a set of markers, and to simultaneously perform association testing between CNVs and phenotypic outcomes. The goal of the present study is to perform a series of data filtering and analysis procedures using a DNA pooling strategy to identify potential CNV regions that are related to bipolar disorder. A total of 200 normal controls and 200 clinically diagnosed bipolar patients were recruited in this study, and were randomly divided into eight control and eight case pools. Genome-wide genotyping was employed using Illumina Human Omni1-Quad array with approximately one million markers for CNV calling. We aimed at setting a series of criteria to filter out the signal noise of marker data and to reduce the chance of false-positive findings for CNV regions. We first defined CNV regions for each pool. Potential CNV regions were reported based on the different patterns of CNV status between cases and controls. Genes that were mapped into the potential CNV regions were examined with association testing, Gene Ontology enrichment analysis, and checked with existing literature for their associations with bipolar disorder. We reported several CNV regions that are related to bipolar disorder. Two CNV regions on chromosome 11 and 22 showed significant signal differences between cases and controls (p < 0.05). Another five CNV regions on chromosome 6, 9, and 19 were overlapped with results in previous CNV studies. Experimental validation of two CNV regions lent some support to our reported findings. Further experimental and replication studies could be designed for these selected regions.

18.
Mol Ecol Resour ; 13(5): 918-28, 2013 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-23937576

RESUMO

The internal transcribed spacer (ITS) region of nuclear ribosomal DNA is a common marker not only for the molecular identification of different taxa and strains, but also for the analysis of population structure of wild microparasite communities. Importantly, the multicopy nature of this region allows the amplification of low-quantity samples of the target DNA, a common problem in studies on unicellular, unculturable microparasites. We analysed ITS sequences from the protozoan parasite Caullerya mesnili (class Ichthyosporea) infecting waterflea (Daphnia) hosts, across several host population samples. We showed that analysing representative ITS-types [as identified by statistical parsimony network analysis (SPN)] is a suitable method to address relevant polymorphism. The spatial patterns were consistent regardless of whether parasite DNA was extracted from individual hosts or pooled host samples. Remarkably, the efficiency in detecting different sequence types was even higher after sample pooling. As shown by simulations, an easily manageable number of sequences from pooled DNA samples are sufficient to resolve the spatial population structure in this system. In summary, the ITS region analysed from pooled DNA samples can provide valuable insights into the spatial and temporal dynamics of microparasites. Moreover, the application of SPN analysis is a good alternative to the well-established neighbour-joining method (NJ) for the identification of representative ITS-types. SPN can even outperform NJ by joining most of the singleton sequences to representative sequence clusters.


Assuntos
Biota , DNA Intergênico/genética , Parasitos/classificação , Parasitos/genética , Parasitologia/métodos , Animais , DNA Intergênico/química , Daphnia/parasitologia , Mesomycetozoea/classificação , Mesomycetozoea/genética , Dados de Sequência Molecular , Análise de Sequência de DNA
19.
Artigo em Chinês | WPRIM | ID: wpr-385270

RESUMO

Objective To find out association mapping of loci related to bipolar disorder on chromosome 4 with microsatellite markers in DNA pooling samples from bipolar disorder cases and normal controls in Shandong province. Methods A total of 22 microsatellite markers on chromosome 4 spaced at approximately 10 cM were selected and two separated DNA pooling samples consisting of 104 bipolar disorder cases and 1000 normal controls were genotyped respectively. Statistic analysis was performed by Chi-square method with CLUMP software to compare the difference in the ratio of each allele in these loci between the two pooling samples. Result Significant statistic differences were found at D4S1592 and D4S402 on chromosome 4 between cases and controls(P<0.01 ).( D4S1592:x2 = 15.968, P=0.006; D4S402:x2 =31.553, P=0.002). Conclusion The loci of D4S1592 and D4S402 on chromosome 4 are found to be associated with bipolar disorder patients in Shandong province, further screening of the susceptibility genes around these loci is needed.

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa