RESUMEN
Large sample datasets have been regarded as the primary basis for innovative discoveries and the solution to missing heritability in genome-wide association studies. However, their computational complexity cannot consider all comprehensive effects and all polygenic backgrounds, which reduces the effectiveness of large datasets. To address these challenges, we included all effects and polygenic backgrounds in a mixed logistic model for binary traits and compressed four variance components into two. The compressed model combined three computational algorithms to develop an innovative method, called FastBiCmrMLM, for large data analysis. These algorithms were tailored to sample size, computational speed, and reduced memory requirements. To mine additional genes, linkage disequilibrium markers were replaced by bin-based haplotypes, which are analyzed by FastBiCmrMLM, named FastBiCmrMLM-Hap. Simulation studies highlighted the superiority of FastBiCmrMLM over GMMAT, SAIGE and fastGWA-GLMM in identifying dominant, small α (allele substitution effect), and rare variants. In the UK Biobank-scale dataset, we demonstrated that FastBiCmrMLM could detect variants as small as 0.03% and with α ≈ 0. In re-analyses of seven diseases in the WTCCC datasets, 29 candidate genes, with both functional and TWAS evidence, around 36 variants identified only by the new methods, strongly validated the new methods. These methods offer a new way to decipher the genetic architecture of binary traits and address the challenges outlined above.
Asunto(s)
Algoritmos , Estudio de Asociación del Genoma Completo , Estudio de Asociación del Genoma Completo/métodos , Humanos , Modelos Logísticos , Estudios de Casos y Controles , Desequilibrio de Ligamiento , Polimorfismo de Nucleótido Simple , Genómica/métodos , Simulación por Computador , Haplotipos , Modelos GenéticosRESUMEN
BACKGROUND: Salt stress significantly reduces soybean yield. To improve salt tolerance in soybean, it is important to mine the genes associated with salt tolerance traits. RESULTS: Salt tolerance traits of 286 soybean accessions were measured four times between 2009 and 2015. The results were associated with 740,754 single nucleotide polymorphisms (SNPs) to identify quantitative trait nucleotides (QTNs) and QTN-by-environment interactions (QEIs) using three-variance-component multi-locus random-SNP-effect mixed linear model (3VmrMLM). As a result, eight salt tolerance genes (GmCHX1, GsPRX9, Gm5PTase8, GmWRKY, GmCHX20a, GmNHX1, GmSK1, and GmLEA2-1) near 179 significant and 79 suggested QTNs and two salt tolerance genes (GmWRKY49 and GmSK1) near 45 significant and 14 suggested QEIs were associated with salt tolerance index traits in previous studies. Six candidate genes and three gene-by-environment interactions (GEIs) were predicted to be associated with these index traits. Analysis of four salt tolerance related traits under control and salt treatments revealed six genes associated with salt tolerance (GmHDA13, GmPHO1, GmERF5, GmNAC06, GmbZIP132, and GmHsp90s) around 166 QEIs were verified in previous studies. Five candidate GEIs were confirmed to be associated with salt stress by at least one haplotype analysis. The elite molecular modules of seven candidate genes with selection signs were extracted from wild soybean, and these genes could be applied to soybean molecular breeding. Two of these genes, Glyma06g04840 and Glyma07g18150, were confirmed by qRT-PCR and are expected to be key players in responding to salt stress. CONCLUSIONS: Around the QTNs and QEIs identified in this study, 16 known genes, 6 candidate genes, and 8 candidate GEIs were found to be associated with soybean salt tolerance, of which Glyma07g18150 was further confirmed by qRT-PCR.
Asunto(s)
Interacción Gen-Ambiente , Genes de Plantas , Glycine max , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Tolerancia a la Sal , Glycine max/genética , Glycine max/fisiología , Tolerancia a la Sal/genética , Sitios de Carácter Cuantitativo/genética , FenotipoRESUMEN
Malus sieversii, commonly known as wild apples, represents a Tertiary relict plant species and serves as the progenitor of globally cultivated apple varieties. Unfortunately, wild apple populations are facing significant degradation in localized areas due to a myriad of factors. To gain a comprehensive understanding of the nutrient status and spatiotemporal variations of M. sieversii, green leaves were collected in May and July, and the fallen leaves were collected in October. The concentrations of leaf nitrogen (N), phosphorus (P), and potassium (K) were measured, and the stoichiometric ratios as well as nutrient resorption efficiencies were calculated. The study also explored the relative contributions of soil, topographic, and biotic factors to the variation in nutrient traits. The results indicate that as the growing period progressed, the concentrations of N and P in the leaves significantly decreased (P < 0.05), and the concentration of K in October was significantly lower than in May and July. Throughout plant growth, leaf N-P and N-K exhibited hyperallometric relationships, while P-K showed an isometric relationship. Resorption efficiency followed the order of N < P < K (P < 0.05), with all three ratios being less than 1; this indicates that the order of nutrient limitation is K > P > N. The resorption efficiencies were mainly regulated by nutrient concentrations in fallen leaves. A robust spatial dependence was observed in leaf nutrient concentrations during all periods (70.1-97.9% for structural variation), highlighting that structural variation, rather than random factors, dominated the spatial variation. Nutrient resorption efficiencies (NRE, PRE, and KRE) displayed moderate structural variation (30.2-66.8%). The spatial patterns of nutrient traits varied across growth periods, indicating they are influenced by multifactorial elements (in which, soil property showed the highest influence). In conclusion, wild apples manifested differentiated spatiotemporal variability and influencing factors across various leaf nutrient traits. These results provide crucial insights into the spatiotemporal patterns and influencing factors of leaf nutrient traits of M. sieversii at the permanent plot scale for the first time. This work is of great significance for the ecosystem restoration and sustainable management of degrading wild fruit forests.
Asunto(s)
Malus , Nitrógeno , Fósforo , Hojas de la Planta , Potasio , Hojas de la Planta/metabolismo , Malus/metabolismo , Malus/crecimiento & desarrollo , Malus/fisiología , China , Fósforo/metabolismo , Fósforo/análisis , Nitrógeno/metabolismo , Potasio/metabolismo , Potasio/análisis , Bosques , Nutrientes/metabolismo , Nutrientes/análisis , Suelo/química , Frutas/crecimiento & desarrollo , Frutas/metabolismo , Análisis Espacio-TemporalRESUMEN
Detecting small and linked quantitative trait loci (QTLs) and QTL-by-environment interactions (QEIs) for complex traits is a difficult issue in immortalized F2 and F2:3 design, especially in the era of global climate change and environmental plasticity research. Here we proposed a compressed variance component mixed model. In this model, a parametric vector of QTL genotype and environment combination effects replaced QTL effects, environmental effects and their interaction effects, whereas the combination effect polygenic background replaced the QTL and QEI polygenic backgrounds. Thus, the number of variance components in the mixed model was greatly reduced. The model was incorporated into our genome-wide composite interval mapping (GCIM) to propose GCIM-QEI-random and GCIM-QEI-fixed, respectively, under random and fixed models of genetic effects. First, potentially associated QTLs and QEIs were selected from genome-wide scanning. Then, significant QTLs and QEIs were identified using empirical Bayes and likelihood ratio test. Finally, known and candidate genes around these significant loci were mined. The new methods were validated by a series of simulation studies and real data analyses. Compared with ICIM, GCIM-QEI-random had 29.77 ± 18.20% and 24.33 ± 10.15% higher average power, respectively, in 0.5-3.0% QTL and QEI detection, 43.44 ± 9.53% and 51.47 ± 15.70% higher average power, respectively, in linked QTL and QEI detection, and identified 30 more known genes for four rice yield traits, because GCIM-QEI-random identified more small genes/loci, being 2.69 ± 2.37% for additional genes. GCIM-QEI-random was slightly better than GCIM-QEI-fixed. In addition, the new methods may be extended into backcross and genome-wide association studies. This study provides effective methods for detecting small-effect and linked QTLs and QEIs.
Asunto(s)
Estudio de Asociación del Genoma Completo , Sitios de Carácter Cuantitativo , Teorema de Bayes , Mapeo Cromosómico , Interacción Gen-Ambiente , FenotipoRESUMEN
BACKGROUND: Wild apple (Malus sieversii) is under second-class national protection in China and one of the lineal ancestors of cultivated apples worldwide. In recent decades, the natural habitation area of wild apple trees has been seriously declining, resulting in a lack of saplings and difficulty in population regeneration. Artificial near-natural breeding is crucial for protecting and restoring wild apple populations, and adding nitrogen (N) and phosphorous (P) is one of the important measures to improve the growth performance of saplings. In this study, field experiments using N (CK, N1, N2, and N3: 0, 10, 20, and 40 g m- 2 yr- 1, respectively), P (CK, P1, P2, and P3: 0, 2, 4, and 8 g m- 2 yr- 1, respectively), N20Px (CK, N2P1, N2P2, and N2P3: N20P2, N20P4 and N20P8 g m- 2 yr- 1, respectively), and NxP4 (CK, N1P2, N2P2, and N3P2: N10P4, N20P4, and N40P4 g m- 2 yr- 1, respectively) treatments (totaling 12 levels, including one CK) were conducted in four consecutive years. The twig traits (including four current-year stem, 10 leaf, and three ratio traits) and comprehensive growth performance of wild apple saplings were analyzed under different nutrient treatments. RESULTS: N addition had a significantly positive effect on stem length, basal diameter, leaf area, and leaf dry mass, whereas P addition had a significantly positive effect on stem length and basal diameter only. The combination of N and P (NxP4 and N20Px) treatments evidently promoted stem growth at moderate concentrations; however, the N20Px treatment showed a markedly negative effect at low concentrations and a positive effect at moderate and high concentrations. The ratio traits (leaf intensity, leaf area ratio, and leaf to stem mass ratio) decreased with the increase in nutrient concentration under each treatment. In the plant trait network, basal diameter, stem mass, and twig mass were tightly connected to other traits after nutrient treatments, indicating that stem traits play an important role in twig growth. The membership function revealed that the greatest comprehensive growth performance of saplings was achieved after N addition alone, followed by that under the NxP4 treatment (except for N40P4). CONCLUSIONS: Consequently, artificial nutrient treatments for four years significantly but differentially altered the growth status of wild apple saplings, and the use of appropriate N fertilizer promoted sapling growth. These results can provide scientific basis for the conservation and management of wild apple populations.
Asunto(s)
Malus , Malus/genética , Fitomejoramiento , Nitrógeno , Hojas de la Planta , FenotipoRESUMEN
BACKGROUND: Ferula L. is one of the largest and most taxonomically complicated genera as well as being an important medicinal plant resource in the family Apiaceae. To investigate the plastome features and phylogenetic relationships of Ferula and its neighboring genera Soranthus Ledeb., Schumannia Kuntze., and Talassia Korovin, we sequenced 14 complete plastomes of 12 species. RESULTS: The size of the 14 complete chloroplast genomes ranged from 165,607 to 167,013 base pairs (bp) encoding 132 distinct genes (87 protein-coding, 37 tRNA, and 8 rRNA genes), and showed a typical quadripartite structure with a pair of inverted repeats (IR) regions. Based on comparative analysis, we found that the 14 plastomes were similar in codon usage, repeat sequence, simple sequence repeats (SSRs), and IR borders, and had significant collinearity. Based on our phylogenetic analyses, Soranthus, Schumannia, and Talassia should be considered synonymous with Ferula. Six highly divergent regions (rps16/trnQ-UUG, trnS-UGA/psbZ, psbH/petB, ycf1/ndhF, rpl32, and ycf1) were also detected, which may represent potential molecular markers, and combined with selective pressure analysis, the weak positive selection gene ccsA may be a discriminating DNA barcode for Ferula species. CONCLUSION: Plastids contain abundant informative sites for resolving phylogenetic relationships. Combined with previous studies, we suggest that there is still much room for improvement in the classification of Ferula. Overall, our study provides new insights into the plastome evolution, phylogeny, and taxonomy of this genus.
Asunto(s)
Ferula , Genoma del Cloroplasto , Ferula/genética , Repeticiones de Microsatélite , FilogeniaRESUMEN
Ephemeral plants are a crucial vegetation component in temperate deserts of Central Asia, and play an important role in biogeochemical cycle and biodiversity maintenance in desert ecosystems. However, the nitrogen (N) and phosphorus (P) status and interrelations of leaf-root-soil of ephemeral plants remain unclear. A total of 194 leaf-root-soil samples of eight ephemeral species at 37 sites in the Gurbantunggut Desert, China were collected, and then the corresponding N and P concentrations, and the N:P ratio were measured. Results showed that soil parameters presented no significant difference among the eight species. The total soil N:P was only 0.116 (geomean), indicating limited soil N, while the available soil N:P (4.896, geomean) was significantly larger than the total N:P. The leaf N (averagely 30.995 mg g-1) and P (averagely 1.523 mg g-1) concentrations were 2.64-8.46 and 0.93-3.99 times higher than the root N (averagely 8.014 mg g-1) and P (averagely 0.802 mg g-1) concentrations, respectively. Thus, leaf N:P (averagely 21.499) was 1.410-2.957 times higher than root N:P (averagely 11.803). Meanwhile, significant interspecific differences existed in plant stoichiometric traits. At the across-species level, N content scaled as the 3/4-power of P content in both leaves and roots. Leaf and root N:P ratios were mainly influenced by P; however, the leaf-to-root N or P ratio was dominated by roots. Leaf and root N, P contents and N:P were generally unrelated to soil nutrients, and the former presented lower variation than the latter, indicating a strong stoichiometric homeostasis for ephemerals. These results demonstrate that regardless of soil nutrient supply capacity in this region, the fast-growing ephemeral plants have formed a specific leaf-root-soil stoichiometric relation and nutrient use strategy adapting to the extreme desert environment.
Asunto(s)
Ecosistema , Suelo , China , Nitrógeno/análisis , Fósforo , Hojas de la Planta/química , PlantasRESUMEN
Although the biochemical and genetic basis of lipid metabolism is clear in Arabidopsis, there is limited information concerning the relevant genes in Glycine max (soybean). To address this issue, we constructed three-dimensional genetic networks using six seed oil-related traits, 52 lipid metabolism-related metabolites and 54 294 SNPs in 286 soybean accessions in total. As a result, 284 and 279 candidate genes were found to be significantly associated with seed oil-related traits and metabolites by phenotypic and metabolic genome-wide association studies and multi-omics analyses, respectively. Using minimax concave penalty (MCP) and smoothly clipped absolute deviation (SCAD) analyses, six seed oil-related traits were found to be significantly related to 31 metabolites. Among the above candidate genes, 36 genes were found to be associated with oil synthesis (27 genes), amino acid synthesis (four genes) and the tricarboxylic acid (TCA) cycle (five genes), and four genes (GmFATB1a, GmPDAT, GmPLDα1 and GmDAGAT1) are already known to be related to oil synthesis. Using this information, 133 three-dimensional genetic networks were constructed, 24 of which are known, e.g. pyruvate-GmPDAT-GmFATA2-oil content. Using these networks, GmPDAT, GmAGT and GmACP4 reveal the genetic relationships between pyruvate and the three major nutrients, and GmPDAT, GmZF351 and GmPgs1 reveal the genetic relationships between amino acids and seed oil content. In addition, GmCds1, along with average temperature in July and the rainfall from June to September, influence seed oil content across years. This study provides a new approach for the construction of three-dimensional genetic networks and reveals new information for soybean seed oil improvement and the identification of gene function.
Asunto(s)
Redes Reguladoras de Genes/genética , Genes de Plantas/genética , Glycine max/genética , Semillas/genética , Aceite de Soja/genética , Estudio de Asociación del Genoma Completo , Metabolismo de los Lípidos/genética , Mapas de Interacción de Proteínas/genética , Sitios de Carácter Cuantitativo/genética , Carácter Cuantitativo Heredable , Semillas/metabolismo , Aceite de Soja/metabolismo , Glycine max/metabolismoRESUMEN
Compared with genomic data of individual markers, haplotype data provide higher resolution for DNA variants, advancing our knowledge in genetics and evolution. Although many computational and experimental phasing methods have been developed for analyzing diploid genomes, it remains challenging to reconstruct chromosome-scale haplotypes at low cost, which constrains the utility of this valuable genetic resource. Gamete cells, the natural packaging of haploid complements, are ideal materials for phasing entire chromosomes because the majority of the haplotypic allele combinations has been preserved. Therefore, compared with the current diploid-based phasing methods, using haploid genomic data of single gametes may substantially reduce the complexity in inferring the donor's chromosomal haplotypes. In this study, we developed the first easy-to-use R package, Hapi, for inferring chromosome-length haplotypes of individual diploid genomes with only a few gametes. Hapi outperformed other phasing methods when analyzing both simulated and real single gamete cell sequencing data sets. The results also suggested that chromosome-scale haplotypes may be inferred by using as few as three gametes, which has pushed the boundary to its possible limit. The single gamete cell sequencing technology allied with the cost-effective Hapi method will make large-scale haplotype-based genetic studies feasible and affordable, promoting the use of haplotype data in a wide range of research.
Asunto(s)
Técnicas Genéticas , Células Germinativas , Haplotipos , Programas Informáticos , Cromosomas , Humanos , Recombinación Genética , Zea maysRESUMEN
In the genetic system that regulates complex traits, metabolites, gene expression levels, RNA editing levels and DNA methylation, a series of small and linked genes exist. To date, however, little is known about how to design an efficient framework for the detection of these kinds of genes. In this article, we propose a genome-wide composite interval mapping (GCIM) in F2. First, controlling polygenic background via selecting markers in the genome scanning of linkage analysis was replaced by estimating polygenic variance in a genome-wide association study. This can control large, middle and minor polygenic backgrounds in genome scanning. Then, additive and dominant effects for each putative quantitative trait locus (QTL) were separately scanned so that a negative logarithm P-value curve against genome position could be separately obtained for each kind of effect. In each curve, all the peaks were identified as potential QTLs. Thus, almost all the small-effect and linked QTLs are included in a multi-locus model. Finally, adaptive least absolute shrinkage and selection operator (adaptive lasso) was used to estimate all the effects in the multi-locus model, and all the nonzero effects were further identified by likelihood ratio test for true QTL identification. This method was used to reanalyze four rice traits. Among 25 known genes detected in this study, 16 small-effect genes were identified only by GCIM. To further demonstrate GCIM, a series of Monte Carlo simulation experiments was performed. As a result, GCIM is demonstrated to be more powerful than the widely used methods for the detection of closely linked and small-effect QTLs.
Asunto(s)
Modelos Genéticos , Sitios de Carácter Cuantitativo , Metilación de ADN , Ligamiento Genético , Humanos , Método de MontecarloRESUMEN
Flowering time (FT) and plant height (PH) are important agronomic traits in soybean. However, their genetic foundations are not fully understood. Thus, in this study, a total of 106,013 single nucleotide polymorphisms in 286 soybean accessions were used to associate with the first and full FT (FT1 and FT2) and PH in 4 environments and their BLUP values using 6 multi-locus genome-wide association study methods. As a result, 38, 43, and 27 stable quantitative trait nucleotides (QTNs) were identified, respectively, for FT1, FT2, and PH across at least 3 methods and/or environments. Among these QTNs for FT1, FT2, and PH, 31, 36, and 21 were found to have significant phenotype differences across 2 alleles; 22, 18, and 13 were consistent with the corresponding loci in previous studies; 13 and 8 genes, with more than average expression level, around 64 FT and 27 PH QTNs were predicted as their corresponding candidate genes. Among these candidate genes, GmPRR3b, and GmGIa for FT, and GmTFL1b for PH were known, while some were new, e.g., GmPHYA4, GmVRN5, GmFPA, and GmSPA1 for FT, and Glyma.02g300200, GmFPA, and Glyma.13g339800 for PH. All the validated QTNs were used to design the best cross-combinations in 2 FT directions. In each FT direction, the best 5 cross-combinations were predicted, such as Heihe 54 × Qincha 1 for early FT, and Yingdejiadou × Wuhuabayuehuang for late FT. This study provides solid foundations for genetic basis, molecular biology, and breeding by design of soybean FT and PH. Supplementary Information: The online version contains supplementary material available at 10.1007/s11032-021-01230-3.
RESUMEN
The excess organic carbon is often added to meet denitrification requirements during municipal wastewater treatment, resulting in the carbon waste and increased risk of secondary pollution. In this study, microbial fuel cell (MFC) was coupled with an up-flow denitrification biofilter (BF), and the long-term performances of denitrification and power output were investigated under the different carbon source concentration. With sodium acetate (NaAc) of 600 mg/L and 300 mg/L, the favorable denitrification efficiencies were obtained (98.60%) and the stable current output was maintained (0.44 mÃ0.48 mA). By supplying NaAc of 150 mg/L, the high denitrification efficiency remained in a high range (89.31%) and the current output maintained at 0.12 mA, while, the denitrification efficiency dropped to 71.34% without coupling MFC. Electron balance analysis indicated that both nitrate removal and electron recovery efficiencies were higher in MFC-BF than that in BF, verifying the improved denitrification and carbon utilization performance. Coupling MFC significantly altered the bacterial community structure and composition, and while, the diversified abundance and distribution of bacterial genera were observed at the different locations. Compared with BF, the more exoelectrogenic genera (Desulfobacterium, Trichococcus) and genera holding both denitrifying and electrogenic functions (Dechloromonas, Geobacter) were found dominated in MFC-BF. Instead, the dominating genera in BF were Dechloromonas, Desulfomicrobium, Acidovorax and etc. By coupling MFC, the more complex and diversified network and the closer interaction relationships between the dominant potential functional genera were found. The study provides a feasible approach to effectively improve the denitrification efficiency and organic carbon recovery for deep denitrification process.
Asunto(s)
Fuentes de Energía Bioeléctrica , Purificación del Agua , Bacterias , Reactores Biológicos , Desnitrificación , Nitrógeno/análisis , Aguas ResidualesRESUMEN
The mixed linear model has been widely used in genome-wide association studies (GWAS), but its application to multi-locus GWAS analysis has not been explored and assessed. Here, we implemented a fast multi-locus random-SNP-effect EMMA (FASTmrEMMA) model for GWAS. The model is built on random single nucleotide polymorphism (SNP) effects and a new algorithm. This algorithm whitens the covariance matrix of the polygenic matrix K and environmental noise, and specifies the number of nonzero eigenvalues as one. The model first chooses all putative quantitative trait nucleotides (QTNs) with ≤ 0.005 P-values and then includes them in a multi-locus model for true QTN detection. Owing to the multi-locus feature, the Bonferroni correction is replaced by a less stringent selection criterion. Results from analyses of both simulated and real data showed that FASTmrEMMA is more powerful in QTN detection and model fit, has less bias in QTN effect estimation and requires a less running time than existing single- and multi-locus methods, such as empirical Bayes, settlement of mixed linear model under progressively exclusive relationship (SUPER), efficient mixed model association (EMMA), compressed MLM (CMLM) and enriched CMLM (ECMLM). FASTmrEMMA provides an alternative for multi-locus GWAS.
Asunto(s)
Algoritmos , Proteínas de Arabidopsis/genética , Arabidopsis/genética , Estudio de Asociación del Genoma Completo/métodos , Polimorfismo de Nucleótido Simple , Teorema de Bayes , Simulación por Computador , Modelos Lineales , Modelos Genéticos , Herencia Multifactorial , FenotipoRESUMEN
Seed oil traits in soybean that are of benefit to human nutrition and health have been selected for during crop domestication. However, these domesticated traits have significant differences across various evolutionary types. In this study, we found that the integration of evolutionary population structure (evolutionary types) with genome-wide association studies increased the power of gene detection, and it identified one locus for traits related to seed size and oil content on chromosome 13. This domestication locus, together with another one in a 200-kb region, was confirmed by the GEMMA and EMMAX software. The candidate gene, GmPDAT, had higher expressional levels in high-oil and large-seed accessions than in low-oil and small-seed accessions. Overexpression lines had increased seed size and oil content, whereas RNAi lines had decreased seed size and oil content. The molecular mechanism of GmPDAT was deduced based on results from linkage analysis for triacylglycerols and on histocytological comparisons of transgenic soybean seeds. Our results illustrate a new approach for identifying domestication genes with pleiotropic effects.
Asunto(s)
Estudio de Asociación del Genoma Completo , Glycine max , Domesticación , Sitios de Carácter Cuantitativo/genética , Semillas/genética , Glycine max/genéticaRESUMEN
Marker segregation distortion is a natural phenomenon. Severely distorted markers are usually excluded in the construction of linkage maps. We investigated the effect of marker segregation distortion on linkage map construction and quantitative trait locus (QTL) mapping. A total of 519 recombinant inbred lines of soybean from orthogonal and reciprocal crosses between LSZZH and NN493-1 were genotyped by specific length amplified fragment markers and seed linoleic acid content was measured in three environments. As a result, twenty linkage groups were constructed with 11,846 markers, including 1513 (12.77%) significantly distorted markers, on 20 chromosomes, and the map length was 2475.86 cM with an average marker-interval of 0.21 cM. The inclusion of distorted markers in the analysis was shown to not only improve the grouping of the markers from the same chromosomes, and the consistency of linkage maps with genome, but also increase genome coverage by markers. Combining genotypic data from both orthogonal and reciprocal crosses decreased the proportion of distorted markers and then improved the quality of linkage maps. Validation of the linkage maps was confirmed by the high collinearity between positions of markers in the soybean reference genome and in linkage maps and by the high consistency of 24 QTL regions in this study compared with the previously reported QTLs and lipid metabolism related genes. Additionally, linkage maps that include distorted markers could add more information to the outputs from QTL mapping. These results provide important information for linkage mapping, gene cloning and marker-assisted selection in soybean.
Asunto(s)
Mapeo Cromosómico , Ligamiento Genético , Genotipo , Glycine max/genética , Carácter Cuantitativo HeredableRESUMEN
Although the legume-rhizobium symbiosis is a most-important biological process, there is a limited knowledge about the protein interaction network between host and symbiont. Using interolog- and domain-based approaches, we constructed an interspecies protein interactome containing 5115 protein-protein interactions between 2291 Glycine max and 290 Bradyrhizobium diazoefficiens USDA 110 proteins. The interactome was further validated by the expression pattern analysis in nodules, gene ontology term semantic similarity, co-expression analysis, and luciferase complementation image assay. In the G. max-B. diazoefficiens interactome, bacterial proteins are mainly ion channel and transporters of carbohydrates and cations, while G. max proteins are mainly involved in the processes of metabolism, signal transduction, and transport. We also identified the top 10 highly interacting proteins (hubs) for each species. Kyoto Encyclopedia of Genes and Genomes pathway analysis for each hub showed that a pair of 14-3-3 proteins (SGF14g and SGF14k) and 5 heat shock proteins in G. max are possibly involved in symbiosis, and 10 hubs in B. diazoefficiens may be important symbiotic effectors. Subnetwork analysis showed that 18 symbiosis-related soluble N-ethylmaleimide sensitive factor attachment protein receptor proteins may play roles in regulating bacterial ion channels, and SGF14g and SGF14k possibly regulate the rhizobium dicarboxylate transport protein DctA. The predicted interactome provide a valuable basis for understanding the molecular mechanism of nodulation in soybean.
Asunto(s)
Proteínas Bacterianas/metabolismo , Bradyrhizobium/metabolismo , Biología Computacional/métodos , Glycine max/metabolismo , Proteínas de Plantas/metabolismo , Mapas de Interacción de Proteínas , Proteínas 14-3-3/genética , Proteínas 14-3-3/metabolismo , Proteínas Bacterianas/clasificación , Proteínas Bacterianas/genética , Bradyrhizobium/genética , Transportadores de Ácidos Dicarboxílicos/genética , Transportadores de Ácidos Dicarboxílicos/metabolismo , Expresión Génica , Ontología de Genes , Proteínas de Choque Térmico/genética , Proteínas de Choque Térmico/metabolismo , Canales Iónicos/genética , Canales Iónicos/metabolismo , Anotación de Secuencia Molecular , Fijación del Nitrógeno/fisiología , Proteínas de Plantas/clasificación , Proteínas de Plantas/genética , Unión Proteica , Mapeo de Interacción de Proteínas , Nódulos de las Raíces de las Plantas/genética , Nódulos de las Raíces de las Plantas/metabolismo , Nódulos de las Raíces de las Plantas/microbiología , Proteínas SNARE/genética , Proteínas SNARE/metabolismo , Glycine max/genética , Glycine max/microbiología , Simbiosis/fisiologíaRESUMEN
BACKGROUND: Rapeseed (Brassica napus L.) and soybean (Glycine max L.) seeds are rich in both protein and oil, which are major sources of biofuels and nutrition. Although the difference in seed oil content between soybean (~ 20%) and rapeseed (~ 40%) exists, little is known about its underlying molecular mechanism. RESULTS: An integrated omics analysis was performed in soybean, rapeseed, Arabidopsis (Arabidopsis thaliana L. Heynh), and sesame (Sesamum indicum L.), based on Arabidopsis acyl-lipid metabolism- and carbon metabolism-related genes. As a result, candidate genes and their transcription factors and microRNAs, along with phylogenetic analysis and co-expression network analysis of the PEPC gene family, were found to be largely associated with the difference between the two species. First, three soybean genes (Glyma.13G148600, Glyma.13G207900 and Glyma.12G122900) co-expressed with GmPEPC1 are specifically enriched during seed storage protein accumulation stages, while the expression of BnPEPC1 is putatively inhibited by bna-miR169, and two genes BnSTKA and BnCKII are co-expressed with BnPEPC1 and are specifically associated with plant circadian rhythm, which are related to seed oil biosynthesis. Then, in de novo fatty acid synthesis there are rapeseed-specific genes encoding subunits ß-CT (BnaC05g37990D) and BCCP1 (BnaA03g06000D) of heterogeneous ACCase, which could interfere with synthesis rate, and ß-CT is positively regulated by four transcription factors (BnaA01g37250D, BnaA02g26190D, BnaC01g01040D and BnaC07g21470D). In triglyceride synthesis, GmLPAAT2 is putatively inhibited by three miRNAs (gma-miR171, gma-miR1516 and gma-miR5775). Finally, in rapeseed there was evidence for the expansion of gene families, CALO, OBO and STERO, related to lipid storage, and the contraction of gene families, LOX, LAH and HSI2, related to oil degradation. CONCLUSIONS: The molecular mechanisms associated with differences in seed oil content provide the basis for future breeding efforts to improve seed oil content.
Asunto(s)
Brassica napus/metabolismo , Glycine max/metabolismo , Aceite de Brassica napus/análisis , Semillas/química , Aceite de Soja/análisis , Arabidopsis/química , Arabidopsis/genética , Arabidopsis/metabolismo , Brassica napus/química , Brassica napus/genética , Regulación de la Expresión Génica de las Plantas , Genes de Plantas/genética , Lípidos/biosíntesis , Redes y Vías Metabólicas/genética , MicroARNs/genética , Filogenia , Aceites de Plantas/análisis , Aceites de Plantas/metabolismo , Aceite de Brassica napus/metabolismo , Alineación de Secuencia , Sesamum/química , Sesamum/genética , Sesamum/metabolismo , Aceite de Soja/metabolismo , Glycine max/química , Glycine max/genética , Factores de Transcripción/genéticaRESUMEN
Although nonparametric methods in genome-wide association studies (GWAS) are robust in quantitative trait nucleotide (QTN) detection, the absence of polygenic background control in single-marker association in genome-wide scans results in a high false positive rate. To overcome this issue, we proposed an integrated nonparametric method for multi-locus GWAS. First, a new model transformation was used to whiten the covariance matrix of polygenic matrix K and environmental noise. Using the transferred model, Kruskal-Wallis test along with least angle regression was then used to select all the markers that were potentially associated with the trait. Finally, all the selected markers were placed into multi-locus model, these effects were estimated by empirical Bayes, and all the nonzero effects were further identified by a likelihood ratio test for true QTN detection. This method, named pKWmEB, was validated by a series of Monte Carlo simulation studies. As a result, pKWmEB effectively controlled false positive rate, although a less stringent significance criterion was adopted. More importantly, pKWmEB retained the high power of Kruskal-Wallis test, and provided QTN effect estimates. To further validate pKWmEB, we re-analyzed four flowering time related traits in Arabidopsis thaliana, and detected some previously reported genes that were not identified by the other methods.
Asunto(s)
Estudio de Asociación del Genoma Completo/métodos , Modelos Genéticos , Herencia Multifactorial , Arabidopsis/genética , Arabidopsis/fisiología , Teorema de Bayes , Simulación por Computador , Flores/genética , Flores/fisiología , Funciones de Verosimilitud , Método de MontecarloRESUMEN
Genome-wide association study (GWAS) entails examining a large number of single nucleotide polymorphisms (SNPs) in a limited sample with hundreds of individuals, implying a variable selection problem in the high dimensional dataset. Although many single-locus GWAS approaches under polygenic background and population structure controls have been widely used, some significant loci fail to be detected. In this study, we used an iterative modified-sure independence screening (ISIS) approach in reducing the number of SNPs to a moderate size. Expectation-Maximization (EM)-Bayesian least absolute shrinkage and selection operator (BLASSO) was used to estimate all the selected SNP effects for true quantitative trait nucleotide (QTN) detection. This method is referred to as ISIS EM-BLASSO algorithm. Monte Carlo simulation studies validated the new method, which has the highest empirical power in QTN detection and the highest accuracy in QTN effect estimation, and it is the fastest, as compared with efficient mixed-model association (EMMA), smoothly clipped absolute deviation (SCAD), fixed and random model circulating probability unification (FarmCPU), and multi-locus random-SNP-effect mixed linear model (mrMLM). To further demonstrate the new method, six flowering time traits in Arabidopsis thaliana were re-analyzed by four methods (New method, EMMA, FarmCPU, and mrMLM). As a result, the new method identified most previously reported genes. Therefore, the new method is a good alternative for multi-locus GWAS.
Asunto(s)
Algoritmos , Mapeo Cromosómico/métodos , Análisis Mutacional de ADN/métodos , Estudios de Asociación Genética/métodos , Polimorfismo de Nucleótido Simple/genética , Carácter Cuantitativo Heredable , Teorema de Bayes , Simulación por Computador , Interpretación Estadística de Datos , Marcadores Genéticos/genética , Funciones de Verosimilitud , Modelos Genéticos , Modelos EstadísticosRESUMEN
BACKGROUND AND AIM: Increasing evidence confirms that potassium channels are essential for lymphocyte activation, suggesting an involvement in the development of hypertension. Moreover, chronic inflammation is regarded as a direct or indirect manifestation of hypertension, highlighting the theoretical mechanisms. In this study, we investigated changes in KCa3.1 potassium channel expression in the blood of hypertensive and healthy Kazakh people in north-west China. METHODS: Flow cytometry technology was used for T-lymphocyte subtype analysis. Changes in the messenger RNA and protein expression of the KCa3.1 potassium channel in CD4+ T lymphocytes were detected using real-time quantitative polymerase chain reaction and western blots, using CD4+ T-cell samples from hypertensive Kazakh patients divided into candesartan and TRAM-34 treatment groups, and healthy case controls. Peripheral blood CD4+ T lymphocytes were activated and proliferated in vitro and then incubated for 0, 24, and 48 h under various treatment conditions. Changes in CD4+ T-lymphocytic proliferation were determined using Cell Counting Kit-8 and electron microscope photography. RESULTS: Expression of KCa3.1 was significantly higher in the hypertensive patients than in the controls (p < 0.05). Compared with the healthy group, Kazakh hypertensive patients had a reduced proportion of CD4+ T lymphocytes (p < 0.05).Candesartan and TRAM-34 intervention for 24 h and 48 h inhibited the expression of Kv1.3 and KCa3.1 at mRNA and protein level (p < 0.05). CONCLUSIONS: Increase in functional KCa3.1 channels expressed in CD4+ T lymphocytes of Kazakh patients with hypertension was blocked by candesartan, providing theoretical support for hypertension treatment at the cellular ion channel level. Candesartan may potentially regulate hypertensive inflammatory responses by inhibiting T-lymphocytic proliferation and KCa3.1 potassium channel expression in CD4 + T lymphocytes.