Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 29
Filter
Add more filters










Publication year range
1.
Plant Commun ; : 100985, 2024 Jun 10.
Article in English | MEDLINE | ID: mdl-38859587

ABSTRACT

Chromatin interactions create spatial proximity between distal regulatory elements and target genes in the genome, which has an important impact on gene expression, transcriptional regulation, and phenotypic traits. To date, several methods have been developed for predicting gene expression. However, existing methods do not take into consideration the impact of chromatin interactions on target gene expression, thus potentially reduces the accuracy of gene expression prediction and mining of important regulatory elements. In this study, a highly accurate deep learning-based gene expression prediction model (DeepCBA) based on maize chromatin interaction data was developed. Compared with existing models, DeepCBA exhibits higher accuracy in expression classification and expression value prediction. The average Pearson correlation coefficients (PCC) for predicting gene expression using gene promoter proximal interactions, proximal-distal interactions, and proximal and distal interactions were 0.818, 0.625, and 0.929, respectively, representing an increase of 0.357, 0.16, and 0.469 over the PCC of traditional methods that only use gene proximal sequences. Some important motifs were identified through DeepCBA and were found to be enriched in open chromatin regions and expression quantitative trait loci (eQTL) and have the molecular characteristic of tissue specificity. Importantly, the experimental results of maize flowering-related gene ZmRap2.7 and tillering-related gene ZmTb1 demonstrate the feasibility of DeepCBA in exploring regulatory elements that affect gene expression. Moreover, the promoter editing and verification of two reported genes (ZmCLE7, ZmVTE4) demonstrated new insights of DeepCBA in precise designing of gene expression and even future intelligent breeding. DeepCBA is available at http://www.deepcba.com/ or http://124.220.197.196/.

2.
Sci China Life Sci ; 67(6): 1133-1154, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38568343

ABSTRACT

Detecting genes that affect specific traits (such as human diseases and crop yields) is important for treating complex diseases and improving crop quality. A genome-wide association study (GWAS) provides new insights and directions for understanding complex traits by identifying important single nucleotide polymorphisms. Many GWAS summary statistics data related to various complex traits have been gathered recently. Studies have shown that GWAS risk loci and expression quantitative trait loci (eQTLs) often have a lot of overlaps, which makes gene expression gradually become an important intermediary to reveal the regulatory role of GWAS. In this review, we review three types of gene-trait association detection methods of integrating GWAS summary statistics and eQTLs data, namely colocalization methods, transcriptome-wide association study-oriented approaches, and Mendelian randomization-related methods. At the theoretical level, we discussed the differences, relationships, advantages, and disadvantages of various algorithms in the three kinds of gene-trait association detection methods. To further discuss the performance of various methods, we summarize the significant gene sets that influence high-density lipoprotein, low-density lipoprotein, total cholesterol, and triglyceride reported in 16 studies. We discuss the performance of various algorithms using the datasets of the four lipid traits. The advantages and limitations of various algorithms are analyzed based on experimental results, and we suggest directions for follow-up studies on detecting gene-trait associations.


Subject(s)
Genome-Wide Association Study , Polymorphism, Single Nucleotide , Quantitative Trait Loci , Genome-Wide Association Study/methods , Humans , Algorithms , Mendelian Randomization Analysis , Transcriptome/genetics
3.
Brief Bioinform ; 24(2)2023 03 19.
Article in English | MEDLINE | ID: mdl-36917472

ABSTRACT

Identifying the function of DNA sequences accurately is an essential and challenging task in the genomic field. Until now, deep learning has been widely used in the functional analysis of DNA sequences, including DeepSEA, DanQ, DeepATT and TBiNet. However, these methods have the problems of high computational complexity and not fully considering the distant interactions among chromatin features, thus affecting the prediction accuracy. In this work, we propose a hybrid deep neural network model, called DeepFormer, based on convolutional neural network (CNN) and flow-attention mechanism for DNA sequence function prediction. In DeepFormer, the CNN is used to capture the local features of DNA sequences as well as important motifs. Based on the conservation law of flow network, the flow-attention mechanism can capture more distal interactions among sequence features with linear time complexity. We compare DeepFormer with the above four kinds of classical methods using the commonly used dataset of 919 chromatin features of nearly 4.9 million noncoding DNA sequences. Experimental results show that DeepFormer significantly outperforms four kinds of methods, with an average recall rate at least 7.058% higher than other methods. Furthermore, we confirmed the effectiveness of DeepFormer in capturing functional variation using Alzheimer's disease, pathogenic mutations in alpha-thalassemia and modification in CCCTC-binding factor (CTCF) activity. We further predicted the maize chromatin accessibility of five tissues and validated the generalization of DeepFormer. The average recall rate of DeepFormer exceeds the classical methods by at least 1.54%, demonstrating strong robustness.


Subject(s)
Genomics , Neural Networks, Computer , Base Sequence , Genomics/methods , Chromatin/genetics , Genome
4.
Bioinformatics ; 39(1)2023 01 01.
Article in English | MEDLINE | ID: mdl-36342190

ABSTRACT

MOTIVATION: The question of how to construct gene regulatory networks has long been a focus of biological research. Mutual information can be used to measure nonlinear relationships, and it has been widely used in the construction of gene regulatory networks. However, this method cannot measure indirect regulatory relationships under the influence of multiple genes, which reduces the accuracy of inferring gene regulatory networks. APPROACH: This work proposes a method for constructing gene regulatory networks based on mixed entropy optimizing context-related likelihood mutual information (MEOMI). First, two entropy estimators were combined to calculate the mutual information between genes. Then, distribution optimization was performed using a context-related likelihood algorithm to eliminate some indirect regulatory relationships and obtain the initial gene regulatory network. To obtain the complex interaction between genes and eliminate redundant edges in the network, the initial gene regulatory network was further optimized by calculating the conditional mutual inclusive information (CMI2) between gene pairs under the influence of multiple genes. The network was iteratively updated to reduce the impact of mutual information on the overestimation of the direct regulatory intensity. RESULTS: The experimental results show that the MEOMI method performed better than several other kinds of gene network construction methods on DREAM challenge simulated datasets (DREAM3 and DREAM5), three real Escherichia coli datasets (E.coli SOS pathway network, E.coli SOS DNA repair network and E.coli community network) and two human datasets. AVAILABILITY AND IMPLEMENTATION: Source code and dataset are available at https://github.com/Dalei-Dalei/MEOMI/ and http://122.205.95.139/MEOMI/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Computational Biology , Gene Regulatory Networks , Humans , Entropy , Computational Biology/methods , Probability , Algorithms , Escherichia coli/genetics
5.
Build Environ ; 218: 109153, 2022 Jun 15.
Article in English | MEDLINE | ID: mdl-35531051

ABSTRACT

The coronavirus disease 2019 (COVID-19) pandemic has posed substantial challenges to worldwide health systems in quick response to epidemics. The assessment of personal exposure to COVID-19 in enclosed spaces is critical to identifying potential infectees and preventing outbreaks. However, traditional contact tracing methods rely heavily on a manual interview, which is costly and time consuming given the large population involved. With advanced indoor localisation techniques, it is possible to collect people's footprints accurately by locating their smartphones. This study presents a new framework for the assessment of personal exposure to COVID-19 carriers using their fine-grained trajectory data. An integral model was established to quantify the exposure risk, in which the spatial and temporal decay effects are simultaneously considered when modelling the airborne transmission of COVID-19. Regarding the obstacle effect of the indoor layout on airborne transmission, a weight graph based on the space syntax technique was further introduced to constrain the transmission strength between subspaces that are less inter-visible. The proposed framework was demonstrated by a simulation study, in which external comparison and internal analysis were conducted to justify its validity and robustness in different scenarios. Our method is expected to promote the efficient identification of potential infectees and provide an extensible spatial-temporal model to simulate different control measures and examine their effectiveness in a built environment.

6.
Transbound Emerg Dis ; 69(4): e845-e858, 2022 Jul.
Article in English | MEDLINE | ID: mdl-34695291

ABSTRACT

Bartonella species are facultative intracellular bacteria and recognized worldwide as emerging zoonotic pathogens. Bartonella were isolated or identified by polymerase chain reaction (PCR) in bats and their ectoparasites worldwide, whereas the association between them was scarce, especially in Asia. In this study, a retrospective analysis with frozen samples was carried out to identify the genetic diversity of Bartonella in bats and their ectoparasites and to investigate the relationships of Bartonella carried by bats and their ectoparasites. Bats and their ectoparasites (bat flies and bat mites) were collected from caves in Hubei Province, Central China, from May 2018 to July 2020. Bartonella were screened by PCR amplification and sequencing of three genes (gltA, rpoB, and ftsZ). Bats, bat flies, and bat mites carried diverse novel Bartonella genotypes with a high prevalence. The sharing of some Bartonella genotypes between bats and bat flies or bat mites indicated a potential role of bat flies and bat mites as vectors of bartonellae, while the higher genetic diversity of Bartonella in bat flies than that in bats might be due to the vertical transmission of this bacterium in bat flies. Therefore, bat flies might also act as reservoirs of Bartonella. In addition, human-pathogenic B. mayotimonesis was identified in both bats and their ectoparasites, which expanded our knowledge on the geographic distribution of this bacterium and suggested a potential bat origin with bat flies and bat mites playing important roles in the maintenance and transmission of Bartonella.


Subject(s)
Bartonella Infections , Bartonella , Chiroptera , Diptera , Animals , Bartonella/genetics , Bartonella Infections/epidemiology , Bartonella Infections/microbiology , Bartonella Infections/veterinary , Genotype , Humans , Phylogeny , Retrospective Studies
7.
IEEE/ACM Trans Comput Biol Bioinform ; 19(5): 2654-2671, 2022.
Article in English | MEDLINE | ID: mdl-34181547

ABSTRACT

Proposing a more effective and accurate epistatic loci detection method in large-scale genomic data has important research significance for improving crop quality, disease treatment, etc. Due to the characteristics of high accuracy and processing non-linear relationship, Bayesian network (BN) has been widely used in constructing the network of SNPs and phenotype traits and thus to mine epistatic loci. However, the shortcoming of BN is that it is easy to fall into local optimum and unable to process large-scale of SNPs. In this work, we transform the problem of learning Bayesian network into the optimization of integer linear programming (ILP). We use the algorithms of branch-and-bound and cutting planes to get the global optimal Bayesian network (ILPBN), and thus to get epistatic loci influencing specific phenotype traits. In order to handle large-scale of SNP loci and further to improve efficiency, we use the method of optimizing Markov blanket to reduce the number of candidate parent nodes for each node. In addition, we use α-BIC that is suitable for processing the epistatis mining to calculate the BN score. We use four properties of BN decomposable scoring functions to further reduce the number of candidate parent sets for each node. Experiment results show that ILPBN can not only process 2-locus and 3-locus epistasis mining, but also realize multi-locus epistasis detection. Finally, we compare ILPBN with several popular epistasis mining algorithms by using simulated and real Age-related macular disease (AMD) dataset. Experiment results show that ILPBN has better epistasis detection accuracy, F1-score and false positive rate in premise of ensuring the efficiency compared with other methods. Availability: Codes and dataset are available at: http://122.205.95.139/ILPBN/.


Subject(s)
Epistasis, Genetic , Genome-Wide Association Study , Algorithms , Bayes Theorem , Epistasis, Genetic/genetics , Genome-Wide Association Study/methods , Polymorphism, Single Nucleotide/genetics , Programming, Linear
8.
Bioelectrochemistry ; 143: 107986, 2022 Feb.
Article in English | MEDLINE | ID: mdl-34735912

ABSTRACT

At present, carcinoembryonic antigen (CEA) is considered a broad-spectrum cancer biomarker, and its accurate analysis in clinical samples can assist early cancer diagnosis and treatment. Herein, a novel electrochemical aptasensor has been proposed for CEA detection based on exonuclease III and hybrid chain reaction. The target CEA specifically binds to the aptamer region in hairpin probe 1 (defined as H1) by strong attraction, which leads the rest of the H1 triggering catalytic hairpin assembly to form a high quantity of H1 and hairpin probe 2 (defined as H2) double chain complex (denoted as H1@H2). Subsequently, the exonuclease III digests the complex of H1@H2 and liberates H1 to induce the first signal amplification. Simultaneously, a large number of generated trigger chains initiate a hybrid chain reaction and produce a second signal amplification. This proposed sensor exhibited excellent analytical performance for the detection of CEA, with wide linear range from 10 pg.mL-1 to 100 ng.mL-1 and low limit of detection of 0.84 pg.mL-1. Additionally, the biosensing strategy was successfully verified for direct measurement of CEA in human serum. Therefore, this elaborated sensor provides a new simple method for detecting CEA and exhibits great promise in the early screening of cancer.


Subject(s)
Carcinoembryonic Antigen
9.
Exp Eye Res ; 208: 108595, 2021 07.
Article in English | MEDLINE | ID: mdl-34000276

ABSTRACT

This study aimed to explore the effects of N-acetylserotonin (NAS) on the expression of interleukin-1ß (IL-1ß) in the retina of retinal ischemia-reperfusion injury (RIRI) rats via the toll-like receptor 4 (TLR4)/nuclear factor-kappa B (NF-κB)/nod-like receptor pyrin domain containing 3 (NLRP3) signaling pathway. In this study, adult male Sprague Dawley rats were randomly divided into the sham, RIRI, RIRI + NAS and RIRI + TAK-242 + NAS groups. The rats in the RIRI + NAS and RIRI + TAK-242 + NAS groups were intraperitoneally injected with NAS 30 min before and after modeling. TAK-242, a selective TLR4 inhibitor, was administered by intraperitoneal injection in RIRI + TAK-242 + NAS group. The RIRI rat model was established by elevating the intraocular pressure to 110 mmHg for 60 min. The retinal structure and edema were assessed by H&E staining. The expression levels of TLR4, phosphorylated NF-κB (p-NF-κB), NLRP3, cleaved Caspase-1, and IL-1ß in the retina of each group were detected using immunohistochemistry and Western blot. The correlations of the differences of TLR4+ and cleaved Caspase-1+ with IL-1ß+ cells (between the NAS and the RIRI groups) were analyzed, using linear regression in the RIRI + NAS group. Results showed that thinner retina, more RGCs, and less TLR4+, p-NF-κB+, NLRP3+, cleaved Caspase-1+, and IL-1ß+ cells in the retina were observed in the RIRI + NAS and RIRI + TAK-242 + NAS groups compared with the RIRI group 12 h after RIRI (all P < 0.01). Western blot analysis results showed that the expression of IL-1ß in the RIRI + NAS group began to increase 6 h after RIRI, and it reached a high level 12 h after RIRI, and then decreased. And it was lower at each time point in the RIRI + NAS group than in the RIRI group, and there existed significant difference (all P < 0.01). Besides, the expression levels of TLR4, p-NF-κB, NLRP3, and cleaved Caspase-1 proteins in the RIRI + NAS and RIRI + TAK-242 + NAS groups decreased 12 h after RIRI compared with those in the RIRI group (all P < 0.01). The difference in IL-1ß+ cells was significantly correlated with those of TLR4+ and cleaved Caspase-1+ cells in the RIRI + NAS group (r2 = 0.9054 or 0.7431, P < 0.01). In conclusion, NAS could attenuate the expression of IL-1ß by inhibiting the TLR4/NF-κB/NLRP3 signaling pathway, reduce the retina edema, and promote the survival of RGCs, thereby alleviating the retinal injury and exert its neuroprotective effect.


Subject(s)
Interleukin-18/biosynthesis , NLR Family, Pyrin Domain-Containing 3 Protein/biosynthesis , Reperfusion Injury/metabolism , Retinal Diseases/metabolism , Serotonin/analogs & derivatives , Toll-Like Receptor 4/biosynthesis , Animals , Disease Models, Animal , Immunohistochemistry , Inflammasomes/metabolism , Male , Rats , Rats, Sprague-Dawley , Reperfusion Injury/drug therapy , Reperfusion Injury/pathology , Retinal Diseases/drug therapy , Retinal Diseases/pathology , Serotonin/pharmacology , Signal Transduction/drug effects
10.
PLoS Negl Trop Dis ; 15(3): e0009113, 2021 03.
Article in English | MEDLINE | ID: mdl-33735240

ABSTRACT

Bats can harbor zoonotic pathogens causing emerging infectious diseases, but their status as hosts for bacteria is limited. We aimed to investigate the distribution, prevalence and genetic diversity of Borrelia in bats and bat ticks in Hubei Province, China, which will give us a better understanding of the risk of Borrelia infection posed by bats and their ticks. During 2018-2020, 403 bats were captured from caves in Hubei Province, China, 2 bats were PCR-positive for Borrelia. Sequence analysis of rrs, flaB and glpQ genes of positive samples showed 99.55%-100% similarity to Candidatus Borrelia fainii, a novel human-pathogenic relapsing fever Borrelia species recently reported in Zambia, Africa and Eastern China, which was clustered together with relapsing fever Borrelia species traditionally reported only in the New World. Multilocus sequence typing (MLST) and pairwise genetic distances further confirmed the Borrelia species in the bats from Central China as Candidatus Borrelia fainii. No Borrelia DNA was detected in ticks collected from bats. The detection of this human-pathogenic relapsing fever Borrelia in bats suggests a wide distribution of this novel relapsing fever Borrelia species in China, which may pose a threat to public health in China.


Subject(s)
Borrelia/classification , Chiroptera/microbiology , Relapsing Fever/epidemiology , Ticks/microbiology , Animals , Borrelia/genetics , Borrelia/isolation & purification , China/epidemiology , DNA, Bacterial/genetics , Disease Vectors , Multilocus Sequence Typing , Phylogeny , Polymerase Chain Reaction
11.
IEEE/ACM Trans Comput Biol Bioinform ; 18(4): 1369-1383, 2021.
Article in English | MEDLINE | ID: mdl-31670676

ABSTRACT

How to mine the interaction between SNPs (namely epistasis) efficiently and accurately must be considered when to tackle the complexity of underlying biological mechanisms. In order to overcome the defect of low learning efficiency and local optimal, this work proposes an epistasis mining method using artificial fish swarm optimizing Bayesian network (AFSBN). This method uses the characteristics of global optimization, good robustness and fast convergence about the artificial fish swarm algorithm, and uses the algorithm into the heuristic search strategy of Bayesian network. The initial network structure can be evolved through the manipulations of foraging behavior, clustering behavior, tail-chasing behavior and random behavior. This algorithm chooses different behaviors to modify the network state according to the changing of surrounding environment and the states of partners. It realizes the interaction between each artificial fish and its neighboring environment, and finally finds the optimal network in the population. We compared AFSBN with other existing algorithms on both simulated and real datasets. The experimental results demonstrate that our method outperforms others in epistasis detection accuracy in the case of not affecting the efficiency basically for different datasets.


Subject(s)
Algorithms , Bayes Theorem , Computational Biology/methods , Epistasis, Genetic/genetics , Polymorphism, Single Nucleotide/genetics , Cluster Analysis , Humans , Macular Degeneration/genetics , Models, Biological
12.
Sci China Life Sci ; 63(12): 1860-1878, 2020 Dec.
Article in English | MEDLINE | ID: mdl-33051704

ABSTRACT

In recent years, deep learning has been widely used in diverse fields of research, such as speech recognition, image classification, autonomous driving and natural language processing. Deep learning has showcased dramatically improved performance in complex classification and regression problems, where the intricate structure in the high-dimensional data is difficult to discover using conventional machine learning algorithms. In biology, applications of deep learning are gaining increasing popularity in predicting the structure and function of genomic elements, such as promoters, enhancers, or gene expression levels. In this review paper, we described the basic concepts in machine learning and artificial neural network, followed by elaboration on the workflow of using convolutional neural network in genomics. Then we provided a concise introduction of deep learning applications in genomics and synthetic biology at the levels of DNA, RNA and protein. Finally, we discussed the current challenges and future perspectives of deep learning in genomics.


Subject(s)
Deep Learning , Genomics , Algorithms , Animals , DNA/chemistry , DNA/genetics , Humans , Machine Learning , Neural Networks, Computer , Proteins/chemistry , Proteins/metabolism , RNA/chemistry , RNA/genetics
13.
Article in English | MEDLINE | ID: mdl-32992905

ABSTRACT

Carbon labeling describes carbon dioxide emissions across food lifecycles, contributing to enhancing consumers' low-carbon awareness and promoting low-carbon consumption behaviors. In a departure from the existing literature on carbon labeling that heavily relies on interviews or questionnaire surveys, this study forms a hybrid of an auction experiment and a consumption experiment to observe university students' purchase intention and willingness to pay for a carbon-labeled food product. In this study, students from a university in a city (Chengdu) of China, the largest carbon emitter, are taken as the experimental group, and cow's milk is selected as the experimental food product. The main findings of this study are summarized as follows: (1) the purchase of carbon-labeled milk products is primarily influenced by price; (2) the willingness to pay for carbon-labeled milk products primarily depends on the premium; and (3) the students are willing to accept a maximum price premium of 3.2%. This study further offers suggestions to promote the formation of China's carbon product-labeling system and the marketization of carbon-labeled products and consequently facilitate low-carbon consumption in China.


Subject(s)
Consumer Behavior , Food Labeling , Students , Animals , Cattle , China , Female , Humans , Intention , Universities
14.
BMC Bioinformatics ; 21(1): 414, 2020 Sep 22.
Article in English | MEDLINE | ID: mdl-32962627

ABSTRACT

BACKGROUND: Gene selection refers to find a small subset of discriminant genes from the gene expression profiles. How to select genes that affect specific phenotypic traits effectively is an important research work in the field of biology. The neural network has better fitting ability when dealing with nonlinear data, and it can capture features automatically and flexibly. In this work, we propose an embedded gene selection method using neural network. The important genes can be obtained by calculating the weight coefficient after the training is completed. In order to solve the problem of black box of neural network and further make the training results interpretable in neural network, we use the idea of knockoffs to construct the knockoff feature genes of the original feature genes. This method not only make each feature gene to compete with each other, but also make each feature gene compete with its knockoff feature gene. This approach can help to select the key genes that affect the decision-making of neural networks. RESULTS: We use maize carotenoids, tocopherol methyltransferase, raffinose family oligosaccharides and human breast cancer dataset to do verification and analysis. CONCLUSIONS: The experiment results demonstrate that the knockoffs optimizing neural network method has better detection effect than the other existing algorithms, and specially for processing the nonlinear gene expression and phenotype data.


Subject(s)
Data Mining/methods , Neural Networks, Computer , Transcriptome , Breast Neoplasms/genetics , Computational Biology/methods , Female , Gene Expression Regulation , Humans , Zea mays/enzymology , Zea mays/genetics , Zea mays/metabolism
15.
Database (Oxford) ; 20202020 01 01.
Article in English | MEDLINE | ID: mdl-32548639

ABSTRACT

MaizeCUBIC is a free database that describes genomic variations, gene expression, phenotypes and quantitative trait locus (QTLs) for a maize CUBIC population (24 founders and 1404 inbred offspring). The database not only includes information for over 14M single nucleotide polymorphism (SNPs) and 43K indels previously identified but also contains 660K structure variations (SVs) and 600M novel sequences newly identified in the present study, which represents a comprehensive high-density variant map for a diverse population. Based on these genomic variations, the database would demonstrate the mosaic structure for each progeny, reflecting a high-resolution reshuffle across parental genomes. A total of 23 agronomic traits measured on parents and progeny in five locations, where are representative of the maize main growing regions in China, were also included in the database. To further explore the genotype-phenotype relationships, two different methods of genome-wide association studies (GWAS) were employed for dissecting the genetic architecture of 23 agronomic traits. Additionally, the Basic Local Alignment Search Tool and primer design tools are developed to promote follow-up analysis and experimental verification. All the original data and corresponding analytical results can be accessed through user-friendly online queries and web interface dynamic visualization, as well as downloadable files. These data and tools provide valuable resources on genetic and genomic studies of maize and other crops.


Subject(s)
Databases, Genetic , Genome, Plant/genetics , Zea mays/genetics , Genomics , Phenotype , Polymorphism, Single Nucleotide/genetics , Quantitative Trait Loci/genetics , Software
16.
Plant Biotechnol J ; 18(11): 2345-2353, 2020 11.
Article in English | MEDLINE | ID: mdl-32367649

ABSTRACT

Rapeseed is the second most important oil crop species and is widely cultivated worldwide. However, overcoming the 'phenotyping bottleneck' has remained a significant challenge. A clear goal of high-throughput phenotyping is to bridge the gap between genomics and phenomics. In addition, it is important to explore the dynamic genetic architecture underlying rapeseed plant growth and its contribution to final yield. In this work, a high-throughput phenotyping facility was used to dynamically screen a rapeseed intervarietal substitution line population during two growing seasons. We developed an automatic image analysis pipeline to quantify 43 dynamic traits across multiple developmental stages, with 12 time points. The time-resolved i-traits could be extracted to reflect shoot growth and predict the final yield of rapeseed. Broad phenotypic variation and high heritability were observed for these i-traits across all developmental stages. A total of 337 and 599 QTLs were identified, with 33.5% and 36.1% consistent QTLs for each trait across all 12 time points in the two growing seasons, respectively. Moreover, the QTLs responsible for yield indicators colocalized with those of final yield, potentially providing a new mechanism of yield regulation. Our results indicate that high-throughput phenotyping can provide novel insights into the dynamic genetic architecture of rapeseed growth and final yield, which would be useful for future genetic improvements in rapeseed.


Subject(s)
Brassica napus , Brassica rapa , Brassica napus/genetics , Brassica rapa/genetics , Chromosome Mapping , Phenotype , Quantitative Trait Loci/genetics
17.
Article in English | MEDLINE | ID: mdl-30281476

ABSTRACT

How to mine the gene regulatory relationship and construct gene regulatory network (GRN) is of utmost interest within the whole biological community, however, which has been consistently a challenging problem since the tremendous complexity in cellular systems. In present work, we construct gene regulatory network using an improved three-phase dependency analysis algorithm (TPDA) Bayesian network learning method, which includes the steps of Drafting, Thickening, and Thinning. In order to solve the problem of learning result is not reliable due to the high order conditional independence test, we use the entropy estimation approach of Gaussian kernel probability density estimator to calculate the (conditional) mutual information between genes. The experiment on the public benchmark data sets show the improved method outperforms the other nine kinds of Bayesian network learning methods when to process the data with large sample size, with small number of discrete values, and the frequency of different discrete values is about same. In addition, the improved TPDA method was further applied on a real large gene expression data set on RNA-seq from a global collection with 368 elite maize inbred lines. Experiment results show it performs better than the original TPDA method and the other nine kinds of Bayesian network learning algorithms significantly.


Subject(s)
Computational Biology/methods , Gene Regulatory Networks/genetics , Machine Learning , Algorithms , Bayes Theorem , Data Mining , Zea mays/genetics
18.
BMC Bioinformatics ; 20(1): 444, 2019 Aug 28.
Article in English | MEDLINE | ID: mdl-31455207

ABSTRACT

BACKGROUND: Mining epistatic loci which affects specific phenotypic traits is an important research issue in the field of biology. Bayesian network (BN) is a graphical model which can express the relationship between genetic loci and phenotype. Until now, it has been widely used into epistasis mining in many research work. However, this method has two disadvantages: low learning efficiency and easy to fall into local optimum. Genetic algorithm has the excellence of rapid global search and avoiding falling into local optimum. It is scalable and easy to integrate with other algorithms. This work proposes an epistasis mining approach based on genetic tabu algorithm and Bayesian network (Epi-GTBN). It uses genetic algorithm into the heuristic search strategy of Bayesian network. The individual structure can be evolved through the genetic operations of selection, crossover and mutation. It can help to find the optimal network structure, and then further to mine the epistasis loci effectively. In order to enhance the diversity of the population and obtain a more effective global optimal solution, we use the tabu search strategy into the operations of crossover and mutation in genetic algorithm. It can help to accelerate the convergence of the algorithm. RESULTS: We compared Epi-GTBN with other recent algorithms using both simulated and real datasets. The experimental results demonstrate that our method has much better epistasis detection accuracy in the case of not affecting the efficiency for different datasets. CONCLUSIONS: The presented methodology (Epi-GTBN) is an effective method for epistasis detection, and it can be seen as an interesting addition to the arsenal used in complex traits analyses.


Subject(s)
Algorithms , Data Mining , Epistasis, Genetic , Bayes Theorem , Gene Regulatory Networks , Genetic Loci , Humans , Macular Degeneration/genetics , Models, Genetic , Polymorphism, Single Nucleotide/genetics
19.
BMC Genomics ; 20(1): 443, 2019 Jun 03.
Article in English | MEDLINE | ID: mdl-31159731

ABSTRACT

BACKGROUND: Trait ontology (TO) analysis is a powerful system for functional annotation and enrichment analysis of genes. However, given the complexity of the molecular mechanisms underlying phenomes, only a few hundred gene-to-TO relationships in plants have been elucidated to date, limiting the pace of research in this "big data" era. RESULTS: Here, we curated all the available trait associated sites (TAS) information from 79 association mapping studies of maize (Zea mays L.) and rice (Oryza sativa L.) lines with diverse genetic backgrounds and built a large-scale TAS-derived TO system for functional annotation of genes in various crops. Our TO system contains information for up to 18,042 genes (6345 in maize at the 25 k level and 11,697 in rice at the 50 k level), including gene-to-TO relationships, which covers over one fifth of the annotated gene sets for maize and rice. A comparison of Gene Ontology (GO) vs. TO analysis demonstrated that the TAS-derived TO system is an efficient alternative tool for gene functional annotation and enrichment analysis. We therefore combined information from the TO, GO, metabolic pathway, and co-expression network databases and constructed the TAS system, which is publicly available at http://tas.hzau.edu.cn . TAS provides a user-friendly interface for functional annotation of genes, enrichment analysis, genome-wide extraction of trait-associated genes, and crosschecking of different functional annotation databases. CONCLUSIONS: TAS bridges the gap between genomic and phenomic information in crops. This easy-to-use tool will be useful for geneticists, biologists, and breeders in the agricultural community, as it facilitates the dissection of molecular mechanisms conferring agronomic traits in an easy, genome-wide manner.


Subject(s)
Genome, Plant , Genomics/methods , Oryza/genetics , Plant Proteins/genetics , Zea mays/genetics , Crops, Agricultural/genetics , Genome-Wide Association Study , Oryza/physiology , Phenotype , Quantitative Trait Loci , Zea mays/physiology
20.
J Exp Bot ; 70(2): 545-561, 2019 01 07.
Article in English | MEDLINE | ID: mdl-30380099

ABSTRACT

Manual phenotyping of rice tillers is time consuming and labor intensive, and lags behind the rapid development of rice functional genomics. Thus, automated, non-destructive methods of phenotyping rice tiller traits at a high spatial resolution and high throughput for large-scale assessment of rice accessions are urgently needed. In this study, we developed a high-throughput micro-CT-RGB imaging system to non-destructively extract 739 traits from 234 rice accessions at nine time points. We could explain 30% of the grain yield variance from two tiller traits assessed in the early growth stages. A total of 402 significantly associated loci were identified by genome-wide association study, and dynamic and static genetic components were found across the nine time points. A major locus associated with tiller angle was detected at time point 9, which contained a major gene, TAC1. Significant variants associated with tiller angle were enriched in the 3'-untranslated region of TAC1. Three haplotypes for the gene were found, and rice accessions containing haplotype H3 displayed much smaller tiller angles. Further, we found two loci containing associations with both vigor-related traits identified by high-throughput micro-CT-RGB imaging and yield. The superior alleles would be beneficial for breeding for high yield and dense planting.


Subject(s)
Oryza/growth & development , Oryza/genetics , Biomass , Droughts , Edible Grain/growth & development , Genome, Plant , Genome-Wide Association Study , X-Ray Microtomography
SELECTION OF CITATIONS
SEARCH DETAIL
...