Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 29
Filter
Add more filters










Publication year range
1.
medRxiv ; 2024 Feb 09.
Article in English | MEDLINE | ID: mdl-38076942

ABSTRACT

Background: Large scale genomics projects have identified driver alterations for most childhood cancers that provide reliable biomarkers for clinical diagnosis and disease monitoring using targeted sequencing. However, there is lack of a comprehensive panel that matches the list of known driver genes. Here we fill this gap by developing SJPedPanel for childhood cancers. Results: SJPedPanel covers 5,275 coding exons of 357 driver genes, 297 introns frequently involved in rearrangements that generate fusion oncoproteins, commonly amplified/deleted regions (e.g., MYCN for neuroblastoma, CDKN2A and PAX5 for B-/T-ALL, and SMARCB1 for AT/RT), and 7,590 polymorphism sites for interrogating tumors with aneuploidy, such as hyperdiploid and hypodiploid B-ALL or 17q gain neuroblastoma. We used driver alterations reported from an established real-time clinical genomics cohort (n=253) to validate this gene panel. Among the 485 pathogenic variants reported, our panel covered 417 variants (86%). For 90 rearrangements responsible for oncogenic fusions, our panel covered 74 events (82%). We re-sequenced 113 previously characterized clinical specimens at an average depth of 2,500X using SJPedPanel and recovered 354 (91%) of the 389 reported pathogenic variants. We then investigated the power of this panel in detecting mutations from specimens with low tumor purity (as low as 0.1%) using cell line-based dilution experiments and discovered that this gene panel enabled us to detect ∼80% variants with allele fraction of 0.2%, while the detection rate decreases to ∼50% when the allele fraction is 0.1%. We finally demonstrate its utility in disease monitoring on clinical specimens collected from AML patients in morphologic remission. Conclusions: SJPedPanel enables the detection of clinically relevant genetic alterations including rearrangements responsible for subtype-defining fusions for childhood cancers by targeted sequencing of ∼0.15% of human genome. It will enhance the analysis of specimens with low tumor burdens for cancer monitoring and early detection.

2.
Methods Mol Biol ; 2698: 361-379, 2023.
Article in English | MEDLINE | ID: mdl-37682485

ABSTRACT

Leveraging existing resources in studied species to predict gene functions has the potential to rapidly expand understanding of annotated genes in other, less well-studied, species with assembled genomes. However, orthology is not a reliable predictor for the transcriptional responses of genes to stress. Machine learning methods can quantitatively estimate expression patterns and gene functions using known annotations and collections of features describing each gene. In this chapter, we describe a supervised machine learning framework to predict stress-responsive genes across species using only features derived from nucleotide sequences, using the example of cold stress-responsive genes in different Panicoid grass species.


Subject(s)
Machine Learning , Supervised Machine Learning , Cold-Shock Response , Poaceae/genetics
3.
Plant Physiol ; 193(4): 2622-2639, 2023 Nov 22.
Article in English | MEDLINE | ID: mdl-37587696

ABSTRACT

Common purslane (Portulaca oleracea) integrates both C4 and crassulacean acid metabolism (CAM) photosynthesis pathways and is a promising model plant to explore C4-CAM plasticity. Here, we report a high-quality chromosome-level genome of nicotinamide adenine dinucleotide (NAD)-malic enzyme (ME) subtype common purslane that provides evidence for 2 rounds of whole-genome duplication (WGD) with an ancient WGD (P-ß) in the common ancestor to Portulacaceae and Cactaceae around 66.30 million years ago (Mya) and another (Po-α) specific to common purslane lineage around 7.74 Mya. A larger number of gene copies encoding key enzymes/transporters involved in C4 and CAM pathways were detected in common purslane than in related species. Phylogeny, conserved functional site, and collinearity analyses revealed that the Po-α WGD produced the phosphoenolpyruvate carboxylase-encoded gene copies used for photosynthesis in common purslane, while the P-ß WGD event produced 2 ancestral genes of functionally differentiated (C4- and CAM-specific) beta carbonic anhydrases involved in the C4 + CAM pathways. Additionally, cis-element enrichment analysis in the promoters showed that CAM-specific genes have recruited both evening and midnight circadian elements as well as the Abscisic acid (ABA)-independent regulatory module mediated by ethylene-response factor cis-elements. Overall, this study provides insights into the origin and evolutionary process of C4 and CAM pathways in common purslane, as well as potential targets for engineering crops by integrating C4 or CAM metabolism.


Subject(s)
Portulaca , Portulaca/genetics , Portulaca/metabolism , Gene Duplication , Crassulacean Acid Metabolism , Biological Evolution , Phylogeny , Photosynthesis/genetics
4.
J Exp Bot ; 74(17): 5405-5417, 2023 09 13.
Article in English | MEDLINE | ID: mdl-37357909

ABSTRACT

Severe cold, defined as a damaging cold beyond acclimation temperatures, has unique responses, but the signaling and evolution of these responses are not well understood. Production of oligogalactolipids, which is triggered by cytosolic acidification in Arabidopsis (Arabidopsis thaliana), contributes to survival in severe cold. Here, we investigated oligogalactolipid production in species from bryophytes to angiosperms. Production of oligogalactolipids differed within each clade, suggesting multiple evolutionary origins of severe cold tolerance. We also observed greater oligogalactolipid production in control samples than in temperature-challenged samples of some species. Further examination of representative species revealed a tight association between temperature, damage, and oligogalactolipid production that scaled with the cold tolerance of each species. Based on oligogalactolipid production and transcript changes, multiple angiosperm species share a signal of oligogalactolipid production initially described in Arabidopsis, namely cytosolic acidification. Together, these data suggest that oligogalactolipid production is a severe cold response that originated from an ancestral damage response that remains in many land plant lineages and that cytosolic acidification may be a common signaling mechanism for its activation.


Subject(s)
Arabidopsis Proteins , Arabidopsis , Magnoliopsida , Arabidopsis/metabolism , Cold Temperature , Arabidopsis Proteins/metabolism , Temperature , Magnoliopsida/metabolism , Acclimatization/physiology , Gene Expression Regulation, Plant
5.
Plant Direct ; 7(4): e489, 2023 Apr.
Article in English | MEDLINE | ID: mdl-37124872

ABSTRACT

The Heat Shock Factor (HSF) transcription factor family is a central and required component of plant heat stress responses and acquired thermotolerance. The HSF family has dramatically expanded in plant lineages, often including a repertoire of 20 or more genes. Here we assess and compare the composition, heat responsiveness, and chromatin profiles of the HSF families in maize and Setaria viridis (Setaria), two model C4 panicoid grasses. Both species encode a similar number of HSFs, and examples of both conserved and variable expression responses to a heat stress event were observed between the two species. Chromatin accessibility and genome-wide DNA-binding profiles were generated to assess the chromatin of HSF family members with distinct responses to heat stress. We observed significant variability for both chromatin accessibility and promoter occupancy within similarly regulated sets of HSFs between Setaria and maize, as well as between syntenic pairs of maize HSFs retained following its most recent genome duplication event. Additionally, we observed the widespread presence of TF binding at HSF promoters in control conditions, even at HSFs that are only expressed in response to heat stress. TF-binding peaks were typically near putative HSF-binding sites in HSFs upregulated in response to heat stress, but not in stable or not expressed HSFs. These observations collectively support a complex scenario of expansion and subfunctionalization within this transcription factor family and suggest that within-family HSF transcriptional regulation is a conserved, defining feature of the family.

6.
ACS Omega ; 8(18): 15951-15959, 2023 May 09.
Article in English | MEDLINE | ID: mdl-37179632

ABSTRACT

In this study, a sintering test of high-alumina limonite from Indonesia, matched with an appropriate magnetite concentration, is performed. The sintering yield and quality index are effectively improved by optimizing the ore matching and regulating the basicity. For the optimal coke dosage of 5.8% and basicity of 1.8, the tumbling index of the ore blend is found to be 61.5% and the productivity is 1.2 t/(h·m2). The main liquid phase in the sinter is the silico-ferrite of calcium and aluminum (SFCA), followed by a mutual solution, both of which maintain the sintering strength. However, when the basicity is increased from 1.8 to 2.0, the production of SFCA is found to increase gradually, whereas the mutual solution content decreases dramatically. A metallurgical performance test of the optimal sinter sample demonstrates that the sinter can meet the requirements of small- and medium-sized blast furnace smelting, even for high-alumina limonite ratios of 60.0-65.0%, thereby greatly reducing the sintering production costs. The results of this study are expected to provide theoretical guidance for the practical high-proportion sintering of high-alumina limonite.

7.
Genome Biol ; 23(1): 234, 2022 11 07.
Article in English | MEDLINE | ID: mdl-36345007

ABSTRACT

BACKGROUND: Many plant species exhibit genetic variation for coping with environmental stress. However, there are still limited approaches to effectively uncover the genomic region that regulates distinct responsive patterns of the gene across multiple varieties within the same species under abiotic stress. RESULTS: By analyzing the transcriptomes of more than 100 maize inbreds, we reveal many cis- and trans-acting eQTLs that influence the expression response to heat stress. The cis-acting eQTLs in response to heat stress are identified in genes with differential responses to heat stress between genotypes as well as genes that are only expressed under heat stress. The cis-acting variants for heat stress-responsive expression likely result from distinct promoter activities, and the differential heat responses of the alleles are confirmed for selected genes using transient expression assays. Global footprinting of transcription factor binding is performed in control and heat stress conditions to document regions with heat-enriched transcription factor binding occupancies. CONCLUSIONS: Footprints enriched near proximal regions of characterized heat-responsive genes in a large association panel can be utilized for prioritizing functional genomic regions that regulate genotype-specific responses under heat stress.


Subject(s)
Gene Expression Regulation, Plant , Zea mays , Zea mays/genetics , Heat-Shock Response/genetics , Stress, Physiological/genetics , Genomics , Transcription Factors/genetics
8.
ACS Omega ; 7(37): 33167-33185, 2022 Sep 20.
Article in English | MEDLINE | ID: mdl-36157731

ABSTRACT

To understand the characteristics of variation in porosity and permeability, the physical properties of the shale reservoir under different stress conditions play an important role in guiding shale gas production. With the shale of the Wufeng-Longmaxi Formation in the south of the Sichuan Basin as the research object, stress-dependent porosity and permeability test, high-pressure mercury injection, and scanning electron microscope test were performed in this study to thoroughly analyze the variation in physical properties of different shale lithofacies with effective stress. Besides, the stress sensitivity of different lithofacies reservoirs was evaluated by using parameters such as pore compressibility coefficient (PCC) and porosity sensitivity exponent (PSE), while the optimized support vector machine (SVM) algorithm was adopted to predict the coefficient of reservoir porosity sensitivity. According to the research results, the porosity and permeability of shale reservoirs decline as a negative exponential function. When the effective stress falls below 15 MPa, the damage rate of permeability/porosity increases rapidly with the rise of effective stress. By contrast, the permeability curvature of the shale reservoirs plunges with the rise of effective stress. It was discovered that a higher siliceous content results in a higher permeability curvature of shale, indicating the greater stress sensitivity of the reservoir. The ratio of matrix porosity to microfracture porosity determines the PSE, which is relatively low, and low aspect ratio pores contribute to high porosity compressibility and stress sensitivity. Young's modulus shows a negative correlation with pore compressibility and a positive correlation with Poisson's ratio. High clay minerals have a large number of low aspect ratio pores and a low elastic modulus, which leads to both high PCC and low PSE. Based on the principal component analysis, a multiclassification SVM model was established to predict the PSE, revealing that the accuracy of the sigmoid, radial basis function (RBF), and linear kernel function is consistently above 70%. According to error analysis, the accuracy can exceed 80% with the RBF kernel function and appropriate penalty factor. The research results serve to advance the research on the parameters related to overburden pressure, porosity, and permeability. Moreover, the optimized SVM algorithm is applied to make a classification prediction, which provides a reference for shale reservoir exploration and development both in theory and practice.

9.
Med Phys ; 49(8): 5451-5463, 2022 Aug.
Article in English | MEDLINE | ID: mdl-35543109

ABSTRACT

PURPOSE: Compared to the pencil-beam algorithm, the Monte-Carlo (MC) algorithm is more accurate for dose calculation but time-consuming in proton therapy. To solve this problem, this study uses deep learning to provide fast 3D dose prediction for prostate cancer patients treated with intensity-modulated proton therapy (IMPT). METHODS: A novel recurrent U-net (RU-net) architecture was trained to predict the 3D dose distribution. Doses, CT images, and beam spot information from IMPT plans were used to train the RU-net with a five-fold cross-validation. However, predicting the complicated dose properties of the IMPT plan is difficult for neural networks. Instead of the peak-monitor unit (MU) model, this work develops the multi-MU model that adopted more comprehensive inputs and was trained with a combinational loss function. The dose difference between the prediction dose and Monte Carlo (MC) dose was evaluated with gamma analysis, dice similarity coefficient (DSC), and dose-volume histogram (DVH) metrics. The MC dropout was also added to the network to quantify the uncertainty of the model. RESULTS: Compared to the peak-MU model, the multi-MU model led to smaller mean absolute errors (3.03% vs. 2.05%, p = 0.005), higher gamma-passing rate (2 mm, 3%: 97.42% vs. 93.69%, p = 0.005), higher dice similarity coefficient, and smaller relative DVH metrics error (clinical target volume (CTV) D98% : 3.03% vs. 6.08%, p = 0.017; in Bladder V30: 3.08% vs. 5.28%, p = 0.028; and in Bladder V20: 3.02% vs. 4.42%, p = 0.017). Considering more prior knowledge, the multi-MU model had better-predicted accuracy with a prediction time of less than half a second for each fold. The mean uncertainty value of the multi-MU model is 0.46%, with a dropout rate of 10%. CONCLUSION: This method was a nearly real-time IMPT dose prediction algorithm with accuracy comparable to the pencil beam (PB) analytical algorithms used in prostate cancer. This RU-net might be used in plan robustness optimization and robustness evaluation in the future.


Subject(s)
Prostatic Neoplasms , Proton Therapy , Radiotherapy, Intensity-Modulated , Feasibility Studies , Humans , Male , Neural Networks, Computer , Prostatic Neoplasms/diagnostic imaging , Prostatic Neoplasms/radiotherapy , Proton Therapy/methods , Radiotherapy Dosage , Radiotherapy Planning, Computer-Assisted/methods , Radiotherapy, Intensity-Modulated/methods
10.
Plant J ; 111(1): 103-116, 2022 07.
Article in English | MEDLINE | ID: mdl-35436373

ABSTRACT

The DOMAINS REARRANGED METHYLTRANSFERASEs (DRMs) are crucial for RNA-directed DNA methylation (RdDM) in plant species. Setaria viridis is a model monocot species with a relatively compact genome that has limited transposable element (TE) content. CRISPR-based genome editing approaches were used to create loss-of-function alleles for the two putative functional DRM genes in S. viridis to probe the role of RdDM. Double mutant (drm1ab) plants exhibit some morphological abnormalities but are fully viable. Whole-genome methylation profiling provided evidence for the widespread loss of methylation in CHH sequence contexts, particularly in regions with high CHH methylation in wild-type plants. Evidence was also found for the locus-specific loss of CG and CHG methylation, even in some regions that lack CHH methylation. Transcriptome profiling identified genes with altered expression in the drm1ab mutants. However, the majority of genes with high levels of CHH methylation directly surrounding the transcription start site or in nearby promoter regions in wild-type plants do not have altered expression in the drm1ab mutant, even when this methylation is lost, suggesting limited regulation of gene expression by RdDM. Detailed analysis of the expression of TEs identified several transposons that are transcriptionally activated in drm1ab mutants. These transposons are likely to require active RdDM for the maintenance of transcriptional repression.


Subject(s)
Setaria Plant , DNA Methylation/genetics , Gene Expression Regulation, Plant/genetics , Methyltransferases/genetics , Setaria Plant/genetics , Transcriptome
11.
Plant Cell ; 34(1): 514-534, 2022 01 20.
Article in English | MEDLINE | ID: mdl-34735005

ABSTRACT

Changes in gene expression are important for responses to abiotic stress. Transcriptome profiling of heat- or cold-stressed maize genotypes identifies many changes in transcript abundance. We used comparisons of expression responses in multiple genotypes to identify alleles with variable responses to heat or cold stress and to distinguish examples of cis- or trans-regulatory variation for stress-responsive expression changes. We used motifs enriched near the transcription start sites (TSSs) for thermal stress-responsive genes to develop predictive models of gene expression responses. Prediction accuracies can be improved by focusing only on motifs within unmethylated regions near the TSS and vary for genes with different dynamic responses to stress. Models trained on expression responses in a single genotype and promoter sequences provided lower performance when applied to other genotypes but this could be improved by using models trained on data from all three genotypes tested. The analysis of genes with cis-regulatory variation provides evidence for structural variants that result in presence/absence of transcription factor binding sites in creating variable responses. This study provides insights into cis-regulatory motifs for heat- and cold-responsive gene expression and defines a framework for developing models to predict expression responses across multiple genotypes.


Subject(s)
Cold-Shock Response/genetics , Gene Expression Regulation, Plant/physiology , Genes, Plant , Heat-Shock Response/genetics , Transcriptome , Zea mays/physiology , Gene Expression Profiling , Zea mays/genetics
12.
G3 (Bethesda) ; 11(8)2021 08 07.
Article in English | MEDLINE | ID: mdl-34849810

ABSTRACT

Accessible chromatin and unmethylated DNA are associated with many genes and cis-regulatory elements. Attempts to understand natural variation for accessible chromatin regions (ACRs) and unmethylated regions (UMRs) often rely upon alignments to a single reference genome. This limits the ability to assess regions that are absent in the reference genome assembly and monitor how nearby structural variants influence variation in chromatin state. In this study, de novo genome assemblies for four maize inbreds (B73, Mo17, Oh43, and W22) are utilized to assess chromatin accessibility and DNA methylation patterns in a pan-genome context. A more complete set of UMRs and ACRs can be identified when chromatin data are aligned to the matched genome rather than a single reference genome. While there are UMRs and ACRs present within genomic regions that are not shared between genotypes, these features are 6- to 12-fold enriched within regions between genomes. Characterization of UMRs present within shared genomic regions reveals that most UMRs maintain the unmethylated state in other genotypes with only ∼5% being polymorphic between genotypes. However, the majority (71%) of UMRs that are shared between genotypes only exhibit partial overlaps suggesting that the boundaries between methylated and unmethylated DNA are dynamic. This instability is not solely due to sequence variation as these partially overlapping UMRs are frequently found within genomic regions that lack sequence variation. The ability to compare chromatin properties among individuals with structural variation enables pan-epigenome analyses to study the sources of variation for accessible chromatin and unmethylated DNA.


Subject(s)
DNA Methylation , Zea mays , Chromatin/genetics , Gene Expression Regulation, Plant , Genome, Plant , Humans , Zea mays/genetics
13.
G3 (Bethesda) ; 11(10)2021 09 27.
Article in English | MEDLINE | ID: mdl-34568911

ABSTRACT

Intact transposable elements (TEs) account for 65% of the maize genome and can impact gene function and regulation. Although TEs comprise the majority of the maize genome and affect important phenotypes, genome-wide patterns of TE polymorphisms in maize have only been studied in a handful of maize genotypes, due to the challenging nature of assessing highly repetitive sequences. We implemented a method to use short-read sequencing data from 509 diverse inbred lines to classify the presence/absence of 445,418 nonredundant TEs that were previously annotated in four genome assemblies including B73, Mo17, PH207, and W22. Different orders of TEs (i.e., LTRs, Helitrons, and TIRs) had different frequency distributions within the population. LTRs with lower LTR similarity were generally more frequent in the population than LTRs with higher LTR similarity, though high-frequency insertions with very high LTR similarity were observed. LTR similarity and frequency estimates of nested elements and the outer elements in which they insert revealed that most nesting events occurred very near the timing of the outer element insertion. TEs within genes were at higher frequency than those that were outside of genes and this is particularly true for those not inserted into introns. Many TE insertional polymorphisms observed in this population were tagged by SNP markers. However, there were also 19.9% of the TE polymorphisms that were not well tagged by SNPs (R2 < 0.5) that potentially represent information that has not been well captured in previous SNP-based marker-trait association studies. This study provides a population scale genome-wide assessment of TE variation in maize and provides valuable insight on variation in TEs in maize and factors that contribute to this variation.


Subject(s)
DNA Transposable Elements , Zea mays , DNA Transposable Elements/genetics , Genotype , Introns , Terminal Repeat Sequences , Zea mays/genetics
14.
Proc Natl Acad Sci U S A ; 118(10)2021 03 09.
Article in English | MEDLINE | ID: mdl-33658387

ABSTRACT

Although genome-sequence assemblies are available for a growing number of plant species, gene-expression responses to stimuli have been cataloged for only a subset of these species. Many genes show altered transcription patterns in response to abiotic stresses. However, orthologous genes in related species often exhibit different responses to a given stress. Accordingly, data on the regulation of gene expression in one species are not reliable predictors of orthologous gene responses in a related species. Here, we trained a supervised classification model to identify genes that transcriptionally respond to cold stress. A model trained with only features calculated directly from genome assemblies exhibited only modest decreases in performance relative to models trained by using genomic, chromatin, and evolution/diversity features. Models trained with data from one species successfully predicted which genes would respond to cold stress in other related species. Cross-species predictions remained accurate when training was performed in cold-sensitive species and predictions were performed in cold-tolerant species and vice versa. Models trained with data on gene expression in multiple species provided at least equivalent performance to models trained and tested in a single species and outperformed single-species models in cross-species prediction. These results suggest that classifiers trained on stress data from well-studied species may suffice for predicting gene-expression patterns in related, less-studied species with sequenced genomes.


Subject(s)
Cold-Shock Response , Gene Expression Profiling , Gene Expression Regulation, Plant , Models, Genetic , Poaceae , Transcription, Genetic , Poaceae/genetics , Poaceae/metabolism , Species Specificity
15.
Plant Physiol ; 186(1): 420-433, 2021 05 27.
Article in English | MEDLINE | ID: mdl-33591319

ABSTRACT

Transposable elements (TEs) pervade most eukaryotic genomes. The repetitive nature of TEs complicates the analysis of their expression. Evaluation of the expression of both TE families (using unique and multi-mapping reads) and specific elements (using uniquely mapping reads) in leaf tissue of three maize (Zea mays) inbred lines subjected to heat or cold stress reveals no evidence for genome-wide activation of TEs; however, some specific TE families generate transcripts only in stress conditions. There is substantial variation for which TE families exhibit stress-responsive expression in the different genotypes. In order to understand the factors that drive expression of TEs, we focused on a subset of families in which we could monitor expression of individual elements. The stress-responsive activation of a TE family can often be attributed to a small number of elements in the family that contains regions lacking DNA methylation. Comparisons of the expression of TEs in different genotypes revealed both genetic and epigenetic variation. Many of the specific TEs that are activated in stress in one inbred are not present in the other inbred, explaining the lack of activation. Among the elements that are shared in both genomes but only expressed in one genotype, we found that many exhibit differences in DNA methylation such that the genotype without expression is fully methylated. This study provides insights into the regulation of expression of TEs in normal and stress conditions and highlights the role of chromatin variation between elements in a family or between genotypes for contributing to expression variation. The highly repetitive nature of many TEs complicates the analysis of their expression. Although most TEs are not expressed, some exhibits expression in certain tissues or conditions. We monitored the expression of both TE families (using unique and multi-mapping reads) and specific elements (using uniquely mapping reads) in leaf tissue of three maize (Zea mays) inbred lines subjected to heat or cold stress. While genome-wide activation of TEs did not occur, some TE families generated transcripts only in stress conditions with variation by genotype. To better understand the factors that drive expression of TEs, we focused on a subset of families in which we could monitor expression of individual elements. In most cases, stress-responsive activation of a TE family was attributed to a small number of elements in the family. The elements that contained small regions lacking DNA methylation regions showed enriched expression while fully methylated elements were rarely expressed in control or stress conditions. The cause of varied expression in the different genotypes was due to both genetic and epigenetic variation. Many specific TEs activated by stress in one inbred were not present in the other inbred. Among the elements shared in both genomes, full methylation inhibited expression in one of the genotypes. This study provides insights into the regulation of TE expression in normal and stress conditions and highlights the role of chromatin variation between elements in a family or between genotypes for contributing to expression.


Subject(s)
DNA Transposable Elements/genetics , Epigenesis, Genetic , Gene Expression , Genetic Variation , Stress, Physiological/genetics , Zea mays/physiology , Zea mays/genetics
16.
Plant Phenomics ; 2020: 7481687, 2020.
Article in English | MEDLINE | ID: mdl-33313562

ABSTRACT

High-throughput phenotyping system has become more and more popular in plant science research. The data analysis for such a system typically involves two steps: plant feature extraction through image processing and statistical analysis for the extracted features. The current approach is to perform those two steps on different platforms. We develop the package "implant" in R for both robust feature extraction and functional data analysis. For image processing, the "implant" package provides methods including thresholding, hidden Markov random field model, and morphological operations. For statistical analysis, this package can produce nonparametric curve fitting with its confidence region for plant growth. A functional ANOVA model to test for the treatment and genotype effects on the plant growth dynamics is also provided.

17.
Genes (Basel) ; 11(11)2020 10 29.
Article in English | MEDLINE | ID: mdl-33138126

ABSTRACT

In genome-wide association studies, linear mixed models (LMMs) have been widely used to explore the molecular mechanism of complex traits. However, typical association approaches suffer from several important drawbacks: estimation of variance components in LMMs with large scale individuals is computationally slow; single-locus model is unsatisfactory to handle complex confounding and causes loss of statistical power. To address these issues, we propose an efficient two-stage method based on hybrid of restricted and penalized maximum likelihood, named HRePML. Firstly, we performed restricted maximum likelihood (REML) on single-locus LMM to remove unrelated markers, where spectral decomposition on covariance matrix was used to fast estimate variance components. Secondly, we carried out penalized maximum likelihood (PML) on multi-locus LMM for markers with reasonably large effects. To validate the effectiveness of HRePML, we conducted a series of simulation studies and real data analyses. As a result, our method always had the highest average statistical power compared with multi-locus mixed-model (MLMM), fixed and random model circulating probability unification (FarmCPU), and genome-wide efficient mixed model association (GEMMA). More importantly, HRePML can provide higher accuracy estimation of marker effects. HRePML also identifies 41 previous reported genes associated with development traits in Arabidopsis, which is more than was detected by the other methods.


Subject(s)
Genome-Wide Association Study/methods , Algorithms , Arabidopsis/genetics , Arabidopsis/growth & development , Computer Simulation , Databases, Genetic , Genetic Markers , Genome-Wide Association Study/statistics & numerical data , Humans , Likelihood Functions , Linear Models , Models, Genetic , Polymorphism, Single Nucleotide , Quantitative Trait, Heritable , Software
18.
Plant Genome ; 13(2): e20015, 2020 07.
Article in English | MEDLINE | ID: mdl-33016608

ABSTRACT

Advances in genome sequencing and annotation have eased the difficulty of identifying new gene sequences. Predicting the functions of these newly identified genes remains challenging. Genes descended from a common ancestral sequence are likely to have common functions. As a result, homology is widely used for gene function prediction. This means functional annotation errors also propagate from one species to another. Several approaches based on machine learning classification algorithms were evaluated for their ability to accurately predict gene function from non-homology gene features. Among the eight supervised classification algorithms evaluated, random-forest-based prediction consistently provided the most accurate gene function prediction. Non-homology-based functional annotation provides complementary strengths to homology-based annotation, with higher average performance in Biological Process GO terms, the domain where homology-based functional annotation performs the worst, and weaker performance in Molecular Function GO terms, the domain where the accuracy of homology-based functional annotation is highest. GO prediction models trained with homology-based annotations were able to successfully predict annotations from a manually curated "gold standard" GO annotation set. Non-homology-based functional annotation based on machine learning may ultimately prove useful both as a method to assign predicted functions to orphan genes which lack functionally characterized homologs, and to identify and correct functional annotation errors which were propagated through homology-based functional annotations.


Subject(s)
Computational Biology , Zea mays , Algorithms , Chromosome Mapping , Machine Learning , Zea mays/genetics
19.
Phys Med ; 73: 43-47, 2020 May.
Article in English | MEDLINE | ID: mdl-32311653

ABSTRACT

PURPOSE: Proton therapy is a precise radiation cancer treatment with low side effects. To reduce the cost and footprint of the facility, the superconducting gantry with large momentum acceptance becomes a potential solution. Benefit from this feature, beam delivery time depends largely on the energy-switching process and short time is helpful for increasing the number of volume repaintings. METHODS: This note introduces an energy degrader with lightweight moving parts and a new hybrid structure (wedge-block-block). The total energies are separated into three stages and are degraded at fixed rates in two boron carbide blocks. As only one pair of graphite wedges is used for energy modulation, the energy switching at each step reaches a 10 ms level. RESULTS: The transport process in the degrader was simulated in TOPAS. After the degradation, the maximum energy spread (1σ) was approximately 5.5%, and the distance between successive energy layers can be increased for treating non-sensitive tissues. Six configurations of the hybrid degrader achieved distinctly higher transmission efficiencies than the usual graphite multi-wedge degrader. Finally, the configuration that maximized the beam transmission in the lower-energy range (namely, the W-B1-B2 configuration) was chosen as the degrader. CONCLUSIONS: This new degrader not only improved the transmission efficiency, but also reduced the energy-switching time by virtue of its light and compact structure.


Subject(s)
Equipment Design , Proton Therapy/instrumentation , Graphite
20.
Mol Plant ; 13(6): 907-922, 2020 06 01.
Article in English | MEDLINE | ID: mdl-32171733

ABSTRACT

Linking natural genetic variation to trait variation can help determine the functional roles ofdifferent genes. Variations of one or several traits are often assessed separately. High-throughput phenotyping and data mining can capture dozens or hundreds of traits from the same individuals. Here, we test the association between markers within a gene and many traits simultaneously. This genome-phenome wide association study (GPWAS) is both a multi-marker and multi-trait test. Genes identified using GPWAS with 260 phenotypic traits in maize were enriched for genes independently linked to phenotypic variation. Traits associated with classical mutants were consistent with reported phenotypes for mutant alleles. Genes linked to phenomic variation in maize using GPWAS shared molecular, population genetic, and evolutionary features with classical mutants in maize. Genes linked to phenomic variation in Arabidopsis using GPWAS are significantly enriched in genes with known loss-of-function phenotypes. GPWAS may be an effective strategy to identify genes in which loss-of-function alleles produce mutant phenotypes. The shared signatures present in classical mutants and genes identified using GPWAS may be markers for genes with a role in specifying plant phenotypes generally or pleiotropy specifically.


Subject(s)
Arabidopsis/genetics , Evolution, Molecular , Genome, Plant , Genome-Wide Association Study , Phenomics , Zea mays/genetics , Algorithms , Gene Knockout Techniques , Genes, Plant , Genetic Pleiotropy , Genetic Variation , Models, Genetic , Mutation/genetics , Phenotype , Reproducibility of Results
SELECTION OF CITATIONS
SEARCH DETAIL
...