ABSTRACT
With high-frequency data of nitrate (NO3-N) concentrations in waters becoming increasingly important for understanding of watershed system behaviors and ecosystem managements, the accurate and economic acquisition of high-frequency NO3-N concentration data has become a key point. This study attempted to use coupled deep learning neural networks and routine monitored data to predict hourly NO3-N concentrations in a river. The hourly NO3-N concentration at the outlet of the Oyster River watershed in New Hampshire, USA, was predicted through neural networks with a hybrid model architecture coupling the Convolutional Neural Networks and the Long Short-Term Memory model (CNN-LSTM). The routine monitored data (the river depth, water temperature, air temperature, precipitation, specific conductivity, pH and dissolved oxygen concentrations) for model training were collected from a nested high-frequency monitoring network, while the high-frequency NO3-N concentration data obtained at the outlet were not included as inputs. The whole dataset was separated into training, validation, and testing processes according to the ratio of 5:3:2, respectively. The hybrid CNN-LSTM model with different input lengths (1d, 3d, 7d, 15d, 30d) displayed comparable even better performance than other studies with lower frequencies, showing mean values of the Nash-Sutcliffe Efficiency 0.60-0.83. Models with shorter input lengths demonstrated both the higher modeling accuracy and stability. The water level, water temperature and pH values at monitoring sites were main controlling factors for forecasting performances. This study provided a new insight of using deep learning networks with a coupled architecture and routine monitored data for high-frequency riverine NO3-N concentration forecasting and suggestions about strategies about variable and input length selection during preprocessing of input data.
Subject(s)
Deep Learning , Neural Networks, Computer , Nitrates , Rivers , Nitrates/analysis , Rivers/chemistry , Environmental Monitoring/methods , Water Pollutants, Chemical/analysis , New HampshireABSTRACT
LISE is a web server for a novel method for predicting small molecule binding sites on proteins. It differs from a number of servers currently available for such predictions in two aspects. First, rather than relying on knowledge of similar protein structures, identification of surface cavities or estimation of binding energy, LISE computes a score by counting geometric motifs extracted from sub-structures of interaction networks connecting protein and ligand atoms. These network motifs take into account spatial and physicochemical properties of ligand-interacting protein surface atoms. Second, LISE has now been more thoroughly tested, as, in addition to the evaluation we previously reported using two commonly used small benchmark test sets and targets of two community-based experiments on ligand-binding site predictions, we now report an evaluation using a large non-redundant data set containing >2000 protein-ligand complexes. This unprecedented test, the largest ever reported to our knowledge, demonstrates LISE's overall accuracy and robustness. Furthermore, we have identified some hard to predict protein classes and provided an estimate of the performance that can be expected from a state-of-the-art binding site prediction server, such as LISE, on a proteome scale. The server is freely available at http://lise.ibms.sinica.edu.tw.
Subject(s)
Proteins/chemistry , Software , Binding Sites , Internet , Ligands , Phosphotransferases/chemistry , Protein Conformation , Proteins/metabolismABSTRACT
VarioWatch (http://genepipe.ncgm.sinica.edu.tw/variowatch/) has been vastly improved since its former publication GenoWatch in the 2008 Web Server Issue. It is now at least 10 000-times faster in annotating a variant. Drastic speed increase, through complete re-design of its working mechanism, makes VarioWatch capable of annotating millions of human genomic variants generated from next generation sequencing in minutes, if not seconds. While using MegaQuery of VarioWatch to quickly annotate variants, users can apply various filters to retrieve a subgroup of variants according to the risk levels, interested regions, etc. that satisfy users' requirements. In addition to performance leap, many new features have also been added, such as annotation on novel variants, functional analyses on splice sites and in/dels, detailed variant information in tabulated form, plus a risk level decision tree regarding the analyzed variant. Up to 1000 target variants can be visualized with our carefully designed Genome View, Gene View, Transcript View and Variation View. Two commonly used reference versions, NCBI build 36.3 and NCBI build 37.2, are supported. VarioWatch is unique in its ability to annotate comprehensively and efficiently millions of variants online, immediately delivering the results in real time, plus visualizes up to 1000 annotated variants.
Subject(s)
Genetic Variation , Genome, Human , High-Throughput Nucleotide Sequencing , Molecular Sequence Annotation , Software , Humans , Internet , Sequence Analysis, DNAABSTRACT
Dissolved oxygen (DO) depletion is a severe threat to aquatic ecosystems. Hence, using easily available routine hydrometeorological variables without DO as inputs to predict the daily minimum DO concentration in rivers has huge practical significance in the watershed management. The daily minimum DO concentrations at the outlet of the Oyster River watershed in New Hampshire, USA, were predicted by a set of deep learning neural networks using meteorological data and high-frequency water level, water temperature, and specific conductance (CTD) data measured within the watershed. The dependent variable, DO concentration, was measured at the outlet. From April 2013 to March 2018, the dataset was separated into training, validation, and test portions with a ratio of 5:3:3. A Long Short-Term Memory (LSTM) model and a hybrid Convolutional Neural Networks (CNN-LSTM) model were trained and evaluated for predicting the daily minimum DO concentration. The hybrid CNN-LSTM model exhibited the better predictive stability but the comparable accuracy (the mean R2 valueĀ =Ā 0.865) compared with the pure LSTM model (the mean R2 valueĀ =Ā 0.839). The model performance (both the stability and accuracy) was improved by aggregating the input data frequency from 15 min of raw data to 24 h. Likewise, the modeling performance didn't benefit from including 'forecasted' meteorological data at the predicted time step in the input dataset. This study provided an efficient and low-cost approach to predict the water quality in rivers and streams to realize the scientific watershed management.
ABSTRACT
It has been well documented that agricultural activities lead to significant alterations in surface water dissolved organic matter (DOM), yet their impacts on groundwater DOM remain poorly constrained. The quantity, source, and composition of DOM play a pivotal role in a range of groundwater ecosystem services that are of important ecological and societal values. We assessed the impact of irrigation on the source and compositional characteristics of groundwater DOM in a large river basin supporting intensive agriculture in arid northwestern China. We sampled five water types along a river reach of approximately 40Ā km, including groundwater, river water, irrigation canal water, hyporheic water, and soil leachates. The excitation-emission matrix (EEM) measurements coupled with parallel factor analysis (PARAFAC) identified two terrestrial-derived, humic-like fluorescent components (C1 and C2) and one protein-like autochthonous component (C3). DOM composition and dissolved organic carbon (DOC) concentration varied as a function of water type, with subsurface waters showing relatively lower DOC and terrestrial humic fluorescence than surface waters. Combining nitrate, electrical conductivity, dissolved inorganic carbon (DIC), and ĆĀ“13C-DIC, irrigation-influenced samples were identified, and the influence of irrigation on groundwater DOM appeared only in shallow aquifers (<50Ā m). Irrigation-influenced groundwater exhibited higher DOC and terrestrial fluorescence than unimpacted groundwater, suggesting that irrigation return flows accelerated the downward movement of terrestrial humic compounds and led to their accumulation in aquifers. This effect was propagated via surface water-groundwater interactions to upwelling hyporheic water, which also showed enrichment in terrestrial fluorescence. Our findings demonstrate that irrigation can accelerate the biogeochemical cycling of organic compounds via a subsurface pathway of from the soil to aquifer to hyporheic zone. The enrichment of soil-derived compounds in subsurface waters may have important ecological consequences, such as altering the transport of nutrients and pollutants and changing carbon and energy flows across the surface-subsurface boundary.
ABSTRACT
A human gene association study often involves several genomic markers such as single nucleotide polymorphisms (SNPs) or short tandem repeat polymorphisms, and many statistically significant markers may be identified during the study. GenoWatch can efficiently extract up-to-date information about multiple markers and their associated genes in batch mode from many relevant biological databases in real-time. The comprehensive gene information retrieved includes gene ontology, function, pathway, disease, related articles in PubMed and so on. Subsequent SNP functional impact analysis and primer design of a target gene for re-sequencing can also be done in a few clicks. The presentation of results has been carefully designed to be as intuitive as possible to all users. The GenoWatch is available at the website http://genepipe.ngc.sinica.edu.tw/genowatch.
Subject(s)
Genes , Genetic Diseases, Inborn/genetics , Polymorphism, Genetic , Software , Chromosome Mapping , Databases, Genetic , Genetic Markers , Genome, Human , Humans , Internet , Microsatellite Repeats , Polymorphism, Single Nucleotide , PubMed , User-Computer InterfaceABSTRACT
BACKGROUND: With the flood of information generated by the new generation of sequencing technologies, more efficient bioinformatics tools are needed for in-depth impact analysis of novel genomic variations. FANS (Functional Analysis of Novel SNPs) was developed to streamline comprehensive but tedious functional analysis steps into a few clicks and to offer a carefully designed presentation of results so researchers can focus more on thinking instead of typing and calculating. RESULTS: FANS http://fans.ngc.sinica.edu.tw/ harnesses the power of public information databases and powerful tools from six well established websites to enhance the efficiency of analysis of novel variations. FANS can process any point change in any coding region or GT-AG splice site to provide a clear picture of the disease risk of a prioritized variation by classifying splicing and functional alterations into one of nine risk subtypes with five risk levels. CONCLUSION: FANS significantly simplifies the analysis operations to a four-step procedure while still covering all major areas of interest to researchers. FANS offers a convenient way to prioritize the variations and select the ones with most functional impact for validation. Additionally, the program offers a distinct improvement in efficiency over manual operations in our benchmark test.
Subject(s)
Computational Biology/methods , Mutation , Polymorphism, Single Nucleotide , Animals , Automation , Genetic Variation , Genome , Genome, Human , Genomics , Humans , Mice , Programming Languages , Risk , Sequence Analysis, DNA/methods , SoftwareABSTRACT
Single nucleotide polymorphism (SNP) prioritization based on the phenotypic risk is essential for association studies. Assessment of the risk requires access to a variety of heterogeneous biological databases and analytical tools. FASTSNP (function analysis and selection tool for single nucleotide polymorphisms) is a web server that allows users to efficiently identify and prioritize high-risk SNPs according to their phenotypic risks and putative functional effects. A unique feature of FASTSNP is that the functional effect information used for SNP prioritization is always up-to-date, because FASTSNP extracts the information from 11 external web servers at query time using a team of web wrapper agents. Moreover, FASTSNP is extendable by simply deploying more Web wrapper agents. To validate the results of our prioritization, we analyzed 1569 SNPs from the SNP500Cancer database. The results show that SNPs with a high predicted risk exhibit low allele frequencies for the minor alleles, consistent with a well-known finding that a strong selective pressure exists for functional polymorphisms. We have been using FASTSNP for 2 years and FASTSNP enables us to discover a novel promoter polymorphism. FASTSNP is available at http://fastsnp.ibms.sinica.edu.tw.
Subject(s)
Polymorphism, Single Nucleotide , Software , Gene Frequency , Genetic Predisposition to Disease , Internet , Phenotype , Proteins/genetics , Risk , User-Computer InterfaceABSTRACT
Fenton process was employed to treat synthetic dye wastewater with supply of Fe(II) electrolytically generated from iron-containing sludge which was recycled and reused throughout the study. Treated water quality and properties of iron sludge after being repeatedly used were reported and discussed. Experimental results showed that COD was mainly removed by oxidation other than coagulation. Although, the process was quite effective for COD and color removal, conductivity of treated water was enormously high. Meanwhile, repeated use of iron-containing sludge results in accumulation of organic materials embedded in the sludge as indicated by increasing volatile suspended solid (VSS)/TSS ratio and decreasing zeta potential.
Subject(s)
Coloring Agents/chemistry , Ferrous Compounds/chemistry , Hydrogen Peroxide/chemistry , Sewage , Water Pollutants, Chemical/chemistry , Electrolysis , Waste Disposal, Fluid/methods , Water Purification/methodsABSTRACT
A nonionic surfactant, polyoxyethylene Octyl phenyl ether (Triton-X), is added to a micellar-enhanced ultrafiltration process to lower the critical micellar concentration (CMC) of an anionic surfactant, sodium dodecyl sulfate (SDS). The effects of adding Triton-X on the copper removal efficiency, the permeate SDS concentration, the copper binding capacity of SDS micelles, and membrane fouling are investigated. Our results show that the addition of Triton-X at concentrations greater than its CMC can reduce the SDS dosage required for effective Cu removal, and at the same time, minimize the permeate SDS concentration. Although, no adverse effect on the copper binding capacity of SDS micelle is observed by the addition of Triton-X, the membrane fouling is worsen. Cleaning the membrane with DI water allowed restoring the membrane flux, indicating that the fouling caused by Triton-X was reversible.
Subject(s)
Copper/isolation & purification , Polyethylene Glycols/chemistry , Sodium Dodecyl Sulfate/chemistry , Surface-Active Agents/chemistry , Water Pollutants, Chemical/isolation & purification , Copper/chemistry , Industrial Waste , Micelles , Ultrafiltration , Waste Disposal, Fluid , Water Purification/methodsABSTRACT
Traditional methods for studying surface water and groundwater interactions have usually been limited to point measurements, such as geochemical sampling and seepage measurement. A new methodology is presented for quantifying groundwater discharge to a river, by using river surface temperature data obtained from airborne thermal infrared remote sensing technology. The Hot Spot Analysis toolkit in ArcGIS was used to calculate the percentage of groundwater discharge to a river relative to the total flow of the river. This methodology was evaluated in the midstream of the Heihe River in the arid and semiarid northwest China. The results show that the percentage of groundwater discharge relative to the total streamflow was as high as 28%, which is in good agreement with the results from previous geochemical studies. The data analysis methodology used in this study is based on the assumption that the river water is fully mixed except in the areas of extremely low flow velocity, which could lead to underestimation of the amount of groundwater discharge. Despite this limitation, this remote sensing-based approach provides an efficient means of quantifying the surface water and groundwater interactions on a regional scale.
Subject(s)
Groundwater , Remote Sensing Technology , China , Fresh Water , RiversABSTRACT
The effects of the type and concentration of ligands on the removal of Cu by micellar-enhanced ultrafiltration (MEUF) with the help of either anionic or cationic surfactants were investigated. The removal efficiency of copper by anionic surfactant-(SDS-) MEUF depends on the ligand-to-Cu ratio and the ligand-to-Cu complexation constant. At fixed ligand-to-Cu ratio, the Cu removal efficiency decreases in the order of citric acid>NTA>EDTA, which is the reverse order of Cu-ligand complexation constants for these ligands. Increasing SDS-ligand ratios from 12 to 60 at fixed ligand concentration did not improve copper removal efficiency. The cationic surfactant, CPC, enhances Cu removal efficiency in systems with condition of ligand-copper ratios higher than 1.0, where Cu removal is not very efficient using SDS-MEUF process. The Cu removal efficiency with CPC-MEUF depends on both the ligand-to-Cu ratio and the type of ligands.