Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 18 de 18
Filter
Add more filters










Publication year range
1.
J Chromatogr A ; 1725: 464949, 2024 Jun 21.
Article in English | MEDLINE | ID: mdl-38688054

ABSTRACT

This study introduces an innovative needle trap device (NTD) featuring a molecularly imprinted polymer (MIP) surface-modified Zeolite Y. The developed NTD was integrated with gas chromatography-flame ionization detector (GC-FID) and employed for analysis of fuel ether oxygenates (methyl tert­butyl ether, MTBE, ethyl tert­butyl ether, ETBE, and tert­butyl formate, TBF) in urine samples. To optimize the key experimental variables including extraction temperature, extraction time, salt concentration, and stirring speed, a central composite design-response surface methodology (CCD-RSM) was employed. The optimal values for extraction in the study were found to be 51.2 °C extraction temperature, 46.2 min extraction time, 27 % salt concentration, and 620 rpm stirring speed. Under the optimized conditions, the calibration curves demonstrated excellent linearity within the range of 0.1-100 µg L-1, with correlation coefficients (R2) exceeding 0.99. The limits of detection (LODs) for MTBE, ETBE, and TBF were obtained 0.06, 0.08, and 0.09 µg L-1, respectively. Moreover, the limits of quantification (LOQs) for MTBE, ETBE, and TBF were obtained 0.18, 0.24, and 0.27 µg L-1, respectively. The enrichment factor was also found to be in the range of 98-129.The NTD-GC-FID procedure demonstrated a high extraction efficiency, making it a promising tool for urinary biomonitoring of fuel ether oxygenates with improved sensitivity and selectivity compared to current methods.


Subject(s)
Limit of Detection , Methyl Ethers , Zeolites , Zeolites/chemistry , Humans , Methyl Ethers/urine , Methyl Ethers/chemistry , Molecularly Imprinted Polymers/chemistry , Biological Monitoring/methods , Chromatography, Gas/methods , Ethyl Ethers/urine , Ethyl Ethers/chemistry
3.
Food Chem ; 442: 138455, 2024 Jun 01.
Article in English | MEDLINE | ID: mdl-38271905

ABSTRACT

The study was performed in two phases. First, the polymerization was carried out upon three magnetized surfaces of silica aerogel, zeolite Y, and MIL-101(Cr). Then, optimal molecularly imprinted polymer and optimal extraction conditions were determined by the central composite design-response surface method. Subsequently, the validation parameters of dispersive solid-phase extraction based optimal molecularly imprinted polymer were examined for the extraction of the fuel ether oxygenates. The optimal conditions include the type of adsorbent: Zeolite-magnetic molecularly imprinted polymer, the amount of adsorbent: 40 mg, pH: 7.7, and absorption time: 24.8 min which was selected with desirability equal to 0.996. The calibration graphs were linear between 1 and 100 µg L-1, with good correlation coefficients. The limits of detection were found to be 0.64, 0. 4, and 0.34 µg L-1 for methyl tert-butyl ether, ethyl tert-butyl ether, and tert butyl formate, respectively. The method proved reliable for analyzing fuel ether oxygenates in drinking water.


Subject(s)
Drinking Water , Metal-Organic Frameworks , Molecular Imprinting , Zeolites , Molecularly Imprinted Polymers , Silicon Dioxide , Ether , Polymers , Solid Phase Extraction , Ethers , Magnetic Phenomena , Molecular Imprinting/methods
4.
Am J Hum Genet ; 110(9): 1454-1469, 2023 09 07.
Article in English | MEDLINE | ID: mdl-37595579

ABSTRACT

Short-read genome sequencing (GS) holds the promise of becoming the primary diagnostic approach for the assessment of autism spectrum disorder (ASD) and fetal structural anomalies (FSAs). However, few studies have comprehensively evaluated its performance against current standard-of-care diagnostic tests: karyotype, chromosomal microarray (CMA), and exome sequencing (ES). To assess the clinical utility of GS, we compared its diagnostic yield against these three tests in 1,612 quartet families including an individual with ASD and in 295 prenatal families. Our GS analytic framework identified a diagnostic variant in 7.8% of ASD probands, almost 2-fold more than CMA (4.3%) and 3-fold more than ES (2.7%). However, when we systematically captured copy-number variants (CNVs) from the exome data, the diagnostic yield of ES (7.4%) was brought much closer to, but did not surpass, GS. Similarly, we estimated that GS could achieve an overall diagnostic yield of 46.1% in unselected FSAs, representing a 17.2% increased yield over karyotype, 14.1% over CMA, and 4.1% over ES with CNV calling or 36.1% increase without CNV discovery. Overall, GS provided an added diagnostic yield of 0.4% and 0.8% beyond the combination of all three standard-of-care tests in ASD and FSAs, respectively. This corresponded to nine GS unique diagnostic variants, including sequence variants in exons not captured by ES, structural variants (SVs) inaccessible to existing standard-of-care tests, and SVs where the resolution of GS changed variant classification. Overall, this large-scale evaluation demonstrated that GS significantly outperforms each individual standard-of-care test while also outperforming the combination of all three tests, thus warranting consideration as the first-tier diagnostic approach for the assessment of ASD and FSAs.


Subject(s)
Autism Spectrum Disorder , Female , Pregnancy , Humans , Autism Spectrum Disorder/diagnosis , Autism Spectrum Disorder/genetics , Pregnancy Trimester, First , Ultrasonography, Prenatal , Chromosome Mapping , Exome
5.
BMC Bioinformatics ; 24(1): 240, 2023 Jun 07.
Article in English | MEDLINE | ID: mdl-37286963

ABSTRACT

BACKGROUND: Protein-DNA binding sites of ChIP-seq experiments are identified where the binding affinity is significant based on a given threshold. The choice of the threshold is a trade-off between conservative region identification and discarding weak, but true binding sites. RESULTS: We rescue weak binding sites using MSPC, which efficiently exploits replicates to lower the threshold required to identify a site while keeping a low false-positive rate, and we compare it to IDR, a widely used post-processing method for identifying highly reproducible peaks across replicates. We observe several master transcription regulators (e.g., SP1 and GATA3) and HDAC2-GATA1 regulatory networks on rescued regions in K562 cell line. CONCLUSIONS: We argue the biological relevance of weak binding sites and the information they add when rescued by MSPC. An implementation of the proposed extended MSPC methodology and the scripts to reproduce the performed analysis are freely available at https://genometric.github.io/MSPC/ ; MSPC is distributed as a command-line application and an R package available from Bioconductor ( https://doi.org/doi:10.18129/B9.bioc.rmspc ).


Subject(s)
Chromatin Immunoprecipitation Sequencing , Software , Sequence Analysis, DNA/methods , Consensus , Binding Sites
6.
Crit Rev Anal Chem ; 53(3): 463-482, 2023.
Article in English | MEDLINE | ID: mdl-34414831

ABSTRACT

Per- and polyfluoroalkyl substances (PFAS) are fluorocarbon compounds in which hydrogen atoms have been partly or entirely replaced by fluorine. They have a very wide range of applications, while they are persistent in the environment and exhibit bioaccumulative and toxic properties. Neither chemical nor biological mechanisms can decompose PFAS due to their strong C-F bonds. PFAS have shown adverse effects on various organisms, even at trace levels. Accordingly, highly sensitive and selective analytical methods are required for their tracing in biological and environmental matrices. The physicochemical properties of PFAS like surfactant characteristics and high-water solubility are unique and different from other known pollutants. Accordingly, the number of articles on the analysis of PFAS is less than the other well-known contaminants. The routine PFAS sample preparation methods (like solvent extraction) coupled with chromatographic systems, face challenges such as high limits of detection, need for laborious derivatization, limited selectivity, and expensive instrumentation. Recent efforts to address these limitations have aroused considerable attention to the development of microextraction techniques, which are consistent with the principles of green chemistry and can be made easily portable and automated. Moreover, these methods have shown enough sensitivity and selectivity for the analysis of different analytes (including PFAS) in a wide range of samples with different matrices. This research aims to review the microextraction methods and detection techniques, applied for the sample pretreatment of PFAS in various matrices, along with a critical discussion of the challenges and potential future trends.


Subject(s)
Environmental Pollutants , Fluorocarbons , Hydrocarbons, Fluorinated , Environmental Pollutants/analysis , Fluorocarbons/analysis , Fluorocarbons/chemistry , Hydrocarbons, Fluorinated/analysis , Hydrocarbons, Fluorinated/chemistry , Humans
7.
PLoS Comput Biol ; 17(6): e1009014, 2021 06.
Article in English | MEDLINE | ID: mdl-34061826

ABSTRACT

Supervised machine learning is an essential but difficult to use approach in biomedical data analysis. The Galaxy-ML toolkit (https://galaxyproject.org/community/machine-learning/) makes supervised machine learning more accessible to biomedical scientists by enabling them to perform end-to-end reproducible machine learning analyses at large scale using only a web browser. Galaxy-ML extends Galaxy (https://galaxyproject.org), a biomedical computational workbench used by tens of thousands of scientists across the world, with a suite of tools for all aspects of supervised machine learning.


Subject(s)
Computational Biology/methods , Machine Learning , Reproducibility of Results , Software
9.
Nucleic Acids Res ; 48(W1): W395-W402, 2020 07 02.
Article in English | MEDLINE | ID: mdl-32479607

ABSTRACT

Galaxy (https://galaxyproject.org) is a web-based computational workbench used by tens of thousands of scientists across the world to analyze large biomedical datasets. Since 2005, the Galaxy project has fostered a global community focused on achieving accessible, reproducible, and collaborative research. Together, this community develops the Galaxy software framework, integrates analysis tools and visualizations into the framework, runs public servers that make Galaxy available via a web browser, performs and publishes analyses using Galaxy, leads bioinformatics workshops that introduce and use Galaxy, and develops interactive training materials for Galaxy. Over the last two years, all aspects of the Galaxy project have grown: code contributions, tools integrated, users, and training materials. Key advances in Galaxy's user interface include enhancements for analyzing large dataset collections as well as interactive tools for exploratory data analysis. Extensions to Galaxy's framework include support for federated identity and access management and increased ability to distribute analysis jobs to remote resources. New community resources include large public servers in Europe and Australia, an increasing number of regional and local Galaxy communities, and substantial growth in the Galaxy Training Network.


Subject(s)
Software , Biomedical Research , Data Analysis , Datasets as Topic , Metabolomics/methods , Metagenomics/methods , Proteomics/methods , Reproducibility of Results , Single-Cell Analysis/methods
10.
Cell ; 181(1): 92-101, 2020 04 02.
Article in English | MEDLINE | ID: mdl-32243801

ABSTRACT

This Perspective explores the application of machine learning toward improved diagnosis and treatment. We outline a vision for how machine learning can transform three broad areas of biomedicine: clinical diagnostics, precision treatments, and health monitoring, where the goal is to maintain health through a range of diseases and the normal aging process. For each area, early instances of successful machine learning applications are discussed, as well as opportunities and challenges for machine learning. When these challenges are met, machine learning promises a future of rigorous, outcomes-based medicine with detection, diagnosis, and treatment strategies that are continuously adapted to individual and environmental differences.


Subject(s)
Machine Learning , Precision Medicine , Humans
11.
Bioinformatics ; 36(1): 1-9, 2020 01 01.
Article in English | MEDLINE | ID: mdl-31197310

ABSTRACT

MOTIVATION: Large biomedical datasets, such as those from genomics and imaging, are increasingly being stored on commercial and institutional cloud computing platforms. This is because cloud-scale computing resources, from robust backup to high-speed data transfer to scalable compute and storage, are needed to make these large datasets usable. However, one challenge for large-scale biomedical data on the cloud is providing secure access, especially when datasets are distributed across platforms. While there are open Web protocols for secure authentication and authorization, these protocols are not in wide use in bioinformatics and are difficult to use for even technologically sophisticated users. RESULTS: We have developed a generic and extensible approach for securely accessing biomedical datasets distributed across cloud computing platforms. Our approach combines OpenID Connect and OAuth2, best-practice Web protocols for authentication and authorization, together with Galaxy (https://galaxyproject.org), a web-based computational workbench used by thousands of scientists across the world. With our enhanced version of Galaxy, users can access and analyze data distributed across multiple cloud computing providers without any special knowledge of access/authorization protocols. Our approach does not require users to share permanent credentials (e.g. username, password, API key), instead relying on automatically generated temporary tokens that refresh as needed. Our approach is generalizable to most identity providers and cloud computing platforms. To the best of our knowledge, Galaxy is the only computational workbench where users can access biomedical datasets across multiple cloud computing platforms using best-practice Web security approaches and thereby minimize risks of unauthorized data access and credential use. AVAILABILITY AND IMPLEMENTATION: Freely available for academic and commercial use under the open-source Academic Free License (https://opensource.org/licenses/AFL-3.0) from the following Github repositories: https://github.com/galaxyproject/galaxy and https://github.com/galaxyproject/cloudauthz.


Subject(s)
Cloud Computing , Computational Biology , Computer Security , Computational Biology/standards , Computer Security/trends , Software
12.
Nucleic Acids Res ; 46(W1): W537-W544, 2018 07 02.
Article in English | MEDLINE | ID: mdl-29790989

ABSTRACT

Galaxy (homepage: https://galaxyproject.org, main public server: https://usegalaxy.org) is a web-based scientific analysis platform used by tens of thousands of scientists across the world to analyze large biomedical datasets such as those found in genomics, proteomics, metabolomics and imaging. Started in 2005, Galaxy continues to focus on three key challenges of data-driven biomedical science: making analyses accessible to all researchers, ensuring analyses are completely reproducible, and making it simple to communicate analyses so that they can be reused and extended. During the last two years, the Galaxy team and the open-source community around Galaxy have made substantial improvements to Galaxy's core framework, user interface, tools, and training materials. Framework and user interface improvements now enable Galaxy to be used for analyzing tens of thousands of datasets, and >5500 tools are now available from the Galaxy ToolShed. The Galaxy community has led an effort to create numerous high-quality tutorials focused on common types of genomic analyses. The Galaxy developer and user communities continue to grow and be integral to Galaxy's development. The number of Galaxy public servers, developers contributing to the Galaxy framework and its tools, and users of the main Galaxy server have all increased substantially.


Subject(s)
Genomics/statistics & numerical data , Metabolomics/statistics & numerical data , Molecular Imaging/statistics & numerical data , Proteomics/statistics & numerical data , User-Computer Interface , Datasets as Topic , Humans , Information Dissemination , International Cooperation , Internet , Reproducibility of Results
14.
Article in English | MEDLINE | ID: mdl-34386295

ABSTRACT

Biomedical data exploration requires integrative analyses of large datasets using a diverse ecosystem of tools. For more than a decade, the Galaxy project (https://galaxyproject.org) has provided researchers with a web-based, user-friendly, scalable data analysis framework complemented by a rich ecosystem of tools (https://usegalaxy.org/toolshed) used to perform genomic, proteomic, metabolomic, and imaging experiments. Galaxy can be deployed on the cloud (https://launch.usegalaxy.org), institutional computing clusters, and personal computers, or readily used on a number of public servers (e.g., https://usegalaxy.org). In this paper, we present our plan and progress towards creating Galaxy-as-a-Service-a federation of distributed data and computing resources into a panoptic analysis platform. Users can leverage a pool of public and institutional resources, in addition to plugging-in their private resources, helping answer the challenge of resource divergence across various Galaxy instances and enabling seamless analysis of biomedical data.

15.
BMC Bioinformatics ; 18(1): 536, 2017 Dec 04.
Article in English | MEDLINE | ID: mdl-29202689

ABSTRACT

BACKGROUND: With the wide-spreading of public repositories of NGS processed data, the availability of user-friendly and effective tools for data exploration, analysis and visualization is becoming very relevant. These tools enable interactive analytics, an exploratory approach for the seamless "sense-making" of data through on-the-fly integration of analysis and visualization phases, suggested not only for evaluating processing results, but also for designing and adapting NGS data analysis pipelines. RESULTS: This paper presents abstractions for supporting the early analysis of NGS processed data and their implementation in an associated tool, named GenoMetric Space Explorer (GeMSE). This tool serves the needs of the GenoMetric Query Language, an innovative cloud-based system for computing complex queries over heterogeneous processed data. It can also be used starting from any text files in standard BED, BroadPeak, NarrowPeak, GTF, or general tab-delimited format, containing numerical features of genomic regions; metadata can be provided as text files in tab-delimited attribute-value format. GeMSE allows interactive analytics, consisting of on-the-fly cycling among steps of data exploration, analysis and visualization that help biologists and bioinformaticians in making sense of heterogeneous genomic datasets. By means of an explorative interaction support, users can trace past activities and quickly recover their results, seamlessly going backward and forward in the analysis steps and comparative visualizations of heatmaps. CONCLUSIONS: GeMSE effective application and practical usefulness is demonstrated through significant use cases of biological interest. GeMSE is available at http://www.bioinformatics.deib.polimi.it/GeMSE/ , and its source code is available at https://github.com/Genometric/GeMSE under GPLv3 open-source license.


Subject(s)
Databases, Genetic , Genomics/methods , Metadata , A549 Cells , Dexamethasone/pharmacology , Ethanol/pharmacology , Humans , Models, Theoretical , Pattern Recognition, Automated , Protein Interaction Mapping , Software
16.
Brief Bioinform ; 18(3): 367-381, 2017 05 01.
Article in English | MEDLINE | ID: mdl-27013647

ABSTRACT

Enriched region (ER) identification is a fundamental step in several next-generation sequencing (NGS) experiment types. Yet, although NGS experimental protocols recommend producing replicate samples for each evaluated condition and their consistency is usually assessed, typically pipelines for ER identification do not consider available NGS replicates. This may alter genome-wide descriptions of ERs, hinder significance of subsequent analyses on detected ERs and eventually preclude biological discoveries that evidence in replicate could support. MuSERA is a broadly useful stand-alone tool for both interactive and batch analysis of combined evidence from ERs in multiple ChIP-seq or DNase-seq replicates. Besides rigorously combining sample replicates to increase statistical significance of detected ERs, it also provides quantitative evaluations and graphical features to assess the biological relevance of each determined ER set within its genomic context; they include genomic annotation of determined ERs, nearest ER distance distribution, global correlation assessment of ERs and an integrated genome browser. We review MuSERA rationale and implementation, and illustrate how sets of significant ERs are expanded by applying MuSERA on replicates for several types of NGS data, including ChIP-seq of transcription factors or histone marks and DNase-seq hypersensitive sites. We show that MuSERA can determine a new, enhanced set of ERs for each sample by locally combining evidence on replicates, and prove how the easy-to-use interactive graphical displays and quantitative evaluations that MuSERA provides effectively support thorough inspection of obtained results and evaluation of their biological content, facilitating their understanding and biological interpretations. MuSERA is freely available at http://www.bioinformatics.deib.polimi.it/MuSERA/.


Subject(s)
High-Throughput Nucleotide Sequencing , Chromatin Immunoprecipitation , Genome , Genomics , Software
17.
Bioinformatics ; 31(17): 2761-9, 2015 Sep 01.
Article in English | MEDLINE | ID: mdl-25957351

ABSTRACT

MOTIVATION: Chromatin Immunoprecipitation followed by sequencing (ChIP-seq) detects genome-wide DNA-protein interactions and chromatin modifications, returning enriched regions (ERs), usually associated with a significance score. Moderately significant interactions can correspond to true, weak interactions, or to false positives; replicates of a ChIP-seq experiment can provide co-localised evidence to decide between the two cases. We designed a general methodological framework to rigorously combine the evidence of ERs in ChIP-seq replicates, with the option to set a significance threshold on the repeated evidence and a minimum number of samples bearing this evidence. RESULTS: We applied our method to Myc transcription factor ChIP-seq datasets in K562 cells available in the ENCODE project. Using replicates, we could extend up to 3 times the ER number with respect to single-sample analysis with equivalent significance threshold. We validated the 'rescued' ERs by checking for the overlap with open chromatin regions and for the enrichment of the motif that Myc binds with strongest affinity; we compared our results with alternative methods (IDR and jMOSAiCS), obtaining more validated peaks than the former and less peaks than latter, but with a better validation. AVAILABILITY AND IMPLEMENTATION: An implementation of the proposed method and its source code under GPLv3 license are freely available at http://www.bioinformatics.deib.polimi.it/MSPC/ and http://mspc.codeplex.com/, respectively. CONTACT: marco.morelli@iit.it SUPPLEMENTARY INFORMATION: Supplementary Material are available at Bioinformatics online.


Subject(s)
Chromatin Immunoprecipitation/methods , Chromatin/metabolism , Genome, Human , High-Throughput Nucleotide Sequencing , Transcription Factors/metabolism , Ubiquitin-Protein Ligases/metabolism , Algorithms , Chromatin/genetics , Computational Biology/methods , Data Interpretation, Statistical , Gene Expression Regulation , Humans , K562 Cells , Nucleotide Motifs/genetics , Protein Binding , Protein Structure, Tertiary , Proto-Oncogene Proteins c-myc/genetics , Proto-Oncogene Proteins c-myc/metabolism , Quality Control , Reproducibility of Results , Sequence Analysis, DNA , Software , Ubiquitin-Protein Ligases/genetics
18.
Bioinformatics ; 31(12): 1881-8, 2015 Jun 15.
Article in English | MEDLINE | ID: mdl-25649616

ABSTRACT

MOTIVATION: Improvement of sequencing technologies and data processing pipelines is rapidly providing sequencing data, with associated high-level features, of many individual genomes in multiple biological and clinical conditions. They allow for data-driven genomic, transcriptomic and epigenomic characterizations, but require state-of-the-art 'big data' computing strategies, with abstraction levels beyond available tool capabilities. RESULTS: We propose a high-level, declarative GenoMetric Query Language (GMQL) and a toolkit for its use. GMQL operates downstream of raw data preprocessing pipelines and supports queries over thousands of heterogeneous datasets and samples; as such it is key to genomic 'big data' analysis. GMQL leverages a simple data model that provides both abstractions of genomic region data and associated experimental, biological and clinical metadata and interoperability between many data formats. Based on Hadoop framework and Apache Pig platform, GMQL ensures high scalability, expressivity, flexibility and simplicity of use, as demonstrated by several biological query examples on ENCODE and TCGA datasets. AVAILABILITY AND IMPLEMENTATION: The GMQL toolkit is freely available for non-commercial use at http://www.bioinformatics.deib.polimi.it/GMQL/.


Subject(s)
Abstracting and Indexing , Computational Biology/methods , Databases, Factual , Genomics/methods , High-Throughput Screening Assays/methods , Software , Chromatin Immunoprecipitation , Epigenomics , Histones/metabolism , Humans , Sequence Analysis, DNA/methods , Transcription Factors/metabolism
SELECTION OF CITATIONS
SEARCH DETAIL
...