RESUMO
The number of publicly available microbiome samples is continually growing. As data set size increases, bottlenecks arise in standard analytical pipelines. Faith's phylogenetic diversity (Faith's PD) is a highly utilized phylogenetic alpha diversity metric that has thus far failed to effectively scale to trees with millions of vertices. Stacked Faith's phylogenetic diversity (SFPhD) enables calculation of this widely adopted diversity metric at a much larger scale by implementing a computationally efficient algorithm. The algorithm reduces the amount of computational resources required, resulting in more accessible software with a reduced carbon footprint, as compared to previous approaches. The new algorithm produces identical results to the previous method. We further demonstrate that the phylogenetic aspect of Faith's PD provides increased power in detecting diversity differences between younger and older populations in the FINRISK study's metagenomic data.
Assuntos
Microbiota , Microbiota/genética , FilogeniaRESUMO
Intrinsic dynamics of DNA plays a crucial role in DNA-protein interactions and has been emphasized as a possible key component for in vivo chromatin organization. We have prepared an entangled DNA microtube above the overlap concentration by exploiting the complementary cohesive ends of λ-phage DNA, which is confirmed by atomic force microscopy and agarose gel electrophoresis. Photon correlation spectroscopy further confirmed that the entangled solutions are found to exhibit the classical hydrodynamics of a single chain segment on length scales smaller than the hydrodynamic length scale of single λ-phage DNA molecule. We also observed that in 41.6% (gm water/gm DNA) hydrated state, λ-phage DNA exhibits a dynamic transition temperature (T(dt)) at 187 K and a crossover temperature (T(c)) at 246 K. Computational insight reveals that the observed structure and dynamics of entangled λ-phage DNA are distinctively different from the behavior of the corresponding unentangled DNA with open cohesive ends, which is reminiscent with our experimental observation.
Assuntos
Bacteriófago lambda/química , DNA Bacteriano/química , Hidrodinâmica , Conformação de Ácido Nucleico , Água/químicaRESUMO
UniFrac is an important tool in microbiome research that is used for phylogenetically comparing microbiome profiles to one another (beta diversity). Striped UniFrac recently added the ability to split the problem into many independent subproblems, exhibiting nearly linear scaling but suffering from memory contention. Here, we adapt UniFrac to graphics processing units using OpenACC, enabling greater than 1,000× computational improvement, and apply it to 307,237 samples, the largest 16S rRNA V4 uniformly preprocessed microbiome data set analyzed to date. IMPORTANCE UniFrac is an important tool in microbiome research that is used for phylogenetically comparing microbiome profiles to one another. Here, we adapt UniFrac to operate on graphics processing units, enabling a 1,000× computational improvement. To highlight this advance, we perform what may be the largest microbiome analysis to date, applying UniFrac to 307,237 16S rRNA V4 microbiome samples preprocessed with Deblur. These scaling improvements turn UniFrac into a real-time tool for common data sets and unlock new research questions as more microbiome data are collected.
Assuntos
Bactérias , Microbiota , RNA Ribossômico 16S/genética , Bactérias/genética , Microbiota/genéticaRESUMO
Dimensionality reduction techniques are a key component of most microbiome studies, providing both the ability to tractably visualize complex microbiome datasets and the starting point for additional, more formal, statistical analyses. In this review, we discuss the motivation for applying dimensionality reduction techniques, the special characteristics of microbiome data such as sparsity and compositionality that make this difficult, the different categories of strategies that are available for dimensionality reduction, and examples from the literature of how they have been successfully applied (together with pitfalls to avoid). We conclude by describing the need for further development in the field, in particular combining the power of phylogenetic analysis with the ability to handle sparsity, compositionality, and non-normality, as well as discussing current techniques that should be applied more widely in future analyses.
RESUMO
Microbiome data have several specific characteristics (sparsity and compositionality) that introduce challenges in data analysis. The integration of prior information regarding the data structure, such as phylogenetic structure and repeated-measure study designs, into analysis, is an effective approach for revealing robust patterns in microbiome data. Past methods have addressed some but not all of these challenges and features: for example, robust principal-component analysis (RPCA) addresses sparsity and compositionality; compositional tensor factorization (CTF) addresses sparsity, compositionality, and repeated measure study designs; and UniFrac incorporates phylogenetic information. Here we introduce a strategy of incorporating phylogenetic information into RPCA and CTF. The resulting methods, phylo-RPCA, and phylo-CTF, provide substantial improvements over state-of-the-art methods in terms of discriminatory power of underlying clustering ranging from the mode of delivery to adult human lifestyle. We demonstrate quantitatively that the addition of phylogenetic information improves effect size and classification accuracy in both data-driven simulated data and real microbiome data. IMPORTANCE Microbiome data analysis can be difficult because of particular data features, some unavoidable and some due to technical limitations of DNA sequencing instruments. The first step in many analyses that ultimately reveals patterns of similarities and differences among sets of samples (e.g., separating samples from sick and healthy people or samples from seawater versus soil) is calculating the difference between each pair of samples. We introduce two new methods to calculate these differences that combine features of past methods, specifically being able to take into account the principles that most types of microbes are not in most samples (sparsity), that abundances are relative rather than absolute (compositionality), and that all microbes have a shared evolutionary history (phylogeny). We show using simulated and real data that our new methods provide improved classification accuracy of ordinal sample clusters and increased effect size between sample groups on beta-diversity distances.
Assuntos
Microbiota , Humanos , Filogenia , Microbiota/genética , Análise de Sequência de DNA , Projetos de Pesquisa , FenótipoRESUMO
Increasing data volumes on high-throughput sequencing instruments such as the NovaSeq 6000 leads to long computational bottlenecks for common metagenomics data preprocessing tasks such as adaptor and primer trimming and host removal. Here, we test whether faster recently developed computational tools (Fastp and Minimap2) can replace widely used choices (Atropos and Bowtie2), obtaining dramatic accelerations with additional sensitivity and minimal loss of specificity for these tasks. Furthermore, the taxonomic tables resulting from downstream processing provide biologically comparable results. However, we demonstrate that for taxonomic assignment, Bowtie2's specificity is still required. We suggest that periodic reevaluation of pipeline components, together with improvements to standardized APIs to chain them together, will greatly enhance the efficiency of common bioinformatics tasks while also facilitating incorporation of further optimized steps running on GPUs, FPGAs, or other architectures. We also note that a detailed exploration of available algorithms and pipeline components is an important step that should be taken before optimization of less efficient algorithms on advanced or nonstandard hardware. IMPORTANCE In shotgun metagenomics studies that seek to relate changes in microbial DNA across samples, processing the data on a computer often takes longer than obtaining the data from the sequencing instrument. Recently developed software packages that perform individual steps in the pipeline of data processing in principle offer speed advantages, but in practice they may contain pitfalls that prevent their use, for example, they may make approximations that introduce unacceptable errors in the data. Here, we show that differences in choices of these components can speed up overall data processing by 5-fold or more on the same hardware while maintaining a high degree of correctness, greatly reducing the time taken to interpret results. This is an important step for using the data in clinical settings, where the time taken to obtain the results may be critical for guiding treatment.
Assuntos
Metagenômica , Software , Metagenômica/métodos , Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Biologia Computacional/métodosRESUMO
We introduce the operational genomic unit (OGU) method, a metagenome analysis strategy that directly exploits sequence alignment hits to individual reference genomes as the minimum unit for assessing the diversity of microbial communities and their relevance to environmental factors. This approach is independent of taxonomic classification, granting the possibility of maximal resolution of community composition, and organizes features into an accurate hierarchy using a phylogenomic tree. The outputs are suitable for contemporary analytical protocols for community ecology, differential abundance, and supervised learning while supporting phylogenetic methods, such as UniFrac and phylofactorization, that are seldom applied to shotgun metagenomics despite being prevalent in 16S rRNA gene amplicon studies. As demonstrated in two real-world case studies, the OGU method produces biologically meaningful patterns from microbiome data sets. Such patterns further remain detectable at very low metagenomic sequencing depths. Compared with taxonomic unit-based analyses implemented in currently adopted metagenomics tools, and the analysis of 16S rRNA gene amplicon sequence variants, this method shows superiority in informing biologically relevant insights, including stronger correlation with body environment and host sex on the Human Microbiome Project data set and more accurate prediction of human age by the gut microbiomes of Finnish individuals included in the FINRISK 2002 cohort. We provide Woltka, a bioinformatics tool to implement this method, with full integration with the QIIME 2 package and the Qiita web platform, to facilitate adoption of the OGU method in future metagenomics studies. IMPORTANCE Shotgun metagenomics is a powerful, yet computationally challenging, technique compared to 16S rRNA gene amplicon sequencing for decoding the composition and structure of microbial communities. Current analyses of metagenomic data are primarily based on taxonomic classification, which is limited in feature resolution. To solve these challenges, we introduce operational genomic units (OGUs), which are the individual reference genomes derived from sequence alignment results, without further assigning them taxonomy. The OGU method advances current read-based metagenomics in two dimensions: (i) providing maximal resolution of community composition and (ii) permitting use of phylogeny-aware tools. Our analysis of real-world data sets shows that it is advantageous over currently adopted metagenomic analysis methods and the finest-grained 16S rRNA analysis methods in predicting biological traits. We thus propose the adoption of OGUs as an effective practice in metagenomic studies.
Assuntos
Metagenoma , Microbiota , Humanos , Filogenia , RNA Ribossômico 16S/genética , EcologiaRESUMO
Microbiome data are sparse and high dimensional, so effective visualization of these data requires dimensionality reduction. To date, the most commonly used method for dimensionality reduction in the microbiome is calculation of between-sample microbial differences (beta diversity), followed by principal-coordinate analysis (PCoA). Uniform Manifold Approximation and Projection (UMAP) is an alternative method that can reduce the dimensionality of beta diversity distance matrices. Here, we demonstrate the benefits and limitations of using UMAP for dimensionality reduction on microbiome data. Using real data, we demonstrate that UMAP can improve the representation of clusters, especially when the clusters are composed of multiple subgroups. Additionally, we show that UMAP provides improved correlation of biological variation along a gradient with a reduced number of coordinates of the resulting embedding. Finally, we provide parameter recommendations that emphasize the preservation of global geometry. We therefore conclude that UMAP should be routinely used as a complementary visualization method for microbiome beta diversity studies. IMPORTANCE UMAP provides an additional method to visualize microbiome data. The method is extensible to any beta diversity metric used with PCoA, and our results demonstrate that UMAP can indeed improve visualization quality and correspondence with biological and technical variables of interest. The software to perform this analysis is available under an open-source license and can be obtained at https://github.com/knightlab-analyses/umap-microbiome-benchmarking; additionally, we have provided a QIIME 2 plugin for UMAP at https://github.com/biocore/q2-umap.
RESUMO
The translational power of human microbiome studies is limited by high interindividual variation. We describe a dimensionality reduction tool, compositional tensor factorization (CTF), that incorporates information from the same host across multiple samples to reveal patterns driving differences in microbial composition across phenotypes. CTF identifies robust patterns in sparse compositional datasets, allowing for the detection of microbial changes associated with specific phenotypes that are reproducible across datasets.
Assuntos
Algoritmos , Microbioma Gastrointestinal , Humanos , LactenteRESUMO
Standard workflows for analyzing microbiomes often include the creation and curation of phylogenetic trees. Here we present EMPress, an interactive web tool for visualizing trees in the context of microbiome, metabolome, and other community data scalable to trees with well over 500,000 nodes. EMPress provides novel functionality-including ordination integration and animations-alongside many standard tree visualization features and thus simplifies exploratory analyses of many forms of 'omic data.IMPORTANCE Phylogenetic trees are integral data structures for the analysis of microbial communities. Recent work has also shown the utility of trees constructed from certain metabolomic data sets, further highlighting their importance in microbiome research. The ever-growing scale of modern microbiome surveys has led to numerous challenges in visualizing these data. In this paper we used five diverse data sets to showcase the versatility and scalability of EMPress, an interactive web visualization tool. EMPress addresses the growing need for exploratory analysis tools that can accommodate large, complex multi-omic data sets.
RESUMO
BACKGROUND: SARS-CoV-2 is an RNA virus responsible for the coronavirus disease 2019 (COVID-19) pandemic. Viruses exist in complex microbial environments, and recent studies have revealed both synergistic and antagonistic effects of specific bacterial taxa on viral prevalence and infectivity. We set out to test whether specific bacterial communities predict SARS-CoV-2 occurrence in a hospital setting. METHODS: We collected 972 samples from hospitalized patients with COVID-19, their health care providers, and hospital surfaces before, during, and after admission. We screened for SARS-CoV-2 using RT-qPCR, characterized microbial communities using 16S rRNA gene amplicon sequencing, and used these bacterial profiles to classify SARS-CoV-2 RNA detection with a random forest model. RESULTS: Sixteen percent of surfaces from COVID-19 patient rooms had detectable SARS-CoV-2 RNA, although infectivity was not assessed. The highest prevalence was in floor samples next to patient beds (39%) and directly outside their rooms (29%). Although bed rail samples more closely resembled the patient microbiome compared to floor samples, SARS-CoV-2 RNA was detected less often in bed rail samples (11%). SARS-CoV-2 positive samples had higher bacterial phylogenetic diversity in both human and surface samples and higher biomass in floor samples. 16S microbial community profiles enabled high classifier accuracy for SARS-CoV-2 status in not only nares, but also forehead, stool, and floor samples. Across these distinct microbial profiles, a single amplicon sequence variant from the genus Rothia strongly predicted SARS-CoV-2 presence across sample types, with greater prevalence in positive surface and human samples, even when compared to samples from patients in other intensive care units prior to the COVID-19 pandemic. CONCLUSIONS: These results contextualize the vast diversity of microbial niches where SARS-CoV-2 RNA is detected and identify specific bacterial taxa that associate with the viral RNA prevalence both in the host and hospital environment. Video Abstract.
Assuntos
COVID-19 , SARS-CoV-2 , Hospitais , Humanos , Pandemias , Filogenia , RNA Ribossômico 16S/genética , RNA Viral/genéticaRESUMO
Synergistic effects of bacteria on viral stability and transmission are widely documented but remain unclear in the context of SARS-CoV-2. We collected 972 samples from hospitalized ICU patients with coronavirus disease 2019 (COVID-19), their health care providers, and hospital surfaces before, during, and after admission. We screened for SARS-CoV-2 using RT-qPCR, characterized microbial communities using 16S rRNA gene amplicon sequencing, and contextualized the massive microbial diversity in this dataset in a meta-analysis of over 20,000 samples. Sixteen percent of surfaces from COVID-19 patient rooms were positive, with the highest prevalence in floor samples next to patient beds (39%) and directly outside their rooms (29%). Although bed rail samples increasingly resembled the patient microbiome throughout their stay, SARS-CoV-2 was less frequently detected there (11%). Despite surface contamination in almost all patient rooms, no health care workers providing COVID-19 patient care contracted the disease. SARS-CoV-2 positive samples had higher bacterial phylogenetic diversity across human and surface samples, and higher biomass in floor samples. 16S microbial community profiles allowed for high classifier accuracy for SARS-CoV-2 status in not only nares, but also forehead, stool and floor samples. Across these distinct microbial profiles, a single amplicon sequence variant from the genus Rothia was highly predictive of SARS-CoV-2 across sample types, and had higher prevalence in positive surface and human samples, even when comparing to samples from patients in another intensive care unit prior to the COVID-19 pandemic. These results suggest that bacterial communities contribute to viral prevalence both in the host and hospital environment.
RESUMO
Aniline, heterocyclic aromatic amines, and arylamines are known carcinogens. Recently aniline mustard has come into prominence as a novel anticancer agent. In this project, microwave irradiation has been used to synthesize an optically active alkylated aniline namely 2,6-dimethyl-4-(1-(p-tolyl)ethyl)aniline (abbreviated DMPA). The presence of quartet and doublet peaks in NMR and a single chromatogram in HPLC verified that the final product DMPA, prepared from the synthesis reactions, had no major impurities. By using a Lux chiral column in HPLC, two peaks have been detected in the chromatogram, which correspond to two enantiomers of the chiral aniline derivative. Fluorescence spectroscopic measurements on DMPA indicated conspicuous dependence of its emission behavior on the polarity (in terms of the empirical polarity parameter ET(30)) of the homogeneous solvents used, a property important for an optical sensor. The nature of the emission profiles, along with the relevant parameter namely wavelength at emission maximum (λemmax) is used to infer the distribution, binding and microenvironment of the DMPA molecules in human serum albumin protein (HSA). DMPA is weakly fluorescent in aqueous buffer medium, with a dramatic enhancement in the fluorescence emission in the presence of HSA. Molecular modeling studies have been carried out on the two enantiomers (R and S) of DMPA with HSA. The implications of these findings are examined in relation to the potentialities of DMPA as a novel fluorescence sensor for biological systems.
Assuntos
Compostos de Anilina/química , Corantes Fluorescentes/química , Espectrometria de Fluorescência/métodos , Alquilação , Compostos de Anilina/análise , Compostos de Anilina/metabolismo , Corantes Fluorescentes/análise , Corantes Fluorescentes/metabolismo , Humanos , Modelos Moleculares , Albumina Sérica Humana/análise , Albumina Sérica Humana/química , Albumina Sérica Humana/metabolismo , EstereoisomerismoRESUMO
OBJECTIVE: To describe the use of oral antidiabetic drugs for management of type 2 diabetes in the U.S. from 1990 through 2001. RESEARCH DESIGN AND METHODS: Data on oral antidiabetic drugs were derived from two pharmaceutical marketing databases from IMS Health, the National Prescription Audit Plus and the National Disease and Therapeutic Index. RESULTS: In 1990, 23.4 million outpatient prescriptions of oral antidiabetic agents were dispensed. By 2001, this number had increased 3.9-fold, to 91.8 million prescriptions. Glipizide and glyburide, two sulfonylurea medications, accounted for approximately 77% of prescriptions of oral antidiabetic drugs in 1990 and 35.5% of prescriptions in 2001. By 2001, the biguanide metformin (approved in 1995) had captured approximately 33% of prescriptions, and the thiazolidinedione insulin sensitizers (rosiglitazone and pioglitazone marketed beginning in 1999) accounted for approximately 17% of market share. Compared with patients treated in 1990, those in 2001 were proportionately younger and they more often used oral antidiabetic drugs and insulin in combination. Internists and general and family practitioners were the primary prescribers of this class of drugs. CONCLUSIONS: Consistent with the reported increase in the prevalence of type 2 diabetes, the number of dispensed outpatient prescriptions of oral antidiabetic drugs increased rapidly between 1990 and 2001. This period was marked by an increase in the treatment of younger people and the use of oral antidiabetic drugs in combination. With the approval in the last decade of several new types of oral antidiabetic medications with different mechanisms of action, options for management of type 2 diabetes have expanded.
Assuntos
Diabetes Mellitus Tipo 2/tratamento farmacológico , Prescrições de Medicamentos/estatística & dados numéricos , Hipoglicemiantes/uso terapêutico , Administração Oral , Adulto , Idoso , Idoso de 80 Anos ou mais , Bases de Dados Factuais , Diabetes Mellitus Tipo 2/epidemiologia , Feminino , Humanos , Hipoglicemiantes/administração & dosagem , Hipoglicemiantes/classificação , Masculino , Pessoa de Meia-Idade , Compostos de Sulfonilureia/classificação , Compostos de Sulfonilureia/uso terapêutico , Fatores de Tempo , Estados Unidos/epidemiologiaRESUMO
The most commonly used assays designed to detect either skin or systemic immune-based hypersensitivity reactions are those using guinea pigs (GP). We obtained data from various FDA records to evaluate the correlation between GP assay results and reported post-marketing systemic hypersensitivity reactions. We examined the new drug application (NDA) reviews of approved drugs for the results of GP assays. Post-marketing human data were extracted from the FDA adverse event reporting system (AERS). Drug usage data were obtained from a commercial database maintained by IMS Health Inc. We found 83 (21%) of 396 drugs approved between 1978 and 1998 had reported GP test results. Among these 83 drugs, 14 (17%) were found to have positive results in at least one GP assay. Simple reporting index (RI) values for systemic hypersensitivity reactions were calculated from AERS data and usage to produce the index of adverse event reports per million shipping units of drug. A variety of definitions of positive human response were examined. A statistically significant association was seen for rash between post-marketing and clinical trials adverse event reports. No statistically significant associations between human data and GP test results were observed. These data suggest that standard GP assays have limited ability to predict human systemic hypersensitivity potential for pharmaceuticals.
Assuntos
Hipersensibilidade a Drogas , United States Food and Drug Administration , Sistemas de Notificação de Reações Adversas a Medicamentos , Animais , Bases de Dados Factuais , Aprovação de Drogas , Cobaias , Humanos , Vigilância de Produtos Comercializados , Estados UnidosRESUMO
A disruptive physician can alienate staff, drive away patients, and even land your organization in a lawsuit. Consider some practical advice on how to identify and deal with disruptive physicians.
Assuntos
Agressão , Disciplina no Trabalho , Prática Institucional/normas , Relações Interprofissionais , Inabilitação do Médico/psicologia , Relações Médico-Paciente , Comunicação , Documentação , Humanos , Responsabilidade Legal , Moral , Negociação , Política Organizacional , Estados UnidosRESUMO
The energy of combination of crystalline boron in gaseous fluorine was measured in a bomb calorimeter. The experimental data combined with reasonable estimates of all known errors may be expressed by the equation: B ( c ) + 3 / 2 F 2 ( g ) = BF 3 ( g ) , Δ H f 298 ° = - 271.03 ± 0.51 kcal mol - 1 This result is compared with other recent work on and related to the heat of formation of boron trifluoride.
RESUMO
The energies of combustion of AlB2 and α-AlB12 were measured in a bomb calorimeter using fluorine as the oxidant. Major problems of this investigation were the assessment of the state and distribution of impurities in the samples and the establishment of the stoichiornetry of the aluminum boride phase. We obtain -16±3 kcal mol-1 and -48±10 kcal mol-1 for the heats of formation of AlB2 and α-AlB12, respectively. The uncertainties cited are the overall experimental errors. Their magnitudes are chiefly due to uncertainties in the impurity correction applied and the uncertainties in the heats of formation of the combustion products.
RESUMO
The heats of the following reactions were measured directly in an electrically calibrated flame calorimeter operated at one atm pressure and 303 °K. OF 2 ( g ) + 2 H 2 ( g ) + 99 H 2 O ( l ) â 2 [ HF â 50 H 2 O ] ( l ) F 2 ( g ) + H 2 ( g ) + 100 H 2 O ( 1 ) â 2 [ HF â 50 H 2 O ] ( l ) 1 2 O 2 ( g ) + H 2 ( g ) â H 2 O ( l ) The reactants and products were analyzed for each of the reactions. From these heats we calculated the corresponding heats of formation, as follows: OF 2 ( g ) Δ H f 298.15 ° = + 24.52 ± 1.59 kJ mol - 1 ( + 5.86 ± 0.38 kcal mol - 1 ) HF â 50 H 2 O ( l ) Δ H f 298.15 ° = - 320.83 ± 0.38 kJ mol - 1 ( - 76.68 ± 0.09 kcal mol - 1 ) H 2 O ( l ) Δ H f 298.15 ° = - 285.85 ± 0.33 kJ mol - 1 ( - 68.32 ± 0.08 kcal mol - 1 ) The uncertainties indicated are the estimates of the overall experimental errors. The value of the average O - F bond energy in OF2 was calculated to be 191.29 kJ mol-1 (45.72 kcal mol-1).