Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 129
Filter
Add more filters

Publication year range
1.
Nucleic Acids Res ; 52(D1): D791-D797, 2024 Jan 05.
Article in English | MEDLINE | ID: mdl-37953409

ABSTRACT

UNITE (https://unite.ut.ee) is a web-based database and sequence management environment for molecular identification of eukaryotes. It targets the nuclear ribosomal internal transcribed spacer (ITS) region and offers nearly 10 million such sequences for reference. These are clustered into ∼2.4M species hypotheses (SHs), each assigned a unique digital object identifier (DOI) to promote unambiguous referencing across studies. UNITE users have contributed over 600 000 third-party sequence annotations, which are shared with a range of databases and other community resources. Recent improvements facilitate the detection of cross-kingdom biological associations and the integration of undescribed groups of organisms into everyday biological pursuits. Serving as a digital twin for eukaryotic biodiversity and communities worldwide, the latest release of UNITE offers improved avenues for biodiversity discovery, precise taxonomic communication and integration of biological knowledge across platforms.


Subject(s)
Databases, Nucleic Acid , Fungi , DNA, Ribosomal Spacer , Fungi/genetics , Biodiversity , DNA, Fungal , Phylogeny
2.
Bioinformatics ; 38(6): 1727-1728, 2022 03 04.
Article in English | MEDLINE | ID: mdl-34951622

ABSTRACT

SUMMARY: Comparing genomic loci of a given bacterial gene across strains and species can provide insights into their evolution, including information on e.g. acquired mobility, the degree of conservation between different taxa or indications of horizontal gene transfer events. While thousands of bacterial genomes are available to date, there is no software that facilitates comparisons of individual gene loci for a large number of genomes. GEnView (Genetic Environment View) is a Python-based pipeline for the comparative analysis of gene-loci in a large number of bacterial genomes, providing users with automated, taxon-selective access to the >800.000 genomes and plasmids currently available in the NCBI Assembly and RefSeq databases, and is able to process local genomes that are not deposited at NCBI, enabling searches for genomic sequences and to analyze their genetic environments through the interactive visualization and extensive metadata files created by GEnView. AVAILABILITY AND IMPLEMENTATION: GEnView is implemented in Python 3. Instructions for download and usage can be found at https://github.com/EbmeyerSt/GEnView under GLP3. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Genomics , Software , Phylogeny , Genome, Bacterial , Plasmids/genetics
3.
Proc Natl Acad Sci U S A ; 117(35): 21403-21412, 2020 09 01.
Article in English | MEDLINE | ID: mdl-32817418

ABSTRACT

The early steps of DNA double-strand break (DSB) repair in human cells involve the MRE11-RAD50-NBS1 (MRN) complex and its cofactor, phosphorylated CtIP. The roles of these proteins in nucleolytic DSB resection are well characterized, but their role in bridging the DNA ends for efficient and correct repair is much less explored. Here we study the binding of phosphorylated CtIP, which promotes the endonuclease activity of MRN, to single long (∼50 kb) DNA molecules using nanofluidic channels and compare it to the yeast homolog Sae2. CtIP bridges DNA in a manner that depends on the oligomeric state of the protein, and truncated mutants demonstrate that the bridging depends on CtIP regions distinct from those that stimulate the nuclease activity of MRN. Sae2 is a much smaller protein than CtIP, and its bridging is significantly less efficient. Our results demonstrate that the nuclease cofactor and structural functions of CtIP may depend on the same protein population, which may be crucial for CtIP functions in both homologous recombination and microhomology-mediated end-joining.


Subject(s)
DNA Breaks, Double-Stranded , DNA, Circular/metabolism , Endodeoxyribonucleases/metabolism , Animals , Endonucleases/metabolism , Humans , Nanotechnology , Phosphorylation , Saccharomyces cerevisiae Proteins/metabolism , Saccharomycetales , Sf9 Cells , Spodoptera
4.
PLoS Genet ; 16(6): e1008803, 2020 06.
Article in English | MEDLINE | ID: mdl-32511227

ABSTRACT

Identification of additional cancer-associated genes and secondary mutations driving the metastatic progression in pheochromocytoma and paraganglioma (PPGL) is important for subtyping, and may provide optimization of therapeutic regimens. We recently reported novel recurrent nonsynonymous mutations in the MYO5B gene in metastatic PPGL. Here, we explored the functional impact of these MYO5B mutations, and analyzed MYO5B expression in primary PPGL tumor cases in relation to mutation status. Immunohistochemistry and mRNA expression analysis in 30 PPGL tumors revealed an increased MYO5B expression in metastatic compared to non-metastatic cases. In addition, subcellular localization of MYO5B protein was altered from cytoplasmic to membranous in some metastatic tumors, and the strongest and most abnormal expression pattern was observed in a paraganglioma harboring a somatic MYO5B:p.G1611S mutation. In addition to five previously discovered MYO5B mutations, the present study of 30 PPGL (8 previous and 22 new samples) also revealed two, and hence recurrent, mutations in the gene paralog MYO5A. The three MYO5B missense mutations with the highest prediction scores (p.L587P, p.G1611S and p.R1641C) were selected and functionally validated using site directed mutagenesis and stable transfection into human neuroblastoma cells (SK-N-AS) and embryonic kidney cells (HEK293). In vitro analysis showed a significant increased proliferation rate in all three MYO5B mutated clones. The two somatically derived mutations, p.L587P and p.G1611S, were also found to increase the migration rate. Expression analysis of MYO5B mutants compared to wild type clones, demonstrated a significant enrichment of genes involved in migration, proliferation, cell adhesion, glucose metabolism, and cellular homeostasis. Our study validates the functional role of novel MYO5B mutations in proliferation and migration, and suggest the MYO5-pathway to be involved in the malignant progression in some PPGL tumors.


Subject(s)
Adrenal Gland Neoplasms/genetics , Biomarkers, Tumor/genetics , Mutation, Missense , Myosin Heavy Chains/genetics , Myosin Type V/genetics , Pheochromocytoma/genetics , Adrenal Gland Neoplasms/pathology , Cell Line, Tumor , Cell Movement , Cell Proliferation , Female , HEK293 Cells , Humans , Male , Neoplasm Metastasis , Pheochromocytoma/pathology
5.
Mol Cell Proteomics ; 19(3): 518-528, 2020 03.
Article in English | MEDLINE | ID: mdl-31941798

ABSTRACT

Mass spectrometry (MS) and proteomics offer comprehensive characterization and identification of microorganisms and discovery of protein biomarkers that are applicable for diagnostics of infectious diseases. The use of biomarkers for diagnostics is widely applied in the clinic and the use of peptide biomarkers is increasingly being investigated for applications in the clinical laboratory. Respiratory-tract infections are a predominant cause for medical treatment, although, clinical assessments and standard clinical laboratory protocols are time-consuming and often inadequate for reliable diagnoses. Novel methods, preferably applied directly to clinical samples, excluding cultivation steps, are needed to improve diagnostics of infectious diseases, provide adequate treatment and reduce the use of antibiotics and associated development of antibiotic resistance. This study applied nano-liquid chromatography (LC) coupled with tandem MS, with a bioinformatics pipeline and an in-house database of curated high-quality reference genome sequences to identify species-unique peptides as potential biomarkers for four bacterial pathogens commonly found in respiratory tract infections (RTIs): Staphylococcus aureus; Moraxella catarrhalis; Haemophilus influenzae and Streptococcus pneumoniae The species-unique peptides were initially identified in pure cultures of bacterial reference strains, reflecting the genomic variation in the four species and, furthermore, in clinical respiratory tract samples, without prior cultivation, elucidating proteins expressed in clinical conditions of infection. For each of the four bacterial pathogens, the peptide biomarker candidates most predominantly found in clinical samples, are presented. Data are available via ProteomeXchange with identifier PXD014522. As proof-of-principle, the most promising species-unique peptides were applied in targeted tandem MS-analyses of clinical samples and their relevance for identifications of the pathogens, i.e. proteotyping, was validated, thus demonstrating their potential as peptide biomarker candidates for diagnostics of infectious diseases.


Subject(s)
Bacterial Proteins/metabolism , Haemophilus influenzae/metabolism , Moraxella catarrhalis/metabolism , Peptides/metabolism , Staphylococcus aureus/metabolism , Streptococcus pneumoniae/metabolism , Biomarkers/metabolism , Haemophilus influenzae/isolation & purification , Humans , Moraxella catarrhalis/isolation & purification , Respiratory System/microbiology , Respiratory Tract Infections/microbiology , Species Specificity , Staphylococcus aureus/isolation & purification , Streptococcus pneumoniae/isolation & purification , Tandem Mass Spectrometry
6.
J Antimicrob Chemother ; 76(1): 117-123, 2021 01 01.
Article in English | MEDLINE | ID: mdl-33005957

ABSTRACT

BACKGROUND: Metallo-ß-lactamases (MBLs) are enzymes that use zinc-dependent hydrolysis to confer resistance to almost all available ß-lactam antibiotics. They are hypothesized to originate from commensal and environmental bacteria, from where some have mobilized and transferred horizontally to pathogens. The current phylogeny of MBLs, however, is biased as it is founded largely on genes encountered in pathogenic bacteria. This incompleteness is emphasized by recent findings of environmental MBLs with new forms of zinc binding sites and atypical functional profiles. OBJECTIVES: To expand the phylogeny of MBLs to provide a more accurate view of their evolutionary history. METHODS: We searched more than 16 terabases of genomic and metagenomic data for MBLs of the three subclasses B1, B2 and B3 using the validated fARGene method. Predicted genes, together with the previously known ones, were used to infer phylogenetic trees. RESULTS: We identified 2290 unique MBL genes forming 817 gene families, of which 741 were previously uncharacterized. MBLs from subclasses B1 and B3 separated into distinct monophyletic groups, in agreement with their taxonomic and functional properties. We present evidence that clinically associated MBLs were mobilized from Proteobacteria. Additionally, we identified three new variants of the zinc binding sites, indicating that the functional repertoire is broader than previously reported. CONCLUSIONS: Based on our results, we recommend that the nomenclature of MBLs is refined into the phylogenetic groups B1.1-B1.5 and B3.1-B3.4 that more accurately describe their molecular and functional characteristics. Our results will also facilitate the annotation of novel MBLs, reflecting their taxonomic organization and evolutionary origin.


Subject(s)
Metagenomics , beta-Lactamases , Anti-Bacterial Agents , Bacteria/genetics , Bacteria/metabolism , Binding Sites , Humans , Phylogeny , beta-Lactamases/genetics , beta-Lactamases/metabolism
7.
BMC Cancer ; 21(1): 101, 2021 Jan 28.
Article in English | MEDLINE | ID: mdl-33509126

ABSTRACT

BACKGROUND: Patients with small intestinal neuroendocrine tumors (SINETs) frequently present with lymph node and liver metastases at the time of diagnosis, but the molecular changes that lead to the progression of these tumors are largely unknown. Sequencing studies have only identified recurrent point mutations at low frequencies with CDKN1B being the most common harboring heterozygous mutations in less than 10% of all tumors. Although SINETs are genetically stable tumors with a low frequency of point mutations and indels, they often harbor recurrent hemizygous copy number alterations (CNAs) yet the functional implications of these CNA are unclear. METHODS: Utilizing comparative genomic hybridization (CGH) arrays we analyzed the CNA profile of 131 SINETs from 117 patients. Two tumor suppressor genes and corresponding proteins i.e. SMAD4, and CDKN1B, were further characterized using a tissue microarray (TMA) with 846 SINETs. Immunohistochemistry (IHC) was used to quantify protein expression in TMA samples and this was correlated with chromosome number evaluated with fluorescent in-situ hybridization (FISH). Intestinal tissue from a Smad4+/- mouse model was used to detect entero-endocrine cell hyperplasia with IHC. RESULTS: Analyzing the CGH arrays we found loss of chromosome 18q and SMAD4 in 71% of SINETs and that focal loss of chromosome 12 affecting the CDKN1B was present in 9.4% of SINETs. No homozygous loss of chromosome 18 was detected. Hemizygous loss of SMAD4, but not CDKN1B, significantly correlated with reduced protein levels but hemizygous loss of SMAD4 did not induce entero-endocrine cell hyperplasia in the Smad4+/- mouse model. In addition, patients with low SMAD4 protein expression in primary tumors more often presented with metastatic disease. CONCLUSIONS: Hemizygous loss of chromosome 18q and the SMAD4 gene is the most common genetic event in SINETs and our results suggests that this could influence SMAD4 protein expression and spread of metastases. Although SMAD4 haploinsufficiency alone did not induce tumor initiation, loss of chromosome 18 could represent an evolutionary advantage in SINETs explaining the high prevalence of this aberration. Functional consequences of reduced SMAD4 protein levels could hypothetically be a potential mechanism as to why loss of chromosome 18 appears to be clonally selected in SINETs.


Subject(s)
Biomarkers, Tumor/genetics , Cyclin-Dependent Kinase Inhibitor p27/genetics , Gene Expression Regulation, Neoplastic , Intestinal Neoplasms/genetics , Mutation , Neuroendocrine Tumors/genetics , Smad4 Protein/genetics , Follow-Up Studies , Haploinsufficiency , Humans , Intestinal Neoplasms/pathology , Neuroendocrine Tumors/pathology , Prognosis
8.
BMC Genomics ; 21(1): 495, 2020 Jul 20.
Article in English | MEDLINE | ID: mdl-32689930

ABSTRACT

BACKGROUND: Integrons are genomic elements that mediate horizontal gene transfer by inserting and removing genetic material using site-specific recombination. Integrons are commonly found in bacterial genomes, where they maintain a large and diverse set of genes that plays an important role in adaptation and evolution. Previous studies have started to characterize the wide range of biological functions present in integrons. However, the efforts have so far mainly been limited to genomes from cultivable bacteria and amplicons generated by PCR, thus targeting only a small part of the total integron diversity. Metagenomic data, generated by direct sequencing of environmental and clinical samples, provides a more holistic and unbiased analysis of integron-associated genes. However, the fragmented nature of metagenomic data has previously made such analysis highly challenging. RESULTS: Here, we present a systematic survey of integron-associated genes in metagenomic data. The analysis was based on a newly developed computational method where integron-associated genes were identified by detecting their associated recombination sites. By processing contiguous sequences assembled from more than 10 terabases of metagenomic data, we were able to identify 13,397 unique integron-associated genes. Metagenomes from marine microbial communities had the highest occurrence of integron-associated genes with levels more than 100-fold higher than in the human microbiome. The identified genes had a large functional diversity spanning over several functional classes. Genes associated with defense mechanisms and mobility facilitators were most overrepresented and more than five times as common in integrons compared to other bacterial genes. As many as two thirds of the genes were found to encode proteins of unknown function. Less than 1% of the genes were associated with antibiotic resistance, of which several were novel, previously undescribed, resistance gene variants. CONCLUSIONS: Our results highlight the large functional diversity maintained by integrons present in unculturable bacteria and significantly expands the number of described integron-associated genes.


Subject(s)
Integrons , Metagenome , Bacteria/genetics , Gene Transfer, Horizontal , Genes, Bacterial , Humans , Integrons/genetics
9.
J Antimicrob Chemother ; 75(9): 2554-2563, 2020 09 01.
Article in English | MEDLINE | ID: mdl-32464640

ABSTRACT

BACKGROUND: MBLs form a large and heterogeneous group of bacterial enzymes conferring resistance to ß-lactam antibiotics, including carbapenems. A large environmental reservoir of MBLs has been identified, which can act as a source for transfer into human pathogens. Therefore, structural investigation of environmental and clinically rare MBLs can give new insights into structure-activity relationships to explore the role of catalytic and second shell residues, which are under selective pressure. OBJECTIVES: To investigate the structure and activity of the environmental subclass B1 MBLs MYO-1, SHD-1 and ECV-1. METHODS: The respective genes of these MBLs were cloned into vectors and expressed in Escherichia coli. Purified enzymes were characterized with respect to their catalytic efficiency (kcat/Km). The enzymatic activities and MICs were determined for a panel of different ß-lactams, including penicillins, cephalosporins and carbapenems. Thermostability was measured and structures were solved using X-ray crystallography (MYO-1 and ECV-1) or generated by homology modelling (SHD-1). RESULTS: Expression of the environmental MBLs in E. coli resulted in the characteristic MBL profile, not affecting aztreonam susceptibility and decreasing susceptibility to carbapenems, cephalosporins and penicillins. The purified enzymes showed variable catalytic activity in the order of <5% to ∼70% compared with the clinically widespread NDM-1. The thermostability of ECV-1 and SHD-1 was up to 8°C higher than that of MYO-1 and NDM-1. Using solved structures and molecular modelling, we identified differences in their second shell composition, possibly responsible for their relatively low hydrolytic activity. CONCLUSIONS: These results show the importance of environmental species acting as reservoirs for MBL-encoding genes.


Subject(s)
Escherichia coli , beta-Lactamases , Anti-Bacterial Agents/pharmacology , Carbapenems , Escherichia coli/genetics , Humans , Microbial Sensitivity Tests , beta-Lactamases/genetics
10.
Nucleic Acids Res ; 46(D1): D930-D936, 2018 01 04.
Article in English | MEDLINE | ID: mdl-29140522

ABSTRACT

Pharmaceuticals are designed to interact with specific molecular targets in humans and these targets generally have orthologs in other species. This provides opportunities for the drug discovery community to use alternative model species for drug development. It also means, however, there is potential for mode of action related effects in non-target wildlife species as many pharmaceuticals reach the environment through patient use and manufacturing wastes. Acquiring insight in drug target ortholog predictions across species and taxonomic groups has proven difficult because of the lack of an optimal strategy and because necessary information is spread across multiple and diverse sources and platforms. We introduce a new research platform tool, ECOdrug, that reliably connects drugs to their protein targets across divergent species. It harmonizes ortholog predictions from multiple sources via a simple user interface underpinning critical applications for a wide range of studies in pharmacology, ecotoxicology and comparative evolutionary biology. ECOdrug can be used to identify species with drug targets and identify drugs that interact with those targets. As such, it can be applied to support intelligent targeted drug safety testing by ensuring appropriate and relevant species are selected in ecological risk assessments. ECOdrug is freely accessible and available at: http://www.ecodrug.org.


Subject(s)
Antineoplastic Agents/pharmacology , Databases, Pharmaceutical , Drug Discovery , Molecular Targeted Therapy , Neoplasm Proteins/antagonists & inhibitors , Neoplasms/genetics , RNA, Neoplasm/genetics , Amino Acid Sequence , Animals , Antineoplastic Agents/adverse effects , Antineoplastic Agents/therapeutic use , Conservation of Natural Resources , Conserved Sequence , Data Collection , Data Display , Drug Delivery Systems , Drug Evaluation, Preclinical , Fishes/genetics , Forecasting , Humans , Invertebrates/genetics , Mammals/genetics , Neoplasm Proteins/chemistry , Neoplasms/drug therapy , Risk Assessment , Species Specificity , User-Computer Interface
11.
J Antimicrob Chemother ; 74(5): 1202-1206, 2019 05 01.
Article in English | MEDLINE | ID: mdl-30753583

ABSTRACT

OBJECTIVES: To investigate the origin of CMY-1/MOX-family ß-lactamases. METHODS: Publicly available genome assemblies were screened for CMY-1/MOX genes. The loci of CMY-1/MOX genes were compared with respect to synteny and nucleotide identity, and subjected to phylogenetic analysis. RESULTS: The chromosomal ampC genes of several Aeromonas species were highly similar to known mobile CMY-1/MOX variants. Annotation and sequence comparison revealed nucleotide identities >98% and conserved syntenies between MOX-1-, MOX-2- and MOX-9-associated mobile sequences and the chromosomal Aeromonas sanarellii, Aeromonas caviae and Aeromonas media ampC loci. Furthermore, the phylogenetic analysis showed that MOX-1, MOX-2 and MOX-9 formed three distinct monophyletic groups with the chromosomal ampC genes of A. sanarellii, A. caviae and A. media, respectively. CONCLUSIONS: Our findings show that three CMY-1/MOX-family ß-lactamases were mobilized independently from three Aeromonas species and hence shine new light on the evolution and emergence of mobile antibiotic resistance genes.


Subject(s)
Aeromonas/classification , Aeromonas/genetics , Bacterial Proteins/genetics , Multigene Family , beta-Lactamases/genetics , Aeromonas/enzymology , Bacterial Proteins/metabolism , Gene Order , Genetic Loci , Humans , Open Reading Frames , Phylogeny , beta-Lactamases/metabolism
12.
Environ Sci Technol ; 53(23): 13898-13905, 2019 Dec 03.
Article in English | MEDLINE | ID: mdl-31713420

ABSTRACT

Airplane sanitary facilities are shared by an international audience. We hypothesized the corresponding sewage to be an extraordinary source of antibiotic-resistant bacteria (ARB) and resistance genes (ARG) in terms of diversity and quantity. Accordingly, we analyzed ARG and ARB in airplane-borne sewage using complementary approaches: metagenomics, quantitative polymerase chain reaction (qPCR), and cultivation. For the purpose of comparison, we also quantified ARG and ARB in the inlets of municipal treatment plants with and without connection to airports. As expected, airplane sewage contained an extraordinarily rich set of mobile ARG, and the relative abundances of genes were mostly increased compared to typical raw sewage of municipal origin. Moreover, combined resistance against third-generation cephalosporins, fluorochinolones, and aminoglycosides was unusually common (28.9%) among Escherichia coli isolated from airplane sewage. This percentage exceeds the one reported for German clinical isolates by a factor of 8. Our findings suggest that airplane-borne sewage can effectively contribute to the fast and global spread of antibiotic resistance.


Subject(s)
Anti-Bacterial Agents , Sewage , Aircraft , Drug Resistance, Microbial , Genes, Bacterial
13.
Mol Cell Proteomics ; 16(6): 1052-1063, 2017 06.
Article in English | MEDLINE | ID: mdl-28420677

ABSTRACT

Methods for rapid and reliable microbial identification are essential in modern healthcare. The ability to detect and correctly identify pathogenic species and their resistance phenotype is necessary for accurate diagnosis and efficient treatment of infectious diseases. Bottom-up tandem mass spectrometry (MS) proteomics enables rapid characterization of large parts of the expressed genes of microorganisms. However, the generated data are highly fragmented, making downstream analyses complex. Here we present TCUP, a new computational method for typing and characterizing bacteria using proteomics data from bottom-up tandem MS. TCUP compares the generated protein sequence data to reference databases and automatically finds peptides suitable for characterization of taxonomic composition and identification of expressed antimicrobial resistance genes. TCUP was evaluated using several clinically relevant bacterial species (Escherichia coli, Pseudomonas aeruginosa, Staphylococcus aureus, Streptococcus pneumoniae, Moraxella catarrhalis, and Haemophilus influenzae), using both simulated data generated by in silico peptide digestion and experimental proteomics data generated by liquid chromatography-tandem mass spectrometry (MS/MS). The results showed that TCUP performs correct peptide classifications at rates between 90.3 and 98.5% at the species level. The method was also able to estimate the relative abundances of individual species in mixed cultures. Furthermore, TCUP could identify expressed ß-lactamases in an extended spectrum ß-lactamase-producing (ESBL) E. coli strain, even when the strain was cultivated in the absence of antibiotics. Finally, TCUP is computationally efficient, easy to integrate in existing bioinformatics workflows, and freely available under an open source license for both Windows and Linux environments.


Subject(s)
Bacteria/classification , Bacteria/metabolism , Bacterial Proteins/metabolism , Peptides/metabolism , Proteomics/methods , Tandem Mass Spectrometry/methods , Anti-Bacterial Agents/pharmacology , Bacteria/drug effects , Bacteria/genetics , Cefotaxime/pharmacology , Drug Resistance, Bacterial , Genome, Bacterial
14.
BMC Genomics ; 19(1): 274, 2018 Apr 20.
Article in English | MEDLINE | ID: mdl-29678163

ABSTRACT

BACKGROUND: In shotgun metagenomics, microbial communities are studied through direct sequencing of DNA without any prior cultivation. By comparing gene abundances estimated from the generated sequencing reads, functional differences between the communities can be identified. However, gene abundance data is affected by high levels of systematic variability, which can greatly reduce the statistical power and introduce false positives. Normalization, which is the process where systematic variability is identified and removed, is therefore a vital part of the data analysis. A wide range of normalization methods for high-dimensional count data has been proposed but their performance on the analysis of shotgun metagenomic data has not been evaluated. RESULTS: Here, we present a systematic evaluation of nine normalization methods for gene abundance data. The methods were evaluated through resampling of three comprehensive datasets, creating a realistic setting that preserved the unique characteristics of metagenomic data. Performance was measured in terms of the methods ability to identify differentially abundant genes (DAGs), correctly calculate unbiased p-values and control the false discovery rate (FDR). Our results showed that the choice of normalization method has a large impact on the end results. When the DAGs were asymmetrically present between the experimental conditions, many normalization methods had a reduced true positive rate (TPR) and a high false positive rate (FPR). The methods trimmed mean of M-values (TMM) and relative log expression (RLE) had the overall highest performance and are therefore recommended for the analysis of gene abundance data. For larger sample sizes, CSS also showed satisfactory performance. CONCLUSIONS: This study emphasizes the importance of selecting a suitable normalization methods in the analysis of data from shotgun metagenomics. Our results also demonstrate that improper methods may result in unacceptably high levels of false positives, which in turn may lead to incorrect or obfuscated biological interpretation.


Subject(s)
Data Analysis , Metagenomics
15.
Mod Pathol ; 31(8): 1302-1317, 2018 08.
Article in English | MEDLINE | ID: mdl-29487354

ABSTRACT

The aim of this study was to define the miRNA profile of small intestinal neuroendocrine tumors and to search for novel molecular subgroups and prognostic biomarkers. miRNA profiling was conducted on 42 tumors from 37 patients who underwent surgery for small intestinal neuroendocrine tumors. Unsupervised hierarchical clustering analysis of miRNA profiles identified two groups of tumor metastases, denoted cluster M1 and M2. The smaller cluster M1 was associated with shorter overall survival and contained tumors with higher grade (WHO grade G2/3) and multiple chromosomal gains including gain of chromosome 14. Tumors of cluster M1 had elevated expression of miR-1246 and miR-663a, and reduced levels of miR-488-3p. Pathway analysis predicted Wnt signaling to be the most significantly altered signaling pathway between clusters M1 and M2. Analysis of miRNA expression in relation to tumor proliferation rate showed significant alterations including downregulation of miR-137 and miR-204-5p in tumors with Ki67 index above 3%. Similarly, tumor progression was associated with significant alterations in miRNA expression, e.g. higher expression of miR-95 and miR-210, and lower expression of miR-378a-3p in metastases. Pathway analysis predicted Wnt signaling to be altered during tumor progression, which was supported by decreased nuclear translocation of ß-catenin in metastases. Survival analysis revealed that downregulation of miR-375 was associated with shorter overall survival. We performed in situ hybridization on biopsies from an independent cohort of small intestinal neuroendocrine tumors using tissue microarrays. Expression of miR-375 was found in 578/635 (91%) biopsies and survival analysis confirmed that there was a correlation between downregulation of miR-375 in tumor metastases and shorter patient survival. We conclude that miRNA profiling defines novel molecular subgroups of metastatic small intestinal neuroendocrine tumors and identifies miRNAs associated with tumor proliferation rate and progression. miR-375 is highly expressed in small intestinal neuroendocrine tumors and may be used as a prognostic biomarker.


Subject(s)
Biomarkers, Tumor/genetics , Intestinal Neoplasms/genetics , MicroRNAs/biosynthesis , Neuroendocrine Tumors/genetics , Adult , Aged , Aged, 80 and over , Female , Gene Expression Profiling , Humans , Intestinal Neoplasms/mortality , Intestine, Small/pathology , Kaplan-Meier Estimate , Male , Middle Aged , Neuroendocrine Tumors/mortality
16.
BMC Genomics ; 18(1): 316, 2017 04 21.
Article in English | MEDLINE | ID: mdl-28431529

ABSTRACT

BACKGROUND: Gene-centric analysis of metagenomics data provides information about the biochemical functions present in a microbiome under a certain condition. The ability to identify significant differences in functions between metagenomes is dependent on accurate classification and quantification of the sequence reads (binning). However, biological effects acting on specific functions may be overlooked if the classes are too general. METHODS: Here we introduce High-Resolution Binning (HirBin), a new method for gene-centric analysis of metagenomes. HirBin combines supervised annotation with unsupervised clustering to bin sequence reads at a higher resolution. The supervised annotation is performed by matching sequence fragments to genes using well-established protein domains, such as TIGRFAM, PFAM or COGs, followed by unsupervised clustering where each functional domain is further divided into sub-bins based on sequence similarity. Finally, differential abundance of the sub-bins is statistically assessed. RESULTS: We show that HirBin is able to identify biological effects that are only present at more specific functional levels. Furthermore we show that changes affecting more specific functional levels are often diluted at the more general level and therefore overlooked when analyzed using standard binning approaches. CONCLUSIONS: HirBin improves the resolution of the gene-centric analysis of metagenomes and facilitates the biological interpretation of the results. HirBin is implemented as a Python package and is freely available for download at http://bioinformatics.math.chalmers.se/hirbin .


Subject(s)
Metagenomics/methods , Algorithms , Cluster Analysis , Diabetes Mellitus, Type 2/genetics , Diabetes Mellitus, Type 2/pathology , High-Throughput Nucleotide Sequencing , Humans , Internet , Intestines/microbiology , Male , Microbiota , User-Computer Interface
17.
BMC Genomics ; 18(1): 682, 2017 Sep 02.
Article in English | MEDLINE | ID: mdl-28865446

ABSTRACT

BACKGROUND: Fluoroquinolones are broad-spectrum antibiotics used to prevent and treat a wide range of bacterial infections. Plasmid-mediated qnr genes provide resistance to fluoroquinolones in many bacterial species and are increasingly encountered in clinical settings. Over the last decade, several families of qnr genes have been discovered and characterized, but their true prevalence and diversity still remain unclear. In particular, environmental and host-associated bacterial communities have been hypothesized to maintain a large and unknown collection of qnr genes that could be mobilized into pathogens. RESULTS: In this study we used computational methods to screen genomes and metagenomes for novel qnr genes. In contrast to previous studies, we analyzed an almost 20-fold larger dataset comprising almost 13 terabases of sequence data. In total, 362,843 potential qnr gene fragments were identified, from which 611 putative qnr genes were reconstructed. These gene sequences included all previously described plasmid-mediated qnr gene families. Fifty-two of the 611 identified qnr genes were reconstructed from metagenomes, and 20 of these were previously undescribed. All of the novel qnr genes were assembled from metagenomes associated with aquatic environments. Nine of the novel genes were selected for validation, and six of the tested genes conferred consistently decreased susceptibility to ciprofloxacin when expressed in Escherichia coli. CONCLUSIONS: The results presented in this study provide additional evidence for the ubiquitous presence of qnr genes in environmental microbial communities, expand the number of known qnr gene variants and further elucidate the diversity of this class of resistance genes. This study also strengthens the hypothesis that environmental bacterial communities act as sources of previously uncharacterized qnr genes.


Subject(s)
Databases, Genetic , Drug Resistance, Bacterial/genetics , Fluoroquinolones/pharmacology , Metagenomics , Humans
18.
J Antimicrob Chemother ; 72(10): 2690-2703, 2017 10 01.
Article in English | MEDLINE | ID: mdl-28673041

ABSTRACT

Antibiotic resistance is a global health concern declared by the WHO as one of the largest threats to modern healthcare. In recent years, metagenomic DNA sequencing has started to be applied as a tool to study antibiotic resistance in different environments, including the human microbiota. However, a multitude of methods exist for metagenomic data analysis, and not all methods are suitable for the investigation of resistance genes, particularly if the desired outcome is an assessment of risks to human health. In this review, we outline the current state of methods for sequence handling, mapping to databases of resistance genes, statistical analysis and metagenomic assembly. In addition, we provide an overview of important considerations related to the analysis of resistance genes, and recommend some of the currently used tools and methods that are best equipped to inform research and clinical practice related to antibiotic resistance.


Subject(s)
Anti-Bacterial Agents/pharmacology , Drug Resistance, Bacterial/genetics , Metagenomics/methods , Microbiota/genetics , Chromosome Mapping/methods , Data Interpretation, Statistical , Databases, Genetic , Gastrointestinal Microbiome/genetics , Genes, Bacterial/drug effects , Geologic Sediments/microbiology , High-Throughput Nucleotide Sequencing/methods , Humans , Lakes/microbiology
19.
Eur J Haematol ; 98(1): 26-37, 2017 Jan.
Article in English | MEDLINE | ID: mdl-27197529

ABSTRACT

Next-generation sequencing techniques have revealed that leukemic cells in acute myeloid leukemia often are characterized by a limited number of somatic mutations. These mutations can be the basis for the detection of leukemic cells in follow-up samples. The aim of this study was to identify leukemia-specific mutations in cells from patients with acute myeloid leukemia and to use these mutations as markers for minimal residual disease. Leukemic cells and normal lymphocytes were simultaneously isolated at diagnosis from 17 patients with acute myeloid leukemia using fluorescence-activated cell sorting. Exome sequencing of these cells identified 240 leukemia-specific single nucleotide variations and 22 small insertions and deletions. Based on estimated allele frequencies and their accuracies, 191 of these mutations qualified as candidates for minimal residual disease analysis. Targeted deep sequencing with a significance threshold of 0.027% for single nucleotide variations and 0.006% for NPM1 type A mutation was developed for quantification of minimal residual disease. When tested on follow-up samples from a patient with acute myeloid leukemia, targeted deep sequencing of single nucleotide variations as well as NPM1 was more sensitive than minimal residual disease quantification with multiparameter flow cytometry. In conclusion, we here describe how exome sequencing can be used for identification of leukemia-specific mutations in samples already at diagnosis of acute myeloid leukemia. We also show that targeted deep sequencing of such mutations, including single nucleotide variations, can be used for high-sensitivity quantification of minimal residual disease in a patient-tailored manner.


Subject(s)
High-Throughput Nucleotide Sequencing , Leukemia, Myeloid, Acute/diagnosis , Leukemia, Myeloid, Acute/genetics , Neoplasm, Residual/diagnosis , Adolescent , Adult , Aged , Biomarkers, Tumor , Child , Child, Preschool , Chromosome Aberrations , Exome , Female , Genetic Testing , High-Throughput Nucleotide Sequencing/methods , Humans , Immunophenotyping , Male , Middle Aged , Mutation , Nucleophosmin , Polymorphism, Single Nucleotide , Reproducibility of Results , Young Adult
20.
BMC Genomics ; 17: 78, 2016 Jan 25.
Article in English | MEDLINE | ID: mdl-26810311

ABSTRACT

BACKGROUND: Metagenomics is the study of microbial communities by sequencing of genetic material directly from environmental or clinical samples. The genes present in the metagenomes are quantified by annotating and counting the generated DNA fragments. Identification of differentially abundant genes between metagenomes can provide important information about differences in community structure, diversity and biological function. Metagenomic data is however high-dimensional, contain high levels of biological and technical noise and have typically few biological replicates. The statistical analysis is therefore challenging and many approaches have been suggested to date. RESULTS: In this article we perform a comprehensive evaluation of 14 methods for identification of differentially abundant genes between metagenomes. The methods are compared based on the power to detect differentially abundant genes and their ability to correctly estimate the type I error rate and the false discovery rate. We show that sample size, effect size, and gene abundance greatly affect the performance of all methods. Several of the methods also show non-optimal model assumptions and biased false discovery rate estimates, which can result in too large numbers of false positives. We also demonstrate that the performance of several of the methods differs substantially between metagenomic data sequenced by different technologies. CONCLUSIONS: Two methods, primarily designed for the analysis of RNA sequencing data (edgeR and DESeq2) together with a generalized linear model based on an overdispersed Poisson distribution were found to have best overall performance. The results presented in this study may serve as a guide for selecting suitable statistical methods for identification of differentially abundant genes in metagenomes.


Subject(s)
Metagenomics/methods , Metagenome/genetics , Sequence Analysis, RNA , Software
SELECTION OF CITATIONS
SEARCH DETAIL