RESUMO
Halophilic archaea of the class Halobacteria are the most salt-requiring prokaryotes within the domain Archaea. In 1997, minimal standards for the description of new taxa in the order Halobacteriales were proposed. From then on, the taxonomy of the class Halobacteria provides an excellent example of how changing concepts on prokaryote taxonomy and the development of new methods were implemented. The last decades have witnessed a rapid expansion of the number of described taxa within the class Halobacteria coinciding with the era of genome sequencing development. The current members of the International Committee on Systematics of Prokaryotes Subcommittee on the Taxonomy of Halobacteria propose these revisions to the recommended minimal standards and encourage the use of advanced technologies in the taxonomic description of members of the Halobacteria. Most previously required and some recommended minimal standards for the description of new taxa in the class Halobacteria were retained in the present revision, but changes have been proposed in line with the new methodologies. In addition to the 16S rRNA gene, the rpoB' gene is an important molecular marker for the identification of members of the Halobacteria. Phylogenomic analysis based on concatenated conserved, single-copy marker genes is required to infer the taxonomic status of new taxa. The overall genome relatedness indexes have proven to be determinative in the classification of the taxa within the class Halobacteria. Average nucleotide identity, digital DNA-DNA hybridization, and average amino acid identity values should be calculated for rigorous comparison among close relatives.
Assuntos
Ácidos Graxos , Halobacteriales , Filogenia , RNA Ribossômico 16S/genética , Análise de Sequência de DNA , Ácidos Graxos/química , Técnicas de Tipagem Bacteriana/métodos , DNA Bacteriano/genética , Composição de BasesRESUMO
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
RESUMO
Recent technological advancements have enabled the profiling of a large number of genome-wide features in individual cells. However, single-cell data present unique challenges that require the development of specialized methods and software infrastructure to successfully derive biological insights. The Bioconductor project has rapidly grown to meet these demands, hosting community-developed open-source software distributed as R packages. Featuring state-of-the-art computational methods, standardized data infrastructure and interactive data visualization tools, we present an overview and online book (https://osca.bioconductor.org) of single-cell methods for prospective users.
Assuntos
Análise de Célula Única/métodos , Perfilação da Expressão Gênica , Genoma , Sequenciamento de Nucleotídeos em Larga Escala , SoftwareRESUMO
Biological experiments involving genomics or other high-throughput assays typically yield a data matrix that can be explored and analyzed using the R programming language with packages from the Bioconductor project. Improvements in the throughput of these assays have resulted in an explosion of data even from routine experiments, which poses a challenge to the existing computational infrastructure for statistical data analysis. For example, single-cell RNA sequencing (scRNA-seq) experiments frequently generate large matrices containing expression values for each gene in each cell, requiring sparse or file-backed representations for memory-efficient manipulation in R. These alternative representations are not easily compatible with high-performance C++ code used for computationally intensive tasks in existing R/Bioconductor packages. Here, we describe a C++ interface named beachmat, which enables agnostic data access from various matrix representations. This allows package developers to write efficient C++ code that is interoperable with dense, sparse and file-backed matrices, amongst others. We evaluated the performance of beachmat for accessing data from each matrix representation using both simulated and real scRNA-seq data, and defined a clear memory/speed trade-off to motivate the choice of an appropriate representation. We also demonstrate how beachmat can be incorporated into the code of other packages to drive analyses of a very large scRNA-seq data set.
Assuntos
Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de RNA/métodos , Software , Algoritmos , Bases de Dados Genéticas , HumanosRESUMO
Rift Valley fever virus (RVFV) continues to pose a threat to much of the world. Unlike many arboviruses, numerous mosquito species have been associated with RVFV in nature, and many species have been demonstrated as competent vectors in the laboratory. In this study, we evaluated two field-collected Psorophora species, Psorophora columbiae (Dyar and Knab) and Psorophora ciliata (F.) for their potential to transmit RVFV in North America. Both species were susceptible to infection after feeding on a hamster with a viremia of 10(7) plaque-forming units/ml, with infection rates of 65 and 83% for Ps. columbiae and Ps. ciliata, respectively (with nearly all specimens becoming infected when feeding on a hamster with a higher viremia). However, both species had a significant salivary gland barrier, as only 2/35 Ps. columbiae and 0/3 Ps. ciliata with a disseminated infection transmitted virus by bite. Despite the presence of the salivary gland barrier, due to the very high population that can occur and its propensity to feed on large mammals, Ps. columbiae might play a role in amplifying RVFV should that virus be introduced into an area where this species is common.
Assuntos
Culicidae/virologia , Insetos Vetores/virologia , Febre do Vale de Rift/transmissão , Animais , California , Feminino , Florida , Mesocricetus/virologia , Febre do Vale de Rift/virologia , Vírus da Febre do Vale do Rift/fisiologia , Viremia/virologiaRESUMO
Members of the haloarchaeal genera Halosarcina and Halogeometricum (family Halobacteriaceae) are closely related to each other and show 96.6-98â% 16S rRNA gene sequence similarity. This is higher than the accepted threshold value (95â%) to separate two genera, and a taxonomic study using a polyphasic approach of all four members of the two genera was conducted to clarify their relationships. Polar lipid profiles indicated that Halogeometricum rufum RO1-4(T), Halosarcina pallida BZ256(T) and Halosarcina limi RO1-6(T) are related more to each other than to Halogeometricum borinquense CGMCC 1.6168(T). Phylogenetic analyses using the sequences of three different genes (16S rRNA gene, rpoB' and EF-2) strongly supported the monophyly of these four species, showing that they formed a distinct clade, separate from the related genera Halopelagius, Halobellus, Haloquadratum, Haloferax and Halogranum. The results indicate that the four species should be assigned to the same genus, and it is proposed that Halosarcina pallida and Halosarcina limi be transferred to the genus Halogeometricum as Halogeometricum pallidum comb. nov. (type strain, BZ256(T)â=âKCTC 4017(T)â=âJCM 14848(T)) and Halogeometricum limi comb. nov. (type strain, RO1-6(T)â=âCGMCC 1.8711(T)â=âJCM 16054(T)).
Assuntos
Halobacteriaceae/classificação , Filogenia , DNA Arqueal/genética , Genes Arqueais , Halobacteriaceae/genética , Lipídeos/análise , Dados de Sequência Molecular , Fator 2 de Elongação de Peptídeos/genética , RNA Ribossômico 16S/genética , Análise de Sequência de DNARESUMO
MOTIVATION: Identification of genomic regions of interest in ChIP-seq data, commonly referred to as peak-calling, aims to find the locations of transcription factor binding sites, modified histones or nucleosomes. The BayesPeak algorithm was developed to model the data structure using Bayesian statistical techniques and was shown to be a reliable method, but did not have a full-genome implementation. RESULTS: In this note we present BayesPeak, an R package for genome-wide peak-calling that provides a flexible implementation of the BayesPeak algorithm and is compatible with downstream BioConductor packages. The BayesPeak package introduces a new method for summarizing posterior probability output, along with methods for handling overfitting and support for parallel processing. We briefly compare the package with other common peak-callers. AVAILABILITY: Available as part of BioConductor version 2.6. URL: http://bioconductor.org/packages/release/bioc/html/BayesPeak.html.
Assuntos
Algoritmos , Teorema de Bayes , Imunoprecipitação da Cromatina/métodos , Software , Genoma , Cadeias de Markov , RecoverinaRESUMO
Illumina whole-genome expression BeadArrays are a popular choice in gene profiling studies. Aside from the vendor-provided software tools for analyzing BeadArray expression data (GenomeStudio/BeadStudio), there exists a comprehensive set of open-source analysis tools in the Bioconductor project, many of which have been tailored to exploit the unique properties of this platform. In this article, we explore a number of these software packages and demonstrate how to perform a complete analysis of BeadArray data in various formats. The key steps of importing data, performing quality assessments, preprocessing, and annotation in the common setting of assessing differential expression in designed experiments will be covered.
Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Software , Projetos de PesquisaRESUMO
Two halophilic archaea, strains TBN53(T) and CSW2.24.4(T), were characterized to elucidate their taxonomic status. Strain TBN53(T) was isolated from the Taibei marine solar saltern near Lianyungang city, Jiangsu province, China, whereas strain CSW2.24.4(T) was isolated from a saltern crystallizer in Victoria, Australia. Cells of the two strains were pleomorphic, stained Gram-negative and produced red-pigmented colonies. Strain TBN53(T) was able to grow at 25-55 °C (optimum 45 °C), with 1.4-5.1 M NaCl (optimum 2.6-3.9 M NaCl), with 0-1.0 M MgCl(2) (optimum 0-0.1 M MgCl(2)) and at pH 5.5-9.5 (optimum pH 7.0), whereas strain CSW2.24.4(T) was able to grow at 25-45 °C (optimum 37 °C), with 2.6-5.1 M NaCl (optimum 3.4 M NaCl), with 0.01-0.7 M MgCl(2) (optimum 0.05 M MgCl(2)) and at pH 5.5-9.5 (optimum pH 7.0-7.5). Cells of the two isolates lysed in distilled water. The minimum NaCl concentrations that prevented cell lysis were 8 % (w/v) for strain TBN53(T) and 12 % (w/v) for strain CSW2.24.4(T). The major polar lipids of the two strains were phosphatidylglycerol, phosphatidylglycerol phosphate methyl ester and phosphatidylglycerol sulfate, with two glycolipids chromatographically identical to sulfated mannosyl glucosyl diether and mannosyl glucosyl diether, respectively. Trace amounts of other unidentified lipids were also detected. On the basis of 16S rRNA gene sequence analysis, strains TBN53(T) and CSW2.24.4(T) showed 94.1 % similarity to each other and were closely related to Halobellus clavatus TNN18(T) (95.0 and 94.7 % similarity, respectively). Levels of rpoB' gene sequence similarity between strains TBN53(T) and CSW2.24.4(T), and between these strains and Halobellus clavatus TNN18(T) were 88.5, 88.5 and 88.1 %, respectively. The DNA G+C contents of strains TBN53(T) and CSW2.24.4(T) were 69.2 and 67.0 mol%, respectively. The level of DNA-DNA relatedness between strain TBN53(T) and strain CSW2.24.4(T) was 25 %, and these two strains showed low levels of DNA-DNA relatedness with Halobellus clavatus TNN18(T) (30 and 29 % relatedness, respectively). Based on these phenotypic, chemotaxonomic and phylogenetic properties, two novel species of the genus Halobellus are proposed to accommodate these two strains, Halobellus limi sp. nov. (type strain TBN53(T) = CGMCC 1.10331(T) = JCM 16811(T)) and Halobellus salinus sp. nov. (type strain CSW2.24.4(T) = DSM 18730(T) = CGMCC 1.10710(T) = JCM 14359(T)).
Assuntos
Sedimentos Geológicos/microbiologia , Halobacteriaceae/classificação , Halobacteriaceae/isolamento & purificação , DNA Arqueal/genética , Halobacteriaceae/genética , Halobacteriaceae/metabolismo , Concentração de Íons de Hidrogênio , Dados de Sequência Molecular , Filogenia , RNA Ribossômico 16S/genética , Cloreto de Sódio/metabolismo , VitóriaRESUMO
Two halophilic archaeal strains, R30(T) and tADL(T), were isolated from an aquaculture farm in Dailing, China, and from Deep Lake, Antarctica, respectively. Both have rod-shaped cells that lyse in distilled water, stain Gram-negative and form red-pigmented colonies. They are neutrophilic, require >120 g/l NaCl and 48-67 g/l MgCl(2) for growth but differ in their optimum growth temperatures (30 °C, tADL(T) vs. 40 °C, R30(T)). The major polar lipids were typical for members of the Archaea but also included a major glycolipid chromatographically identical to sulfated mannosyl glucosyl diether (S-DGD-1). The 16S rRNA gene sequences of the two strains are 97.4 % identical, show most similarity to genes of the family Halobacteriaceae, and cluster together as a distinct clade in phylogenetic tree reconstructions. The rpoB' gene similarity between strains R30(T) and tADL(T) is 92.9 % and less to other halobacteria. Their DNA G + C contents are 62.4-62.9 mol % but DNA-DNA hybridization gives a relatedness of only 44 %. Based on phenotypic, chemotaxonomic and phylogenetic properties, we describe two new species of a novel genus, represented by strain R30(T) (= CGMCC 1.10593(T) = JCM 17270(T)) and strain tADL(T) (= JCM 15066(T) = DSMZ 22187(T)) for which we propose the names Halohasta litorea gen. nov., sp. nov. and Halohasta litchfieldiae sp. nov., respectively.
Assuntos
Halobacteriaceae/classificação , Halobacteriaceae/isolamento & purificação , Regiões Antárticas , Proteínas Arqueais/genética , China , DNA Arqueal/química , Halobacteriaceae/citologia , Filogenia , RNA Ribossômico 16S/genéticaRESUMO
A Pioneer Eco-Backpack electric cold ultra-low volume (ULV) sprayer and a gas-powered Twister XL 3950 series 2 motorized knapsack ULV sprayer with Aqualuer (20.6% permethrin AI) were evaluated against caged adult Aedes albopictus and Culex quinquefasciatus in St. Augustine, FL. The Pioneer Eco-Backpack sprayer provided 100% knockdown of both species of mosquitoes at 15 min; the Twister XL backpack sprayer resulted in 17-23% knockdown at 15 min. Both backpack sprayers with Aqualuer resulted in 100% mortality of both species at 24 h. The new Pioneer Eco-Backpack sprayer powered by electricity could be a potential tool for mosquito control.
Assuntos
Aedes/efeitos dos fármacos , Culex/efeitos dos fármacos , Inseticidas/farmacologia , Permetrina/farmacologia , Butóxido de Piperonila/farmacologia , Animais , Feminino , Inseticidas/química , Permetrina/química , Butóxido de Piperonila/química , Especificidade da EspécieRESUMO
Three plant-based repellents-REPEL LEMON Eucalyptus Insect Repellent Lotion (active ingredient [AI] 30% oil of eucalyptus), Bite Blocker Xtreme Sportsman Organic Insect Repellent ([AI] 3% soybean oil, 6% geranium oil, and 8% castor oil), and Bite Blocker BioUD Insect Repellent ([AI] 7.75% 2-undecanone)--were evaluated against OFF! ([AI] 15% N,N-diethyl-m-toluamide or N,N-diethyl-3-methyl-benzamide, also called DEET) at a field site in Elkton, FL, to determine the mean protection time provided against Psorophora columbiae (Dyar & Knab). These products provided different protection times against biting Ps. columbiae. REPEL provided the longest protection time (330 min) followed by Bite Blocker Xtreme Sportsman (163 min), Bite Blocker BioUD (140 min), and OFF! (130 min). This study provides the first information about plant-based insect repellent protection times against Ps. columbiae.
Assuntos
Culicidae , Repelentes de Insetos , Extratos Vegetais , Animais , DEET , FloridaRESUMO
Precise control of gene expression requires the coordinated action of multiple factors at cis-regulatory elements. We recently developed single-molecule footprinting to simultaneously resolve the occupancy of multiple proteins including transcription factors, RNA polymerase II and nucleosomes on single DNA molecules genome-wide. The technique combines the use of cytosine methyltransferases to footprint the genome with bisulfite sequencing to resolve transcription factor binding patterns at cis-regulatory elements. DNA footprinting is performed by incubating permeabilized nuclei with recombinant methyltransferases. Upon DNA extraction, whole-genome or targeted bisulfite libraries are prepared and loaded on Illumina sequencers. The protocol can be completed in 4-5 d in any laboratory with access to high-throughput sequencing. Analysis can be performed in 2 d using a dedicated R package and requires access to a high-performance computing system. Our method can be used to analyze how transcription factors cooperate and antagonize to regulate transcription.
Assuntos
Pegada de DNA/métodos , Metilases de Modificação do DNA/metabolismo , DNA/metabolismo , Genoma , Imagem Individual de Molécula/métodos , Fatores de Transcrição/metabolismo , Animais , Núcleo Celular/metabolismo , DNA/genética , Metilases de Modificação do DNA/genética , Regulação da Expressão Gênica , Biblioteca Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Camundongos , Células-Tronco Embrionárias Murinas/citologia , Células-Tronco Embrionárias Murinas/metabolismo , Nucleossomos/química , Nucleossomos/metabolismo , RNA Polimerase II/genética , RNA Polimerase II/metabolismo , Análise de Sequência de DNA/estatística & dados numéricos , Software , Fatores de Transcrição/genéticaRESUMO
As a group, the halophilic archaea (class Halobacteria) are the most salt-requiring and salt-resistant microorganisms within the domain Archaea. Halophilic archaea flourish in thalassohaline and athalassohaline environments and require over 100-150 g/L NaCl for growth and structural stability. Natural hypersaline environments vary in salt concentration, chemical composition and pH, and occur in climates ranging from tropical to polar and even under-sea. Accordingly, their resident haloarchaeal species vary enormously, as do their individual population compositions and community structures. These diverse halophilic archaeal strains are precious resources for theoretical and applied research but assessing their taxonomic and metabolic novelty and diversity in natural environments has been technically difficult up until recently. Environmental DNA-based high-throughput sequencing technology has now matured sufficiently to allow inexpensive recovery of massive amounts of sequence data, revealing the distribution and community composition of halophilic archaea in different hypersaline environments. While cultivation of haloarchaea is slow and tedious, and only recovers a fraction of the natural diversity, it is the conventional means of describing new species, and provides strains for detailed study. As of the end of May 2020, the class Halobacteria contains 71 genera and 275 species, 49.8% of which were first isolated from the marine salt environment and 50.2% from the inland salt environment, indicating that both thalassohaline and athalassohaline environments contain diverse halophilic archaea. However, there remain taxa that have not yet been isolated in pure culture, such as the nanohaloarchaea, which are widespread in the salt environment and may be one of the hot spots in the field of halophilic archaea research in the future. In this review, we focus on the cultivation strategies that have been used to isolate extremely halophilic archaea and point out some of the pitfalls and challenges. Supplementary Information: The online version contains supplementary material available at 10.1007/s42995-020-00087-3.
RESUMO
BACKGROUND: A key stage for all microarray analyses is the extraction of feature-intensities from an image. If this step goes wrong, then subsequent preprocessing and processing stages will stand little chance of rectifying the matter. Illumina employ random construction of their BeadArrays, making feature-intensity extraction even more important for the Illumina platform than for other technologies. In this paper we show that using raw Illumina data it is possible to identify, control, and perhaps correct for a range of spatial-related phenomena that affect feature-intensity extraction. RESULTS: We note that feature intensities can be unnaturally high when in the proximity of a number of phenomena relating either to the images themselves or to the layout of the beads on an array. Additionally we note that beads neighbour beads of the same type more often than one might expect, which may cause concern in some models of hybridization. We highlight issues in the identification of a bead's location, and in particular how this both affects and is affected by its intensity. Finally we show that beads can be wrongly identified in the image on either a local or array-wide scale, with obvious implications for data quality. CONCLUSIONS: The image processing issues identified will often pass unnoticed by an analysis of the standard data returned from an experiment. We detail some simple diagnostics that can be implemented to identify problems of this nature, and outline approaches to correcting for such problems. These approaches require access to the raw data from the arrays, not just the summarized data usually returned, making the acquisition of such raw data highly desirable.
Assuntos
Biologia Computacional/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Bases de Dados Genéticas , Perfilação da Expressão Gênica/métodosRESUMO
TP53 deficiency is the most common alteration in cancer; however, this alone is typically insufficient to drive tumorigenesis. To identify genes promoting tumorigenesis in combination with TP53 deficiency, we perform genome-wide CRISPR-Cas9 knockout screens coupled with proliferation and transformation assays in isogenic cell lines. Loss of several known tumor suppressors enhances cellular proliferation and transformation. Loss of neddylation pathway genes promotes uncontrolled proliferation exclusively in TP53-deficient cells. Combined loss of CUL3 and TP53 activates an oncogenic transcriptional program governed by the nuclear factor κB (NF-κB), AP-1, and transforming growth factor ß (TGF-ß) pathways. This program maintains persistent cellular proliferation, induces partial epithelial to mesenchymal transition, and increases DNA damage, genomic instability, and chromosomal rearrangements. Our findings reveal CUL3 loss as a key event stimulating persistent proliferation in TP53-deficient cells. These findings may be clinically relevant, since TP53-CUL3-deficient cells are highly sensitive to ataxia telangiectasia mutated (ATM) inhibition, exposing a vulnerability that could be exploited for cancer treatment.
Assuntos
Proteínas Culina/genética , Proteína Supressora de Tumor p53/genética , Proteínas Mutadas de Ataxia Telangiectasia/antagonistas & inibidores , Proteínas Mutadas de Ataxia Telangiectasia/genética , Proteínas Mutadas de Ataxia Telangiectasia/metabolismo , Carcinogênese/genética , Linhagem Celular , Linhagem Celular Tumoral , Proliferação de Células/fisiologia , Proteínas Culina/metabolismo , Transição Epitelial-Mesenquimal , Estudo de Associação Genômica Ampla , Instabilidade Genômica , Humanos , NF-kappa B/metabolismo , Epitélio Pigmentado da Retina/citologia , Fator de Crescimento Transformador beta/metabolismo , Proteína Supressora de Tumor p53/deficiência , Proteína Supressora de Tumor p53/metabolismoRESUMO
High-coverage long-read sequencing of the Halobacterium salinarum type strain (91-R6) revealed a 2.17-Mb chromosome and two large plasmids (148 and 102 kb). Population heterogeneity and long repeats were observed. Strain 91-R6 and laboratory strain R1 showed 99.63% sequence identity in common chromosomal regions and only 38 strain-specific segments. This information resolves the previously uncertain relationship between type and laboratory strains.
RESUMO
UNLABELLED: The R/Bioconductor package beadarray allows raw data from Illumina experiments to be read and stored in convenient R classes. Users are free to choose between various methods of image processing, background correction and normalization in their analysis rather than using the defaults in Illumina's; proprietary software. The package also allows quality assessment to be carried out on the raw data. The data can then be summarized and stored in a format which can be used by other R/Bioconductor packages to perform downstream analyses. Summarized data processed by Illumina's; BeadStudio software can also be read and analysed in the same manner. AVAILABILITY: The beadarray package is available from the Bioconductor web page at www.bioconductor.org. A user's guide and example data sets are provided with the package.
Assuntos
Interpretação de Imagem Assistida por Computador/métodos , Hibridização in Situ Fluorescente/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Linguagens de Programação , Análise de Sequência de DNA/métodos , Software , Algoritmos , Bases de Dados Genéticas , MicroesferasRESUMO
A correction to this article has been published and is linked from the HTML and PDF versions of this paper. The error has been fixed in the paper.
RESUMO
Telomere length is a risk factor in disease and the dynamics of telomere length are crucial to our understanding of cell replication and vitality. The proliferation of whole genome sequencing represents an unprecedented opportunity to glean new insights into telomere biology on a previously unimaginable scale. To this end, a number of approaches for estimating telomere length from whole-genome sequencing data have been proposed. Here we present Telomerecat, a novel approach to the estimation of telomere length. Previous methods have been dependent on the number of telomeres present in a cell being known, which may be problematic when analysing aneuploid cancer data and non-human samples. Telomerecat is designed to be agnostic to the number of telomeres present, making it suited for the purpose of estimating telomere length in cancer studies. Telomerecat also accounts for interstitial telomeric reads and presents a novel approach to dealing with sequencing errors. We show that Telomerecat performs well at telomere length estimation when compared to leading experimental and computational methods. Furthermore, we show that it detects expected patterns in longitudinal data, repeated measurements, and cross-species comparisons. We also apply the method to a cancer cell data, uncovering an interesting relationship with the underlying telomerase genotype.