Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 28
Filter
1.
NAR Genom Bioinform ; 6(1): lqae028, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38482061

ABSTRACT

Recent COVID-19 vaccines unleashed the potential of mRNA-based therapeutics. A common bottleneck across mRNA-based therapeutic approaches is the rapid design of mRNA sequences that are translationally efficient, long-lived and non-immunogenic. Currently, an accessible software tool to aid in the design of such high-quality mRNA is lacking. Here, we present mRNAid, an open-source platform for therapeutic mRNA optimization, design and visualization that offers a variety of optimization strategies for sequence and structural features, allowing one to customize desired properties into their mRNA sequence. We experimentally demonstrate that transcripts optimized by mRNAid have characteristics comparable with commercially available sequences. To encompass additional aspects of mRNA design, we experimentally show that incorporation of certain uridine analogs and untranslated regions can further enhance stability, boost protein output and mitigate undesired immunogenicity effects. Finally, this study provides a roadmap for rational design of therapeutic mRNA transcripts.

2.
Drug Discov Today ; 29(3): 103884, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38219969

ABSTRACT

The volume of nucleic acid sequence data has exploded recently, amplifying the challenge of transforming data into meaningful information. Processing data can require an increasingly complex ecosystem of customized tools, which increases difficulty in communicating analyses in an understandable way yet is of sufficient detail to enable informed decisions or repeats. This can be of particular interest to institutions and companies communicating computations in a regulatory environment. BioCompute Objects (BCOs; an instance of pipeline documentation that conforms to the IEEE 2791-2020 standard) were developed as a standardized mechanism for analysis reporting. A suite of BCOs is presented, representing interconnected elements of a computation modeled after those that might be found in a regulatory submission but are shared publicly - in this case a pipeline designed to identify viral contaminants in biological manufacturing, such as for vaccines.


Subject(s)
Computational Biology , Vaccines , High-Throughput Nucleotide Sequencing , Workflow
3.
MAbs ; 15(1): 2248671, 2023.
Article in English | MEDLINE | ID: mdl-37610144

ABSTRACT

Identification of favorable biophysical properties for protein therapeutics as part of developability assessment is a crucial part of the preclinical development process. Successful prediction of such properties and bioassay results from calculated in silico features has potential to reduce the time and cost of delivering clinical-grade material to patients, but nevertheless has remained an ongoing challenge to the field. Here, we demonstrate an automated and flexible machine learning workflow designed to compare and identify the most powerful features from computationally derived physiochemical feature sets, generated from popular commercial software packages. We implement this workflow with medium-sized datasets of human and humanized IgG molecules to generate predictive regression models for two key developability endpoints, hydrophobicity and poly-specificity. The most important features discovered through the automated workflow corroborate several previous literature reports, and newly discovered features suggest directions for further research and potential model improvement.


Subject(s)
Antibodies, Monoclonal , Immunoglobulin G , Humans , Antibodies, Monoclonal/chemistry , Machine Learning
4.
Bioinform Adv ; 3(1): vbad083, 2023.
Article in English | MEDLINE | ID: mdl-37456510

ABSTRACT

Motivation: Despite the advent of next-generation sequencing technology and its widespread applications, Sanger sequencing remains instrumental for molecular biology subcloning work in biological and medical research and indispensable for drug discovery campaigns. Although Sanger sequencing technology has been long established, existing software for processing and visualization of trace file chromatograms is limited in terms of functionality, scalability and availability for commercial use. Results: To fill this gap, we developed TraceTrack, an open-source web application tool for batch alignment, analysis and visualization of Sanger trace files. TraceTrack offers high-throughput matching of trace files to reference sequences, rapid identification of mutations and an intuitive chromatogram analysis. Comparative analysis between TraceTrack and existing software tools highlights the advantages of TraceTrack with regards to batch processing, visualization and export functionalities. Availability and implementation: TraceTrack is available at https://github.com/MSDLLCpapers/TraceTrack and as a web application at https://tracetrack.dichlab.org. TraceTrack is a web application for batch processing and visualization of Sanger trace file chromatograms that meets the increasing demand of industrial sequence validation workflows in pharmaceutical settings. Supplementary information: Supplementary data are available at Bioinformatics Advances online.

5.
J Chem Inf Model ; 63(7): 1852-1857, 2023 04 10.
Article in English | MEDLINE | ID: mdl-36977316

ABSTRACT

To solve recurring problems in drug discovery, matched molecular pair (MMP) analysis is used to understand relationships between chemical structure and function. For the MMP analysis of large data sets (>10,000 compounds), available tools lack flexible search and visualization functionality and require computational expertise. Here, we present Matcher, an open-source application for MMP analysis, with novel search algorithms and fully automated querying-to-visualization that requires no programming expertise. Matcher enables unprecedented control over the search and clustering of MMP transformations based on both variable fragment and constant environment structure, which is critical for disentangling relevant and irrelevant data to a given problem. Users can exert such control through a built-in chemical sketcher and with a few mouse clicks can navigate between resulting MMP transformations, statistics, property distribution graphs, and structures with raw experimental data, for confident and accelerated decision making. Matcher can be used with any collection of structure/property data; here, we demonstrate usage with a public ChEMBL data set of about 20,000 small molecules with CYP3A4 and/or hERG inhibition data. Users can reproduce all examples demonstrated herein via unique links within Matcher's interface-a functionality that anyone can use to preserve and share their own analyses. Matcher and all its dependencies are open-source, can be used for free, and are available with containerized deployment from code at https://github.com/Merck/Matcher. Matcher makes large structure/property data sets more transparent than ever before and accelerates the data-driven solution of common problems in drug discovery.


Subject(s)
Algorithms , Software , Drug Design , Drug Discovery/methods , Cluster Analysis
6.
MAbs ; 14(1): 2020203, 2022.
Article in English | MEDLINE | ID: mdl-35133949

ABSTRACT

Despite recent advances in transgenic animal models and display technologies, humanization of mouse sequences remains one of the main routes for therapeutic antibody development. Traditionally, humanization is manual, laborious, and requires expert knowledge. Although automation efforts are advancing, existing methods are either demonstrated on a small scale or are entirely proprietary. To predict the immunogenicity risk, the human-likeness of sequences can be evaluated using existing humanness scores, but these lack diversity, granularity or interpretability. Meanwhile, immune repertoire sequencing has generated rich antibody libraries such as the Observed Antibody Space (OAS) that offer augmented diversity not yet exploited for antibody engineering. Here we present BioPhi, an open-source platform featuring novel methods for humanization (Sapiens) and humanness evaluation (OASis). Sapiens is a deep learning humanization method trained on the OAS using language modeling. Based on an in silico humanization benchmark of 177 antibodies, Sapiens produced sequences at scale while achieving results comparable to that of human experts. OASis is a granular, interpretable and diverse humanness score based on 9-mer peptide search in the OAS. OASis separated human and non-human sequences with high accuracy, and correlated with clinical immunogenicity. BioPhi thus offers an antibody design interface with automated methods that capture the richness of natural antibody repertoires to produce therapeutics with desired properties and accelerate antibody discovery campaigns. The BioPhi platform is accessible at https://biophi.dichlab.org and https://github.com/Merck/BioPhi.


Subject(s)
Deep Learning , Animals , Antibodies , Mice
7.
J Chem Inf Model ; 62(5): 1259-1267, 2022 03 14.
Article in English | MEDLINE | ID: mdl-35192366

ABSTRACT

Therapeutic peptides offer potential advantages over small molecules in terms of selectivity, affinity, and their ability to target "undruggable" proteins that are associated with a wide range of pathologies. Despite their importance, current molecular design capabilities that inform medicinal chemistry decisions on peptide programs are limited. More specifically, there are unmet needs for structure-activity relationship (SAR) analysis and visualization of linear, cyclic, and cross-linked peptides containing non-natural motifs, which are widely used in drug discovery. To bridge this gap, we developed PepSeA (Peptide Sequence Alignment and Visualization), an open-source, freely available package of sequence-based tools (https://github.com/Merck/PepSeA). PepSeA enables multiple sequence alignment of non-natural amino acids and enhanced visualization with the hierarchical editing language for macromolecules (HELM). Via stepwise SAR analysis of a ChEMBL peptide data set, we demonstrate the utility of PepSeA to accelerate decision making in lead optimization campaigns in pharmaceutical setting. PepSeA represents an initial attempt to expand cheminformatics capabilities for therapeutic peptides and to enable rapid and more efficient design-make-test cycles.


Subject(s)
Peptides , Proteins , Amino Acid Sequence , Cheminformatics , Peptides/chemistry , Sequence Alignment
8.
ACS Synth Biol ; 10(2): 357-370, 2021 02 19.
Article in English | MEDLINE | ID: mdl-33433999

ABSTRACT

Protein engineering is the discipline of developing useful proteins for applications in research, therapeutic, and industrial processes by modification of naturally occurring proteins or by invention of de novo proteins. Modern protein engineering relies on the ability to rapidly generate and screen diverse libraries of mutant proteins. However, design of mutant libraries is typically hampered by scale and complexity, necessitating development of advanced automation and optimization tools that can improve efficiency and accuracy. At present, automated library design tools are functionally limited or not freely available. To address these issues, we developed Mutation Maker, an open source mutagenic oligo design software for large-scale protein engineering experiments. Mutation Maker is not only specifically tailored to multisite random and directed mutagenesis protocols, but also pioneers bespoke mutagenic oligo design for de novo gene synthesis workflows. Enabled by a novel bundle of orchestrated heuristics, optimization, constraint-satisfaction and backtracking algorithms, Mutation Maker offers a versatile toolbox for gene diversification design at industrial scale. Supported by in silico simulations and compelling experimental validation data, Mutation Maker oligos produce diverse gene libraries at high success rates irrespective of genes or vectors used. Finally, Mutation Maker was created as an extensible platform on the notion that directed evolution techniques will continue to evolve and revolutionize current and future-oriented applications.


Subject(s)
Mutagenesis, Site-Directed/methods , Mutagenesis , Mutation , Oligonucleotides/genetics , Proteins/genetics , Software , Algorithms , Codon/genetics , Computer Simulation , Directed Molecular Evolution/methods , Escherichia coli/genetics , Gene Library , Mutant Proteins
9.
Nat Prod Rep ; 38(6): 1100-1108, 2021 06 23.
Article in English | MEDLINE | ID: mdl-33245088

ABSTRACT

Covering: up to the end of 2020. The machine learning field can be defined as the study and application of algorithms that perform classification and prediction tasks through pattern recognition instead of explicitly defined rules. Among other areas, machine learning has excelled in natural language processing. As such methods have excelled at understanding written languages (e.g. English), they are also being applied to biological problems to better understand the "genomic language". In this review we focus on recent advances in applying machine learning to natural products and genomics, and how those advances are improving our understanding of natural product biology, chemistry, and drug discovery. We discuss machine learning applications in genome mining (identifying biosynthetic signatures in genomic data), predictions of what structures will be created from those genomic signatures, and the types of activity we might expect from those molecules. We further explore the application of these approaches to data derived from complex microbiomes, with a focus on the human microbiome. We also review challenges in leveraging machine learning approaches in the field, and how the availability of other "omics" data layers provides value. Finally, we provide insights into the challenges associated with interpreting machine learning models and the underlying biology and promises of applying machine learning to natural product drug discovery. We believe that the application of machine learning methods to natural product research is poised to accelerate the identification of new molecular entities that may be used to treat a variety of disease indications.


Subject(s)
Biological Products , Genomics , Machine Learning , Biological Products/chemistry , Biological Products/pharmacology , Biosynthetic Pathways/genetics , Drug Discovery , Humans , Microbiota
10.
Nucleic Acids Res ; 48(13): 7154-7168, 2020 07 27.
Article in English | MEDLINE | ID: mdl-32496538

ABSTRACT

Mono-ubiquitylation of histone H2B (H2Bub1) and phosphorylation of elongation factor Spt5 by cyclin-dependent kinase 9 (Cdk9) occur during transcription by RNA polymerase II (RNAPII), and are mutually dependent in fission yeast. It remained unclear whether Cdk9 and H2Bub1 cooperate to regulate the expression of individual genes. Here, we show that Cdk9 inhibition or H2Bub1 loss induces intragenic antisense transcription of ∼10% of fission yeast genes, with each perturbation affecting largely distinct subsets; ablation of both pathways de-represses antisense transcription of over half the genome. H2Bub1 and phospho-Spt5 have similar genome-wide distributions; both modifications are enriched, and directly proportional to each other, in coding regions, and decrease abruptly around the cleavage and polyadenylation signal (CPS). Cdk9-dependence of antisense suppression at specific genes correlates with high H2Bub1 occupancy, and with promoter-proximal RNAPII pausing. Genetic interactions link Cdk9, H2Bub1 and the histone deacetylase Clr6-CII, while combined Cdk9 inhibition and H2Bub1 loss impair Clr6-CII recruitment to chromatin and lead to decreased occupancy and increased acetylation of histones within gene coding regions. These results uncover novel interactions between co-transcriptional histone modification pathways, which link regulation of RNAPII transcription elongation to suppression of aberrant initiation.


Subject(s)
Cell Cycle Proteins/metabolism , Cyclin-Dependent Kinase 9/metabolism , Histones/metabolism , RNA Polymerase II/metabolism , Schizosaccharomyces pombe Proteins/metabolism , Schizosaccharomyces/genetics , Transcription Elongation, Genetic , Phosphorylation , Transcriptional Elongation Factors/metabolism , Ubiquitination
11.
Nucleic Acids Res ; 47(18): e110, 2019 10 10.
Article in English | MEDLINE | ID: mdl-31400112

ABSTRACT

Natural products represent a rich reservoir of small molecule drug candidates utilized as antimicrobial drugs, anticancer therapies, and immunomodulatory agents. These molecules are microbial secondary metabolites synthesized by co-localized genes termed Biosynthetic Gene Clusters (BGCs). The increase in full microbial genomes and similar resources has led to development of BGC prediction algorithms, although their precision and ability to identify novel BGC classes could be improved. Here we present a deep learning strategy (DeepBGC) that offers reduced false positive rates in BGC identification and an improved ability to extrapolate and identify novel BGC classes compared to existing machine-learning tools. We supplemented this with random forest classifiers that accurately predicted BGC product classes and potential chemical activity. Application of DeepBGC to bacterial genomes uncovered previously undetectable putative BGCs that may code for natural products with novel biologic activities. The improved accuracy and classification ability of DeepBGC represents a major addition to in-silico BGC identification.


Subject(s)
Biosynthetic Pathways/genetics , Computational Biology/methods , Data Mining/methods , Multigene Family/genetics , Deep Learning , Genome , Genome, Bacterial/genetics
12.
Mol Biol Evol ; 36(8): 1612-1623, 2019 08 01.
Article in English | MEDLINE | ID: mdl-31077324

ABSTRACT

The relationship between DNA sequence, biochemical function, and molecular evolution is relatively well-described for protein-coding regions of genomes, but far less clear in noncoding regions, particularly, in eukaryote genomes. In part, this is because we lack a complete description of the essential noncoding elements in a eukaryote genome. To contribute to this challenge, we used saturating transposon mutagenesis to interrogate the Schizosaccharomyces pombe genome. We generated 31 million transposon insertions, a theoretical coverage of 2.4 insertions per genomic site. We applied a five-state hidden Markov model (HMM) to distinguish insertion-depleted regions from insertion biases. Both raw insertion-density and HMM-defined fitness estimates showed significant quantitative relationships to gene knockout fitness, genetic diversity, divergence, and expected functional regions based on transcription and gene annotations. Through several analyses, we conclude that transposon insertions produced fitness effects in 66-90% of the genome, including substantial portions of the noncoding regions. Based on the HMM, we estimate that 10% of the insertion depleted sites in the genome showed no signal of conservation between species and were weakly transcribed, demonstrating limitations of comparative genomics and transcriptomics to detect functional units. In this species, 3'- and 5'-untranslated regions were the most prominent insertion-depleted regions that were not represented in measures of constraint from comparative genomics. We conclude that the combination of transposon mutagenesis, evolutionary, and biochemical data can provide new insights into the relationship between genome function and molecular evolution.


Subject(s)
Genetic Fitness , Genome, Fungal , Schizosaccharomyces/genetics , Models, Genetic , Mutagenesis, Insertional
13.
RNA ; 24(9): 1195-1213, 2018 09.
Article in English | MEDLINE | ID: mdl-29914874

ABSTRACT

Long noncoding RNAs (lncRNAs), which are longer than 200 nucleotides but often unstable, contribute a substantial and diverse portion to pervasive noncoding transcriptomes. Most lncRNAs are poorly annotated and understood, although several play important roles in gene regulation and diseases. Here we systematically uncover and analyze lncRNAs in Schizosaccharomyces pombe. Based on RNA-seq data from twelve RNA-processing mutants and nine physiological conditions, we identify 5775 novel lncRNAs, nearly 4× the previously annotated lncRNAs. The expression of most lncRNAs becomes strongly induced under the genetic and physiological perturbations, most notably during late meiosis. Most lncRNAs are cryptic and suppressed by three RNA-processing pathways: the nuclear exosome, cytoplasmic exonuclease, and RNAi. Double-mutant analyses reveal substantial coordination and redundancy among these pathways. We classify lncRNAs by their dominant pathway into cryptic unstable transcripts (CUTs), Xrn1-sensitive unstable transcripts (XUTs), and Dicer-sensitive unstable transcripts (DUTs). XUTs and DUTs are enriched for antisense lncRNAs, while CUTs are often bidirectional and actively translated. The cytoplasmic exonuclease, along with RNAi, dampens the expression of thousands of lncRNAs and mRNAs that become induced during meiosis. Antisense lncRNA expression mostly negatively correlates with sense mRNA expression in the physiological, but not the genetic conditions. Intergenic and bidirectional lncRNAs emerge from nucleosome-depleted regions, upstream of positioned nucleosomes. Our results highlight both similarities and differences to lncRNA regulation in budding yeast. This broad survey of the lncRNA repertoire and characteristics in S. pombe, and the interwoven regulatory pathways that target lncRNAs, provides a rich framework for their further functional analyses.


Subject(s)
Exonucleases/metabolism , Exosomes/metabolism , RNA, Long Noncoding/genetics , Schizosaccharomyces/genetics , Sequence Analysis, RNA/methods , Cell Nucleus/metabolism , Cytoplasm/enzymology , Fungal Proteins/metabolism , Gene Expression Profiling/methods , Gene Expression Regulation, Fungal , Meiosis , Molecular Sequence Annotation , Mutation , RNA Interference , RNA Stability , RNA, Fungal/genetics , RNA, Long Noncoding/chemistry , Schizosaccharomyces/chemistry , Schizosaccharomyces/enzymology
14.
Genome Biol ; 17(1): 240, 2016 11 25.
Article in English | MEDLINE | ID: mdl-27887640

ABSTRACT

BACKGROUND: The control of energy metabolism is fundamental for cell growth and function and anomalies in it are implicated in complex diseases and ageing. Metabolism in yeast cells can be manipulated by supplying different carbon sources: yeast grown on glucose rapidly proliferates by fermentation, analogous to tumour cells growing by aerobic glycolysis, whereas on non-fermentable carbon sources metabolism shifts towards respiration. RESULTS: We screened deletion libraries of fission yeast to identify over 200 genes required for respiratory growth. Growth media and auxotrophic mutants strongly influenced respiratory metabolism. Most genes uncovered in the mutant screens have not been implicated in respiration in budding yeast. We applied gene-expression profiling approaches to compare steady-state fermentative and respiratory growth and to analyse the dynamic adaptation to respiratory growth. The transcript levels of most genes functioning in energy metabolism pathways are coherently tuned, reflecting anticipated differences in metabolic flows between fermenting and respiring cells. We show that acetyl-CoA synthase, rather than citrate lyase, is essential for acetyl-CoA synthesis in fission yeast. We also investigated the transcriptional response to mitochondrial damage by genetic or chemical perturbations, defining a retrograde response that involves the concerted regulation of distinct groups of nuclear genes that may avert harm from mitochondrial malfunction. CONCLUSIONS: This study provides a rich framework of the genetic and regulatory basis of energy metabolism in fission yeast and beyond, and it pinpoints weaknesses of commonly used auxotroph mutants for investigating metabolism. As a model for cellular energy regulation, fission yeast provides an attractive and complementary system to budding yeast.


Subject(s)
Energy Metabolism/genetics , Gene Expression Profiling , Gene Expression Regulation, Fungal , Schizosaccharomyces/genetics , Schizosaccharomyces/metabolism , Transcriptome , Acetyl Coenzyme A/metabolism , Adaptation, Biological , Cell Nucleus/genetics , Cell Nucleus/metabolism , Fermentation , Glucose/metabolism , High-Throughput Nucleotide Sequencing , Mitochondria/genetics , Mitochondria/metabolism , Mutation , Signal Transduction
15.
Front Genet ; 6: 330, 2015.
Article in English | MEDLINE | ID: mdl-26635866

ABSTRACT

Genome-wide assays and screens typically result in large lists of genes or proteins. Enrichments of functional or other biological properties within such lists can provide valuable insights and testable hypotheses. To systematically detect these enrichments can be challenging and time-consuming, because relevant data to compare against query gene lists are spread over many different sources. We have developed AnGeLi (Analysis of Gene Lists), an intuitive, integrated web-tool for comprehensive and customized interrogation of gene lists from the fission yeast, Schizosaccharomyces pombe. AnGeLi searches for significant enrichments among multiple qualitative and quantitative information sources, including gene and phenotype ontologies, genetic and protein interactions, numerous features of genes, transcripts, translation, and proteins such as copy numbers, chromosomal positions, genetic diversity, RNA polymerase II and ribosome occupancy, localization, conservation, half-lives, domains, and molecular weight among others, as well as diverse sets of genes that are co-regulated or lead to the same phenotypes when mutated. AnGeLi uses robust statistics which can be tailored to specific needs. It also provides the option to upload user-defined gene sets to compare against the query list. Through an integrated data submission form, AnGeLi encourages the community to contribute additional curated gene lists to further increase the usefulness of this resource and to get the most from the ever increasing large-scale experiments. AnGeLi offers a rigorous yet flexible statistical analysis platform for rich insights into functional enrichments and biological context for query gene lists, thus providing a powerful exploratory tool through which S. pombe researchers can uncover fresh perspectives and unexpected connections from genomic data. AnGeLi is freely available at: www.bahlerlab.info/AnGeLi.

16.
Genome Res ; 25(6): 884-96, 2015 Jun.
Article in English | MEDLINE | ID: mdl-25883323

ABSTRACT

Exon skipping is considered a principal mechanism by which eukaryotic cells expand their transcriptome and proteome repertoires, creating different splice variants with distinct cellular functions. Here we analyze RNA-seq data from 116 transcriptomes in fission yeast (Schizosaccharomyces pombe), covering multiple physiological conditions as well as transcriptional and RNA processing mutants. We applied brute-force algorithms to detect all possible exon-skipping events, which were widespread but rare compared to normal splicing events. Exon-skipping events increased in cells deficient for the nuclear exosome or the 5'-3' exonuclease Dhp1, and also at late stages of meiotic differentiation when nuclear-exosome transcripts decreased. The pervasive exon-skipping transcripts were stochastic, did not increase in specific physiological conditions, and were mostly present at less than one copy per cell, even in the absence of nuclear RNA surveillance and during late meiosis. These exon-skipping transcripts are therefore unlikely to be functional and may reflect splicing errors that are actively removed by nuclear RNA surveillance. The average splicing rate by exon skipping was ∼ 0.24% in wild type and ∼ 1.75% in nuclear exonuclease mutants. We also detected approximately 250 circular RNAs derived from single or multiple exons. These circular RNAs were rare and stochastic, although a few became stabilized during quiescence and in splicing mutants. Using an exhaustive search algorithm, we also uncovered thousands of previously unknown splice sites, indicating pervasive splicing; yet most of these splicing variants were cryptic and increased in nuclear degradation mutants. This study highlights widespread but low frequency alternative or aberrant splicing events that are targeted by nuclear RNA surveillance.


Subject(s)
Exons , Genome, Fungal , RNA, Nuclear/genetics , Schizosaccharomyces/genetics , Alternative Splicing , Exoribonucleases/genetics , Exoribonucleases/metabolism , Meiosis , RNA/genetics , RNA/metabolism , RNA, Circular , RNA, Nuclear/metabolism , Schizosaccharomyces/metabolism , Schizosaccharomyces pombe Proteins/genetics , Schizosaccharomyces pombe Proteins/metabolism , Sequence Alignment , Sequence Analysis, RNA , Transcriptome
17.
Nat Genet ; 47(3): 235-41, 2015 Mar.
Article in English | MEDLINE | ID: mdl-25665008

ABSTRACT

Natural variation within species reveals aspects of genome evolution and function. The fission yeast Schizosaccharomyces pombe is an important model for eukaryotic biology, but researchers typically use one standard laboratory strain. To extend the usefulness of this model, we surveyed the genomic and phenotypic variation in 161 natural isolates. We sequenced the genomes of all strains, finding moderate genetic diversity (π = 3 × 10(-3) substitutions/site) and weak global population structure. We estimate that dispersal of S. pombe began during human antiquity (∼340 BCE), and ancestors of these strains reached the Americas at ∼1623 CE. We quantified 74 traits, finding substantial heritable phenotypic diversity. We conducted 223 genome-wide association studies, with 89 traits showing at least one association. The most significant variant for each trait explained 22% of the phenotypic variance on average, with indels having larger effects than SNPs. This analysis represents a rich resource to examine genotype-phenotype relationships in a tractable model.


Subject(s)
Genome, Fungal , Schizosaccharomyces/genetics , Genetic Variation , Genome-Wide Association Study/methods , Genomics/methods , Genotype , Humans , Phenotype , Polymorphism, Single Nucleotide
18.
G3 (Bethesda) ; 5(1): 145-55, 2014 Dec 01.
Article in English | MEDLINE | ID: mdl-25452419

ABSTRACT

Genetic factors underlying aging are remarkably conserved from yeast to human. The fission yeast Schizosaccharomyces pombe is an emerging genetic model to analyze cellular aging. Chronological lifespan (CLS) has been studied in stationary-phase yeast cells depleted for glucose, which only survive for a few days. Here, we analyzed CLS in quiescent S. pombe cells deprived of nitrogen, which arrest in a differentiated, G0-like state and survive for more than 2 months. We applied parallel mutant phenotyping by barcode sequencing (Bar-seq) to assay pooled haploid deletion mutants as they aged together during long-term quiescence. As expected, mutants with defects in autophagy or quiescence were under-represented or not detected. Lifespan scores could be calculated for 1199 mutants. We focus the discussion on the 48 most long-lived mutants, including both known aging genes in other model systems and genes not previously implicated in aging. Genes encoding membrane proteins were particularly prominent as pro-aging factors. We independently verified the extended CLS in individual assays for 30 selected mutants, showing the efficacy of the screen. We also applied Bar-seq to profile all pooled deletion mutants for proliferation under a standard growth condition. Unlike for stationary-phase cells, no inverse correlation between growth and CLS of quiescent cells was evident. These screens provide a rich resource for further studies, and they suggest that the quiescence model can provide unique, complementary insights into cellular aging.


Subject(s)
Mutation , Schizosaccharomyces/genetics , DNA Barcoding, Taxonomic , DNA, Fungal/genetics , Schizosaccharomyces/growth & development
19.
Mol Cell Biol ; 34(18): 3500-14, 2014 Sep 15.
Article in English | MEDLINE | ID: mdl-25002536

ABSTRACT

The acetylation state of histones, controlled by histone acetyltransferases (HATs) and deacetylases (HDACs), profoundly affects DNA transcription and repair by modulating chromatin accessibility to the cellular machinery. The Schizosaccharomyces pombe HDAC Clr6 (human HDAC1) binds to different sets of proteins that define functionally distinct complexes: I, I', and II. Here, we determine the composition, architecture, and functions of a new Clr6 HDAC complex, I'', delineated by the novel proteins Nts1, Mug165, and Png3. Deletion of nts1 causes increased sensitivity to genotoxins and deregulated expression of Tf2 elements, long noncoding RNA, and subtelomeric and stress-related genes. Similar, but more pervasive, phenotypes are observed upon Clr6 inactivation, supporting the designation of complex I'' as a mediator of a key subset of Clr6 functions. We also reveal that with the exception of Tf2 elements, the genome-wide loading sites and loci regulated by Clr6 I″ do not correlate. Instead, Nts1 loads at genes that are expressed in midmeiosis, following oxidative stress, or are periodically expressed. Collective data suggest that Clr6 I'' has (i) indirect effects on gene expression, conceivably by mediating higher-order chromatin organization of subtelomeres and Tf2 elements, and (ii) direct effects on the transcription of specific genes in response to certain cellular or environmental stimuli.


Subject(s)
Cell Cycle Proteins/metabolism , Histone Deacetylases/metabolism , Schizosaccharomyces pombe Proteins/genetics , Schizosaccharomyces pombe Proteins/metabolism , Schizosaccharomyces/enzymology , Cell Cycle Proteins/genetics , Chromatin/genetics , Chromatin/metabolism , Chromosomes, Fungal , Epigenesis, Genetic , Gene Expression Regulation, Fungal , Genome, Fungal , Genomic Instability , Meiosis , Phenotype , RNA, Fungal/genetics , RNA, Long Noncoding/genetics , Schizosaccharomyces/genetics , Schizosaccharomyces/physiology , Stress, Physiological
20.
Genome Res ; 24(7): 1169-79, 2014 Jul.
Article in English | MEDLINE | ID: mdl-24709818

ABSTRACT

Both canonical and alternative splicing of RNAs are governed by intronic sequence elements and produce transient lariat structures fastened by branch points within introns. To map precisely the location of branch points on a genomic scale, we developed LaSSO (Lariat Sequence Site Origin), a data-driven algorithm which utilizes RNA-seq data. Using fission yeast cells lacking the debranching enzyme Dbr1, LaSSO not only accurately identified canonical splicing events, but also pinpointed novel, but rare, exon-skipping events, which may reflect aberrantly spliced transcripts. Compromised intron turnover perturbed gene regulation at multiple levels, including splicing and protein translation. Notably, Dbr1 function was also critical for the expression of mitochondrial genes and for the processing of self-spliced mitochondrial introns. LaSSO showed better sensitivity and accuracy than algorithms used for computational branch-point prediction or for empirical branch-point determination. Even when applied to a human data set acquired in the presence of debranching activity, LaSSO identified both canonical and exon-skipping branch points. LaSSO thus provides an effective approach for defining high-resolution maps of branch-site sequences and intronic elements on a genomic scale. LaSSO should be useful to validate introns and uncover branch-point sequences in any eukaryote, and it could be integrated into RNA-seq pipelines.


Subject(s)
Algorithms , Chromosome Mapping , Introns , Nucleotide Motifs , RNA Splicing , Regulatory Sequences, Nucleic Acid , Base Sequence , Computational Biology/methods , Databases, Nucleic Acid , Exons , Gene Deletion , Gene Expression Profiling , Genomics/methods , High-Throughput Nucleotide Sequencing , Humans , Position-Specific Scoring Matrices , RNA Precursors/genetics , RNA, Fungal/genetics , Schizosaccharomyces/genetics , Transcription, Genetic , Transcriptome
SELECTION OF CITATIONS
SEARCH DETAIL
...