Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 12 de 12
1.
Bioinformatics ; 40(4)2024 Mar 29.
Article En | MEDLINE | ID: mdl-38569896

MOTIVATION: Long-read sequencing technologies, an attractive solution for many applications, often suffer from higher error rates. Alignment of multiple reads can improve base-calling accuracy, but some applications, e.g. sequencing mutagenized libraries where multiple distinct clones differ by one or few variants, require the use of barcodes or unique molecular identifiers. Unfortunately, sequencing errors can interfere with correct barcode identification, and a given barcode sequence may be linked to multiple independent clones within a given library. RESULTS: Here we focus on the target application of sequencing mutagenized libraries in the context of multiplexed assays of variant effects (MAVEs). MAVEs are increasingly used to create comprehensive genotype-phenotype maps that can aid clinical variant interpretation. Many MAVE methods use long-read sequencing of barcoded mutant libraries for accurate association of barcode with genotype. Existing long-read sequencing pipelines do not account for inaccurate sequencing or nonunique barcodes. Here, we describe Pacybara, which handles these issues by clustering long reads based on the similarities of (error-prone) barcodes while also detecting barcodes that have been associated with multiple genotypes. Pacybara also detects recombinant (chimeric) clones and reduces false positive indel calls. In three example applications, we show that Pacybara identifies and correctly resolves these issues. AVAILABILITY AND IMPLEMENTATION: Pacybara, freely available at https://github.com/rothlab/pacybara, is implemented using R, Python, and bash for Linux. It runs on GNU/Linux HPC clusters via Slurm, PBS, or GridEngine schedulers. A single-machine simplex version is also available.


High-Throughput Nucleotide Sequencing , Software , Sequence Analysis, DNA/methods , High-Throughput Nucleotide Sequencing/methods , Gene Library , Genotype , Cluster Analysis
2.
Am J Hum Genet ; 110(10): 1769-1786, 2023 10 05.
Article En | MEDLINE | ID: mdl-37729906

Defects in hydroxymethylbilane synthase (HMBS) can cause acute intermittent porphyria (AIP), an acute neurological disease. Although sequencing-based diagnosis can be definitive, ∼⅓ of clinical HMBS variants are missense variants, and most clinically reported HMBS missense variants are designated as "variants of uncertain significance" (VUSs). Using saturation mutagenesis, en masse selection, and sequencing, we applied a multiplexed validated assay to both the erythroid-specific and ubiquitous isoforms of HMBS, obtaining confident functional impact scores for >84% of all possible amino acid substitutions. The resulting variant effect maps generally agreed with biochemical expectations and provide further evidence that HMBS can function as a monomer. Additionally, the maps implicated specific residues as having roles in active site dynamics, which was further supported by molecular dynamics simulations. Most importantly, these maps can help discriminate pathogenic from benign HMBS variants, proactively providing evidence even for yet-to-be-observed clinical missense variants.


Hydroxymethylbilane Synthase , Porphyria, Acute Intermittent , Humans , Hydroxymethylbilane Synthase/chemistry , Hydroxymethylbilane Synthase/genetics , Hydroxymethylbilane Synthase/metabolism , Mutation, Missense/genetics , Porphyria, Acute Intermittent/diagnosis , Porphyria, Acute Intermittent/genetics , Amino Acid Substitution , Molecular Dynamics Simulation
3.
bioRxiv ; 2023 Dec 07.
Article En | MEDLINE | ID: mdl-36865234

Long read sequencing technologies, an attractive solution for many applications, often suffer from higher error rates. Alignment of multiple reads can improve base-calling accuracy, but some applications, e.g. sequencing mutagenized libraries where multiple distinct clones differ by one or few variants, require the use of barcodes or unique molecular identifiers. Unfortunately, sequencing errors can interfere with correct barcode identification, and a given barcode sequence may be linked to multiple independent clones within a given library. Here we focus on the target application of sequencing mutagenized libraries in the context of multiplexed assays of variant effects (MAVEs). MAVEs are increasingly used to create comprehensive genotype-phenotype maps that can aid clinical variant interpretation. Many MAVE methods use long-read sequencing of barcoded mutant libraries for accurate association of barcode with genotype. Existing long-read sequencing pipelines do not account for inaccurate sequencing or non-unique barcodes. Here, we describe Pacybara, which handles these issues by clustering long reads based on the similarities of (error-prone) barcodes while also detecting barcodes that have been associated with multiple genotypes. Pacybara also detects recombinant (chimeric) clones and reduces false positive indel calls. In three example applications, we show that Pacybara identifies and correctly resolves these issues.

4.
bioRxiv ; 2023 Feb 06.
Article En | MEDLINE | ID: mdl-36798224

Defects in hydroxymethylbilane synthase (HMBS) can cause Acute Intermittent Porphyria (AIP), an acute neurological disease. Although sequencing-based diagnosis can be definitive, ~⅓ of clinical HMBS variants are missense variants, and most clinically-reported HMBS missense variants are designated as "variants of uncertain significance" (VUS). Using saturation mutagenesis, en masse selection, and sequencing, we applied a multiplexed validated assay to both the erythroid-specific and ubiquitous isoforms of HMBS, obtaining confident functional impact scores for >84% of all possible amino-acid substitutions. The resulting variant effect maps generally agreed with biochemical expectation. However, the maps showed variants at the dimerization interface to be unexpectedly well tolerated, and suggested residue roles in active site dynamics that were supported by molecular dynamics simulations. Most importantly, these HMBS variant effect maps can help discriminate pathogenic from benign variants, proactively providing evidence even for yet-to-be-observed clinical missense variants.

5.
Nat Biotechnol ; 41(1): 140-149, 2023 01.
Article En | MEDLINE | ID: mdl-36217029

Understanding the mechanisms of coronavirus disease 2019 (COVID-19) disease severity to efficiently design therapies for emerging virus variants remains an urgent challenge of the ongoing pandemic. Infection and immune reactions are mediated by direct contacts between viral molecules and the host proteome, and the vast majority of these virus-host contacts (the 'contactome') have not been identified. Here, we present a systematic contactome map of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) with the human host encompassing more than 200 binary virus-host and intraviral protein-protein interactions. We find that host proteins genetically associated with comorbidities of severe illness and long COVID are enriched in SARS-CoV-2 targeted network communities. Evaluating contactome-derived hypotheses, we demonstrate that viral NSP14 activates nuclear factor κB (NF-κB)-dependent transcription, even in the presence of cytokine signaling. Moreover, for several tested host proteins, genetic knock-down substantially reduces viral replication. Additionally, we show for USP25 that this effect is phenocopied by the small-molecule inhibitor AZ1. Our results connect viral proteins to human genetic architecture for COVID-19 severity and offer potential therapeutic targets.


COVID-19 , SARS-CoV-2 , Humans , SARS-CoV-2/genetics , COVID-19/genetics , Proteome/genetics , Post-Acute COVID-19 Syndrome , Virus Replication/genetics , Ubiquitin Thiolesterase/pharmacology
6.
Am J Hum Genet ; 108(7): 1283-1300, 2021 07 01.
Article En | MEDLINE | ID: mdl-34214447

Most rare clinical missense variants cannot currently be classified as pathogenic or benign. Deficiency in human 5,10-methylenetetrahydrofolate reductase (MTHFR), the most common inherited disorder of folate metabolism, is caused primarily by rare missense variants. Further complicating variant interpretation, variant impacts often depend on environment. An important example of this phenomenon is the MTHFR variant p.Ala222Val (c.665C>T), which is carried by half of all humans and has a phenotypic impact that depends on dietary folate. Here we describe the results of 98,336 variant functional-impact assays, covering nearly all possible MTHFR amino acid substitutions in four folinate environments, each in the presence and absence of p.Ala222Val. The resulting atlas of MTHFR variant effects reveals many complex dependencies on both folinate and p.Ala222Val. MTHFR atlas scores can distinguish pathogenic from benign variants and, among individuals with severe MTHFR deficiency, correlate with age of disease onset. Providing a powerful tool for understanding structure-function relationships, the atlas suggests a role for a disordered loop in retaining cofactor at the active site and identifies variants that enable escape of inhibition by S-adenosylmethionine. Thus, a model based on eight MTHFR variant effect maps illustrates how shifting landscapes of environment- and genetic-background-dependent missense variation can inform our clinical, structural, and functional understanding of MTHFR deficiency.


Methylenetetrahydrofolate Reductase (NADPH2)/genetics , Mutation, Missense , Amino Acid Substitution , DNA Mutational Analysis , Diploidy , Gene Library , Genotype , Humans , Methylenetetrahydrofolate Reductase (NADPH2)/deficiency , Methylenetetrahydrofolate Reductase (NADPH2)/physiology , Saccharomyces cerevisiae/genetics
7.
Bioinformatics ; 37(19): 3382-3383, 2021 Oct 11.
Article En | MEDLINE | ID: mdl-33774657

SUMMARY: Multiplexed assays of variant effect (MAVEs) are capable of experimentally testing all possible single nucleotide or amino acid variants in selected genomic regions, generating 'variant effect maps', which provide biochemical insight and functional evidence to enable more rapid and accurate clinical interpretation of human variation. Because the international community applying MAVE approaches is growing rapidly, we developed the online MaveRegistry platform to catalyze collaboration, reduce redundant efforts, allow stakeholders to nominate targets and enable tracking and sharing of progress on ongoing MAVE projects. AVAILABILITY AND IMPLEMENTATION: MaveRegistry service: https://registry.varianteffect.org. MaveRegistry source code: https://github.com/kvnkuang/maveregistry-front-end.

8.
Microorganisms ; 8(8)2020 Aug 04.
Article En | MEDLINE | ID: mdl-32759834

The Neurospora crassa AOD1 protein is a mitochondrial alternative oxidase that passes electrons directly from ubiquinol to oxygen. The enzyme is encoded by the nuclear aod-1 gene and is produced when the standard electron transport chain is inhibited. We previously identified eleven strains in the N. crassa single gene deletion library that were severely deficient in their ability to produce AOD1 when grown in the presence of chloramphenicol, an inhibitor of mitochondrial translation that is known to induce the enzyme. Three mutants affected previously characterized genes. In this report we examined the remaining mutants and found that the deficiency of AOD1 was due to secondary mutations in all but two of the strains. One of the authentic mutants contained a deletion of the yvh1 gene and was found to have a deficiency of aod-1 transcripts. The YVH1 protein localized to the nucleus and a post mitochondrial pellet from the cytoplasm. A zinc binding domain in the protein was required for rescue of the AOD1 deficiency. In other organisms YVH1 is required for ribosome assembly and mutants have multiple phenotypes. Lack of YVH1 in N. crassa likely also affects ribosome assembly leading to phenotypes that include altered regulation of AOD1 production.

9.
Nature ; 580(7803): 402-408, 2020 04.
Article En | MEDLINE | ID: mdl-32296183

Global insights into cellular organization and genome function require comprehensive understanding of the interactome networks that mediate genotype-phenotype relationships1,2. Here we present a human 'all-by-all' reference interactome map of human binary protein interactions, or 'HuRI'. With approximately 53,000 protein-protein interactions, HuRI has approximately four times as many such interactions as there are high-quality curated interactions from small-scale studies. The integration of HuRI with genome3, transcriptome4 and proteome5 data enables cellular function to be studied within most physiological or pathological cellular contexts. We demonstrate the utility of HuRI in identifying the specific subcellular roles of protein-protein interactions. Inferred tissue-specific networks reveal general principles for the formation of cellular context-specific functions and elucidate potential molecular mechanisms that might underlie tissue-specific phenotypes of Mendelian diseases. HuRI is a systematic proteome-wide reference that links genomic variation to phenotypic outcomes.


Proteome/metabolism , Extracellular Space/metabolism , Humans , Organ Specificity , Protein Interaction Mapping
10.
Cell Syst ; 10(1): 25-38.e10, 2020 01 22.
Article En | MEDLINE | ID: mdl-31668799

Many traits are complex, depending non-additively on variant combinations. Even in model systems, such as the yeast S. cerevisiae, carrying out the high-order variant-combination testing needed to dissect complex traits remains a daunting challenge. Here, we describe "X-gene" genetic analysis (XGA), a strategy for engineering and profiling highly combinatorial gene perturbations. We demonstrate XGA on yeast ABC transporters by engineering 5,353 strains, each deleted for a random subset of 16 transporters, and profiling each strain's resistance to 16 compounds. XGA yielded 85,648 genotype-to-resistance observations, revealing high-order genetic interactions for 13 of the 16 transporters studied. Neural networks yielded intuitive functional models and guided exploration of fluconazole resistance, which was influenced non-additively by five genes. Together, our results showed that highly combinatorial genetic perturbation can functionally dissect complex traits, supporting pursuit of analogous strategies in human cells and other model systems.


Biological Transport/genetics , Membrane Transport Proteins/genetics , Humans
11.
G3 (Bethesda) ; 9(10): 3453-3465, 2019 10 07.
Article En | MEDLINE | ID: mdl-31444295

The Neurospora crassa nuclear aod-1 gene encodes an alternative oxidase that functions in mitochondria. The enzyme provides a branch from the standard electron transport chain by transferring electrons directly from ubiquinol to oxygen. In standard laboratory strains, aod-1 is transcribed at very low levels under normal growth conditions. However, if the standard electron transport chain is disrupted, aod-1 mRNA expression is induced and the AOD1 protein is produced. We previously identified a strain of N. crassa, that produces high levels of aod-1 transcript under non-inducing conditions. Here we have crossed this strain to a standard lab strain and determined the genomic sequences of the parents and several progeny. Analysis of the sequence data and the levels of aod-1 mRNA in uninduced cultures revealed that a frameshift mutation in the flbA gene results in the high uninduced expression of aod-1 The flbA gene encodes a regulator of G protein signaling that decreases the activity of the Gα subunit of heterotrimeric G proteins. Our data suggest that strains with a functional flbA gene prevent uninduced expression of aod-1 by inactivating a G protein signaling pathway, and that this pathway is activated in cells grown under conditions that induce aod-1 Induced cells with a deletion of the gene encoding the Gα protein still have a partial increase in aod-1 mRNA levels, suggesting a second pathway for inducing transcription of the gene in N. crassa We also present evidence that a translational control mechanism prevents production of AOD1 protein in uninduced cultures.


GTP-Binding Proteins/genetics , Gene Expression Regulation, Fungal , Mitochondrial Proteins/biosynthesis , Neurospora crassa/genetics , Neurospora crassa/metabolism , Oxidoreductases/biosynthesis , Plant Proteins/biosynthesis , Mutation , RNA, Messenger/genetics , RNA, Messenger/metabolism
12.
Nat Commun ; 10(1): 1240, 2019 03 18.
Article En | MEDLINE | ID: mdl-30886144

Despite exceptional experimental efforts to map out the human interactome, the continued data incompleteness limits our ability to understand the molecular roots of human disease. Computational tools offer a promising alternative, helping identify biologically significant, yet unmapped protein-protein interactions (PPIs). While link prediction methods connect proteins on the basis of biological or network-based similarity, interacting proteins are not necessarily similar and similar proteins do not necessarily interact. Here, we offer structural and evolutionary evidence that proteins interact not if they are similar to each other, but if one of them is similar to the other's partners. This approach, that mathematically relies on network paths of length three (L3), significantly outperforms all existing link prediction methods. Given its high accuracy, we show that L3 can offer mechanistic insights into disease mechanisms and can complement future experimental efforts to complete the human interactome.


Models, Biological , Protein Interaction Mapping/methods , Protein Interaction Maps , Algorithms , Animals , Arabidopsis Proteins/metabolism , Caenorhabditis elegans Proteins/metabolism , Computational Biology/methods , Datasets as Topic , Drosophila Proteins/metabolism , Humans , Mice , Saccharomyces cerevisiae Proteins/metabolism , Schizosaccharomyces pombe Proteins/metabolism , Software
...