RESUMO
We describe a multiplex genome engineering technology in Saccharomyces cerevisiae based on annealing synthetic oligonucleotides at the lagging strand of DNA replication. The mechanism is independent of Rad51-directed homologous recombination and avoids the creation of double-strand DNA breaks, enabling precise chromosome modifications at single base-pair resolution with an efficiency of >40%, without unintended mutagenic changes at the targeted genetic loci. We observed the simultaneous incorporation of up to 12 oligonucleotides with as many as 60 targeted mutations in one transformation. Iterative transformations of a complex pool of oligonucleotides rapidly produced large combinatorial genomic diversity >105. This method was used to diversify a heterologous ß-carotene biosynthetic pathway that produced genetic variants with precise mutations in promoters, genes, and terminators, leading to altered carotenoid levels. Our approach of engineering the conserved processes of DNA replication, repair, and recombination could be automated and establishes a general strategy for multiplex combinatorial genome engineering in eukaryotes.
Assuntos
Engenharia Genética/métodos , Saccharomyces cerevisiae/genética , Replicação do DNA , Escherichia coli/genética , Edição de Genes , Oligonucleotídeos/químicaRESUMO
The GENCODE project annotates human and mouse genes and transcripts supported by experimental data with high accuracy, providing a foundational resource that supports genome biology and clinical genomics. GENCODE annotation processes make use of primary data and bioinformatic tools and analysis generated both within the consortium and externally to support the creation of transcript structures and the determination of their function. Here, we present improvements to our annotation infrastructure, bioinformatics tools, and analysis, and the advances they support in the annotation of the human and mouse genomes including: the completion of first pass manual annotation for the mouse reference genome; targeted improvements to the annotation of genes associated with SARS-CoV-2 infection; collaborative projects to achieve convergence across reference annotation databases for the annotation of human and mouse protein-coding genes; and the first GENCODE manually supervised automated annotation of lncRNAs. Our annotation is accessible via Ensembl, the UCSC Genome Browser and https://www.gencodegenes.org.
Assuntos
COVID-19/prevenção & controle , Biologia Computacional/métodos , Bases de Dados Genéticas , Genômica/métodos , Anotação de Sequência Molecular/métodos , SARS-CoV-2/genética , Animais , COVID-19/epidemiologia , COVID-19/virologia , Epidemias , Humanos , Internet , Camundongos , Pseudogenes/genética , RNA Longo não Codificante/genética , SARS-CoV-2/metabolismo , SARS-CoV-2/fisiologia , Transcrição Gênica/genéticaRESUMO
Next-generation DNA sequencing has revealed the complete genome sequences of numerous organisms, establishing a fundamental and growing understanding of genetic variation and phenotypic diversity. Engineering at the gene, network and whole-genome scale aims to introduce targeted genetic changes both to explore emergent phenotypes and to introduce new functionalities. Expansion of these approaches into massively parallel platforms establishes the ability to generate targeted genome modifications, elucidating causal links between genotype and phenotype, as well as the ability to design and reprogramme organisms. In this Review, we explore techniques and applications in genome engineering, outlining key advances and defining challenges.
Assuntos
Engenharia Genética/métodos , Genoma , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Animais , Marcação de Genes/métodos , Variação Genética , Genótipo , HumanosRESUMO
The accurate identification and description of the genes in the human and mouse genomes is a fundamental requirement for high quality analysis of data informing both genome biology and clinical genomics. Over the last 15 years, the GENCODE consortium has been producing reference quality gene annotations to provide this foundational resource. The GENCODE consortium includes both experimental and computational biology groups who work together to improve and extend the GENCODE gene annotation. Specifically, we generate primary data, create bioinformatics tools and provide analysis to support the work of expert manual gene annotators and automated gene annotation pipelines. In addition, manual and computational annotation workflows use any and all publicly available data and analysis, along with the research literature to identify and characterise gene loci to the highest standard. GENCODE gene annotations are accessible via the Ensembl and UCSC Genome Browsers, the Ensembl FTP site, Ensembl Biomart, Ensembl Perl and REST APIs as well as https://www.gencodegenes.org.
Assuntos
Bases de Dados Genéticas , Genoma Humano/genética , Genômica , Pseudogenes/genética , Animais , Biologia Computacional , Humanos , Internet , Camundongos , Anotação de Sequência Molecular , SoftwareRESUMO
Coral reefs are increasingly threatened by thermal bleaching and tropical storm events associated with rising sea surface temperatures. Deeper habitats offer some protection from these impacts and may safeguard reef-coral biodiversity, but their faunas are largely undescribed for the Indo-Pacific. Here, we show high species richness of scleractinian corals in mesophotic habitats (30-125 m) for the northern Great Barrier Reef region that greatly exceeds previous records for mesophotic habitats globally. Overall, 45% of shallow-reef species (less than or equal to 30 m), 78% of genera, and all families extended below 30 m depth, with 13% of species, 41% of genera, and 78% of families extending below 45 m. Maximum depth of occurrence showed a weak relationship to phylogeny, but a strong correlation with maximum latitudinal extent. Species recorded in the mesophotic had a significantly greater than expected probability of also occurring in shaded microhabitats and at higher latitudes, consistent with light as a common limiting factor. The findings suggest an important role for deeper habitats, particularly depths 30-45 m, in preserving evolutionary lineages of Indo-Pacific corals. Deeper reef areas are clearly more diverse than previously acknowledged and therefore deserve full consideration in our efforts to protect the world's coral reef biodiversity.
Assuntos
Antozoários , Biodiversidade , Filogenia , Animais , Antozoários/classificação , Recifes de Corais , QueenslandRESUMO
Mass bleaching associated with unusually high sea temperatures represents one of the greatest threats to corals and coral reef ecosystems. Deeper reef areas are hypothesized as potential refugia, but the susceptibility of Scleractinian species over depth has not been quantified. During the most severe bleaching event on record, we found up to 83% of coral cover severely affected on Maldivian reefs at a depth of 3-5 m, but significantly reduced effects at 24-30 m. Analysis of 153 species' responses showed depth, shading and species identity had strong, significant effects on susceptibility. Overall, 73.3% of the shallow-reef assemblage had individuals at a depth of 24-30 m with reduced effects, potentially mitigating local extinction and providing a source of recruits for population recovery. Although susceptibility was phylogenetically constrained, species-level effects caused most lineages to contain some partially resistant species. Many genera showed wide variation between species, including Acropora, previously considered highly susceptible. Extinction risk estimates showed species and lineages of concern and those likely to dominate following repeated events. Our results show that deeper reef areas provide refuge for a large proportion of Scleractinian species during severe bleaching events and that the deepest occurring individuals of each population have the greatest potential to survive and drive reef recovery.
Assuntos
Antozoários/fisiologia , Recifes de Corais , Monitoramento Ambiental , Temperatura Alta/efeitos adversos , Animais , Ilhas do Oceano Índico , Especificidade da EspécieRESUMO
ATP synthase's intrinsic molecular electrostatic potential (MESP) adds constructively to, and hence reinforces, the chemiosmotic voltage. This ATP synthase voltage represents a new free energy term that appears to have been overlooked. This term is at least roughly equal in order of magnitude and opposite in sign to the energy needed to be dissipated as a Maxwell's demon (Landauer principle).
Assuntos
Adenosina Trifosfatases/metabolismo , Adenosina Trifosfatases/análise , Modelos Moleculares , Eletricidade Estática , TermodinâmicaRESUMO
Coral reefs are the epitome of species diversity, yet the number of described scleractinian coral species, the framework-builders of coral reefs, remains moderate by comparison. DNA sequencing studies are rapidly challenging this notion by exposing a wealth of undescribed diversity, but the evolutionary and ecological significance of this diversity remains largely unclear. Here, we present an annotated genome for one of the most ubiquitous corals in the Indo-Pacific (Pachyseris speciosa) and uncover, through a comprehensive genomic and phenotypic assessment, that it comprises morphologically indistinguishable but ecologically divergent lineages. Demographic modeling based on whole-genome resequencing indicated that morphological crypsis (across micro- and macromorphological traits) was due to ancient morphological stasis rather than recent divergence. Although the lineages occur sympatrically across shallow and mesophotic habitats, extensive genotyping using a rapid molecular assay revealed differentiation of their ecological distributions. Leveraging "common garden" conditions facilitated by the overlapping distributions, we assessed physiological and quantitative skeletal traits and demonstrated concurrent phenotypic differentiation. Lastly, spawning observations of genotyped colonies highlighted the potential role of temporal reproductive isolation in the limited admixture, with consistent genomic signatures in genes related to morphogenesis and reproduction. Overall, our findings demonstrate the presence of ecologically and phenotypically divergent coral species without substantial morphological differentiation and provide new leads into the potential mechanisms facilitating such divergence. More broadly, they indicate that our current taxonomic framework for reef-building corals may be scratching the surface of the ecologically relevant diversity on coral reefs, consequently limiting our ability to protect or restore this diversity effectively.
Assuntos
Antozoários/classificação , Biodiversidade , Recifes de Corais , Clima Tropical , Animais , Antozoários/genética , Morfogênese/genética , Reprodução/genéticaRESUMO
Pseudogenes are ideal markers of genome remodelling. In turn, the mouse is an ideal platform for studying them, particularly with the recent availability of strain-sequencing and transcriptional data. Here, combining both manual curation and automatic pipelines, we present a genome-wide annotation of the pseudogenes in the mouse reference genome and 18 inbred mouse strains (available via the mouse.pseudogene.org resource). We also annotate 165 unitary pseudogenes in mouse, and 303, in human. The overall pseudogene repertoire in mouse is similar to that in human in terms of size, biotype distribution, and family composition (e.g. with GAPDH and ribosomal proteins being the largest families). Notable differences arise in the pseudogene age distribution, with multiple retro-transpositional bursts in mouse evolutionary history and only one in human. Furthermore, in each strain about a fifth of all pseudogenes are unique, reflecting strain-specific evolution. Finally, we find that ~15% of the mouse pseudogenes are transcribed, and that highly transcribed parent genes tend to give rise to many processed pseudogenes.
Assuntos
Pseudogenes/genética , Transcrição Gênica , Animais , Sequência Conservada/genética , Evolução Molecular , Ontologia Genética , Genoma , Humanos , Camundongos Endogâmicos C57BL , Anotação de Sequência Molecular , Especificidade da EspécieRESUMO
Evidence suggests that the mitochondrial (mt)DNA of anthozoans is evolving at a slower tempo than their nuclear DNA; however, parallel surveys of nuclear and mitochondrial variations and calibrated rates of both synonymous and nonsynonymous substitutions across taxa are needed in order to support this scenario. We examined species of the scleractinian coral genus Acropora, including previously unstudied species, for molecular variations in protein-coding genes and noncoding regions of both nuclear and mt genomes. DNA sequences of a calmodulin (CaM)-encoding gene region containing three exons, two introns and a 411-bp mt intergenic spacer (IGS) spanning the cytochrome b (cytb) and NADH 2 genes, were obtained from 49 Acropora species. The molecular evolutionary rates of coding and noncoding regions in nuclear and mt genomes were compared in conjunction with published data, including mt cytochrome b, the control region, and nuclear Pax-C introns. Direct sequencing of the mtIGS revealed an average interspecific variation comparable to that seen in published data for mt cytb. The average interspecific variation of the nuclear genome was two to five times greater than that of the mt genome. Based on the calibration of the closure of Panama Isthmus (3.0 mya) and closure of the Tethy Seaway (12 mya), synonymous substitution rates ranged from 0.367% to 1.467% Ma(-1) for nuclear CaM, which is about 4.8 times faster than those of mt cytb (0.076-0.303% Ma(-1)). This is similar to the findings in plant genomes that the nuclear genome is evolving at least five times faster than those of mitochondrial counterparts.
Assuntos
Antozoários/genética , DNA/genética , Evolução Molecular , Genes Mitocondriais/genética , Genoma , Plantas/genética , Animais , Variação Genética , FilogeniaRESUMO
Mesophotic coral ecosystems (MCEs) and temperate mesophotic ecosystems (TMEs) occur at depths of roughly 30-150 m depth and are characterized by the presence of photosynthetic organisms despite reduced light availability. Exploration of these ecosystems dates back several decades, but our knowledge remained extremely limited until about a decade ago, when a renewed interest resulted in the establishment of a rapidly growing research community. Here, we present the 'mesophotic.org' database, a comprehensive and curated repository of scientific literature on mesophotic ecosystems. Through both manually curated and automatically extracted metadata, the repository facilitates rapid retrieval of available information about particular topics (e.g. taxa or geographic regions), exploration of spatial/temporal trends in research and identification of knowledge gaps. The repository can be queried to comprehensively obtain available data to address large-scale questions and guide future research directions. Overall, the 'mesophotic.org' repository provides an independent and open-source platform for the ever-growing research community working on MCEs and TMEs to collate and expedite our understanding of the occurrence, composition and functioning of these ecosystems. Database URL: http://mesophotic.org/.
Assuntos
Bases de Dados Factuais , Ecossistema , Geografia , PublicaçõesRESUMO
Post-translational phosphorylation is essential to human cellular processes, but the transient, heterogeneous nature of this modification complicates its study in native systems. We developed an approach to interrogate phosphorylation and its role in protein-protein interactions on a proteome-wide scale. We genetically encoded phosphoserine in recoded E. coli and generated a peptide-based heterologous representation of the human serine phosphoproteome. We designed a single-plasmid library encoding >100,000 human phosphopeptides and confirmed the site-specific incorporation of phosphoserine in >36,000 of these peptides. We then integrated our phosphopeptide library into an approach known as Hi-P to enable proteome-level screens for serine-phosphorylation-dependent human protein interactions. Using Hi-P, we found hundreds of known and potentially new phosphoserine-dependent interactors with 14-3-3 proteins and WW domains. These phosphosites retained important binding characteristics of the native human phosphoproteome, as determined by motif analysis and pull-downs using full-length phosphoproteins. This technology can be used to interrogate user-defined phosphoproteomes in any organism, tissue, or disease of interest.
Assuntos
Peptídeos/genética , Mapas de Interação de Proteínas/genética , Proteoma/genética , Serina Proteases/genética , Proteínas 14-3-3/química , Proteínas 14-3-3/genética , Motivos de Aminoácidos/genética , Escherichia coli/genética , Biblioteca Gênica , Humanos , Peptídeos/química , Fosforilação , Fosfosserina/química , Plasmídeos/genética , Serina Proteases/química , Domínios WW/genéticaRESUMO
AIM: To explore the feasibility and reliability of Clinical Coding Surveillance (CCS) for the routine monitoring of Adverse Drug Events (ADE) and describe the characteristics of harm identified through this approach in a large district health board (DHB). METHOD: All hospital admissions at Waitemata DHB from 2015 to 2016 with an ADE-related ICD10-AM code of Y40-Y59, X40-X49 or T36-T50 were extracted from clinical coded data. The data was analysed using descriptive statistics, statistical process control and Pareto charts. Two clinicians assessed a random sample of 140 ADEs for their accuracy against what was clinically documented in medical records. RESULTS: A total of 11,999 ADEs were identified in 244,992 admissions (4.9 ADEs per 100 admissions). ADEs were more prevalent in older adults and associated with longer average length of stays and medicines such as analgesics, antibiotics, anticoagulants and diuretics. Only 2,164 (18%) of ADEs were classified as originating within hospital. Of ADEs originating outside of the hospital, the main causes were poisoning by psychotropics, anti-epileptics and anti-parkinsonism agents and non-opioid analgesics. Clinicians agreed that 91% of ADE positive admissions were accurately classified as per clinical documentation. CONCLUSION: CCS is a feasible and reliable approach for the routine monitoring of ADEs in hospitals.
Assuntos
Sistemas de Notificação de Reações Adversas a Medicamentos/estatística & dados numéricos , Codificação Clínica/estatística & dados numéricos , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Criança , Pré-Escolar , Estudos de Viabilidade , Feminino , Hospitalização , Hospitais , Humanos , Lactente , Recém-Nascido , Masculino , Pessoa de Meia-Idade , Nova Zelândia , Reprodutibilidade dos Testes , Adulto JovemRESUMO
We report full-length draft de novo genome assemblies for 16 widely used inbred mouse strains and find extensive strain-specific haplotype variation. We identify and characterize 2,567 regions on the current mouse reference genome exhibiting the greatest sequence diversity. These regions are enriched for genes involved in pathogen defence and immunity and exhibit enrichment of transposable elements and signatures of recent retrotransposition events. Combinations of alleles and genes unique to an individual strain are commonly observed at these loci, reflecting distinct strain phenotypes. We used these genomes to improve the mouse reference genome, resulting in the completion of 10 new gene structures. Also, 62 new coding loci were added to the reference genome annotation. These genomes identified a large, previously unannotated, gene (Efcab3-like) encoding 5,874 amino acids. Mutant Efcab3-like mice display anomalies in multiple brain regions, suggesting a possible role for this gene in the regulation of brain development.
Assuntos
Mapeamento Cromossômico , Loci Gênicos , Genoma , Haplótipos , Camundongos Endogâmicos/genética , Animais , Animais de Laboratório , Mapeamento Cromossômico/veterinária , Haplótipos/genética , Camundongos , Camundongos Endogâmicos BALB C/genética , Camundongos Endogâmicos C3H/genética , Camundongos Endogâmicos C57BL/genética , Camundongos Endogâmicos CBA/genética , Camundongos Endogâmicos DBA/genética , Camundongos Endogâmicos NOD/genética , Camundongos Endogâmicos/classificação , Anotação de Sequência Molecular , Filogenia , Polimorfismo de Nucleotídeo Único , Especificidade da EspécieRESUMO
Mesophotic coral ecosystems in the Indo-Pacific remain relatively unexplored, particularly at lower mesophotic depths (≥60 m), despite their potentially large spatial extent. Here, we used a remotely operated vehicle to conduct a qualitative assessment of the zooxanthellate coral community at lower mesophotic depths (60-125 m) at 10 different locations in the Great Barrier Reef Marine Park and the Coral Sea Commonwealth Marine Reserve. Lower mesophotic coral communities were present at all 10 locations, with zooxanthellate scleractinian corals extending down to ~100 metres on walls and ~125 m on steep slopes. Lower mesophotic coral communities were most diverse in the 60-80 m zone, while at depths of ≥100 m the coral community consisted almost exclusively of the genus Leptoseris. Collections of coral specimens (n = 213) between 60 and 125 m depth confirmed the presence of at least 29 different species belonging to 18 genera, including several potential new species and geographic/depth range extensions. Overall, this study highlights that lower mesophotic coral ecosystems are likely to be ubiquitous features on the outer reefs of the Great Barrier Reef and atolls of the Coral Sea, and harbour a generic and species richness of corals that is much higher than thus far reported. Further research efforts are urgently required to better understand and manage these ecosystems as part of the Great Barrier Reef Marine Park and Coral Sea Commonwealth Marine Reserve.
Assuntos
Antozoários , Recifes de Corais , Ecossistema , Animais , Antozoários/classificação , Austrália , Biodiversidade , TemperaturaRESUMO
Biological systems are complex. In particular, the interactions between molecular components often form dense networks that, more often than not, are criticized for being inscrutable 'hairballs'. We argue that one way of untangling these hairballs is through cross-disciplinary network comparison-leveraging advances in other disciplines to obtain new biological insights. In some cases, such comparisons enable the direct transfer of mathematical formalism between disciplines, precisely describing the abstract associations between entities and allowing us to apply a variety of sophisticated formalisms to biology. In cases where the detailed structure of the network does not permit the transfer of complete formalisms between disciplines, comparison of mechanistic interactions in systems for which we have significant day-to-day experience can provide analogies for interpreting relatively more abstruse biological networks. Here, we illustrate how these comparisons benefit the field with a few specific examples related to network growth, organizational hierarchies, and the evolution of adaptive systems.
RESUMO
As the cost of sequencing continues to decrease and the amount of sequence data generated grows, new paradigms for data storage and analysis are increasingly important. The relative scaling behavior of these evolving technologies will impact genomics research moving forward.
Assuntos
Biologia Computacional/tendências , Sequenciamento de Nucleotídeos em Larga Escala/economia , Algoritmos , Pesquisa Biomédica , Biologia Computacional/métodos , Genômica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Armazenamento e Recuperação da InformaçãoRESUMO
Analyses of pairwise relatedness represent a key component to addressing many topics in biology. However, such analyses have been limited because most available programs provide a means to estimate relatedness based on only a single estimator, making comparison across estimators difficult. Second, all programs to date have been platform specific, working only on a specific operating system. This has the undesirable outcome of making choice of relatedness estimator limited by operating system preference, rather than being based on scientific rationale. Here, we present a new R package, called related, that can calculate relatedness based on seven estimators, can account for genotyping errors, missing data and inbreeding, and can estimate 95% confidence intervals. Moreover, simulation functions are provided that allow for easy comparison of the performance of different estimators and for analyses of how much resolution to expect from a given data set. Because this package works in R, it is platform independent. Combined, this functionality should allow for more appropriate analyses and interpretation of pairwise relatedness and will also allow for the integration of relatedness data into larger R workflows.