RESUMO
Research over the past decade has suggested important roles for pseudogenes in physiology and disease. In vitro experiments demonstrated that pseudogenes contribute to cell transformation through several mechanisms. However, in vivo evidence for a causal role of pseudogenes in cancer development is lacking. Here, we report that mice engineered to overexpress either the full-length murine B-Raf pseudogene Braf-rs1 or its pseudo "CDS" or "3' UTR" develop an aggressive malignancy resembling human diffuse large B cell lymphoma. We show that Braf-rs1 and its human ortholog, BRAFP1, elicit their oncogenic activity, at least in part, as competitive endogenous RNAs (ceRNAs) that elevate BRAF expression and MAPK activation in vitro and in vivo. Notably, we find that transcriptional or genomic aberrations of BRAFP1 occur frequently in multiple human cancers, including B cell lymphomas. Our engineered mouse models demonstrate the oncogenic potential of pseudogenes and indicate that ceRNA-mediated microRNA sequestration may contribute to the development of cancer.
Assuntos
Linfoma Difuso de Grandes Células B/genética , Proteínas Proto-Oncogênicas B-raf/genética , Pseudogenes , RNA/metabolismo , Animais , Sequência de Bases , Humanos , Linfoma Difuso de Grandes Células B/metabolismo , Camundongos , Dados de Sequência Molecular , Proteínas Proto-Oncogênicas B-raf/metabolismoRESUMO
Large numbers of inbred laboratory rat strains have been developed for a range of complex disease phenotypes. To gain insights into the evolutionary pressures underlying selection for these phenotypes, we sequenced the genomes of 27 rat strains, including 11 models of hypertension, diabetes, and insulin resistance, along with their respective control strains. Altogether, we identified more than 13 million single-nucleotide variants, indels, and structural variants across these rat strains. Analysis of strain-specific selective sweeps and gene clusters implicated genes and pathways involved in cation transport, angiotensin production, and regulators of oxidative stress in the development of cardiovascular disease phenotypes in rats. Many of the rat loci that we identified overlap with previously mapped loci for related traits in humans, indicating the presence of shared pathways underlying these phenotypes in rats and humans. These data represent a step change in resources available for evolutionary analysis of complex traits in disease models.
Assuntos
Ratos/classificação , Ratos/genética , Animais , Modelos Animais de Doenças , Genoma , Fenótipo , Filogenia , Polimorfismo de Nucleotídeo Único , Ratos EndogâmicosRESUMO
Human genomics is undergoing a step change from being a predominantly research-driven activity to one driven through health care as many countries in Europe now have nascent precision medicine programmes. To maximize the value of the genomic data generated, these data will need to be shared between institutions and across countries. In recognition of this challenge, 21 European countries recently signed a declaration to transnationally share data on at least 1 million human genomes by 2022. In this Roadmap, we identify the challenges of data sharing across borders and demonstrate that European research infrastructures are well-positioned to support the rapid implementation of widespread genomic data access.
Assuntos
Pesquisa Biomédica , Genoma Humano , Projeto Genoma Humano , Europa (Continente) , HumanosRESUMO
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
RESUMO
The European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI) is one of the world's leading sources of public biomolecular data. Based at the Wellcome Genome Campus in Hinxton, UK, EMBL-EBI is one of six sites of the European Molecular Biology Laboratory (EMBL), Europe's only intergovernmental life sciences organisation. This overview summarises the status of services that EMBL-EBI data resources provide to scientific communities globally. The scale, openness, rich metadata and extensive curation of EMBL-EBI added-value databases makes them particularly well-suited as training sets for deep learning, machine learning and artificial intelligence applications, a selection of which are described here. The data resources at EMBL-EBI can catalyse such developments because they offer sustainable, high-quality data, collected in some cases over decades and made openly availability to any researcher, globally. Our aim is for EMBL-EBI data resources to keep providing the foundations for tools and research insights that transform fields across the life sciences.
Assuntos
Inteligência Artificial , Biologia Computacional , Gerenciamento de Dados , Bases de Dados Factuais , Genoma , InternetRESUMO
The European Variation Archive (EVA; https://www.ebi.ac.uk/eva/) is a resource for sharing all types of genetic variation data (SNPs, indels, and structural variants) for all species. The EVA was created in 2014 to provide FAIR access to genetic variation data and has since grown to be a primary resource for genomic variants hosting >3 billion records. The EVA and dbSNP have established a compatible global system to assign unique identifiers to all submitted genetic variants. The EVA is active within the Global Alliance of Genomics and Health (GA4GH), maintaining, contributing and implementing standards such as VCF, Refget and Variant Representation Specification (VRS). In this article, we describe the submission and permanent accessioning services along with the different ways the data can be retrieved by the scientific community.
Assuntos
Biologia Computacional , Bases de Dados Genéticas , Variação Genética/genética , Software , Animais , Variação Estrutural do Genoma/genética , Genômica , Humanos , Mutação INDEL/genética , Anotação de Sequência Molecular , Polimorfismo de Nucleotídeo Único/genéticaRESUMO
The European Genome-phenome Archive (EGA - https://ega-archive.org/) is a resource for long term secure archiving of all types of potentially identifiable genetic, phenotypic, and clinical data resulting from biomedical research projects. Its mission is to foster hosted data reuse, enable reproducibility, and accelerate biomedical and translational research in line with the FAIR principles. Launched in 2008, the EGA has grown quickly, currently archiving over 4,500 studies from nearly one thousand institutions. The EGA operates a distributed data access model in which requests are made to the data controller, not to the EGA, therefore, the submitter keeps control on who has access to the data and under which conditions. Given the size and value of data hosted, the EGA is constantly improving its value chain, that is, how the EGA can contribute to enhancing the value of human health data by facilitating its submission, discovery, access, and distribution, as well as leading the design and implementation of standards and methods necessary to deliver the value chain. The EGA has become a key GA4GH Driver Project, leading multiple development efforts and implementing new standards and tools, and has been appointed as an ELIXIR Core Data Resource.
Assuntos
Confidencialidade/legislação & jurisprudência , Genoma Humano , Disseminação de Informação/métodos , Fenômica/organização & administração , Pesquisa Translacional Biomédica/métodos , Conjuntos de Dados como Assunto , Genótipo , História do Século XX , História do Século XXI , Humanos , Disseminação de Informação/ética , Metadados/ética , Metadados/estatística & dados numéricos , Fenômica/história , FenótipoRESUMO
Whereas the emphasis of water splitting is typically on hydrogen generation, there is value in the oxygen produced, especially in the undersea environment and for medicinal applications in the developing world. The generation of pure and breathable oxygen from abundant and accessible sources of water, such as brine and seawater, is challenging owing to the prevalence of the competing halide oxidation reaction to produce halogen and hypohalous acids. We show here that pure O2 may be generated from briny water by using an oxygen evolution catalyst with an overlayer that fulfills the criteria of (i) possessing a point of zero charge that results in halide anion rejection and (ii) promoting the disproportionation of hypohalous acids.
RESUMO
Earth-abundant oxygen evolution catalysts (OECs) with extended stability in acid can be constructed by embedding active sites within an acid-stable metal-oxide framework. Here, we report stable NiPbOx films that are able to perform oxygen evolution reaction (OER) catalysis for extended periods of operation (>20 h) in acidic solutions of pH 2.5; conversely, native NiOx catalyst films dissolve immediately. In situ X-ray absorption spectroscopy and ex situ X-ray photoelectron spectroscopy reveal that PbO2 is unperturbed after addition of Ni and/or Fe into the lattice, which serves as an acid-stable, conductive framework for embedded OER active centers. The ability to perform OER in acid allows the mechanism of Fe doping on Ni catalysts to be further probed. Catalyst activity with Fe doping of oxidic Ni OEC under acid conditions, as compared to neutral or basic conditions, supports the contention that role of Fe3+ in enhancing catalytic activity in Ni oxide catalysts arises from its Lewis acid properties.
RESUMO
BACKGROUND: Mice carrying targeted mutations are important for investigating gene function and the role of genes in disease, but off-target mutagenic effects associated with the processes of generating targeted alleles, for instance using Crispr, and culturing embryonic stem cells, offer opportunities for spontaneous mutations to arise. Identifying spontaneous mutations relies on the detection of phenotypes segregating independently of targeted alleles, and having a broad estimate of the level of mutations generated by intensive breeding programmes is difficult given that many phenotypes are easy to miss if not specifically looked for. Here we present data from a large, targeted knockout programme in which mice were analysed through a phenotyping pipeline. Such spontaneous mutations segregating within mutant lines may confound phenotypic analyses, highlighting the importance of record-keeping and maintaining correct pedigrees. RESULTS: Twenty-five lines out of 1311 displayed different deafness phenotypes that did not segregate with the targeted allele. We observed a variety of phenotypes by Auditory Brainstem Response (ABR) and behavioural assessment and isolated eight lines showing early-onset severe progressive hearing loss, later-onset progressive hearing loss, low frequency hearing loss, or complete deafness, with vestibular dysfunction. The causative mutations identified include deletions, insertions, and point mutations, some of which involve new genes not previously associated with deafness while others are new alleles of genes known to underlie hearing loss. Two of the latter show a phenotype much reduced in severity compared to other mutant alleles of the same gene. We investigated the ES cells from which these lines were derived and determined that only one of the 8 mutations could have arisen in the ES cell, and in that case, only after targeting. Instead, most of the non-segregating mutations appear to have occurred during breeding of mutant mice. In one case, the mutation arose within the wildtype colony used for expanding mutant lines. CONCLUSIONS: Our data show that spontaneous mutations with observable effects on phenotype are a common side effect of intensive breeding programmes, including those underlying targeted mutation programmes. Such spontaneous mutations segregating within mutant lines may confound phenotypic analyses, highlighting the importance of record-keeping and maintaining correct pedigrees.
Assuntos
Surdez , Perda Auditiva , Alelos , Animais , Surdez/genética , Perda Auditiva/genética , Camundongos , Mutagênese , MutaçãoRESUMO
MOTIVATION: The majority of genome analysis tools and pipelines require data to be decrypted for access. This potentially leaves sensitive genetic data exposed, either because the unencrypted data is not removed after analysis, or because the data leaves traces on the permanent storage medium. RESULTS: : We defined a file container specification enabling direct byte-level compatible random access to encrypted genetic data stored in community standards such as SAM/BAM/CRAM/VCF/BCF. By standardizing this format, we show how it can be added as a native file format to genomic libraries, enabling direct analysis of encrypted data without the need to create a decrypted copy. AVAILABILITY AND IMPLEMENTATION: The Crypt4GH specification can be found at: http://samtools.github.io/hts-specs/crypt4gh.pdf. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
RESUMO
MOTIVATION: Reference sequences are essential in creating a baseline of knowledge for many common bioinformatics methods, especially those using genomic sequencing. RESULTS: We have created refget, a Global Alliance for Genomics and Health API specification to access reference sequences and sub-sequences using an identifier derived from the sequence itself. We present four reference implementations across in-house and cloud infrastructure, a compliance suite and a web report used to ensure specification conformity across implementations. AVAILABILITY AND IMPLEMENTATION: The refget specification can be found at: https://w3id.org/ga4gh/refget. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Genômica , SoftwareRESUMO
The European Nucleotide Archive (ENA, https://www.ebi.ac.uk/ena) at the European Molecular Biology Laboratory's European Bioinformatics Institute provides open and freely available data deposition and access services across the spectrum of nucleotide sequence data types. Making the world's public sequencing datasets available to the scientific community, the ENA represents a globally comprehensive nucleotide sequence resource. Here, we outline ENA services and content in 2019 and provide an insight into selected key areas of development in this period.
Assuntos
Biologia Computacional , Bases de Dados de Ácidos Nucleicos , Genômica , Biologia Computacional/métodos , Europa (Continente) , Genômica/métodos , Anotação de Sequência Molecular , Software , Interface Usuário-Computador , NavegadorRESUMO
For over a century, mice have been used to model human disease, leading to many fundamental discoveries about mammalian biology and the development of new therapies. Mouse genetics research has been further catalysed by a plethora of genomic resources developed in the last 20 years, including the genome sequence of C57BL/6J and more recently the first draft reference genomes for 16 additional laboratory strains. Collectively, the comparison of these genomes highlights the extreme diversity that exists at loci associated with the immune system, pathogen response, and key sensory functions, which form the foundation for dissecting phenotypic traits in vivo. We review the current status of the mouse genome across the diversity of the mouse lineage and discuss the value of mice to understanding human disease.
Assuntos
Animais Endogâmicos/genética , Genoma/genética , Genômica , Animais , Mapeamento Cromossômico , Haplótipos , Humanos , Endogamia , Camundongos , FenótipoRESUMO
BACKGROUND & AIMS: Extracorporeal shock wave lithotripsy (ESWL) for pancreaticolithiasis is most commonly performed by urologists. We investigated the effects of transitioning from urologist- to gastroenterologist-directed ESWL on case complexity, process measures, and duct clearance. METHODS: We performed a retrospective study of patients who underwent ESWL for pancreaticolithiasis from 2014 through 2019 at a single center. We collected demographic, clinical, radiographic, and procedural data in duplicate and compared case complexity and process measures between the periods the procedure was performed by urologists (January 2014 through February 2017; 18 patients, 0.47 patients/month) vs gastroenterologists (March 2017 through December 2019; 61 patients; 1.79 patients/month). We also compared data on pancreatic duct stone characteristics and technical success (duct clearance, determined by imaging analysis). RESULTS: There were no differences in patient demographics, comorbidities, pancreatic stone morphology, or time from referral to ESWL during the period the procedure was performed by urologists vs gastroenterologists. Patients received a higher mean number of ESWL shocks per session during the gastroenterology period (4341) than during the urology period (3117) (P < .001). A higher proportion of patients underwent same-session endoscopic retrograde cholangiopancreatography during the gastroenterology time period (66%) than the urology time period (6%) (P < .001). A higher proportion of patients had partial or complete duct clearance during the gastroenterology period (71%) than during the urology period (44%) (P = .04). During the urology period, a higher proportion of patients were hospitalized following ESWL, although there was no difference in captured adverse events between the periods. CONCLUSIONS: Transition from urologist- to gastroenterologist-directed ESWL did not affect case complexity or wait times for ESWL. However, the transition did result in increased procedure volume, more shocks per ESWL session, and improved duct clearance.
Assuntos
Cálculos , Gastroenterologistas , Litotripsia , Cálculos/terapia , Colangiopancreatografia Retrógrada Endoscópica , Humanos , Litotripsia/efeitos adversos , Estudos Retrospectivos , Resultado do Tratamento , UrologistasRESUMO
The recent introductions of low-cost, long-read, and read-cloud sequencing technologies coupled with intense efforts to develop efficient algorithms have made affordable, high-quality de novo sequence assembly a realistic proposition. The result is an explosion of new, ultracontiguous genome assemblies. To compare these genomes, we need robust methods for genome annotation. We describe the fully open source Comparative Annotation Toolkit (CAT), which provides a flexible way to simultaneously annotate entire clades and identify orthology relationships. We show that CAT can be used to improve annotations on the rat genome, annotate the great apes, annotate a diverse set of mammals, and annotate personal, diploid human genomes. We demonstrate the resulting discovery of novel genes, isoforms, and structural variants-even in genomes as well studied as rat and the great apes-and how these annotations improve cross-species RNA expression experiments.
Assuntos
Genoma Humano/genética , Algoritmos , Animais , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Anotação de Sequência Molecular/métodos , RNA/genética , RatosRESUMO
Despite the rapid development of sequencing technologies, the assembly of mammalian-scale genomes into complete chromosomes remains one of the most challenging problems in bioinformatics. To help address this difficulty, we developed Ragout 2, a reference-assisted assembly tool that works for large and complex genomes. By taking one or more target assemblies (generated from an NGS assembler) and one or multiple related reference genomes, Ragout 2 infers the evolutionary relationships between the genomes and builds the final assemblies using a genome rearrangement approach. By using Ragout 2, we transformed NGS assemblies of 16 laboratory mouse strains into sets of complete chromosomes, leaving <5% of sequence unlocalized per set. Various benchmarks, including PCR testing and realigning of long Pacific Biosciences (PacBio) reads, suggest only a small number of structural errors in the final assemblies, comparable with direct assembly approaches. We applied Ragout 2 to the Mus caroli and Mus pahari genomes, which exhibit karyotype-scale variations compared with other genomes from the Muridae family. Chromosome painting maps confirmed most large-scale rearrangements that Ragout 2 detected. We applied Ragout 2 to improve draft sequences of three ape genomes that have recently been published. Ragout 2 transformed three sets of contigs (generated using PacBio reads only) into chromosome-scale assemblies with accuracy comparable to chromosome assemblies generated in the original study using BioNano maps, Hi-C, BAC clones, and FISH.
Assuntos
Mapeamento de Sequências Contíguas/métodos , Sequenciamento Completo do Genoma/métodos , Animais , Mapeamento de Sequências Contíguas/normas , Camundongos , Padrões de Referência , Sequenciamento Completo do Genoma/normasRESUMO
Understanding the mechanisms driving lineage-specific evolution in both primates and rodents has been hindered by the lack of sister clades with a similar phylogenetic structure having high-quality genome assemblies. Here, we have created chromosome-level assemblies of the Mus caroli and Mus pahari genomes. Together with the Mus musculus and Rattus norvegicus genomes, this set of rodent genomes is similar in divergence times to the Hominidae (human-chimpanzee-gorilla-orangutan). By comparing the evolutionary dynamics between the Muridae and Hominidae, we identified punctate events of chromosome reshuffling that shaped the ancestral karyotype of Mus musculus and Mus caroli between 3 and 6 million yr ago, but that are absent in the Hominidae. Hominidae show between four- and sevenfold lower rates of nucleotide change and feature turnover in both neutral and functional sequences, suggesting an underlying coherence to the Muridae acceleration. Our system of matched, high-quality genome assemblies revealed how specific classes of repeats can play lineage-specific roles in related species. Recent LINE activity has remodeled protein-coding loci to a greater extent across the Muridae than the Hominidae, with functional consequences at the species level such as reproductive isolation. Furthermore, we charted a Muridae-specific retrotransposon expansion at unprecedented resolution, revealing how a single nucleotide mutation transformed a specific SINE element into an active CTCF binding site carrier specifically in Mus caroli, which resulted in thousands of novel, species-specific CTCF binding sites. Our results show that the comparison of matched phylogenetic sets of genomes will be an increasingly powerful strategy for understanding mammalian biology.
Assuntos
Evolução Molecular , Genoma/genética , Muridae/genética , Filogenia , Animais , Sítios de Ligação , Fator de Ligação a CCCTC/genética , Cromossomos/genética , Cariotipagem/métodos , Elementos Nucleotídeos Longos e Dispersos/genética , Camundongos , Retroelementos/genética , Especificidade da EspécieRESUMO
PURPOSE: Nonadherence to dosing schedules for androgen deprivation therapy increases the risk of testosterone escape for patients with prostate cancer. Two approved formulations of leuprolide acetate, the most commonly prescribed androgen deprivation therapy in the United States, use different extended release delivery technologies: an in situ gel and microspheres. We evaluated the prevalence and impact of late dosing on testosterone suppression for gel and microsphere formulations of leuprolide acetate. MATERIALS AND METHODS: We retrospectively analyzed records of patients with prostate cancer treated with gel or microsphere delivery of leuprolide acetate. Analyses used 2 definitions of "month," "28-day" (late dosing after day 28, 84, 112 or 168) and "extended" (late dosing after day 32, 97, 128 and 194). Frequencies of late dosing and associated testosterone values were calculated. RESULTS: A total of 2,038 patients received gel and 8,360 received microsphere formulations of leuprolide acetate. More than 80% and 27% of injections were late for 28-day and extended month, respectively. For 28-day month late injections 10% (gel delivery) and 14% (microsphere delivery) of testosterone values were above 50 ng/dl, and 25% (gel) vs 33% (microsphere) were above 20 ng/dl. For extended month 18% (gel) vs 25% (microsphere) were above 50 ng/dl, and 34% (gel) vs 44% (microsphere) were above 20 ng/dl. Microsphere leuprolide acetate was 1.5 times more likely to have testosterone above 50/20 ng/dl vs gel. Least square mean testosterone was 34 ng/dl (gel) vs 46 ng/dl (microsphere) for 28-day month, and 48 ng/dl (gel) vs 76 ng/dl (microsphere) for extended month. CONCLUSIONS: Leuprolide acetate therapies were frequently administered late. Gel formulation demonstrated higher rates of testosterone 50 ng/dl or less and 20 ng/dl or less than microsphere formulation. Optimal testosterone suppression can impact prostate cancer progression and patient survival, and differences in extended release technology for androgen deprivation therapy appear relevant.
Assuntos
Antagonistas de Androgênios/administração & dosagem , Leuprolida/administração & dosagem , Neoplasias da Próstata/tratamento farmacológico , Testosterona/antagonistas & inibidores , Adulto , Idoso , Idoso de 80 Anos ou mais , Géis , Humanos , Masculino , Microesferas , Pessoa de Meia-Idade , Estudos Retrospectivos , Fatores de Tempo , Estados Unidos , Adulto JovemRESUMO
Next-generation sequencing of human tumours has refined our understanding of the mutational processes operative in cancer initiation and progression, yet major questions remain regarding the factors that induce driver mutations and the processes that shape mutation selection during tumorigenesis. Here we performed whole-exome sequencing on adenomas from three mouse models of non-small-cell lung cancer, which were induced either by exposure to carcinogens (methyl-nitrosourea (MNU) and urethane) or by genetic activation of Kras (Kras(LA2)). Although the MNU-induced tumours carried exactly the same initiating mutation in Kras as seen in the Kras(LA2) model (G12D), MNU tumours had an average of 192 non-synonymous, somatic single-nucleotide variants, compared with only six in tumours from the Kras(LA2) model. By contrast, the Kras(LA2) tumours exhibited a significantly higher level of aneuploidy and copy number alterations compared with the carcinogen-induced tumours, suggesting that carcinogen-induced and genetically engineered models lead to tumour development through different routes. The wild-type allele of Kras has been shown to act as a tumour suppressor in mouse models of non-small-cell lung cancer. We demonstrate that urethane-induced tumours from wild-type mice carry mostly (94%) Kras Q61R mutations, whereas those from Kras heterozygous animals carry mostly (92%) Kras Q61L mutations, indicating a major role for germline Kras status in mutation selection during initiation. The exome-wide mutation spectra in carcinogen-induced tumours overwhelmingly display signatures of the initiating carcinogen, while adenocarcinomas acquire additional C > T mutations at CpG sites. These data provide a basis for understanding results from human tumour genome sequencing, which has identified two broad categories of tumours based on the relative frequency of single-nucleotide variations and copy number alterations, and underline the importance of carcinogen models for understanding the complex mutation spectra seen in human cancers.