RESUMO
This corrects the article DOI: 10.1038/nature22403.
RESUMO
Technology utilizing human induced pluripotent stem cells (iPS cells) has enormous potential to provide improved cellular models of human disease. However, variable genetic and phenotypic characterization of many existing iPS cell lines limits their potential use for research and therapy. Here we describe the systematic generation, genotyping and phenotyping of 711 iPS cell lines derived from 301 healthy individuals by the Human Induced Pluripotent Stem Cells Initiative. Our study outlines the major sources of genetic and phenotypic variation in iPS cells and establishes their suitability as models of complex human traits and cancer. Through genome-wide profiling we find that 5-46% of the variation in different iPS cell phenotypes, including differentiation capacity and cellular morphology, arises from differences between individuals. Additionally, we assess the phenotypic consequences of genomic copy-number alterations that are repeatedly observed in iPS cells. In addition, we present a comprehensive map of common regulatory variants affecting the transcriptome of human pluripotent cells.
Assuntos
Variação Genética/genética , Células-Tronco Pluripotentes Induzidas/metabolismo , Células Cultivadas , Reprogramação Celular/genética , Variações do Número de Cópias de DNA/genética , Regulação da Expressão Gênica/genética , Genótipo , Humanos , Especificidade de Órgãos , Fenótipo , Controle de Qualidade , Locos de Características Quantitativas/genética , Transcriptoma/genéticaRESUMO
The Open Targets Platform (https://www.targetvalidation.org/) provides users with a queryable knowledgebase and user interface to aid systematic target identification and prioritisation for drug discovery based upon underlying evidence. It is publicly available and the underlying code is open source. Since our last update two years ago, we have had 10 releases to maintain and continuously improve evidence for target-disease relationships from 20 different data sources. In addition, we have integrated new evidence from key datasets, including prioritised targets identified from genome-wide CRISPR knockout screens in 300 cancer models (Project Score), and GWAS/UK BioBank statistical genetic analysis evidence from the Open Targets Genetics Portal. We have evolved our evidence scoring framework to improve target identification. To aid the prioritisation of targets and inform on the potential impact of modulating a given target, we have added evaluation of post-marketing adverse drug reactions and new curated information on target tractability and safety. We have also developed the user interface and backend technologies to improve performance and usability. In this article, we describe the latest enhancements to the Platform, to address the fundamental challenge that developing effective and safe drugs is difficult and expensive.
Assuntos
Antineoplásicos/uso terapêutico , Drogas em Investigação/uso terapêutico , Bases de Conhecimento , Terapia de Alvo Molecular/métodos , Neoplasias/tratamento farmacológico , Software , Antineoplásicos/química , Bases de Dados Factuais , Conjuntos de Dados como Assunto , Descoberta de Drogas/métodos , Drogas em Investigação/química , Humanos , Internet , Neoplasias/classificação , Neoplasias/genética , Neoplasias/patologiaRESUMO
The BioSamples database at EMBL-EBI provides a central hub for sample metadata storage and linkage to other EMBL-EBI resources. BioSamples has recently undergone major changes, both in terms of data content and supporting infrastructure. The data content has more than doubled from around 2 million samples in 2014 to just over 5 million samples in 2018. Fast, reciprocal data exchange was fully established between sister Biosample databases and other INSDC partners, enabling a worldwide common representation and centralization of sample metadata. The BioSamples platform has been upgraded to accommodate anticipated increases in the number of submissions via GA4GH driver projects such as the Human Cell Atlas and the EGA, as well as from mirroring of NCBI dbGaP data. The BioSamples database is now the authoritative repository for all INSDC sample metadata, an ELIXIR Deposition Database for Biomolecular Data and the EMBL-EBI sample metadata hub. To support faster turnaround for sample submission, and to increase scalability and resilience, we have upgraded the BioSamples database backend storage, APIs and user interface. Finally, the website has been redesigned to allow search and retrieval of records based on specific filters, such as 'disease' or 'organism'. These changes are targeted at answering current use cases as well as providing functionalities for future emerging and anticipated developments. Availability: The BioSamples database is freely available at http://www.ebi.ac.uk/biosamples. Content is distributed under the EMBL-EBI Terms of Use available at https://www.ebi.ac.uk/about/terms-of-use.
Assuntos
Bancos de Espécimes Biológicos , Biologia Computacional/métodos , Bases de Dados Genéticas , Bases de Dados de Ácidos Nucleicos , Genômica/métodos , Biologia Computacional/estatística & dados numéricos , Genômica/estatística & dados numéricos , Humanos , Armazenamento e Recuperação da Informação/métodos , Internet , Metadados/estatística & dados numéricos , Interface Usuário-ComputadorRESUMO
The Open Targets Platform integrates evidence from genetics, genomics, transcriptomics, drugs, animal models and scientific literature to score and rank target-disease associations for drug target identification. The associations are displayed in an intuitive user interface (https://www.targetvalidation.org), and are available through a REST-API (https://api.opentargets.io/v3/platform/docs/swagger-ui) and a bulk download (https://www.targetvalidation.org/downloads/data). In addition to target-disease associations, we also aggregate and display data at the target and disease levels to aid target prioritisation. Since our first publication two years ago, we have made eight releases, added new data sources for target-disease associations, started including causal genetic variants from non genome-wide targeted arrays, added new target and disease annotations, launched new visualisations and improved existing ones and released a new web tool for batch search of up to 200 targets. We have a new URL for the Open Targets Platform REST-API, new REST endpoints and also removed the need for authorisation for API fair use. Here, we present the latest developments of the Open Targets Platform, expanding the evidence and target-disease associations with new and improved data sources, refining data quality, enhancing website usability, and increasing our user base with our training workshops, user support, social media and bioinformatics forum engagement.
Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Genômica/métodos , Armazenamento e Recuperação da Informação/métodos , Terapia de Alvo Molecular/métodos , Biologia Computacional/tendências , Perfilação da Expressão Gênica/métodos , Genômica/tendências , Humanos , Armazenamento e Recuperação da Informação/tendências , Internet , Reprodutibilidade dos Testes , SoftwareRESUMO
The Human Induced Pluripotent Stem Cell Initiative (HipSci) isf establishing a large catalogue of human iPSC lines, arguably the most well characterized collection to date. The HipSci portal enables researchers to choose the right cell line for their experiment, and makes HipSci's rich catalogue of assay data easy to discover and reuse. Each cell line has genomic, transcriptomic, proteomic and cellular phenotyping data. Data are deposited in the appropriate EMBL-EBI archives, including the European Nucleotide Archive (ENA), European Genome-phenome Archive (EGA), ArrayExpress and PRoteomics IDEntifications (PRIDE) databases. The project will make 500 cell lines from healthy individuals, and from 150 patients with rare genetic diseases; these will be available through the European Collection of Authenticated Cell Cultures (ECACC). As of August 2016, 238 cell lines are available for purchase. Project data is presented through the HipSci data portal (http://www.hipsci.org/lines) and is downloadable from the associated FTP site (ftp://ftp.hipsci.ebi.ac.uk/vol1/ftp). The data portal presents a summary matrix of the HipSci cell lines, showing available data types. Each line has its own page containing descriptive metadata, quality information, and links to archived assay data. Analysis results are also available in a Track Hub, allowing visualization in the context of public genomic annotations (http://www.hipsci.org/data/trackhubs).
Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Genômica/métodos , Células-Tronco Pluripotentes Induzidas , Linhagem Celular , Estudos de Associação Genética , Humanos , Células-Tronco Pluripotentes Induzidas/citologia , Células-Tronco Pluripotentes Induzidas/metabolismo , Proteoma , Software , TranscriptomaRESUMO
The BioSamples database at the EBI (http://www.ebi.ac.uk/biosamples) provides an integration point for BioSamples information between technology specific databases at the EBI, projects such as ENCODE and reference collections such as cell lines. The database delivers a unified query interface and API to query sample information across EBI's databases and provides links back to assay databases. Sample groups are used to manage related samples, e.g. those from an experimental submission, or a single reference collection. Infrastructural improvements include a new user interface with ontological and key word queries, a new query API, a new data submission API, complete RDF data download and a supporting SPARQL endpoint, accessioning at the point of submission to the European Nucleotide Archive and European Genotype Phenotype Archives and improved query response times.
Assuntos
Bases de Dados Genéticas , Linhagem Celular , Europa (Continente) , Humanos , Internet , Neoplasias/genética , Integração de SistemasRESUMO
Leishmania parasites cause a broad spectrum of clinical disease. Here we report the sequencing of the genomes of two species of Leishmania: Leishmania infantum and Leishmania braziliensis. The comparison of these sequences with the published genome of Leishmania major reveals marked conservation of synteny and identifies only approximately 200 genes with a differential distribution between the three species. L. braziliensis, contrary to Leishmania species examined so far, possesses components of a putative RNA-mediated interference pathway, telomere-associated transposable elements and spliced leader-associated SLACS retrotransposons. We show that pseudogene formation and gene loss are the principal forces shaping the different genomes. Genes that are differentially distributed between the species encode proteins implicated in host-pathogen interactions and parasite survival in the macrophage.
Assuntos
Genoma , Genômica , Leishmania/genética , Leishmaniose/parasitologia , Sequência de Aminoácidos , Animais , Humanos , Leishmania braziliensis/genética , Leishmania infantum/genética , Leishmania major/genética , Leishmaniose Cutânea/parasitologia , Leishmaniose Visceral/parasitologia , Dados de Sequência MolecularRESUMO
The BioSample Database (http://www.ebi.ac.uk/biosamples) is a new database at EBI that stores information about biological samples used in molecular experiments, such as sequencing, gene expression or proteomics. The goals of the BioSample Database include: (i) recording and linking of sample information consistently within EBI databases such as ENA, ArrayExpress and PRIDE; (ii) minimizing data entry efforts for EBI database submitters by enabling submitting sample descriptions once and referencing them later in data submissions to assay databases and (iii) supporting cross database queries by sample characteristics. Each sample in the database is assigned an accession number. The database includes a growing set of reference samples, such as cell lines, which are repeatedly used in experiments and can be easily referenced from any database by their accession numbers. Accession numbers for the reference samples will be exchanged with a similar database at NCBI. The samples in the database can be queried by their attributes, such as sample types, disease names or sample providers. A simple tab-delimited format facilitates submissions of sample information to the database, initially via email to biosamples@ebi.ac.uk.
Assuntos
Bases de Dados Genéticas , Linhagem Celular , Expressão Gênica , Genômica , Proteômica , Análise de Sequência , Integração de Sistemas , Interface Usuário-ComputadorRESUMO
Unambiguous cell line authentication is essential to avoid loss of association between data and cells. The risk for loss of references increases with the rapidity that new human pluripotent stem cell (hPSC) lines are generated, exchanged, and implemented. Ideally, a single name should be used as a generally applied reference for each cell line to access and unify cell-related information across publications, cell banks, cell registries, and databases and to ensure scientific reproducibility. We discuss the needs and requirements for such a unique identifier and implement a standard nomenclature for hPSCs, which can be automatically generated and registered by the human pluripotent stem cell registry (hPSCreg). To avoid ambiguities in PSC-line referencing, we strongly urge publishers to demand registration and use of the standard name when publishing research based on hPSC lines.
Assuntos
Bancos de Espécimes Biológicos , Bases de Dados Factuais , Células-Tronco Pluripotentes , Sistema de Registros , Terminologia como Assunto , HumanosRESUMO
Habitat choice plays a critical role in the processes of host range evolution, specialization, and ecological speciation. Pea aphid, Acyrthosiphon pisum, populations from alfalfa and red clover in eastern North America are known to be genetically differentiated and show genetic preferences for the appropriate host plant. This species feeds on many more hosts, and here we report a study of the genetic variation in host plant preference within and between pea aphid populations collected from eight genera of host plants in southeastern England. Most host-associated populations show a strong, genetically based preference for the host plant from which they were collected. Only in one case (populations from Vicia and Trifolium) was there little difference in the plant preference spectrum between populations. All populations showed a significant secondary preference for the plant on which all the aphid lines were reared: broad bean, Vicia faba, previously suggested to be a "universal host" for pea aphids. Of the total genetic variance in host preference within our sample, 61% could be attributed to preference for the collection host plant and a further 9% to systematic differences in secondary preferences with the residual representing within-population genetic variation between clones. We discuss how a combination of host plant preference and mating on the host plant may promote local adaptation and possibly ecological speciation, and whether a widely accepted host could oppose speciation by mediating gene flow between different populations.