Búsqueda | BVS CLAP/SMR-OPS/OMS

1.

Disease ontologies for knowledge graphs.

Kurbatova, Natalja; Swiers, Rowan.

BMC Bioinformatics ; 22(1): 377, 2021 Jul 21.

Artículo en Inglés | MEDLINE | ID: mdl-34289807

RESUMEN

BACKGROUND: Data integration to build a biomedical knowledge graph is a challenging task. There are multiple disease ontologies used in data sources and publications, each having its hierarchy. A common task is to map between ontologies, find disease clusters and finally build a representation of the chosen disease area. There is a shortage of published resources and tools to facilitate interactive, efficient and flexible cross-referencing and analysis of multiple disease ontologies commonly found in data sources and research. RESULTS: Our results are represented as a knowledge graph solution that uses disease ontology cross-references and facilitates switching between ontology hierarchies for data integration and other tasks. CONCLUSIONS: Grakn core with pre-installed "Disease ontologies for knowledge graphs" facilitates the biomedical knowledge graph build and provides an elegant solution for the multiple disease ontologies problem.

Asunto(s)

Ontologías Biológicas , Etnicidad , Humanos , Almacenamiento y Recuperación de la Información , Conocimiento , Reconocimiento de Normas Patrones Automatizadas

2.

Transcriptome and genome sequencing uncovers functional variation in humans.

Lappalainen, Tuuli; Sammeth, Michael; Friedländer, Marc R; 't Hoen, Peter A C; Monlong, Jean; Rivas, Manuel A; Gonzàlez-Porta, Mar; Kurbatova, Natalja; Griebel, Thasso; Ferreira, Pedro G; Barann, Matthias; Wieland, Thomas; Greger, Liliana; van Iterson, Maarten; Almlöf, Jonas; Ribeca, Paolo; Pulyakhina, Irina; Esser, Daniela; Giger, Thomas; Tikhonov, Andrew; Sultan, Marc; Bertier, Gabrielle; MacArthur, Daniel G; Lek, Monkol; Lizano, Esther; Buermans, Henk P J; Padioleau, Ismael; Schwarzmayr, Thomas; Karlberg, Olof; Ongen, Halit; Kilpinen, Helena; Beltran, Sergi; Gut, Marta; Kahlem, Katja; Amstislavskiy, Vyacheslav; Stegle, Oliver; Pirinen, Matti; Montgomery, Stephen B; Donnelly, Peter; McCarthy, Mark I; Flicek, Paul; Strom, Tim M; Lehrach, Hans; Schreiber, Stefan; Sudbrak, Ralf; Carracedo, Angel; Antonarakis, Stylianos E; Häsler, Robert; Syvänen, Ann-Christine; van Ommen, Gert-Jan.

Nature ; 501(7468): 506-11, 2013 Sep 26.

Artículo en Inglés | MEDLINE | ID: mdl-24037378

RESUMEN

Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of messenger RNA and microRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project--the first uniformly processed high-throughput RNA-sequencing data from multiple human populations with high-quality genome sequences. We discover extremely widespread genetic variation affecting the regulation of most genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on the cellular mechanisms of regulatory and loss-of-function variation, and allows us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome.

Asunto(s)

Variación Genética/genética , Genoma Humano/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Análisis de Secuencia de ARN , Transcriptoma/genética , Alelos , Línea Celular Transformada , Exones/genética , Perfilación de la Expresión Génica , Humanos , Polimorfismo de Nucleótido Simple/genética , Sitios de Carácter Cuantitativo/genética , ARN Mensajero/análisis , ARN Mensajero/genética

3.

Applying the ARRIVE Guidelines to an In Vivo Database.

Karp, Natasha A; Meehan, Terry F; Morgan, Hugh; Mason, Jeremy C; Blake, Andrew; Kurbatova, Natalja; Smedley, Damian; Jacobsen, Julius; Mott, Richard F; Iyer, Vivek; Matthews, Peter; Melvin, David G; Wells, Sara; Flenniken, Ann M; Masuya, Hiroshi; Wakana, Shigeharu; White, Jacqueline K; Lloyd, K C Kent; Reynolds, Corey L; Paylor, Richard; West, David B; Svenson, Karen L; Chesler, Elissa J; de Angelis, Martin Hrabe; Tocchini-Valentini, Glauco P; Sorg, Tania; Herault, Yann; Parkinson, Helen; Mallon, Ann-Marie; Brown, Steve D M.

PLoS Biol ; 13(5): e1002151, 2015 May.

Artículo en Inglés | MEDLINE | ID: mdl-25992600

RESUMEN

The Animal Research: Reporting of In Vivo Experiments (ARRIVE) guidelines were developed to address the lack of reproducibility in biomedical animal studies and improve the communication of research findings. While intended to guide the preparation of peer-reviewed manuscripts, the principles of transparent reporting are also fundamental for in vivo databases. Here, we describe the benefits and challenges of applying the guidelines for the International Mouse Phenotyping Consortium (IMPC), whose goal is to produce and phenotype 20,000 knockout mouse strains in a reproducible manner across ten research centres. In addition to ensuring the transparency and reproducibility of the IMPC, the solutions to the challenges of applying the ARRIVE guidelines in the context of IMPC will provide a resource to help guide similar initiatives in the future.

Asunto(s)

Experimentación Animal/normas , Bases de Datos como Asunto , Guías como Asunto , Fenotipo , Animales , Ratones

4.

ArrayExpress update--simplifying data submissions.

Kolesnikov, Nikolay; Hastings, Emma; Keays, Maria; Melnichuk, Olga; Tang, Y Amy; Williams, Eleanor; Dylag, Miroslaw; Kurbatova, Natalja; Brandizi, Marco; Burdett, Tony; Megy, Karyn; Pilicheva, Ekaterina; Rustici, Gabriella; Tikhonov, Andrew; Parkinson, Helen; Petryszak, Robert; Sarkans, Ugis; Brazma, Alvis.

Nucleic Acids Res ; 43(Database issue): D1113-6, 2015 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-25361974

RESUMEN

The ArrayExpress Archive of Functional Genomics Data (http://www.ebi.ac.uk/arrayexpress) is an international functional genomics database at the European Bioinformatics Institute (EMBL-EBI) recommended by most journals as a repository for data supporting peer-reviewed publications. It contains data from over 7000 public sequencing and 42,000 array-based studies comprising over 1.5 million assays in total. The proportion of sequencing-based submissions has grown significantly over the last few years and has doubled in the last 18 months, whilst the rate of microarray submissions is growing slightly. All data in ArrayExpress are available in the MAGE-TAB format, which allows robust linking to data analysis and visualization tools and standardized analysis. The main development over the last two years has been the release of a new data submission tool Annotare, which has reduced the average submission time almost 3-fold. In the near future, Annotare will become the only submission route into ArrayExpress, alongside MAGE-TAB format-based pipelines. ArrayExpress is a stable and highly accessed resource. Our future tasks include automation of data flows and further integration with other EMBL-EBI resources for the representation of multi-omics data.

Asunto(s)

Bases de Datos Genéticas , Perfilación de la Expresión Génica , Análisis de Secuencia por Matrices de Oligonucleótidos , Genómica , Secuenciación de Nucleótidos de Alto Rendimiento , Internet , Programas Informáticos

5.

The International Mouse Phenotyping Consortium Web Portal, a unified point of access for knockout mice and related phenotyping data.

Koscielny, Gautier; Yaikhom, Gagarine; Iyer, Vivek; Meehan, Terrence F; Morgan, Hugh; Atienza-Herrero, Julian; Blake, Andrew; Chen, Chao-Kung; Easty, Richard; Di Fenza, Armida; Fiegel, Tanja; Grifiths, Mark; Horne, Alan; Karp, Natasha A; Kurbatova, Natalja; Mason, Jeremy C; Matthews, Peter; Oakley, Darren J; Qazi, Asfand; Regnart, Jack; Retha, Ahmad; Santos, Luis A; Sneddon, Duncan J; Warren, Jonathan; Westerberg, Henrik; Wilson, Robert J; Melvin, David G; Smedley, Damian; Brown, Steve D M; Flicek, Paul; Skarnes, William C; Mallon, Ann-Marie; Parkinson, Helen.

Nucleic Acids Res ; 42(Database issue): D802-9, 2014 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-24194600

RESUMEN

The International Mouse Phenotyping Consortium (IMPC) web portal (http://www.mousephenotype.org) provides the biomedical community with a unified point of access to mutant mice and rich collection of related emerging and existing mouse phenotype data. IMPC mouse clinics worldwide follow rigorous highly structured and standardized protocols for the experimentation, collection and dissemination of data. Dedicated 'data wranglers' work with each phenotyping center to collate data and perform quality control of data. An automated statistical analysis pipeline has been developed to identify knockout strains with a significant change in the phenotype parameters. Annotation with biomedical ontologies allows biologists and clinicians to easily find mouse strains with phenotypic traits relevant to their research. Data integration with other resources will provide insights into mammalian gene function and human disease. As phenotype data become available for every gene in the mouse, the IMPC web portal will become an invaluable tool for researchers studying the genetic contributions of genes to human diseases.

Asunto(s)

Bases de Datos Genéticas , Ratones Noqueados , Fenotipo , Animales , Ontologías Biológicas , Internet , Ratones

6.

ArrayExpress update--trends in database growth and links to data analysis tools.

Rustici, Gabriella; Kolesnikov, Nikolay; Brandizi, Marco; Burdett, Tony; Dylag, Miroslaw; Emam, Ibrahim; Farne, Anna; Hastings, Emma; Ison, Jon; Keays, Maria; Kurbatova, Natalja; Malone, James; Mani, Roby; Mupo, Annalisa; Pedro Pereira, Rui; Pilicheva, Ekaterina; Rung, Johan; Sharma, Anjan; Tang, Y Amy; Ternent, Tobias; Tikhonov, Andrew; Welter, Danielle; Williams, Eleanor; Brazma, Alvis; Parkinson, Helen; Sarkans, Ugis.

Nucleic Acids Res ; 41(Database issue): D987-90, 2013 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-23193272

RESUMEN

The ArrayExpress Archive of Functional Genomics Data (http://www.ebi.ac.uk/arrayexpress) is one of three international functional genomics public data repositories, alongside the Gene Expression Omnibus at NCBI and the DDBJ Omics Archive, supporting peer-reviewed publications. It accepts data generated by sequencing or array-based technologies and currently contains data from almost a million assays, from over 30 000 experiments. The proportion of sequencing-based submissions has grown significantly over the last 2 years and has reached, in 2012, 15% of all new data. All data are available from ArrayExpress in MAGE-TAB format, which allows robust linking to data analysis and visualization tools, including Bioconductor and GenomeSpace. Additionally, R objects, for microarray data, and binary alignment format files, for sequencing data, have been generated for a significant proportion of ArrayExpress data.

Asunto(s)

Bases de Datos Genéticas , Genómica , Análisis por Micromatrices , Bases de Datos Genéticas/estadística & datos numéricos , Bases de Datos Genéticas/tendencias , Secuenciación de Nucleótidos de Alto Rendimiento , Internet , Programas Informáticos , Interfaz Usuario-Computador

7.

Gene Expression Atlas update--a value-added database of microarray and sequencing-based functional genomics experiments.

Kapushesky, Misha; Adamusiak, Tomasz; Burdett, Tony; Culhane, Aedin; Farne, Anna; Filippov, Alexey; Holloway, Ele; Klebanov, Andrey; Kryvych, Nataliya; Kurbatova, Natalja; Kurnosov, Pavel; Malone, James; Melnichuk, Olga; Petryszak, Robert; Pultsin, Nikolay; Rustici, Gabriella; Tikhonov, Andrew; Travillian, Ravensara S; Williams, Eleanor; Zorin, Andrey; Parkinson, Helen; Brazma, Alvis.

Nucleic Acids Res ; 40(Database issue): D1077-81, 2012 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-22064864

RESUMEN

Gene Expression Atlas (http://www.ebi.ac.uk/gxa) is an added-value database providing information about gene expression in different cell types, organism parts, developmental stages, disease states, sample treatments and other biological/experimental conditions. The content of this database derives from curation, re-annotation and statistical analysis of selected data from the ArrayExpress Archive and the European Nucleotide Archive. A simple interface allows the user to query for differential gene expression either by gene names or attributes or by biological conditions, e.g. diseases, organism parts or cell types. Since our previous report we made 20 monthly releases and, as of Release 11.08 (August 2011), the database supports 19 species, which contains expression data measured for 19,014 biological conditions in 136,551 assays from 5598 independent studies.

Asunto(s)

Bases de Datos Genéticas , Perfilación de la Expresión Génica , Análisis de Secuencia por Matrices de Oligonucleótidos , Atlas como Asunto , Genómica , Humanos , MicroARNs/metabolismo , Anotación de Secuencia Molecular , Análisis de Secuencia de ARN , Interfaz Usuario-Computador

8.

Good machine learning practices: Learnings from the modern pharmaceutical discovery enterprise.

Makarov, Vladimir; Chabbert, Christophe; Koletou, Elina; Psomopoulos, Fotis; Kurbatova, Natalja; Ramirez, Samuel; Nelson, Chas; Natarajan, Prashant; Neupane, Bikalpa.

Comput Biol Med ; 177: 108632, 2024 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-38788373

RESUMEN

Machine Learning (ML) and Artificial Intelligence (AI) have become an integral part of the drug discovery and development value chain. Many teams in the pharmaceutical industry nevertheless report the challenges associated with the timely, cost effective and meaningful delivery of ML and AI powered solutions for their scientists. We sought to better understand what these challenges were and how to overcome them by performing an industry wide assessment of the practices in AI and Machine Learning. Here we report results of the systematic business analysis of the personas in the modern pharmaceutical discovery enterprise in relation to their work with the AI and ML technologies. We identify 23 common business problems that individuals in these roles face when they encounter AI and ML technologies at work, and describe best practices (Good Machine Learning Practices) that address these issues.

Asunto(s)

Descubrimiento de Drogas , Industria Farmacéutica , Aprendizaje Automático , Humanos , Inteligencia Artificial

9.

graph2tab, a library to convert experimental workflow graphs into tabular formats.

Brandizi, Marco; Kurbatova, Natalja; Sarkans, Ugis; Rocca-Serra, Philippe.

Bioinformatics ; 28(12): 1665-7, 2012 Jun 15.

Artículo en Inglés | MEDLINE | ID: mdl-22556367

RESUMEN

MOTIVATIONS: Spreadsheet-like tabular formats are ever more popular in the biomedical field as a mean for experimental reporting. The problem of converting the graph of an experimental workflow into a table-based representation occurs in many such formats and is not easy to solve. RESULTS: We describe graph2tab, a library that implements methods to realise such a conversion in a size-optimised way. Our solution is generic and can be adapted to specific cases of data exporters or data converters that need to be implemented. AVAILABILITY AND IMPLEMENTATION: The library source code and documentation are available at http://github.com/ISA-tools/graph2tab.

Asunto(s)

Gráficos por Computador , Lenguajes de Programación , Flujo de Trabajo , Biología Computacional/métodos , Bases de Datos Factuales , Análisis de Secuencia por Matrices de Oligonucleótidos

10.

ArrayExpress update--an archive of microarray and high-throughput sequencing-based functional genomics experiments.

Parkinson, Helen; Sarkans, Ugis; Kolesnikov, Nikolay; Abeygunawardena, Niran; Burdett, Tony; Dylag, Miroslaw; Emam, Ibrahim; Farne, Anna; Hastings, Emma; Holloway, Ele; Kurbatova, Natalja; Lukk, Margus; Malone, James; Mani, Roby; Pilicheva, Ekaterina; Rustici, Gabriella; Sharma, Anjan; Williams, Eleanor; Adamusiak, Tomasz; Brandizi, Marco; Sklyar, Nataliya; Brazma, Alvis.

Nucleic Acids Res ; 39(Database issue): D1002-4, 2011 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-21071405

RESUMEN

The ArrayExpress Archive (http://www.ebi.ac.uk/arrayexpress) is one of the three international public repositories of functional genomics data supporting publications. It includes data generated by sequencing or array-based technologies. Data are submitted by users and imported directly from the NCBI Gene Expression Omnibus. The ArrayExpress Archive is closely integrated with the Gene Expression Atlas and the sequence databases at the European Bioinformatics Institute. Advanced queries provided via ontology enabled interfaces include queries based on technology and sample attributes such as disease, cell types and anatomy.

Asunto(s)

Bases de Datos Genéticas , Perfilación de la Expresión Génica , Genómica , Secuenciación de Nucleótidos de Alto Rendimiento , Análisis de Secuencia por Matrices de Oligonucleótidos , Expresión Génica

11.

ontoCAT: an R package for ontology traversal and search.

Kurbatova, Natalja; Adamusiak, Tomasz; Kurnosov, Pavel; Swertz, Morris A; Kapushesky, Misha.

Bioinformatics ; 27(17): 2468-70, 2011 Sep 01.

Artículo en Inglés | MEDLINE | ID: mdl-21697126

RESUMEN

MOTIVATION: There exist few simple and easily accessible methods to integrate ontologies programmatically in the R environment. We present ontoCAT-an R package to access ontologies in widely used standard formats, stored locally in the filesystem or available online. The ontoCAT package supports a number of traversal and search functions on a single ontology, as well as searching for ontology terms across multiple ontologies and in major ontology repositories. AVAILABILITY: The package and sources are freely available in Bioconductor starting from version 2.8: http://bioconductor.org/help/bioc-views/release/bioc/html/ontoCAT.html or via the OntoCAT website http://www.ontocat.org/wiki/r. CONTACT: natalja@ebi.ac.uk; natalja@ebi.ac.uk.

Asunto(s)

Programas Informáticos , Vocabulario Controlado , Terminología como Asunto

12.

OntoCAT--simple ontology search and integration in Java, R and REST/JavaScript.

Adamusiak, Tomasz; Burdett, Tony; Kurbatova, Natalja; Joeri van der Velde, K; Abeygunawardena, Niran; Antonakaki, Despoina; Kapushesky, Misha; Parkinson, Helen; Swertz, Morris A.

BMC Bioinformatics ; 12: 218, 2011 May 29.

Artículo en Inglés | MEDLINE | ID: mdl-21619703

RESUMEN

BACKGROUND: Ontologies have become an essential asset in the bioinformatics toolbox and a number of ontology access resources are now available, for example, the EBI Ontology Lookup Service (OLS) and the NCBO BioPortal. However, these resources differ substantially in mode, ease of access, and ontology content. This makes it relatively difficult to access each ontology source separately, map their contents to research data, and much of this effort is being replicated across different research groups. RESULTS: OntoCAT provides a seamless programming interface to query heterogeneous ontology resources including OLS and BioPortal, as well as user-specified local OWL and OBO files. Each resource is wrapped behind easy to learn Java, Bioconductor/R and REST web service commands enabling reuse and integration of ontology software efforts despite variation in technologies. It is also available as a stand-alone MOLGENIS database and a Google App Engine application. CONCLUSIONS: OntoCAT provides a robust, configurable solution for accessing ontology terms specified locally and from remote services, is available as a stand-alone tool and has been tested thoroughly in the ArrayExpress, MOLGENIS, EFO and Gen2Phen phenotype use cases. AVAILABILITY: http://www.ontocat.org.

Asunto(s)

Biología Computacional/métodos , Programas Informáticos , Vocabulario , Bases de Datos Factuales , Humanos , Lenguajes de Programación , Interfaz Usuario-Computador , Vocabulario Controlado

13.

A System for Information Management in BioMedical Studies--SIMBioMS.

Krestyaninova, Maria; Zarins, Andris; Viksna, Juris; Kurbatova, Natalja; Rucevskis, Peteris; Neogi, Sudeshna Guha; Gostev, Mike; Perheentupa, Teemu; Knuuttila, Juha; Barrett, Amy; Lappalainen, Ilkka; Rung, Johan; Podnieks, Karlis; Sarkans, Ugis; McCarthy, Mark I; Brazma, Alvis.

Bioinformatics ; 25(20): 2768-9, 2009 Oct 15.

Artículo en Inglés | MEDLINE | ID: mdl-19633095

RESUMEN

UNLABELLED: SIMBioMS is a web-based open source software system for managing data and information in biomedical studies. It provides a solution for the collection, storage, management and retrieval of information about research subjects and biomedical samples, as well as experimental data obtained using a range of high-throughput technologies, including gene expression, genotyping, proteomics and metabonomics. The system can easily be customized and has proven to be successful in several large-scale multi-site collaborative projects. It is compatible with emerging functional genomics data standards and provides data import and export in accepted standard formats. Protocols for transferring data to durable archives at the European Bioinformatics Institute have been implemented. AVAILABILITY: The source code, documentation and initialization scripts are available at http://simbioms.org.

Asunto(s)

Biología Computacional/métodos , Sistemas de Administración de Bases de Datos , Gestión de la Información/métodos , Almacenamiento y Recuperación de la Información/métodos , Programas Informáticos , Bases de Datos Factuales

14.

Urinary metabolic phenotyping for Alzheimer's disease.

Kurbatova, Natalja; Garg, Manik; Whiley, Luke; Chekmeneva, Elena; Jiménez, Beatriz; Gómez-Romero, María; Pearce, Jake; Kimhofer, Torben; D'Hondt, Ellie; Soininen, Hilkka; Kloszewska, Iwona; Mecocci, Patrizia; Tsolaki, Magda; Vellas, Bruno; Aarsland, Dag; Nevado-Holgado, Alejo; Liu, Benjamine; Snowden, Stuart; Proitsi, Petroula; Ashton, Nicholas J; Hye, Abdul; Legido-Quigley, Cristina; Lewis, Matthew R; Nicholson, Jeremy K; Holmes, Elaine; Brazma, Alvis; Lovestone, Simon.

Sci Rep ; 10(1): 21745, 2020 12 10.

Artículo en Inglés | MEDLINE | ID: mdl-33303834

RESUMEN

Finding early disease markers using non-invasive and widely available methods is essential to develop a successful therapy for Alzheimer's Disease. Few studies to date have examined urine, the most readily available biofluid. Here we report the largest study to date using comprehensive metabolic phenotyping platforms (NMR spectroscopy and UHPLC-MS) to probe the urinary metabolome in-depth in people with Alzheimer's Disease and Mild Cognitive Impairment. Feature reduction was performed using metabolomic Quantitative Trait Loci, resulting in the list of metabolites associated with the genetic variants. This approach helps accuracy in identification of disease states and provides a route to a plausible mechanistic link to pathological processes. Using these mQTLs we built a Random Forests model, which not only correctly discriminates between people with Alzheimer's Disease and age-matched controls, but also between individuals with Mild Cognitive Impairment who were later diagnosed with Alzheimer's Disease and those who were not. Further annotation of top-ranking metabolic features nominated by the trained model revealed the involvement of cholesterol-derived metabolites and small-molecules that were linked to Alzheimer's pathology in previous studies.

Asunto(s)

Enfermedad de Alzheimer/genética , Enfermedad de Alzheimer/metabolismo , Fenotipo , Anciano , Anciano de 80 o más Años , Enfermedad de Alzheimer/orina , Biomarcadores/orina , Disfunción Cognitiva/genética , Disfunción Cognitiva/metabolismo , Disfunción Cognitiva/orina , Femenino , Humanos , Masculino , Metabolómica/métodos , Sitios de Carácter Cuantitativo

15.

Detection of 3D atomic similarities and their use in the discrimination of small molecule protein-binding sites.

Najmanovich, Rafael; Kurbatova, Natalja; Thornton, Janet.

Bioinformatics ; 24(16): i105-11, 2008 Aug 15.

Artículo en Inglés | MEDLINE | ID: mdl-18689810

RESUMEN

MOTIVATION: Current computational methods for the prediction of function from structure are restricted to the detection of similarities and subsequent transfer of functional annotation. In a significant minority of cases, global sequence or structural (fold) similarities do not provide clues about protein function. In these cases, one alternative is to detect local binding site similarities. These may still reflect more distant evolutionary relationships as well as unique physico-chemical constraints necessary for binding similar ligands, thus helping pinpoint the function. In the present work, we ask the following question: is it possible to discriminate within a dataset of non-homologous proteins those that bind similar ligands based on their binding site similarities? METHODS: We implement a graph-matching-based method for the detection of 3D atomic similarities introducing some simplifications that allow us to extend its applicability to the analysis of large allatom binding site models. This method, called IsoCleft, does not require atoms to be connected either in sequence or space. We apply the method to a cognate-ligand bound dataset of non-homologous proteins. We define a family of binding site models with decreasing knowledge about the identity of the ligand-interacting atoms to uncouple the questions of predicting the location of the binding site and detecting binding site similarities. Furthermore, we calculate the individual contributions of binding site size, chemical composition and geometry to prediction performance. RESULTS: We find that it is possible to discriminate between different ligand-binding sites. In other words, there is a certain uniqueness in the set of atoms that are in contact to specific ligand scaffolds. This uniqueness is restricted to the atoms in close proximity of the ligand in which case, size and chemical composition alone are sufficient to discriminate binding sites. Discrimination ability decreases with decreasing knowledge about the identity of the ligand-interacting binding site atoms. The decrease is quite abrupt when considering size and chemical composition alone, but much slower when including geometry. We also observe that certain ligands are easier to discriminate. Interestingly, the subset of binding site atoms belonging to highly conserved residues is not sufficient to discriminate binding sites, implying that convergently evolved binding sites arrived at dissimilar solutions. AVAILABILITY: IsoCleft can be obtained from the authors.

Asunto(s)

Algoritmos , Modelos Químicos , Modelos Moleculares , Proteínas/química , Proteínas/ultraestructura , Alineación de Secuencia/métodos , Análisis de Secuencia de Proteína/métodos , Secuencia de Aminoácidos , Sitios de Unión , Simulación por Computador , Análisis Discriminante , Datos de Secuencia Molecular , Unión Proteica , Conformación Proteica , Homología de Secuencia

16.

An integrated genomic analysis of anaplastic meningioma identifies prognostic molecular signatures.

Collord, Grace; Tarpey, Patrick; Kurbatova, Natalja; Martincorena, Inigo; Moran, Sebastian; Castro, Manuel; Nagy, Tibor; Bignell, Graham; Maura, Francesco; Young, Matthew D; Berna, Jorge; Tubio, Jose M C; McMurran, Chris E; Young, Adam M H; Sanders, Mathijs; Noorani, Imran; Price, Stephen J; Watts, Colin; Leipnitz, Elke; Kirsch, Matthias; Schackert, Gabriele; Pearson, Danita; Devadass, Abel; Ram, Zvi; Collins, V Peter; Allinson, Kieren; Jenkinson, Michael D; Zakaria, Rasheed; Syed, Khaja; Hanemann, C Oliver; Dunn, Jemma; McDermott, Michael W; Kirollos, Ramez W; Vassiliou, George S; Esteller, Manel; Behjati, Sam; Brazma, Alvis; Santarius, Thomas; McDermott, Ultan.

Sci Rep ; 8(1): 13537, 2018 09 10.

Artículo en Inglés | MEDLINE | ID: mdl-30202034

RESUMEN

Anaplastic meningioma is a rare and aggressive brain tumor characterised by intractable recurrences and dismal outcomes. Here, we present an integrated analysis of the whole genome, transcriptome and methylation profiles of primary and recurrent anaplastic meningioma. A key finding was the delineation of distinct molecular subgroups that were associated with diametrically opposed survival outcomes. Relative to lower grade meningiomas, anaplastic tumors harbored frequent driver mutations in SWI/SNF complex genes, which were confined to the poor prognosis subgroup. Aggressive disease was further characterised by transcriptional evidence of increased PRC2 activity, stemness and epithelial-to-mesenchymal transition. Our analyses discern biologically distinct variants of anaplastic meningioma with prognostic and therapeutic significance.

Asunto(s)

Regulación Neoplásica de la Expresión Génica , Neoplasias Meníngeas/genética , Meningioma/genética , Recurrencia Local de Neoplasia/genética , Transcriptoma/genética , Anciano , Metilación de ADN/genética , Progresión de la Enfermedad , Femenino , Perfilación de la Expresión Génica , Genómica/métodos , Humanos , Masculino , Neoplasias Meníngeas/mortalidad , Neoplasias Meníngeas/patología , Neoplasias Meníngeas/cirugía , Meningioma/mortalidad , Meningioma/patología , Meningioma/cirugía , Persona de Mediana Edad , Clasificación del Tumor , Recurrencia Local de Neoplasia/mortalidad , Recurrencia Local de Neoplasia/patología , Pronóstico , Análisis de Supervivencia , Secuenciación Completa del Genoma

17.

Prevalence of sexual dimorphism in mammalian phenotypic traits.

Karp, Natasha A; Mason, Jeremy; Beaudet, Arthur L; Benjamini, Yoav; Bower, Lynette; Braun, Robert E; Brown, Steve D M; Chesler, Elissa J; Dickinson, Mary E; Flenniken, Ann M; Fuchs, Helmut; Angelis, Martin Hrabe de; Gao, Xiang; Guo, Shiying; Greenaway, Simon; Heller, Ruth; Herault, Yann; Justice, Monica J; Kurbatova, Natalja; Lelliott, Christopher J; Lloyd, K C Kent; Mallon, Ann-Marie; Mank, Judith E; Masuya, Hiroshi; McKerlie, Colin; Meehan, Terrence F; Mott, Richard F; Murray, Stephen A; Parkinson, Helen; Ramirez-Solis, Ramiro; Santos, Luis; Seavitt, John R; Smedley, Damian; Sorg, Tania; Speak, Anneliese O; Steel, Karen P; Svenson, Karen L; Wakana, Shigeharu; West, David; Wells, Sara; Westerberg, Henrik; Yaacoby, Shay; White, Jacqueline K.

Nat Commun ; 8: 15475, 2017 06 26.

Artículo en Inglés | MEDLINE | ID: mdl-28650954

RESUMEN

The role of sex in biomedical studies has often been overlooked, despite evidence of sexually dimorphic effects in some biological studies. Here, we used high-throughput phenotype data from 14,250 wildtype and 40,192 mutant mice (representing 2,186 knockout lines), analysed for up to 234 traits, and found a large proportion of mammalian traits both in wildtype and mutants are influenced by sex. This result has implications for interpreting disease phenotypes in animal models and humans.

Asunto(s)

Mamíferos/fisiología , Carácter Cuantitativo Heredable , Caracteres Sexuales , Animales , Peso Corporal , Femenino , Genes Modificadores , Genotipo , Ratones , Fenotipo

18.

A large scale hearing loss screen reveals an extensive unexplored genetic landscape for auditory dysfunction.

Bowl, Michael R; Simon, Michelle M; Ingham, Neil J; Greenaway, Simon; Santos, Luis; Cater, Heather; Taylor, Sarah; Mason, Jeremy; Kurbatova, Natalja; Pearson, Selina; Bower, Lynette R; Clary, Dave A; Meziane, Hamid; Reilly, Patrick; Minowa, Osamu; Kelsey, Lois; Tocchini-Valentini, Glauco P; Gao, Xiang; Bradley, Allan; Skarnes, William C; Moore, Mark; Beaudet, Arthur L; Justice, Monica J; Seavitt, John; Dickinson, Mary E; Wurst, Wolfgang; de Angelis, Martin Hrabe; Herault, Yann; Wakana, Shigeharu; Nutter, Lauryl M J; Flenniken, Ann M; McKerlie, Colin; Murray, Stephen A; Svenson, Karen L; Braun, Robert E; West, David B; Lloyd, K C Kent; Adams, David J; White, Jacqui; Karp, Natasha; Flicek, Paul; Smedley, Damian; Meehan, Terrence F; Parkinson, Helen E; Teboul, Lydia M; Wells, Sara; Steel, Karen P; Mallon, Ann-Marie; Brown, Steve D M.

Nat Commun ; 8(1): 886, 2017 10 12.

Artículo en Inglés | MEDLINE | ID: mdl-29026089

RESUMEN

The developmental and physiological complexity of the auditory system is likely reflected in the underlying set of genes involved in auditory function. In humans, over 150 non-syndromic loci have been identified, and there are more than 400 human genetic syndromes with a hearing loss component. Over 100 non-syndromic hearing loss genes have been identified in mouse and human, but we remain ignorant of the full extent of the genetic landscape involved in auditory dysfunction. As part of the International Mouse Phenotyping Consortium, we undertook a hearing loss screen in a cohort of 3006 mouse knockout strains. In total, we identify 67 candidate hearing loss genes. We detect known hearing loss genes, but the vast majority, 52, of the candidate genes were novel. Our analysis reveals a large and unexplored genetic landscape involved with auditory function.The full extent of the genetic basis for hearing impairment is unknown. Here, as part of the International Mouse Phenotyping Consortium, the authors perform a hearing loss screen in 3006 mouse knockout strains and identify 52 new candidate genes for genetic hearing loss.

Asunto(s)

Pérdida Auditiva/genética , Mapas de Interacción de Proteínas/genética , Animales , Conjuntos de Datos como Asunto , Pruebas Genéticas , Pérdida Auditiva/epidemiología , Pruebas Auditivas , Ratones , Ratones Noqueados , Fenotipo

19.

PhenStat: A Tool Kit for Standardized Analysis of High Throughput Phenotypic Data.

Kurbatova, Natalja; Mason, Jeremy C; Morgan, Hugh; Meehan, Terrence F; Karp, Natasha A.

PLoS One ; 10(7): e0131274, 2015.

Artículo en Inglés | MEDLINE | ID: mdl-26147094

RESUMEN

The lack of reproducibility with animal phenotyping experiments is a growing concern among the biomedical community. One contributing factor is the inadequate description of statistical analysis methods that prevents researchers from replicating results even when the original data are provided. Here we present PhenStat--a freely available R package that provides a variety of statistical methods for the identification of phenotypic associations. The methods have been developed for high throughput phenotyping pipelines implemented across various experimental designs with an emphasis on managing temporal variation. PhenStat is targeted to two user groups: small-scale users who wish to interact and test data from large resources and large-scale users who require an automated statistical analysis pipeline. The software provides guidance to the user for selecting appropriate analysis methods based on the dataset and is designed to allow for additions and modifications as needed. The package was tested on mouse and rat data and is used by the International Mouse Phenotyping Consortium (IMPC). By providing raw data and the version of PhenStat used, resources like the IMPC give users the ability to replicate and explore results within their own computing environment.

Asunto(s)

Ensayos Analíticos de Alto Rendimiento/normas , Fenotipo , Reproducibilidad de los Resultados , Programas Informáticos , Animales , Conjuntos de Datos como Asunto/normas , Conjuntos de Datos como Asunto/estadística & datos numéricos , Femenino , Ensayos Analíticos de Alto Rendimiento/métodos , Ensayos Analíticos de Alto Rendimiento/estadística & datos numéricos , Modelos Lineales , Masculino , Ratones , Ratas , Estándares de Referencia

20.

IsoCleft Finder - a web-based tool for the detection and analysis of protein binding-site geometric and chemical similarities.

Kurbatova, Natalja; Chartier, Matthieu; Zylber, María Inés; Najmanovich, Rafael.

F1000Res ; 2: 117, 2013.

Artículo en Inglés | MEDLINE | ID: mdl-24555058

RESUMEN

IsoCleft Finder is a web-based tool for the detection of local geometric and chemical similarities between potential small-molecule binding cavities and a non-redundant dataset of ligand-bound known small-molecule binding-sites. The non-redundant dataset developed as part of this study is composed of 7339 entries representing unique Pfam/PDB-ligand (hetero group code) combinations with known levels of cognate ligand similarity. The query cavity can be uploaded by the user or detected automatically by the system using existing PDB entries as well as user-provided structures in PDB format. In all cases, the user can refine the definition of the cavity interactively via a browser-based Jmol 3D molecular visualization interface. Furthermore, users can restrict the search to a subset of the dataset using a cognate-similarity threshold. Local structural similarities are detected using the IsoCleft software and ranked according to two criteria (number of atoms in common and Tanimoto score of local structural similarity) and the associated Z-score and p-value measures of statistical significance. The results, including predicted ligands, target proteins, similarity scores, number of atoms in common, etc., are shown in a powerful interactive graphical interface. This interface permits the visualization of target ligands superimposed on the query cavity and additionally provides a table of pairwise ligand topological similarities. Similarities between top scoring ligands serve as an additional tool to judge the quality of the results obtained. We present several examples where IsoCleft Finder provides useful functional information. IsoCleft Finder results are complementary to existing approaches for the prediction of protein function from structure, rational drug design and x-ray crystallography. IsoCleft Finder can be found at: http://bcb.med.usherbrooke.ca/isocleftfinder.

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA