RESUMEN
Progress in mass spectrometry lipidomics has led to a rapid proliferation of studies across biology and biomedicine. These generate extremely large raw datasets requiring sophisticated solutions to support automated data processing. To address this, numerous software tools have been developed and tailored for specific tasks. However, for researchers, deciding which approach best suits their application relies on ad hoc testing, which is inefficient and time consuming. Here we first review the data processing pipeline, summarizing the scope of available tools. Next, to support researchers, LIPID MAPS provides an interactive online portal listing open-access tools with a graphical user interface. This guides users towards appropriate solutions within major areas in data processing, including (1) lipid-oriented databases, (2) mass spectrometry data repositories, (3) analysis of targeted lipidomics datasets, (4) lipid identification and (5) quantification from untargeted lipidomics datasets, (6) statistical analysis and visualization, and (7) data integration solutions. Detailed descriptions of functions and requirements are provided to guide customized data analysis workflows.
Asunto(s)
Biología Computacional , Lipidómica , Biología Computacional/métodos , Programas Informáticos , Informática , Lípidos/químicaRESUMEN
The Eukaryotic Pathogen, Vector and Host Informatics Resource (VEuPathDB, https://veupathdb.org) is a Bioinformatics Resource Center funded by the National Institutes of Health with additional funding from the Wellcome Trust. VEuPathDB supports >600 organisms that comprise invertebrate vectors, eukaryotic pathogens (protists and fungi) and relevant free-living or non-pathogenic species or hosts. Since 2004, VEuPathDB has analyzed omics data from the public domain using contemporary bioinformatic workflows, including orthology predictions via OrthoMCL, and integrated the analysis results with analysis tools, visualizations, and advanced search capabilities. The unique data mining platform coupled with >3000 pre-analyzed data sets facilitates the exploration of pertinent omics data in support of hypothesis driven research. Comparisons are easily made across data sets, data types and organisms. A Galaxy workspace offers the opportunity for the analysis of private large-scale datasets and for porting to VEuPathDB for comparisons with integrated data. The MapVEu tool provides a platform for exploration of spatially resolved data such as vector surveillance and insecticide resistance monitoring. To address the growing body of omics data and advances in laboratory techniques, VEuPathDB has added several new data types, searches and features, improved the Galaxy workspace environment, redesigned the MapVEu interface and updated the infrastructure to accommodate these changes.
Asunto(s)
Biología Computacional , Eucariontes , Animales , Biología Computacional/métodos , Invertebrados , Bases de Datos FactualesRESUMEN
Ensembl (https://www.ensembl.org) is unique in its flexible infrastructure for access to genomic data and annotation. It has been designed to efficiently deliver annotation at scale for all eukaryotic life, and it also provides deep comprehensive annotation for key species. Genomes representing a greater diversity of species are increasingly being sequenced. In response, we have focussed our recent efforts on expediting the annotation of new assemblies. Here, we report the release of the greatest annual number of newly annotated genomes in the history of Ensembl via our dedicated Ensembl Rapid Release platform (http://rapid.ensembl.org). We have also developed a new method to generate comparative analyses at scale for these assemblies and, for the first time, we have annotated non-vertebrate eukaryotes. Meanwhile, we continually improve, extend and update the annotation for our high-value reference vertebrate genomes and report the details here. We have a range of specific software tools for specific tasks, such as the Ensembl Variant Effect Predictor (VEP) and the newly developed interface for the Variant Recoder. All Ensembl data, software and tools are freely available for download and are accessible programmatically.
Asunto(s)
Bases de Datos Genéticas , Genoma/genética , Anotación de Secuencia Molecular , Programas Informáticos , Animales , Biología Computacional/clasificación , HumanosRESUMEN
The Ensembl project (https://www.ensembl.org) annotates genomes and disseminates genomic data for vertebrate species. We create detailed and comprehensive annotation of gene structures, regulatory elements and variants, and enable comparative genomics by inferring the evolutionary history of genes and genomes. Our integrated genomic data are made available in a variety of ways, including genome browsers, search interfaces, specialist tools such as the Ensembl Variant Effect Predictor, download files and programmatic interfaces. Here, we present recent Ensembl developments including two new website portals. Ensembl Rapid Release (http://rapid.ensembl.org) is designed to provide core tools and services for genomes as soon as possible and has been deployed to support large biodiversity sequencing projects. Our SARS-CoV-2 genome browser (https://covid-19.ensembl.org) integrates our own annotation with publicly available genomic data from numerous sources to facilitate the use of genomics in the international scientific response to the COVID-19 pandemic. We also report on other updates to our annotation resources, tools and services. All Ensembl data and software are freely available without restriction.
Asunto(s)
Biología Computacional/métodos , Bases de Datos de Ácidos Nucleicos , Genómica/métodos , SARS-CoV-2/genética , Vertebrados/genética , Animales , COVID-19/epidemiología , COVID-19/virología , Humanos , Internet , Anotación de Secuencia Molecular/métodos , Pandemias , Vertebrados/clasificaciónRESUMEN
The lipid envelope of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is an essential component of the virus; however, its molecular composition is undetermined. Addressing this knowledge gap could support the design of antiviral agents as well as further our understanding of viral-host protein interactions, infectivity, pathogenicity, and innate immune system clearance. Lipidomics revealed that the virus envelope comprised mainly phospholipids (PLs), with some cholesterol and sphingolipids, and with cholesterol/phospholipid ratio similar to lysosomes. Unlike cellular membranes, procoagulant amino-PLs were present on the external side of the viral envelope at levels exceeding those on activated platelets. Accordingly, virions directly promoted blood coagulation. To investigate whether these differences could enable selective targeting of the viral envelope in vivo, we tested whether oral rinses containing lipid-disrupting chemicals could reduce infectivity. Products containing PL-disrupting surfactants (such as cetylpyridinium chloride) met European virucidal standards in vitro; however, components that altered the critical micelle concentration reduced efficacy, and products containing essential oils, povidone-iodine, or chlorhexidine were ineffective. This result was recapitulated in vivo, where a 30-s oral rinse with cetylpyridinium chloride mouthwash eliminated live virus in the oral cavity of patients with coronavirus disease 19 for at least 1 h, whereas povidone-iodine and saline mouthwashes were ineffective. We conclude that the SARS-CoV-2 lipid envelope i) is distinct from the host plasma membrane, which may enable design of selective antiviral approaches; ii) contains exposed phosphatidylethanolamine and phosphatidylserine, which may influence thrombosis, pathogenicity, and inflammation; and iii) can be selectively targeted in vivo by specific oral rinses.
Asunto(s)
COVID-19 , Antisépticos Bucales , Antivirales , Cetilpiridinio , Humanos , Lípidos , Antisépticos Bucales/farmacología , Povidona Yodada , ARN Viral , SARS-CoV-2RESUMEN
SUMMARY: We present LipidFinder 2.0, incorporating four new modules that apply artefact filters, remove lipid and contaminant stacks, in-source fragments and salt clusters, and a new isotope deletion method which is significantly more sensitive than available open-access alternatives. We also incorporate a novel false discovery rate method, utilizing a target-decoy strategy, which allows users to assess data quality. A renewed lipid profiling method is introduced which searches three different databases from LIPID MAPS and returns bulk lipid structures only, and a lipid category scatter plot with color blind friendly pallet. An API interface with XCMS Online is made available on LipidFinder's online version. We show using real data that LipidFinder 2.0 provides a significant improvement over non-lipid metabolite filtering and lipid profiling, compared to available tools. AVAILABILITY AND IMPLEMENTATION: LipidFinder 2.0 is freely available at https://github.com/ODonnell-Lipidomics/LipidFinder and http://lipidmaps.org/resources/tools/lipidfinder. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Lipidómica , Programas Informáticos , Bases de Datos Factuales , LípidosRESUMEN
The Ensembl (https://www.ensembl.org) is a system for generating and distributing genome annotation such as genes, variation, regulation and comparative genomics across the vertebrate subphylum and key model organisms. The Ensembl annotation pipeline is capable of integrating experimental and reference data from multiple providers into a single integrated resource. Here, we present 94 newly annotated and re-annotated genomes, bringing the total number of genomes offered by Ensembl to 227. This represents the single largest expansion of the resource since its inception. We also detail our continued efforts to improve human annotation, developments in our epigenome analysis and display, a new tool for imputing causal genes from genome-wide association studies and visualisation of variation within a 3D protein model. Finally, we present information on our new website. Both software and data are made available without restriction via our website, online tools platform and programmatic interfaces (available under an Apache 2.0 license) and data updates made available four times a year.
Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Epigenoma , Anotación de Secuencia Molecular , Algoritmos , Animales , Gráficos por Computador , Bases de Datos de Proteínas , Variación Genética , Estudio de Asociación del Genoma Completo , Genómica , Histonas/metabolismo , Humanos , Imagenología Tridimensional , Internet , Ligandos , Motor de Búsqueda , Programas Informáticos , Especificidad de la Especie , Transcriptoma , Interfaz Usuario-Computador , Navegador WebRESUMEN
Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of interfaces to genomic data across the tree of life, including reference genome sequence, gene models, transcriptional data, genetic variation and comparative analysis. Data may be accessed via our website, online tools platform and programmatic interfaces, with updates made four times per year (in synchrony with Ensembl). Here, we provide an overview of Ensembl Genomes, with a focus on recent developments. These include the continued growth, more robust and reproducible sets of orthologues and paralogues, and enriched views of gene expression and gene function in plants. Finally, we report on our continued deeper integration with the Ensembl project, which forms a key part of our future strategy for dealing with the increasing quantity of available genome-scale data across the tree of life.
Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Variación Genética , Genoma Bacteriano , Genoma Fúngico , Genoma de Planta , Algoritmos , Animales , Caenorhabditis elegans/genética , Genómica , Internet , Anotación de Secuencia Molecular , Fenotipo , Plantas/genética , Valores de Referencia , Programas Informáticos , Interfaz Usuario-ComputadorRESUMEN
Abdominal aortic aneurysm (AAA) is an inflammatory vascular disease with high mortality and limited treatment options. How blood lipids regulate AAA development is unknown. Here lipidomics and genetic models demonstrate a central role for procoagulant enzymatically oxidized phospholipids (eoxPL) in regulating AAA. Specifically, through activating coagulation, eoxPL either promoted or inhibited AAA depending on tissue localization. Ang II administration to ApoE-/- mice increased intravascular coagulation during AAA development. Lipidomics revealed large numbers of eoxPL formed within mouse and human AAA lesions. Deletion of eoxPL-generating enzymes (Alox12 or Alox15) or administration of the factor Xa inhibitor rivaroxaban significantly reduced AAA. Alox-deficient mice displayed constitutively dysregulated hemostasis, including a consumptive coagulopathy, characterized by compensatory increase in prothrombotic aminophospholipids (aPL) in circulating cell membranes. Intravenously administered procoagulant PL caused clotting factor activation and depletion, induced a bleeding defect, and significantly reduced AAA development. These data suggest that Alox deletion reduces AAA through diverting coagulation away from the vessel wall due to eoxPL deficiency, instead activating clotting factor consumption and depletion in the circulation. In mouse whole blood, â¼44 eoxPL molecular species formed within minutes of clot initiation. These were significantly elevated with ApoE-/- deletion, and many were absent in Alox-/- mice, identifying specific eoxPL that modulate AAA. Correlation networks demonstrated eoxPL belonged to subfamilies defined by oxylipin composition. Thus, procoagulant PL regulate AAA development through complex interactions with clotting factors. Modulation of the delicate balance between bleeding and thrombosis within either the vessel wall or circulation was revealed that can either drive or prevent disease development.
Asunto(s)
Aorta Abdominal/fisiopatología , Aneurisma de la Aorta Abdominal , Fosfolípidos , Angiotensinas/metabolismo , Animales , Aneurisma de la Aorta Abdominal/genética , Aneurisma de la Aorta Abdominal/metabolismo , Aneurisma de la Aorta Abdominal/fisiopatología , Factores de Coagulación Sanguínea/genética , Factores de Coagulación Sanguínea/metabolismo , Modelos Animales de Enfermedad , Femenino , Lipooxigenasa/genética , Lipooxigenasa/metabolismo , Masculino , Ratones , Ratones Noqueados para ApoE , Fosfolípidos/genética , Fosfolípidos/metabolismoRESUMEN
SUMMARY: We present LipidFinder online, hosted on the LIPID MAPS website, as a liquid chromatography/mass spectrometry (LC/MS) workflow comprising peak filtering, MS searching and statistical analysis components, highly customized for interrogating lipidomic data. The online interface of LipidFinder includes several innovations such as comprehensive parameter tuning, a MS search engine employing in-house customized, curated and computationally generated databases and multiple reporting/display options. A set of integrated statistical analysis tools which enable users to identify those features which are significantly-altered under the selected experimental conditions, thereby greatly reducing the complexity of the peaklist prior to MS searching is included. LipidFinder is presented as a highly flexible, extensible user-friendly online workflow which leverages the lipidomics knowledge base and resources of the LIPID MAPS website, long recognized as a leading global lipidomics portal. AVAILABILITY AND IMPLEMENTATION: LipidFinder on LIPID MAPS is available at: http://www.lipidmaps.org/data/LF.
Asunto(s)
Bases de Datos Factuales , Lípidos/análisis , Programas Informáticos , Cromatografía Liquida , Biología Computacional , Bases del Conocimiento , Espectrometría de Masas , Flujo de TrabajoRESUMEN
BACKGROUND: Several methods have been developed to predict the pathogenicity of missense mutations but none has been specifically designed for classification of variants in mtDNA-encoded polypeptides. Moreover, there is not available curated dataset of neutral and damaging mtDNA missense variants to test the accuracy of predictors. Because mtDNA sequencing of patients suffering mitochondrial diseases is revealing many missense mutations, it is needed to prioritize candidate substitutions for further confirmation. Predictors can be useful as screening tools but their performance must be improved. RESULTS: We have developed a SVM classifier (Mitoclass.1) specific for mtDNA missense variants. Training and validation of the model was executed with 2,835 mtDNA damaging and neutral amino acid substitutions, previously curated by a set of rigorous pathogenicity criteria with high specificity. Each instance is described by a set of three attributes based on evolutionary conservation in Eukaryota of wildtype and mutant amino acids as well as coevolution and a novel evolutionary analysis of specific substitutions belonging to the same domain of mitochondrial polypeptides. Our classifier has performed better than other web-available tested predictors. We checked performance of three broadly used predictors with the total mutations of our curated dataset. PolyPhen-2 showed the best results for a screening proposal with a good sensitivity. Nevertheless, the number of false positive predictions was too high. Our method has an improved sensitivity and better specificity in relation to PolyPhen-2. We also publish predictions for the complete set of 24,201 possible missense variants in the 13 human mtDNA-encoded polypeptides. CONCLUSIONS: Mitoclass.1 allows a better selection of candidate damaging missense variants from mtDNA. A careful search of discriminatory attributes and a training step based on a curated dataset of amino acid substitutions belonging exclusively to human mtDNA genes allows an improved performance. Mitoclass.1 accuracy could be improved in the future when more mtDNA missense substitutions will be available for updating the attributes and retraining the model.
Asunto(s)
Análisis Mutacional de ADN/métodos , ADN Mitocondrial , Aprendizaje Automático , Mitocondrias/metabolismo , Mutación Missense , Péptidos/genética , Biología Computacional/métodos , Humanos , Mitocondrias/genética , Sensibilidad y EspecificidadRESUMEN
BACKGROUND: Molecular evolution studies involve many different hard computational problems solved, in most cases, with heuristic algorithms that provide a nearly optimal solution. Hence, diverse software tools exist for the different stages involved in a molecular evolution workflow. RESULTS: We present MEvoLib, the first molecular evolution library for Python, providing a framework to work with different tools and methods involved in the common tasks of molecular evolution workflows. In contrast with already existing bioinformatics libraries, MEvoLib is focused on the stages involved in molecular evolution studies, enclosing the set of tools with a common purpose in a single high-level interface with fast access to their frequent parameterizations. The gene clustering from partial or complete sequences has been improved with a new method that integrates accessible external information (e.g. GenBank's features data). Moreover, MEvoLib adjusts the fetching process from NCBI databases to optimize the download bandwidth usage. In addition, it has been implemented using parallelization techniques to cope with even large-case scenarios. CONCLUSIONS: MEvoLib is the first library for Python designed to facilitate molecular evolution researches both for expert and novel users. Its unique interface for each common task comprises several tools with their most used parameterizations. It has also included a method to take advantage of biological knowledge to improve the gene partition of sequence datasets. Additionally, its implementation incorporates parallelization techniques to enhance computational costs when handling very large input datasets.
Asunto(s)
Evolución Molecular , Biblioteca de Genes , Programas Informáticos , Algoritmos , Secuencia de Bases , Biología Computacional/métodos , ADN Mitocondrial/genética , Genes , HumanosRESUMEN
Tissue factor (TF) is critical for the activation of blood coagulation. TF function is regulated by the amount of externalised phosphatidylserine (PS) and phosphatidylethanolamine (PE) on the surface of the cell in which it is expressed. We investigated the role PS and PE in fibroblast TF function. Fibroblasts expressed 6-9 x 104 TF molecules/cell but had low specific activity for FXa generation. We confirmed that this was associated with minimal externalized PS and PE and characterised for the first time the molecular species of PS/PE demonstrating that these differed from those found in platelets. Mechanical damage of fibroblasts, used to simulate vascular injury, increased externalized PS/PE and led to a 7-fold increase in FXa generation that was inhibited by annexin V and an anti-TF antibody. Platelet-derived extracellular vesicles (EVs), that did not express TF, supported minimal FVIIa-dependent FXa generation but substantially increased fibroblast TF activity. This enhancement in fibroblast TF activity could also be achieved using synthetic liposomes comprising 10% PS without TF. In conclusion, despite high levels of surface TF expression, healthy fibroblasts express low levels of external-facing PS and PE limiting their ability to generate FXa. Addition of platelet-derived TF-negative EVs or artificial liposomes enhanced fibroblast TF activity in a PS dependent manner. These findings contribute information about the mechanisms that control TF function in the fibroblast membrane.
Asunto(s)
Vesículas Extracelulares/metabolismo , Fibroblastos/metabolismo , Tromboplastina/metabolismo , Coagulación Sanguínea , Plaquetas/metabolismo , Línea Celular , Humanos , Liposomas/metabolismo , Fosfatidiletanolaminas/metabolismo , Fosfatidilserinas/metabolismo , Tromboplastina/genéticaRESUMEN
BACKGROUND: Common chromosome 9p21 single nucleotide polymorphisms (SNPs) increase coronary heart disease risk, independent of traditional lipid risk factors. However, lipids comprise large numbers of structurally related molecules not measured in traditional risk measurements, and many have inflammatory bioactivities. Here, we applied lipidomic and genomic approaches to 3 model systems to characterize lipid metabolic changes in common Chr9p21 SNPs, which confer ≈30% elevated coronary heart disease risk associated with altered expression of ANRIL, a long ncRNA. METHODS: Untargeted and targeted lipidomics was applied to plasma from NPHSII (Northwick Park Heart Study II) homozygotes for AA or GG in rs10757274, followed by correlation and network analysis. To identify candidate genes, transcriptomic data from shRNA downregulation of ANRIL in HEK-293 cells was mined. Transcriptional data from vascular smooth muscle cells differentiated from induced pluripotent stem cells of individuals with/without Chr9p21 risk, nonrisk alleles, and corresponding knockout isogenic lines were next examined. Last, an in-silico analysis of miRNAs was conducted to identify how ANRIL might control lysoPL (lysophosphospholipid)/lysoPA (lysophosphatidic acid) genes. RESULTS: Elevated risk GG correlated with reduced lysoPLs, lysoPA, and ATX (autotaxin). Five other risk SNPs did not show this phenotype. LysoPL-lysoPA interconversion was uncoupled from ATX in GG plasma, suggesting metabolic dysregulation. Significantly altered expression of several lysoPL/lysoPA metabolizing enzymes was found in HEK cells lacking ANRIL. In the vascular smooth muscle cells data set, the presence of risk alleles associated with altered expression of several lysoPL/lysoPA enzymes. Deletion of the risk locus reversed the expression of several lysoPL/lysoPA genes to nonrisk haplotype levels. Genes that were altered across both cell data sets were DGKA, MBOAT2, PLPP1, and LPL. The in-silico analysis identified 4 ANRIL-regulated miRNAs that control lysoPL genes as miR-186-3p, miR-34a-3p, miR-122-5p, and miR-34a-5p. CONCLUSIONS: A Chr9p21 risk SNP associates with complex alterations in immune-bioactive phospholipids and their metabolism. Lipid metabolites and genomic pathways associated with coronary heart disease pathogenesis in Chr9p21 and ANRIL-associated disease are demonstrated.
Asunto(s)
Cromosomas Humanos Par 9/genética , Enfermedad Coronaria , Lisofosfolípidos , Hidrolasas Diéster Fosfóricas , Polimorfismo de Nucleótido Simple , Cromosomas Humanos Par 9/metabolismo , Enfermedad Coronaria/genética , Enfermedad Coronaria/metabolismo , Células HEK293 , Humanos , Lisofosfolípidos/genética , Lisofosfolípidos/metabolismo , Masculino , Persona de Mediana Edad , Hidrolasas Diéster Fosfóricas/genética , Hidrolasas Diéster Fosfóricas/metabolismoRESUMEN
The life cycle of spirochetes of the genus Borrelia includes complex networks of vertebrates and ticks. The tripartite association of Borrelia-vertebrate-tick has proved ecologically successful for these bacteria, which have become some of the most prominent tick-borne pathogens in the northern hemisphere. To keep evolutionary pace with its double-host life history, Borrelia must adapt to the evolutionary pressures exerted by both sets of hosts. In this review, we attempt to reconcile functional, phylogenetic, and ecological perspectives to propose a coherent scenario of Borrelia evolution. Available empirical information supports that the association of Borrelia with ticks is very old. The major split between the tick families Argasidae-Ixodidae (dated some 230-290 Mya) resulted in most relapsing fever (Rf) species being restricted to Argasidae and few associated with Ixodidae. A further key event produced the diversification of the Lyme borreliosis (Lb) species: the radiation of ticks of the genus Ixodes from the primitive stock of Ixodidae (around 217 Mya). The ecological interactions of Borrelia demonstrate that Argasidae-transmitted Rf species remain restricted to small niches of one tick species and few vertebrates. The evolutionary pressures on this group are consequently low, and speciation processes seem to be driven by geographical isolation. In contrast to Rf, Lb species circulate in nested networks of dozens of tick species and hundreds of vertebrate species. This greater variety confers a remarkably variable pool of evolutionary pressures, resulting in large speciation of the Lb group, where different species adapt to circulate through different groups of vertebrates. Available data, based on ospA and multilocus sequence typing (including eight concatenated in-house genes) phylogenetic trees, suggest that ticks could constitute a secondary bottleneck that contributes to Lb specialization. Both sets of adaptive pressures contribute to the resilience of highly adaptable meta-populations of bacteria.