Búsqueda | Portal de Búsqueda de la BVS Ecuador

1.

Pangenome graph layout by Path-Guided Stochastic Gradient Descent.

Heumos, Simon; Guarracino, Andrea; Schmelzle, Jan-Niklas M; Li, Jiajie; Zhang, Zhiru; Hagmann, Jörg; Nahnsen, Sven; Prins, Pjotr; Garrison, Erik.

Bioinformatics ; 40(7)2024 Jul 01.

Artículo en Inglés | MEDLINE | ID: mdl-38960860

RESUMEN

MOTIVATION: The increasing availability of complete genomes demands for models to study genomic variability within entire populations. Pangenome graphs capture the full genomic similarity and diversity between multiple genomes. In order to understand them, we need to see them. For visualization, we need a human-readable graph layout: a graph embedding in low (e.g. two) dimensional depictions. Due to a pangenome graph's potential excessive size, this is a significant challenge. RESULTS: In response, we introduce a novel graph layout algorithm: the Path-Guided Stochastic Gradient Descent (PG-SGD). PG-SGD uses the genomes, represented in the pangenome graph as paths, as an embedded positional system to sample genomic distances between pairs of nodes. This avoids the quadratic cost seen in previous versions of graph drawing by SGD. We show that our implementation efficiently computes the low-dimensional layouts of gigabase-scale pangenome graphs, unveiling their biological features. AVAILABILITY AND IMPLEMENTATION: We integrated PG-SGD in ODGI which is released as free software under the MIT open source license. Source code is available at https://github.com/pangenome/odgi.

Asunto(s)

Algoritmos , Programas Informáticos , Humanos , Genómica/métodos , Gráficos por Computador , Genoma

2.

mlf-core: a framework for deterministic machine learning.

Heumos, Lukas; Ehmele, Philipp; Kuhn Cuellar, Luis; Menden, Kevin; Miller, Edmund; Lemke, Steffen; Gabernet, Gisela; Nahnsen, Sven.

Bioinformatics ; 39(4)2023 04 03.

Artículo en Inglés | MEDLINE | ID: mdl-37004171

RESUMEN

MOTIVATION: Machine learning has shown extensive growth in recent years and is now routinely applied to sensitive areas. To allow appropriate verification of predictive models before deployment, models must be deterministic. Solely fixing all random seeds is not sufficient for deterministic machine learning, as major machine learning libraries default to the usage of nondeterministic algorithms based on atomic operations. RESULTS: Various machine learning libraries released deterministic counterparts to the nondeterministic algorithms. We evaluated the effect of these algorithms on determinism and runtime. Based on these results, we formulated a set of requirements for deterministic machine learning and developed a new software solution, the mlf-core ecosystem, which aids machine learning projects to meet and keep these requirements. We applied mlf-core to develop deterministic models in various biomedical fields including a single-cell autoencoder with TensorFlow, a PyTorch-based U-Net model for liver-tumor segmentation in computed tomography scans, and a liver cancer classifier based on gene expression profiles with XGBoost. AVAILABILITY AND IMPLEMENTATION: The complete data together with the implementations of the mlf-core ecosystem and use case models are available at https://github.com/mlf-core.

Asunto(s)

Ecosistema , Programas Informáticos , Aprendizaje Automático , Algoritmos , Tomografía Computarizada por Rayos X

3.

ODGI: understanding pangenome graphs.

Guarracino, Andrea; Heumos, Simon; Nahnsen, Sven; Prins, Pjotr; Garrison, Erik.

Bioinformatics ; 38(13): 3319-3326, 2022 06 27.

Artículo en Inglés | MEDLINE | ID: mdl-35552372

RESUMEN

MOTIVATION: Pangenome graphs provide a complete representation of the mutual alignment of collections of genomes. These models offer the opportunity to study the entire genomic diversity of a population, including structurally complex regions. Nevertheless, analyzing hundreds of gigabase-scale genomes using pangenome graphs is difficult as it is not well-supported by existing tools. Hence, fast and versatile software is required to ask advanced questions to such data in an efficient way. RESULTS: We wrote Optimized Dynamic Genome/Graph Implementation (ODGI), a novel suite of tools that implements scalable algorithms and has an efficient in-memory representation of DNA pangenome graphs in the form of variation graphs. ODGI supports pre-built graphs in the Graphical Fragment Assembly format. ODGI includes tools for detecting complex regions, extracting pangenomic loci, removing artifacts, exploratory analysis, manipulation, validation and visualization. Its fast parallel execution facilitates routine pangenomic tasks, as well as pipelines that can quickly answer complex biological questions of gigabase-scale pangenome graphs. AVAILABILITY AND IMPLEMENTATION: ODGI is published as free software under the MIT open source license. Source code can be downloaded from https://github.com/pangenome/odgi and documentation is available at https://odgi.readthedocs.io. ODGI can be installed via Bioconda https://bioconda.github.io/recipes/odgi/README.html or GNU Guix https://github.com/pangenome/odgi/blob/master/guix.scm. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Genoma , Programas Informáticos , Genómica , Algoritmos , Documentación

4.

A data management infrastructure for the integration of imaging and omics data in life sciences.

Kuhn Cuellar, Luis; Friedrich, Andreas; Gabernet, Gisela; de la Garza, Luis; Fillinger, Sven; Seyboldt, Adrian; Koch, Tobias; Zur Oven-Krockhaus, Sven; Wanke, Friederike; Richter, Sandra; Thaiss, Wolfgang M; Horger, Marius; Malek, Nisar; Harter, Klaus; Bitzer, Michael; Nahnsen, Sven.

BMC Bioinformatics ; 23(1): 61, 2022 Feb 07.

Artículo en Inglés | MEDLINE | ID: mdl-35130839

RESUMEN

BACKGROUND: As technical developments in omics and biomedical imaging increase the throughput of data generation in life sciences, the need for information systems capable of managing heterogeneous digital assets is increasing. In particular, systems supporting the findability, accessibility, interoperability, and reusability (FAIR) principles of scientific data management. RESULTS: We propose a Service Oriented Architecture approach for integrated management and analysis of multi-omics and biomedical imaging data. Our architecture introduces an image management system into a FAIR-supporting, web-based platform for omics data management. Interoperable metadata models and middleware components implement the required data management operations. The resulting architecture allows for FAIR management of omics and imaging data, facilitating metadata queries from software applications. The applicability of the proposed architecture is demonstrated using two technical proofs of concept and a use case, aimed at molecular plant biology and clinical liver cancer research, which integrate various imaging and omics modalities. CONCLUSIONS: We describe a data management architecture for integrated, FAIR-supporting management of omics and biomedical imaging data, and exemplify its applicability for basic biology research and clinical studies. We anticipate that FAIR data management systems for multi-modal data repositories will play a pivotal role in data-driven research, including studies which leverage advanced machine learning methods, as the joint analysis of omics and imaging data, in conjunction with phenotypic metadata, becomes not only desirable but necessary to derive novel insights into biological processes.

Asunto(s)

Disciplinas de las Ciencias Biológicas , Manejo de Datos , Gestión de la Información , Metadatos , Programas Informáticos

5.

Transcriptome Profiling Identifies TIGIT as a Marker of T-Cell Exhaustion in Liver Cancer.

Ostroumov, Dmitrij; Duong, Steven; Wingerath, Jessica; Woller, Norman; Manns, Michael P; Timrott, Kai; Kleine, Moritz; Ramackers, Wolf; Roessler, Stephanie; Nahnsen, Sven; Czemmel, Stefan; Dittrich-Breiholz, Oliver; Eggert, Tobias; Kühnel, Florian; Wirth, Thomas C.

Hepatology ; 73(4): 1399-1418, 2021 04.

Artículo en Inglés | MEDLINE | ID: mdl-32716559

RESUMEN

BACKGROUND AND AIMS: Programmed death 1 (PD-1) checkpoint inhibition has shown promising results in patients with hepatocellular carcinoma, inducing objective responses in approximately 20% of treated patients. The roles of other coinhibitory molecules and their individual contributions to T-cell dysfunction in liver cancer, however, remain largely elusive. APPROACH AND RESULTS: We performed a comprehensive mRNA profiling of cluster of differentiation 8 (CD8) T cells in a murine model of autochthonous liver cancer by comparing the transcriptome of naive, functional effector, and exhausted, tumor-specific CD8 T cells. Subsequently, we functionally validated the role of identified genes in T-cell exhaustion. Our results reveal a unique transcriptome signature of exhausted T cells and demonstrate that up-regulation of the inhibitory immune receptor T-cell immunoreceptor with immunoglobulin and immunoreceptor tyrosine-based inhibitor motif domains (TIGIT) represents a hallmark in the process of T-cell exhaustion in liver cancer. Compared to PD-1, expression of TIGIT more reliably identified exhausted CD8 T cells at different stages of their differentiation. In combination with PD-1 inhibition, targeting of TIGIT with antagonistic antibodies resulted in synergistic inhibition of liver cancer growth in immunocompetent mice. Finally, we demonstrate expression of TIGIT on tumor-infiltrating CD8 T cells in tissue samples of patients with hepatocellular carcinoma and intrahepatic cholangiocarcinoma and identify two subsets of patients based on differential expression of TIGIT on tumor-specific T cells. CONCLUSIONS: Our transcriptome analysis provides a valuable resource for the identification of key pathways involved in T-cell exhaustion in patients with liver cancer and identifies TIGIT as a potential target in checkpoint combination therapies.

Asunto(s)

Neoplasias de los Conductos Biliares/genética , Neoplasias de los Conductos Biliares/inmunología , Linfocitos T CD8-positivos/inmunología , Carcinoma Hepatocelular/genética , Carcinoma Hepatocelular/inmunología , Colangiocarcinoma/genética , Colangiocarcinoma/inmunología , Neoplasias Hepáticas/genética , Neoplasias Hepáticas/inmunología , Receptores Inmunológicos/genética , Transcriptoma , Anciano , Animales , Neoplasias de los Conductos Biliares/patología , Biomarcadores de Tumor/genética , Carcinoma Hepatocelular/tratamiento farmacológico , Carcinoma Hepatocelular/patología , Línea Celular Tumoral , Colangiocarcinoma/patología , Modelos Animales de Enfermedad , Quimioterapia Combinada , Femenino , Perfilación de la Expresión Génica/métodos , Humanos , Inhibidores de Puntos de Control Inmunológico/uso terapéutico , Neoplasias Hepáticas/tratamiento farmacológico , Neoplasias Hepáticas/patología , Linfocitos Infiltrantes de Tumor/inmunología , Masculino , Ratones , Ratones Endogámicos C57BL , Persona de Mediana Edad , Receptor de Muerte Celular Programada 1/antagonistas & inhibidores , Receptores Inmunológicos/antagonistas & inhibidores , Resultado del Tratamiento , Carga Tumoral/efectos de los fármacos

6.

Multiomics surface receptor profiling of the NCI-60 tumor cell panel uncovers novel theranostics for cancer immunotherapy.

Heumos, Simon; Dehn, Sandra; Bräutigam, Konstantin; Codrea, Marius C; Schürch, Christian M; Lauer, Ulrich M; Nahnsen, Sven; Schindler, Michael.

Cancer Cell Int ; 22(1): 311, 2022 Oct 11.

Artículo en Inglés | MEDLINE | ID: mdl-36221114

RESUMEN

BACKGROUND: Immunotherapy with immune checkpoint inhibitors (ICI) has revolutionized cancer therapy. However, therapeutic targeting of inhibitory T cell receptors such as PD-1 not only initiates a broad immune response against tumors, but also causes severe adverse effects. An ideal future stratified immunotherapy would interfere with cancer-specific cell surface receptors only. METHODS: To identify such candidates, we profiled the surface receptors of the NCI-60 tumor cell panel via flow cytometry. The resulting surface receptor expression data were integrated into proteomic and transcriptomic NCI-60 datasets applying a sophisticated multiomics multiple co-inertia analysis (MCIA). This allowed us to identify surface profiles for skin, brain, colon, kidney, and bone marrow derived cell lines and cancer entity-specific cell surface receptor biomarkers for colon and renal cancer. RESULTS: For colon cancer, identified biomarkers are CD15, CD104, CD324, CD326, CD49f, and for renal cancer, CD24, CD26, CD106 (VCAM1), EGFR, SSEA-3 (B3GALT5), SSEA-4 (TMCC1), TIM1 (HAVCR1), and TRA-1-60R (PODXL). Further data mining revealed that CD106 (VCAM1) in particular is a promising novel immunotherapeutic target for the treatment of renal cancer. CONCLUSION: Altogether, our innovative multiomics analysis of the NCI-60 panel represents a highly valuable resource for uncovering surface receptors that could be further exploited for diagnostic and therapeutic purposes in the context of cancer immunotherapy.

7.

Downregulation of TGR5 (GPBAR1) in biliary epithelial cells contributes to the pathogenesis of sclerosing cholangitis.

Reich, Maria; Spomer, Lina; Klindt, Caroline; Fuchs, Katharina; Stindt, Jan; Deutschmann, Kathleen; Höhne, Johanna; Liaskou, Evaggelia; Hov, Johannes R; Karlsen, Tom H; Beuers, Ulrich; Verheij, Joanne; Ferreira-Gonzalez, Sofia; Hirschfield, Gideon; Forbes, Stuart J; Schramm, Christoph; Esposito, Irene; Nierhoff, Dirk; Fickert, Peter; Fuchs, Claudia Daniela; Trauner, Michael; García-Beccaria, María; Gabernet, Gisela; Nahnsen, Sven; Mallm, Jan-Philipp; Vogel, Marina; Schoonjans, Kristina; Lautwein, Tobias; Köhrer, Karl; Häussinger, Dieter; Luedde, Tom; Heikenwalder, Mathias; Keitel, Verena.

J Hepatol ; 75(3): 634-646, 2021 09.

Artículo en Inglés | MEDLINE | ID: mdl-33872692

RESUMEN

BACKGROUND & AIMS: Primary sclerosing cholangitis (PSC) is characterized by chronic inflammation and progressive fibrosis of the biliary tree. The bile acid receptor TGR5 (GPBAR1) is found on biliary epithelial cells (BECs), where it promotes secretion, proliferation and tight junction integrity. Thus, we speculated that changes in TGR5-expression in BECs may contribute to PSC pathogenesis. METHODS: TGR5-expression and -localization were analyzed in PSC livers and liver tissue, isolated bile ducts and BECs from Abcb4-/-, Abcb4-/-/Tgr5Tg and ursodeoxycholic acid (UDCA)- or 24-norursodeoxycholic acid (norUDCA)-fed Abcb4-/- mice. The effects of IL8/IL8 homologues on TGR5 mRNA and protein levels were studied. BEC gene expression was analyzed by single-cell transcriptomics (scRNA-seq) from distinct mouse models. RESULTS: TGR5 mRNA expression and immunofluorescence staining intensity were reduced in BECs of PSC and Abcb4-/- livers, in Abcb4-/- extrahepatic bile ducts, but not in intrahepatic macrophages. No changes in TGR5 BEC fluorescence intensity were detected in liver tissue of other liver diseases, including primary biliary cholangitis. Incubation of BECs with IL8/IL8 homologues, but not with other cytokines, reduced TGR5 mRNA and protein levels. BECs from Abcb4-/- mice had lower levels of phosphorylated Erk and higher expression levels of Icam1, Vcam1 and Tgfß2. Overexpression of Tgr5 abolished the activated inflammatory phenotype characteristic of Abcb4-/- BECs. NorUDCA-feeding restored TGR5-expression levels in BECs in Abcb4-/- livers. CONCLUSIONS: Reduced TGR5 levels in BECs from patients with PSC and Abcb4-/- mice promote development of a reactive BEC phenotype, aggravate biliary injury and thus contribute to the pathogenesis of sclerosing cholangitis. Restoration of biliary TGR5-expression levels represents a previously unknown mechanism of action of norUDCA. LAY SUMMARY: Primary sclerosing cholangitis (PSC) is a chronic cholestatic liver disease-associated with progressive inflammation of the bile duct, leading to fibrosis and end-stage liver disease. Bile acid (BA) toxicity may contribute to the development and disease progression of PSC. TGR5 is a membrane-bound receptor for BAs, which is found on bile ducts and protects bile ducts from BA toxicity. In this study, we show that TGR5 levels were reduced in bile ducts from PSC livers and in bile ducts from a genetic mouse model of PSC. Our investigations indicate that lower levels of TGR5 in bile ducts may contribute to PSC development and progression. Furthermore, treatment with norUDCA, a drug currently being tested in a phase III trial for PSC, restored TGR5 levels in biliary epithelial cells.

Asunto(s)

Sistema Biliar/efectos de los fármacos , Colangitis Esclerosante/genética , Regulación hacia Abajo/efectos de los fármacos , Receptores Acoplados a Proteínas G/efectos de los fármacos , Animales , Sistema Biliar/metabolismo , Colangitis Esclerosante/tratamiento farmacológico , Colangitis Esclerosante/fisiopatología , Modelos Animales de Enfermedad , Regulación hacia Abajo/genética , Regulación hacia Abajo/fisiología , Células Epiteliales/efectos de los fármacos , Células Epiteliales/metabolismo , Células Epiteliales/fisiología , Hígado/efectos de los fármacos , Hígado/patología , Ratones , Receptores Acoplados a Proteínas G/metabolismo , Factores de Virulencia

8.

Genetic evolution of in situ follicular neoplasia to aggressive B-cell lymphoma of germinal center subtype.

Vogelsberg, Antonio; Steinhilber, Julia; Mankel, Barbara; Federmann, Birgit; Schmidt, Janine; Montes-Mojarro, Ivonne A; Hüttl, Katrin; Rodriguez-Pinilla, Maria; Baskaran, Praveen; Nahnsen, Sven; Piris, Miguel A; Ott, German; Quintanilla-Martinez, Leticia; Bonzheim, Irina; Fend, Falko.

Haematologica ; 106(10): 2673-2681, 2021 10 01.

Artículo en Inglés | MEDLINE | ID: mdl-32855278

RESUMEN

In situ follicular neoplasia (ISFN) is the earliest morphologically identifiable precursor of follicular lymphoma (FL). Although it is genetically less complex than FL and has low risk for progression, ISFN already harbors secondary genetic alterations, in addition to the defining t(14;18)(q32;q21) translocation. FL, in turn, frequently progresses to diffuse large B-cell lymphoma (DLBCL) or high-grade B-cell lymphoma (HGBL). By BCL2 staining of available reactive lymphoid tissue obtained at any time point in patients with aggressive B-cell lymphoma (BCL), we identified ten paired cases of ISFN and DLBCL/HGBL, including six de novo tumors and four tumors transformed from FL as an intermediate step, and investigated their clonal evolution using microdissection and next-generation sequencing. A clonal relationship between ISFN and aggressive BCL was established by immunoglobulin and/or BCL2 rearrangements and/or the demonstration of shared somatic mutations for all ten cases. Targeted sequencing revealed CREBBP, KMT2D, EZH2, TNFRSF14 and BCL2 as the genes most frequently mutated already in ISFN. Based on the distribution of private and shared mutations, two patterns of clonal evolution were evident. In most cases, the aggressive lymphoma, ISFN and, when present, FL revealed divergent evolution from a common progenitor, whereas linear evolution with sequential accumulation of mutations was less frequent. In conclusion, we demonstrate for the first time that t(14;18)+ aggressive BCL can arise from ISFN without clinically evident FL as an intermediate step and that during this progression, branched evolution is common.

Asunto(s)

Linfoma Folicular , Linfoma de Células B Grandes Difuso , Evolución Molecular , Centro Germinal , Humanos , Linfoma Folicular/genética , Translocación Genética

9.

Ten simple rules for providing effective bioinformatics research support.

Kumuthini, Judit; Chimenti, Michael; Nahnsen, Sven; Peltzer, Alexander; Meraba, Rebone; McFadyen, Ross; Wells, Gordon; Taylor, Deanne; Maienschein-Cline, Mark; Li, Jian-Liang; Thimmapuram, Jyothi; Murthy-Karuturi, Radha; Zass, Lyndon.

PLoS Comput Biol ; 16(3): e1007531, 2020 03.

Artículo en Inglés | MEDLINE | ID: mdl-32214318

RESUMEN

Life scientists are increasingly turning to high-throughput sequencing technologies in their research programs, owing to the enormous potential of these methods. In a parallel manner, the number of core facilities that provide bioinformatics support are also increasing. Notably, the generation of complex large datasets has necessitated the development of bioinformatics support core facilities that aid laboratory scientists with cost-effective and efficient data management, analysis, and interpretation. In this article, we address the challenges-related to communication, good laboratory practice, and data handling-that may be encountered in core support facilities when providing bioinformatics support, drawing on our own experiences working as support bioinformaticians on multidisciplinary research projects. Most importantly, the article proposes a list of guidelines that outline how these challenges can be preemptively avoided and effectively managed to increase the value of outputs to the end user, covering the entire research project lifecycle, including experimental design, data analysis, and management (i.e., sharing and storage). In addition, we highlight the importance of clear and transparent communication, comprehensive preparation, appropriate handling of samples and data using monitoring systems, and the employment of appropriate tools and standard operating procedures to provide effective bioinformatics support.

Asunto(s)

Biología Computacional/economía , Biología Computacional/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Investigación Biomédica/economía , Investigación Biomédica/métodos , Comunicación , Biología Computacional/normas , Secuenciación de Nucleótidos de Alto Rendimiento/economía , Secuenciación de Nucleótidos de Alto Rendimiento/normas , Humanos , Proyectos de Investigación/normas

10.

Ring1b-dependent epigenetic remodelling is an essential prerequisite for pancreatic carcinogenesis.

Benitz, Simone; Straub, Tobias; Mahajan, Ujjwal Mukund; Mutter, Jurik; Czemmel, Stefan; Unruh, Tatjana; Wingerath, Britta; Deubler, Sabrina; Fahr, Lisa; Cheng, Tao; Nahnsen, Sven; Bruns, Philipp; Kong, Bo; Raulefs, Susanne; Ceyhan, Güralp O; Mayerle, Julia; Steiger, Katja; Esposito, Irene; Kleeff, Jörg; Michalski, Christoph W; Regel, Ivonne.

Gut ; 68(11): 2007-2018, 2019 11.

Artículo en Inglés | MEDLINE | ID: mdl-30954952

RESUMEN

BACKGROUND AND AIMS: Besides well-defined genetic alterations, the dedifferentiation of mature acinar cells is an important prerequisite for pancreatic carcinogenesis. Acinar-specific genes controlling cell homeostasis are extensively downregulated during cancer development; however, the underlying mechanisms are poorly understood. Now, we devised a novel in vitro strategy to determine genome-wide dynamics in the epigenetic landscape in pancreatic carcinogenesis. DESIGN: With our in vitro carcinogenic sequence, we performed global gene expression analysis and ChIP sequencing for the histone modifications H3K4me3, H3K27me3 and H2AK119ub. Followed by a comprehensive bioinformatic approach, we captured gene clusters with extensive epigenetic and transcriptional remodelling. Relevance of Ring1b-catalysed H2AK119ub in acinar cell reprogramming was studied in an inducible Ring1b knockout mouse model. CRISPR/Cas9-mediated Ring1b ablation as well as drug-induced Ring1b inhibition were functionally characterised in pancreatic cancer cells. RESULTS: The epigenome is vigorously modified during pancreatic carcinogenesis, defining cellular identity. Particularly, regulatory acinar cell transcription factors are epigenetically silenced by the Ring1b-catalysed histone modification H2AK119ub in acinar-to-ductal metaplasia and pancreatic cancer cells. Ring1b knockout mice showed greatly impaired acinar cell dedifferentiation and pancreatic tumour formation due to a retained expression of acinar differentiation genes. Depletion or drug-induced inhibition of Ring1b promoted tumour cell reprogramming towards a less aggressive phenotype. CONCLUSIONS: Our data provide substantial evidence that the epigenetic silencing of acinar cell fate genes is a mandatory event in the development and progression of pancreatic cancer. Targeting the epigenetic repressor Ring1b could offer new therapeutic options.

Asunto(s)

Células Acinares/patología , Epigénesis Genética/fisiología , Neoplasias Pancreáticas/etiología , Neoplasias Pancreáticas/patología , Complejo Represivo Polycomb 1/fisiología , Ubiquitina-Proteína Ligasas/fisiología , Animales , Carcinogénesis , Técnicas de Cultivo de Célula , Modelos Animales de Enfermedad , Ratones , Ratones Noqueados

11.

OpenMS: a flexible open-source software platform for mass spectrometry data analysis.

Röst, Hannes L; Sachsenberg, Timo; Aiche, Stephan; Bielow, Chris; Weisser, Hendrik; Aicheler, Fabian; Andreotti, Sandro; Ehrlich, Hans-Christian; Gutenbrunner, Petra; Kenar, Erhan; Liang, Xiao; Nahnsen, Sven; Nilse, Lars; Pfeuffer, Julianus; Rosenberger, George; Rurik, Marc; Schmitt, Uwe; Veit, Johannes; Walzer, Mathias; Wojnar, David; Wolski, Witold E; Schilling, Oliver; Choudhary, Jyoti S; Malmström, Lars; Aebersold, Ruedi; Reinert, Knut; Kohlbacher, Oliver.

Nat Methods ; 13(9): 741-8, 2016 08 30.

Artículo en Inglés | MEDLINE | ID: mdl-27575624

RESUMEN

High-resolution mass spectrometry (MS) has become an important tool in the life sciences, contributing to the diagnosis and understanding of human diseases, elucidating biomolecular structural information and characterizing cellular signaling networks. However, the rapid growth in the volume and complexity of MS data makes transparent, accurate and reproducible analysis difficult. We present OpenMS 2.0 (http://www.openms.de), a robust, open-source, cross-platform software specifically designed for the flexible and reproducible analysis of high-throughput MS data. The extensible OpenMS software implements common mass spectrometric data processing tasks through a well-defined application programming interface in C++ and Python and through standardized open data formats. OpenMS additionally provides a set of 185 tools and ready-made workflows for common mass spectrometric data processing tasks, which enable users to perform complex quantitative mass spectrometric analyses with ease.

Asunto(s)

Biología Computacional/métodos , Procesamiento Automatizado de Datos , Espectrometría de Masas/métodos , Proteómica/métodos , Programas Informáticos , Envejecimiento/sangre , Proteínas Sanguíneas/química , Humanos , Anotación de Secuencia Molecular , Proteogenómica/métodos , Flujo de Trabajo

12.

Challenges of big data integration in the life sciences.

Fillinger, Sven; de la Garza, Luis; Peltzer, Alexander; Kohlbacher, Oliver; Nahnsen, Sven.

Anal Bioanal Chem ; 411(26): 6791-6800, 2019 Oct.

Artículo en Inglés | MEDLINE | ID: mdl-31463515

RESUMEN

Big data has been reported to be revolutionizing many areas of life, including science. It summarizes data that is unprecedentedly large, rapidly generated, heterogeneous, and hard to accurately interpret. This availability has also brought new challenges: How to properly annotate data to make it searchable? What are the legal and ethical hurdles when sharing data? How to store data securely, preventing loss and corruption? The life sciences are not the only disciplines that must align themselves with big data requirements to keep up with the latest developments. The large hadron collider, for instance, generates research data at a pace beyond any current biomedical research center. There are three recent major coinciding events that explain the emergence of big data in the context of research: the technological revolution for data generation, the development of tools for data analysis, and a conceptual change towards open science and data. The true potential of big data lies in pattern discovery in large datasets, as well as the formulation of new models and hypotheses. Confirmation of the existence of the Higgs boson, for instance, is one of the most recent triumphs of big data analysis in physics. Digital representations of biological systems have become more comprehensive. This, in combination with advances in machine learning, creates exciting new research possibilities. In this paper, we review the state of big data in bioanalytical research and provide an overview of the guidelines for its proper usage.

Asunto(s)

Macrodatos , Biología Computacional , Animales , Disciplinas de las Ciencias Biológicas , Investigación Biomédica , Humanos , Difusión de la Información , Almacenamiento y Recuperación de la Información , Aprendizaje Automático , Reconocimiento de Normas Patrones Automatizadas

13.

Mass-Spectrometry-Based Proteomics Reveals Organ-Specific Expression Patterns To Be Used as Forensic Evidence.

Dammeier, Sascha; Nahnsen, Sven; Veit, Johannes; Wehner, Frank; Ueffing, Marius; Kohlbacher, Oliver.

J Proteome Res ; 15(1): 182-92, 2016 Jan 04.

Artículo en Inglés | MEDLINE | ID: mdl-26593679

RESUMEN

Standard forensic procedures to examine bullets after an exchange of fire include a mechanical or ballistic reconstruction of the event. While this is routine to identify which projectile hit a subject by DNA analysis of biological material on the surface of the projectile, it is rather difficult to determine which projectile caused the lethal injury--often the crucial point with regard to legal proceedings. With respect to fundamental law it is the duty of the public authority to make every endeavor to solve every homicide case. To improve forensic examinations, we present a forensic proteomic method to investigate biological material from a projectile's surface and determine the tissues traversed by it. To obtain a range of relevant samples, different major bovine organs were penetrated with projectiles experimentally. After tryptic "on-surface" digestion, mass-spectrometry-based proteome analysis, and statistical data analysis, we were able to achieve a cross-validated organ classification accuracy of >99%. Different types of anticipated external variables exhibited no prominent influence on the findings. In addition, shooting experiments were performed to validate the results. Finally, we show that these concepts could be applied to a real case of murder to substantially improve the forensic reconstruction.

Asunto(s)

Proteoma/química , Heridas por Arma de Fuego/diagnóstico , Animales , Lesiones Encefálicas/diagnóstico , Bovinos , Cromatografía Líquida de Alta Presión , Resultado Fatal , Femenino , Balística Forense , Homicidio/legislación & jurisprudencia , Humanos , Masculino , Persona de Mediana Edad , Especificidad de Órganos , Proteoma/aislamiento & purificación , Proteómica/métodos , Espectrometría de Masas en Tándem

14.

Personalized peptide vaccine-induced immune response associated with long-term survival of a metastatic cholangiocarcinoma patient.

Löffler, Markus W; Chandran, P Anoop; Laske, Karoline; Schroeder, Christopher; Bonzheim, Irina; Walzer, Mathias; Hilke, Franz J; Trautwein, Nico; Kowalewski, Daniel J; Schuster, Heiko; Günder, Marc; Carcamo Yañez, Viviana A; Mohr, Christopher; Sturm, Marc; Nguyen, Huu-Phuc; Riess, Olaf; Bauer, Peter; Nahnsen, Sven; Nadalin, Silvio; Zieker, Derek; Glatzle, Jörg; Thiel, Karolin; Schneiderhan-Marra, Nicole; Clasen, Stephan; Bösmüller, Hans; Fend, Falko; Kohlbacher, Oliver; Gouttefangeas, Cécile; Stevanovic, Stefan; Königsrainer, Alfred; Rammensee, Hans-Georg.

J Hepatol ; 65(4): 849-855, 2016 10.

Artículo en Inglés | MEDLINE | ID: mdl-27397612

RESUMEN

BACKGROUND & AIMS: We report a novel experimental immunotherapeutic approach in a patient with metastatic intrahepatic cholangiocarcinoma. In the 5year course of the disease, the initial tumor mass, two local recurrences and a lung metastasis were surgically removed. Lacking alternative treatment options, aiming at the induction of anti-tumor T cells responses, we initiated a personalized multi-peptide vaccination, based on in-depth analysis of tumor antigens (immunopeptidome) and sequencing. METHODS: Tumors were characterized by immunohistochemistry, next-generation sequencing and mass spectrometry of HLA ligands. RESULTS: Although several tumor-specific neo-epitopes were predicted in silico, none could be validated by mass spectrometry. Instead, a personalized multi-peptide vaccine containing non-mutated tumor-associated epitopes was designed and applied. Immunomonitoring showed vaccine-induced T cell responses to three out of seven peptides administered. The pulmonary metastasis resected after start of vaccination showed strong immune cell infiltration and perforin positivity, in contrast to the previous lesions. The patient remains clinically healthy, without any radiologically detectable tumors since March 2013 and the vaccination is continued. CONCLUSIONS: This remarkable clinical course encourages formal clinical studies on adjuvant personalized peptide vaccination in cholangiocarcinoma. LAY SUMMARY: Metastatic cholangiocarcinomas, cancers that originate from the liver bile ducts, have very limited treatment options and a fatal prognosis. We describe a novel therapeutic approach in such a patient using a personalized multi-peptide vaccine. This vaccine, developed based on the characterization of the patient's tumor, evoked detectable anti-tumor immune responses, associating with long-term tumor-free survival.

Asunto(s)

Colangiocarcinoma , Neoplasias de los Conductos Biliares , Vacunas contra el Cáncer , Humanos , Recurrencia Local de Neoplasia , Vacunas de Subunidad

15.

qcML: an exchange format for quality control metrics from mass spectrometry experiments.

Walzer, Mathias; Pernas, Lucia Espona; Nasso, Sara; Bittremieux, Wout; Nahnsen, Sven; Kelchtermans, Pieter; Pichler, Peter; van den Toorn, Henk W P; Staes, An; Vandenbussche, Jonathan; Mazanek, Michael; Taus, Thomas; Scheltema, Richard A; Kelstrup, Christian D; Gatto, Laurent; van Breukelen, Bas; Aiche, Stephan; Valkenborg, Dirk; Laukens, Kris; Lilley, Kathryn S; Olsen, Jesper V; Heck, Albert J R; Mechtler, Karl; Aebersold, Ruedi; Gevaert, Kris; Vizcaíno, Juan Antonio; Hermjakob, Henning; Kohlbacher, Oliver; Martens, Lennart.

Mol Cell Proteomics ; 13(8): 1905-13, 2014 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-24760958

RESUMEN

Quality control is increasingly recognized as a crucial aspect of mass spectrometry based proteomics. Several recent papers discuss relevant parameters for quality control and present applications to extract these from the instrumental raw data. What has been missing, however, is a standard data exchange format for reporting these performance metrics. We therefore developed the qcML format, an XML-based standard that follows the design principles of the related mzML, mzIdentML, mzQuantML, and TraML standards from the HUPO-PSI (Proteomics Standards Initiative). In addition to the XML format, we also provide tools for the calculation of a wide range of quality metrics as well as a database format and interconversion tools, so that existing LIMS systems can easily add relational storage of the quality control data to their existing schema. We here describe the qcML specification, along with possible use cases and an illustrative example of the subsequent analysis possibilities. All information about qcML is available at http://code.google.com/p/qcml.

Asunto(s)

Espectrometría de Masas/normas , Programas Informáticos , Bases de Datos de Proteínas , Lenguajes de Programación , Proteómica/normas , Control de Calidad

16.

Platforms and Pipelines for Proteomics Data Analysis and Management.

Codrea, Marius Cosmin; Nahnsen, Sven.

Adv Exp Med Biol ; 919: 203-215, 2016.

Artículo en Inglés | MEDLINE | ID: mdl-27975218

RESUMEN

Since mass spectrometry was introduced as the core technology for large-scale analysis of the proteome, the speed of data acquisition, dynamic ranges of measurements, and data quality are continuously improving. These improvements are triggered by regular launches of new methodologies and instruments.

Asunto(s)

Biología Computacional/métodos , Minería de Datos/métodos , Bases de Datos de Proteínas , Espectrometría de Masas/métodos , Proteínas/análisis , Proteoma , Proteómica/métodos , Algoritmos , Animales , Ensayos Analíticos de Alto Rendimiento , Humanos , Programas Informáticos , Flujo de Trabajo

17.

Tools for label-free peptide quantification.

Nahnsen, Sven; Bielow, Chris; Reinert, Knut; Kohlbacher, Oliver.

Mol Cell Proteomics ; 12(3): 549-56, 2013 Mar.

Artículo en Inglés | MEDLINE | ID: mdl-23250051

RESUMEN

The increasing scale and complexity of quantitative proteomics studies complicate subsequent analysis of the acquired data. Untargeted label-free quantification, based either on feature intensities or on spectral counting, is a method that scales particularly well with respect to the number of samples. It is thus an excellent alternative to labeling techniques. In order to profit from this scalability, however, data analysis has to cope with large amounts of data, process them automatically, and do a thorough statistical analysis in order to achieve reliable results. We review the state of the art with respect to computational tools for label-free quantification in untargeted proteomics. The two fundamental approaches are feature-based quantification, relying on the summed-up mass spectrometric intensity of peptides, and spectral counting, which relies on the number of MS/MS spectra acquired for a certain protein. We review the current algorithmic approaches underlying some widely used software packages and briefly discuss the statistical strategies for analyzing the data.

Asunto(s)

Péptidos/análisis , Proteoma/análisis , Proteómica/métodos , Espectrometría de Masas en Tándem/métodos , Algoritmos , Animales , Humanos , Reproducibilidad de los Resultados , Programas Informáticos

18.

How tool combinations in different pipeline versions affect the outcome in RNA-seq analysis.

Perelo, Louisa Wessels; Gabernet, Gisela; Straub, Daniel; Nahnsen, Sven.

NAR Genom Bioinform ; 6(1): lqae020, 2024 Mar.

Artículo en Inglés | MEDLINE | ID: mdl-38456178

RESUMEN

Data analysis tools are continuously changed and improved over time. In order to test how these changes influence the comparability between analyses, the output of different workflow options of the nf-core/rnaseq pipeline were compared. Five different pipeline settings (STAR+Salmon, STAR+RSEM, STAR+featureCounts, HISAT2+featureCounts, pseudoaligner Salmon) were run on three datasets (human, Arabidopsis, zebrafish) containing spike-ins of the External RNA Control Consortium (ERCC). Fold change ratios and differential expression of genes and spike-ins were used for comparative analyses of the different tools and versions settings of the pipeline. An overlap of 85% for differential gene classification between pipelines could be shown. Genes interpreted with a bias were mostly those present at lower concentration. Also, the number of isoforms and exons per gene were determinants. Previous pipeline versions using featureCounts showed a higher sensitivity to detect one-isoform genes like ERCC. To ensure data comparability in long-term analysis series it would be recommendable to either stay with the pipeline version the series was initialized with or to run both versions during a transition time in order to ensure that the target genes are addressed the same way.

19.

nf-core/airrflow: an adaptive immune receptor repertoire analysis workflow employing the Immcantation framework.

Gabernet, Gisela; Marquez, Susanna; Bjornson, Robert; Peltzer, Alexander; Meng, Hailong; Aron, Edel; Lee, Noah Yann; Jensen, Cole; Ladd, David; Hanssen, Friederike; Heumos, Simon; Yaari, Gur; Kowarik, Markus C; Nahnsen, Sven; Kleinstein, Steven H.

bioRxiv ; 2024 Jan 28.

Artículo en Inglés | MEDLINE | ID: mdl-38293151

RESUMEN

Adaptive Immune Receptor Repertoire sequencing (AIRR-seq) is a valuable experimental tool to study the immune state in health and following immune challenges such as infectious diseases, (auto)immune diseases, and cancer. Several tools have been developed to reconstruct B cell and T cell receptor sequences from AIRR-seq data and infer B and T cell clonal relationships. However, currently available tools offer limited parallelization across samples, scalability or portability to high-performance computing infrastructures. To address this need, we developed nf-core/airrflow, an end-to-end bulk and single-cell AIRR-seq processing workflow which integrates the Immcantation Framework following BCR and TCR sequencing data analysis best practices. The Immcantation Framework is a comprehensive toolset, which allows the processing of bulk and single-cell AIRR-seq data from raw read processing to clonal inference. nf-core/airrflow is written in Nextflow and is part of the nf-core project, which collects community contributed and curated Nextflow workflows for a wide variety of analysis tasks. We assessed the performance of nf-core/airrflow on simulated sequencing data with sequencing errors and show example results with real datasets. To demonstrate the applicability of nf-core/airrflow to the high-throughput processing of large AIRR-seq datasets, we validated and extended previously reported findings of convergent antibody responses to SARS-CoV-2 by analyzing 97 COVID-19 infected individuals and 99 healthy controls, including a mixture of bulk and single-cell sequencing datasets. Using this dataset, we extended the convergence findings to 20 additional subjects, highlighting the applicability of nf-core/airrflow to validate findings in small in-house cohorts with reanalysis of large publicly available AIRR datasets. nf-core/airrflow is available free of charge, under the MIT license on GitHub (https://github.com/nf-core/airrflow). Detailed documentation and example results are available on the nf-core website at (https://nf-co.re/airrflow).

20.

Scalable and efficient DNA sequencing analysis on different compute infrastructures aiding variant discovery.

Hanssen, Friederike; Garcia, Maxime U; Folkersen, Lasse; Pedersen, Anders Sune; Lescai, Francesco; Jodoin, Susanne; Miller, Edmund; Seybold, Matthias; Wacker, Oskar; Smith, Nicholas; Gabernet, Gisela; Nahnsen, Sven.

NAR Genom Bioinform ; 6(2): lqae031, 2024 Jun.

Artículo en Inglés | MEDLINE | ID: mdl-38666213

RESUMEN

DNA variation analysis has become indispensable in many aspects of modern biomedicine, most prominently in the comparison of normal and tumor samples. Thousands of samples are collected in local sequencing efforts and public databases requiring highly scalable, portable, and automated workflows for streamlined processing. Here, we present nf-core/sarek 3, a well-established, comprehensive variant calling and annotation pipeline for germline and somatic samples. It is suitable for any genome with a known reference. We present a full rewrite of the original pipeline showing a significant reduction of storage requirements by using the CRAM format and runtime by increasing intra-sample parallelization. Both are leading to a 70% cost reduction in commercial clouds enabling users to do large-scale and cross-platform data analysis while keeping costs and CO2 emissions low. The code is available at https://nf-co.re/sarek.

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA