Búsqueda | Portal de Búsqueda de la BVS España

Quality control stress test for deep learning-based diagnostic model in digital pathology.

Schömig-Markiefka, Birgid; Pryalukhin, Alexey; Hulla, Wolfgang; Bychkov, Andrey; Fukuoka, Junya; Madabhushi, Anant; Achter, Viktor; Nieroda, Lech; Büttner, Reinhard; Quaas, Alexander; Tolkach, Yuri.

Mod Pathol ; 34(12): 2098-2108, 2021 12.

Artículo en Inglés | MEDLINE | ID: mdl-34168282

RESUMEN

Digital pathology provides a possibility for computational analysis of histological slides and automatization of routine pathological tasks. Histological slides are very heterogeneous concerning staining, sections' thickness, and artifacts arising during tissue processing, cutting, staining, and digitization. In this study, we digitally reproduce major types of artifacts. Using six datasets from four different institutions digitized by different scanner systems, we systematically explore artifacts' influence on the accuracy of the pre-trained, validated, deep learning-based model for prostate cancer detection in histological slides. We provide evidence that any histological artifact dependent on severity can lead to a substantial loss in model performance. Strategies for the prevention of diagnostic model accuracy losses in the context of artifacts are warranted. Stress-testing of diagnostic models using synthetically generated artifacts might be an essential step during clinical validation of deep learning-based algorithms.

Asunto(s)

Artefactos , Aprendizaje Profundo , Procesamiento de Imagen Asistido por Computador , Redes Neurales de la Computación , Patología Clínica/métodos , Neoplasias de la Próstata/diagnóstico , Control de Calidad , Humanos , Masculino , Neoplasias de la Próstata/clasificación , Reproducibilidad de los Resultados

iRODS metadata management for a cancer genome analysis workflow.

Nieroda, Lech; Maas, Lukas; Thiebes, Scott; Lang, Ulrich; Sunyaev, Ali; Achter, Viktor; Peifer, Martin.

BMC Bioinformatics ; 20(1): 29, 2019 Jan 15.

Artículo en Inglés | MEDLINE | ID: mdl-30646845

RESUMEN

BACKGROUND: The massive amounts of data from next generation sequencing (NGS) methods pose various challenges with respect to data security, storage and metadata management. While there is a broad range of data analysis pipelines, these challenges remain largely unaddressed to date. RESULTS: We describe the integration of the open-source metadata management system iRODS (Integrated Rule-Oriented Data System) with a cancer genome analysis pipeline in a high performance computing environment. The system allows for customized metadata attributes as well as fine-grained protection rules and is augmented by a user-friendly front-end for metadata input. This results in a robust, efficient end-to-end workflow under consideration of data security, central storage and unified metadata information. CONCLUSIONS: Integrating iRODS with an NGS data analysis pipeline is a suitable method for addressing the challenges of data security, storage and metadata management in NGS environments.

Asunto(s)

Metodologías Computacionales , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Metadatos , Neoplasias/genética , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Seguridad Computacional , Humanos , Polimorfismo Genético , Flujo de Trabajo

Leveraging the power of high performance computing for next generation sequencing data analysis: tricks and twists from a high throughput exome workflow.

Kawalia, Amit; Motameny, Susanne; Wonczak, Stephan; Thiele, Holger; Nieroda, Lech; Jabbari, Kamel; Borowski, Stefan; Sinha, Vishal; Gunia, Wilfried; Lang, Ulrich; Achter, Viktor; Nürnberg, Peter.

PLoS One ; 10(5): e0126321, 2015.

Artículo en Inglés | MEDLINE | ID: mdl-25942438

RESUMEN

Next generation sequencing (NGS) has been a great success and is now a standard method of research in the life sciences. With this technology, dozens of whole genomes or hundreds of exomes can be sequenced in rather short time, producing huge amounts of data. Complex bioinformatics analyses are required to turn these data into scientific findings. In order to run these analyses fast, automated workflows implemented on high performance computers are state of the art. While providing sufficient compute power and storage to meet the NGS data challenge, high performance computing (HPC) systems require special care when utilized for high throughput processing. This is especially true if the HPC system is shared by different users. Here, stability, robustness and maintainability are as important for automated workflows as speed and throughput. To achieve all of these aims, dedicated solutions have to be developed. In this paper, we present the tricks and twists that we utilized in the implementation of our exome data processing workflow. It may serve as a guideline for other high throughput data analysis projects using a similar infrastructure. The code implementing our solutions is provided in the supporting information files.

Asunto(s)

Biología Computacional/métodos , Metodologías Computacionales , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN/métodos , Procesamiento Automatizado de Datos/métodos , Humanos , Flujo de Trabajo

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA