Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 27
Filtrar
1.
Clin Epidemiol ; 14: 59-70, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35082531

RESUMEN

BACKGROUND: The International Society of Urological Pathology (ISUP) revised the Gleason system in 2005 and 2014. The impact of these changes on prostate cancer (PCa) prognostication remains unclear. OBJECTIVE: To evaluate if the ISUP 2014 Gleason score (GS) predicts PCa death better than the pre-2005 GS, and if additional histopathological information can further improve PCa death prediction. PATIENTS AND METHODS: We conducted a case-control study nested among men in the National Prostate Cancer Register of Sweden diagnosed with non-metastatic PCa 1998-2015. We included 369 men who died from PCa (cases) and 369 men who did not (controls). Two uro-pathologists centrally re-reviewed biopsy ISUP 2014 Gleason grading, poorly formed glands, cribriform pattern, comedonecrosis, perineural invasion, intraductal, ductal and mucinous carcinoma, percentage Gleason 4, inflammation, high-grade prostatic intraepithelial neoplasia (HGPIN) and post-atrophic hyperplasia. Pre-2005 GS was back-transformed using i) information on cribriform pattern and/or poorly formed glands and ii) the diagnostic GS from the registry. Models were developed using Firth logistic regression and compared in terms of discrimination (AUC). RESULTS: The ISUP 2014 GS (AUC = 0.808) performed better than the pre-2005 GS when back-transformed using only cribriform pattern (AUC = 0.785) or both cribriform and poorly formed glands (AUC = 0.792), but not when back-transformed using only poorly formed glands (AUC = 0.800). Similarly, the ISUP 2014 GS performed better than the diagnostic GS (AUC = 0.808 vs 0.781). Comedonecrosis (AUC = 0.811), HGPIN (AUC = 0.810) and number of cores with ≥50% cancer (AUC = 0.810) predicted PCa death independently of the ISUP 2014 GS. CONCLUSION: The Gleason Grading revisions have improved PCa death prediction, likely due to classifying cribriform patterns, rather than poorly formed glands, as Gleason 4. Comedonecrosis, HGPIN and number of cores with ≥50% cancer further improve PCa death discrimination slightly.

2.
Stud Health Technol Inform ; 281: 113-117, 2021 May 27.
Artículo en Inglés | MEDLINE | ID: mdl-34042716

RESUMEN

The FAIR Principles are a set of recommendations that aim to underpin knowledge discovery and integration by making the research outcomes Findable, Accessible, Interoperable and Reusable. These guidelines encourage the accurate recording and exchange of data, coupled with contextual information about their creation, expressed in domain-specific standards and machine-readable formats. This paper analyses the potential support to FAIRness of the openEHR specifications and reference implementation, by theoretically assessing their compliance with each of the 15 FAIR principles. Our study highlights how the openEHR approach, thanks to its computable semantics-oriented design, is inherently FAIR-enabling and is a promising implementation strategy for creating FAIR-compliant Clinical Data Repositories (CDRs).


Asunto(s)
Registros Electrónicos de Salud , Semántica
3.
Sci Rep ; 11(1): 3257, 2021 02 05.
Artículo en Inglés | MEDLINE | ID: mdl-33547336

RESUMEN

Virtual microscopy (VM) holds promise to reduce subjectivity as well as intra- and inter-observer variability for the histopathological evaluation of prostate cancer. We evaluated (i) the repeatability (intra-observer agreement) and reproducibility (inter-observer agreement) of the 2014 Gleason grading system and other selected features using standard light microscopy (LM) and an internally developed VM system, and (ii) the interchangeability of LM and VM. Two uro-pathologists reviewed 413 cores from 60 Swedish men diagnosed with non-metastatic prostate cancer 1998-2014. Reviewer 1 performed two reviews using both LM and VM. Reviewer 2 performed one review using both methods. The intra- and inter-observer agreement within and between LM and VM were assessed using Cohen's kappa and Bland and Altman's limits of agreement. We found good repeatability and reproducibility for both LM and VM, as well as interchangeability between LM and VM, for primary and secondary Gleason pattern, Gleason Grade Groups, poorly formed glands, cribriform pattern and comedonecrosis but not for the percentage of Gleason pattern 4. Our findings confirm the non-inferiority of VM compared to LM. The repeatability and reproducibility of percentage of Gleason pattern 4 was poor regardless of method used warranting further investigation and improvement before it is used in clinical practice.


Asunto(s)
Próstata/patología , Neoplasias de la Próstata/patología , Biopsia , Humanos , Masculino , Microscopía , Clasificación del Tumor , Estadificación de Neoplasias , Variaciones Dependientes del Observador , Reproducibilidad de los Resultados
4.
Stud Health Technol Inform ; 270: 443-447, 2020 Jun 16.
Artículo en Inglés | MEDLINE | ID: mdl-32570423

RESUMEN

Current high-throughput sequencing technologies allow us to acquire entire genomes in a very short time and at a relatively sustainable cost, thus resulting in an increasing diffusion of genetic test capabilities, in specialized clinical laboratories and research centers. In contrast, it is still limited the impact of genomic information on clinical decisions, as an effective interpretation is a challenging task. From the technological point of view, genomic data are big in size, have a complex granular nature and strongly depend on the computational steps of the generation and processing workflows. This article introduces our work to create the openEHR Genomic Project and the set of genomic information models we developed to catch such complex structure and to preserve data provenance efficiently in a machine-readable format. The models support clinical actionability of data, by improving their quality, fostering interoperability and laying the basis for re-usability.


Asunto(s)
Registros Electrónicos de Salud , Genómica , Pruebas Genéticas , Flujo de Trabajo
5.
Am J Epidemiol ; 188(6): 1165-1173, 2019 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-30976789

RESUMEN

In this paper, we describe the Prognostic Factors for Mortality in Prostate Cancer (ProMort) study and use it to demonstrate how the weighted likelihood method can be used in nested case-control studies to estimate both relative and absolute risks in the competing-risks setting. ProMort is a case-control study nested within the National Prostate Cancer Register (NPCR) of Sweden, comprising 1,710 men diagnosed with low- or intermediate-risk prostate cancer between 1998 and 2011 who died from prostate cancer (cases) and 1,710 matched controls. Cause-specific hazard ratios and cumulative incidence functions (CIFs) for prostate cancer death were estimated in ProMort using weighted flexible parametric models and compared with the corresponding estimates from the NPCR cohort. We further drew 1,500 random nested case-control subsamples of the NPCR cohort and quantified the bias in the hazard ratio and CIF estimates. Finally, we compared the ProMort estimates with those obtained by augmenting competing-risks cases and by augmenting both competing-risks cases and controls. The hazard ratios for prostate cancer death estimated in ProMort were comparable to those in the NPCR. The hazard ratios for dying from other causes were biased, which introduced bias in the CIFs estimated in the competing-risks setting. When augmenting both competing-risks cases and controls, the bias was reduced.


Asunto(s)
Neoplasias de la Próstata/mortalidad , Factores de Edad , Anciano , Estudios de Casos y Controles , Humanos , Masculino , Persona de Mediana Edad , Clasificación del Tumor , Estadificación de Neoplasias , Pronóstico , Modelos de Riesgos Proporcionales , Antígeno Prostático Específico , Neoplasias de la Próstata/terapia , Medición de Riesgo , Factores de Riesgo , Suecia/epidemiología
6.
Bioinformatics ; 35(19): 3752-3760, 2019 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-30851093

RESUMEN

MOTIVATION: Developing a robust and performant data analysis workflow that integrates all necessary components whilst still being able to scale over multiple compute nodes is a challenging task. We introduce a generic method based on the microservice architecture, where software tools are encapsulated as Docker containers that can be connected into scientific workflows and executed using the Kubernetes container orchestrator. RESULTS: We developed a Virtual Research Environment (VRE) which facilitates rapid integration of new tools and developing scalable and interoperable workflows for performing metabolomics data analysis. The environment can be launched on-demand on cloud resources and desktop computers. IT-expertise requirements on the user side are kept to a minimum, and workflows can be re-used effortlessly by any novice user. We validate our method in the field of metabolomics on two mass spectrometry, one nuclear magnetic resonance spectroscopy and one fluxomics study. We showed that the method scales dynamically with increasing availability of computational resources. We demonstrated that the method facilitates interoperability using integration of the major software suites resulting in a turn-key workflow encompassing all steps for mass-spectrometry-based metabolomics including preprocessing, statistics and identification. Microservices is a generic methodology that can serve any scientific discipline and opens up for new types of large-scale integrative science. AVAILABILITY AND IMPLEMENTATION: The PhenoMeNal consortium maintains a web portal (https://portal.phenomenal-h2020.eu) providing a GUI for launching the Virtual Research Environment. The GitHub repository https://github.com/phnmnl/ hosts the source code of all projects. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Análisis de Datos , Metabolómica , Biología Computacional , Programas Informáticos , Flujo de Trabajo
7.
Int J Med Inform ; 120: 147-156, 2018 12.
Artículo en Inglés | MEDLINE | ID: mdl-30409340

RESUMEN

PURPOSE: The increasing usage of high throughput sequencing in personalized medicine brings new challenges to the realm of healthcare informatics. Patient records need to accommodate data of unprecedented size and complexity as well as keep track of their production process. In this work we present a solution for integrating genomic data into electronic health records via openEHR archetypes. METHODS: We use the popular Variant Call Format as the base format to represent genetic test results within openEHR. We evaluate existing openEHR archetypes to determine what can be extended or specialized and what needs to be developed ex novo. RESULTS: Eleven new archetypes have been developed, while an existing one has been specialized to represent genomic data. We show their applicability to rare genetic diseases and compare our approach to HL7 FHIR. CONCLUSION: The proposed model allows to represent genetic test results in health records in a structured format. It supports different levels of abstraction, allowing both automated processing and clinical decision support. It is extensible via external references, allowing to keep track of data provenance and adapt to future domain changes.


Asunto(s)
Sistemas de Apoyo a Decisiones Clínicas , Registros Electrónicos de Salud/estadística & datos numéricos , Variación Genética , Genómica/métodos , Aplicaciones de la Informática Médica , Modelos Teóricos , Enfermedades Raras/genética , Registros Electrónicos de Salud/organización & administración , Pruebas Genéticas , Genoma Humano , Humanos , Enfermedades Raras/diagnóstico , Enfermedades Raras/terapia , Integración de Sistemas
8.
IEEE Trans Biomed Eng ; 65(12): 2713-2719, 2018 12.
Artículo en Inglés | MEDLINE | ID: mdl-29993423

RESUMEN

OBJECTIVE: Electroencephalography (EEG) is widely employed in the study of sleep disorders. This paper exploits the identification of cyclic alternating patterns (CAPs), a periodic ubiquitous phenomenon nested in the sleep stages, to analyze the EEG spectral coherence in subjects affected by nocturnal frontal lobe epilepsy (NFLE) and healthy controls. METHODS: For each EEG recording, we extracted several CAP A1 subtype 4 s time series. We analyze the coherence between each pair of electrodes for each individual to obtain its distribution for each frequency range of interest to investigate differences between cases and controls. In addition, the imaginary and real parts of the spectral coherence were calculated and plotted to assess their likelihood of segregation into different classes and anatomical regions. RESULTS: The results of this study suggest a relevant frontal-temporal neural circuitry difference between individuals affected by epilepsy and controls. CONCLUSION: This supports the observation that, though highly variable, a broad range of executive, cognitive and attentional deficit observed in subjects affected by NFLE might depend on frontal-temporal altered networking. SIGNIFICANCE: The investigation of EEG activity in the domain of the complex sleep architecture represents a challenging topic in neurophysiology and needs new methods to explore the manifold aspects of sleep. This work aims to provide a simple method to distinguish NFLE from healthy subjects from a functional connectivity point of view and to explore the possibility of using a smaller EEG channel set to support diagnosis.


Asunto(s)
Electroencefalografía/métodos , Epilepsia/diagnóstico , Polisomnografía/métodos , Procesamiento de Señales Asistido por Computador , Fases del Sueño/fisiología , Adulto , Humanos
9.
Biopreserv Biobank ; 16(2): 97-105, 2018 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-29359962

RESUMEN

The known challenge of underutilization of data and biological material from biorepositories as potential resources for medical research has been the focus of discussion for over a decade. Recently developed guidelines for improved data availability and reusability-entitled FAIR Principles (Findability, Accessibility, Interoperability, and Reusability)-are likely to address only parts of the problem. In this article, we argue that biological material and data should be viewed as a unified resource. This approach would facilitate access to complete provenance information, which is a prerequisite for reproducibility and meaningful integration of the data. A unified view also allows for optimization of long-term storage strategies, as demonstrated in the case of biobanks. We propose an extension of the FAIR Principles to include the following additional components: (1) quality aspects related to research reproducibility and meaningful reuse of the data, (2) incentives to stimulate effective enrichment of data sets and biological material collections and its reuse on all levels, and (3) privacy-respecting approaches for working with the human material and data. These FAIR-Health principles should then be applied to both the biological material and data. We also propose the development of common guidelines for cloud architectures, due to the unprecedented growth of volume and breadth of medical data generation, as well as the associated need to process the data efficiently.


Asunto(s)
Bancos de Muestras Biológicas , Confidencialidad/normas , Bases de Datos Factuales/normas , Difusión de la Información/métodos , Bancos de Muestras Biológicas/organización & administración , Bancos de Muestras Biológicas/normas , Guías como Asunto , Humanos
10.
F1000Res ; 62017.
Artículo en Inglés | MEDLINE | ID: mdl-29043062

RESUMEN

Metabolomics, the youngest of the major omics technologies, is supported by an active community of researchers and infrastructure developers across Europe. To coordinate and focus efforts around infrastructure building for metabolomics within Europe, a workshop on the "Future of metabolomics in ELIXIR" was organised at Frankfurt Airport in Germany. This one-day strategic workshop involved representatives of ELIXIR Nodes, members of the PhenoMeNal consortium developing an e-infrastructure that supports workflow-based metabolomics analysis pipelines, and experts from the international metabolomics community. The workshop established metabolite identification as the critical area, where a maximal impact of computational metabolomics and data management on other fields could be achieved. In particular, the existing four ELIXIR Use Cases, where the metabolomics community - both industry and academia - would benefit most, and which could be exhaustively mapped onto the current five ELIXIR Platforms were discussed. This opinion article is a call for support for a new ELIXIR metabolomics Use Case, which aligns with and complements the existing and planned ELIXIR Platforms and Use Cases.

11.
Bioinformatics ; 33(23): 3805-3807, 2017 Dec 01.
Artículo en Inglés | MEDLINE | ID: mdl-29036536

RESUMEN

MOTIVATION: Workflow managers for scientific analysis provide a high-level programming platform facilitating standardization, automation, collaboration and access to sophisticated computing resources. The Galaxy workflow manager provides a prime example of this type of platform. As compositions of simpler tools, workflows effectively comprise specialized computer programs implementing often very complex analysis procedures. To date, no simple way to automatically test Galaxy workflows and ensure their correctness has appeared in the literature. RESULTS: With wft4galaxy we offer a tool to bring automated testing to Galaxy workflows, making it feasible to bring continuous integration to their development and ensuring that defects are detected promptly. wft4galaxy can be easily installed as a regular Python program or launched directly as a Docker container-the latter reducing installation effort to a minimum. AVAILABILITY AND IMPLEMENTATION: Available at https://github.com/phnmnl/wft4galaxy under the Academic Free License v3.0. CONTACT: marcoenrico.piras@crs4.it.


Asunto(s)
Biología Computacional/métodos , Programas Informáticos , Flujo de Trabajo , Automatización
12.
Phys Rev E ; 95(3-1): 030108, 2017 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-28415372

RESUMEN

We model a set of point-to-point transports on a network as a system of polydisperse interacting self-avoiding walks (SAWs) over a finite square lattice. The ends of each SAW may be located both at random, uniformly distributed, positions or with one end fixed at a lattice corner. The total energy of the system is computed as the sum over all SAWs, which may represent either the time needed to complete the transport over the network, or the resources needed to build the networking infrastructure. We focus especially on the second aspect by assigning a concave cost function to each site to encourage path overlap. A simulated annealing optimization, based on a modified Berg-Foerster-Aragao de Carvalho-Caracciolo-Froehlich (BFACF) algorithm developed for polymers, is used to probe the complex conformational substate structure at zero temperature. We characterize the average cost gains (and path-length variations) for increasing polymer density with respect to a Dijkstra routing and find a nonmonotonic behavior as recently found for random networks. We observe the emergence of ergodicity breaking and of nontrivial overlap distributions among replicas when switching from a convex to a concave cost function (e.g., x^{γ}, where x represents the node overlap). Finally, we show that the space of ground states for γ<1 is compatible with an ultrametric structure, as seen in many complex systems such as some spin glasses.

13.
PLoS One ; 11(12): e0168004, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27936191

RESUMEN

This work presents a scalable data access layer, called PyEHR, designed to support the implementation of data management systems for secondary use of structured heterogeneous biomedical and clinical data. PyEHR adopts the openEHR's formalisms to guarantee the decoupling of data descriptions from implementation details and exploits structure indexing to accelerate searches. Data persistence is guaranteed by a driver layer with a common driver interface. Interfaces for two NoSQL Database Management Systems are already implemented: MongoDB and Elasticsearch. We evaluated the scalability of PyEHR experimentally through two types of tests, called "Constant Load" and "Constant Number of Records", with queries of increasing complexity on synthetic datasets of ten million records each, containing very complex openEHR archetype structures, distributed on up to ten computing nodes.


Asunto(s)
Biología Computacional , Sistemas de Administración de Bases de Datos , Almacenamiento y Recuperación de la Información
14.
Genome Med ; 6(9): 67, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25342980

RESUMEN

The analysis of the genomic distribution of viral vector genomic integration sites is a key step in hematopoietic stem cell-based gene therapy applications, allowing to assess both the safety and the efficacy of the treatment and to study the basic aspects of hematopoiesis and stem cell biology. Identifying vector integration sites requires ad-hoc bioinformatics tools with stringent requirements in terms of computational efficiency, flexibility, and usability. We developed VISPA (Vector Integration Site Parallel Analysis), a pipeline for automated integration site identification and annotation based on a distributed environment with a simple Galaxy web interface. VISPA was successfully used for the bioinformatics analysis of the follow-up of two lentiviral vector-based hematopoietic stem-cell gene therapy clinical trials. Our pipeline provides a reliable and efficient tool to assess the safety and efficacy of integrating vectors in clinical settings.

15.
Bioinformatics ; 30(19): 2816-7, 2014 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-24928211

RESUMEN

SUMMARY: BioBlend.objects is a new component of the BioBlend package, adding an object-oriented interface for the Galaxy REST-based application programming interface. It improves support for metacomputing on Galaxy entities by providing higher-level functionality and allowing users to more easily create programs to explore, query and create Galaxy datasets and workflows. AVAILABILITY AND IMPLEMENTATION: BioBlend.objects is available online at https://github.com/afgane/bioblend. The new object-oriented API is implemented by the galaxy/objects subpackage.


Asunto(s)
Biología Computacional/métodos , Algoritmos , Automatización , Gráficos por Computador , Sistemas de Computación , Lenguajes de Programación , Programas Informáticos , Interfaz Usuario-Computador
16.
Bioinformatics ; 30(13): 1928-9, 2014 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-24618473

RESUMEN

UNLABELLED: End-to-end next-generation sequencing microbiology data analysis requires a diversity of tools covering bacterial resequencing, de novo assembly, scaffolding, bacterial RNA-Seq, gene annotation and metagenomics. However, the construction of computational pipelines that use different software packages is difficult owing to a lack of interoperability, reproducibility and transparency. To overcome these limitations we present Orione, a Galaxy-based framework consisting of publicly available research software and specifically designed pipelines to build complex, reproducible workflows for next-generation sequencing microbiology data analysis. Enabling microbiology researchers to conduct their own custom analysis and data manipulation without software installation or programming, Orione provides new opportunities for data-intensive computational analyses in microbiology and metagenomics. AVAILABILITY AND IMPLEMENTATION: Orione is available online at http://orione.crs4.it.


Asunto(s)
Programas Informáticos , Secuenciación de Nucleótidos de Alto Rendimiento , Internet , Metagenómica , Técnicas Microbiológicas , Reproducibilidad de los Resultados
17.
Bioinformatics ; 30(1): 119-20, 2014 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-24149054

RESUMEN

SUMMARY: Hadoop MapReduce-based approaches have become increasingly popular due to their scalability in processing large sequencing datasets. However, as these methods typically require in-depth expertise in Hadoop and Java, they are still out of reach of many bioinformaticians. To solve this problem, we have created SeqPig, a library and a collection of tools to manipulate, analyze and query sequencing datasets in a scalable and simple manner. SeqPigscripts use the Hadoop-based distributed scripting engine Apache Pig, which automatically parallelizes and distributes data processing tasks. We demonstrate SeqPig's scalability over many computing nodes and illustrate its use with example scripts. AVAILABILITY AND IMPLEMENTATION: Available under the open source MIT license at http://sourceforge.net/projects/seqpig/


Asunto(s)
Ensayos Analíticos de Alto Rendimiento/métodos , Diseño de Software
18.
Science ; 341(6148): 1233158, 2013 Aug 23.
Artículo en Inglés | MEDLINE | ID: mdl-23845948

RESUMEN

Metachromatic leukodystrophy (MLD) is an inherited lysosomal storage disease caused by arylsulfatase A (ARSA) deficiency. Patients with MLD exhibit progressive motor and cognitive impairment and die within a few years of symptom onset. We used a lentiviral vector to transfer a functional ARSA gene into hematopoietic stem cells (HSCs) from three presymptomatic patients who showed genetic, biochemical, and neurophysiological evidence of late infantile MLD. After reinfusion of the gene-corrected HSCs, the patients showed extensive and stable ARSA gene replacement, which led to high enzyme expression throughout hematopoietic lineages and in cerebrospinal fluid. Analyses of vector integrations revealed no evidence of aberrant clonal behavior. The disease did not manifest or progress in the three patients 7 to 21 months beyond the predicted age of symptom onset. These findings indicate that extensive genetic engineering of human hematopoiesis can be achieved with lentiviral vectors and that this approach may offer therapeutic benefit for MLD patients.


Asunto(s)
Cerebrósido Sulfatasa/genética , Terapia Genética/métodos , Trasplante de Células Madre Hematopoyéticas , Células Madre Hematopoyéticas/metabolismo , Leucodistrofia Metacromática/terapia , Encéfalo/patología , Daño del ADN , Estudios de Seguimiento , Ingeniería Genética , Vectores Genéticos/toxicidad , Humanos , Lentivirus , Leucodistrofia Metacromática/patología , Imagen por Resonancia Magnética , Transducción Genética , Resultado del Tratamiento , Integración Viral
19.
Nat Methods ; 9(3): 245-53, 2012 Feb 28.
Artículo en Inglés | MEDLINE | ID: mdl-22373911

RESUMEN

Data-intensive research depends on tools that manage multidimensional, heterogeneous datasets. We built OME Remote Objects (OMERO), a software platform that enables access to and use of a wide range of biological data. OMERO uses a server-based middleware application to provide a unified interface for images, matrices and tables. OMERO's design and flexibility have enabled its use for light-microscopy, high-content-screening, electron-microscopy and even non-image-genotype data. OMERO is open-source software, available at http://openmicroscopy.org/.


Asunto(s)
Sistemas de Administración de Bases de Datos , Bases de Datos Factuales , Interpretación de Imagen Asistida por Computador/métodos , Almacenamiento y Recuperación de la Información/métodos , Modelos Biológicos , Programas Informáticos , Interfaz Usuario-Computador , Animales , Biología/métodos , Simulación por Computador , Humanos
20.
Bioinformatics ; 27(15): 2159-60, 2011 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-21697132

RESUMEN

SUMMARY: SEAL is a scalable tool for short read pair mapping and duplicate removal. It computes mappings that are consistent with those produced by BWA and removes duplicates according to the same criteria employed by Picard MarkDuplicates. On a 16-node Hadoop cluster, it is capable of processing about 13 GB per hour in map+rmdup mode, while reaching a throughput of 19 GB per hour in mapping-only mode. AVAILABILITY: SEAL is available online at http://biodoop-seal.sourceforge.net/.


Asunto(s)
Biología Computacional/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Algoritmos , Alineación de Secuencia/métodos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...