Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 20 de 23
Filtrar
1.
Opt Express ; 27(6): 8395-8413, 2019 Mar 18.
Artículo en Inglés | MEDLINE | ID: mdl-31052658

RESUMEN

An advanced transmit remote opto-antenna unit is proposed that accomplishes impedance matching between a photodetector and a low-profile antenna in a specified frequency bandwidth, without requiring an area-consuming matching network. This results in a highly compact design, which also avoids the losses and spurious radiation by such an electrically large matching circuit. Instead, the photodetector is almost directly connected to the antenna, which is designed as a conjugate load, such that the extracted and radiated power are optimized. The required input impedance for the antenna is obtained by adopting a half-mode air-filled substrate-integrated-waveguide topology, which also exhibits excellent radiation efficiency. The proposed unit omits electrical amplifiers and is, therefore, completely driven by the signal supplied by an optical fiber when deployed in an analog optical link, except for an externally supplied photodetector bias voltage. Such a highly cost-effective, power-efficient and reliable unit is an important step in making innovative wireless communication systems, which deploy extremely dense attocells of 15 cm × 15 cm, technically and economically feasible. As a validation, a prototype, operating in the Unlicensed National Information Infrastructure radio bands (5.15 GHz-5.85 GHz), is constructed and its radiation properties are characterized in free-space conditions. After normalizing with respect to the optical source's slope efficiency, a maximum boresight gain of 12.0 dBi and a -3 dB gain bandwidth of 1020 MHz (18.6 %) are observed.

2.
Bioinformatics ; 33(3): 461-463, 2017 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-28158465

RESUMEN

Summary: We present a Cytoscape app for the ISMAGS algorithm, which can enumerate all instances of a motif in a graph, making optimal use of the motif's symmetries to make the search more efficient. The Cytoscape app provides a handy interface for this algorithm, which allows more efficient network analysis. Availability and Implementation: The Cytoscape app for ISMAGS can be freely downloaded from the Cytoscape App store http://apps.cytoscape.org/apps/ismags. Source code and documentation for ISMAGS are available at https://github.com/biointec/ismags. Source code and documentation for the Cytoscape app are available at https://gitlab.psb.ugent.be/thpar/ISMAGS_Cytoscape. Contacts: Pieter.Audenaert@intec.ugent.be or Yves.VanDePeer@psb.vib-ugent.be Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Biología Computacional/métodos , Programas Informáticos , Algoritmos , Gráficos por Computador
3.
BMC Bioinformatics ; 18(1): 374, 2017 Aug 18.
Artículo en Inglés | MEDLINE | ID: mdl-28821237

RESUMEN

BACKGROUND: Recently, many standalone applications have been proposed to correct sequencing errors in Illumina data. The key idea is that downstream analysis tools such as de novo genome assemblers benefit from a reduced error rate in the input data. Surprisingly, a systematic validation of this assumption using state-of-the-art assembly methods is lacking, even for recently published methods. RESULTS: For twelve recent Illumina error correction tools (EC tools) we evaluated both their ability to correct sequencing errors and their ability to improve de novo genome assembly in terms of contig size and accuracy. CONCLUSIONS: We confirm that most EC tools reduce the number of errors in sequencing data without introducing many new errors. However, we found that many EC tools suffer from poor performance in certain sequence contexts such as regions with low coverage or regions that contain short repeated or low-complexity sequences. Reads overlapping such regions are often ill-corrected in an inconsistent manner, leading to breakpoints in the resulting assemblies that are not present in assemblies obtained from uncorrected data. Resolving this systematic flaw in future EC tools could greatly improve the applicability of such tools.


Asunto(s)
Genoma , Secuenciación de Nucleótidos de Alto Rendimiento , Algoritmos , Animales , Bacterias/genética , Caenorhabditis elegans/genética , ADN/química , ADN/metabolismo , Drosophila/genética , Humanos , Alineación de Secuencia , Análisis de Secuencia de ADN
4.
Bioelectromagnetics ; 38(4): 295-306, 2017 May.
Artículo en Inglés | MEDLINE | ID: mdl-28240792

RESUMEN

In the future, wireless radiofrequency (RF) telecommunications networks will provide users with gigabit-per-second data rates. Therefore, these networks are evolving toward hybrid networks, which will include commonly used macro- and microcells in combination with local ultra-high density access networks consisting of so-called attocells. The use of attocells requires a proper compliance assessment of exposure to RF electromagnetic radiation. This paper presents, for the first time, such a compliance assessment of an attocell operating at 3.5 GHz with an input power of 1 mW, based on both root-mean-squared electric field strength (Erms ) and peak 10 g-averaged specific absorption rate (SAR10g ) values. The Erms values near the attocell were determined using finite-difference time-domain (FDTD) simulations and measurements by a tri-axial probe. They were compared to the International Commission on Non-Ionizing Radiation Protection's (ICNIRP) reference levels. All measured and simulated Erms values above the attocell were below 5.9 V/m and lower than reference levels. The SAR10g values were measured in a homogeneous phantom, which resulted in an SAR10g of 9.7 mW/kg, and used FDTD simulations, which resulted in an SAR10g of 7.2 mW/kg. FDTD simulations of realistic exposure situations were executed using a heterogeneous phantom, which yielded SAR10g values lower than 2.8 mW/kg. The studied dosimetric quantities were in compliance with ICNIRP guidelines when the attocell was fed an input power <1 mW. The deployment of attocells is thus a feasible solution for providing broadband data transmission without drastically increasing personal RF exposure. Bioelectromagnetics. 38:295-306, 2017. © 2017 Wiley Periodicals, Inc.


Asunto(s)
Redes de Comunicación de Computadores , Exposición a la Radiación/análisis , Ondas de Radio , Absorción de Radiación , Humanos , Modelos Teóricos , Fantasmas de Imagen , Tecnología Inalámbrica
5.
Sensors (Basel) ; 17(7)2017 Jul 11.
Artículo en Inglés | MEDLINE | ID: mdl-28696393

RESUMEN

As the IoT continues to grow over the coming years, resource-constrained devices and networks will see an increase in traffic as everything is connected in an open Web of Things. The performance- and function-enhancing features are difficult to provide in resource-constrained environments, but will gain importance if the WoT is to be scaled up successfully. For example, scalable open standards-based authentication and authorization will be important to manage access to the limited resources of constrained devices and networks. Additionally, features such as caching and virtualization may help further reduce the load on these constrained systems. This work presents the Secure Service Proxy (SSP): a constrained-network edge proxy with the goal of improving the performance and functionality of constrained RESTful environments. Our evaluations show that the proposed design reaches its goal by reducing the load on constrained devices while implementing a wide range of features as different adapters. Specifically, the results show that the SSP leads to significant savings in processing, network traffic, network delay and packet loss rates for constrained devices. As a result, the SSP helps to guarantee the proper operation of constrained networks as these networks form an ever-expanding Web of Things.

6.
BMC Bioinformatics ; 17: 76, 2016 Feb 09.
Artículo en Inglés | MEDLINE | ID: mdl-26862054

RESUMEN

BACKGROUND: Many algorithms have been developed to infer the topology of gene regulatory networks from gene expression data. These methods typically produce a ranking of links between genes with associated confidence scores, after which a certain threshold is chosen to produce the inferred topology. However, the structural properties of the predicted network do not resemble those typical for a gene regulatory network, as most algorithms only take into account connections found in the data and do not include known graph properties in their inference process. This lowers the prediction accuracy of these methods, limiting their usability in practice. RESULTS: We propose a post-processing algorithm which is applicable to any confidence ranking of regulatory interactions obtained from a network inference method which can use, inter alia, graphlets and several graph-invariant properties to re-rank the links into a more accurate prediction. To demonstrate the potential of our approach, we re-rank predictions of six different state-of-the-art algorithms using three simple network properties as optimization criteria and show that Netter can improve the predictions made on both artificially generated data as well as the DREAM4 and DREAM5 benchmarks. Additionally, the DREAM5 E.coli. community prediction inferred from real expression data is further improved. Furthermore, Netter compares favorably to other post-processing algorithms and is not restricted to correlation-like predictions. Lastly, we demonstrate that the performance increase is robust for a wide range of parameter settings. Netter is available at http://bioinformatics.intec.ugent.be. CONCLUSIONS: Network inference from high-throughput data is a long-standing challenge. In this work, we present Netter, which can further refine network predictions based on a set of user-defined graph properties. Netter is a flexible system which can be applied in unison with any method producing a ranking from omics data. It can be tailored to specific prior knowledge by expert users but can also be applied in general uses cases. Concluding, we believe that Netter is an interesting second step in the network inference process to further increase the quality of prediction.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Escherichia coli/genética , Redes Reguladoras de Genes , Benchmarking , Regulación de la Expresión Génica , Humanos
7.
Bioinformatics ; 31(23): 3758-66, 2015 Dec 01.
Artículo en Inglés | MEDLINE | ID: mdl-26254488

RESUMEN

MOTIVATION: The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. RESULTS: We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O.sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z.mays. AVAILABILITY AND IMPLEMENTATION: BLSSpeller was written in Java. Source code and manual are available at http://bioinformatics.intec.ugent.be/blsspeller CONTACT: Klaas.Vandepoele@psb.vib-ugent.be or jan.fostier@intec.ugent.be. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Genoma de Planta , Regiones Promotoras Genéticas , Análisis de Secuencia de ADN/métodos , Secuencia de Bases , Sitios de Unión , Secuencia Conservada , ADN de Plantas/química , Motivos de Nucleótidos , Alineación de Secuencia , Programas Informáticos , Factores de Transcripción/metabolismo
8.
Sensors (Basel) ; 16(7)2016 Jul 21.
Artículo en Inglés | MEDLINE | ID: mdl-27455262

RESUMEN

The Internet of Things (IoT) is expanding rapidly to new domains in which embedded devices play a key role and gradually outnumber traditionally-connected devices. These devices are often constrained in their resources and are thus unable to run standard Internet protocols. The Constrained Application Protocol (CoAP) is a new alternative standard protocol that implements the same principals as the Hypertext Transfer Protocol (HTTP), but is tailored towards constrained devices. In many IoT application domains, devices need to be addressed in groups in addition to being addressable individually. Two main approaches are currently being proposed in the IoT community for CoAP-based group communication. The main difference between the two approaches lies in the underlying communication type: multicast versus unicast. In this article, we experimentally evaluate those two approaches using two wireless sensor testbeds and under different test conditions. We highlight the pros and cons of each of them and propose combining these approaches in a hybrid solution to better suit certain use case requirements. Additionally, we provide a solution for multicast-based group membership management using CoAP.

9.
Sensors (Basel) ; 16(8)2016 Aug 02.
Artículo en Inglés | MEDLINE | ID: mdl-27490554

RESUMEN

Sensors and actuators are becoming important components of Internet of Things (IoT) applications. Today, several approaches exist to facilitate communication of sensors and actuators in IoT applications. Most communications go through often proprietary gateways requiring availability of the gateway for each and every interaction between sensors and actuators. Sometimes, the gateway does some processing of the sensor data before triggering actuators. Other approaches put this processing logic further in the cloud. These approaches introduce significant latencies and increased number of packets. In this paper, we introduce a CoAP-based mechanism for direct binding of sensors and actuators. This flexible binding solution is utilized further to build IoT applications through RESTlets. RESTlets are defined to accept inputs and produce outputs after performing some processing tasks. Sensors and actuators could be associated with RESTlets (which can be hosted on any device) through the flexible binding mechanism we introduced. This approach facilitates decentralized IoT application development by placing all or part of the processing logic in Low power and Lossy Networks (LLNs). We run several tests to compare the performance of our solution with existing solutions and found out that our solution reduces communication delay and number of packets in the LLN.

10.
Cytometry A ; 87(7): 636-45, 2015 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-25573116

RESUMEN

The number of markers measured in both flow and mass cytometry keeps increasing steadily. Although this provides a wealth of information, it becomes infeasible to analyze these datasets manually. When using 2D scatter plots, the number of possible plots increases exponentially with the number of markers and therefore, relevant information that is present in the data might be missed. In this article, we introduce a new visualization technique, called FlowSOM, which analyzes Flow or mass cytometry data using a Self-Organizing Map. Using a two-level clustering and star charts, our algorithm helps to obtain a clear overview of how all markers are behaving on all cells, and to detect subsets that might be missed otherwise. R code is available at https://github.com/SofieVG/FlowSOM and will be made available at Bioconductor.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Citometría de Flujo/métodos , Biomarcadores/análisis , Análisis por Conglomerados , Enfermedad Injerto contra Huésped/diagnóstico , Trasplante de Células Madre Hematopoyéticas , Humanos , Linfoma de Células B/diagnóstico , Fiebre del Nilo Occidental/diagnóstico
11.
Bioinformatics ; 29(10): 1308-16, 2013 May 15.
Artículo en Inglés | MEDLINE | ID: mdl-23595663

RESUMEN

MOTIVATION: When genomic data are associated with gene expression data, the resulting expression quantitative trait loci (eQTL) will likely span multiple genes. eQTL prioritization techniques can be used to select the most likely causal gene affecting the expression of a target gene from a list of candidates. As an input, these techniques use physical interaction networks that often contain highly connected genes and unreliable or irrelevant interactions that can interfere with the prioritization process. We present EPSILON, an extendable framework for eQTL prioritization, which mitigates the effect of highly connected genes and unreliable interactions by constructing a local network before a network-based similarity measure is applied to select the true causal gene. RESULTS: We tested the new method on three eQTL datasets derived from yeast data using three different association techniques. A physical interaction network was constructed, and each eQTL in each dataset was prioritized using the EPSILON approach: first, a local network was constructed using a k-trials shortest path algorithm, followed by the calculation of a network-based similarity measure. Three similarity measures were evaluated: random walks, the Laplacian Exponential Diffusion kernel and the Regularized Commute-Time kernel. The aim was to predict knockout interactions from a yeast knockout compendium. EPSILON outperformed two reference prioritization methods, random assignment and shortest path prioritization. Next, we found that using a local network significantly increased prioritization performance in terms of predicted knockout pairs when compared with using exactly the same network similarity measures on the global network, with an average increase in prioritization performance of 8 percentage points (P < 10(-5)). AVAILABILITY: The physical interaction network and the source code (Matlab/C++) of our implementation can be downloaded from http://bioinformatics.intec.ugent.be/epsilon. CONTACT: lieven.verbeke@intec.ugent.be, kamar@psb.ugent.be, jan.fostier@intec.ugent.be SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Sitios de Carácter Cuantitativo , Saccharomyces cerevisiae/genética , Programas Informáticos , Algoritmos , Expresión Génica , Técnicas de Inactivación de Genes , Mutación
12.
Nucleic Acids Res ; 40(2): e11, 2012 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-22102584

RESUMEN

Comparative genomics is a powerful means to gain insight into the evolutionary processes that shape the genomes of related species. As the number of sequenced genomes increases, the development of software to perform accurate cross-species analyses becomes indispensable. However, many implementations that have the ability to compare multiple genomes exhibit unfavorable computational and memory requirements, limiting the number of genomes that can be analyzed in one run. Here, we present a software package to unveil genomic homology based on the identification of conservation of gene content and gene order (collinearity), i-ADHoRe 3.0, and its application to eukaryotic genomes. The use of efficient algorithms and support for parallel computing enable the analysis of large-scale data sets. Unlike other tools, i-ADHoRe can process the Ensembl data set, containing 49 species, in 1 h. Furthermore, the profile search is more sensitive to detect degenerate genomic homology than chaining pairwise collinearity information based on transitive homology. From ultra-conserved collinear regions between mammals and birds, by integrating coexpression information and protein-protein interactions, we identified more than 400 regions in the human genome showing significant functional coherence. The different algorithmical improvements ensure that i-ADHoRe 3.0 will remain a powerful tool to study genome evolution.


Asunto(s)
Genómica/métodos , Programas Informáticos , Sintenía , Algoritmos , Animales , Orden Génico , Genes , Genoma Humano , Humanos , Alineación de Secuencia/métodos
13.
Sensors (Basel) ; 14(6): 9833-77, 2014 Jun 04.
Artículo en Inglés | MEDLINE | ID: mdl-24901978

RESUMEN

Smart embedded objects will become an important part of what is called the Internet of Things. Applications often require concurrent interactions with several of these objects and their resources. Existing solutions have several limitations in terms of reliability, flexibility and manageability of such groups of objects. To overcome these limitations we propose an intermediately level of intelligence to easily manipulate a group of resources across multiple smart objects, building upon the Constrained Application Protocol (CoAP). We describe the design of our solution to create and manipulate a group of CoAP resources using a single client request. Furthermore we introduce the concept of profiles for the created groups. The use of profiles allows the client to specify in more detail how the group should behave. We have implemented our solution and demonstrate that it covers the complete group life-cycle, i.e., creation, validation, flexible usage and deletion. Finally, we quantitatively analyze the performance of our solution and compare it against multicast-based CoAP group communication. The results show that our solution improves reliability and flexibility with a trade-off in increased communication overhead.

14.
Sci Rep ; 13(1): 20560, 2023 Nov 23.
Artículo en Inglés | MEDLINE | ID: mdl-37996612

RESUMEN

To address the rising demand for high-speed wireless data links, communication systems operating at frequencies beyond [Formula: see text] are being targeted. A key enabling technology in the development of these wireless systems is the phased antenna array. Yet, the design and implementation of such steerable antenna arrays at frequencies over [Formula: see text] comes with a multitude of challenges. In particular, the cointegration of active electronics at each antenna element poses a major hurdle due to the inherent space constraints in the array. This article proposes a novel scalable concept for opto-electronic phased antenna arrays operating at 140 GHz. It details the system architecture of a transmitter that enables the implementation of large scale, wideband, 2D steerable phased antenna arrays and presents the design and measurement of a compact SiGe power amplifier (PA) chip to be used as one of its key building blocks. The amplifier achieves a gain of 20 dB at 135 GHz, features a [Formula: see text] of 14.6 dBm and can support data rates up to 45 Gbps in a limited footprint of only 540µm × 550µm. This makes it one of the fastest, most powerful D-band power amplifiers in literature with a footprint compatible with [Formula: see text]-spaced phased array integration.

15.
Bioinformatics ; 27(6): 749-56, 2011 Mar 15.
Artículo en Inglés | MEDLINE | ID: mdl-21216775

RESUMEN

MOTIVATION: Many comparative genomics studies rely on the correct identification of homologous genomic regions using accurate alignment tools. In such case, the alphabet of the input sequences consists of complete genes, rather than nucleotides or amino acids. As optimal multiple sequence alignment is computationally impractical, a progressive alignment strategy is often employed. However, such an approach is susceptible to the propagation of alignment errors in early pairwise alignment steps, especially when dealing with strongly diverged genomic regions. In this article, we present a novel accurate and efficient greedy, graph-based algorithm for the alignment of multiple homologous genomic segments, represented as ordered gene lists. RESULTS: Based on provable properties of the graph structure, several heuristics are developed to resolve local alignment conflicts that occur due to gene duplication and/or rearrangement events on the different genomic segments. The performance of the algorithm is assessed by comparing the alignment results of homologous genomic segments in Arabidopsis thaliana to those obtained by using both a progressive alignment method and an earlier graph-based implementation. Especially for datasets that contain strongly diverged segments, the proposed method achieves a substantially higher alignment accuracy, and proves to be sufficiently fast for large datasets including a few dozens of eukaryotic genomes. AVAILABILITY: http://bioinformatics.psb.ugent.be/software. The algorithm is implemented as a part of the i-ADHoRe 3.0 package.


Asunto(s)
Algoritmos , Genómica/métodos , Alineación de Secuencia/métodos , Arabidopsis/genética , Biología Computacional/métodos , Genoma , Programas Informáticos
16.
Bioinformatics ; 27(11): 1587-8, 2011 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-21478195

RESUMEN

SUMMARY: Network motifs in integrated molecular networks represent functional relationships between distinct data types. They aggregate to form dense topological structures corresponding to functional modules which cannot be detected by traditional graph clustering algorithms. We developed CyClus3D, a Cytoscape plugin for clustering composite three-node network motifs using a 3D spectral clustering algorithm. AVAILABILITY: Via the Cytoscape plugin manager or http://bioinformatics.psb.ugent.be/software/details/CyClus3D.


Asunto(s)
Algoritmos , Redes Reguladoras de Genes , Análisis por Conglomerados , Modelos Biológicos , Mapeo de Interacción de Proteínas , Transducción de Señal , Programas Informáticos
17.
Opt Express ; 20(26): B630-1, 2012 Dec 10.
Artículo en Inglés | MEDLINE | ID: mdl-23262912

RESUMEN

We introduce the Optics Express special issue from the 38th European Conference on Optical Communication and Exhibition (ECOC). A total of 134 expanded papers from ECOC 2012 are included in this special issue.

18.
Opt Express ; 20(26): B52-63, 2012 Dec 10.
Artículo en Inglés | MEDLINE | ID: mdl-23262897

RESUMEN

The optical network unit (ONU), installed at a customer's premises, accounts for about 60% of power in current fiber-to-the-home (FTTH) networks. We propose a power consumption model for the ONU and evaluate the ONU power consumption in various next generation optical access (NGOA) architectures. Further, we study the impact of the power savings of the ONU in various low power modes such as power shedding, doze and sleep.

19.
Radiat Prot Dosimetry ; 183(3): 326-331, 2019 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-30085262

RESUMEN

In this article, we study human electromagnetic exposure to the radiation of an ultra dense network of nodes integrated in a floor denoted as ATTO-cell floor, or ATTO-floor. ATTO-cells are a prospective 5 G wireless networking technology, in which humans are exposed by several interfering sources. To numerically estimate this exposure we propose a statistical approach based on a set of finite difference time domain simulations. It accounts for variations of antenna phases and makes use of a large number of exposure evaluations, based on a relatively low number of required simulations. The exposure was expressed in peak-spatial 10-g SAR average (psSAR10g). The results show an average exposure level of ~4.9 mW/kg and reaching 7.6 mW/kg in 5% of cases. The maximum psSAR10g value found in the studied numerical setup equals around 21.2 mW/kg. Influence of the simulated ATTO-floor size on the resulting exposure was examined. All obtained exposure levels are far below 4 W/kg ICNIRP basic restriction for general public in limbs (and 20 W/kg basic restriction for occupational exposure), which makes ATTO-floor a potential low-exposure 5 G candidate.


Asunto(s)
Radiación Electromagnética , Pisos y Cubiertas de Piso , Exposición a la Radiación/análisis , Robótica , Tecnología Inalámbrica , Absorción de Radiación , Redes de Comunicación de Computadores , Humanos , Modelos Estadísticos , Dispersión de Radiación
20.
Algorithms Mol Biol ; 11: 10, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27148393

RESUMEN

BACKGROUND: Third generation sequencing platforms produce longer reads with higher error rates than second generation technologies. While the improved read length can provide useful information for downstream analysis, underlying algorithms are challenged by the high error rate. Error correction methods in which accurate short reads are used to correct noisy long reads appear to be attractive to generate high-quality long reads. Methods that align short reads to long reads do not optimally use the information contained in the second generation data, and suffer from large runtimes. Recently, a new hybrid error correcting method has been proposed, where the second generation data is first assembled into a de Bruijn graph, on which the long reads are then aligned. RESULTS: In this context we present Jabba, a hybrid method to correct long third generation reads by mapping them on a corrected de Bruijn graph that was constructed from second generation data. Unique to our method is the use of a pseudo alignment approach with a seed-and-extend methodology, using maximal exact matches (MEMs) as seeds. In addition to benchmark results, certain theoretical results concerning the possibilities and limitations of the use of MEMs in the context of third generation reads are presented. CONCLUSION: Jabba produces highly reliable corrected reads: almost all corrected reads align to the reference, and these alignments have a very high identity. Many of the aligned reads are error-free. Additionally, Jabba corrects reads using a very low amount of CPU time. From this we conclude that pseudo alignment with MEMs is a fast and reliable method to map long highly erroneous sequences on a de Bruijn graph.

SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda