Pesquisa | Portal Regional da BVS

Correction: Memory hierarchy characterization of SPEC CPU2006 and SPEC CPU2017 on the Intel Xeon Skylake-SP.

Navarro-Torres, Agustín; Alastruey-Benedé, Jesús; Ibáñez-Marín, Pablo; Viñals-Yúfera, Víctor.

PLoS One ; 19(5): e0303712, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38722938

RESUMO

[This corrects the article DOI: 10.1371/journal.pone.0220135.].

Compressed Sparse FM-Index: Fast Sequence Alignment Using Large K-Steps.

Langarita, Ruben; Armejach, Adria; Setoain, Javier; Ibanez-Marin, Pablo; Alastruey-Benede, Jesus; Moreto, Miquel.

IEEE/ACM Trans Comput Biol Bioinform ; 19(1): 355-368, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-32750858

RESUMO

The FM-index is a data structure used in genomics for exact search of input sequences over large reference genomes. Algorithms based on the FM-index show an irregular memory access pattern, resulting in a memory bound problem. We analyze a recent implementation of the FM-index and highlight existing throughput-memory trade-offs, showing that memory requirements limit implementation of large k-steps. We propose COFI, a COmpressed FM-Index for large K-steps. COFI enables a 15-step FM-index using less than 16 GB for a human genome reference of 3 giga base pairs. An algorithm based on this new layout is evaluated on both a Knights Landing (KNL) and an Skylake-based system (SKX). We achieve average speed-ups of 1.46× and 1.39×, respectively, with respect to an state-of-the-art FM-index implementation that is already well optimized.

Assuntos

Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Algoritmos , Genoma Humano , Humanos , Alinhamento de Sequência , Análise de Sequência de DNA , Software

Accelerating Sequence Alignments Based on FM-Index Using the Intel KNL Processor.

Herruzo, Jose M; Gonzalez-Navarro, Sonia; Ibanez-Marin, Pablo; Vinals-Yufera, Victor; Alastruey-Benede, Jesus; Plata, Oscar.

IEEE/ACM Trans Comput Biol Bioinform ; 17(4): 1093-1104, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-30530369

RESUMO

FM-index is a compact data structure suitable for fast matches of short reads to large reference genomes. The matching algorithm using this index exhibits irregular memory access patterns that cause frequent cache misses, resulting in a memory bound problem. This paper analyzes different FM-index versions presented in the literature, focusing on those computing aspects related to the data access. As a result of the analysis, we propose a new organization of FM-index that minimizes the demand for memory bandwidth, allowing a great improvement of performance on processors with high-bandwidth memory, such as the second-generation Intel Xeon Phi (Knights Landing, or KNL), integrating ultra high-bandwidth stacked memory technology. As the roofline model shows, our implementation reaches 95 percent of the peak random access bandwidth limit when executed on the KNL and almost all of the available bandwidth when executed on other Intel Xeon architectures with conventional DDR memory. In addition, the obtained throughput in KNL is much higher than the results reported for GPUs in the literature.

Assuntos

Genômica , Alinhamento de Sequência , Algoritmos , Computadores , DNA/genética , Genoma Humano/genética , Genômica/instrumentação , Genômica/métodos , Humanos , Alinhamento de Sequência/instrumentação , Alinhamento de Sequência/métodos

Memory hierarchy characterization of SPEC CPU2006 and SPEC CPU2017 on the Intel Xeon Skylake-SP.

Navarro-Torres, Agustín; Alastruey-Benedé, Jesús; Ibáñez-Marín, Pablo; Viñals-Yúfera, Víctor.

PLoS One ; 14(8): e0220135, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-31369592

RESUMO

SPEC CPU is one of the most common benchmark suites used in computer architecture research. CPU2017 has recently been released to replace CPU2006. In this paper we present a detailed evaluation of the memory hierarchy performance for both the CPU2006 and single-threaded CPU2017 benchmarks. The experiments were executed on an Intel Xeon Skylake-SP, which is the first Intel processor to implement a mostly non-inclusive last-level cache (LLC). We present a classification of the benchmarks according to their memory pressure and analyze the performance impact of different LLC sizes. We also test all the hardware prefetchers showing they improve performance in most of the benchmarks. After comprehensive experimentation, we can highlight the following conclusions: i) almost half of SPEC CPU benchmarks have very low miss ratios in the second and third level caches, even with small LLC sizes and without hardware prefetching, ii) overall, the SPEC CPU2017 benchmarks demand even less memory hierarchy resources than the SPEC CPU2006 ones, iii) hardware prefetching is very effective in reducing LLC misses for most benchmarks, even with the smallest LLC size, and iv) from the memory hierarchy standpoint the methodologies commonly used to select benchmarks or simulation points do not guarantee representative workloads.

Assuntos

Algoritmos , Benchmarking , Sistemas Computacionais/normas , Computadores/normas , Software

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA