Búsqueda | Biblioteca Virtual en Salud Odontología. Uruguay

Neural network design for energy-autonomous artificial intelligence applications using temporal encoding.

Mileiko, Sergey; Bunnam, Thanasin; Xia, Fei; Shafik, Rishad; Yakovlev, Alex; Das, Shidhartha.

Philos Trans A Math Phys Eng Sci ; 378(2164): 20190166, 2020 Feb 07.

Artículo en Inglés | MEDLINE | ID: mdl-31865878

RESUMEN

Neural networks (NNs) are steering a new generation of artificial intelligence (AI) applications at the micro-edge. Examples include wireless sensors, wearables and cybernetic systems that collect data and process them to support real-world decisions and controls. For energy autonomy, these applications are typically powered by energy harvesters. As harvesters and other power sources which provide energy autonomy inevitably have power variations, the circuits need to robustly operate over a dynamic power envelope. In other words, the NN hardware needs to be able to function correctly under unpredictable and variable supply voltages. In this paper, we propose a novel NN design approach using the principle of pulse width modulation (PWM). PWM signals represent information with their duty cycle values which may be made independent of the voltages and frequencies of the carrier signals. We design a PWM-based perceptron which can serve as the fundamental building block for NNs, by using an entirely new method of realizing arithmetic in the PWM domain. We analyse the proposed approach building from a 3 × 3 perceptron circuit to a complex multi-layer NN. Using handwritten character recognition as an exemplar of AI applications, we demonstrate the power elasticity, resilience and efficiency of the proposed NN design in the presence of functional and parametric variations including large voltage variations in the power supply. This article is part of the theme issue 'Harmonizing energy-autonomous computing and intelligence'.

Learning automata based energy-efficient AI hardware design for IoT applications.

Wheeldon, Adrian; Shafik, Rishad; Rahman, Tousif; Lei, Jie; Yakovlev, Alex; Granmo, Ole-Christoffer.

Philos Trans A Math Phys Eng Sci ; 378(2182): 20190593, 2020 Oct 16.

Artículo en Inglés | MEDLINE | ID: mdl-32921236

RESUMEN

Energy efficiency continues to be the core design challenge for artificial intelligence (AI) hardware designers. In this paper, we propose a new AI hardware architecture targeting Internet of Things applications. The architecture is founded on the principle of learning automata, defined using propositional logic. The logic-based underpinning enables low-energy footprints as well as high learning accuracy during training and inference, which are crucial requirements for efficient AI with long operating life. We present the first insights into this new architecture in the form of a custom-designed integrated circuit for pervasive applications. Fundamental to this circuit is systematic encoding of binarized input data fed into maximally parallel logic blocks. The allocation of these blocks is optimized through a design exploration and automation flow using field programmable gate array-based fast prototypes and software simulations. The design flow allows for an expedited hyperparameter search for meeting the conflicting requirements of energy frugality and high accuracy. Extensive validations on the hardware implementation of the new architecture using single- and multi-class machine learning datasets show potential for significantly lower energy than the existing AI hardware architectures. In addition, we demonstrate test accuracy and robustness matching the software implementation, outperforming other state-of-the-art machine learning algorithms. This article is part of the theme issue 'Advanced electromagnetic non-destructive evaluation and smart monitoring'.

Harmonizing energy-autonomous computing and intelligence: an editorial introduction.

Shafik, Rishad; Yakovlev, Alex.

Philos Trans A Math Phys Eng Sci ; 378(2164): 20190594, 2020 02 07.

Artículo en Inglés | MEDLINE | ID: mdl-31865879

REDRESS: Generating Compressed Models for Edge Inference Using Tsetlin Machines.

Maheshwari, Sidharth; Rahman, Tousif; Shafik, Rishad; Yakovlev, Alex; Rafiev, Ashur; Jiao, Lei; Granmo, Ole-Christoffer.

IEEE Trans Pattern Anal Mach Intell ; 45(9): 11152-11168, 2023 Sep.

Artículo en Inglés | MEDLINE | ID: mdl-37074898

RESUMEN

Inference at-the-edge using embedded machine learning models is associated with challenging trade-offs between resource metrics, such as energy and memory footprint, and the performance metrics, such as computation time and accuracy. In this work, we go beyond the conventional Neural Network based approaches to explore Tsetlin Machine (TM), an emerging machine learning algorithm, that uses learning automata to create propositional logic for classification. We use algorithm-hardware co-design to propose a novel methodology for training and inference of TM. The methodology, called REDRESS, comprises independent TM training and inference techniques to reduce the memory footprint of the resulting automata to target low and ultra-low power applications. The array of Tsetlin Automata (TA) holds learned information in the binary form as bits: {0,1}, called excludes and includes, respectively. REDRESS proposes a lossless TA compression method, called the include-encoding, that stores only the information associated with includes to achieve over 99% compression. This is enabled by a novel computationally minimal training procedure, called the Tsetlin Automata Re-profiling, to improve the accuracy and increase the sparsity of TA to reduce the number of includes, hence, the memory footprint. Finally, REDRESS includes an inherently bit-parallel inference algorithm that operates on the optimally trained TA in the compressed domain, that does not require decompression during runtime, to obtain high speedups when compared with the state-of-the-art Binary Neural Network (BNN) models. In this work, we demonstrate that using REDRESS approach, TM outperforms BNN models on all design metrics for five benchmark datasets viz. MNIST, CIFAR2, KWS6, Fashion-MNIST and Kuzushiji-MNIST. When implemented on an STM32F746G-DISCO microcontroller, REDRESS obtained speedups and energy savings ranging 5-5700× compared with different BNN models.

An FPGA Based Energy-Efficient Read Mapper With Parallel Filtering and In-Situ Verification.

Gudur, Venkateshwarlu Yellaswamy; Maheshwari, Sidharth; Acharyya, Amit; Shafik, Rishad.

IEEE/ACM Trans Comput Biol Bioinform ; 19(5): 2697-2711, 2022.

Artículo en Inglés | MEDLINE | ID: mdl-34415836

RESUMEN

In the assembly pipeline of Whole Genome Sequencing (WGS), read mapping is a widely used method to re-assemble the genome. It employs approximate string matching and dynamic programming-based algorithms on a large volume of data and associated structures, making it a computationally intensive process. Currently, the state-of-the-art data centers for genome sequencing incur substantial setup and energy costs for maintaining hardware, data storage and cooling systems. To enable low-cost genomics, we propose an energy-efficient architectural methodology for read mapping using a single system-on-chip (SoC) platform. The proposed methodology is based on the q-gram lemma and designed using a novel architecture for filtering and verification. The filtering algorithm is designed using a parallel sorted q-gram lemma based method for the first time, and it is complemented by an in-situ verification routine using parallel Myers bit-vector algorithm. We have implemented our design on the Zynq Ultrascale+ XCZU9EG MPSoC platform. It is then extensively validated using real genomic data to demonstrate up to 7.8× energy reduction and up to 13.3× less resource utilization when compared with the state-of-the-art software and hardware approaches.

Asunto(s)

Algoritmos , Programas Informáticos , Genoma , Genómica , Análisis de Secuencia de ADN/métodos

Hardware-Algorithm Codesign for Fast and Energy Efficient Approximate String Matching on FPGA for Computational Biology.

Gudur, Venkateshwarlu Yellaswamy; Maheshwari, Sidharth; Bhardwaj, Swati; Acharyya, Amit; Shafik, Rishad.

Annu Int Conf IEEE Eng Med Biol Soc ; 2022: 87-90, 2022 07.

Artículo en Inglés | MEDLINE | ID: mdl-36086088

RESUMEN

Myers bit-vector algorithm for approximate string matching (ASM) is a dynamic programming based approach that takes advantage of bit-parallel operations. It is one of the fastest algorithms to find the edit distance between two strings. In computational biology, ASM is used at various stages of the computational pipeline, including proteomics and genomics. The computationally intensive nature of the underlying algorithms for ASM operating on the large volume of data necessitates the acceleration of these algorithms. In this paper, we propose a novel ASM architecture based on Myers bit-vector algorithm for parallel searching of multiple query patterns in the biological databases. The proposed parallel architecture uses multiple processing engines and hardware/software codesign for an accelerated and energy-efficient design of ASM algorithm on hardware. In comparison with related literature, the proposed design achieves 22× better performance with a demonstrative energy efficiency of â¼ 500×109 cell updates per joule.

Asunto(s)

Biología Computacional , Conservación de los Recursos Energéticos , Algoritmos , Computadores , Programas Informáticos

CORAL: Verification-Aware OpenCL Based Read Mapper for Heterogeneous Systems.

Maheshwari, Sidharth; Gudur, Venkateshwarlu Y; Shafik, Rishad; Wilson, Ian; Yakovlev, Alex; Acharyya, Amit.

IEEE/ACM Trans Comput Biol Bioinform ; 18(4): 1426-1438, 2021.

Artículo en Inglés | MEDLINE | ID: mdl-31562102

RESUMEN

Genomics has the potential to transform medicine from reactive to a personalized, predictive, preventive, and participatory (P4) form. Being a Big Data application with continuously increasing rate of data production, the computational costs of genomics have become a daunting challenge. Most modern computing systems are heterogeneous consisting of various combinations of computing resources, such as CPUs, GPUs, and FPGAs. They require platform-specific software and languages to program making their simultaneous operation challenging. Existing read mappers and analysis tools in the whole genome sequencing (WGS) pipeline do not scale for such heterogeneity. Additionally, the computational cost of mapping reads is high due to expensive dynamic programming based verification, where optimized implementations are already available. Thus, improvement in filtration techniques is needed to reduce verification overhead. To address the aforementioned limitations with regards to the mapping element of the WGS pipeline, we propose a Cross-platfOrm Read mApper using opencL (CORAL). CORAL is capable of executing on heterogeneous devices/platforms, simultaneously. It can reduce computational time by suitably distributing the workload without any additional programming effort. We showcase this on a quadcore Intel CPU along with two Nvidia GTX 590 GPUs, distributing the workload judiciously to achieve up to 2× speedup compared to when, only, the CPUs are used. To reduce the verification overhead, CORAL dynamically adapts k-mer length during filtration. We demonstrate competitive timings in comparison with other mappers using real and simulated reads. CORAL is available at: https://github.com/nclaes/CORAL.

Asunto(s)

Mapeo Cromosómico/métodos , Genómica/métodos , Secuenciación Completa del Genoma/métodos , Algoritmos , Humanos , Alineación de Secuencia

RESUMEN

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA