Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Mais filtros












Base de dados
Intervalo de ano de publicação
1.
ACS Appl Mater Interfaces ; 15(47): 54602-54610, 2023 Nov 29.
Artigo em Inglês | MEDLINE | ID: mdl-37962420

RESUMO

Single-port ferroelectric FET (FeFET) that performs write and read operations on the same electrical gate prevents its wide application in tunable analog electronics and suffers from read disturb, especially in the high-threshold voltage (VTH) state as the retention energy barrier is reduced by the applied read bias. To address both issues, we propose to adopt a read disturb-free dual-port FeFET where the write is performed on the gate featuring a ferroelectric layer and the read is done on a separate gate featuring a nonferroelectric dielectric. Combining the unique structure and the separate read gate, read disturb is eliminated as the applied field is aligned with polarization in the high-VTH state, thus improving its stability, while it is screened by the channel inversion charge and exerts no negative impact on the low-VTH state stability. Comprehensive theoretical and experimental validation has been performed on fully depleted silicon-on-insulator (FDSOI) FeFETs integrated on a 22 nm platform, which intrinsically has dual ports with its buried oxide layer acting as the nonferroelectric dielectric. Novel applications that can exploit the proposed dual-port FeFET are proposed and experimentally demonstrated for the first time, including FPGA that harnesses its read disturb-free feature and tunable analog electronics (e.g., frequency tunable ring oscillator in this work) leveraging the separated write and read paths.

2.
Micromachines (Basel) ; 14(10)2023 Oct 07.
Artigo em Inglês | MEDLINE | ID: mdl-37893350

RESUMO

The paper proposes two architectures for a dynamically scalable network-on-chip (NoC) for dynamically reconfigurable intellectual properties (IPs) to save power. The first architecture is a run-time scalable column-based NoC, where the columns of the NoC are scaled up and down at run-time depending on the demands to connect reconfigurable IPs. The second architecture is an extension of the first, where both the rows and columns of the NoC are dynamically scaled up and down on demand. A robust control manager is developed to control the IP and sub-NoC reconfigurations by optimizing the reconfiguration costs. The proposed architectures have been implemented and tested in actual prototypes on a Virtex 6 FPGA mounted on the ML605 board. The results show that dynamically scalable architectures are capable of significant power reduction as compared to traditional static architectures for the same size of the NoC. It is anticipated that the scalable NoC can be very useful for sharing the FPGA resources among IPs at runtime.

3.
Entropy (Basel) ; 24(9)2022 Aug 23.
Artigo em Inglês | MEDLINE | ID: mdl-36141063

RESUMO

Entropy is one of the most fundamental notions for understanding complexity. Among all the methods to calculate the entropy, sample entropy (SampEn) is a practical and common method to estimate time-series complexity. Unfortunately, SampEn is a time-consuming method growing in quadratic times with the number of elements, which makes this method unviable when processing large data series. In this work, we evaluate hardware SampEn architectures to offload computation weight, using improved SampEn algorithms and exploiting reconfigurable technologies, such as field-programmable gate arrays (FPGAs), a reconfigurable technology well-known for its high performance and power efficiency. In addition to the fundamental disclosed straightforward SampEn (SF) calculation method, this study evaluates optimized strategies, such as bucket-assist (BA) SampEn and lightweight SampEn based on BubbleSort (BS-LW) and MergeSort (MS-LW) on an embedded CPU, a high-performance CPU and on an FPGA using simulated data and real-world electrocardiograms (ECG) as input data. Irregular storage space and memory access of enhanced algorithms is also studied and estimated in this work. These fast SampEn algorithms are evaluated and profiled using metrics such as execution time, resource use, power and energy consumption based on input data length. Finally, although the implementation of fast SampEn is not significantly faster than versions running on a high-performance CPU, FPGA implementations consume one or two orders of magnitude less energy than a high-performance CPU.

4.
Front Neurosci ; 15: 728460, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-35126034

RESUMO

This article employs the new IBM INC-3000 prototype FPGA-based neural supercomputer to implement a widely used model of the cortical microcircuit. With approximately 80,000 neurons and 300 Million synapses this model has become a benchmark network for comparing simulation architectures with regard to performance. To the best of our knowledge, the achieved speed-up factor is 2.4 times larger than the highest speed-up factor reported in the literature and four times larger than biological real time demonstrating the potential of FPGA systems for neural modeling. The work was performed at Jülich Research Centre in Germany and the INC-3000 was built at the IBM Almaden Research Center in San Jose, CA, United States. For the simulation of the microcircuit only the programmable logic part of the FPGA nodes are used. All arithmetic is implemented with single-floating point precision. The original microcircuit network with linear LIF neurons and current-based exponential-decay-, alpha-function- as well as beta-function-shaped synapses was simulated using exact exponential integration as ODE solver method. In order to demonstrate the flexibility of the approach, additionally networks with non-linear neuron models (AdEx, Izhikevich) and conductance-based synapses were simulated, applying Runge-Kutta and Parker-Sochacki solver methods. In all cases, the simulation-time speed-up factor did not decrease by more than a very few percent. It finally turns out that the speed-up factor is essentially limited by the latency of the INC-3000 communication system.

5.
Sensors (Basel) ; 20(13)2020 Jul 05.
Artigo em Inglês | MEDLINE | ID: mdl-32635604

RESUMO

Visible light communications are considered as a promising solution for inter-vehicle communications, which in turn can significantly enhance the traffic safety and efficiency. However, the vehicular visible light communications (VLC) channel is highly dynamic, very unpredictable, and subject to many noise sources. Enhancing VLC systems with self-aware capabilities would maximize the communication performances and efficiency, whatever the environmental conditions. Within this context, this letter proposes a novel signal to noise ratio (SNR)-adaptive visible light communication receiver architecture aimed for automotive applications. The novelty of this letter comes from an open loop signal processing technique in which the signal treatment complexity is established based on a real-time SNR analysis. So, the receiver evaluates the SNR, and based on this assessment, it reconfigures its structural design in order to ensure a proper signal treatment, while providing an optimal tradeoff between communication performances and computational resources usage. This approach based on software reconfiguration has the potential to provide the system with enhanced flexibility and enables its usage in resource sharing application. As far as we know, this approach has not been considered in vehicular VLC systems. The performances of the proposed architecture are demonstrated by simulations, which confirm the SNR-adaptive capacity and the optimized performances.

6.
J Supercomput ; 70(1): 284-300, 2014 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-25309040

RESUMO

Compared to Beowulf clusters and shared-memory machines, GPU and FPGA are emerging alternative architectures that provide massive parallelism and great computational capabilities. These architectures can be utilized to run compute-intensive algorithms to analyze ever-enlarging datasets and provide scalability. In this paper, we present four implementations of K-means data clustering algorithm for different high performance computing platforms. These four implementations include a CUDA implementation for GPUs, a Mitrion C implementation for FPGAs, an MPI implementation for Beowulf compute clusters, and an OpenMP implementation for shared-memory machines. The comparative analyses of the cost of each platform, difficulty level of programming for each platform, and the performance of each implementation are presented.

7.
Artigo em Inglês | MEDLINE | ID: mdl-26594666

RESUMO

The 3D FFT is critical in many physical simulations and image processing applications. On FPGAs, however, the 3D FFT was thought to be inefficient relative to other methods such as convolution-based implementations of multi-grid. We find the opposite: a simple design, operating at a conservative frequency, takes 4µs for 163, 21µs for 323, and 215µs for 643 single precision data points. The first two of these compare favorably with the 25µs and 29µs obtained running on a current Nvidia GPU. Some broader significance is that this is a critical piece in implementing a large scale FPGA-based MD engine: even a single FPGA is capable of keeping the FFT off of the critical path for a large fraction of possible MD simulations.

8.
Sensors (Basel) ; 12(5): 6244-68, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22778640

RESUMO

This paper presents a novel hardware architecture for principal component analysis. The architecture is based on the Generalized Hebbian Algorithm (GHA) because of its simplicity and effectiveness. The architecture is separated into three portions: the weight vector updating unit, the principal computation unit and the memory unit. In the weight vector updating unit, the computation of different synaptic weight vectors shares the same circuit for reducing the area costs. To show the effectiveness of the circuit, a texture classification system based on the proposed architecture is physically implemented by Field Programmable Gate Array (FPGA). It is embedded in a System-On-Programmable-Chip (SOPC) platform for performance measurement. Experimental results show that the proposed architecture is an efficient design for attaining both high speed performance and low area costs.

9.
Sensors (Basel) ; 11(7): 6697-718, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-22163980

RESUMO

This paper presents a novel VLSI architecture for image segmentation. The architecture is based on the fuzzy c-means algorithm with spatial constraint for reducing the misclassification rate. In the architecture, the usual iterative operations for updating the membership matrix and cluster centroid are merged into one single updating process to evade the large storage requirement. In addition, an efficient pipelined circuit is used for the updating process for accelerating the computational speed. Experimental results show that the the proposed circuit is an effective alternative for real-time image segmentation with low area cost and low misclassification rate.


Assuntos
Sistemas Computacionais , Lógica Fuzzy , Processamento de Imagem Assistida por Computador , Algoritmos
10.
Sensors (Basel) ; 11(10): 9160-81, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-22163688

RESUMO

This paper presents a novel phase unwrapping architecture for accelerating the computational speed of digital holographic microscopy (DHM). A fast Fourier transform (FFT) based phase unwrapping algorithm providing a minimum squared error solution is adopted for hardware implementation because of its simplicity and robustness to noise. The proposed architecture is realized in a pipeline fashion to maximize throughput of the computation. Moreover, the number of hardware multipliers and dividers are minimized to reduce the hardware costs. The proposed architecture is used as a custom user logic in a system on programmable chip (SOPC) for physical performance measurement. Experimental results reveal that the proposed architecture is effective for expediting the computational speed while consuming low hardware resources for designing an embedded DHM system.


Assuntos
Algoritmos , Holografia/métodos , Processamento de Imagem Assistida por Computador/métodos , Microscopia/métodos , Computadores , Análise de Fourier , Processamento de Sinais Assistido por Computador , Software , Fatores de Tempo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...