simd processors Latest Research Papers

Transaction Accelerator for Blockchain Networks Based on Cryptonight Algorithm using Specialized Multicore Processor MALT

PROGRAMMNAYA INGENERIA ◽

10.17587/prin.12.295-301 ◽

2021 ◽

Vol 12 (6) ◽

pp. 295-301

Author(s):

A. A. Titova ◽

◽

V. A. Roganov ◽

G. A. Lukyanchenko ◽

S. G. Elizarov ◽

...

Keyword(s):

Energy Efficiency ◽

Parallel Computing ◽

Data Clustering ◽

Memory Usage ◽

Algorithm Optimization ◽

Multicore Processor ◽

Local Memory ◽

Simd Processors ◽

Clustering Data ◽

Data Prefetch

Cryptonight is one of the possible base algorithms for cryptocurrencies. It belongs to the group of memory-bound algorithms, designed to prevent mining on specialized processors and ASICs by using 2MB of memory for each hash. Thus, it is not easy to adapt for parallel computing. The aim of this work is to prove theoretically and experimentally that this algorithm can still be optimized for a specialized multicore processor to make mining more energetically efficient than on CPU. This article describes the process of optimization, which was conducted using the following methods: data clustering, storage of repeatedly used data in local memory, usage of SIMD for parallel computing, data prefetch. Those methods are first explained, their supposed effectiveness analyzed, and then implemented. As a result, two schemes of algorithm optimization were created: first one is based on the usage of MALTs slave cores, which compute hashes independently. Although memory-boundness creates multiple problems, we were able to increase the efficiency by clustering data. The second scheme is more complicated, it suggests using SIMD processors for most cryptographic computations and also involves data prefetch, which becomes possible if more than one hash is calculated on one core at the same time. All the results are demonstrated in the paper and they indicate that it is indeed possible to optimize Cryptonight for a specialized multicore processor MALT. The practical results show that energy efficiency has increased 5 times in comparison with CPU.

Download Full-text

Parallelized and Vectorized Tracking Using Kalman Filters with CMS Detector Geometry and Events

EPJ Web of Conferences ◽

10.1051/epjconf/201921402002 ◽

2019 ◽

Vol 214 ◽

pp. 02002 ◽

Cited By ~ 1

Author(s):

Giuseppe Cerati ◽

Peter Elmer ◽

Brian Gravelle ◽

Matti Kortelainen ◽

Vyacheslav Krutelyov ◽

...

Keyword(s):

Kalman Filter ◽

High Performance ◽

Hadron Collider ◽

Track Reconstruction ◽

Computational Performance ◽

Detector Geometry ◽

Simd Processors ◽

High Performance Systems ◽

Physics Performance ◽

Many Core

The High-Luminosity Large Hadron Collider at CERN will be characterized by greater pileup of events and higher occupancy, making the track reconstruction even more computationally demanding. Existing algorithms at the LHC are based on Kalman filter techniques with proven excellent physics performance under a variety of conditions. Starting in 2014, we have been developing Kalman-filter-based methods for track finding and fitting adapted for many-core SIMD processors that are becoming dominant in high-performance systems. This paper summarizes the latest extensions to our software that allow it to run on the realistic CMS-2017 tracker geometry using CMSSW-generated events, including pileup. The reconstructed tracks can be validated against either the CMSSW simulation that generated the detector hits, or the CMSSW reconstruction of the tracks. In general, the code’s computational performance has continued to improve while the above capabilities were being added. We demonstrate that the present Kalman filter implementation is able to reconstruct events with comparable physics performance to CMSSW, while providing generally better computational performance. Further plans for advancing the software are discussed.

Download Full-text

Efficient Strict-Binning Particle-in-Cell Algorithm for Multi-core SIMD Processors

Euro-Par 2018: Parallel Processing - Lecture Notes in Computer Science ◽

10.1007/978-3-319-96983-1_53 ◽

2018 ◽

pp. 749-763 ◽

Cited By ~ 1

Author(s):

Yann Barsamian ◽

Arthur Charguéraud ◽

Sever A. Hirstoaga ◽

Michel Mehrenberger

Keyword(s):

Particle In Cell ◽

Simd Processors

Download Full-text

Design of Parallel BEM Analyses Framework for SIMD Processors

Lecture Notes in Computer Science - Computational Science – ICCS 2018 ◽

10.1007/978-3-319-93698-7_46 ◽

2018 ◽

pp. 601-613

Author(s):

Tetsuya Hoshino ◽

Akihiro Ida ◽

Toshihiro Hanawa ◽

Kengo Nakajima

Keyword(s):

Simd Processors

Download Full-text

Multiple Precision Floating-Point Arithmetic on SIMD Processors

2017 IEEE 24th Symposium on Computer Arithmetic (ARITH) ◽

10.1109/arith.2017.12 ◽

2017 ◽

Author(s):

Joris Van Der Hoeven

Keyword(s):

Floating Point ◽

Floating Point Arithmetic ◽

Multiple Precision ◽

Simd Processors ◽

Point Arithmetic

Download Full-text

Efficient Emulation of Floating-Point Arithmetic on Fixed-Point SIMD Processors

2016 IEEE International Workshop on Signal Processing Systems (SiPS) ◽

10.1109/sips.2016.52 ◽

2016 ◽

Cited By ~ 2

Author(s):

Lukas Gerlach ◽

Guillermo Paya-Vaya ◽

Holger Blume

Keyword(s):

Fixed Point ◽

Floating Point ◽

Floating Point Arithmetic ◽

Simd Processors ◽

Point Arithmetic

Download Full-text

Power optimizations for transport triggered SIMD processors

2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS) ◽

10.1109/samos.2015.7363689 ◽

2015 ◽

Cited By ~ 2

Author(s):

Joonas Multanen ◽

Timo Viitanen ◽

Henry Linjamaki ◽

Heikki Kultala ◽

Pekka Jaaskelainen ◽

...

Keyword(s):

Simd Processors

Download Full-text

Vectorized Bloom filters for advanced SIMD processors

Proceedings of the Tenth International Workshop on Data Management on New Hardware - DaMoN '14 ◽

10.1145/2619228.2619234 ◽

2014 ◽

Cited By ~ 27

Author(s):

Orestis Polychroniou ◽

Kenneth A. Ross

Keyword(s):

Bloom Filters ◽

Simd Processors

Download Full-text

Customized MMRF: Efficient Matrix Operations on SIMD Processors

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.347-350.1727 ◽

2013 ◽

Vol 347-350 ◽

pp. 1727-1731 ◽

Cited By ~ 1

Author(s):

Kai Zhang ◽

Yao Hua Wang ◽

Shu Ming Chen ◽

Zhen Tao Li ◽

Liang Wen

Keyword(s):

Wireless Communication ◽

Performance Improvement ◽

Critical Path ◽

Experimental Results ◽

Path Delay ◽

Design Technology ◽

Matrix Size ◽

Matrix Operations ◽

Critical Path Delay ◽

Simd Processors

Wireless communication and multimedia applications feature a large amount of matrix operations with different matrix size. These operations require accessing matrix in column order. This paper implements a Multi-Grained Matrix Register File (MMRF) that supports multi-grained parallel row-wise and column-wise access. We implement a 4*4 MIMO decoding with the help of MMRF to illustrate the efficient matrix operations on SIMD processors. Experimental results show that, compared with TMS320C64x+, our SIMD processor can achieve about 5.65x to 7.71x performance improvement by employing the MMRF. By customized design technology, we reduce the area and critical-path delay of MMRF by 17.9% and 39.1% respectively.

Download Full-text

Loop Transforming for Reducing Data Alignment on Multi-Core SIMD Processors

Journal of Signal Processing Systems ◽

10.1007/s11265-013-0754-2 ◽

2013 ◽

Vol 74 (2) ◽

pp. 137-150

Author(s):

Yi Wang ◽

Linfeng Pan ◽

Zili Shao ◽

Yong Guan ◽

Minyi Guo

Keyword(s):

Data Alignment ◽

Simd Processors

Download Full-text

simd processors
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Transaction Accelerator for Blockchain Networks Based on Cryptonight Algorithm using Specialized Multicore Processor MALT

Parallelized and Vectorized Tracking Using Kalman Filters with CMS Detector Geometry and Events

Efficient Strict-Binning Particle-in-Cell Algorithm for Multi-core SIMD Processors

Design of Parallel BEM Analyses Framework for SIMD Processors

Multiple Precision Floating-Point Arithmetic on SIMD Processors

Efficient Emulation of Floating-Point Arithmetic on Fixed-Point SIMD Processors

Power optimizations for transport triggered SIMD processors

Vectorized Bloom filters for advanced SIMD processors

Customized MMRF: Efficient Matrix Operations on SIMD Processors

Loop Transforming for Reducing Data Alignment on Multi-Core SIMD Processors

Export Citation Format

simd processorsRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Transaction Accelerator for Blockchain Networks Based on Cryptonight Algorithm using Specialized Multicore Processor MALT

Parallelized and Vectorized Tracking Using Kalman Filters with CMS Detector Geometry and Events

Efficient Strict-Binning Particle-in-Cell Algorithm for Multi-core SIMD Processors

Design of Parallel BEM Analyses Framework for SIMD Processors

Multiple Precision Floating-Point Arithmetic on SIMD Processors

Efficient Emulation of Floating-Point Arithmetic on Fixed-Point SIMD Processors

Power optimizations for transport triggered SIMD processors

Vectorized Bloom filters for advanced SIMD processors

Customized MMRF: Efficient Matrix Operations on SIMD Processors

Loop Transforming for Reducing Data Alignment on Multi-Core SIMD Processors

simd processors
Recently Published Documents