Regime Inference for Sound Floating-Point Optimizations

Robert Rabe; Anastasiia Izycheva; Eva Darulova

doi:10.1145/3477012

Regime Inference for Sound Floating-Point Optimizations

ACM Transactions on Embedded Computing Systems ◽

10.1145/3477012 ◽

2021 ◽

Vol 20 (5s) ◽

pp. 1-23

Author(s):

Robert Rabe ◽

Anastasiia Izycheva ◽

Eva Darulova

Keyword(s):

Floating Point ◽

Improve Performance ◽

Rounding Errors ◽

Proper Functioning ◽

Mixed Precision ◽

Input Domain ◽

Different Parts

Efficient numerical programs are required for proper functioning of many systems. Today’s tools offer a variety of optimizations to generate efficient floating-point implementations that are specific to a program’s input domain. However, sound optimizations are of an “all or nothing” fashion with respect to this input domain—if an optimizer cannot improve a program on the specified input domain, it will conclude that no optimization is possible. In general, though, different parts of the input domain exhibit different rounding errors and thus have different optimization potential. We present the first regime inference technique for sound optimizations that automatically infers an effective subdivision of a program’s input domain such that individual sub-domains can be optimized more aggressively. Our algorithm is general; we have instantiated it with mixed-precision tuning and rewriting optimizations to improve performance and accuracy, respectively. Our evaluation on a standard benchmark set shows that with our inferred regimes, we can, on average, improve performance by 65% and accuracy by 54% with respect to whole-domain optimizations.

Download Full-text

Rigorous floating-point mixed-precision tuning

Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages - POPL 2017 ◽

10.1145/3009837.3009846 ◽

2017 ◽

Cited By ~ 38

Author(s):

Wei-Fan Chiang ◽

Mark Baranowski ◽

Ian Briggs ◽

Alexey Solovyev ◽

Ganesh Gopalakrishnan ◽

...

Keyword(s):

Floating Point ◽

Mixed Precision

Download Full-text

GPU-FPtuner: Mixed-precision Auto-tuning for Floating-point Applications on GPU

2020 IEEE 27th International Conference on High Performance Computing, Data, and Analytics (HiPC) ◽

10.1109/hipc50609.2020.00043 ◽

2020 ◽

Author(s):

Ruidong Gu ◽

Michela Becchi

Keyword(s):

Floating Point ◽

Mixed Precision ◽

Auto Tuning

Download Full-text

POP: A Tuning Assistant for Mixed-Precision Floating-Point Computations

Communications in Computer and Information Science - Formal Techniques for Safety-Critical Systems ◽

10.1007/978-3-030-46902-3_5 ◽

2020 ◽

pp. 77-94

Author(s):

Dorra Ben Khalifa ◽

Matthieu Martel ◽

Assalé Adjé

Keyword(s):

Floating Point ◽

Mixed Precision

Download Full-text

Multiple-Precision BLAS Library for Graphics Processing Units

10.36227/techrxiv.12580301.v1 ◽

2020 ◽

Author(s):

Konstantin Isupov ◽

Vladimir Knyazkov

Keyword(s):

Graphics Processing Units ◽

Arithmetic Operation ◽

Number System ◽

Residue Number System ◽

Floating Point ◽

Data Types ◽

Rounding Errors ◽

Multiple Precision ◽

Graphics Processing ◽

Point Arithmetic

The binary32 and binary64 floating-point formats provide good performance on current hardware, but also introduce a rounding error in almost every arithmetic operation. Consequently, the accumulation of rounding errors in large computations can cause accuracy issues. One way to prevent these issues is to use multiple-precision floating-point arithmetic. This preprint, submitted to Russian Supercomputing Days 2020, presents a new library of basic linear algebra operations with multiple precision for graphics processing units. The library is written in CUDA C/C++ and uses the residue number system to represent multiple-precision significands of floating-point numbers. The supported data types, memory layout, and main features of the library are considered. Experimental results are presented showing the performance of the library.

Download Full-text

Poster: Automatically Adapting Programs for Mixed-Precision Floating-Point Computation

2012 SC Companion: High Performance Computing, Networking Storage and Analysis ◽

10.1109/sc.companion.2012.232 ◽

2012 ◽

Author(s):

Michael O. Lam ◽

Bronis R. de Supinksi ◽

Matthew P. LeGendre ◽

Jeffrey K. Hollingsworth

Keyword(s):

Floating Point ◽

Mixed Precision ◽

Floating Point Computation

Download Full-text

Efficient Multiple-Precision Floating-Point Fused Multiply-Add with Mixed-Precision Support

IEEE Transactions on Computers ◽

10.1109/tc.2019.2895031 ◽

2019 ◽

Vol 68 (7) ◽

pp. 1035-1048 ◽

Cited By ~ 2

Author(s):

Hao Zhang ◽

Dongdong Chen ◽

Seok-Bum Ko

Keyword(s):

Floating Point ◽

Multiple Precision ◽

Mixed Precision

Download Full-text

A floating point conversion algorithm for mixed precision computations

Journal of Zhejiang University SCIENCE C ◽

10.1631/jzus.c1200043 ◽

2012 ◽

Vol 13 (9) ◽

pp. 711-718

Author(s):

Choon Lih Hoo ◽

Sallehuddin Mohamed Haris ◽

Nik Abdullah Nik Mohamed

Keyword(s):

Floating Point ◽

Conversion Algorithm ◽

Mixed Precision

Download Full-text

Scientific processing in ISO-Pascal: a proposal to get the benefits of mixed precision floating-point

ACM SIGPLAN Notices ◽

10.1145/71052.71054 ◽

1989 ◽

Vol 24 (6) ◽

pp. 20-22

Author(s):

B. A. Wichmann

Keyword(s):

Floating Point ◽

Mixed Precision

Download Full-text

Clock Math — a System for Solving SLEs Exactly

Acta Polytechnica ◽

10.14311/1795 ◽

2013 ◽

Vol 53 (2) ◽

Author(s):

Jakub Hladík ◽

Róbert Lórencz ◽

Ivan Šimeček

Keyword(s):

Hybrid System ◽

Chinese Remainder Theorem ◽

Linear Equations ◽

Floating Point ◽

Rounding Errors ◽

Systems Of Linear Equations ◽

Floating Point Numbers

In this paper, we present a GPU-accelerated hybrid system that solves ill-conditioned systems of linear equations exactly. Exactly means without rounding errors due to using integer arithmetics. First, we scale floating-point numbers up to integers, then we solve dozens of SLEs within different modular arithmetics and then we assemble sub-solutions back using the Chinese remainder theorem. This approach effectively bypasses current CPU floating-point limitations. The system is capable of solving Hilbert’s matrix without losing a single bit of precision, and with a significant speedup compared to existing CPU solvers.

Download Full-text

Automatically adapting programs for mixed-precision floating-point computation

Proceedings of the 27th international ACM conference on International conference on supercomputing - ICS '13 ◽

10.1145/2464996.2465018 ◽

2013 ◽

Cited By ~ 45

Author(s):

Michael O. Lam ◽

Jeffrey K. Hollingsworth ◽

Bronis R. de Supinski ◽

Matthew P. Legendre

Keyword(s):

Floating Point ◽

Mixed Precision ◽

Floating Point Computation

Download Full-text