Regime Inference for Sound Floating-Point Optimizations

2021 ◽  
Vol 20 (5s) ◽  
pp. 1-23
Author(s):  
Robert Rabe ◽  
Anastasiia Izycheva ◽  
Eva Darulova

Efficient numerical programs are required for proper functioning of many systems. Today’s tools offer a variety of optimizations to generate efficient floating-point implementations that are specific to a program’s input domain. However, sound optimizations are of an “all or nothing” fashion with respect to this input domain—if an optimizer cannot improve a program on the specified input domain, it will conclude that no optimization is possible. In general, though, different parts of the input domain exhibit different rounding errors and thus have different optimization potential. We present the first regime inference technique for sound optimizations that automatically infers an effective subdivision of a program’s input domain such that individual sub-domains can be optimized more aggressively. Our algorithm is general; we have instantiated it with mixed-precision tuning and rewriting optimizations to improve performance and accuracy, respectively. Our evaluation on a standard benchmark set shows that with our inferred regimes, we can, on average, improve performance by 65% and accuracy by 54% with respect to whole-domain optimizations.

Author(s):  
Wei-Fan Chiang ◽  
Mark Baranowski ◽  
Ian Briggs ◽  
Alexey Solovyev ◽  
Ganesh Gopalakrishnan ◽  
...  

2020 ◽  
Author(s):  
Konstantin Isupov ◽  
Vladimir Knyazkov

The binary32 and binary64 floating-point formats provide good performance on current hardware, but also introduce a rounding error in almost every arithmetic operation. Consequently, the accumulation of rounding errors in large computations can cause accuracy issues. One way to prevent these issues is to use multiple-precision floating-point arithmetic. This preprint, submitted to Russian Supercomputing Days 2020, presents a new library of basic linear algebra operations with multiple precision for graphics processing units. The library is written in CUDA C/C++ and uses the residue number system to represent multiple-precision significands of floating-point numbers. The supported data types, memory layout, and main features of the library are considered. Experimental results are presented showing the performance of the library.


2012 ◽  
Vol 13 (9) ◽  
pp. 711-718
Author(s):  
Choon Lih Hoo ◽  
Sallehuddin Mohamed Haris ◽  
Nik Abdullah Nik Mohamed

10.14311/1795 ◽  
2013 ◽  
Vol 53 (2) ◽  
Author(s):  
Jakub Hladík ◽  
Róbert Lórencz ◽  
Ivan Šimeček

In this paper, we present a GPU-accelerated hybrid system that solves ill-conditioned systems of linear equations exactly. Exactly means without rounding errors due to using integer arithmetics. First, we scale floating-point numbers up to integers, then we solve dozens of SLEs within different modular arithmetics and then we assemble sub-solutions back using the Chinese remainder theorem. This approach effectively bypasses current CPU floating-point limitations. The system is capable of solving Hilbert’s matrix without losing a single bit of precision, and with a significant speedup compared to existing CPU solvers.


Sign in / Sign up

Export Citation Format

Share Document