Bit Grooming: Statistically accurate precision-preserving quantization with compression, evaluated in the netCDF Operators (NCO, v4.4.8+)

Mapping Intimacies ◽

10.5194/gmd-2016-63 ◽

2016 ◽

Author(s):

Charles S. Zender

Keyword(s):

Data Storage ◽

Lossy Compression ◽

Floating Point ◽

Decimal Digit ◽

Double Precision ◽

Storage Space ◽

Climate Data ◽

Single Precision ◽

Precision Data ◽

Point Data

Abstract. Lossy compression schemes can help reduce the space required to store the false precision (i.e, scientifically meaningless data bits) that geoscientific models and measurements generate. We introduce, implement, and characterize a new lossy compression scheme suitable for IEEE floating-point data. Our new Bit Grooming algorithm alternately shaves (to zero) and sets (to one) the least significant bits of consecutive values to preserve a desired precision. This is a symmetric, two-sided variant of an algorithm sometimes called Bit Shaving which quantizes values solely by zeroing bits. Our variation eliminates the artificial low-bias produced by always zeroing bits, and makes Bit Grooming more suitable for arrays and multi-dimensional fields whose mean statistics are important. Bit Grooming relies on standard lossless compression schemes to achieve the actual reduction in storage space, so we tested Bit Grooming by applying the DEFLATE compression algorithm to bit-groomed and full-precision climate data stored in netCDF3, netCDF4, HDF4, and HDF5 formats. Bit Grooming reduces the storage space required by uncompressed and compressed climate data by up to 50 % and 20 %, respectively, for single-precision data (the most common case for climate data). When used aggressively (i.e., preserving only 1–3 decimal digits of precision), Bit Grooming produces storage reductions comparable to other quantization techniques such as linear packing. Unlike linear packing, Bit Grooming works on the full representable range of floating-point data. Bit Grooming reduces the volume of single-precision compressed data by roughly 10 % per decimal digit quantized (or "groomed") after the third such digit, up to a maximum reduction of about 50 %. The potential reduction is greater for double-precision datasets. Data quantization by Bit Grooming is irreversible (i.e., lossy) yet transparent, meaning that no extra processing is required by data users/readers. Hence Bit Grooming can easily reduce data storage volume without sacrificing scientific precision or imposing extra burdens on users.

Download Full-text

Bit Grooming: statistically accurate precision-preserving quantization with compression, evaluated in the netCDF Operators (NCO, v4.4.8+)

Geoscientific Model Development ◽

10.5194/gmd-9-3199-2016 ◽

2016 ◽

Vol 9 (9) ◽

pp. 3199-3211 ◽

Cited By ~ 10

Author(s):

Charles S. Zender

Keyword(s):

Data Storage ◽

Dynamic Range ◽

Lossy Compression ◽

Floating Point ◽

Scientific Integrity ◽

Range Data ◽

Double Precision ◽

Storage Space ◽

Climate Data ◽

Data Limitations

Abstract. Geoscientific models and measurements generate false precision (scientifically meaningless data bits) that wastes storage space. False precision can mislead (by implying noise is signal) and be scientifically pointless, especially for measurements. By contrast, lossy compression can be both economical (save space) and heuristic (clarify data limitations) without compromising the scientific integrity of data. Data quantization can thus be appropriate regardless of whether space limitations are a concern. We introduce, implement, and characterize a new lossy compression scheme suitable for IEEE floating-point data. Our new Bit Grooming algorithm alternately shaves (to zero) and sets (to one) the least significant bits of consecutive values to preserve a desired precision. This is a symmetric, two-sided variant of an algorithm sometimes called Bit Shaving that quantizes values solely by zeroing bits. Our variation eliminates the artificial low bias produced by always zeroing bits, and makes Bit Grooming more suitable for arrays and multi-dimensional fields whose mean statistics are important. Bit Grooming relies on standard lossless compression to achieve the actual reduction in storage space, so we tested Bit Grooming by applying the DEFLATE compression algorithm to bit-groomed and full-precision climate data stored in netCDF3, netCDF4, HDF4, and HDF5 formats. Bit Grooming reduces the storage space required by initially uncompressed and compressed climate data by 25–80 and 5–65 %, respectively, for single-precision values (the most common case for climate data) quantized to retain 1–5 decimal digits of precision. The potential reduction is greater for double-precision datasets. When used aggressively (i.e., preserving only 1–2 digits), Bit Grooming produces storage reductions comparable to other quantization techniques such as Linear Packing. Unlike Linear Packing, whose guaranteed precision rapidly degrades within the relatively narrow dynamic range of values that it can compress, Bit Grooming guarantees the specified precision throughout the full floating-point range. Data quantization by Bit Grooming is irreversible (i.e., lossy) yet transparent, meaning that no extra processing is required by data users/readers. Hence Bit Grooming can easily reduce data storage volume without sacrificing scientific precision or imposing extra burdens on users.

Download Full-text

Research on IEEE 754 Standard Single Precision Floating Point Multipliers Designed using Urdhva Triyagbhyam Sutra of Vedic Mathematics

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b1382.0982s1119 ◽

2019 ◽

Vol 8 (2S11) ◽

pp. 2990-2993

Keyword(s):

Floating Point ◽

Double Precision ◽

Single Precision ◽

Vedic Mathematics ◽

Quadruple Precision

Duplication of the coasting element numbers is the big activity in automated signal handling. So the exhibition of drifting problem multipliers count on a primary undertaking in any computerized plan. Coasting factor numbers are spoken to utilizing IEEE 754 modern day in single precision(32-bits), Double precision(sixty four-bits) and Quadruple precision(128-bits) organizations. Augmentation of those coasting component numbers can be completed via using Vedic generation. Vedic arithmetic encompass sixteen wonderful calculations or Sutras. Urdhva Triyagbhyam Sutra is most usually applied for growth of twofold numbers. This paper indicates the compare of tough work finished via exceptional specialists in the direction of the plan of IEEE 754 ultra-modern-day unmarried accuracy skimming thing multiplier the usage of Vedic technological statistics.

Download Full-text

Operational Single-Precision Earth-System Modelling at ECMWF

10.5194/egusphere-egu21-733 ◽

2021 ◽

Author(s):

Sam Hatfield ◽

Kristian Mogensen ◽

Peter Dueben ◽

Nils Wedi ◽

Michail Diamantakis

Keyword(s):

Atmospheric Model ◽

Floating Point ◽

System Modelling ◽

Double Precision ◽

Earth System ◽

Single Precision ◽

System Models ◽

Floating Point Arithmetic ◽

Point Arithmetic ◽

Earth System Models

<p>Earth-System models traditionally use double-precision, 64 bit floating-point numbers to perform arithmetic. According to orthodoxy, we must use such a relatively high level of precision in order to minimise the potential impact of rounding errors on the physical fidelity of the model. However, given the inherently imperfect formulation of our models, and the computational benefits of lower precision arithmetic, we must question this orthodoxy. At ECMWF, a single-precision, 32 bit variant of the atmospheric model IFS has been undergoing rigorous testing in preparation for operations for around 5 years. The single-precision simulations have been found to have effectively the same forecast skill as the double-precision simulations while finishing in 40% less time, thanks to the memory and cache benefits of single-precision numbers. Following these positive results, other modelling groups are now also considering single-precision as a way to accelerate their simulations.</p><p>In this presentation I will present the rationale behind the move to lower-precision floating-point arithmetic and up-to-date results from the single-precision atmospheric model at ECMWF, which will be operational imminently. I will then provide an update on the development of the single-precision ocean component at ECMWF, based on the NEMO ocean model, including a verification of quarter-degree simulations. I will also present new results from running ECMWF's coupled atmosphere-ocean-sea-ice-wave forecasting system entirely with single-precision. Finally I will discuss the feasibility of even lower levels of precision, like half-precision, which are now becoming available through GPU- and ARM-based systems such as Summit and Fugaku, respectively. The use of reduced-precision floating-point arithmetic will be an essential consideration for developing high-resolution, storm-resolving Earth-System models.</p>

Download Full-text

Lossless Compression of Double-Precision Floating-Point Data for Numerical Simulations: Highly Parallelizable Algorithms for GPU Computing

IEICE Transactions on Information and Systems ◽

10.1587/transinf.e95.d.2778 ◽

2012 ◽

Vol E95.D (12) ◽

pp. 2778-2786

Author(s):

Mamoru OHARA ◽

Takashi YAMAGUCHI

Keyword(s):

Numerical Simulations ◽

Gpu Computing ◽

Lossless Compression ◽

Floating Point ◽

Double Precision ◽

Point Data

Download Full-text

Compression of Barotropic Turbulence Simulation Data Using Wavelet-Based Lossy Coding

Volume 1: Fora, Parts A and B ◽

10.1115/fedsm2002-31120 ◽

2002 ◽

Cited By ~ 2

Author(s):

John P. Wilson

Keyword(s):

Evaluation Criteria ◽

Floating Point ◽

Simulation Data ◽

Single Precision ◽

Acceptable Error ◽

Turbulence Simulation ◽

Lossy Coding ◽

Point Data ◽

Definition Of ◽

Quantities Of Interest

Single-precision floating point data from a simulation of barotropic turbulence is compressed with a wavelet-based method. The quantity being compressed is vorticity. The compression error is evaluated both in terms of error in the vorticity and the error in various quantities derived from the vorticity. Numerical error is evaluated in all quantities and visualizations of the vorticity and correlation of the error with the uncompressed data are evaluated. It is found that depending on the quantities of interest and the evaluation criteria, compression ratios of 4:1 to 256:1 are achievable. Under a conservative definition of acceptable error, it is possible to recover quantities of interest from data compressed 4:1 (8bpp), the data rate that in existing practice is used for visualization.

Download Full-text

High performance and energy efficient single‐precision and double‐precision merged floating‐point adder on FPGA

IET Computers & Digital Techniques ◽

10.1049/iet-cdt.2016.0200 ◽

2017 ◽

Vol 12 (1) ◽

pp. 20-29 ◽

Cited By ~ 4

Author(s):

Hao Zhang ◽

Dongdong Chen ◽

Seok‐Bum Ko

Keyword(s):

Energy Efficient ◽

High Performance ◽

Floating Point ◽

Double Precision ◽

Single Precision

Download Full-text

FPC: A High-Speed Compressor for Double-Precision Floating-Point Data

IEEE Transactions on Computers ◽

10.1109/tc.2008.131 ◽

2009 ◽

Vol 58 (1) ◽

pp. 18-31 ◽

Cited By ~ 107

Author(s):

Martin Burtscher ◽

Paruj Ratanaworabhan

Keyword(s):

High Speed ◽

Floating Point ◽

Double Precision ◽

Point Data

Download Full-text

A Vector-Like Reconfigurable Floating-Point Unit for the Logarithm

International Journal of Reconfigurable Computing ◽

10.1155/2011/341510 ◽

2011 ◽

Vol 2011 ◽

pp. 1-12 ◽

Cited By ~ 1

Author(s):

Nikolaos Alachiotis ◽

Alexandros Stamatakis

Keyword(s):

Reconfigurable Computing ◽

Lookup Table ◽

Floating Point ◽

Double Precision ◽

Single Precision ◽

Reconfigurable Devices ◽

Floating Point Unit ◽

New Generation ◽

Logarithm Function

The use of reconfigurable computing for accelerating floating-point intensive codes is becoming common due to the availability of DSPs in new-generation FPGAs. We present the design of an efficient, pipelined floating-point datapath for calculating the logarithm function on reconfigurable devices. We integrate the datapath into a stand-alone LUT-based (Lookup Table) component, the LAU (Logarithm Approximation Unit). We extended the LAU, by integrating two architecturally independent, LAU-based datapaths into a larger component, the VLAU (vector-like LAU). The VLAU produces 2 results/cycle, while occupying the same amount of memory as the LAU. Under single precision, one LAU is 12 and 1.7 times faster than the GNU and Intel Math Kernel Library (MKL) implementations, respectively. The LAU is also 1.6 times faster than the FloPoCo reconfigurable logarithm architecture. Under double precision, one LAU is 20 and 2.6 times faster than the respective GNU and MKL functions and 1.4 times faster than the FloPoCo logarithm. The VLAU is approximately twice as fast as the LAU, both under single and double precision.

Download Full-text