The Effects of Digital Quantization Error on Speech Intelligibility and Perceived Speech Quality

1991 ◽  
Vol 34 (1) ◽  
pp. 189-196 ◽  
Author(s):  
Richard W. Harris ◽  
Robert H. Brey ◽  
Yuan-Shu Chang ◽  
B. Diann Soria ◽  
Laurence M. Hilton

The effects of digital quantization error upon speech intelligibility and perceived speech quality, for normally hearing subjects, were investigated for digitized speech processed to simulate 6-, 8-, 10-, 12-, 14-, and 16-bit integer conversion and 2-, 3-, 4-, 5-, 6-, and 7-bit floating-point conversion. For the integer data, there were no significant differences in speech intelligibility for 8- to 16-bit conversion. Only 6-bit integer conversion at 55 dB SPL resulted in a significant degradation in speech intelligibility. For the floating-point data, there were no significant differences in speech intelligibility for 2- to 7-bit floating-point conversion. However, results of the perceived quality experiment appeared to be more sensitive to differences among the various conditions. Speech processed using 12-, 14-, and 16-bit integer conversion was judged to be superior to speech processed using the 6-, 8-, and 10-bit integer conditions. Speech processed using 5-, 6-, and 7-bit floating-point conversion was judged to be superior to speech processed using 2-, 3-, and 4-bit floating-point conversion.

Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1878
Author(s):  
Yi Zhou ◽  
Haiping Wang ◽  
Yijing Chu ◽  
Hongqing Liu

The use of multiple spatially distributed microphones allows performing spatial filtering along with conventional temporal filtering, which can better reject the interference signals, leading to an overall improvement of the speech quality. In this paper, we propose a novel dual-microphone generalized sidelobe canceller (GSC) algorithm assisted by a bone-conduction (BC) sensor for speech enhancement, which is named BC-assisted GSC (BCA-GSC) algorithm. The BC sensor is relatively insensitive to the ambient noise compared to the conventional air-conduction (AC) microphone. Hence, BC speech can be analyzed to generate very accurate voice activity detection (VAD), even in a high noise environment. The proposed algorithm incorporates the VAD information obtained by the BC speech into the adaptive blocking matrix (ABM) and adaptive noise canceller (ANC) in GSC. By using VAD to control ABM and combining VAD with signal-to-interference ratio (SIR) to control ANC, the proposed method could suppress interferences and improve the overall performance of GSC significantly. It is verified by experiments that the proposed GSC system not only improves speech quality remarkably but also boosts speech intelligibility.


1995 ◽  
Vol 38 (3) ◽  
pp. 714-725 ◽  
Author(s):  
Jill E. Preminger ◽  
Dianne J. Van Tasell

The purpose of the present research was to examine the relation between speech quality and speech intelligibility. Speech quality measurements were made using continuous discourse and a category rating procedure for the following dimensions: intelligibility, pleasantness, loudness, effort, and total impression. Measurements were made using a group of listeners with normal hearing for a set of stimulus conditions in which intelligibility varied, and for a set of stimulus conditions in which intelligibility was held constant near 100%. When ratings were made for a set of stimulus conditions in which intelligibility was allowed to vary (a) intersubject reliability was high (i.e., different listeners interpreted the dimensions in a similar manner); and (b) the speech quality dimensions of intelligibility, effort, and loudness were indistinguishable. When ratings were made for a set of stimulus conditions in which intelligibility was held constant (a) intersubject reliability was reduced, indicating that different listeners interpreted the dimensions in different ways; (b) most listeners rated each dimension differently, indicating that the dimensions were unique; and (c) across listeners, no single dimension was highly correlated with total impression. These results can be used in order to examine the relation between speech quality and speech intelligibility.


2005 ◽  
Vol 48 (3) ◽  
pp. 702-714 ◽  
Author(s):  
Peninah S. Rosengard ◽  
Karen L. Payton ◽  
Louis D. Braida

The purpose of this study was twofold: (a) to determine the extent to which 4-channel, slow-acting wide dynamic range amplitude compression (WDRC) can counteract the perceptual effects of reduced auditory dynamic range and (b) to examine the relation between objective measures of speech intelligibility and categorical ratings of speech quality for sentences processed with slow-acting WDRC. Multiband expansion was used to simulate the effects of elevated thresholds and loudness recruitment in normal hearing listeners. While some previous studies have shown that WDRC can improve both speech intelligibility and quality, others have found no benefit. The current experiment shows that moderate amounts of compression can provide a small but significant improvement in speech intelligibility, relative to linear amplification, for simulated-loss listeners with small dynamic ranges (i.e., flat, moderate hearing loss). This benefit was found for speech at conversational levels, both in quiet and in a background of babble. Simulated-loss listeners with large dynamic ranges (i.e., sloping, mild-to-moderate hearing loss) did not show any improvement. Comparison of speech intelligibility scores and subjective ratings of intelligibility showed that listeners with simulated hearing loss could accurately judge the overall intelligibility of speech. However, in all listeners, ratings of pleasantness decreased as the compression ratio increased. These findings suggest that subjective measures of speech quality should be used in conjunction with either objective or subjective measures of speech intelligibility to ensure that participant-selected hearing aid parameters optimize both comfort and intelligibility.


1983 ◽  
Vol 27 (1) ◽  
pp. 104-107 ◽  
Author(s):  
Thomas R. Edman ◽  
Stephen V. Metz

Real-time speech digitizing technologies underlie such modern communications products as voice store and forward systems and digital PBX's. Among the human factors design issues associated with this technology, three of particular importance can be identified: i) speaker identifiability, ii) acceptability of speech quality, and iii) speech intelligibility. An experimental method for addressing issues of identifiability and intelligibility was developed and used to compare a commercial speech digitizing device with a standard toll quality telephone channel. It was found that the identifiability and acceptability of the telephone was slightly superior to the digitized speech. Additionally, results on an MRT showed intelligibility scores somewhat below optimal.


Sign in / Sign up

Export Citation Format

Share Document