An Analysis of the Impact of Playout Delay Adjustments introduced by VoIP Jitter Buffers on Listening Speech Quality

Peter Počta; Hugh Melvin; Andrew Hines

doi:10.3813/aaa.918857

Impact of Background Traffic on Speech Quality in VoWLAN

Advances in Multimedia ◽

10.1155/2007/57423 ◽

2007 ◽

Vol 2007 ◽

pp. 1-9 ◽

Cited By ~ 3

Author(s):

Peter Počta ◽

Peter Kortiš ◽

Martin Vaculík

Keyword(s):

Data Transfer ◽

Speech Quality ◽

Point Of View ◽

Telecommunication Networks ◽

Traffic Load ◽

Critical Conditions ◽

Test Sequence ◽

Perceptual Evaluation ◽

Background Traffic ◽

The Impact

This paper describes measurements of the impact of background traffic on speech quality in an environment of WLANs (IEEE 802.11). The simulated background traffic consists of three types of current traffics in telecommunication networks such as data transfer service, multimedia streaming service, and Web service. The background traffic was generated by means of the accomplished Distributed Internet Traffic Generator (D-ITG). The impact of these types of traffic and traffic load on speech quality using the test sequence and speech sequences is the aim of this paper. The assessment of speech quality is carried out by means of the accomplished Perceptual Evaluation of Speech Quality (PESQ) algorithm. The proposal of a new method for improved detection of the critical conditions in wireless telecommunication networks from the speech quality point of view is presented in this paper. Conclusion implies the next application of the method of improved detection of critical conditions for the purpose of algorithms for link adaptation from the speech quality point of view in an environment of WLANs. The primary goal of these algorithms is improving speech quality in the VoWLAN connections, which are established in the competent link.

Download Full-text

On the impact of speech intelligibility on speech quality in the context of voice over IP telephony

2014 Sixth International Workshop on Quality of Multimedia Experience (QoMEX) ◽

10.1109/qomex.2014.6982292 ◽

2014 ◽

Cited By ~ 1

Author(s):

Falk Schiffner ◽

Janto Skowronek ◽

Alexander Raake

Keyword(s):

Speech Intelligibility ◽

Voice Over Ip ◽

Speech Quality ◽

Ip Telephony ◽

The Impact

Download Full-text

Assessment of Conversational Speech Quality Inside Vehicles, Concerning Influences of Room Acoustics and Driving Noises

Acta Acustica united with Acustica ◽

10.3813/aaa.918530 ◽

2012 ◽

Vol 98 (3) ◽

pp. 461-474 ◽

Cited By ~ 1

Author(s):

Oliver Jung

Keyword(s):

Transfer Functions ◽

Strong Relationship ◽

Speech Quality ◽

Room Acoustics ◽

Conversational Speech ◽

Acoustical Parameters ◽

Vehicle Interior ◽

Lombard Effect ◽

Speech Spectrum ◽

The Impact

This study considers the influences of room acoustics and driving noises in vehicle interiors on the subjectively perceived acoustical quality of conversations between passengers. A listening test with 25 participants was performed inside a laboratory to assess the impact of different vehicle interior transfer functions on the speech quality assessment in four predetermined dimensions. Idealized driving noises at three different vehicle speeds were presented simultaneously with speech samples to quantify the interferences of these noise conditions with varied signal-to-noise ratios. To minimize the influence of different human speakers, four talkers (two male and two female) were selected from commercially available audio books. The respective speech samples were adjusted in level and long-term average speech spectrum to the common values of conversational speech. The automatic reflex of raising one's voice in noisy environments, called “Lombard Effect” [1], was taken into account for an additional adjustment of speech levels while driving noises were present. A strong relationship between the speech-to-noise ratio and the test participants' evaluations was found. Thus, one can assume that the speech signals' attenuation or amplification caused by the different room acoustics of the tested vehicles play a more important role for a sufficient speech quality than the varied speech timbre or other parameters. Only at very high speech-to-noise ratios ( ≥ 20 dB with A-weighting), room-acoustical parameters such as IACC or the reverberation time are more determining for the speech quality appreciation than the speech's sound pressure level.

Download Full-text

Factors Affecting the Accessibility of Voice Telephony for People with Hearing Loss: Audio Encoding, Network Impairments, Video and Environmental Noise

ACM Transactions on Accessible Computing ◽

10.1145/3479160 ◽

2021 ◽

Vol 14 (4) ◽

pp. 1-35

Author(s):

Linda Kozma-Spytek ◽

Christian Vogler

Keyword(s):

Hearing Loss ◽

Speech Recognition ◽

Packet Loss ◽

Mental Effort ◽

Quality Parameters ◽

Speech Quality ◽

Audio Quality ◽

Presentation Modes ◽

Quality Ratings ◽

The Impact

This paper describes four studies with a total of 114 individuals with hearing loss and 12 hearing controls that investigate the impact of audio quality parameters on voice telecommunications. These studies were first informed by a survey of 439 individuals with hearing loss on their voice telecommunications experiences. While voice telephony was very important, with high usage of wireless mobile phones, respondents reported relatively low satisfaction with their hearing devices’ performance for telephone listening, noting that improved telephone audio quality was a significant need. The studies cover three categories of audio quality parameters: (1) narrowband (NB) versus wideband (WB) audio; (2) encoding audio at varying bit rates, from typical rates used in today's mobile networks to the highest quality supported by these audio codecs; and (3) absence of packet loss to worst-case packet loss in both mobile and VoIP networks. Additionally, NB versus WB audio was tested in auditory-only and audiovisual presentation modes and in quiet and noisy environments. With WB audio in a quiet environment, individuals with hearing loss exhibited better speech recognition, expended less perceived mental effort, and rated speech quality higher than with NB audio. WB audio provided a greater benefit when listening alone than when the visual channel also was available. The noisy environment significantly degraded performance for both presentation modes, but particularly for listening alone. Bit rate affected speech recognition for NB audio, and speech quality ratings for both NB and WB audio. Packet loss affected all of speech recognition, mental effort, and speech quality ratings. WB versus NB audio also affected hearing individuals, especially under packet loss. These results are discussed in terms of the practical steps they suggest for the implementation of telecommunications systems and related technical standards and policy considerations to improve the accessibility of voice telephony for people with hearing loss.

Download Full-text

Experimental results on the impact of cell delay variation on speech quality in ATM networks

ICC '98. 1998 IEEE International Conference on Communications. Conference Record. Affiliated with SUPERCOMM'98 (Cat. No.98CH36220) ◽

10.1109/icc.1998.682902 ◽

2002 ◽

Author(s):

Bin Li ◽

Xi-Ren Cao

Keyword(s):

Speech Quality ◽

Experimental Results ◽

Atm Networks ◽

The Impact ◽

Cell Delay ◽

Delay Variation

Download Full-text

It takes two to tango - assessing the impact of delay on conversational interactivity on perceived speech quality

10.21437/interspeech.2010-412 ◽

2010 ◽

Author(s):

Sebastian Egger ◽

Raimund Schatz ◽

Stefan Scherer

Keyword(s):

Speech Quality ◽

The Impact

Download Full-text

Evaluation of digital watermarking on subjective speech quality

Scientific Reports ◽

10.1038/s41598-021-99811-x ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Yann Kowalczuk ◽

Jan Holub

Keyword(s):

Social Background ◽

Speech Quality ◽

Speech Compression ◽

Audio Quality ◽

Speech Watermarking ◽

Subjective Testing ◽

Listening Tests ◽

Subjective Assessments ◽

Audio Files ◽

The Impact

AbstractNew methods of securing the distribution of audio content have been widely deployed in the last twenty years. Their impact on perceptive quality has, however, only been seldomly the subject of recent extensive research. We review digital speech watermarking state of the art and provide subjective testing of watermarked speech samples. Latest speech watermarking techniques are listed, with their specifics and potential for further development. Their current and possible applications are evaluated. Open-source software designed to embed watermarking patterns in audio files is used to produce a set of samples that satisfies the requirements of modern speech-quality subjective assessments. The patchwork algorithm that is coded in the application is mainly considered in this analysis. Different watermark robustness levels are used, which allow determining the threshold of detection to human listeners. The subjective listening tests are conducted following ITU-T P.800 Recommendation, which precisely defines the conditions and requirements for subjective testing. Further analysis tries to determine the effects of noise and various disturbances on watermarked speech’s perceived quality. A threshold of intelligibility is estimated to allow further openings on speech compression techniques with watermarking. The impact of language or social background is evaluated through an additional experiment involving two groups of listeners. Results show significant robustness of the watermarking implementation, retaining both a reasonable net subjective audio quality and security attributes, despite mild levels of distortion and noise. Extended experiments with Chinese listeners open the door to formulate a hypothesis on perception variations with geographical and social backgrounds.

Download Full-text

Predicting the Quality of Synthesized and Natural Speech Impaired by Packet Loss and Coding Using PESQ and P.563 Models

Acta Acustica united with Acustica ◽

10.3813/aaa.918465 ◽

2011 ◽

Vol 97 (5) ◽

pp. 852-868 ◽

Cited By ~ 7

Author(s):

Peter Počta ◽

Jan Holub

Keyword(s):

Packet Loss ◽

Speech Quality ◽

The Other ◽

Natural Speech ◽

Text To Speech ◽

Synthesized Speech ◽

Subjective Assessments ◽

Almost All ◽

The Impact

This paper investigates the impact of independent and dependent losses and coding on speech quality predictions provided by PESQ (also known as ITU-T P.862) and P.563 models, when both naturally-produced and synthesized speech are used. Two synthesized speech samples generated with two different Text-to-Speech systems and one naturally-produced sample are investigated. In addition, we assess the variability of PESQ's and P.563's predictions with respect to the type of speech used (naturally-produced or synthesized) and loss conditions as well as their accuracy, by comparing the predictions with subjective assessments. The results show that there is no difference between the impact of packet loss on naturally-produced speech and synthesized speech. On the other hand, the impact of coding is different for the two types of stimuli. In addition, synthesized speech seems to be insensitive to degradations provided by most of the codecs investigated here. The reasons for those findings are particularly discussed. Finally, it is concluded that both models are capable of predicting the quality of transmitted synthesized speech under the investigated conditions to a certain degree. As expected, PESQ achieves the best performance over almost all of the investigated conditions.

Download Full-text