Evaluating Low-Level Speech Features Against Human Perceptual Data

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00071 ◽

2017 ◽

Vol 5 ◽

pp. 425-440 ◽

Cited By ~ 2

Author(s):

Caitlin Richter ◽

Naomi H. Feldman ◽

Harini Salgado ◽

Aren Jansen

Keyword(s):

Speech Perception ◽

Cognitive Model ◽

Human Perception ◽

Speaker Normalization ◽

Low Level ◽

Human Data ◽

Normalization Methods ◽

Speech Features ◽

New Framework ◽

Better Than

We introduce a method for measuring the correspondence between low-level speech features and human perception, using a cognitive model of speech perception implemented directly on speech recordings. We evaluate two speaker normalization techniques using this method and find that in both cases, speech features that are normalized across speakers predict human data better than unnormalized speech features, consistent with previous research. Results further reveal differences across normalization methods in how well each predicts human data. This work provides a new framework for evaluating low-level representations of speech on their match to human perception, and lays the groundwork for creating more ecologically valid models of speech perception.

Download Full-text

Electrophysiological signatures of hierarchical learning

10.1101/2021.03.09.434666 ◽

2021 ◽

Author(s):

Meng Liu ◽

Wenshan Dong ◽

Shaozheng Qin ◽

Tom Verguts ◽

Qi Chen

Keyword(s):

Cognitive Process ◽

Human Perception ◽

Learning Task ◽

Prediction Errors ◽

Neural Basis ◽

Hierarchical Learning ◽

Low Level ◽

High Level ◽

The Brain ◽

Better Than

AbstractHuman perception and learning is thought to rely on a hierarchical generative model that is continuously updated via precision-weighted prediction errors (pwPEs). However, the neural basis of such cognitive process and how it unfolds during decision making, remain poorly understood. To investigate this question, we combined a hierarchical Bayesian model (i.e., Hierarchical Gaussian Filter, HGF) with electrophysiological (EEG) recording, while participants performed a probabilistic reversal learning task in alternatingly stable and volatile environments. Behaviorally, the HGF fitted significantly better than two control, non-hierarchical, models. Neurally, low-level and high-level pwPEs were independently encoded by the P300 component. Low-level pwPEs were reflected in the theta (4-8 Hz) frequency band, but high-level pwPEs were not. Furthermore, the expressions of high-level pwPEs were stronger for participants with better HGF fit. These results indicate that the brain employs hierarchical learning, and encodes both low- and high-level learning signals separately and adaptively.

Download Full-text

Exploring The Effect of Visual Information Degradation on Human Perception and Performance In A Human-Telerobot System

Proceedings of the Human Factors and Ergonomics Society Annual Meeting ◽

10.1177/1071181320641310 ◽

2020 ◽

Vol 64 (1) ◽

pp. 1302-1307

Author(s):

Richard Stone ◽

Minglu Wang ◽

Thomas Schnieders ◽

Esraa Abdelall

Keyword(s):

Visual Information ◽

Human Perception ◽

Low Level ◽

Hit Rate ◽

Word Level ◽

Level Of Automation ◽

Significant Difference ◽

Human Operators ◽

And Performance ◽

High Level

Human-robotic interaction system are increasingly becoming integrated into industrial, commercial and emergency service agencies. It is critical that human operators understand and trust automation when these systems support and even make important decisions. The following study focused on human-in-loop telerobotic system performing a reconnaissance operation. Twenty-four subjects were divided into groups based on level of automation (Low-Level Automation (LLA), and High-Level Automation (HLA)). Results indicated a significant difference between low and high word level of control in hit rate when permanent error occurred. In the LLA group, the type of error had a significant effect on the hit rate. In general, the high level of automation was better than the low level of automation, especially if it was more reliable, suggesting that subjects in the HLA group could rely on the automatic implementation to perform the task more effectively and more accurately.

Download Full-text

Acoustic-Prosodic and Lexical Cues to Deception and Trust: Deciphering How People Detect Lies

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00311 ◽

2020 ◽

Vol 8 ◽

pp. 199-214

Author(s):

Xi (Leslie) Chen ◽

Sarah Ita Levitan ◽

Michelle Levine ◽

Marko Mandic ◽

Julia Hirschberg

Keyword(s):

Deception Detection ◽

Human Perception ◽

Lie Detection ◽

Challenging Problem ◽

Cues To Deception ◽

Lexical Cues ◽

Insight Into ◽

Large Corpus ◽

Better Than

Humans rarely perform better than chance at lie detection. To better understand human perception of deception, we created a game framework, LieCatcher, to collect ratings of perceived deception using a large corpus of deceptive and truthful interviews. We analyzed the acoustic-prosodic and linguistic characteristics of language trusted and mistrusted by raters and compared these to characteristics of actual truthful and deceptive language to understand how perception aligns with reality. With this data we built classifiers to automatically distinguish trusted from mistrusted speech, achieving an F1 of 66.1%. We next evaluated whether the strategies raters said they used to discriminate between truthful and deceptive responses were in fact useful. Our results show that, although several prosodic and lexical features were consistently perceived as trustworthy, they were not reliable cues. Also, the strategies that judges reported using in deception detection were not helpful for the task. Our work sheds light on the nature of trusted language and provides insight into the challenging problem of human deception detection.

Download Full-text

Deviations of acoustic low-level descriptors in speech features of a set of triplets, one with autism

2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) ◽

10.1109/embc.2018.8513289 ◽

2018 ◽

Cited By ~ 1

Author(s):

Hasini Yatawatte ◽

Christian Poellabauer ◽

Sandra Schneider ◽

Susan Latham

Keyword(s):

Low Level ◽

Speech Features

Download Full-text

Saliency Detection by Multilevel Deep Pyramid Model

Journal of Sensors ◽

10.1155/2018/8249180 ◽

2018 ◽

Vol 2018 ◽

pp. 1-11 ◽

Cited By ~ 2

Author(s):

Hai Wang ◽

Lei Dai ◽

Yingfeng Cai ◽

Long Chen ◽

Yong Zhang

Keyword(s):

Background Noise ◽

State Of The Art ◽

Saliency Detection ◽

Saliency Map ◽

Multiple Features ◽

Low Level ◽

Pyramid Model ◽

High Level ◽

Different Levels ◽

Better Than

Traditional salient object detection models are divided into several classes based on low-level features and contrast between pixels. In this paper, we propose a model based on a multilevel deep pyramid (MLDP), which involves fusing multiple features on different levels. Firstly, the MLDP uses the original image as the input for a VGG16 model to extract high-level features and form an initial saliency map. Next, the MLDP further extracts high-level features to form a saliency map based on a deep pyramid. Then, the MLDP obtains the salient map fused with superpixels by extracting low-level features. After that, the MLDP applies background noise filtering to the saliency map fused with superpixels in order to filter out the interference of background noise and form a saliency map based on the foreground. Lastly, the MLDP combines the saliency map fused with the superpixels with the saliency map based on the foreground, which results in the final saliency map. The MLDP is not limited to low-level features while it fuses multiple features and achieves good results when extracting salient targets. As can be seen in our experiment section, the MLDP is better than the other 7 state-of-the-art models across three different public saliency datasets. Therefore, the MLDP has superiority and wide applicability in extraction of salient targets.

Download Full-text

Evaluation of Distributed Collaborative Adaptive Sensing for Detection of Low-Level Circulations and Implications for Severe Weather Warning Operations

Weather and Forecasting ◽

10.1175/2009waf2222233.1 ◽

2010 ◽

Vol 25 (1) ◽

pp. 173-189 ◽

Cited By ~ 21

Author(s):

J. Brotzge ◽

K. Hondl ◽

B. Philips ◽

L. Lemon ◽

E. J. Bass ◽

...

Keyword(s):

Severe Weather ◽

Special Focus ◽

Test Bed ◽

Scanning Strategy ◽

Engineering Research ◽

Low Level ◽

Adaptive Sensing ◽

Low Levels ◽

Casa System ◽

Better Than

Abstract The Center for Collaborative Adaptive Sensing of the Atmosphere (CASA) is a multiyear engineering research center established by the National Science Foundation for the development of small, inexpensive, low-power radars designed to improve the scanning of the lowest levels (<3 km AGL) of the atmosphere. Instead of sensing autonomously, CASA radars are designed to operate as a network, collectively adapting to the changing needs of end users and the environment; this network approach to scanning is known as distributed collaborative adaptive sensing (DCAS). DCAS optimizes the low-level volume coverage scanning and maximizes the utility of each scanning cycle. A test bed of four prototype CASA radars was deployed in southwestern Oklahoma in 2006 and operated continuously while in DCAS mode from March through June of 2007. This paper analyzes three convective events observed during April–May 2007, during CASA’s intense operation period (IOP), with a special focus on evaluating the benefits and weaknesses of CASA radar system deployment and DCAS scanning strategy of detecting and tracking low-level circulations. Data collected from nearby Weather Surveillance Radar-1988 Doppler (WSR-88D) and CASA radars are compared for mesoscyclones, misocyclones, and low-level vortices. Initial results indicate that the dense, overlapping coverage at low levels provided by the CASA radars and the high temporal (60 s) resolution provided by DCAS give forecasters more detailed feature continuity and tracking. Moreover, the CASA system is able to resolve a whole class of circulations—misocyclones—far better than the WSR-88Ds. In fact, many of these are probably missed completely by the WSR-88D. The impacts of this increased detail on severe weather warnings are under investigation. Ongoing efforts include enhancing the DCAS data quality and scanning strategy, improving the DCAS data visualization, and developing a robust infrastructure to better support forecast and warning operations.

Download Full-text

Speaker Normalization in Speech Perception

The Handbook of Speech Perception ◽

10.1002/9780470757024.ch15 ◽

2008 ◽

pp. 363-389 ◽

Cited By ~ 74

Author(s):

Keith Johnson

Keyword(s):

Speech Perception ◽

Speaker Normalization

Download Full-text

Early Warning from Car Warranty Data using a Fuzzy Logic Technique

Scalable Fuzzy Algorithms for Data Management and Analysis ◽

10.4018/978-1-60566-858-1.ch014 ◽

2010 ◽

pp. 347-364

Author(s):

Mark Last ◽

Yael Mendelson ◽

Sugato Chakrabarty ◽

Karishma Batra

Keyword(s):

Early Warning ◽

Cognitive Model ◽

Human Perception ◽

General Motors ◽

Time To Failure ◽

Frequency Distributions ◽

Empirical Distributions ◽

Preventive Actions ◽

Warranty Claims ◽

Fuzzy Logic Technique

Car manufacturers are interested to detect evolving problems in a car fleet as early as possible so they can take preventive actions and deal with the problems before they become widespread. The vast amount of warranty claims recorded by the car dealers makes the manual process of analyzing this data hardly feasible. This chapter describes a fuzzy-based methodology for automated detection of evolving maintenance problems in massive streams of car warranty data. The empirical distributions of time-to-failure and mileage-to-failure are monitored over time using the advanced, fuzzy approach to comparison of frequency distributions. The authors’ fuzzy-based early warning tool builds upon an automated interpretation of the differences between consecutive histogram plots using a cognitive model of human perception rather than “crisp” statistical models. They demonstrate the effectiveness and the efficiency of the proposed tool on warranty data that is very similar to the actual data gathered from a database within General Motors.

Download Full-text

TH 7832CDZ (TH X31513) Bilinear CCD Array for Astronomy Applications

Symposium - International Astronomical Union ◽

10.1017/s0074180900034756 ◽

1988 ◽

Vol 132 ◽

pp. 45-48

Author(s):

J. L. Coutures ◽

G. Boucharlat

Keyword(s):

Spectral Analysis ◽

Dynamic Range ◽

Linear Arrays ◽

New Device ◽

Low Level ◽

Ccd Array ◽

Reading Efficiency ◽

Photodiode Signal ◽

Better Than

Photodiode linear arrays are perfectly adapted for spectral analysis. The TH 7832CDZ bilinear array is a new device specially adapted to low level detection (exposure < 7 nJ/cm2) with a reading efficiency of the photodiode signal better than 97% on all the dynamic range (> 70 dB).

Download Full-text

Representations and representational specificity in speech perception and spoken word recognition

The Oxford Handbook of Psycholinguistics ◽

10.1093/oxfordhb/9780198568971.013.0001 ◽

2007 ◽

pp. 2-18 ◽

Cited By ~ 2

Author(s):

David B. Pisoni ◽

Susannah V. Levi

Keyword(s):

Speech Perception ◽

Word Recognition ◽

Spoken Word Recognition ◽

Spoken Word ◽

Lexical Knowledge ◽

New Approaches ◽

Abstract Representations ◽

New Directions ◽

Learning And Adaptation ◽

New Framework

This article examines how new approaches—coupled with previous insights—provide a new framework for questions that deal with the nature of phonological and lexical knowledge and representation, processing of stimulus variability, and perceptual learning and adaptation. First, it outlines the traditional view of speech perception and identifies some problems with assuming such a view, in which only abstract representations exist. The article then discusses some new approaches to speech perception that retain detailed information in the representations. It also considers a view which rejects abstraction altogether, but shows that such a view has difficulty dealing with a range of linguistic phenomena. After providing a brief discussion of some new directions in linguistics that encode both detailed information and abstraction, the article concludes by discussing the coupling of speech perception and spoken word recognition.

Download Full-text