The Relation between Vowel Recognition and Measures of Frequency Resolution

1989 ◽  
Vol 32 (1) ◽  
pp. 49-58 ◽  
Author(s):  
Christopher W. Turner ◽  
Carol C. Henn

The purpose of this study was to employ measures of frequency resolution obtained from individual subjects to predict each subject's vowel recognition performance. Input filter patterns at six test frequencies were obtained from normal-hearing and hearing-impaired subjects. These patterns were used to correlate frequency resolution with vowel recognition in those same subjects. Vowels were presented at levels at which the entire spectrum was fully audible to each subject. Using each subject's measured filter characteristics (and interpolated values for intermediate frequencies), an "internal spectrum" of each vowel was calculated by determining the outputs of all filter channels for the vowel as the input signal. It was speculated that the more similar two internal spectra for a subject were, the more often they would be confused in the vowel recognition task. This expectation received some support when the measure of similarity was a point-by-point Euclidean distance between the two internal spectra. Stronger support was obtained when the measure of similarity was based upon Klatt's (1982) "weighted slope metric" that emphasizes similarities of spectral peak locations. The present study demonstrates a relation between impairments of frequency resolution and vowel recognition. The described filter-bank model of vowel recognition suggests that measures of frequency resolution along with the acoustic spectra of vowel stimuli may be useful in predicting the recognition of vowels by individuals.

Sensors ◽  
2021 ◽  
Vol 21 (3) ◽  
pp. 1007
Author(s):  
Chi Xu ◽  
Yunkai Jiang ◽  
Jun Zhou ◽  
Yi Liu

Hand gesture recognition and hand pose estimation are two closely correlated tasks. In this paper, we propose a deep-learning based approach which jointly learns an intermediate level shared feature for these two tasks, so that the hand gesture recognition task can be benefited from the hand pose estimation task. In the training process, a semi-supervised training scheme is designed to solve the problem of lacking proper annotation. Our approach detects the foreground hand, recognizes the hand gesture, and estimates the corresponding 3D hand pose simultaneously. To evaluate the hand gesture recognition performance of the state-of-the-arts, we propose a challenging hand gesture recognition dataset collected in unconstrained environments. Experimental results show that, the gesture recognition accuracy of ours is significantly boosted by leveraging the knowledge learned from the hand pose estimation task.


2013 ◽  
Vol 333-335 ◽  
pp. 1106-1109
Author(s):  
Wei Wu

Palm vein pattern recognition is one of the newest biometric techniques researched today. This paper proposes project the palm vein image matrix based on independent component analysis directly, then calculates the Euclidean distance of the projection matrix, seeks the nearest distance for classification. The experiment has been done in a self-build palm vein database. Experimental results show that the algorithm of independent component analysis is suitable for palm vein recognition and the recognition performance is practical.


2005 ◽  
Vol 36 (3) ◽  
pp. 219-229 ◽  
Author(s):  
Peggy Nelson ◽  
Kathryn Kohnert ◽  
Sabina Sabur ◽  
Daniel Shaw

Purpose: Two studies were conducted to investigate the effects of classroom noise on attention and speech perception in native Spanish-speaking second graders learning English as their second language (L2) as compared to English-only-speaking (EO) peers. Method: Study 1 measured children’s on-task behavior during instructional activities with and without soundfield amplification. Study 2 measured the effects of noise (+10 dB signal-to-noise ratio) using an experimental English word recognition task. Results: Findings from Study 1 revealed no significant condition (pre/postamplification) or group differences in observations in on-task performance. Main findings from Study 2 were that word recognition performance declined significantly for both L2 and EO groups in the noise condition; however, the impact was disproportionately greater for the L2 group. Clinical Implications: Children learning in their L2 appear to be at a distinct disadvantage when listening in rooms with typical noise and reverberation. Speech-language pathologists and audiologists should collaborate to inform teachers, help reduce classroom noise, increase signal levels, and improve access to spoken language for L2 learners.


2018 ◽  
Vol 2018 ◽  
pp. 1-10 ◽  
Author(s):  
Muhammad Sajid ◽  
Nouman Ali ◽  
Saadat Hanif Dar ◽  
Naeem Iqbal Ratyal ◽  
Asif Raza Butt ◽  
...  

Recently, face datasets containing celebrities photos with facial makeup are growing at exponential rates, making their recognition very challenging. Existing face recognition methods rely on feature extraction and reference reranking to improve the performance. However face images with facial makeup carry inherent ambiguity due to artificial colors, shading, contouring, and varying skin tones, making recognition task more difficult. The problem becomes more confound as the makeup alters the bilateral size and symmetry of the certain face components such as eyes and lips affecting the distinctiveness of faces. The ambiguity becomes even worse when different days bring different facial makeup for celebrities owing to the context of interpersonal situations and current societal makeup trends. To cope with these artificial effects, we propose to use a deep convolutional neural network (dCNN) using augmented face dataset to extract discriminative features from face images containing synthetic makeup variations. The augmented dataset containing original face images and those with synthetic make up variations allows dCNN to learn face features in a variety of facial makeup. We also evaluate the role of partial and full makeup in face images to improve the recognition performance. The experimental results on two challenging face datasets show that the proposed approach can compete with the state of the art.


2021 ◽  
Vol 12 ◽  
Author(s):  
Jorge Oliveira ◽  
Marta Fernandes ◽  
Pedro J. Rosa ◽  
Pedro Gamito

Research on pupillometry provides an increasing evidence for associations between pupil activity and memory processing. The most consistent finding is related to an increase in pupil size for old items compared with novel items, suggesting that pupil activity is associated with the strength of memory signal. However, the time course of these changes is not completely known, specifically, when items are presented in a running recognition task maximizing interference by requiring the recognition of the most recent items from a sequence of old/new items. The sample comprised 42 healthy participants who performed a visual word recognition task under varying conditions of retention interval. Recognition responses were evaluated using behavioral variables for discrimination accuracy, reaction time, and confidence in recognition decisions. Pupil activity was recorded continuously during the entire experiment. The results suggest a decrease in recognition performance with increasing study-test retention interval. Pupil size decreased across retention intervals, while pupil old/new effects were found only for words recognized at the shortest retention interval. Pupillary responses consisted of a pronounced early pupil constriction at retrieval under longer study-test lags corresponding to weaker memory signals. However, the pupil size was also sensitive to the subjective feeling of familiarity as shown by pupil dilation to false alarms (new items judged as old). These results suggest that the pupil size is related not only to the strength of memory signal but also to subjective familiarity decisions in a continuous recognition memory paradigm.


2020 ◽  
Author(s):  
Volkan Nurdal ◽  
Graeme Fairchild ◽  
George Stothart

Introduction: The development of rapid and reliable neural measures of memory is an important goal of cognitive neuroscience research and clinical practice. Fast Periodic Visual Stimulation (FPVS) is a recently developed electroencephalography (EEG) method that involves presenting a mix of novel and previously-learnt stimuli at a fast rate. Recent work has shown that implicit recognition memory can be measured using FPVS, however the role of repetition priming remains unclear. Here, we attempted to separate out the effects of recognition memory and repetition priming by manipulating the degree of repetition of the stimuli to be remembered.Method: Twenty-two participants with a mean age of 20.8 (±4.3) yrs completed an FPVS-oddball paradigm with a varying number of repetitions of the oddball stimuli, ranging from repetition only (pure repetition) to no repetition (pure recognition). In addition to the EEG task, participants completed a behavioural recognition task and visual memory subtests from the Wechsler Memory Scale – 4th edition (WMS-IV). Results: An oddball memory response was observed in all four experimental conditions (pure repetition to pure recognition) compared to the control condition (no oddball stimuli). The oddball memory response was largest in the pure repetition condition and smaller, but still significant, in conditions with less/no oddball repetition (e.g. pure recognition). Behavioural recognition performance was at ceiling, suggesting that all images were encoded successfully. There was no correlation with either behavioural memory performance or WMS-IV scores, suggesting the FPVS-oddball paradigm captures different memory processes than behavioural measures.Conclusion: Repetition priming significantly modulates the FPVS recognition memory response, however recognition is still detectable even in the total absence of repetition priming. The FPVS-oddball paradigm could potentially be developed into an objective and easy-to-administer memory assessment tool.


2016 ◽  
Vol 2016 ◽  
pp. 1-13 ◽  
Author(s):  
Shibli Nisar ◽  
Omar Usman Khan ◽  
Muhammad Tariq

Short Time Fourier Transform (STFT) is an important technique for the time-frequency analysis of a time varying signal. The basic approach behind it involves the application of a Fast Fourier Transform (FFT) to a signal multiplied with an appropriate window function with fixed resolution. The selection of an appropriate window size is difficult when no background information about the input signal is known. In this paper, a novel empirical model is proposed that adaptively adjusts the window size for a narrow band-signal using spectrum sensing technique. For wide-band signals, where a fixed time-frequency resolution is undesirable, the approach adapts the constant Q transform (CQT). Unlike the STFT, the CQT provides a varying time-frequency resolution. This results in a high spectral resolution at low frequencies and high temporal resolution at high frequencies. In this paper, a simple but effective switching framework is provided between both STFT and CQT. The proposed method also allows for the dynamic construction of a filter bank according to user-defined parameters. This helps in reducing redundant entries in the filter bank. Results obtained from the proposed method not only improve the spectrogram visualization but also reduce the computation cost and achieves 87.71% of the appropriate window length selection.


Author(s):  
Mohammad Farhad Bulbul ◽  
Yunsheng Jiang ◽  
Jinwen Ma

The emerging cost-effective depth sensors have facilitated the action recognition task significantly. In this paper, the authors address the action recognition problem using depth video sequences combining three discriminative features. More specifically, the authors generate three Depth Motion Maps (DMMs) over the entire video sequence corresponding to the front, side, and top projection views. Contourlet-based Histogram of Oriented Gradients (CT-HOG), Local Binary Patterns (LBP), and Edge Oriented Histograms (EOH) are then computed from the DMMs. To merge these features, the authors consider decision-level fusion, where a soft decision-fusion rule, Logarithmic Opinion Pool (LOGP), is used to combine the classification outcomes from multiple classifiers each with an individual set of features. Experimental results on two datasets reveal that the fusion scheme achieves superior action recognition performance over the situations when using each feature individually.


2017 ◽  
Vol 23 (1) ◽  
pp. 69-86 ◽  
Author(s):  
Steffen A. Herff ◽  
Daniela Czernochowski

When attention is divided during memory encoding, performance tends to suffer. The nature of this performance decrement, however, is domain-dependent and often governed by domain-specific expertise. In this study, 111 participants with differing levels of musical expertise (professional musicians, amateur musicians, and non-musicians) were presented with novel melodies under full- or divided-attention conditions in a continuous melody-recognition task. As hypothesized, melody recognition was modulated by musical expertise, as greater expertise was associated with better performance. Recognition performance increased with every additional presentation of a target melody. The divided-attention condition required concurrently performing a non-music related digit-monitoring task while simultaneously listening to the melodies. Memory performance decreased universally in all groups in the divided-attention condition; however, intriguingly musicians also performed significantly better in the concurrent digit-monitoring task than non-musicians. Results provide insight into the role of expertise, attention, and memory in the musical domain, and are discussed in terms of attentional resource models. In light of resource models, an asymmetrical non-linear trade-off between two simultaneous tasks is proposed to explain the present findings.


Perception ◽  
10.1068/p5637 ◽  
2007 ◽  
Vol 36 (9) ◽  
pp. 1334-1352 ◽  
Author(s):  
Simone K Favelle ◽  
Stephen Palmisano ◽  
Ryan T Maloney

Previous research into the effects of viewpoint change on face recognition has typically dealt with rotations around the head's vertical axis (yaw). Another common, although less studied, source of viewpoint variation in faces is rotation around the head's horizontal pitch axis (pitch). In the current study we used both a sequential matching task and an old/new recognition task to examine the effect of viewpoint change following rotation about both pitch and yaw axes on human face recognition. The results of both tasks showed that recognition performance was better for faces rotated about yaw compared to pitch. Further, recognition performance for faces rotated upwards on the pitch axis was better than for faces rotated downwards. Thus, equivalent angular rotations about pitch and yaw do not produce equivalent viewpoint-dependent declines in recognition performance.


Sign in / Sign up

Export Citation Format

Share Document