scholarly journals Children’s referent selection and word learning

2016 ◽  
Vol 17 (1) ◽  
pp. 101-127 ◽  
Author(s):  
Katherine E. Twomey ◽  
Anthony F. Morse ◽  
Angelo Cangelosi ◽  
Jessica S. Horst

Abstract It is well-established that toddlers can correctly select a novel referent from an ambiguous array in response to a novel label. There is also a growing consensus that robust word learning requires repeated label-object encounters. However, the effect of the context in which a novel object is encountered is less well-understood. We present two embodied neural network replications of recent empirical tasks, which demonstrated that the context in which a target object is encountered is fundamental to referent selection and word learning. Our model offers an explicit account of the bottom-up associative and embodied mechanisms which could support children’s early word learning and emphasises the importance of viewing behaviour as the interaction of learning at multiple timescales.

2020 ◽  
Vol 63 (1) ◽  
pp. 345-356
Author(s):  
Meital Avivi-Reich ◽  
Megan Y. Roberts ◽  
Tina M. Grieco-Calub

Purpose This study tested the effects of background speech babble on novel word learning in preschool children with a multisession paradigm. Method Eight 3-year-old children were exposed to a total of 8 novel word–object pairs across 2 story books presented digitally. Each story contained 4 novel consonant–vowel–consonant nonwords. Children were exposed to both stories, one in quiet and one in the presence of 4-talker babble presented at 0-dB signal-to-noise ratio. After each story, children's learning was tested with a referent selection task and a verbal recall (naming) task. Children were exposed to and tested on the novel word–object pairs on 5 separate days within a 2-week span. Results A significant main effect of session was found for both referent selection and verbal recall. There was also a significant main effect of exposure condition on referent selection performance, with more referents correctly selected for word–object pairs that were presented in quiet compared to pairs presented in speech babble. Finally, children's verbal recall of novel words was statistically better than baseline performance (i.e., 0%) on Sessions 3–5 for words exposed in quiet, but only on Session 5 for words exposed in speech babble. Conclusions These findings suggest that background speech babble at 0-dB signal-to-noise ratio disrupts novel word learning in preschool-age children. As a result, children may need more time and more exposures of a novel word before they can recognize or verbally recall it.


Sensors ◽  
2021 ◽  
Vol 21 (7) ◽  
pp. 2280
Author(s):  
Ching-Chang Wong ◽  
Li-Yu Yeh ◽  
Chih-Cheng Liu ◽  
Chi-Yi Tsai ◽  
Hisasuki Aoyama

In this paper, a manipulation planning method for object re-orientation based on semantic segmentation keypoint detection is proposed for robot manipulator which is able to detect and re-orientate the randomly placed objects to a specified position and pose. There are two main parts: (1) 3D keypoint detection system; and (2) manipulation planning system for object re-orientation. In the 3D keypoint detection system, an RGB-D camera is used to obtain the information of the environment and can generate 3D keypoints of the target object as inputs to represent its corresponding position and pose. This process simplifies the 3D model representation so that the manipulation planning for object re-orientation can be executed in a category-level manner by adding various training data of the object in the training phase. In addition, 3D suction points in both the object’s current and expected poses are also generated as the inputs of the next operation stage. During the next stage, Mask Region-Convolutional Neural Network (Mask R-CNN) algorithm is used for preliminary object detection and object image. The highest confidence index image is selected as the input of the semantic segmentation system in order to classify each pixel in the picture for the corresponding pack unit of the object. In addition, after using a convolutional neural network for semantic segmentation, the Conditional Random Fields (CRFs) method is used to perform several iterations to obtain a more accurate result of object recognition. When the target object is segmented into the pack units of image process, the center position of each pack unit can be obtained. Then, a normal vector of each pack unit’s center points is generated by the depth image information and pose of the object, which can be obtained by connecting the center points of each pack unit. In the manipulation planning system for object re-orientation, the pose of the object and the normal vector of each pack unit are first converted into the working coordinate system of the robot manipulator. Then, according to the current and expected pose of the object, the spherical linear interpolation (Slerp) algorithm is used to generate a series of movements in the workspace for object re-orientation on the robot manipulator. In addition, the pose of the object is adjusted on the z-axis of the object’s geodetic coordinate system based on the image features on the surface of the object, so that the pose of the placed object can approach the desired pose. Finally, a robot manipulator and a vacuum suction cup made by the laboratory are used to verify that the proposed system can indeed complete the planned task of object re-orientation.


2021 ◽  
Author(s):  
◽  
Ibrahim Mohammad Hussain Rahman

<p>The human visual attention system (HVA) encompasses a set of interconnected neurological modules that are responsible for analyzing visual stimuli by attending to those regions that are salient. Two contrasting biological mechanisms exist in the HVA systems; bottom-up, data-driven attention and top-down, task-driven attention. The former is mostly responsible for low-level instinctive behaviors, while the latter is responsible for performing complex visual tasks such as target object detection.  Very few computational models have been proposed to model top-down attention, mainly due to three reasons. The first is that the functionality of top-down process involves many influential factors. The second reason is that there is a diversity in top-down responses from task to task. Finally, many biological aspects of the top-down process are not well understood yet.  For the above reasons, it is difficult to come up with a generalized top-down model that could be applied to all high level visual tasks. Instead, this thesis addresses some outstanding issues in modelling top-down attention for one particular task, target object detection. Target object detection is an essential step for analyzing images to further perform complex visual tasks. Target object detection has not been investigated thoroughly when modelling top-down saliency and hence, constitutes the may domain application for this thesis.  The thesis will investigate methods to model top-down attention through various high-level data acquired from images. Furthermore, the thesis will investigate different strategies to dynamically combine bottom-up and top-down processes to improve the detection accuracy, as well as the computational efficiency of the existing and new visual attention models. The following techniques and approaches are proposed to address the outstanding issues in modelling top-down saliency:  1. A top-down saliency model that weights low-level attentional features through contextual knowledge of a scene. The proposed model assigns weights to features of a novel image by extracting a contextual descriptor of the image. The contextual descriptor plays the role of tuning the weighting of low-level features to maximize detection accuracy. By incorporating context into the feature weighting mechanism we improve the quality of the assigned weights to these features.  2. Two modules of target features combined with contextual weighting to improve detection accuracy of the target object. In this proposed model, two sets of attentional feature weights are learned, one through context and the other through target features. When both sources of knowledge are used to model top-down attention, a drastic increase in detection accuracy is achieved in images with complex backgrounds and a variety of target objects.  3. A top-down and bottom-up attention combination model based on feature interaction. This model provides a dynamic way for combining both processes by formulating the problem as feature selection. The feature selection exploits the interaction between these features, yielding a robust set of features that would maximize both the detection accuracy and the overall efficiency of the system.  4. A feature map quality score estimation model that is able to accurately predict the detection accuracy score of any previously novel feature map without the need of groundtruth data. The model extracts various local, global, geometrical and statistical characteristic features from a feature map. These characteristics guide a regression model to estimate the quality of a novel map.  5. A dynamic feature integration framework for combining bottom-up and top-down saliencies at runtime. If the estimation model is able to predict the quality score of any novel feature map accurately, then it is possible to perform dynamic feature map integration based on the estimated value. We propose two frameworks for feature map integration using the estimation model. The proposed integration framework achieves higher human fixation prediction accuracy with minimum number of feature maps than that achieved by combining all feature maps.  The proposed works in this thesis provide new directions in modelling top-down saliency for target object detection. In addition, dynamic approaches for top-down and bottom-up combination show considerable improvements over existing approaches in both efficiency and accuracy.</p>


Author(s):  
Caifeng Liu ◽  
Lin Feng ◽  
Guochao Liu ◽  
Huibing Wang ◽  
Shenglan Liu

Electronics ◽  
2020 ◽  
Vol 9 (2) ◽  
pp. 210 ◽  
Author(s):  
Yi-Chun Du ◽  
Muslikhin Muslikhin ◽  
Tsung-Han Hsieh ◽  
Ming-Shyan Wang

This paper develops a hybrid algorithm of adaptive network-based fuzzy inference system (ANFIS) and regions with convolutional neural network (R-CNN) for stereo vision-based object recognition and manipulation. The stereo camera at an eye-to-hand configuration firstly captures the image of the target object. Then, the shape, features, and centroid of the object are estimated. Similar pixels are segmented by the image segmentation method, and similar regions are merged through selective search. The eye-to-hand calibration is based on ANFIS to reduce computing burden. A six-degree-of-freedom (6-DOF) robot arm with a gripper will conduct experiments to demonstrate the effectiveness of the proposed system.


2020 ◽  
Author(s):  
Fuyin Yang ◽  
Hao Zhu ◽  
Lingfang Yu ◽  
Weihong Lu ◽  
Chen Zhang ◽  
...  

AbstractsAuditory verbal hallucinations (AVHs) are one of the most pronounced symptoms that manifest the underlying mechanisms of deficits in schizophrenia. Cognitive models postulate that malfunctioned source monitoring incorrectly weights the top-down prediction and bottom-up sensory processing and causes hallucinations. Here, we investigate the featural-temporal characteristics of source monitoring in AVHs. Schizophrenia patients with and without AVHs, and healthy controls identified target tones in noise at the end of tone sequences. Predictions of different timescales were manipulated by either an alternating pattern in the preceding tone sequences, or a repetition between the target tone and the tone immediately before. The sensitivity index, d’, was obtained to assess the modulation of predictions on tone identification. We found that patients with AVHs showed higher d’ when the target tones conformed to the long-term regularity of alternating pattern in the preceding tone sequence than that when the targets were inconsistent with the pattern. Whereas, the short-term regularity of repetitions modulated the tone identification in patients without AVHs. Predictions did not influence tone identification in healthy controls. These findings suggest that malfunctioned source monitoring in AVHs heavily weights predictions to form incorrect perception. The weighting function in source monitoring can extend to the process of basic tonal features, and predictions at multiple timescales differentially modulate perception in different clinical populations. These collaboratively reveal the featural and temporal characteristics of weighting function in source monitoring of AVHs and suggest that the malfunctioned interaction between top-down and bottom-up processes might underlie the development of auditory hallucinations.HighlightsMalfunctioned source monitoring incorrectly weights the top-down prediction and bottom-up sensory processing underlie pathogenesis of auditory verbal hallucinations in schizophrenia.The weighting function in top-down predictions and bottom-up sensory processing can extend to tonal features.Predictions at multiple timescales differentially modulate perception in different clinical schizophrenia populations.


Energies ◽  
2019 ◽  
Vol 13 (1) ◽  
pp. 116
Author(s):  
Ya-Wen Hsu ◽  
Yi-Horng Lai ◽  
Kai-Quan Zhong ◽  
Tang-Kai Yin ◽  
Jau-Woei Perng

In this study, a millimeter-wave (MMW) radar and an onboard camera are used to develop a sensor fusion algorithm for a forward collision warning system. This study proposed integrating an MMW radar and camera to compensate for the deficiencies caused by relying on a single sensor and to improve frontal object detection rates. Density-based spatial clustering of applications with noise and particle filter algorithms are used in the radar-based object detection system to remove non-object noise and track the target object. Meanwhile, the two-stage vision recognition system can detect and recognize the objects in front of a vehicle. The detected objects include pedestrians, motorcycles, and cars. The spatial alignment uses a radial basis function neural network to learn the conversion relationship between the distance information of the MMW radar and the coordinate information in the image. Then a neural network is utilized for object matching. The sensor with a higher confidence index is selected as the system output. Finally, three kinds of scenario conditions (daytime, nighttime, and rainy-day) were designed to test the performance of the proposed method. The detection rates and the false alarm rates of proposed system were approximately 90.5% and 0.6%, respectively.


2020 ◽  
Vol 32 (6) ◽  
pp. 1193-1199
Author(s):  
Shunya Tanaka ◽  
◽  
Yuki Inoue

An omnidirectional camera can simultaneously capture all-round (360°) environmental information as well as the azimuth angle of a target object or person. By configuring a stereo camera set with two omnidirectional cameras, we can easily determine the azimuth angle of a target object or person per camera on the image information captured by the left and right cameras. A target person in an image can be localized by using a region-based convolutional neural network and the distance measured by the parallax in the combined azimuth angles.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Zhaojun Ye ◽  
Yi Guo ◽  
Chengguang Wang ◽  
Haohui Huang ◽  
Genke Yang

Distinguishing target object under occlusions has become the forefront of research to cope with grasping study in general. In this paper, a novel framework which is able to be utilized for a parallel robotic gripper is proposed. There are two key steps for the proposed method in the process of grasping occluded object: generating template information and grasp detection using the matching algorithm. A neural network, trained by the RGB-D data from the Cornell Grasp Dataset, predicts multiple grasp rectangles on template images. A proposed matching algorithm is utilized to eliminate the influence caused by occluded parts on scene images and generates multiple grasp rectangles for objects under occlusions using the grasp information of matched template images. In order to improve the quality of matching result, the proposed matching algorithm improves the SIFT algorithm and combines it with the improved RANSAC algorithm. In this way, this paper obtains suitable grasp rectangles on scene images and offers a new thought about grasping detection under occlusions. The validation results show the effectiveness and efficiency of this approach.


Sign in / Sign up

Export Citation Format

Share Document