scholarly journals End-to-End Training for Compound Expression Recognition

Sensors ◽  
2020 ◽  
Vol 20 (17) ◽  
pp. 4727
Author(s):  
Hongfei Li ◽  
Qing Li

For a long time, expressions have been something that human beings are proud of. That is an essential difference between us and machines. With the development of computers, we are more eager to develop communication between humans and machines, especially communication with emotions. The emotional growth of computers is similar to the growth process of each of us, starting with a natural, intimate, and vivid interaction by observing and discerning emotions. Since the basic emotions, angry, disgusted, fearful, happy, neutral, sad and surprised are put forward, there are many researches based on basic emotions at present, but few on compound emotions. However, in real life, people’s emotions are complex. Single expressions cannot fully and accurately show people’s inner emotional changes, thus, exploration of compound expression recognition is very essential to daily life. In this paper, we recommend a scheme of combining spatial and frequency domain transform to implement end-to-end joint training based on model ensembling between models for appearance and geometric representations learning for the recognition of compound expressions in the wild. We are mainly devoted to digging the appearance and geometric information based on deep learning models. For appearance feature acquisition, we adopt the idea of transfer learning, introducing the ResNet50 model pretrained on VGGFace2 for face recognition to implement the fine-tuning process. Here, we try and compare two minds, one is that we utilize two static expression databases FER2013 and RAF Basic for basic emotion recognition to fine tune, the other is that we fine tune the model on the input three channels composed of images generated by DWT2 and WAVEDEC2 wavelet transforms based on rbio3.1 and sym1 wavelet bases respectively. For geometric feature acquisition, we firstly introduce a densesift operator to extract facial key points and their histogram descriptions. After that, we introduce deep SAE with a softmax function, stacked LSTM and Sequence-to-Sequence with stacked LSTM and define their structures by ourselves. Then, we feed the salient key points and their descriptions into three models to train respectively and compare their performances. When the model training for appearance and geometric features learning is completed, we combine the two models with category labels to achieve further end-to-end joint training, considering that ensembling models, which describe different information, can further improve recognition results. Finally, we validate the performance of our proposed framework on an RAF Compound database and achieve a recognition rate of 66.97%. Experiments show that integrating different models, which express different information, and achieving end-to-end training can quickly and effectively improve the performance of the recognition.

2018 ◽  
Author(s):  
Kent O. Kirlikovali ◽  
Jonathan C. Axtell ◽  
Kierstyn Anderson ◽  
Peter I. Djurovich ◽  
Arnold L. Rheingold ◽  
...  

We report the synthesis of two isomeric Pt(II) complexes ligated by doubly deprotonated 1,1′-bis(<i>o</i>-carborane) (<b>bc</b>). This work provides a potential route to fine-tune the electronic properties of luminescent metal complexes by virtue of vertex-differentiated coordination chemistry of carborane-based ligands.


Author(s):  
Cunhang Fan ◽  
Jiangyan Yi ◽  
Jianhua Tao ◽  
Zhengkun Tian ◽  
Bin Liu ◽  
...  

Author(s):  
Thomas Blaschke ◽  
Jürgen Bajorath

AbstractExploring the origin of multi-target activity of small molecules and designing new multi-target compounds are highly topical issues in pharmaceutical research. We have investigated the ability of a generative neural network to create multi-target compounds. Data sets of experimentally confirmed multi-target, single-target, and consistently inactive compounds were extracted from public screening data considering positive and negative assay results. These data sets were used to fine-tune the REINVENT generative model via transfer learning to systematically recognize multi-target compounds, distinguish them from single-target or inactive compounds, and construct new multi-target compounds. During fine-tuning, the model showed a clear tendency to increasingly generate multi-target compounds and structural analogs. Our findings indicate that generative models can be adopted for de novo multi-target compound design.


2018 ◽  
Vol 69 (1) ◽  
pp. 24-31
Author(s):  
Khaled S. Hatamleh ◽  
Qais A. Khasawneh ◽  
Adnan Al-Ghasem ◽  
Mohammad A. Jaradat ◽  
Laith Sawaqed ◽  
...  

Abstract Scanning Electron Microscopes are extensively used for accurate micro/nano images exploring. Several strategies have been proposed to fine tune those microscopes in the past few years. This work presents a new fine tuning strategy of a scanning electron microscope sample table using four bar piezoelectric actuated mechanisms. The introduced paper presents an algorithm to find all possible inverse kinematics solutions of the proposed mechanism. In addition, another algorithm is presented to search for the optimal inverse kinematic solution. Both algorithms are used simultaneously by means of a simulation study to fine tune a scanning electron microscope sample table through a pre-specified circular or linear path of motion. Results of the study shows that, proposed algorithms were able to minimize the power required to drive the piezoelectric actuated mechanism by a ratio of 97.5% for all simulated paths of motion when compared to general non-optimized solution.


1998 ◽  
Vol 120 (1) ◽  
pp. 46-51 ◽  
Author(s):  
L. N. Srinivasan ◽  
Q. Jeffrey Ge

This paper presents two algorithms for fine-tuning rational B-spline motions suitable for Computer Aided Design. The problem of fine-tuning of rational motions is studied as that of fine-tuning rational curves in a projective dual three-space, called the image curves. The path-smoothing algorithm automatically detects and smoothes out the third order geometric discontinuities in the path of a cubic rational B-spline image curve. The speed-smoothing algorithm uses a quintic rational spline image curve to obtain a second-order geometric approximation of the path of a cubic rational B-spline image curve while allowing specification of the speed and the rate of change of speed at the key points to obtain a near constant kinetic energy parameterization. The results have applications in Cartesian trajectory planning in robotics, spatial navigation in visualization and virtual reality systems, as well as mechanical system simulation.


2021 ◽  
Vol 18 (2) ◽  
pp. 56-65
Author(s):  
Marcelo Romero ◽  
◽  
Matheus Gutoski ◽  
Leandro Takeshi Hattori ◽  
Manassés Ribeiro ◽  
...  

Transfer learning is a paradigm that consists in training and testing classifiers with datasets drawn from distinct distributions. This technique allows to solve a particular problem using a model that was trained for another purpose. In the recent years, this practice has become very popular due to the increase of public available pre-trained models that can be fine-tuned to be applied in different scenarios. However, the relationship between the datasets used for training the model and the test data is usually not addressed, specially where the fine-tuning process is done only for the fully connected layers of a Convolutional Neural Network with pre-trained weights. This work presents a study regarding the relationship between the datasets used in a transfer learning process in terms of the performance achieved by models complexities and similarities. For this purpose, we fine-tune the final layer of Convolutional Neural Networks with pre-trained weights using diverse soft biometrics datasets. An evaluation of the performances of the models, when tested with datasets that are different from the one used for training the model, is presented. Complexity and similarity metrics are also used to perform the evaluation.


2019 ◽  
Vol 5 (1) ◽  
pp. 239-244
Author(s):  
Jingrui Yu ◽  
Roman Seidel ◽  
Gangolf Hirtz

AbstractWe propose a one-step person detector for topview omnidirectional indoor scenes based on convolutional neural networks (CNNs). While state of the art person detectors reach competitive results on perspective images, missing CNN architectures as well as training data that follows the distortion of omnidirectional images makes current approaches not applicable to our data. The method predicts bounding boxes of multiple persons directly in omnidirectional images without perspective transformation, which reduces overhead of pre- and post-processing and enables realtime performance. The basic idea is to utilize transfer learning to fine-tune CNNs trained on perspective images with data augmentation techniques for detection in omnidirectional images. We fine-tune two variants of Single Shot MultiBox detectors (SSDs). The first one uses Mobilenet v1 FPN as feature extractor (moSSD). The second one uses ResNet50 v1 FPN (resSSD). Both models are pre-trained on Microsoft Common Objects in Context (COCO) dataset. We fine-tune both models on PASCAL VOC07 and VOC12 datasets, specifically on class person. Random 90-degree rotation and random vertical flipping are used for data augmentation in addition to the methods proposed by original SSD. We reach an average precision (AP) of 67.3%with moSSD and 74.9%with resSSD on the evaluation dataset. To enhance the fine-tuning process, we add a subset of HDA Person dataset and a subset of PIROPO database and reduce the number of perspective images to PASCAL VOC07. The AP rises to 83.2% for moSSD and 86.3% for resSSD, respectively. The average inference speed is 28 ms per image for moSSD and 38 ms per image for resSSD using Nvidia Quadro P6000. Our method is applicable to other CNN-based object detectors and can potentially generalize for detecting other objects in omnidirectional images.


Sensors ◽  
2020 ◽  
Vol 20 (9) ◽  
pp. 2639
Author(s):  
Quan T. Ngo ◽  
Seokhoon Yoon

Facial expression recognition (FER) is a challenging problem in the fields of pattern recognition and computer vision. The recent success of convolutional neural networks (CNNs) in object detection and object segmentation tasks has shown promise in building an automatic deep CNN-based FER model. However, in real-world scenarios, performance degrades dramatically owing to the great diversity of factors unrelated to facial expressions, and due to a lack of training data and an intrinsic imbalance in the existing facial emotion datasets. To tackle these problems, this paper not only applies deep transfer learning techniques, but also proposes a novel loss function called weighted-cluster loss, which is used during the fine-tuning phase. Specifically, the weighted-cluster loss function simultaneously improves the intra-class compactness and the inter-class separability by learning a class center for each emotion class. It also takes the imbalance in a facial expression dataset into account by giving each emotion class a weight based on its proportion of the total number of images. In addition, a recent, successful deep CNN architecture, pre-trained in the task of face identification with the VGGFace2 database from the Visual Geometry Group at Oxford University, is employed and fine-tuned using the proposed loss function to recognize eight basic facial emotions from the AffectNet database of facial expression, valence, and arousal computing in the wild. Experiments on an AffectNet real-world facial dataset demonstrate that our method outperforms the baseline CNN models that use either weighted-softmax loss or center loss.


2014 ◽  
Vol 1016 ◽  
pp. 336-341
Author(s):  
Kamolchanok Thipayarat ◽  
Ekasit Nisaratanaporn ◽  
Boonrat Lohwongwatana

In recent years, the Au-Ge-Sb system has been studied as a possible alternative alloy for soldering applications [1-4]. The alloy has various fbenefits such as (i) low melting temperature which allows the alloy system to be used as a drop-in solution for high performance lead-free solders, (ii) three distinct phases of different hardness values (100, 150 and 500 HV) which offer the ability to fine tune the composition and microstructure to a wide range of properties, and (iii) limited solute solubility which offers ease of control and fine-tuning of microstructure, mechanical properties and colors. Gold compositions centered around 75wt% gold were modeled and selected using the CALPHAD (CALculation of PHAse Diagram) method. Predictions were later confirmed by experimental results. The alloy solidifies in the range of 242.5-261.7 °C. The overall hardness values were measured and confirmed to be within the volume average value of all the phases combined.


Sign in / Sign up

Export Citation Format

Share Document