Unsupervised online multitask learning of behavioral sentence embeddings

PeerJ Computer Science ◽

10.7717/peerj-cs.200 ◽

2019 ◽

Vol 5 ◽

pp. e200

Author(s):

Shao-Yen Tseng ◽

Brian Baucom ◽

Panayiotis Georgiou

Keyword(s):

Emotion Recognition ◽

Behavior Analysis ◽

Couples Therapy ◽

Large Scale ◽

State Of The Art ◽

General Purpose ◽

Multitask Learning ◽

Multiple Tasks ◽

And Behavior ◽

Better Than

Appropriate embedding transformation of sentences can aid in downstream tasks such as NLP and emotion and behavior analysis. Such efforts evolved from word vectors which were trained in an unsupervised manner using large-scale corpora. Recent research, however, has shown that sentence embeddings trained using in-domain data or supervised techniques, often through multitask learning, perform better than unsupervised ones. Representations have also been shown to be applicable in multiple tasks, especially when training incorporates multiple information sources. In this work we aspire to combine the simplicity of using abundant unsupervised data with transfer learning by introducing an online multitask objective. We present a multitask paradigm for unsupervised learning of sentence embeddings which simultaneously addresses domain adaption. We show that embeddings generated through this process increase performance in subsequent domain-relevant tasks. We evaluate on the affective tasks of emotion recognition and behavior analysis and compare our results with state-of-the-art general-purpose supervised sentence embeddings. Our unsupervised sentence embeddings outperform the alternative universal embeddings in both identifying behaviors within couples therapy and in emotion recognition.

Download Full-text

Investigating the Relationship Between Emotion Recognition Software and Usability Metrics

i-com ◽

10.1515/icom-2020-0009 ◽

2020 ◽

Vol 19 (2) ◽

pp. 139-151

Author(s):

Thomas Schmidt ◽

Miriam Schlindwein ◽

Katharina Lichtner ◽

Christian Wolff

Keyword(s):

Emotion Recognition ◽

Regression Models ◽

Affective Computing ◽

Large Scale ◽

General Purpose ◽

Thinking Aloud ◽

Continue Research ◽

Usability Metrics ◽

Recognition Software ◽

The Relationship

AbstractDue to progress in affective computing, various forms of general purpose sentiment/emotion recognition software have become available. However, the application of such tools in usability engineering (UE) for measuring the emotional state of participants is rarely employed. We investigate if the application of sentiment/emotion recognition software is beneficial for gathering objective and intuitive data that can predict usability similar to traditional usability metrics. We present the results of a UE project examining this question for the three modalities text, speech and face. We perform a large scale usability test (N = 125) with a counterbalanced within-subject design with two websites of varying usability. We have identified a weak but significant correlation between text-based sentiment analysis on the text acquired via thinking aloud and SUS scores as well as a weak positive correlation between the proportion of neutrality in users’ voice and SUS scores. However, for the majority of the output of emotion recognition software, we could not find any significant results. Emotion metrics could not be used to successfully differentiate between two websites of varying usability. Regression models, either unimodal or multimodal could not predict usability metrics. We discuss reasons for these results and how to continue research with more sophisticated methods.

Download Full-text

Detection of Manipulated Face Videos over Social Networks: A Large-Scale Study

Journal of Imaging ◽

10.3390/jimaging7100193 ◽

2021 ◽

Vol 7 (10) ◽

pp. 193

Author(s):

Federico Marcon ◽

Cecilia Pasquini ◽

Giulia Boato

Keyword(s):

Large Scale ◽

State Of The Art ◽

Forensic Analysis ◽

General Purpose ◽

Fine Tuning ◽

Specific Technique ◽

Multimedia Forensics ◽

Shared Data ◽

Social Media Platforms ◽

The Web

The detection of manipulated videos represents a highly relevant problem in multimedia forensics, which has been widely investigated in the last years. However, a common trait of published studies is the fact that the forensic analysis is typically applied on data prior to their potential dissemination over the web. This work addresses the challenging scenario where manipulated videos are first shared through social media platforms and then are subject to the forensic analysis. In this context, a large scale performance evaluation has been carried out involving general purpose deep networks and state-of-the-art manipulated data, and studying different effects. Results confirm that a performance drop is observed in every case when unseen shared data are tested by networks trained on non-shared data; however, fine-tuning operations can mitigate this problem. Also, we show that the output of differently trained networks can carry useful forensic information for the identification of the specific technique used for visual manipulation, both for shared and non-shared data.

Download Full-text

Distant Supervision for Relation Extraction with Sentence Selection and Interaction Representation

Wireless Communications and Mobile Computing ◽

10.1155/2021/8889075 ◽

2021 ◽

Vol 2021 ◽

pp. 1-16

Author(s):

Tiantian Chen ◽

Nianbin Wang ◽

Hongbin Wang ◽

Haomin Zhan

Keyword(s):

Large Scale ◽

Semantic Information ◽

State Of The Art ◽

Relation Extraction ◽

Semantic Features ◽

Distant Supervision ◽

Word Level ◽

Proposed Model ◽

Relation Prediction ◽

Better Than

Distant supervision (DS) has been widely used for relation extraction (RE), which automatically generates large-scale labeled data. However, there is a wrong labeling problem, which affects the performance of RE. Besides, the existing method suffers from the lack of useful semantic features for some positive training instances. To address the above problems, we propose a novel RE model with sentence selection and interaction representation for distantly supervised RE. First, we propose a pattern method based on the relation trigger words as a sentence selector to filter out noisy sentences to alleviate the wrong labeling problem. After clean instances are obtained, we propose the interaction representation using the word-level attention mechanism-based entity pairs to dynamically increase the weights of the words related to entity pairs, which can provide more useful semantic information for relation prediction. The proposed model outperforms the strongest baseline by 2.61 in F1-score on a widely used dataset, which proves that our model performs significantly better than the state-of-the-art RE systems.

Download Full-text

Rethinking the Fourier-Mellin Transform: Multiple Depths in the Camera’s View

Remote Sensing ◽

10.3390/rs13051000 ◽

2021 ◽

Vol 13 (5) ◽

pp. 1000

Author(s):

Qingwen Xu ◽

Haofei Kuang ◽

Laurent Kneip ◽

Sören Schwertfeger

Keyword(s):

Remote Sensing ◽

Large Scale ◽

Feature Detection ◽

Mellin Transform ◽

State Of The Art ◽

Visual Odometry ◽

Superior Performance ◽

Fourier Mellin Transform ◽

And Robotics ◽

Better Than

Remote sensing and robotics often rely on visual odometry (VO) for localization. Many standard approaches for VO use feature detection. However, these methods will meet challenges if the environments are feature-deprived or highly repetitive. Fourier-Mellin Transform (FMT) is an alternative VO approach that has been shown to show superior performance in these scenarios and is often used in remote sensing. One limitation of FMT is that it requires an environment that is equidistant to the camera, i.e., single-depth. To extend the applications of FMT to multi-depth environments, this paper presents the extended Fourier-Mellin Transform (eFMT), which maintains the advantages of FMT with respect to feature-deprived scenarios. To show the robustness and accuracy of eFMT, we implement an eFMT-based visual odometry framework and test it in toy examples and a large-scale drone dataset. All these experiments are performed on data collected in challenging scenarios, such as, trees, wooden boards and featureless roofs. The results show that eFMT performs better than FMT in the multi-depth settings. Moreover, eFMT also outperforms state-of-the-art VO algorithms, such as ORB-SLAM3, SVO and DSO, in our experiments.

Download Full-text

FQSqueezer: k-mer-based compression of sequencing data

10.1101/559807 ◽

2019 ◽

Cited By ~ 1

Author(s):

Sebastian Deorowicz

Keyword(s):

Data Compression ◽

State Of The Art ◽

Genomic Data ◽

General Purpose ◽

Supplementary Information ◽

Supplementary Data ◽

Sequencing Data ◽

Partial Matching ◽

Supplementary Material ◽

Better Than

AbstractMotivationThe amount of genomic data that needs to be stored is huge. Therefore it is not surprising that a lot of work has been done in the field of specialized data compression of FASTQ files. The existing algorithms are, however, still imperfect and the best tools produce quite large archives.ResultsWe present FQSqueezer, a novel compression algorithm for sequencing data able to process single- and paired-end reads of variable lengths. It is based on the ideas from the famous prediction by partial matching and dynamic Markov coder algorithms known from the general-purpose-compressors world. The compression ratios are often tens of percent better than offered by the state-of-the-art tools.Availability and Implementationhttps://github.com/refresh-bio/[email protected] informationSupplementary data are available at publisher’s Web site.

Download Full-text

Synaptic Scaling Improves the Stability of Neural Mass Models Capable of Simulating Brain Plasticity

Neural Computation ◽

10.1162/neco_a_01257 ◽

2020 ◽

Vol 32 (2) ◽

pp. 424-446

Author(s):

Jure Demšar ◽

Rob Forsyth

Keyword(s):

Large Scale ◽

State Of The Art ◽

Brain Plasticity ◽

Original Model ◽

Neural Mass Model ◽

Synaptic Scaling ◽

Mass Model ◽

Neural Mass ◽

The Stability ◽

And Behavior

Neural mass models offer a way of studying the development and behavior of large-scale brain networks through computer simulations. Such simulations are currently mainly research tools, but as they improve, they could soon play a role in understanding, predicting, and optimizing patient treatments, particularly in relation to effects and outcomes of brain injury. To bring us closer to this goal, we took an existing state-of-the-art neural mass model capable of simulating connection growth through simulated plasticity processes. We identified and addressed some of the model's limitations by implementing biologically plausible mechanisms. The main limitation of the original model was its instability, which we addressed by incorporating a representation of the mechanism of synaptic scaling and examining the effects of optimizing parameters in the model. We show that the updated model retains all the merits of the original model, while being more stable and capable of generating networks that are in several aspects similar to those found in real brains.

Download Full-text

ResMem-Net: memory based deep CNN for image memorability estimation

PeerJ Computer Science ◽

10.7717/peerj-cs.767 ◽

2021 ◽

Vol 7 ◽

pp. e767

Author(s):

Arockia Praveen ◽

Abdulfattah Noorwali ◽

Duraimurugan Samiayya ◽

Mohammad Zubair Khan ◽

Durai Raj Vincent P M ◽

...

Keyword(s):

Deep Learning ◽

Large Scale ◽

Mean Squared Error ◽

State Of The Art ◽

Rank Correlation ◽

Current State ◽

Intermediate Layers ◽

Better Than ◽

Made In ◽

Memory Efficient

Image memorability is a very hard problem in image processing due to its subjective nature. But due to the introduction of Deep Learning and the large availability of data and GPUs, great strides have been made in predicting the memorability of an image. In this paper, we propose a novel deep learning architecture called ResMem-Net that is a hybrid of LSTM and CNN that uses information from the hidden layers of the CNN to compute the memorability score of an image. The intermediate layers are important for predicting the output because they contain information about the intrinsic properties of the image. The proposed architecture automatically learns visual emotions and saliency, shown by the heatmaps generated using the GradRAM technique. We have also used the heatmaps and results to analyze and answer one of the most important questions in image memorability: “What makes an image memorable?”. The model is trained and evaluated using the publicly available Large-scale Image Memorability dataset (LaMem) from MIT. The results show that the model achieves a rank correlation of 0.679 and a mean squared error of 0.011, which is better than the current state-of-the-art models and is close to human consistency (p = 0.68). The proposed architecture also has a significantly low number of parameters compared to the state-of-the-art architecture, making it memory efficient and suitable for production.

Download Full-text

Distributed Pareto Optimization for Subset Selection

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/207 ◽

2018 ◽

Cited By ~ 2

Author(s):

Chao Qian ◽

Guiying Li ◽

Chao Feng ◽

Ke Tang

Keyword(s):

Real World ◽

Large Scale ◽

State Of The Art ◽

Subset Selection ◽

Data Sets ◽

Mapreduce Framework ◽

Real World Data ◽

Real World Applications ◽

Approximation Guarantee ◽

Better Than

The subset selection problem that selects a few items from a ground set arises in many applications such as maximum coverage, influence maximization, sparse regression, etc. The recently proposed POSS algorithm is a powerful approximation solver for this problem. However, POSS requires centralized access to the full ground set, and thus is impractical for large-scale real-world applications, where the ground set is too large to be stored on one single machine. In this paper, we propose a distributed version of POSS (DPOSS) with a bounded approximation guarantee. DPOSS can be easily implemented in the MapReduce framework. Our extensive experiments using Spark, on various real-world data sets with size ranging from thousands to millions, show that DPOSS can achieve competitive performance compared with the centralized POSS, and is almost always better than the state-of-the-art distributed greedy algorithm RandGreeDi.

Download Full-text

Simulating SST Teleconnections to Africa: What is the State of the Art?

Journal of Climate ◽

10.1175/jcli-d-12-00761.1 ◽

2013 ◽

Vol 26 (15) ◽

pp. 5397-5418 ◽

Cited By ~ 69

Author(s):

David P. Rowell

Keyword(s):

Large Scale ◽

State Of The Art ◽

Coupled Model ◽

The State ◽

Sub Saharan Africa ◽

Intercomparison Project ◽

Rainfall Anomalies ◽

Sub Saharan ◽

Better Than

Abstract This study provides an overview of the state of the art of modeling SST teleconnections to Africa and begins to investigate the sources of error. Data are obtained from the Coupled Model Intercomparison Project (CMIP) archives, phases 3 and 5 (CMIP3 and CMIP5), using the “20C3M” and “historical” coupled model experiments. A systematic approach is adopted, with the scope narrowed to six large-scale regions of sub-Saharan Africa within which seasonal rainfall anomalies are reasonably coherent, along with six SST modes known to affect these regions. No significant nonstationarity of the strength of these 6 × 6 teleconnections is found in observations. The capability of models to represent each teleconnection is then assessed (whereby half the teleconnections have observed SST–rainfall correlations that differ significantly from zero). A few of these teleconnections are found to be relatively easy to model, while a few more pose substantial challenges to models and many others exhibit a wide variety of model skill. Furthermore, some models perform consistently better than others, with the best able to at least adequately simulate 80%–85% of the 36 teleconnections. No improvement is found between CMIP3 and CMIP5. Analysis of atmosphere-only simulations suggests that the coupled model teleconnection errors may arise primarily from errors in their SST climatology and variability, although errors in the atmospheric component of teleconnections also play a role. Last, no straightforward relationship is found between the quality of a model's teleconnection to Africa and its SST or rainfall biases or its resolution. Perhaps not surprisingly, the causes of these errors are complex, and will require considerable further investigation.

Download Full-text

Learning spatiotemporal features with 3D DenseNet and attention for gesture recognition

International Journal of Electrical Engineering Education ◽

10.1177/0020720919894196 ◽

2019 ◽

pp. 002072091989419

Author(s):

Honegzhe Liu ◽

Zhifang Deng ◽

Cheng Xu

Keyword(s):

Gesture Recognition ◽

Transition Layer ◽

Large Scale ◽

State Of The Art ◽

Attention Mechanism ◽

Spatiotemporal Features ◽

Current State ◽

Feature Extractor ◽

Dynamic Gestures ◽

Better Than

Gesture recognition aims at understanding dynamic gestures of the human body and is one of the most important ways of human–computer interaction; to extract more effective spatiotemporal features in gesture videos for more accurate gesture classification, a novel feature extractor network, spatiotemporal attention 3D DenseNet is proposed in this study. We extend DenseNet with 3D kernels and Refined Temporal Transition Layer based on Temporal Transition Layer, and we also explore attention mechanism in 3D ConvNets. We embed the Refined Temporal Transition Layer and attention mechanism in DenseNet3D, named the proposed network “spatiotemporal attention 3D DenseNet.” Our experiments show that our Refined Temporal Transition Layer performs better than Temporal Transition Layer and the proposed spatiotemporal attention 3D DenseNet in each modality outperforms the current state-of-the-art methods on the ChaLearn LAP Large-Scale Isolated gesture dataset. The code and pretrained model are released in https://github.com/dzf19927/STA3D .

Download Full-text