P-Norm Attention Deep CORAL: Extending Correlation Alignment Using Attention and the P-Norm Loss Function

Zhi-Yong Wang; Dae-Ki Kang

doi:10.3390/app11115267

P-Norm Attention Deep CORAL: Extending Correlation Alignment Using Attention and the P-Norm Loss Function

Applied Sciences ◽

10.3390/app11115267 ◽

2021 ◽

Vol 11 (11) ◽

pp. 5267

Author(s):

Zhi-Yong Wang ◽

Dae-Ki Kang

Keyword(s):

Loss Function ◽

Domain Adaptation ◽

Original Data ◽

Good Representation ◽

Feature Maps ◽

Target Domain ◽

Unsupervised Domain Adaptation ◽

Adaptation Method ◽

Deep Coral

CORrelation ALignment (CORAL) is an unsupervised domain adaptation method that uses a linear transformation to align the covariances of source and target domains. Deep CORAL extends CORAL with a nonlinear transformation using a deep neural network and adds CORAL loss as a part of the total loss to align the covariances of source and target domains. However, there are still two problems to be solved in Deep CORAL: features extracted from AlexNet are not always a good representation of the original data, as well as joint training combined with both the classification and CORAL loss may not be efficient enough to align the distribution of the source and target domain. In this paper, we proposed two strategies: attention to improve the quality of feature maps and the p-norm loss function to align the distribution of the source and target features, further reducing the offset caused by the classification loss function. Experiments on the Office-31 dataset indicate that our proposed methodologies improved Deep CORAL in terms of performance.

Download Full-text

Multi-Source Domain Adaptation for Text Classification via DistanceNet-Bandits

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6288 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7830-7838 ◽

Cited By ~ 1

Author(s):

Han Guo ◽

Ramakanth Pasunuru ◽

Mohit Bansal

Keyword(s):

Loss Function ◽

Optimal Trajectory ◽

Domain Adaptation ◽

Learning Algorithm ◽

Data Distribution ◽

Distance Measures ◽

Target Domain ◽

Source Domain ◽

Unsupervised Domain Adaptation ◽

Additional Loss

Domain adaptation performance of a learning algorithm on a target domain is a function of its source domain error and a divergence measure between the data distribution of these two domains. We present a study of various distance-based measures in the context of NLP tasks, that characterize the dissimilarity between domains based on sample estimates. We first conduct analysis experiments to show which of these distance measures can best differentiate samples from same versus different domains, and are correlated with empirical results. Next, we develop a DistanceNet model which uses these distance measures, or a mixture of these distance measures, as an additional loss function to be minimized jointly with the task's loss function, so as to achieve better unsupervised domain adaptation. Finally, we extend this model to a novel DistanceNet-Bandit model, which employs a multi-armed bandit controller to dynamically switch between multiple source domains and allow the model to learn an optimal trajectory and mixture of domains for transfer to the low-resource target domain. We conduct experiments on popular sentiment analysis datasets with several diverse domains and show that our DistanceNet model, as well as its dynamic bandit variant, can outperform competitive baselines in the context of unsupervised domain adaptation.

Download Full-text

Unsupervised Domain Adaptation Method Based on Discriminant Sample Selection

Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University ◽

10.1051/jnwpu/20203840828 ◽

2020 ◽

Vol 38 (4) ◽

pp. 828-837

Author(s):

Linlin Wu ◽

Guohua Peng ◽

Weidong Yan

Keyword(s):

Density Estimation ◽

Classification Accuracy ◽

Domain Adaptation ◽

Sample Selection ◽

Previous Method ◽

Probability Density Estimation ◽

Target Domain ◽

Source Domain ◽

Unsupervised Domain Adaptation ◽

Adaptation Method

In order to solve the problem that low classification accuracy caused by the different distribution of training set and test set, an unsupervised domain adaptation method based on discriminant sample selection (DSS) is proposed. DSS projects the samples of different domains onto a same subspace to reduce the distribution discrepancy between the source domain and the target domain, and weights the source domain instances to make the samples more discriminant. Different from the previous method based on the probability density estimation of samples, DSS tries to obtain the sample weights by solving a quadratic programming problem, which avoids the distribution estimation of samples and can be applied to any fields without suffering from the dimensional trouble caused by high-dimensional density estimation. Finally, DSS congregates the same classes by minimizing the intra-class distance. Experimental results show that the proposed method improves the classification accuracy and robustness.

Download Full-text

Correlation alignment with attention mechanism for unsupervised domain adaptation

Web Intelligence ◽

10.3233/web-210447 ◽

2021 ◽

pp. 1-7

Author(s):

Rong Chen ◽

Chongguang Ren

Keyword(s):

Natural Language Processing ◽

Language Processing ◽

Transfer Process ◽

Negative Transfer ◽

Domain Adaptation ◽

Attention Mechanism ◽

Target Domain ◽

Source Domain ◽

Second Order Statistics ◽

Unsupervised Domain Adaptation

Domain adaptation aims to solve the problems of lacking labels. Most existing works of domain adaptation mainly focus on aligning the feature distributions between the source and target domain. However, in the field of Natural Language Processing, some of the words in different domains convey different sentiment. Thus not all features of the source domain should be transferred, and it would cause negative transfer when aligning the untransferable features. To address this issue, we propose a Correlation Alignment with Attention mechanism for unsupervised Domain Adaptation (CAADA) model. In the model, an attention mechanism is introduced into the transfer process for domain adaptation, which can capture the positively transferable features in source and target domain. Moreover, the CORrelation ALignment (CORAL) loss is utilized to minimize the domain discrepancy by aligning the second-order statistics of the positively transferable features extracted by the attention mechanism. Extensive experiments on the Amazon review dataset demonstrate the effectiveness of CAADA method.

Download Full-text

Unsupervised Domain Adaptation by Statistics Alignment for Deep Sleep Staging Networks

10.36227/techrxiv.17212184.v1 ◽

2021 ◽

Author(s):

Jiahao Fan ◽

Hangyu Zhu ◽

Xinyu Jiang ◽

Long Meng ◽

Cong Fu ◽

...

Keyword(s):

Large Scale ◽

Domain Adaptation ◽

Source Model ◽

Deep Sleep ◽

Sleep Staging ◽

Target Domain ◽

Source Domain ◽

Unsupervised Domain Adaptation ◽

Generalization Problem ◽

Source Models

Deep sleep staging networks have reached top performance on large-scale datasets. However, these models perform poorer when training and testing on small sleep cohorts due to data inefficiency. Transferring well-trained models from large-scale datasets (source domain) to small sleep cohorts (target domain) is a promising solution but still remains challenging due to the domain-shift issue. In this work, an unsupervised domain adaptation approach, domain statistics alignment (DSA), is developed to bridge the gap between the data distribution of source and target domains. DSA adapts the source models on the target domain by modulating the domain-specific statistics of deep features stored in the Batch Normalization (BN) layers. Furthermore, we have extended DSA by introducing cross-domain statistics in each BN layer to perform DSA adaptively (AdaDSA). The proposed methods merely need the well-trained source model without access to the source data, which may be proprietary and inaccessible. DSA and AdaDSA are universally applicable to various deep sleep staging networks that have BN layers. We have validated the proposed methods by extensive experiments on two state-of-the-art deep sleep staging networks, DeepSleepNet+ and U-time. The performance was evaluated by conducting various transfer tasks on six sleep databases, including two large-scale databases, MASS and SHHS, as the source domain, four small sleep databases as the target domain. Thereinto, clinical sleep records acquired in Huashan Hospital, Shanghai, were used. The results show that both DSA and AdaDSA could significantly improve the performance of source models on target domains, providing novel insights into the domain generalization problem in sleep staging tasks.<br>

Download Full-text

Does BERT need domain adaptation for clinical negation detection?

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocaa001 ◽

2020 ◽

Vol 27 (4) ◽

pp. 584-591 ◽

Cited By ~ 2

Author(s):

Chen Lin ◽

Steven Bethard ◽

Dmitriy Dligach ◽

Farig Sadeque ◽

Guergana Savova ◽

...

Keyword(s):

Transfer Learning ◽

Domain Adaptation ◽

Fine Tuning ◽

Adaptation Algorithm ◽

Learning Methods ◽

Clinical Text ◽

Unsupervised Domain Adaptation ◽

Adversarial Training ◽

Negation Detection ◽

Adaptation Method

Abstract Introduction Classifying whether concepts in an unstructured clinical text are negated is an important unsolved task. New domain adaptation and transfer learning methods can potentially address this issue. Objective We examine neural unsupervised domain adaptation methods, introducing a novel combination of domain adaptation with transformer-based transfer learning methods to improve negation detection. We also want to better understand the interaction between the widely used bidirectional encoder representations from transformers (BERT) system and domain adaptation methods. Materials and Methods We use 4 clinical text datasets that are annotated with negation status. We evaluate a neural unsupervised domain adaptation algorithm and BERT, a transformer-based model that is pretrained on massive general text datasets. We develop an extension to BERT that uses domain adversarial training, a neural domain adaptation method that adds an objective to the negation task, that the classifier should not be able to distinguish between instances from 2 different domains. Results The domain adaptation methods we describe show positive results, but, on average, the best performance is obtained by plain BERT (without the extension). We provide evidence that the gains from BERT are likely not additive with the gains from domain adaptation. Discussion Our results suggest that, at least for the task of clinical negation detection, BERT subsumes domain adaptation, implying that BERT is already learning very general representations of negation phenomena such that fine-tuning even on a specific corpus does not lead to much overfitting. Conclusion Despite being trained on nonclinical text, the large training sets of models like BERT lead to large gains in performance for the clinical negation detection task.

Download Full-text

FeatureTransfer: Unsupervised Domain Adaptation for Cross-Domain Deepfake Detection

Security and Communication Networks ◽

10.1155/2021/9942754 ◽

2021 ◽

Vol 2021 ◽

pp. 1-8

Author(s):

Baoying Chen ◽

Shunquan Tan

Keyword(s):

Large Scale ◽

Detection Method ◽

Domain Adaptation ◽

Third Party ◽

Detection Methods ◽

Target Domain ◽

Feature Vectors ◽

Unsupervised Domain Adaptation ◽

Cross Domain ◽

Overfitting Problem

Recently, various Deepfake detection methods have been proposed, and most of them are based on convolutional neural networks (CNNs). These detection methods suffer from overfitting on the source dataset and do not perform well on cross-domain datasets which have different distributions from the source dataset. To address these limitations, a new method named FeatureTransfer is proposed in this paper, which is a two-stage Deepfake detection method combining with transfer learning. Firstly, The CNN model pretrained on a third-party large-scale Deepfake dataset can be used to extract the more transferable feature vectors of Deepfake videos in the source and target domains. Secondly, these feature vectors are fed into the domain-adversarial neural network based on backpropagation (BP-DANN) for unsupervised domain adaptive training, where the videos in the source domain have real or fake labels, while the videos in the target domain are unlabelled. The experimental results indicate that the proposed method FeatureTransfer can effectively solve the overfitting problem in Deepfake detection and greatly improve the performance of cross-dataset evaluation.

Download Full-text

Adversarial Training Based Multi-Source Unsupervised Domain Adaptation for Sentiment Analysis

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6262 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7618-7625

Author(s):

Yong Dai ◽

Jian Liu ◽

Xiancong Ren ◽

Zenglin Xu

Keyword(s):

Sentiment Analysis ◽

Domain Adaptation ◽

State Of The Art ◽

Weak Assumption ◽

Target Domain ◽

Smoothness Assumption ◽

Unsupervised Domain Adaptation ◽

Good Target ◽

Adversarial Training ◽

Learning Frameworks

Multi-source unsupervised domain adaptation (MS-UDA) for sentiment analysis (SA) aims to leverage useful information in multiple source domains to help do SA in an unlabeled target domain that has no supervised information. Existing algorithms of MS-UDA either only exploit the shared features, i.e., the domain-invariant information, or based on some weak assumption in NLP, e.g., smoothness assumption. To avoid these problems, we propose two transfer learning frameworks based on the multi-source domain adaptation methodology for SA by combining the source hypotheses to derive a good target hypothesis. The key feature of the first framework is a novel Weighting Scheme based Unsupervised Domain Adaptation framework ((WS-UDA), which combine the source classifiers to acquire pseudo labels for target instances directly. While the second framework is a Two-Stage Training based Unsupervised Domain Adaptation framework (2ST-UDA), which further exploits these pseudo labels to train a target private extractor. Importantly, the weights assigned to each source classifier are based on the relations between target instances and source domains, which measured by a discriminator through the adversarial training. Furthermore, through the same discriminator, we also fulfill the separation of shared features and private features.Experimental results on two SA datasets demonstrate the promising performance of our frameworks, which outperforms unsupervised state-of-the-art competitors.

Download Full-text

Triplet Loss Network for Unsupervised Domain Adaptation

Algorithms ◽

10.3390/a12050096 ◽

2019 ◽

Vol 12 (5) ◽

pp. 96 ◽

Cited By ~ 1

Author(s):

Imad Eddine Ibrahim Bekkouch ◽

Youssef Youssry ◽

Rustam Gafarov ◽

Adil Khan ◽

Asad Masood Khattak

Keyword(s):

Image Classification ◽

Domain Adaptation ◽

Fine Tuning ◽

Generative Adversarial Networks ◽

Target Domain ◽

Traffic Sign ◽

Linear Discriminant ◽

Unsupervised Domain Adaptation ◽

Sign Recognition ◽

Almost All

Domain adaptation is a sub-field of transfer learning that aims at bridging the dissimilarity gap between different domains by transferring and re-using the knowledge obtained in the source domain to the target domain. Many methods have been proposed to resolve this problem, using techniques such as generative adversarial networks (GAN), but the complexity of such methods makes it hard to use them in different problems, as fine-tuning such networks is usually a time-consuming task. In this paper, we propose a method for unsupervised domain adaptation that is both simple and effective. Our model (referred to as TripNet) harnesses the idea of a discriminator and Linear Discriminant Analysis (LDA) to push the encoder to generate domain-invariant features that are category-informative. At the same time, pseudo-labelling is used for the target data to train the classifier and to bring the same classes from both domains together. We evaluate TripNet against several existing, state-of-the-art methods on three image classification tasks: Digit classification (MNIST, SVHN, and USPC datasets), object recognition (Office31 dataset), and traffic sign recognition (GTSRB and Synthetic Signs datasets). Our experimental results demonstrate that (i) TripNet beats almost all existing methods (having a similar simple model like it) on all of these tasks; and (ii) for models that are significantly more complex (or hard to train) than TripNet, it even beats their performance in some cases. Hence, the results confirm the effectiveness of using TripNet for unsupervised domain adaptation in image classification.

Download Full-text

Unsupervised Domain Adaptation via Structured Prediction Based Selective Pseudo-Labeling

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6091 ◽

2020 ◽

Vol 34 (04) ◽

pp. 6243-6250 ◽

Cited By ~ 2

Author(s):

Qian Wang ◽

Toby Breckon

Keyword(s):

Domain Adaptation ◽

Feature Space ◽

Structured Prediction ◽

Target Domain ◽

Source Domain ◽

Domain Specific ◽

Unsupervised Domain Adaptation ◽

Deep Feature ◽

Significant Performance ◽

Error Accumulation

Unsupervised domain adaptation aims to address the problem of classifying unlabeled samples from the target domain whilst labeled samples are only available from the source domain and the data distributions are different in these two domains. As a result, classifiers trained from labeled samples in the source domain suffer from significant performance drop when directly applied to the samples from the target domain. To address this issue, different approaches have been proposed to learn domain-invariant features or domain-specific classifiers. In either case, the lack of labeled samples in the target domain can be an issue which is usually overcome by pseudo-labeling. Inaccurate pseudo-labeling, however, could result in catastrophic error accumulation during learning. In this paper, we propose a novel selective pseudo-labeling strategy based on structured prediction. The idea of structured prediction is inspired by the fact that samples in the target domain are well clustered within the deep feature space so that unsupervised clustering analysis can be used to facilitate accurate pseudo-labeling. Experimental results on four datasets (i.e. Office-Caltech, Office31, ImageCLEF-DA and Office-Home) validate our approach outperforms contemporary state-of-the-art methods.

Download Full-text

Disjoint Label Space Transfer Learning with Common Factorised Space

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33013288 ◽

2019 ◽

Vol 33 ◽

pp. 3288-3295 ◽

Cited By ~ 4

Author(s):

Xiaobin Chang ◽

Yongxin Yang ◽

Tao Xiang ◽

Timothy M. Hospedales

Keyword(s):

Transfer Learning ◽

Domain Adaptation ◽

Unified Approach ◽

Target Domain ◽

Single Model ◽

Unsupervised Domain Adaptation ◽

Common Representation ◽

Wide Range

In this paper, a unified approach is presented to transfer learning that addresses several source and target domain labelspace and annotation assumptions with a single model. It is particularly effective in handling a challenging case, where source and target label-spaces are disjoint, and outperforms alternatives in both unsupervised and semi-supervised settings. The key ingredient is a common representation termed Common Factorised Space. It is shared between source and target domains, and trained with an unsupervised factorisation loss and a graph-based loss. With a wide range of experiments, we demonstrate the flexibility, relevance and efficacy of our method, both in the challenging cases with disjoint label spaces, and in the more conventional cases such as unsupervised domain adaptation, where the source and target domains share the same label-sets.

Download Full-text