Deep Field-Aware Interaction Machine for Click-Through Rate Prediction

Mobile Information Systems ◽

10.1155/2021/5575249 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Gaofeng Qi ◽

Ping Li

Keyword(s):

Field Experiments ◽

State Of The Art ◽

Feature Interaction ◽

Structure Form ◽

Feature Interactions ◽

Great Performance ◽

Field Information ◽

Factorization Machine ◽

Click Through Rate ◽

Modeling Feature

Modeling feature interactions is of crucial importance to predict click-through rate (CTR) in industrial recommender systems. Because of great performance and efficiency, the factorization machine (FM) has been a popular approach to learn feature interaction. Recently, several variants of FM are proposed to improve its performance, and they have proven the field information to play an important role. However, feature-length in a field is usually small; we observe that when there are multiple nonzero features within a field, the interaction between fields is not enough to represent the feature interaction between different fields due to the problem of short feature-length. In this work, we propose a novel neural CTR model named DeepFIM by introducing Field-aware Interaction Machine (FIM), which provides a layered structure form to describe intrafield and interfield feature interaction, to solve the short-expression problem caused by the short feature-length in the field. Experiments show that our model achieves comparable and even materially better results than the state-of-the-art methods.

Download Full-text

Attention-over-Attention Field-Aware Factorization Machine

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6101 ◽

2020 ◽

Vol 34 (04) ◽

pp. 6323-6330

Author(s):

Zhibo Wang ◽

Jinxin Ma ◽

Yongquan Zhang ◽

Qian Wang ◽

Ju Ren ◽

...

Keyword(s):

State Of The Art ◽

Popular Approach ◽

Large Margin ◽

Feature Interactions ◽

Great Performance ◽

Benchmark Datasets ◽

Field Information ◽

Factorization Machine ◽

Click Through Rate ◽

Novel Algorithm

Factorization Machine (FM) has been a popular approach in supervised predictive tasks, such as click-through rate prediction and recommender systems, due to its great performance and efficiency. Recently, several variants of FM have been proposed to improve its performance. However, most of the state-of-the-art prediction algorithms neglected the field information of features, and they also failed to discriminate the importance of feature interactions due to the problem of redundant features. In this paper, we present a novel algorithm called Attention-over-Attention Field-aware Factorization Machine (AoAFFM) for better capturing the characteristics of feature interactions. Specifically, we propose the field-aware embedding layer to exploit the field information of features, and combine it with the attention-over-attention mechanism to learn both feature-level and interaction-level attention to estimate the weight of feature interactions. Experimental results show that the proposed AoAFFM improves FM and FFM with large margin, and outperforms state-of-the-art algorithms on three public benchmark datasets.

Download Full-text

Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/435 ◽

2017 ◽

Cited By ~ 141

Author(s):

Jun Xiao ◽

Hao Ye ◽

Xiangnan He ◽

Hanwang Zhang ◽

Fei Wu ◽

...

Keyword(s):

Deep Learning ◽

State Of The Art ◽

Feature Interaction ◽

Model Parameters ◽

Learning Approach ◽

Attention Networks ◽

Feature Interactions ◽

Factorization Machine ◽

Real World Datasets ◽

Novel Model

Factorization Machines (FMs) are a supervised learning approach that enhances the linear regression model by incorporating the second-order feature interactions. Despite effectiveness, FM can be hindered by its modelling of all feature interactions with the same weight, as not all feature interactions are equally useful and predictive. For example, the interactions with useless features may even introduce noises and adversely degrade the performance. In this work, we improve FM by discriminating the importance of different feature interactions. We propose a novel model named Attentional Factorization Machine (AFM), which learns the importance of each feature interaction from data via a neural attention network. Extensive experiments on two real-world datasets demonstrate the effectiveness of AFM. Empirically, it is shown on regression task AFM betters FM with a 8.6% relative improvement, and consistently outperforms the state-of-the-art deep learning methods Wide&Deep [Cheng et al., 2016] and DeepCross [Shan et al., 2016] with a much simpler structure and fewer model parameters. Our implementation of AFM is publicly available at: https://github.com/hexiangnan/attentional_factorization_machine

Download Full-text

XGBDeepFM for CTR Predictions in Mobile Advertising Benefits from Ad Context

Mathematical Problems in Engineering ◽

10.1155/2020/1747315 ◽

2020 ◽

Vol 2020 ◽

pp. 1-7

Author(s):

Han An ◽

Jifan Ren

Keyword(s):

Weather Condition ◽

Feature Learning ◽

Gradient Boosting ◽

Feature Interaction ◽

Mobile Advertising ◽

Mobile Business ◽

Feature Interactions ◽

Extreme Gradient Boosting ◽

Factorization Machine ◽

Click Through Rate

The problem of click-through rate (CTR) prediction in mobile advertising is one of the most informative metrics used in mobile business activities, such as profit evaluation and resource management. In mobile advertising, CTR prediction is essential but challenging due to data sparsity. Moreover, existing methods often have difficulty in capturing the different orders of feature interactions simultaneously. In this study, a method was developed to obtain accurate CTR prediction by incorporating contextual features and feature interactions. We initially use extreme gradient boosting (XGBoost) as a feature engineering phase to select highly significant features. The selected features are mobile contextual attributes including time contextual, geography contextual, and other contextual attributes (e.g., weather condition) in actual mobile advertising situations. Our model, XGBoost deep factorization machine- (FM-) supported neutral network (XGBDeepFM), combines the power of XGBoost for feature selection, FM for two-order cross feature interaction, and the deep neural network for high-order feature learning in a united architecture. In a mobile advertising condition, our methods lead to significantly accurate CTR prediction in “wide and deep” type of model. In comparison with existing models, many experiments on commercial datasets show that the XGBDeepFM model has better value of area under curve and improves the effectiveness and efficiency of CTR prediction for mobile advertising.

Download Full-text

Interaction-Aware Factorization Machines for Recommender Systems

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33013804 ◽

2019 ◽

Vol 33 ◽

pp. 3804-3811 ◽

Cited By ~ 2

Author(s):

Fuxing Hong ◽

Dongbo Huang ◽

Ge Chen

Keyword(s):

Neural Network ◽

Deep Learning ◽

Interaction Effect ◽

State Of The Art ◽

Feature Interaction ◽

Learning Approach ◽

Field Interaction ◽

Feature Interactions ◽

Factorization Machine ◽

Novel Model

Factorization Machine (FM) is a widely used supervised learning approach by effectively modeling of feature interactions. Despite the successful application of FM and its many deep learning variants, treating every feature interaction fairly may degrade the performance. For example, the interactions of a useless feature may introduce noises; the importance of a feature may also differ when interacting with different features. In this work, we propose a novel model named Interaction-aware Factorization Machine (IFM) by introducing Interaction-Aware Mechanism (IAM), which comprises the feature aspect and the field aspect, to learn flexible interactions on two levels. The feature aspect learns feature interaction importance via an attention network while the field aspect learns the feature interaction effect as a parametric similarity of the feature interaction vector and the corresponding field interaction prototype. IFM introduces more structured control and learns feature interaction importance in a stratified manner, which allows for more leverage in tweaking the interactions on both feature-wise and field-wise levels. Besides, we give a more generalized architecture and propose Interaction-aware Neural Network (INN) and DeepIFM to capture higher-order interactions. To further improve both the performance and efficiency of IFM, a sampling scheme is developed to select interactions based on the field aspect importance. The experimental results from two well-known datasets show the superiority of the proposed models over the state-of-the-art methods.

Download Full-text

Learning Feature Interactions with Lorentzian Factorization Machine

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6119 ◽

2020 ◽

Vol 34 (04) ◽

pp. 6470-6477

Author(s):

Canran Xu ◽

Ming Wu

Keyword(s):

Deep Learning ◽

Hyperbolic Space ◽

Recommendation System ◽

Triangle Inequality ◽

State Of The Art ◽

Learning Methods ◽

New Model ◽

User Behaviors ◽

Feature Interactions ◽

Factorization Machine

Learning representations for feature interactions to model user behaviors is critical for recommendation system and click-trough rate (CTR) predictions. Recent advances in this area are empowered by deep learning methods which could learn sophisticated feature interactions and achieve the state-of-the-art result in an end-to-end manner. These approaches require large number of training parameters integrated with the low-level representations, and thus are memory and computational inefficient. In this paper, we propose a new model named “LorentzFM” that can learn feature interactions embedded in a hyperbolic space in which the violation of triangle inequality for Lorentz distances is available. To this end, the learned representation is benefited by the peculiar geometric properties of hyperbolic triangles, and result in a significant reduction in the number of parameters (20% to 80%) because all the top deep learning layers are not required. With such a lightweight architecture, LorentzFM achieves comparable and even materially better results than the deep learning methods such as DeepFM, xDeepFM and Deep & Cross in both recommendation and CTR prediction tasks.

Download Full-text

Deep context interaction network based on attention mechanism for click-through rate prediction

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-210830 ◽

2021 ◽

pp. 1-16

Author(s):

Ling Yuan ◽

Zhuwen Pan ◽

Ping Sun ◽

Yinzhen Wei ◽

Haiping Yu

Keyword(s):

Online Advertising ◽

Dimensional Space ◽

Interaction Network ◽

High Order ◽

Attention Mechanism ◽

Feature Interaction ◽

Feature Interactions ◽

Low Dimensional ◽

Click Through Rate ◽

The Relationship

Click-through rate (CTR) prediction, which aims to predict the probability of a user clicking on an ad, is a critical task in online advertising systems. The problem is very challenging since(1) an effective prediction relies on high-order combinatorial features, and(2)the relationship to auxiliary ads that may impact the CTR. In this paper, we propose Deep Context Interaction Network on Attention Mechanism(DCIN-Attention) to process feature interaction and context at the same time. The context includes other ads in the current search page, historically clicked and unclicked ads of the user. Specifically, we use the attention mechanism to learn the interactions between the target ad and each type of auxiliary ad. The residual network is used to model the feature interactions in the low-dimensional space, and with the multi-head self-attention neural network, high-order feature interactions can be modeled. Experimental results on Avito dataset show that DCIN outperform several existing methods for CTR prediction.

Download Full-text

Feature Interactions in Highly Configurable Systems: A Dynamic Analysis Approach with Varxplorer

10.5753/cbsoft_estendido.2020.14624 ◽

2020 ◽

Author(s):

Larissa Rocha Soares ◽

Eduardo Almeida ◽

Ivan Machado ◽

Christian Kästner

Keyword(s):

Dynamic Analysis ◽

State Of The Art ◽

Interaction Analysis ◽

Empirical Studies ◽

The State ◽

Feature Interaction ◽

Iterative Detection ◽

Iterative Approach ◽

Feature Interactions ◽

Configurable Systems

Highly-configurable systems provide significant reuse opportunities by tailoring system variants based on a set of features. Those features can interact in undesired ways which may result in faults. Thus, we propose VarXplorer, a dynamic and iterative approach to detect suspicious interactions. To evaluate whether VarXplorer helps improving the performance of identifying suspicious interactions, we performed two empirical studies. Our results shows that from the VarXplorer graphs, participants are able to identify suspicious interactions more than 3 times faster compared to the state-of-the-art tool. Additionally, the iterative detection process provides a more efficient feature interaction analysis, reducing the data developers needs to check to find problematic interactions.

Download Full-text

Latent Graph Predictor Factorization Machine (LGPFM) for modeling feature interactions weight

Proceedings of the 13th International Conference on Intelligent Systems: Theories and Applications ◽

10.1145/3419604.3419618 ◽

2020 ◽

Author(s):

Abdessamad Chanaa ◽

Nour-eddine El Faddouli

Keyword(s):

Feature Interactions ◽

Factorization Machine ◽

Modeling Feature

Download Full-text

InteractE: Improving Convolution-Based Knowledge Graph Embeddings by Increasing Feature Interactions

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i03.5694 ◽

2020 ◽

Vol 34 (03) ◽

pp. 3009-3016 ◽

Cited By ~ 3

Author(s):

Shikhar Vashishth ◽

Soumya Sanyal ◽

Vikram Nitin ◽

Nilesh Agrawal ◽

Partha Talukdar

Keyword(s):

Link Prediction ◽

State Of The Art ◽

Source Code ◽

Prediction Performance ◽

Reproducible Research ◽

Feature Interaction ◽

Feature Interactions ◽

Central Hypothesis ◽

Low Dimensional ◽

Better Than

Most existing knowledge graphs suffer from incompleteness, which can be alleviated by inferring missing links based on known facts. One popular way to accomplish this is to generate low-dimensional embeddings of entities and relations, and use these to make inferences. ConvE, a recently proposed approach, applies convolutional filters on 2D reshapings of entity and relation embeddings in order to capture rich interactions between their components. However, the number of interactions that ConvE can capture is limited. In this paper, we analyze how increasing the number of these interactions affects link prediction performance, and utilize our observations to propose InteractE. InteractE is based on three key ideas – feature permutation, a novel feature reshaping, and circular convolution. Through extensive experiments, we find that InteractE outperforms state-of-the-art convolutional link prediction baselines on FB15k-237. Further, InteractE achieves an MRR score that is 9%, 7.5%, and 23% better than ConvE on the FB15k-237, WN18RR and YAGO3-10 datasets respectively. The results validate our central hypothesis – that increasing feature interaction is beneficial to link prediction performance. We make the source code of InteractE available to encourage reproducible research.

Download Full-text

sDeepFM: Multi-Scale Stacking Feature Interactions for Click-Through Rate Prediction

Electronics ◽

10.3390/electronics9020350 ◽

2020 ◽

Vol 9 (2) ◽

pp. 350

Author(s):

Baohua Qiang ◽

Yongquan Lu ◽

Minghao Yang ◽

Xianjun Chen ◽

Jinlong Chen ◽

...

Keyword(s):

State Of The Art ◽

Receptive Fields ◽

Area Under The Curve ◽

Sparse Data ◽

High Order ◽

Multi Scale ◽

Feature Interactions ◽

Novel Structure ◽

Real World Datasets ◽

Click Through Rate

For estimating the click-through rate of advertisements, there are some problems in that the features cannot be automatically constructed, or the features built are relatively simple, or the high-order combination features are difficult to learn under sparse data. To solve these problems, we propose a novel structure multi-scale stacking pooling (MSSP) to construct multi-scale features based on different receptive fields. The structure stacks multi-scale features bi-directionally from the angles of depth and width by constructing multiple observers with different angles and different fields of view, ensuring the diversity of extracted features. Furthermore, by learning the parameters through factorization, the structure can ensure high-order features being effectively learned in sparse data. We further combine the MSSP with the classical deep neural network (DNN) to form a unified model named sDeepFM. Experimental results on two real-world datasets show that the sDeepFM outperforms state-of-the-art models with respect to area under the curve (AUC) and log loss.

Download Full-text