scholarly journals Learning Ordinal Embedding from Sets

Entropy ◽  
2021 ◽  
Vol 23 (8) ◽  
pp. 964
Author(s):  
Aïssatou Diallo ◽  
Johannes Fürnkranz

Ordinal embedding is the task of computing a meaningful multidimensional representation of objects, for which only qualitative constraints on their distance functions are known. In particular, we consider comparisons of the form “Which object from the pair (j, k) is more similar to object i?”. In this paper, we generalize this framework to the case where the ordinal constraints are not given at the level of individual points, but at the level of sets, and propose a distributional triplet embedding approach in a scalable learning framework. We show that the query complexity of our approach is on par with the single-item approach. Without having access to features of the items to be embedded, we show the applicability of our model on toy datasets for the task of reconstruction and demonstrate the validity of the obtained embeddings in experiments on synthetic and real-world datasets.

Author(s):  
Hai-Feng Guo ◽  
Lixin Han ◽  
Shoubao Su ◽  
Zhou-Bao Sun

Multi-Instance Multi-Label learning (MIML) is a popular framework for supervised classification where an example is described by multiple instances and associated with multiple labels. Previous MIML approaches have focused on predicting labels for instances. The idea of tackling the problem is to identify its equivalence in the traditional supervised learning framework. Motivated by the recent advancement in deep learning, in this paper, we still consider the problem of predicting labels and attempt to model deep learning in MIML learning framework. The proposed approach enables us to train deep convolutional neural network with images from social networks where images are well labeled, even labeled with several labels or uncorrelated labels. Experiments on real-world datasets demonstrate the effectiveness of our proposed approach.


Author(s):  
Lu Zhang ◽  
Zhu Sun ◽  
Jie Zhang ◽  
Yu Lei ◽  
Chen Li ◽  
...  

Studies on next point-of-interest (POI) recommendation mainly seek to learn users' transition patterns with certain historical check-ins. However, in reality, users' movements are typically uncertain (i.e., fuzzy and incomplete) where most existing methods suffer from the transition pattern vanishing issue. To ease this issue, we propose a novel interactive multi-task learning (iMTL) framework to better exploit the interplay between activity and location preference. Specifically, iMTL introduces: (1) temporal-aware activity encoder equipped with fuzzy characterization over uncertain check-ins to unveil the latent activity transition patterns; (2) spatial-aware location preference encoder to capture the latent location transition patterns; and (3) task-specific decoder to make use of the learned latent transition patterns and enhance both activity and location prediction tasks in an interactive manner. Extensive experiments on three real-world datasets show the superiority of iMTL.


Author(s):  
Weiming Lu ◽  
Yangfan Zhou ◽  
Jiale Yu ◽  
Chenhao Jia

Prerequisite relations among concepts are crucial for educational applications. However, it is difficult to automatically extract domain-specific concepts and learn the prerequisite relations among them without labeled data.In this paper, we first extract high-quality phrases from a set of educational data, and identify the domain-specific concepts by a graph based ranking method. Then, we propose an iterative prerequisite relation learning framework, called iPRL, which combines a learning based model and recovery based model to leverage both concept pair features and dependencies among learning materials. In experiments, we evaluated our approach on two real-world datasets Textbook Dataset and MOOC Dataset, and validated that our approach can achieve better performance than existing methods. Finally, we also illustrate some examples of our approach.


2021 ◽  
Vol 21 (3) ◽  
pp. 1-17
Author(s):  
Wu Chen ◽  
Yong Yu ◽  
Keke Gai ◽  
Jiamou Liu ◽  
Kim-Kwang Raymond Choo

In existing ensemble learning algorithms (e.g., random forest), each base learner’s model needs the entire dataset for sampling and training. However, this may not be practical in many real-world applications, and it incurs additional computational costs. To achieve better efficiency, we propose a decentralized framework: Multi-Agent Ensemble. The framework leverages edge computing to facilitate ensemble learning techniques by focusing on the balancing of access restrictions (small sub-dataset) and accuracy enhancement. Specifically, network edge nodes (learners) are utilized to model classifications and predictions in our framework. Data is then distributed to multiple base learners who exchange data via an interaction mechanism to achieve improved prediction. The proposed approach relies on a training model rather than conventional centralized learning. Findings from the experimental evaluations using 20 real-world datasets suggest that Multi-Agent Ensemble outperforms other ensemble approaches in terms of accuracy even though the base learners require fewer samples (i.e., significant reduction in computation costs).


Data ◽  
2020 ◽  
Vol 6 (1) ◽  
pp. 1
Author(s):  
Ahmed Elmogy ◽  
Hamada Rizk ◽  
Amany M. Sarhan

In data mining, outlier detection is a major challenge as it has an important role in many applications such as medical data, image processing, fraud detection, intrusion detection, and so forth. An extensive variety of clustering based approaches have been developed to detect outliers. However they are by nature time consuming which restrict their utilization with real-time applications. Furthermore, outlier detection requests are handled one at a time, which means that each request is initiated individually with a particular set of parameters. In this paper, the first clustering based outlier detection framework, (On the Fly Clustering Based Outlier Detection (OFCOD)) is presented. OFCOD enables analysts to effectively find out outliers on time with request even within huge datasets. The proposed framework has been tested and evaluated using two real world datasets with different features and applications; one with 699 records, and another with five millions records. The experimental results show that the performance of the proposed framework outperforms other existing approaches while considering several evaluation metrics.


Entropy ◽  
2021 ◽  
Vol 23 (6) ◽  
pp. 680
Author(s):  
Hanyang Lin ◽  
Yongzhao Zhan ◽  
Zizheng Zhao ◽  
Yuzhong Chen ◽  
Chen Dong

There is a wealth of information in real-world social networks. In addition to the topology information, the vertices or edges of a social network often have attributes, with many of the overlapping vertices belonging to several communities simultaneously. It is challenging to fully utilize the additional attribute information to detect overlapping communities. In this paper, we first propose an overlapping community detection algorithm based on an augmented attribute graph. An improved weight adjustment strategy for attributes is embedded in the algorithm to help detect overlapping communities more accurately. Second, we enhance the algorithm to automatically determine the number of communities by a node-density-based fuzzy k-medoids process. Extensive experiments on both synthetic and real-world datasets demonstrate that the proposed algorithms can effectively detect overlapping communities with fewer parameters compared to the baseline methods.


2021 ◽  
Vol 15 (3) ◽  
pp. 1-33
Author(s):  
Wenjun Jiang ◽  
Jing Chen ◽  
Xiaofei Ding ◽  
Jie Wu ◽  
Jiawei He ◽  
...  

In online systems, including e-commerce platforms, many users resort to the reviews or comments generated by previous consumers for decision making, while their time is limited to deal with many reviews. Therefore, a review summary, which contains all important features in user-generated reviews, is expected. In this article, we study “how to generate a comprehensive review summary from a large number of user-generated reviews.” This can be implemented by text summarization, which mainly has two types of extractive and abstractive approaches. Both of these approaches can deal with both supervised and unsupervised scenarios, but the former may generate redundant and incoherent summaries, while the latter can avoid redundancy but usually can only deal with short sequences. Moreover, both approaches may neglect the sentiment information. To address the above issues, we propose comprehensive Review Summary Generation frameworks to deal with the supervised and unsupervised scenarios. We design two different preprocess models of re-ranking and selecting to identify the important sentences while keeping users’ sentiment in the original reviews. These sentences can be further used to generate review summaries with text summarization methods. Experimental results in seven real-world datasets (Idebate, Rotten Tomatoes Amazon, Yelp, and three unlabelled product review datasets in Amazon) demonstrate that our work performs well in review summary generation. Moreover, the re-ranking and selecting models show different characteristics.


Complexity ◽  
2018 ◽  
Vol 2018 ◽  
pp. 1-16 ◽  
Author(s):  
Jibing Wu ◽  
Lianfei Yu ◽  
Qun Zhang ◽  
Peiteng Shi ◽  
Lihua Liu ◽  
...  

The heterogeneous information networks are omnipresent in real-world applications, which consist of multiple types of objects with various rich semantic meaningful links among them. Community discovery is an effective method to extract the hidden structures in networks. Usually, heterogeneous information networks are time-evolving, whose objects and links are dynamic and varying gradually. In such time-evolving heterogeneous information networks, community discovery is a challenging topic and quite more difficult than that in traditional static homogeneous information networks. In contrast to communities in traditional approaches, which only contain one type of objects and links, communities in heterogeneous information networks contain multiple types of dynamic objects and links. Recently, some studies focus on dynamic heterogeneous information networks and achieve some satisfactory results. However, they assume that heterogeneous information networks usually follow some simple schemas, such as bityped network and star network schema. In this paper, we propose a multityped community discovery method for time-evolving heterogeneous information networks with general network schemas. A tensor decomposition framework, which integrates tensor CP factorization with a temporal evolution regularization term, is designed to model the multityped communities and address their evolution. Experimental results on both synthetic and real-world datasets demonstrate the efficiency of our framework.


Sign in / Sign up

Export Citation Format

Share Document