scholarly journals Survey on Deep Multi-modal Data Analytics: Collaboration, Rivalry, and Fusion

Author(s):  
Yang Wang

With the development of web technology, multi-modal or multi-view data has surged as a major stream for big data, where each modal/view encodes individual property of data objects. Often, different modalities are complementary to each other. This fact motivated a lot of research attention on fusing the multi-modal feature spaces to comprehensively characterize the data objects. Most of the existing state-of-the-arts focused on how to fuse the energy or information from multi-modal spaces to deliver a superior performance over their counterparts with single modal. Recently, deep neural networks have been exhibited as a powerful architecture to well capture the nonlinear distribution of high-dimensional multimedia data, so naturally does for multi-modal data. Substantial empirical studies are carried out to demonstrate its advantages that are benefited from deep multi-modal methods, which can essentially deepen the fusion from multi-modal deep feature spaces. In this article, we provide a substantial overview of the existing state-of-the-arts in the field of multi-modal data analytics from shallow to deep spaces. Throughout this survey, we further indicate that the critical components for this field go to collaboration, adversarial competition, and fusion over multi-modal spaces. Finally, we share our viewpoints regarding some future directions in this field.

2021 ◽  
Author(s):  
Yiu-ming Cheung ◽  
Zhikai Hu

<div><p>Unsupervised cross-modal retrieval has received increasing attention recently, because of the extreme difficulty of labeling the explosive multimedia data. The core challenge of it is how to measure the similarities between multi-modal data without label information. In previous works, various distance metrics are selected for measuring the similarities and predicting whether samples belong to the same class. However, these predictions are not always right. Unfortunately, even a few wrong predictions can undermine the final retrieval performance. To address this problem, in this paper, we categorize predictions as solid and soft ones based on their confidence. We further categorize samples as solid and soft ones based on the predictions. We propose that these two kinds of predictions and samples should be treated differently. Besides, we find that the absolute values of similarities can represent not only the similarity but also the confidence of the predictions. Thus, we first design an elegant dot product fusion strategy to obtain effective inter-modal similarities. Subsequently, utilizing these similarities, we propose a generalized and flexible weighted loss function where larger weights are assigned to solid samples to increase the retrieval performance, and smaller weights are assigned to soft samples to decrease the disturbance of wrong predictions. Despite less information is used, empirical studies show that the proposed approach achieves the state-of-the-art retrieval performance.</p><br></div>


2021 ◽  
Author(s):  
Yiu-ming Cheung ◽  
Zhikai Hu

<div><p>Unsupervised cross-modal retrieval has received increasing attention recently, because of the extreme difficulty of labeling the explosive multimedia data. The core challenge of it is how to measure the similarities between multi-modal data without label information. In previous works, various distance metrics are selected for measuring the similarities and predicting whether samples belong to the same class. However, these predictions are not always right. Unfortunately, even a few wrong predictions can undermine the final retrieval performance. To address this problem, in this paper, we categorize predictions as solid and soft ones based on their confidence. We further categorize samples as solid and soft ones based on the predictions. We propose that these two kinds of predictions and samples should be treated differently. Besides, we find that the absolute values of similarities can represent not only the similarity but also the confidence of the predictions. Thus, we first design an elegant dot product fusion strategy to obtain effective inter-modal similarities. Subsequently, utilizing these similarities, we propose a generalized and flexible weighted loss function where larger weights are assigned to solid samples to increase the retrieval performance, and smaller weights are assigned to soft samples to decrease the disturbance of wrong predictions. Despite less information is used, empirical studies show that the proposed approach achieves the state-of-the-art retrieval performance.</p><br></div>


2021 ◽  
Vol 16 (1) ◽  
pp. 1-24
Author(s):  
Yaojin Lin ◽  
Qinghua Hu ◽  
Jinghua Liu ◽  
Xingquan Zhu ◽  
Xindong Wu

In multi-label learning, label correlations commonly exist in the data. Such correlation not only provides useful information, but also imposes significant challenges for multi-label learning. Recently, label-specific feature embedding has been proposed to explore label-specific features from the training data, and uses feature highly customized to the multi-label set for learning. While such feature embedding methods have demonstrated good performance, the creation of the feature embedding space is only based on a single label, without considering label correlations in the data. In this article, we propose to combine multiple label-specific feature spaces, using label correlation, for multi-label learning. The proposed algorithm, mu lti- l abel-specific f eature space e nsemble (MULFE), takes consideration label-specific features, label correlation, and weighted ensemble principle to form a learning framework. By conducting clustering analysis on each label’s negative and positive instances, MULFE first creates features customized to each label. After that, MULFE utilizes the label correlation to optimize the margin distribution of the base classifiers which are induced by the related label-specific feature spaces. By combining multiple label-specific features, label correlation based weighting, and ensemble learning, MULFE achieves maximum margin multi-label classification goal through the underlying optimization framework. Empirical studies on 10 public data sets manifest the effectiveness of MULFE.


Author(s):  
Xiao Wang ◽  
Ziwei Zhang ◽  
Jing Wang ◽  
Peng Cui ◽  
Shiqiang Yang

Trust prediction, aiming to predict the trust relations between users in a social network, is a key to helping users discover the reliable information. Many trust prediction methods are proposed based on the low-rank assumption of a trust network. However, one typical property of the trust network is that the trust relations follow the power-law distribution, i.e., few users are trusted by many other users, while most tail users have few trustors. Due to these tail users, the fundamental low-rank assumption made by existing methods is seriously violated and becomes unrealistic. In this paper, we propose a simple yet effective method to address the problem of the violated low-rank assumption. Instead of discovering the low-rank component of the trust network alone, we learn a sparse component of the trust network to describe the tail users simultaneously. With both of the learned low-rank and sparse components, the trust relations in the whole network can be better captured. Moreover, the transitive closure structure of the trust relations is also integrated into our model. We then derive an effective iterative algorithm to infer the parameters of our model, along with the proof of correctness. Extensive experimental results on real-world trust networks demonstrate the superior performance of our proposed method over the state-of-the-arts.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Federica De Santis ◽  
Giuseppe D’Onza

Purpose This study aims to analyze the utilization of big data and data analytics (BDA) in financial auditing, focusing on the process of producing legitimacy around these techniques, the factors fostering or hindering such process and the action auditors take to legitimate BDA inside and outside the audit community. Design/methodology/approach The analysis bases on semi-structured interviews with partners and senior managers of Italian audit companies. Findings The BDA’s legitimation process is more advanced in the audit professional environment than outside the audit community. The Big Four lead the BDA-driven audit innovation process and BDA is used to complement traditional audit procedures. Outside the audit community, the digital maturity of audit clients, the lack of audit standards and the audit oversight authority’s negative view prevent the full legitimation of BDA. Practical implications This research highlights factors influencing the utilization of BDA to enhance audit quality. The results can, thus, be used to enhance the audit strategy and to innovate audit practices by using BDA as a source of adequate audit evidence. Audit regulators and standards setters can also use the results to revise the current auditing standards and guidance. Originality/value This study adds to the literature on digital transformation in auditing by analyzing the legitimation process of a new audit technique. The paper answers the call for more empirical studies on the utilization of BDA in financial auditing by analyzing the application of such techniques in an unexplored operational setting in which auditees are mainly medium-sized enterprises and family-run businesses.


Author(s):  
Bo Yang

In recent years, the rapid expansion of multimedia applications, partly due to the exponential growth of the Internet, has proliferated over the daily life of computer users (Yang & Hurson, 2006). The integration of wireless communication, pervasive computing, and ubiquitous data processing with multimedia database systems has enabled the connection and fusion of distributed multimedia data sources. In addition, the emerging applications, such as smart classroom, digital library, habitat/environment surveillance, traffic monitoring, and battlefield sensing, have provided increasing motivation for conducting research on multimedia content representation, data delivery and dissemination, data fusion and analysis, and contentbased retrieval. Consequently, research on multimedia technologies is of increasing importance in computer society. In contrast with traditional text-based systems, multimedia applications usually incorporate much more powerful descriptions of human thought—video, audio, and images (Karpouzis, Raouzaiou, Tzouveli, Iaonnou, & Kollias, 2003; Liu, Bao, Yu, & Xu, 2005; Yang & Hurson, 2005). Moreover, the large collections of data in multimedia systems make it possible to resolve more complex data operations such as imprecise query or content-based retrieval. For instance, the image database systems may accept an example picture and return the most similar images of the example (Cox, Miller, & Minka, 2000; Hsu, Chua, & Pung, 2000; Huang, Chang, & Huang, 2003). However, the conveniences of multimedia applications come with challenges to the existing data management schemes: • Efficiency: Multimedia applications generally require more resources; however, the storage space and processing power are limited in many practical systems, for example, mobile devices and wireless networks (Yang & Hurson, 2005). Due to the large data volume and complicated operations of multimedia applications, new methods are needed to facilitate efficient representation, retrieval, and processing of multimedia data while considering the technical constraints. • Semantic Gap: There is a gap between user perception of multimedia entities and physical representation/access mechanism of multimedia data. Users often browse and desire to access multimedia data at the object level (“entities” such as human beings, animals, or buildings). However, the existing multimedia retrieval systems tend to access multimedia data based on their lower-level features (“characteristics” such as color patterns and textures), with little regard to combining these features into data objects. This representation gap often leads to higher processing cost and unexpected retrieval results. The representation of multimedia data according to human’s perspective is one of the focuses in recent research activities; however, few existing systems provide automated identification or classification of objects from general multimedia collections. • Heterogeneity: The collections of multimedia data are often diverse and poorly indexed. In a distributed environment, because of the autonomy and heterogeneity of data sources, multimedia data objects are often represented in heterogeneous formats. The difference in data formats further leads to the difficulty of incorporating multimedia data objects under a unique indexing framework. • Semantic Unawareness: The present research on content-based multimedia retrieval is based on feature vectors—features are extracted from audio/video streams or image pixels, empirically or heuristically, and combined into vectors according to the application criteria. Because of the application-specific multimedia formats, the feature-based paradigm lacks scalability and accuracy.


Processes ◽  
2020 ◽  
Vol 8 (10) ◽  
pp. 1215 ◽  
Author(s):  
Varun Gupta ◽  
Jose Maria Fernandez-Crehuet ◽  
Thomas Hanne

[Context] Freelancers could catalyze the software development process by providing their niche skills to generate high quality outputs. They could help companies (including startups) to foster innovations by suggesting creative ideas and providing their expertise in implementing them (for instance, designing solutions, coding solutions etc.). Freelancers could effectively and efficiently work as a virtual member of the software development team. The company must make informed decisions about which task to allot to the freelancer, which freelancer to select, pricing the task, and evaluating the submitted work. On the other hand, the freelancer should make an informed decision about evaluating the monetary value of the task to be charged, trusting the requester, analyzing the skills requirement of the task (finding matches between skill requirement and skills processed), selecting the best task, and maintaining the highest level of reputation. However, the literature does not provide freelancers and the companies the guidelines that support their decision making. However, if freelancers are selected carefully for the most suitable task, the companies will benefit a lot in terms of improved software development metrics. [Objectives] The objective of this paper is to provide the research community the research trends in freelancer-supported software development. This helps to understand that which software development areas have higher concentrations of research efforts, which area has the support of empirical evidence to support management decision makings, and which area requires the research attention. [Method] The systematic study is conducted by planning the mapping protocol, executing the protocol, and reporting the findings using various visualization tools like bar charts and pie charts. The search process was planned to be executed using set of inclusion and exclusion conditions on four bibliographic databases (IEEExplore, Springerlink, Sciencedirect, and ACM digital library). The relevant papers are selected by applying inclusion and exclusion conditions. The google citations of the relevant papers are subject to the inclusion and exclusion conditions again to include the more relevant papers. Finally, the systematic schema was created and populated after analyzing the studies abstracts. [Results] The results indicate the following (a) The research focus is on generic software development (78%) rather on individual life cycle activities. (b) The number of empirical studies is limited (25%). (c) A number of studies proposing solutions and evaluating on live cases in industrial settings are missing from the literature. This is in comparison to the validation approaches (72%) i.e., solutions tested in laboratory settings. (d) At present, the literature has limited ability to provide the software companies (including startups) with the guidelines (in the form of opinions and experience reports) for involving freelancers in the software development process. (e) The reported challenges include Collaboration and Coordination (33%), Developer Recommendation (or selection) (19%), Team Formulation (14%), Task Recommendation (allocation) (14%), Task Decomposition (11%), Privacy and Security (Confidentiality) (11%), Budget Estimation (8%), Recognition (8%), Trust Issues (8%), Market Dynamism (6%), Intellectual Property Issues (6%), Participation of Crowd Worker (6%), and Capacity Utilization (3%). These challenges are highly interactive, and each challenge impacts all other challenges. (e) Recent focus of the researchers (total 7 studies in 2019) is on generic software development handling the collaboration and coordination (3 studies out of 7), Developer recommendation (2 studies out of 7), and task recommendation (2 studies out of 7). [Conclusion] The freelancer-driven software engineering research area has got the attraction of the researchers, but it will take a long time to gain maturity. This puts an urgent call for more empirical studies and evaluation-based solution research that could help companies (including startups) to foster innovations. Further, the research focus should be well distributed among the various development phases to address the unique challenges associated with individual activities. The accurate management of the freelancer in the software development could help companies and startups to foster innovations and remain competitive in the marketplace.


Author(s):  
Riaz Ahmed ◽  
Noor Azmi bin Mohamad

Literature reveals a number of confusions within the area of project management regarding the use of terminology and differing interpretations, concerning leadership competencies and leadership styles. In project management literature, many empirical studies have examined the influence of leadership competencies or styles, but yet any substantial review study has rarely been conducted to differentiate between leadership competencies and styles. This study aims to differentiate between leadership competencies and leadership styles in project management literature. This study found that the project manager's leadership terms including competence, competency, competencies, and styles have frequently been used in project management literature. Furthermore, literature has been synthesized to provide more familiarity and understanding on leadership competencies and styles. Findings indicate that leadership competencies and styles are two different things but a few characteristics are common between both the terms. Furthermore, leadership competencies are more suitable for task-oriented activities and leadership styles are more appropriate for relationship-oriented factors. This study has implications for the future directions to identify differences between project manger's average and superior performance through comparison of leadership competencies and styles.


Author(s):  
Raghuram Mandyam Annasamy ◽  
Katia Sycara

Deep reinforcement learning techniques have demonstrated superior performance in a wide variety of environments. As improvements in training algorithms continue at a brisk pace, theoretical or empirical studies on understanding what these networks seem to learn, are far behind. In this paper we propose an interpretable neural network architecture for Q-learning which provides a global explanation of the model’s behavior using key-value memories, attention and reconstructible embeddings. With a directed exploration strategy, our model can reach training rewards comparable to the state-of-the-art deep Q-learning models. However, results suggest that the features extracted by the neural network are extremely shallow and subsequent testing using out-of-sample examples shows that the agent can easily overfit to trajectories seen during training.


Sign in / Sign up

Export Citation Format

Share Document