scholarly journals Attentional Colorization Networks with Adaptive Group-Instance Normalization

Information ◽  
2020 ◽  
Vol 11 (10) ◽  
pp. 479
Author(s):  
Yuzhen Gao ◽  
Youdong Ding ◽  
Fei Wang ◽  
Huan Liang

We propose a novel end-to-end image colorization framework which integrates attention mechanism and a learnable adaptive normalization function. In contrast to previous colorization methods that directly generate the whole image, we believe that the color of the significant area determines the quality of the colorized image. The attention mechanism uses the attention map which is obtained by the auxiliary classifier to guide our framework to produce more subtle content and visually pleasing color in salient visual regions. Furthermore, we apply Adaptive Group Instance Normalization (AGIN) function to promote our framework to generate vivid colorized images flexibly, under the circumstance that we consider colorization as a particular style transfer task. Experiments show that our model is superior to previous the state-of-the-art models in coloring foreground objects.

Author(s):  
Huan Vu ◽  
Samir Aknine ◽  
Sarvapali D. Ramchurn

Traffic congestion has a significant impact on quality of life and the economy. This paper presents a decentralised traffic management mechanism for intersections using a distributed constraint optimisation approach (DCOP). Our solution outperforms the state of the art solution both for stable traffic conditions (about 60% reduced waiting time) and robustness to unpredictable events. 


2017 ◽  
Vol 2 (1) ◽  
pp. 299-316 ◽  
Author(s):  
Cristina Pérez-Benito ◽  
Samuel Morillas ◽  
Cristina Jordán ◽  
J. Alberto Conejero

AbstractIt is still a challenge to improve the efficiency and effectiveness of image denoising and enhancement methods. There exists denoising and enhancement methods that are able to improve visual quality of images. This is usually obtained by removing noise while sharpening details and improving edges contrast. Smoothing refers to the case of denoising when noise follows a Gaussian distribution.Both operations, smoothing noise and sharpening, have an opposite nature. Therefore, there are few approaches that simultaneously respond to both goals. We will review these methods and we will also provide a detailed study of the state-of-the-art methods that attack both problems in colour images, separately.


2022 ◽  
Vol 22 (3) ◽  
pp. 1-21
Author(s):  
Prayag Tiwari ◽  
Amit Kumar Jaiswal ◽  
Sahil Garg ◽  
Ilsun You

Self-attention mechanisms have recently been embraced for a broad range of text-matching applications. Self-attention model takes only one sentence as an input with no extra information, i.e., one can utilize the final hidden state or pooling. However, text-matching problems can be interpreted either in symmetrical or asymmetrical scopes. For instance, paraphrase detection is an asymmetrical task, while textual entailment classification and question-answer matching are considered asymmetrical tasks. In this article, we leverage attractive properties of self-attention mechanism and proposes an attention-based network that incorporates three key components for inter-sequence attention: global pointwise features, preceding attentive features, and contextual features while updating the rest of the components. Our model follows evaluation on two benchmark datasets cover tasks of textual entailment and question-answer matching. The proposed efficient Self-attention-driven Network for Text Matching outperforms the state of the art on the Stanford Natural Language Inference and WikiQA datasets with much fewer parameters.


Author(s):  
Muhammad Salman Raheel ◽  
Raad Raad

This chapter discusses the state of the art in dealing with the resource optimization problem for smooth delivery of video across a peer to peer (P2P) network. It further discusses the properties of using different video coding techniques such as Scalable Video Coding (SVC) and Multiple Descriptive Coding (MDC) to overcome the playback latency in multimedia streaming and maintains an adequate quality of service (QoS) among the users. The problem can be summarized as follows; Given that a video is requested by a peer in the network, what properties of SVC and MDC can be exploited to deliver the video with the highest quality, least upload bandwidth and least delay from all participating peers. However, the solution to these problems is known to be NP hard. Hence, this chapter presents the state of the art in approximation algorithms or techniques that have been proposed to overcome these issues.


2019 ◽  
Vol 9 (18) ◽  
pp. 3908 ◽  
Author(s):  
Jintae Kim ◽  
Shinhyeok Oh ◽  
Oh-Woog Kwon ◽  
Harksoo Kim

To generate proper responses to user queries, multi-turn chatbot models should selectively consider dialogue histories. However, previous chatbot models have simply concatenated or averaged vector representations of all previous utterances without considering contextual importance. To mitigate this problem, we propose a multi-turn chatbot model in which previous utterances participate in response generation using different weights. The proposed model calculates the contextual importance of previous utterances by using an attention mechanism. In addition, we propose a training method that uses two types of Wasserstein generative adversarial networks to improve the quality of responses. In experiments with the DailyDialog dataset, the proposed model outperformed the previous state-of-the-art models based on various performance measures.


Author(s):  
Mirko Luca Lobina ◽  
Luigi Atzori ◽  
Fabrizio Boi

IP Telephony provides a way for an enterprise to extend consistent communication services to all employees, whether they are in main campus locations, at branch offices, or working remotely, also with a mobile phone. IP Telephony transmits voice communications over a network using open standard-based Internet protocols. This is both the strength and weakness of IP Telephony as the involved basic transport protocols (RTP, UDP, and IP) are not able to natively guarantee the required application quality of service (QoS). From the point of view of an IP Telephony Service Provider this definitely means possible waste of clients and money. Specifically the problem is at two different levels: i) in some countries, wherelong distance and particularly international call tariffs are high, perhaps due to a lack of competition or due to cross subsidies to other services, the major opportunity for IP Telephony Service Providers is for price arbitrage. This means working on diffusion of an acceptable service, although not at high quality levels; ii) in other countries, where different IP Telephony Service Providers already exist, the problem is competition for offering the best possible quality. The main idea behind this chapter is to analyze specifically the state of the art playout control strategies with the following aims: i) propose the reader the technical state of the art playout control management and planning strategies (overview of basic KPIs for IP Telephony); ii) compare the strategies IP Telephony Service Provider can choose with the aim of saving money and offering a better quality of service; iii) introduce also the state of the art quality index for IP Telephony, that is a set of algorithms for taking into account as many factors as possible to evaluate the service quality; iv) provide the reader with examples on some economic scenarios of IP Telephony.


Author(s):  
Chu-Xiong Qin ◽  
Wen-Lin Zhang ◽  
Dan Qu

Abstract A method called joint connectionist temporal classification (CTC)-attention-based speech recognition has recently received increasing focus and has achieved impressive performance. A hybrid end-to-end architecture that adds an extra CTC loss to the attention-based model could force extra restrictions on alignments. To explore better the end-to-end models, we propose improvements to the feature extraction and attention mechanism. First, we introduce a joint model trained with nonnegative matrix factorization (NMF)-based high-level features. Then, we put forward a hybrid attention mechanism by incorporating multi-head attentions and calculating attention scores over multi-level outputs. Experiments on TIMIT indicate that the new method achieves state-of-the-art performance with our best model. Experiments on WSJ show that our method exhibits a word error rate (WER) that is only 0.2% worse in absolute value than the best referenced method, which is trained on a much larger dataset, and it beats all present end-to-end methods. Further experiments on LibriSpeech show that our method is also comparable to the state-of-the-art end-to-end system in WER.


Author(s):  
Ziming Li ◽  
Julia Kiseleva ◽  
Maarten De Rijke

The performance of adversarial dialogue generation models relies on the quality of the reward signal produced by the discriminator. The reward signal from a poor discriminator can be very sparse and unstable, which may lead the generator to fall into a local optimum or to produce nonsense replies. To alleviate the first problem, we first extend a recently proposed adversarial dialogue generation method to an adversarial imitation learning solution. Then, in the framework of adversarial inverse reinforcement learning, we propose a new reward model for dialogue generation that can provide a more accurate and precise reward signal for generator training. We evaluate the performance of the resulting model with automatic metrics and human evaluations in two annotation settings. Our experimental results demonstrate that our model can generate more high-quality responses and achieve higher overall performance than the state-of-the-art.


Sign in / Sign up

Export Citation Format

Share Document