scholarly journals Narrowing the Modeling Gap: A Cluster-Ranking Approach to Coreference Resolution

2011 ◽  
Vol 40 ◽  
pp. 469-521 ◽  
Author(s):  
A. Rahman ◽  
V. Ng

Traditional learning-based coreference resolvers operate by training the mention-pair model for determining whether two mentions are coreferent or not. Though conceptually simple and easy to understand, the mention-pair model is linguistically rather unappealing and lags far behind the heuristic-based coreference models proposed in the pre-statistical NLP era in terms of sophistication. Two independent lines of recent research have attempted to improve the mention-pair model, one by acquiring the mention-ranking model to rank preceding mentions for a given anaphor, and the other by training the entity-mention model to determine whether a preceding cluster is coreferent with a given mention. We propose a cluster-ranking approach to coreference resolution, which combines the strengths of the mention-ranking model and the entity-mention model, and is therefore theoretically more appealing than both of these models. In addition, we seek to improve cluster rankers via two extensions: (1) lexicalization and (2) incorporating knowledge of anaphoricity by jointly modeling anaphoricity determination and coreference resolution. Experimental results on the ACE data sets demonstrate the superior performance of cluster rankers to competing approaches as well as the effectiveness of our two extensions.

2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Hossein Ahmadvand ◽  
Fouzhan Foroutan ◽  
Mahmood Fathy

AbstractData variety is one of the most important features of Big Data. Data variety is the result of aggregating data from multiple sources and uneven distribution of data. This feature of Big Data causes high variation in the consumption of processing resources such as CPU consumption. This issue has been overlooked in previous works. To overcome the mentioned problem, in the present work, we used Dynamic Voltage and Frequency Scaling (DVFS) to reduce the energy consumption of computation. To this goal, we consider two types of deadlines as our constraint. Before applying the DVFS technique to computer nodes, we estimate the processing time and the frequency needed to meet the deadline. In the evaluation phase, we have used a set of data sets and applications. The experimental results show that our proposed approach surpasses the other scenarios in processing real datasets. Based on the experimental results in this paper, DV-DVFS can achieve up to 15% improvement in energy consumption.


2021 ◽  
Author(s):  
Hossein Ahmadvand ◽  
Fouzhan Foroutan ◽  
Mahmood Fathy

Abstract Data variety is one of the most important features of Big Data. Data variety is the result of aggregating data from multiple sources and uneven distribution of data. This feature of Big Data causes high variation in the consumption of processing resources such as CPU consumption. This issue has been overlooked from previous work. To overcome the mentioned problem, in the present work, we used Dynamic Voltage and Frequency Scaling (DVFS) to reduce the energy consumption of computation. To this goal, we consider two types of deadlines as our constraint. Before applying the DVFS technique to computer nodes, we estimate the processing time and the frequency needed to meet the deadline. In the evaluation phase, we have used a set of data sets and applications. The experimental results show that our proposed approach surpasses the other scenarios in processing real datasets. Based on the experimental results in this paper, DV-DVFS can achieve up to 15% improvement in energy consumption.


2013 ◽  
Vol 380-384 ◽  
pp. 2811-2816
Author(s):  
Kai Lei ◽  
Yi Fan Zeng

Query-oriented multi-document summarization (QMDS) attempts to generate a concise piece of text byextracting sentences from a target document collection, with the aim of not only conveying the key content of that corpus, also, satisfying the information needs expressed by that query. Due to its great applicable value, QMDS has been intensively studied in recent decades. Three properties are supposed crucial for a good summary, i.e., relevance, prestige and low redundancy (orso-called diversity). Unfortunately, most existing work either disregarded the concern of diversity, or handled it with non-optimized heuristics, usually based on greedy sentences election. Inspired by the manifold-ranking process, which deals with query-biased prestige, and DivRank algorithm which captures query-independent diversity ranking, in this paper, we propose a novel biased diversity ranking model, named ManifoldDivRank, for query-sensitive summarization tasks. The top-ranked sentences discovered by our algorithm not only enjoy query-oriented high prestige, more importantly, they are dissimilar with each other. Experimental results on DUC2005and DUC2006 benchmark data sets demonstrate the effectiveness of our proposal.


2020 ◽  
Author(s):  
Hossein Ahmadvand ◽  
Fouzhan Foroutan ◽  
Mahmood Fathy

Abstract Data variety is one of the most important features of Big Data. Data variety is the result of aggregating data from multiple sources and uneven distribution of data. This feature of Big Data causes high variation in the consumption of processing resources such as CPU consumption. This issue has been overlooked from previous work. To overcome the mentioned problem, in the present work, we used Dynamic Voltage and Frequency Scaling (DVFS) to reduce the energy consumption of computation. To this goal, we consider two types of deadlines as our constraint. Before applying the DVFS technique to computer nodes, we estimate the processing time and the frequency needed to meet the deadline. In the evaluation phase, we have used a set of data sets and applications. The experimental results show that our proposed approach surpasses the other scenarios in processing real datasets. Based on the experimental results in this paper, DV-DVFS can achieve up to 15% improvement in energy consumption.


2015 ◽  
Vol 24 (03) ◽  
pp. 1550003 ◽  
Author(s):  
Armin Daneshpazhouh ◽  
Ashkan Sami

The task of semi-supervised outlier detection is to find the instances that are exceptional from other data, using some labeled examples. In many applications such as fraud detection and intrusion detection, this issue becomes more important. Most existing techniques are unsupervised. On the other hand, semi-supervised approaches use both negative and positive instances to detect outliers. However, in many real world applications, very few positive labeled examples are available. This paper proposes an innovative approach to address this problem. The proposed method works as follows. First, some reliable negative instances are extracted by a kNN-based algorithm. Afterwards, fuzzy clustering using both negative and positive examples is utilized to detect outliers. Experimental results on real data sets demonstrate that the proposed approach outperforms the previous unsupervised state-of-the-art methods in detecting outliers.


Author(s):  
Mohammed Sarhan Al Duais ◽  
Fatma Susilawati Mohamad

The main problem of batch back propagation (BBP) algorithm is slow training and there are several parameters need to be adjusted manually, such as learning rate. In addition, the BBP algorithm suffers from saturation training. The objective of this study is to improve the speed up training of the BBP algorithm and to remove the saturation training. The training rate is the most significant parameter for increasing the efficiency of the BBP. In this study, a new dynamic training rate is created to speed the training of the BBP algorithm. The dynamic batch back propagation (DBBPLR) algorithm is presented, which trains with adynamic training rate. This technique was implemented with a sigmoid function. Several data sets were used as benchmarks for testing the effects of the created dynamic training rate that we created. All the experiments were performed on Matlab. From the experimental results, the DBBPLR algorithm provides superior performance in terms of training, faster training with higher accuracy compared to the BBP algorithm and existing works.


2020 ◽  
Author(s):  
Hossein Ahmadvand ◽  
Fouzhan Foroutan ◽  
Mahmood Fathy

Abstract Data variety is one of the most important features of Big Data. Data variety is the result of aggregating data from multiple sources and uneven distribution of data. This feature of Big Data causes high variation in the consumption of processing resources such as CPU consumption. This issue has been overlooked from previous work. To overcome the mentioned problem, in the present work, we used Dynamic Voltage and Frequency Scaling (DVFS) to reduce the energy consumption of computation. To this goal, we consider two types of deadlines as our constraint. Before applying the DVFS technique to computer nodes, we estimate the processing time and the frequency needed to meet the deadline. In the evaluation phase, we have used a set of data sets and applications. The experimental results show that our proposed approach surpasses the other scenarios in processing real datasets. Based on the experimental results in this paper, DV-DVFS can achieve up to 15% improvement in energy consumption.


2020 ◽  
Author(s):  
Hossein Ahmadvand ◽  
Fouzhan Foroutan Foroutan ◽  
Mahmood Fathy

Abstract Data variety is one of the most important features of Big Data. Data variety is the result of aggregating data from multiple sources and uneven distribution of data. This feature of Big Data causes high variation in consumption of processing resources such as CPU consumption. In this paper, we used Dynamic Voltage and Frequency Scaling (DVFS) to reduce the energy consumption of computation. To this goal, we consider a deadline as our constraint and before applying the DVFS technique to computer nodes, we estimate the processing time and the frequency needed to meet the deadline. We have used a set of data sets and applications in the evaluation phase. The experimental results show that our proposed approach surpasses the other scenarios in processing real datasets. Based on the experimental results in this paper, DV-DVFS can achieve up to 15% improvement in energy consumption.


2008 ◽  
Vol 34 (3) ◽  
pp. 327-356 ◽  
Author(s):  
Xiaofeng Yang ◽  
Jian Su ◽  
Chew Lim Tan

The traditional single-candidate learning model for anaphora resolution considers the antecedent candidates of an anaphor in isolation, and thus cannot effectively capture the preference relationships between competing candidates for its learning and resolution. To deal with this problem, we propose a twin-candidate model for anaphora resolution. The main idea behind the model is to recast anaphora resolution as a preference classification problem. Specifically, the model learns a classifier that determines the preference between competing candidates, and, during resolution, chooses the antecedent of a given anaphor based on the ranking of the candidates. We present in detail the framework of the twin-candidate model for anaphora resolution. Further, we explore how to deploy the model in the more complicated coreference resolution task. We evaluate the twin-candidate model in different domains using the Automatic Content Extraction data sets. The experimental results indicate that our twin-candidate model is superior to the single-candidate model for the task of pronominal anaphora resolution. For the task of coreference resolution, it also performs equally well, or better.


1994 ◽  
Vol 29 (4) ◽  
pp. 127-132 ◽  
Author(s):  
Naomi Rea ◽  
George G. Ganf

Experimental results demonstrate bow small differences in depth and water regime have a significant affect on the accumulation and allocation of nutrients and biomass. Because the performance of aquatic plants depends on these factors, an understanding of their influence is essential to ensure that systems function at their full potential. The responses differed for two emergent species, indicating that within this morphological category, optimal performance will fall at different locations across a depth or water regime gradient. The performance of one species was unaffected by growth in mixture, whereas the other performed better in deep water and worse in shallow.


Sign in / Sign up

Export Citation Format

Share Document