Narrowing the Modeling Gap: A Cluster-Ranking Approach to Coreference Resolution

Journal of Artificial Intelligence Research ◽

10.1613/jair.3120 ◽

2011 ◽

Vol 40 ◽

pp. 469-521 ◽

Cited By ~ 14

Author(s):

A. Rahman ◽

V. Ng

Keyword(s):

Experimental Results ◽

The Other ◽

Superior Performance ◽

Traditional Learning ◽

Data Sets ◽

Coreference Resolution ◽

Pair Model ◽

Ranking Model ◽

Cluster Ranking

Traditional learning-based coreference resolvers operate by training the mention-pair model for determining whether two mentions are coreferent or not. Though conceptually simple and easy to understand, the mention-pair model is linguistically rather unappealing and lags far behind the heuristic-based coreference models proposed in the pre-statistical NLP era in terms of sophistication. Two independent lines of recent research have attempted to improve the mention-pair model, one by acquiring the mention-ranking model to rank preceding mentions for a given anaphor, and the other by training the entity-mention model to determine whether a preceding cluster is coreferent with a given mention. We propose a cluster-ranking approach to coreference resolution, which combines the strengths of the mention-ranking model and the entity-mention model, and is therefore theoretically more appealing than both of these models. In addition, we seek to improve cluster rankers via two extensions: (1) lexicalization and (2) incorporating knowledge of anaphoricity by jointly modeling anaphoricity determination and coreference resolution. Experimental results on the ACE data sets demonstrate the superior performance of cluster rankers to competing approaches as well as the effectiveness of our two extensions.

Download Full-text

DV-DVFS: merging data variety and DVFS technique to manage the energy consumption of big data processing

Journal Of Big Data ◽

10.1186/s40537-021-00437-7 ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Hossein Ahmadvand ◽

Fouzhan Foroutan ◽

Mahmood Fathy

Keyword(s):

Big Data ◽

Energy Consumption ◽

Processing Time ◽

Experimental Results ◽

The Other ◽

Data Sets ◽

Multiple Sources ◽

Evaluation Phase ◽

Dynamic Voltage ◽

Processing Resources

AbstractData variety is one of the most important features of Big Data. Data variety is the result of aggregating data from multiple sources and uneven distribution of data. This feature of Big Data causes high variation in the consumption of processing resources such as CPU consumption. This issue has been overlooked in previous works. To overcome the mentioned problem, in the present work, we used Dynamic Voltage and Frequency Scaling (DVFS) to reduce the energy consumption of computation. To this goal, we consider two types of deadlines as our constraint. Before applying the DVFS technique to computer nodes, we estimate the processing time and the frequency needed to meet the deadline. In the evaluation phase, we have used a set of data sets and applications. The experimental results show that our proposed approach surpasses the other scenarios in processing real datasets. Based on the experimental results in this paper, DV-DVFS can achieve up to 15% improvement in energy consumption.

Download Full-text

DV-DVFS: Merging Data Variety and DVFS Technique to Manage the Energy Consumption of Big Data Processing

10.21203/rs.3.rs-45414/v4 ◽

2021 ◽

Author(s):

Hossein Ahmadvand ◽

Fouzhan Foroutan ◽

Mahmood Fathy

Keyword(s):

Big Data ◽

Energy Consumption ◽

Processing Time ◽

Experimental Results ◽

The Other ◽

Data Sets ◽

Multiple Sources ◽

Evaluation Phase ◽

Dynamic Voltage ◽

Processing Resources

Abstract Data variety is one of the most important features of Big Data. Data variety is the result of aggregating data from multiple sources and uneven distribution of data. This feature of Big Data causes high variation in the consumption of processing resources such as CPU consumption. This issue has been overlooked from previous work. To overcome the mentioned problem, in the present work, we used Dynamic Voltage and Frequency Scaling (DVFS) to reduce the energy consumption of computation. To this goal, we consider two types of deadlines as our constraint. Before applying the DVFS technique to computer nodes, we estimate the processing time and the frequency needed to meet the deadline. In the evaluation phase, we have used a set of data sets and applications. The experimental results show that our proposed approach surpasses the other scenarios in processing real datasets. Based on the experimental results in this paper, DV-DVFS can achieve up to 15% improvement in energy consumption.

Download Full-text

A Novel Biased Diversity Ranking Model for Query-Oriented Multi-Document Summarization

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.380-384.2811 ◽

2013 ◽

Vol 380-384 ◽

pp. 2811-2816

Author(s):

Kai Lei ◽

Yi Fan Zeng

Keyword(s):

Information Needs ◽

Experimental Results ◽

Data Sets ◽

Manifold Ranking ◽

Document Summarization ◽

Benchmark Data ◽

Ranking Model ◽

Document Collection ◽

High Prestige

Query-oriented multi-document summarization (QMDS) attempts to generate a concise piece of text byextracting sentences from a target document collection, with the aim of not only conveying the key content of that corpus, also, satisfying the information needs expressed by that query. Due to its great applicable value, QMDS has been intensively studied in recent decades. Three properties are supposed crucial for a good summary, i.e., relevance, prestige and low redundancy (orso-called diversity). Unfortunately, most existing work either disregarded the concern of diversity, or handled it with non-optimized heuristics, usually based on greedy sentences election. Inspired by the manifold-ranking process, which deals with query-biased prestige, and DivRank algorithm which captures query-independent diversity ranking, in this paper, we propose a novel biased diversity ranking model, named ManifoldDivRank, for query-sensitive summarization tasks. The top-ranked sentences discovered by our algorithm not only enjoy query-oriented high prestige, more importantly, they are dissimilar with each other. Experimental results on DUC2005and DUC2006 benchmark data sets demonstrate the effectiveness of our proposal.

Download Full-text

DV-DVFS: Merging Data variety and DVFS Technique to Manage the Energy Consumption of Big Data Processing

10.21203/rs.3.rs-45414/v2 ◽

2020 ◽

Author(s):

Hossein Ahmadvand ◽

Fouzhan Foroutan ◽

Mahmood Fathy

Keyword(s):

Big Data ◽

Energy Consumption ◽

Processing Time ◽

Experimental Results ◽

The Other ◽

Data Sets ◽

Multiple Sources ◽

Evaluation Phase ◽

Dynamic Voltage ◽

Processing Resources

Abstract Data variety is one of the most important features of Big Data. Data variety is the result of aggregating data from multiple sources and uneven distribution of data. This feature of Big Data causes high variation in the consumption of processing resources such as CPU consumption. This issue has been overlooked from previous work. To overcome the mentioned problem, in the present work, we used Dynamic Voltage and Frequency Scaling (DVFS) to reduce the energy consumption of computation. To this goal, we consider two types of deadlines as our constraint. Before applying the DVFS technique to computer nodes, we estimate the processing time and the frequency needed to meet the deadline. In the evaluation phase, we have used a set of data sets and applications. The experimental results show that our proposed approach surpasses the other scenarios in processing real datasets. Based on the experimental results in this paper, DV-DVFS can achieve up to 15% improvement in energy consumption.

Download Full-text

Semi-Supervised Outlier Detection with Only Positive and Unlabeled Data Based on Fuzzy Clustering

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213015500037 ◽

2015 ◽

Vol 24 (03) ◽

pp. 1550003 ◽

Cited By ~ 1

Author(s):

Armin Daneshpazhouh ◽

Ashkan Sami

Keyword(s):

Intrusion Detection ◽

Outlier Detection ◽

Fuzzy Clustering ◽

Real World ◽

State Of The Art ◽

Real Data ◽

Experimental Results ◽

The Other ◽

Data Sets ◽

Real World Applications

The task of semi-supervised outlier detection is to find the instances that are exceptional from other data, using some labeled examples. In many applications such as fraud detection and intrusion detection, this issue becomes more important. Most existing techniques are unsupervised. On the other hand, semi-supervised approaches use both negative and positive instances to detect outliers. However, in many real world applications, very few positive labeled examples are available. This paper proposes an innovative approach to address this problem. The proposed method works as follows. First, some reliable negative instances are extracted by a kNN-based algorithm. Afterwards, fuzzy clustering using both negative and positive examples is utilized to detect outliers. Experimental results on real data sets demonstrate that the proposed approach outperforms the previous unsupervised state-of-the-art methods in detecting outliers.

Download Full-text

Dynamically-adaptive Weight in Batch Back Propagation Algorithm via Dynamic Training Rate for Speedup and Accuracy Training

Journal of Telecommunications and Information Technology ◽

10.26636/jtit.2017.113017 ◽

2017 ◽

Vol 4 ◽

pp. 82-89

Author(s):

Mohammed Sarhan Al Duais ◽

Fatma Susilawati Mohamad

Keyword(s):

Back Propagation ◽

Experimental Results ◽

Superior Performance ◽

Data Sets ◽

Sigmoid Function ◽

Back Propagation Algorithm ◽

Significant Parameter ◽

Propagation Algorithm ◽

Adaptive Weight ◽

Speed Up

The main problem of batch back propagation (BBP) algorithm is slow training and there are several parameters need to be adjusted manually, such as learning rate. In addition, the BBP algorithm suﬀers from saturation training. The objective of this study is to improve the speed up training of the BBP algorithm and to remove the saturation training. The training rate is the most signiﬁcant parameter for increasing the eﬃciency of the BBP. In this study, a new dynamic training rate is created to speed the training of the BBP algorithm. The dynamic batch back propagation (DBBPLR) algorithm is presented, which trains with adynamic training rate. This technique was implemented with a sigmoid function. Several data sets were used as benchmarks for testing the eﬀects of the created dynamic training rate that we created. All the experiments were performed on Matlab. From the experimental results, the DBBPLR algorithm provides superior performance in terms of training, faster training with higher accuracy compared to the BBP algorithm and existing works.

Download Full-text

DV-DVFS: Merging Data variety and DVFS Technique to Manage the Energy Consumption of Big Data Processing

10.21203/rs.3.rs-45414/v3 ◽

2020 ◽

Author(s):

Hossein Ahmadvand ◽

Fouzhan Foroutan ◽

Mahmood Fathy

Keyword(s):

Big Data ◽

Energy Consumption ◽

Processing Time ◽

Experimental Results ◽

The Other ◽

Data Sets ◽

Multiple Sources ◽

Evaluation Phase ◽

Dynamic Voltage ◽

Processing Resources

Abstract Data variety is one of the most important features of Big Data. Data variety is the result of aggregating data from multiple sources and uneven distribution of data. This feature of Big Data causes high variation in the consumption of processing resources such as CPU consumption. This issue has been overlooked from previous work. To overcome the mentioned problem, in the present work, we used Dynamic Voltage and Frequency Scaling (DVFS) to reduce the energy consumption of computation. To this goal, we consider two types of deadlines as our constraint. Before applying the DVFS technique to computer nodes, we estimate the processing time and the frequency needed to meet the deadline. In the evaluation phase, we have used a set of data sets and applications. The experimental results show that our proposed approach surpasses the other scenarios in processing real datasets. Based on the experimental results in this paper, DV-DVFS can achieve up to 15% improvement in energy consumption.

Download Full-text

DV-DVFS: Merging Data variety and DVFS Technique to Manage the Energy Consumption of Big Data Processing

10.21203/rs.3.rs-45414/v1 ◽

2020 ◽

Author(s):

Hossein Ahmadvand ◽

Fouzhan Foroutan Foroutan ◽

Mahmood Fathy

Keyword(s):

Big Data ◽

Energy Consumption ◽

Processing Time ◽

Experimental Results ◽

The Other ◽

Data Sets ◽

Multiple Sources ◽

Evaluation Phase ◽

Dynamic Voltage ◽

Processing Resources

Abstract Data variety is one of the most important features of Big Data. Data variety is the result of aggregating data from multiple sources and uneven distribution of data. This feature of Big Data causes high variation in consumption of processing resources such as CPU consumption. In this paper, we used Dynamic Voltage and Frequency Scaling (DVFS) to reduce the energy consumption of computation. To this goal, we consider a deadline as our constraint and before applying the DVFS technique to computer nodes, we estimate the processing time and the frequency needed to meet the deadline. We have used a set of data sets and applications in the evaluation phase. The experimental results show that our proposed approach surpasses the other scenarios in processing real datasets. Based on the experimental results in this paper, DV-DVFS can achieve up to 15% improvement in energy consumption.

Download Full-text

A Twin-Candidate Model for Learning-Based Anaphora Resolution

Computational Linguistics ◽

10.1162/coli.2008.07-004-r2-06-57 ◽

2008 ◽

Vol 34 (3) ◽

pp. 327-356 ◽

Cited By ~ 13

Author(s):

Xiaofeng Yang ◽

Jian Su ◽

Chew Lim Tan

Keyword(s):

Main Idea ◽

Classification Problem ◽

Learning Model ◽

Experimental Results ◽

Data Sets ◽

Coreference Resolution ◽

Anaphora Resolution ◽

Candidate Model ◽

Content Extraction ◽

Pronominal Anaphora

The traditional single-candidate learning model for anaphora resolution considers the antecedent candidates of an anaphor in isolation, and thus cannot effectively capture the preference relationships between competing candidates for its learning and resolution. To deal with this problem, we propose a twin-candidate model for anaphora resolution. The main idea behind the model is to recast anaphora resolution as a preference classification problem. Specifically, the model learns a classifier that determines the preference between competing candidates, and, during resolution, chooses the antecedent of a given anaphor based on the ranking of the candidates. We present in detail the framework of the twin-candidate model for anaphora resolution. Further, we explore how to deploy the model in the more complicated coreference resolution task. We evaluate the twin-candidate model in different domains using the Automatic Content Extraction data sets. The experimental results indicate that our twin-candidate model is superior to the single-candidate model for the task of pronominal anaphora resolution. For the task of coreference resolution, it also performs equally well, or better.

Download Full-text

The Influence of Water Regime on the Performance of Aquatic Plants

Water Science & Technology ◽

10.2166/wst.1994.0174 ◽

1994 ◽

Vol 29 (4) ◽

pp. 127-132 ◽

Cited By ~ 6

Author(s):

Naomi Rea ◽

George G. Ganf

Keyword(s):

Deep Water ◽

Aquatic Plants ◽

Water Regime ◽

Optimal Performance ◽

Experimental Results ◽

The Other ◽

Full Potential ◽

Influence Of Water ◽

Emergent Species

Experimental results demonstrate bow small differences in depth and water regime have a significant affect on the accumulation and allocation of nutrients and biomass. Because the performance of aquatic plants depends on these factors, an understanding of their influence is essential to ensure that systems function at their full potential. The responses differed for two emergent species, indicating that within this morphological category, optimal performance will fall at different locations across a depth or water regime gradient. The performance of one species was unaffected by growth in mixture, whereas the other performed better in deep water and worse in shallow.

Download Full-text