Combining Multiple Retrieval Systems Using Combinatorial Fusion Analysis and Rank-Score Characteristic Function

Author(s):  
Hongzhi Liu ◽  
Zhonghai Wu ◽  
D. Frank Hsu
2008 ◽  
pp. 1157-1181 ◽  
Author(s):  
D. Frank Hsu ◽  
Yun-Sheng Chung ◽  
Kristal Bruce S.

Combination methods have been investigated as a possible means to improve performance in multi-variable (multi-criterion or multi-objective) classification, prediction, learning, and optimization problems. In addition, information collected from multi-sensor or multi-source environment also often needs to be combined to produce more accurate information, to derive better estimation, or to make more knowledgeable decisions. In this chapter, we present a method, called Combinatorial Fusion Analysis (CFA), for analyzing combination and fusion of multiple scoring. CFA characterizes each Scoring system as having included a Score function, a Rank function, and a Rank/score function. Both rank combination and score combination are explored as to their combinatorial complexity and computational efficiency. Information derived from the scoring characteristics of each scoring system is used to perform system selection and to decide method combination. In particular, the rank/score graph defined by Hsu, Shapiro and Taksa (Hsu et al., 2002; Hsu & Taksa, 2005) is used to measure the diversity between scoring systems. We illustrate various applications of the framework using examples in information retrieval and biomedical informatics.


2013 ◽  
Vol 14 (01) ◽  
pp. 1350003 ◽  
Author(s):  
CHUN-YI LIU ◽  
CHUAN-YI TANG ◽  
D. FRANK HSU

Combining multiple information retrieval (IR) systems has been shown to improve performance over individual systems. However, it remains a challenging problem to determine when and how a set of individual systems should to be combined. In this paper, we investigate these issues using combinatorial fusion analysis and five data sets provide by TREC 2, 3, 4, 5, and 6. In particular, we compare the performance of combining six IR systems selected by random choice vs. by performance measurement from these five TREC data sets. Two experiments are conducted, which include: (1) combination of two systems and their performance outcome in terms of performance ratio and cognitive diversity, and (2) combinatorial fusion of t-systems, t = 2 to 6, using both score and rank combinations and exploration of the effect of diversity on the performance outcome. It is demonstrated in both experiments that combination of two or more systems improves the performance more significantly when the systems are selected by performance evaluation than those selected by random choice. Our work provides a distinctive method of system selection for the combination of multiple retrieval systems.


Author(s):  
D. Frank Hsu ◽  
Yun-Sheng Chung ◽  
Bruce S. Kristal

Combination methods have been investigated as a possible means to improve performance in multi-variable (multi-criterion or multi-objective) classification, prediction, learning, and optimization problems. In addition, information collected from multi-sensor or multi-source environment also often needs to be combined to produce more accurate information, to derive better estimation, or to make more knowledgeable decisions. In this chapter, we present a method, called Combinatorial Fusion Analysis (CFA), for analyzing combination and fusion of multiple scoring. CFA characterizes each Scoring system as having included a Score function, a Rank function, and a Rank/score function. Both rank combination and score combination are explored as to their combinatorial complexity and computational efficiency. Information derived from the scoring characteristics of each scoring system is used to perform system selection and to decide method combination. In particular, the rank/score graph defined by Hsu, Shapiro and Taksa (Hsu et al., 2002; Hsu & Taksa, 2005) is used to measure the diversity between scoring systems. We illustrate various applications of the framework using examples in information retrieval and biomedical informatics.


2013 ◽  
Vol 22 (02) ◽  
pp. 1350001 ◽  
Author(s):  
YANJUN LI ◽  
D. FRANK HSU ◽  
SOON M. CHUNG

Effective feature selection methods are important for improving the efficiency and accuracy of text categorization algorithms by removing redundant and irrelevant terms from the corpus. Extensive research has been done to improve the performance of individual feature selection methods. However, it is always a challenge to come up with an individual feature selection method which would outperform other methods in most cases. In this paper, we explore the possibility of improving the overall performance by combining multiple individual feature selection methods. In particular, we propose a method of combining multiple feature selection methods by using an information fusion paradigm, called Combinatorial Fusion Analysis (CFA). A rank-score function and its associated graph, called rank-score graph, are adopted to measure the diversity of different feature selection methods. Our experimental results demonstrated that a combination of multiple feature selection methods can outperform a single method only if each individual feature selection method has unique scoring behavior and relatively high performance. Moreover, it is shown that the rank-score function and rank-score graph are useful for the selection of a combination of feature selection methods.


Sign in / Sign up

Export Citation Format

Share Document