Evaluate the performance of K-means distance functions with string edit distance and the open directory project

2016 ◽  
pp. 157-160
Author(s):  
Chun-Hsiung Tseng ◽  
Yung-Hui Chen
1998 ◽  
Vol 20 (5) ◽  
pp. 522-532 ◽  
Author(s):  
E.S. Ristad ◽  
P.N. Yianilos

2001 ◽  
Vol 01 (02) ◽  
pp. 363-386
Author(s):  
WLADIMIR RODRIGUEZ ◽  
MARK LAST ◽  
ABRAHAM KANDEL ◽  
HORST BUNKE

In this paper, a new, geometric approach to pattern identification in data mining is presented. It is based on applying string edit distance computation to measuring the similarity between multi-dimensional curves. The string edit distance computation is extended to allow the possibility of using strings, where each element is a vector rather than just a symbol. We discuss an approach for representing 3D-curves using the curvature and the tension as their symbolic representation. This transformation preserves all the information contained in the original 3D-curve. We validate this approach through experiments using synthetic and digitalized data. In particular, the proposed approach is suitable to measure the similarity of 3D-curves invariant under translation, rotation, and scaling. It also can be applied for partial curve matching.


2017 ◽  
Vol 40 (2) ◽  
pp. 161-178
Author(s):  
Abe Powell ◽  
Hiroyuki Suzuki

Abstract The goal of this paper is to use string edit distance to describe the synchronic relationship between the Tibetan speech varieties located on the Northeastern edge of the Tibetan Plateau. String edit distance provides a statistical way to compare a large number of linguistic features, in essence producing a statistical bundle of isoglosses. In this way, it can be used as a tool in dialect mapping and synchronic clustering. In this paper, the aggregate distance matrix produced by string edit distance reveals that the great degree of phonetic continuity on the grasslands of the northeastern edge of the plateau is matched by an equal degree of phonetic discontinuity in the mountains forming the eastern border of the plateau. While the dialects located on the grasslands can be grouped together into one cluster, the dialects in the mountains can be grouped together into six clusters.


Sign in / Sign up

Export Citation Format

Share Document