Measuring consistency for multiple taggers using vector space modeling

2009 ◽  
Vol 60 (10) ◽  
pp. 1995-2003 ◽  
Author(s):  
Dietmar Wolfram ◽  
Hope A. Olson ◽  
Raina Bloom
2020 ◽  
Vol 9 (3) ◽  
pp. 43-52
Author(s):  
Alaidine Ben Ayed ◽  
Ismaïl Biskri ◽  
Jean Guy Meunier

2016 ◽  
Vol 21 (1) ◽  
pp. 48-79 ◽  
Author(s):  
Tom Ruette ◽  
Katharina Ehret ◽  
Benedikt Szmrecsanyi

Lectometry is a corpus-based methodology that explores how multiple language-external dimensions shape language usage in an aggregate perspective. The paper combines this methodology with Semantic Vector Space modeling to investigate lexical variability in written Standard English, as sampled in the original Brown family of corpora (Brown, LOB, Frown and F-LOB). Based on a joint analysis of 303 lexical variables, which are semi-automatically extracted by means of a SVS, we find that lexical variation in the Brown family is systematically related to three lectal dimensions: discourse type (informative versus imaginative), standard variety (British English versus American English), and time period (1960s versus 1990s). It turns out that most lexical variables are sensitive to at least one of these three language-external dimensions, yet not every dimension has dedicated lexical variables: in particular, distinctive lexical variables for the real time dimension fail to emerge.


2014 ◽  
Vol 6 (1) ◽  
pp. 14-33
Author(s):  
Ali Gürkan ◽  
Luca Iandoli

While online conversations are very popular, the content generated by participants is very often overwhelming, poorly organized and often of questionable quality. In this article we use two methods, a text analysis technique, Vector Space Modeling (VSM) and clustering to create a methodology to organize and aggregate information generated by users using Online collaborative Argumentation (OA) in their online debate. An alternative to other widely used conversational tools such as online forums, OA is supposed to help users to join their efforts to construct a shared knowledge representation in the form of an argument map in which multiple points of view can coexist and be presented in the form of a well-organized knowledge object. To see whether this supposition comes into effect we first apply VSM to summarize argument map content as a document space and then use clustering to transform it to a limited number of higher order semantic categories. We apply the methodology to more than 3000 posts created in an online debate of about 160 participants using an online argumentation platform and we show how this methodology can be used to effectively organize and evaluate content generated by a large number of users in online discussions.


2014 ◽  
Vol 136 (10) ◽  
Author(s):  
Jeremy Murphy ◽  
Katherine Fu ◽  
Kevin Otto ◽  
Maria Yang ◽  
Dan Jensen ◽  
...  

Design-by-analogy is a powerful approach to augment traditional concept generation methods by expanding the set of generated ideas using similarity relationships from solutions to analogous problems. While the concept of design-by-analogy has been known for some time, few actual methods and tools exist to assist designers in systematically seeking and identifying analogies from general data sources, databases, or repositories, such as patent databases. A new method for extracting functional analogies from data sources has been developed to provide this capability, here based on a functional basis rather than form or conflict descriptions. Building on past research, we utilize a functional vector space model (VSM) to quantify analogous similarity of an idea's functionality. We quantitatively evaluate the functional similarity between represented design problems and, in this case, patent descriptions of products. We also develop document parsing algorithms to reduce text descriptions of the data sources down to the key functions, for use in the functional similarity analysis and functional vector space modeling. To do this, we apply Zipf's law on word count order reduction to reduce the words within the documents down to the applicable functionally critical terms, thus providing a mapping process for function based search. The reduction of a document into functional analogous words enables the matching to novel ideas that are functionally similar, which can be customized various ways. This approach thereby provides relevant sources of design-by-analogy inspiration. As a verification of the approach, two original design problem case studies illustrate the distance range of analogical solutions that can be extracted. This range extends from very near-field, literal solutions to far-field cross-domain analogies.


Electronics ◽  
2021 ◽  
Vol 10 (16) ◽  
pp. 1947
Author(s):  
Yan Wang ◽  
Shan Gao ◽  
Hongyan Chu ◽  
Xuefei Wang

In view of the practical application requirements for the rapid expansion of electric taxis (ETs) and the reasonable planning of charging stations, this paper presents a method for mining latent semantic correlation of large data by the trajectory of ETs and the planning of charging stations with optimal cost. Firstly, the vector space modeling method of ET trajectory data is studied, and the semantic similarity of the trajectory data matrix is evaluated. Secondly, the hidden characteristics of the mass trajectory data are extracted by matrix decomposition. Then, the latent semantic correlation characteristics of trajectory data are mined. Finally, the fast clustering of ETs is realized by the spectral clustering method. On this basis, with the objective of minimizing the annual construction and maintenance costs of charging stations, the optimal planning scheme of charging stations for ETs is given. In this paper, the spectrum clustering processing method of the potential semantic correlation of the big data of the driving track of ETs can be combined with the operation and maintenance costs of the charging station, and the convenience of charging for ET users is also considered. This provides decision support information for the reasonable planning of charging stations.


Sign in / Sign up

Export Citation Format

Share Document