A Method for Chinese Text Classification Based on Three-Dimensional Vector Space Model

Author(s):  
Jixian Zhang ◽  
Qinglin Wang ◽  
Yuan Li ◽  
Dongmei Li ◽  
Yuexing Hao
2014 ◽  
Vol 644-650 ◽  
pp. 2206-2210
Author(s):  
Kun Zhou ◽  
Ya Ping Dai ◽  
Feng Gao ◽  
Ji Hong Zou

By means of word-segmentation technology in TRIP database and each word that appears in a database will be account in detail, a kind of self-constructed category dictionary (SCC-dictionary) in Chinese text classification is proposed. For solving high dimension and sparseness problem exit in vector space model, a four-dimensional feature vector space model (FFVSM) is presented in this paper. With Support Vector Machine (SVM) algorithm, the text classifier is designed. Experimental results show there are two achievements in this paper: first, SCC-dictionary can replace the artificial-written dictionary with the same effect; second, the FFVSM will not only reduce the computing load than high-dimensional feature vector space model, but also keep the precision of classification as 86.87%, recall rate as 95.12%, and F1 value as 90.81%.


2018 ◽  
Vol 10 (4) ◽  
pp. 32
Author(s):  
Lin Li ◽  
Xiuteng Duan ◽  
Yutong Li

Handwriting detection is mainly used in the criminal investigation. We can use four-dimensional vector space model to build a model for handwriting detection. This article selects feature quantities such as word frequency, language style, average word length, and sentence structure from the texts and quantizes them, transforming them into relations between vectors. After quantifying and normalizing the features in an author's article in advance, we can obtain a standard reference vector. Then we do the same processing on the target text database, and compare it with the standard reference vector in terms of the modulus value and the included angle. Then we could estimate whether the author is the owner of database value. The simulation result shows that the model is more accurate and the author of particular texts can be obtained.


Symmetry ◽  
2021 ◽  
Vol 13 (8) ◽  
pp. 1382
Author(s):  
Roger D. Maddux

The Theorems of Pappus and Desargues (for the projective plane over a field) are generalized here by two identities involving determinants and cross products. These identities are proved to hold in the three-dimensional vector space over a field. They are closely related to the Arguesian identity in lattice theory and to Cayley-Grassmann identities in invariant theory.


IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 166578-166592
Author(s):  
Surender Singh Samant ◽  
N. L. Bhanu Murthy ◽  
Aruna Malapati

2013 ◽  
Vol 347-350 ◽  
pp. 2856-2859
Author(s):  
Jun Hui Pan ◽  
Hui Li

A kind of text classification method based on fuzzy vector space model and neural networks is proposed in the paper according to the problems that a text can be belongs to many types during the text classification. Fuzzy theory is adopted in the method to look the occurring position of feature items in text on as the important degree (membership) reflecting text subject, and fully considered the position information while the features are extracted, thus the fuzzy feature vectors are constructed, as a result, the text classification is close to the manual classification method. The established networks are constituted of input layer, hidden layer and output layer, the input layer completes the inputs of classification samples, hidden layer extracts the implicit pattern features of input samples, the output layer is used to output the classification results. Finally the effectiveness of this method is proved by some documents of Wan Fang data in experimental section. (Abstract)


Author(s):  
Jinguo Sang ◽  
Shanchen Pang ◽  
Yang Zha ◽  
Fan Yang

AbstractThe amount of information increases explosively in Internet of Things, because more and more data are sensed by large amount of sensors. The explosive growth of information makes it difficult to access information efficiently, so it is an effective method to decrease the amount of information to be transferred on network by text classification. This paper proposes a new text classification algorithm based on vector space model. This algorithm improves the feature selection and weighting methods by introducing synonym replacement to traditional text classification algorithms. The experimental results show that the proposed classification algorithm has considerably improved the precision and recall of classification.


Term Weighting Scheme (TWS) is a key component of the matching mechanism when using the vector space model In the context of information retrieval (IR) from text documents, the this paper described a new approach of term weighting methods to improve the classification performance. In this study, we propose an effective term weighting scheme, which gives highest accuracy with compare to the text classification methods. We compared performance parameter of KNN and Naïve Bayes Classification with different Weighting Method, Weight information gain, SVM and proposed method.We have implemented many term-weighting methods (TWM) on Amazon data collections in combination with Information-Gain and SVM and KNN algorithm and Naïve Bayes Algorithm.


Sign in / Sign up

Export Citation Format

Share Document