Design of Chinese Word Segmentation System Based on Improved Chinese Converse Dictionary and Reverse Maximum Matching Algorithm

Author(s):  
Liyi Zhang ◽  
Yazi Li ◽  
Jian Meng
2019 ◽  
Vol 267 ◽  
pp. 04001
Author(s):  
Zhibin Xiong

This paper proposes an improved Trie tree structure. The tree node records the position information of the characters participating in the word formation, and the child node uses the hash search mechanism. On this basis, the forward maximum matching algorithm of Chinese word segmentation is optimized. In the process of word segmentation, the automaton mechanism is used to judge whether it constitutes the longest word, and the problem that the forward maximum matching algorithm needs to adjust the string according to the word length is solved. The algorithm time complexity is 1.33, and the comparison test results show that there is a fast word segmentation speed. The forward maximum matching algorithm based on the improved Trie tree structure improves the Chinese word segmentation speed, especially when the dictionary structure needs to be updated in real time.


Sign in / Sign up

Export Citation Format

Share Document