A Hybrid Approach to Vietnamese Word Segmentation Using Part of Speech Tags

Now-a-days people interest to spend their time in social sites especially twitters to post lot of tweets in every day. The posted tweets are used by many users to get the knowledge about the particular applications, products and other search engine queries. With the help of the posted tweets, their emotions and sentiments are derived which are used to get opinion about particular event. Lot of traditional sentiment detection system that has been developed but they failed to analyze huge volume of tweets and online contents with temporal patterns were also difficult to analyze. To overcome the above issues, the co-ranking multi-modal natural language processing based sentiment analysis system was developed to detect the emotions from the posted tweets. Initially, tweets of different events are collected from social sites which are processed by natural language procedures such as Stemming, Lemmatization, Part-of-speech tagging, word segmentation and parsing are applied to get the words related to posted tweets for deriving the sentiments. From the extracted emotions, co-ranking process is applied to get the opinion effectively related to particular event. Then the efficiency of the system is examined using experimental results and discussions. The introduced system recognize the sentiments from tweets with 98.80% of accuracy.

Download Full-text

UniBA @ KIPoS: A Hybrid Approach for Part-of-Speech Tagging

EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020 ◽

10.4000/books.aaccademia.7773 ◽

2020 ◽

pp. 501-506

Author(s):

Giovanni Luca Izzi ◽

Stefano Ferilli

Keyword(s):

Hybrid Approach ◽

Part Of Speech Tagging ◽

Part Of Speech ◽

Speech Tagging

Download Full-text

AMATCHMETHOD BASED ON LATENT SEMANTIC ANALYSIS FOR EARTHQUAKEHAZARD EMERGENCY PLAN

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlii-2-w7-137-2017 ◽

2017 ◽

Vol XLII-2/W7 ◽

pp. 137-141

Author(s):

D. Sun ◽

S. Zhao ◽

Z. Zhang ◽

X. Shi

Keyword(s):

Decision Maker ◽

Latent Semantic Analysis ◽

Semantic Analysis ◽

Semantic Space ◽

Word Segmentation ◽

Emergency Plan ◽

Match Method ◽

Part Of Speech ◽

Plan Retrieval ◽

Short Time

The structure of the emergency plan on earthquake is complex, and it’s difficult for decision maker to make a decision in a short time. To solve the problem, this paper presents a match method based on Latent Semantic Analysis (LSA). After the word segmentation preprocessing of emergency plan, we carry out keywords extraction according to the part-of-speech and the frequency of words. Then through LSA, we map the documents and query information to the semantic space, and calculate the correlation of documents and queries by the relation between vectors. The experiments results indicate that the LSA can improve the accuracy of emergency plan retrieval efficiently.

Download Full-text

Example-based correction of word segmentation and part of speech labelling

Proceedings of the workshop on Human Language Technology - HLT '93 ◽

10.3115/1075671.1075724 ◽

1993 ◽

Cited By ~ 1

Author(s):

Tomoyoshi Matsukawa ◽

Scott Miller ◽

Ralph Weischedel

Keyword(s):

Word Segmentation ◽

Part Of Speech

Download Full-text

Tibetan Word Segmentation as Sub-syllable Tagging with Syllable’s Part-of-Speech Property

Lecture Notes in Computer Science - Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data ◽

10.1007/978-3-319-25816-4_16 ◽

2015 ◽

pp. 189-201

Author(s):

Huidan Liu ◽

Congjun Long ◽

Minghua Nuo ◽

Jian Wu

Keyword(s):

Word Segmentation ◽

Part Of Speech

Download Full-text

A Character Level Based and Word Level Based Approach for Chinese-Vietnamese Machine Translation

Computational Intelligence and Neuroscience ◽

10.1155/2016/9821608 ◽

2016 ◽

Vol 2016 ◽

pp. 1-11 ◽

Cited By ~ 6

Author(s):

Phuoc Tran ◽

Dien Dinh ◽

Hien T. Nguyen

Keyword(s):

Machine Translation ◽

Hybrid Approach ◽

Sparse Data ◽

Word Segmentation ◽

Experimental Results ◽

Translation System ◽

Word Level ◽

Data Problem ◽

Sparse Data Problem ◽

Language Pair

Chinese and Vietnamese have the same isolated language; that is, the words are not delimited by spaces. In machine translation, word segmentation is often done first when translating from Chinese or Vietnamese into different languages (typically English) and vice versa. However, it is a matter for consideration that words may or may not be segmented when translating between two languages in which spaces are not used between words, such as Chinese and Vietnamese. Since Chinese-Vietnamese is a low-resource language pair, the sparse data problem is evident in the translation system of this language pair. Therefore, while translating, whether it should be segmented or not becomes more important. In this paper, we propose a new method for translating Chinese to Vietnamese based on a combination of the advantages of character level and word level translation. In addition, a hybrid approach that combines statistics and rules is used to translate on the word level. And at the character level, a statistical translation is used. The experimental results showed that our method improved the performance of machine translation over that of character or word level translation.

Download Full-text