stochastic language
Recently Published Documents


TOTAL DOCUMENTS

47
(FIVE YEARS 0)

H-INDEX

12
(FIVE YEARS 0)

Author(s):  
Md.Iftakher Alam Eyamin ◽  
Md. Tarek Habib ◽  
Muhammad Ifte Khairul Islam ◽  
Md. Sadekur Rahman ◽  
Md. Abbas Ali Khan

<p class="Abstract">Word completion and word prediction are two important phenomena in typing that have extreme effect on aiding disable people and students while using keyboard or other similar devices. Such autocomplete technique also helps students significantly during learning process through constructing proper keywords during web searching. A lot of works are conducted for English language, but for Bangla, it is still very inadequate as well as the metrics used for performance computation is not rigorous yet. Bangla is one of the mostly spoken languages (3.05% of world population) and ranked as seventh among all the languages in the world. In this paper, word prediction on Bangla sentence by using stochastic, i.e. <em>N</em>-gram based language models are proposed for autocomplete a sentence by predicting a set of words rather than a single word, which was done in previous work. A novel approach is proposed in order to find the optimum language model based on performance metric. In addition, for finding out better performance, a large Bangla corpus of different word types is used.</p>


PLoS ONE ◽  
2017 ◽  
Vol 12 (5) ◽  
pp. e0177794 ◽  
Author(s):  
Alessandro Lopopolo ◽  
Stefan L. Frank ◽  
Antal van den Bosch ◽  
Roel M. Willems

2015 ◽  
Author(s):  
Tsung-Hsien Wen ◽  
Milica Gasic ◽  
Dongho Kim ◽  
Nikola Mrksic ◽  
Pei-Hao Su ◽  
...  

2014 ◽  
Vol 40 (4) ◽  
pp. 763-799 ◽  
Author(s):  
François Mairesse ◽  
Steve Young

Most previous work on trainable language generation has focused on two paradigms: (a) using a generation decisions of an existing generator. Both approaches rely on the existence of a handcrafted generation component, which is likely to limit their scalability to new domains. The first contribution of this article is to present Bagel, a fully data-driven generation method that treats the language generation task as a search for the most likely sequence of semantic concepts and realization phrases, according to Factored Language Models (FLMs). As domain utterances are not readily available for most natural language generation tasks, a large creative effort is required to produce the data necessary to represent human linguistic variation for nontrivial domains. This article is based on the assumption that learning to produce paraphrases can be facilitated by collecting data from a large sample of untrained annotators using crowdsourcing—rather than a few domain experts—by relying on a coarse meaning representation. A second contribution of this article is to use crowdsourced data to show how dialogue naturalness can be improved by learning to vary the output utterances generated for a given semantic input. Two data-driven methods for generating paraphrases in dialogue are presented: (a) by sampling from the n-best list of realizations produced by Bagel's FLM reranker; and (b) by learning a structured perceptron predicting whether candidate realizations are valid paraphrases. We train Bagel on a set of 1,956 utterances produced by 137 annotators, which covers 10 types of dialogue acts and 128 semantic concepts in a tourist information system for Cambridge. An automated evaluation shows that Bagel outperforms utterance class LM baselines on this domain. A human evaluation of 600 resynthesized dialogue extracts shows that Bagel's FLM output produces utterances comparable to a handcrafted baseline, whereas the perceptron classifier performs worse. Interestingly, human judges find the system sampling from the n-best list to be more natural than a system always returning the first-best utterance. The judges are also more willing to interact with the n-best system in the future. These results suggest that capturing the large variation found in human language using data-driven methods is beneficial for dialogue interaction.


2010 ◽  
Vol 3 (4) ◽  
pp. 523-531
Author(s):  
Minghua Deng ◽  
Xiangzhong Fang ◽  
Peng Ge ◽  
Luhua Lai ◽  
Guojun Pei ◽  
...  

2009 ◽  
Vol 21 (2-3) ◽  
pp. 145-159 ◽  
Author(s):  
Carlos Pérez-Sancho ◽  
David Rizo ◽  
José M. Iñesta

Sign in / Sign up

Export Citation Format

Share Document