Automated document content characterization for a multimedia document retrieval system

Author(s):  
Maija Koivusaari ◽  
Jaakko J. Sauvola ◽  
Matti Pietikaeinen
Author(s):  
Jae-Woo Chang ◽  
Du-Seok Jin

Recently it is common for users to acquire through the World Wide Web a variety of multimedia documents. As the number of Web documents is dramatically increasing, we need to develop a multimedia document retrieval system that can support both structure-based retrieval and content-based retrieval. In order to support structure-based retrieval, we design efficient index structures (i.e., keyword, structure, element and attribute) and implement those by using the o2store storage system. For the content-based retrieval, we implement high-dimensional index structure for color and shape feature that is based on X-tree. Finally, we do the performance evaluation of our multimedia document retrieval system in terms of system efficiency, such as retrieval time, insertion time and storage overhead, as well as system effectiveness, such as recall and precision measures.


Author(s):  
Du-Seok Jin ◽  
Jae-Woo Chang

Recently it is common for users to acquire through the World Wide Web a variety of multimedia documents. As the number of Web documents is dramatically increasing, we need to develop a multimedia document retrieval system that can support both structure-based retrieval and content-based retrieval. In order to support structure-based retrieval, we design efficient index structures (i.e., keyword, structure, element and attribute) and implement those by using the o2store storage system. For the content-based retrieval, we implement high-dimensional index structure for color and shape feature that is based on X-tree. Finally, we do the performance evaluation of our multimedia document retrieval system in terms of system efficiency, such as retrieval time, insertion time and storage overhead, as well as system effectiveness, such as recall and precision measures.


2020 ◽  
Vol 4 (3) ◽  
pp. 551-557
Author(s):  
Muhammad zaky ramadhan ◽  
Kemas Muslim Lhaksmana

Hadith has several levels of authenticity, among which are weak (dhaif), and fabricated (maudhu) hadith that may not originate from the prophet Muhammad PBUH, and thus should not be considered in concluding an Islamic law (sharia). However, many such hadiths have been commonly confused as authentic hadiths among ordinary Muslims. To easily distinguish such hadiths, this paper proposes a method to check the authenticity of a hadith by comparing them with a collection of fabricated hadiths in Indonesian. The proposed method applies the vector space model and also performs spelling correction using symspell to check whether the use of spelling check can improve the accuracy of hadith retrieval, because it has never been done in previous works and typos are common on Indonesian-translated hadiths on the Web and social media raw text. The experiment result shows that the use of spell checking improves the mean average precision and recall to become 81% (from 73%) and 89% (from 80%), respectively. Therefore, the improvement in accuracy by implementing spelling correction make the hadith retrieval system more feasible and encouraged to be implemented in future works because it can correct typos that are common in the raw text on the Internet.


2020 ◽  
Vol 9 (4) ◽  
pp. 1
Author(s):  
MATHEW EDWIN ◽  
L. KARTHIKEYAN ◽  
B. MUTHU SENTHIL ◽  
◽  
◽  
...  

Sign in / Sign up

Export Citation Format

Share Document