Study on Unknown Term Translation Mining from Google Snippets

Bin Li; Jianmin Yao

doi:10.3390/info10090267

Study on Unknown Term Translation Mining from Google Snippets

Information ◽

10.3390/info10090267 ◽

2019 ◽

Vol 10 (9) ◽

pp. 267 ◽

Cited By ~ 1

Author(s):

Bin Li ◽

Jianmin Yao

Keyword(s):

Search Engine ◽

Web Pages ◽

Frequency Change ◽

Effective Solution ◽

Change Measurement ◽

Bilingual Corpora ◽

Surface Patterns ◽

The Subject ◽

Phonetic Features ◽

Frequency Distance

Bilingual web pages are widely used to mine translations of unknown terms. This study focused on an effective solution for obtaining relevant web pages, extracting translations with correct lexical boundaries, and ranking the translation candidates. This research adopted co-occurrence information to obtain the subject terms and then expanded the source query with the translation of the subject terms to collect effective bilingual search engine snippets. Afterwards, valid candidates were extracted from small-sized, noisy bilingual corpora using an improved frequency change measurement that combines adjacent information. This research developed a method that considers surface patterns, frequency–distance, and phonetic features to elect an appropriate translation. The experimental results revealed that the proposed method performed remarkably well for mining translations of unknown terms.

Download Full-text

Digital hajj: the pilgrimage to Mecca in Muslim cyberspace and the issue of religious online authority

Scripta Instituti Donneriani Aboensis ◽

10.30674/scripta.67440 ◽

2013 ◽

Vol 25 ◽

pp. 189-203 ◽

Cited By ~ 1

Author(s):

Dominik Schlosser

Keyword(s):

Search Engine ◽

The Internet ◽

Web Pages ◽

Religious Authority ◽

Liminal Space ◽

Online Presence ◽

Optimisation Techniques

This paper attempts to give an overview of the different representations of the pilgrimage to Mecca found in the ‘liminal space’ of the internet. For that purpose, it examines a handful of emblematic examples of how the hajj is being presented and discussed in cyberspace. Thereby, special attention shall be paid to the question of how far issues of religious authority are manifest on these websites, whether the content providers of web pages appoint themselves as authorities by scrutinizing established views of the fifth pillar of Islam, or if they upload already printed texts onto their sites in order to reiterate normative notions of the pilgrimage to Mecca, or of they make use of search engine optimisation techniques, thus heightening the very visibility of their online presence and increasing the possibility of becoming authoritative in shaping internet surfers’ perceptions of the hajj.

Download Full-text

Classification of Spamming Attacks to Blogging Websites and Their Security Techniques

Encyclopedia of Criminal Activities and the Deep Web ◽

10.4018/978-1-5225-9715-5.ch058 ◽

2020 ◽

pp. 864-880 ◽

Cited By ~ 1

Author(s):

Rizwan Ur Rahman ◽

Rishu Verma ◽

Himani Bansal ◽

Deepak Singh Tomar

Keyword(s):

Search Engine ◽

World Wide ◽

Web Search ◽

Service Providers ◽

Web Pages ◽

Internet Service ◽

Important Concern ◽

Attack Scenario ◽

Explosive Expansion

With the explosive expansion of information on the world wide web, search engines are becoming more significant in the day-to-day lives of humans. Even though a search engine generally gives huge number of results for certain query, the majority of the search engine users simply view the first few web pages in result lists. Consequently, the ranking position has become a most important concern of internet service providers. This article addresses the vulnerabilities, spamming attacks, and countermeasures in blogging sites. In the first part, the article explores the spamming types and detailed section on vulnerabilities. In the next part, an attack scenario of form spamming is presented, and defense approach is presented. Consequently, the aim of this article is to provide review of vulnerabilities, threats of spamming associated with blogging websites, and effective measures to counter them.

Download Full-text

A Wiki on the Teaching of Business Administration

Handbook of Research on E-Learning Applications for Career and Technical Education ◽

10.4018/978-1-60566-739-3.ch040 ◽

2009 ◽

pp. 508-517

Author(s):

Ricard Monclús-Guitart ◽

Teresa Torres-Coronas ◽

Araceli Rodríguez-Merayo ◽

M. Arántzazu Vidal-Blasco ◽

Mario Arias-Oliva

Keyword(s):

Professional Learning ◽

Collaborative Work ◽

Collaborative Teaching ◽

Computer Application ◽

Web Pages ◽

Credit System ◽

Final Grade ◽

Internet Connection ◽

New Space ◽

The Subject

The European Credit Transfer System establishes a calculation based on the work students do, rather than direct teaching hours as is the case with the current credit system. These are known as ECTS credits and they represent the amount of work the student needs to do to pass a subject. In short, ECTS credits are the quantity of work needed to learn a subject, including theory, practical classes, seminars, exams as well as anything the student has done individually which can be evaluated. This is where a Wiki would provide a new space for students, where they could and should introduce information on matters related to the subject, as well as edit, correct, expand and improve etc. the already existing information. This information, which would be a collection of web pages in hypertext, would make it possible to create a computer application based on the collaborative work of the students which can be accessed by any student from any Internet connection. At the same time, it can be assessed and therefore form part of the student’s final grade for the subject. The aim of this chapter is to show the methodology which will enable a Wiki to be used for professional learning. Therefore, first the authors define what a Wiki is; second they discuss the Wiki as a collaborative teaching instrument; and third they deal with Wikis as a tool for educational assessment.

Download Full-text

Review of Link Structure Based Ranking Algorithms and Hanging Pages

Handbook of Research on Modern Cryptographic Solutions for Computer and Cyber Security - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-5225-0105-3.ch018 ◽

2016 ◽

pp. 420-459 ◽

Cited By ~ 1

Author(s):

Ravi P. Kumar ◽

Ashutosh K. Singh ◽

Anand Mohan

Keyword(s):

Search Engine ◽

Web Sites ◽

Cyber Security ◽

Web Security ◽

Optimization Techniques ◽

Web Pages ◽

Link Structure ◽

Search Engine Optimization ◽

Ranking Algorithms ◽

The Web

In this era of Web computing, Cyber Security is very important as more and more data is moving into the Web. Some data are confidential and important. There are many threats for the data in the Web. Some of the basic threats can be addressed by designing the Web sites properly using Search Engine Optimization techniques. One such threat is the hanging page which gives room for link spamming. This chapter addresses the issues caused by hanging pages in Web computing. This Chapter has four important objectives. They are 1) Compare and review the different types of link structure based ranking algorithms in ranking Web pages. PageRank is used as the base algorithm throughout this Chapter. 2) Study on hanging pages, explore the effects of hanging pages in Web security and compare the existing methods to handle hanging pages. 3) Study on Link spam and explore the effect of hanging pages in link spam contribution and 4) Study on Search Engine Optimization (SEO) / Web Site Optimization (WSO) and explore the effect of hanging pages in Search Engine Optimization (SEO).

Download Full-text

A Review on Semantic Text and Multimedia Retrieval and Recent Trends

International Journal of Multimedia Data Engineering and Management ◽

10.4018/ijmdem.2015010104 ◽

2015 ◽

Vol 6 (1) ◽

pp. 54-74

Author(s):

Oğuzhan Menemencioğlu ◽

İlhami Muharrem Orak

Keyword(s):

Semantic Web ◽

Search Engine ◽

Search Engines ◽

Semantic Search ◽

Multimedia Retrieval ◽

Web Pages ◽

The Face ◽

Recent Trends ◽

New Applications ◽

Machine Readable

Semantic web works on producing machine readable data and aims to deal with large amount of data. The most important tool to access the data which exist in web is the search engine. Traditional search engines are insufficient in the face of the amount of data that consists in the existing web pages. Semantic search engines are extensions to traditional engines and overcome the difficulties faced by them. This paper summarizes semantic web, concept of traditional and semantic search engines and infrastructure. Also semantic search approaches are detailed. A summary of the literature is provided by touching on the trends. In this respect, type of applications and the areas worked for are considered. Based on the data for two different years, trend on these points are analyzed and impacts of changes are discussed. It shows that evaluation on the semantic web continues and new applications and areas are also emerging. Multimedia retrieval is a newly scope of semantic. Hence, multimedia retrieval approaches are discussed. Text and multimedia retrieval is analyzed within semantic search.

Download Full-text

An Intelligent Web Search Using Multi-Document Summarization

International Journal of Information Retrieval Research ◽

10.4018/ijirr.2016040103 ◽

2016 ◽

Vol 6 (2) ◽

pp. 41-65 ◽

Cited By ~ 2

Author(s):

Sheetal A. Takale ◽

Prakash J. Kulkarni ◽

Sahil K. Shah

Keyword(s):

Search Engine ◽

Web Search ◽

Document Clustering ◽

The Internet ◽

Web Pages ◽

Extractive Summarization ◽

Text Understanding ◽

User Query ◽

Sentence Clustering

Information available on the internet is huge, diverse and dynamic. Current Search Engine is doing the task of intelligent help to the users of the internet. For a query, it provides a listing of best matching or relevant web pages. However, information for the query is often spread across multiple pages which are returned by the search engine. This degrades the quality of search results. So, the search engines are drowning in information, but starving for knowledge. Here, we present a query focused extractive summarization of search engine results. We propose a two level summarization process: identification of relevant theme clusters, and selection of top ranking sentences to form summarized result for user query. A new approach to semantic similarity computation using semantic roles and semantic meaning is proposed. Document clustering is effectively achieved by application of MDL principle and sentence clustering and ranking is done by using SNMF. Experiments conducted demonstrate the effectiveness of system in semantic text understanding, document clustering and summarization.

Download Full-text

Critical Analysis of Major Search Engines

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2019.8239 ◽

2019 ◽

Vol 16 (9) ◽

pp. 3712-3716

Author(s):

Kailash Kumar ◽

Abdulaziz Al-Besher

Keyword(s):

Search Engine ◽

Search Engines ◽

Critical Analysis ◽

Research Paper ◽

Web Pages ◽

Ranking Algorithm ◽

A Method of Subtopic Classification of Search Engine Suggests by Integrating a Topic Model and Word Embeddings

International Journal of Software Innovation ◽

10.4018/ijsi.2018070105 ◽

2018 ◽

Vol 6 (3) ◽

pp. 67-78

Author(s):

Tian Nie ◽

Yi Ding ◽

Chen Zhao ◽

Youchao Lin ◽

Takehito Utsuro

Keyword(s):

Search Engine ◽

Information Needs ◽

Web Search ◽

Topic Model ◽

Japanese Version ◽

Word Embedding ◽

Coarse Grained ◽

Web Pages ◽

Word Embeddings

The background of this article is the issue of how to overview the knowledge of a given query keyword. Especially, the authors focus on concerns of those who search for web pages with a given query keyword. The Web search information needs of a given query keyword is collected through search engine suggests. Given a query keyword, the authors collect up to around 1,000 suggests, while many of them are redundant. They classify redundant search engine suggests based on a topic model. However, one limitation of the topic model based classification of search engine suggests is that the granularity of the topics, i.e., the clusters of search engine suggests, is too coarse. In order to overcome the problem of the coarse-grained classification of search engine suggests, this article further applies the word embedding technique to the webpages used during the training of the topic model, in addition to the text data of the whole Japanese version of Wikipedia. Then, the authors examine the word embedding based similarity between search engines suggests and further classify search engine suggests within a single topic into finer-grained subtopics based on the similarity of word embeddings. Evaluation results prove that the proposed approach performs well in the task of subtopic classification of search engine suggests.

Download Full-text

Research and application of the detection on duplicate web pages on campus search engine

2012 IEEE International Conference on Computer Science and Automation Engineering ◽

10.1109/icsess.2012.6269527 ◽

2012 ◽

Author(s):

Yongbing Gao ◽

Fang Zhang ◽

Bin Hao ◽

Wei Gong

Keyword(s):

Search Engine ◽

Web Pages

Download Full-text

Intertextual relations in web localization

The Journal of Internationalization and Localization ◽

10.1075/jial.3.2.03sha ◽

2016 ◽

Vol 3 (2) ◽

pp. 152-164 ◽

Cited By ~ 1

Author(s):

Hamid Sharifi

Keyword(s):

Chief Executive Officers ◽

Critical Role ◽

Chief Executive ◽

Cross Cultural ◽

Web Pages ◽

Software Packages ◽

The Subject ◽

Executive Officers ◽

Potential Customers

In this research, we studied localized commercial texts of globalized companies in the context of intertextuality on three levels: lexical, thematic, and cultural. Amongst many products of the three companies under study (Samsung, LG, and Sony), four smartphone models of each were selected (total: 12). Their introductory web pages both in Persian and English were the sources of the data. Furthermore, we used an online analyzer tool (online-utility.org/text/analyzer.jsp) so as to analyze the data; the results were also corroborated with other pieces of software packages and applications. In the scene of booming globalization, a better understanding of cross-cultural vocative communication proves to be helpful. One of the most active areas is to study flagship brands where rivals are trying their best at localizing their devices to the liking of potential customers. Descriptive and explanatory methods were brought into play in order to compare English and Persian commercial texts. The research revealed the critical role intertextuality plays in the process of glocalization. Developing companies should note that they, too, could utilize this great potentiality in the context of web localization. Therefore, the findings would be of benefit to Chief Executive Officers (CEOs), product developers and scholars interested in the subject.

Download Full-text