scholarly journals Development of online travel Web scraping for tourism statistics in Indonesia

Author(s):  
Yustiar Adhinugroho ◽  
◽  
Amanda Putra ◽  
Muhammad Luqman ◽  
Geri Ermawan ◽  
...  

Introduction. This research aims to study a novel approach to producing tourism statistics, especially accommodation statistics, in Indonesia using scraping of online travel agent Websites. Method. Accommodation data (e.g., room availability and price) were gathered from two of the largest online travel agencies in Indonesia. All data were collected automatically from the sites’ URLs listed in the sitemap. Analysis. The data were collected daily from 6 March to 27 July 2019. Datasets from the two Websites were merged. The room occupation rate (ROR) for each province was calculated and compared with the official statistics from Statistics Indonesia. Results. The results show that the online room occupancy rates and official statistics have a similar pattern indicating the use of the Web scraping technique provides valuable information, to measure the room occupation rate with an advantage in terms of cost and collection time. Conclusions. It is feasible to use big data as a proxy of or a complement to official statistics, especially in tourism statistics. By using the Web scraping technique, the indicator that usually requires significant time and cost can be done in real-time and less cost. This new approach would improve the quality of tourism statistics produced by BPS Statistics Indonesia.

Author(s):  
Avishay Bransky ◽  
Anders Larsson ◽  
Elisabeth Aardal ◽  
Yaara Ben-Yosef ◽  
Robert H Christenson

Abstract Background The need for rapid point-of-care (POC) diagnostics is now becoming more evident due to the increasing need for timely results and improvement in healthcare service. With the recent COVID-19 pandemic outbreak, POC has become critical in managing the spread of disease. Applicable diagnostics should be readily deployable, easy to use, portable, and accurate so that they fit mobile laboratories, pop-up treatment centers, field hospitals, secluded wards within hospitals, or remote regions, and can be operated by staff with minimal training. Complete blood count (CBC), however, has not been available at the POC in a simple-to-use device until recently. The HemoScreen, which was recently cleared by the FDA for POC use, is a miniature, easy-to-use instrument that uses disposable cartridges and may fill this gap. Content The HemoScreen’s analysis method, in contrast to standard laboratory analyzers, is based on machine vision (image-based analysis) and artificial intelligence (AI). We discuss the different methods currently used and compare their results to the vision-based one. The HemoScreen is found to correlate well to laser and impedance-based methods while emphasis is given to mean cell volume (MCV), mean cell hemoglobin (MCH), and platelets (PLT) that demonstrate better correlation when the vision-based method is compared to itself due to the essential differences between the underlying technologies. Summary The HemoScreen analyzer demonstrates lab equivalent performance, tested at different clinical settings and sample characteristics, and might outperform standard techniques in the presence of certain interferences. This new approach to hematology testing has great potential to improve quality of care in a variety of settings.


Author(s):  
Vinod Podichetty ◽  
Robert Biscup

The Internet offers an unprecedented opportunity for healthcare information to be disseminated instantaneously. Quality of information, both scientific and nonscientific, and the development of tools to disseminate information securely via the Internet are the two most important issues related to achieving effective and wider exchange of health information. For the first time ever, information can be exchanged simultaneously and interactively all around the world, with the potential of being equally available to healthcare professionals as well as to patients. The big difference between yesterday's knowledge-based patient care and that of tomorrow, is a fundamental premise that patients will explore the web world with a desire to learn more about their condition, including its treatment and prognosis. This has evolved into the concept of e-health (Electronic Health). Evaluation and examination of the information being conveyed via the Internet is important and necessary in order for the Internet to be an effective tool in healthcare.


Author(s):  
Elisa Chiapponi ◽  
Marc Dacier ◽  
Onur Catakoglu ◽  
Olivier Thonnard ◽  
Massimiliano Todisco

Airline websites are the victims of unauthorised online travel agencies and aggregators that use armies of bots to scrape prices and flight information. These so-called Advanced Persistent Bots (APBs) are highly sophisticated. On top of the valuable information taken away, these huge quantities of requests consume a very substantial amount of resources on the airlines' websites. In this work, we propose a deceptive approach to counter scraping bots. We present a platform capable of mimicking airlines' sites changing prices at will. We provide results on the case studies we performed with it. We have lured bots for almost 2 months, fed them with indistinguishable inaccurate information. Studying the collected requests, we have found behavioural patterns that could be used as complementary bot detection. Moreover, based on the gathered empirical pieces of evidence, we propose a method to investigate the claim commonly made that proxy services used by web scraping bots have millions of residential IPs at their disposal. Our mathematical models indicate that the amount of IPs is likely 2 to 3 orders of magnitude smaller than the one claimed. This finding suggests that an IP reputation-based blocking strategy could be effective, contrary to what operators of these websites think today.


2017 ◽  
Vol 55 (9) ◽  
pp. 1888-1904 ◽  
Author(s):  
Maurizio Massaro ◽  
John Dumay ◽  
Carlo Bagnoli

Purpose The purpose of this paper is to investigate intellectual capital (IC) discussions held between investors using Web 2.0 tools. More precisely, this paper investigates the determinants of IC disclosures (ICDs) on internet stock message boards (IMBs). Design/methodology/approach Four hypotheses were developed and tested through content analysis of 60,996 messages posted on two main IMBs, Yahoo!Finance and TheLion.com, followed by descriptive statistics and logistic regression testing. Findings The findings show that Web 2.0 is bringing new opportunities to disclose IC. Traditional theories, such as agency, stakeholder, signalling, and legitimacy theory, cannot be applied to the Web 2.0 context. Therefore, a new approach that focusses more on the personal motivations for disclosing IC is called for. At a glance, the results show that IC is disclosed on IMBs, and several elements influence both the quantity and quality of those disclosures. Sometimes “trolls” disturb the dialogue and discourage participation by other investors. Conversely, online influencers facilitate ICD. To filter messages, the time of posting, the length of the messages, and the sentiment the messages contain should be considered along with the author of the message. Originality/value This paper contributes to the existing literature by investigating the IC disclosed on IMBs. The findings provide insights about how ICDs are developed using Web 2.0 tools.


2021 ◽  
pp. 1-14
Author(s):  
Ayoub Faramarzi ◽  
Reza Hadizadeh ◽  
Saeed Fayyaz ◽  
Sohrab Sajadimanesh ◽  
Abbas Moradi

Data pervasiveness was made possible by the advent of new technologies such as the Internet and the World Wide Web in every human and non-human activity. This created an exponential increase or data explosion in data generation, coined under the term Big data. Alternatively, Big Data sources can contribute to the reduction of the response burden or they can be used only to study some economic or social phenomena before designing a statistical survey which is inherently expensive to pilot. Also, incorporating Big Data sources into official statistics means maintaining a net competitive advantage and relevance of the official statistics products compared to those provided by a plethora of commercial players, with reference to large corporations that are active in the field of information technology. In this paper, the web scraping technique was used to extract the daily prices of the food and drinks products in order to replace them with conventional prices which had been used for price indices. Moreover, these sorts of new datasets enable us to calculate the indices in smaller time scales like weekly or daily basis in comparison to the conventional approach which is possible only on monthly basis. Although web scraping has its own problems, it is more economically friendly, accurate, and time-saving, especially in urban areas. Findings revealed that the web scraping technique can be applied as an effective alternative to conventional methods for CPI. Also, this technique can be used for other price statistics.


Author(s):  
Christina Sudyasjayanti ◽  
Auditia Setiobudi

<p align="center"><strong><em>ABSTRACT</em></strong><strong><em>:</em></strong></p><p><em>The increasing number of internet users in Indonesia in recent years greatly affect the growth of e-commerce business. One of the fast growing business is online travel services such as Traveloka, Tiket.com, Agoda, Pegipegi and so on. Online travel services are very diverse facilities, ranging from the procurement of websites to the manufacture of applications. The quality of online services (e-Service Quality) can facilitate efficiently and effectively in the purchase, sales, and delivery of both products and services. E-Service Quality or E-ServQual is a new version of Service Quality (ServQual). E-ServQual is developed to evaluate a service provided on the Internet network. The dimensions of E-ServQual used in the study are 8 dimensions, </em><em>there are</em><em> website design, reliability, responsiveness, security, fulfillment, personalization, information, and empathy.</em><em> </em><em>The </em><em>sample</em><em> used</em><em> in this research is purposive sampling counted </em><em>38</em><em> respondent</em><em>s</em><em>. The sample selection is based on experience using online travel agent in Indonesia. </em><em>This research is using factor analysis to find out the new dimension construct on e-ServQual towards online travel agencies (OTA) in Indonesia</em><em>.</em></p><p><strong><em>Keyword:</em></strong><em> e-ServQual, e-commerce, Online Travel Agent (OTA).</em><em></em></p>


Sign in / Sign up

Export Citation Format

Share Document