Design and Implementation of a Big Data Evaluator Recommendation System Using Deep Learning Methodology

Sukil Cha; Mun Y. Yi; Sekyoung Youm

doi:10.3390/app10228000

Design and Implementation of a Big Data Evaluator Recommendation System Using Deep Learning Methodology

Applied Sciences ◽

10.3390/app10228000 ◽

2020 ◽

Vol 10 (22) ◽

pp. 8000

Author(s):

Sukil Cha ◽

Mun Y. Yi ◽

Sekyoung Youm

Keyword(s):

Big Data ◽

Deep Learning ◽

Full Text ◽

Recommendation System ◽

Selection Process ◽

Korean Literature ◽

Inverse Document Frequency ◽

Design And Implementation ◽

Research Fields ◽

Document Frequency

As the number of researchers in South Korea has grown, there is increasing dissatisfaction with the selection process for national research and development (R&D) projects among unsuccessful applicants. In this study, we designed a system that can recommend the best possible R&D evaluators using big data that are collected from related systems, refined, and analyzed. Our big data recommendation system compares keywords extracted from applications and from the full-text of the achievements of the evaluator candidates. Weights for different keywords are scored using the term frequency–inverse document frequency algorithm. Comparing the keywords extracted from the achievement of the evaluator candidates’, a project comparison module searches, scores, and ranks these achievements similarly to the project applications. The similarity scoring module calculates the overall similarity scores for different candidates based on the project comparison module scores. To assess the performance of the evaluator candidate recommendation system, 61 applications in three Review Board (RB) research fields (system fusion, organic biochemistry, and Korean literature) were recommended as the evaluator candidates by the recommendation system in the same manner as the RB’s recommendation. Our tests reveal that the evaluator candidates recommended by the Korean Review Board and those recommended by our system for 61 applications in different areas, were the same. However, our system performed the recommendation in less time with no bias and fewer personnel. The system requiresrevisions to reflect qualitative indicators, such as journal reputation, before it can entirely replace the current evaluator recommendation process.

Download Full-text

Large expert-curated database for benchmarking document similarity detection in biomedical literature search

Database ◽

10.1093/database/baz085 ◽

2019 ◽

Vol 2019 ◽

Author(s):

Peter Brown ◽

Aik-Choon Tan ◽

Mohamed A El-Esawi ◽

Thomas Liehr ◽

Oliver Blanck ◽

...

Keyword(s):

Literature Search ◽

Relevant Literature ◽

Biomedical Literature ◽

Medical Subject Headings ◽

Document Similarity ◽

Inverse Document Frequency ◽

Research Fields ◽

Experience Levels ◽

Document Frequency ◽

Systematic Biases

Abstract Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency–Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.

Download Full-text

Design and implementation of clothing fashion style recommendation system using deep learning

Revista Română de Informatică şi Automatică ◽

10.33436/v31i4y202110 ◽

2021 ◽

Vol 31 (4) ◽

pp. 123-136

Author(s):

Muhammad KHALID ◽

Mao KEMING ◽

Tariq HUSSAIN

Keyword(s):

Deep Learning ◽

Recommendation System ◽

Design And Implementation

Download Full-text

Reply Using Past Replies—A Deep Learning-Based E-Mail Client

Electronics ◽

10.3390/electronics9091353 ◽

2020 ◽

Vol 9 (9) ◽

pp. 1353

Author(s):

Yiwei Feng ◽

M. Asif Naeem ◽

Farhaan Mirza ◽

Ali Tahir

Keyword(s):

Deep Learning ◽

Hybrid Model ◽

Learning Algorithm ◽

Inverse Document Frequency ◽

Help Desk ◽

Deep Learning Algorithm ◽

Document Frequency ◽

Text Prediction ◽

E Mail ◽

Gated Recurrent Unit

Email is the most common and effective source of communication for most enterprises and individuals. In the corporate sector the volume of email received daily is significant while timely reply of each email is important. This generates a huge amount of work for the organisation, in particular for the staff located in the help-desk role. In this paper we present a novel Smart E-mail Management System (SEMS) for handling the issue of E-mail overload. The Term Frequency-Inverse Document Frequency (TF-IDF) model was used for designing a Smart Email Client in previous research. Since TF-IDF does not consider semantics between words, the replies suggested by the model are not very accurate. In this paper we apply Document to Vector (Doc2Vec) and introduce a novel Gated Recurrent Unit Sentence to Vector (GRU-Sent2Vec), which is a hybrid model by combining GRU and Sent2Vec. Both models are more intelligent as compared to TF-IDF. We compare our results from both models with TF-IDF. The Doc2Vec model performs the best on predicting a response for a similar new incoming Email. In our case, since the dataset is too small to require a deep learning algorithm model, the GRU-Sent2Vec hybrid model cannot produce ideal results, whereas in our understanding it is a robust method for long-text prediction.

Download Full-text

Deep Intelligent Prediction Network: A Novel Deep Learning Based Prediction Model on Spatiotemporal Characteristics and Location Based Services for Big Data Driven Intelligent Transportation System

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.k1503.0981119 ◽

2019 ◽

Vol 8 (11) ◽

pp. 845-849

Keyword(s):

Big Data ◽

Deep Learning ◽

Recommendation System ◽

Intelligent Transportation System ◽

Intelligent Transportation ◽

Transportation System ◽

Transportation Systems ◽

Location Based Services ◽

Traffic Data ◽

Spatiotemporal Characteristics

The concept of big Data for intelligent transportation system has been employed for traffic management on dealing with dynamic traffic environments. Big data analytics helps to cope with large amount of storage and computing resources required to use mass traffic data effectively. However these traditional solutions brings us unprecedented opportunities to manage transportation data but it is inefficient for building the next-generation intelligent transportation systems as Traffic data exploring in velocity and volume on various characteristics. In this article, a new deep intelligent prediction network has been introduced that is hierarchical and operates with spatiotemporal characteristics and location based service on utilizing the Sensor and GPS data of the vehicle in the real time. The proposed model employs deep learning architecture to predict potential road clusters for passengers. It is injected as recommendation system to passenger in terms of mobile apps and hardware equipment employment on the vehicle incorporating location based services models to seek available parking slots, traffic free roads and shortest path for reach destination and other services in the specified path etc. The underlying the traffic data is classified into clusters with extracting set of features on it. The deep behavioural network processes the traffic data in terms of spatiotemporal characteristics to generate the traffic forecasting information, vehicle detection, autonomous driving and driving behaviours. In addition, markov model is embedded to discover the hidden features .The experimental results demonstrates that proposed approaches achieves better results against state of art approaches on the performance measures named as precision, execution time, feasibility and efficiency.

Download Full-text

Pemodelan Topik dengan LDA untuk Temu Kembali Informasi dalam Rekomendasi Tugas Akhir

Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) ◽

10.29207/resti.v5i3.3049 ◽

2021 ◽

Vol 5 (3) ◽

pp. 421-428

Author(s):

Diana Purwitasari ◽

Aida Muflichah ◽

Novrindah Alvi Hasanah ◽

Agus Zainal Arifin

Keyword(s):

Learning Outcomes ◽

Recommendation System ◽

Latent Dirichlet Allocation ◽

Cluster Number ◽

Inverse Document Frequency ◽

Final Project ◽

Document Frequency ◽

Student Graduation ◽

Undergraduate Thesis ◽

Optimal Cluster

Undergraduate thesis as the final project, or in Indonesian called as Tugas Akhir, for each undergraduate student is a pre-requisite before student graduation and the successfulness in finishing the project becomes as one of learning outcomes among others. Determining the topic of the final project according to the ability of students is an important thing. One strategy to decide the topic is reading some literatures but it takes up more time. There is a need for a recommendation system to help students in determining the topic according to their abilities or subject understanding which is based on their academic transcripts. This study focused on a system for final project topic recommendations based on evaluating competencies in previous academic transcripts of graduated students. Collected data of previous final projects, namely titles and abstracts weighted by term occurences of TF-IDF (term frequency–inverse document frequency) and grouped by using K-Means Clustering. From each cluster result, we prepared candidates for recommended topics using Latent Dirichlet Allocation (LDA) with Gibbs Sampling that focusing on the word distribution of each topic in the cluster. Some evaluations were performed to evaluate the optimal cluster number, topic number and then made more thorough exploration on the recommendation results. Our experiments showed that the proposed system could recommend final project topic ideas based on student competence represented in their academic transcripts.

Download Full-text

Network Course Recommendation System Based on Double-Layer Attention Mechanism

Scientific Programming ◽

10.1155/2021/7613511 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Qianyao Zhu

Keyword(s):

Double Layer ◽

Recommendation System ◽

Attention Mechanism ◽

Inverse Document Frequency ◽

Network Teaching ◽

Teaching Platform ◽

Document Frequency ◽

Different Types ◽

Recommendation Accuracy ◽

Selection Of

In view of the lack of accurate recommendation and selection of courses on the network teaching platform in the new form of higher education, a network course recommendation system based on the double-layer attention mechanism is proposed. First of all, the collected data are preprocessed, while the data of students and course information are normalized and classified. Then, the dual attention mechanism is introduced into the parallel neural network recommendation model so as to improve the model’s ability to mine important features. TF-IDF (term frequency-inverse document frequency) based on the student score and course category is improved. The recommendation results are classified according to the weight of course categories, so as to construct different types of course groups and complete the recommendation. The experimental results show that the proposed algorithm can effectively improve the model recommendation accuracy compared with other algorithms.

Download Full-text

Sistem Perekomendasi Dosen Pembimbing berdasarkan Relevansi Topik Tugas Akhir menggunakan Metode Okapi BM25

Repositor ◽

10.22219/repositor.v2i9.672 ◽

2020 ◽

Vol 2 (9) ◽

Author(s):

Meilina Agustina ◽

Yufiz Azhar ◽

Nur Hayatin

Keyword(s):

Recommendation System ◽

School Teacher ◽

Primary School Teacher ◽

Education Department ◽

Inverse Document Frequency ◽

Term Frequency ◽

Document Frequency ◽

Teacher Education Department ◽

Primary School Teacher Education ◽

The University

AbstrakSistem rekomendasi adalah sebuah perangkat lunak untuk memberikan rekomendasi kepada pengguna mengenai produk yang dapat digunakannya. Masalah administrasi di kantor jurusan Pendidikan Guru Sekolah Dasar Universitas Muhammadiyah Malang merupakan salah satu permasalahan yang selalu dihadapi oleh para staf TU dan part timer. Penggunaan sistem manual yang masih berjalan saat ini dinilai kurang efektif terhadap waktu, tempat, dan tenaga sehingga diperlukan adanya bantuan berupa sistem informasi. Pada perancangan sistem informasi ini akan menggunakan metode Okapi BM25 dimana metode ini merupakan fungsi peringkat yang digunakan oleh mesin pencari (search engine) untuk peringkat dokumen pencocokan sesuai relevansinya dengan permintaan pencarian yaitu berupa topik tugas akhir. BM25 memiliki fungsi yang sesuai dengan 3 prinsip pembobotan yang baik, yaitu memiliki inverse document frequency (idf), term frequency (tf), dan memiliki fungsi normalisasi dari panjang dokumen (document length normalization).Abstract The recommendation system is a software to provide recommendations to users about the products they can use. The administrative problem in the office of the Primary School Teacher Education department at the University of Muhammadiyah Malang is one of the problems faced by the Administration staff and part timers. The use of manual systems that are still running at this time is considered to be less effective against time, place, and energy, so that assistance in the form of information systems is needed. In designing this information system will use the Okapi BM25 method where this method is a ranking function used by search engines for matching document rankings according to their relevance to search queries, namely in the form of final assignment topics. BM25 has functions that are in accordance with the 3 principles of good weighting, which has an inverse document frequency (idf), term frequency (tf), and has a document length normalization function.

Download Full-text

Analysis of Unsatisfying User Experiences and Unmet Psychological Needs for Virtual Reality Exergames Using Deep Learning Approach

Information ◽

10.3390/info12110486 ◽

2021 ◽

Vol 12 (11) ◽

pp. 486

Author(s):

Xiaoyan Zhang ◽

Qiang Yan ◽

Simin Zhou ◽

Linye Ma ◽

Siran Wang

Keyword(s):

Virtual Reality ◽

Deep Learning ◽

Experimental Studies ◽

Online Reviews ◽

Psychological Needs ◽

Gradient Boosting ◽

Inverse Document Frequency ◽

Document Frequency ◽

Extreme Gradient Boosting ◽

Speed Up

The number of consumers playing virtual reality games is booming. To speed up product iteration, the user experience team needs to collect and analyze unsatisfying experiences in time. In this paper, we aim to detect the unsatisfying experiences hidden in online reviews of virtual reality exergames using a deep learning method and find out the unmet psychological needs of users based on self-determination theory. Convolutional neural networks for sentence classification (textCNN) are used in this study to classify online reviews with unsatisfying experiences. For comparison, we set eXtreme gradient boosting (XGBoost) with lexical features as the baseline of machine learning. Term frequency-inverse document frequency (TF-IDF) is used to extract keywords from every set of classified reviews. The micro-F1 score of textCNN classifier is 90.00, which is better than 82.69 of XGBoost. The top 10 keywords of every set of reviews reflect relevant topics of unmet psychological needs. This paper explores the potential problems causing unsatisfying experiences and unmet psychological needs in virtual reality exergames through text mining and makes a supplement for experimental studies about virtual reality exergames.

Download Full-text

Sentiment Analysis Based on Deep Learning: A Comparative Study

Electronics ◽

10.3390/electronics9030483 ◽

2020 ◽

Vol 9 (3) ◽

pp. 483 ◽

Cited By ~ 12

Author(s):

Nhan Cach Dang ◽

María N. Moreno-García ◽

Fernando De la Prieta

Keyword(s):

Deep Learning ◽

Comparative Study ◽

Sentiment Analysis ◽

Language Processing ◽

Learning Models ◽

Inverse Document Frequency ◽

Document Frequency ◽

Wide Range ◽

Powerful Means ◽

Promising Solution

The study of public opinion can provide us with valuable information. The analysis of sentiment on social networks, such as Twitter or Facebook, has become a powerful means of learning about the users’ opinions and has a wide range of applications. However, the efficiency and accuracy of sentiment analysis is being hindered by the challenges encountered in natural language processing (NLP). In recent years, it has been demonstrated that deep learning models are a promising solution to the challenges of NLP. This paper reviews the latest studies that have employed deep learning to solve sentiment analysis problems, such as sentiment polarity. Models using term frequency-inverse document frequency (TF-IDF) and word embedding have been applied to a series of datasets. Finally, a comparative study has been conducted on the experimental results obtained for the different models and input features.

Download Full-text

Research on the Changing Trend of Employment-Relevant Terms Based on Internet Big Data Analysis

E3S Web of Conferences ◽

10.1051/e3sconf/202125101050 ◽

2021 ◽

Vol 251 ◽

pp. 01050

Author(s):

Yang Wei

Keyword(s):

Big Data ◽

Data Analysis ◽

Research Result ◽

Big Data Analysis ◽

Inverse Document Frequency ◽

Teaching Administration ◽

Employment Skills ◽

Document Frequency ◽

Changing Trend ◽

Big Data Technology

With publicly-available data collected from mainstream information platforms, this study used the term frequency inverse document frequency (TF-IDF) algorithm to detect 74 popular terms and phrases about employment, analyzed the changes in the ranking of these terms and phrases, and visualized the changing trend in the attention to employment skills from 2017 to 2019. The research result will facilitate application of big data technology to teaching administration in colleges, and provide a guide for college students to plan their study of vocational skills.

Download Full-text