Issue-Based Clustering of Scholarly Articles

Rey-Long Liu; Chih-Kai Hsu

doi:10.3390/app8122591

Issue-Based Clustering of Scholarly Articles

Applied Sciences ◽

10.3390/app8122591 ◽

2018 ◽

Vol 8 (12) ◽

pp. 2591 ◽

Cited By ~ 1

Author(s):

Rey-Long Liu ◽

Chih-Kai Hsu

Keyword(s):

Scientific Literature ◽

State Of The Art ◽

Similarity Measures ◽

Practical Significance ◽

Research Issues ◽

Domain Experts ◽

Overlapping Clustering ◽

Scholarly Article ◽

Multiple Issues ◽

Multiple Clusters

A scholarly article often discusses multiple research issues. The clustering of scholarly articles based on research issues can facilitate analyses of related articles on specific issues in scientific literature. It is a task of overlapping clustering, as an article may discuss multiple issues, and hence, be clustered into multiple clusters. Clustering is challenging, as it is difficult to identify the research issues with which to cluster the articles. In this paper, we propose the use of the titles of the references cited by the articles to tackle the challenge, based on the hypothesis that such information may indicate the research issues discussed in the article. A technique referred to as ICRT (Issue-based Clustering with Reference Titles) was thus developed. ICRT works as a post-processor for various clustering systems. In experiments on those articles that domain experts have selected to annotate research issues about specific entity associations, ICRT works with various clustering systems that employ state-of-the-art similarity measures for scholarly articles. ICRT successfully improves these systems by identifying clusters of articles with the same research focuses on specific entity associations. The contribution is of technical and practical significance to the exploration of research issues reported in scientific literature (supporting the curation of entity associations found in the literature).

Download Full-text

Knowledge Transfer for Entity Resolution with Siamese Neural Networks

Journal of Data and Information Quality ◽

10.1145/3410157 ◽

2021 ◽

Vol 13 (1) ◽

pp. 1-25

Author(s):

Michael Loster ◽

Ioannis Koumarelas ◽

Felix Naumann

Keyword(s):

Knowledge Transfer ◽

Similarity Measure ◽

State Of The Art ◽

Similarity Measures ◽

Engineering Process ◽

Domain Experts ◽

Multiple Datasets ◽

Multiple Data ◽

Domain Expertise ◽

F Measure

The integration of multiple data sources is a common problem in a large variety of applications. Traditionally, handcrafted similarity measures are used to discover, merge, and integrate multiple representations of the same entity—duplicates—into a large homogeneous collection of data. Often, these similarity measures do not cope well with the heterogeneity of the underlying dataset. In addition, domain experts are needed to manually design and configure such measures, which is both time-consuming and requires extensive domain expertise. We propose a deep Siamese neural network, capable of learning a similarity measure that is tailored to the characteristics of a particular dataset. With the properties of deep learning methods, we are able to eliminate the manual feature engineering process and thus considerably reduce the effort required for model construction. In addition, we show that it is possible to transfer knowledge acquired during the deduplication of one dataset to another, and thus significantly reduce the amount of data required to train a similarity measure. We evaluated our method on multiple datasets and compare our approach to state-of-the-art deduplication methods. Our approach outperforms competitors by up to +26 percent F-measure, depending on task and dataset. In addition, we show that knowledge transfer is not only feasible, but in our experiments led to an improvement in F-measure of up to +4.7 percent.

Download Full-text

Learning similarity measures from data

Progress in Artificial Intelligence ◽

10.1007/s13748-019-00201-2 ◽

2019 ◽

Vol 9 (2) ◽

pp. 129-143 ◽

Cited By ~ 4

Author(s):

Bjørn Magnus Mathisen ◽

Agnar Aamodt ◽

Kerstin Bach ◽

Helge Langseth

Keyword(s):

Machine Learning ◽

Similarity Measure ◽

State Of The Art ◽

Similarity Measures ◽

Learning System ◽

Case Based Reasoning ◽

Training Time ◽

Domain Experts ◽

Trained Classifier ◽

Clustering Data

Abstract Defining similarity measures is a requirement for some machine learning methods. One such method is case-based reasoning (CBR) where the similarity measure is used to retrieve the stored case or a set of cases most similar to the query case. Describing a similarity measure analytically is challenging, even for domain experts working with CBR experts. However, datasets are typically gathered as part of constructing a CBR or machine learning system. These datasets are assumed to contain the features that correctly identify the solution from the problem features; thus, they may also contain the knowledge to construct or learn such a similarity measure. The main motivation for this work is to automate the construction of similarity measures using machine learning. Additionally, we would like to do this while keeping training time as low as possible. Working toward this, our objective is to investigate how to apply machine learning to effectively learn a similarity measure. Such a learned similarity measure could be used for CBR systems, but also for clustering data in semi-supervised learning, or one-shot learning tasks. Recent work has advanced toward this goal which relies on either very long training times or manually modeling parts of the similarity measure. We created a framework to help us analyze the current methods for learning similarity measures. This analysis resulted in two novel similarity measure designs: The first design uses a pre-trained classifier as basis for a similarity measure, and the second design uses as little modeling as possible while learning the similarity measure from data and keeping training time low. Both similarity measures were evaluated on 14 different datasets. The evaluation shows that using a classifier as basis for a similarity measure gives state-of-the-art performance. Finally, the evaluation shows that our fully data-driven similarity measure design outperforms state-of-the-art methods while keeping training time low.

Download Full-text

Improving Bibliographic Coupling with Category-Based Cocitation

Applied Sciences ◽

10.3390/app9235176 ◽

2019 ◽

Vol 9 (23) ◽

pp. 5176

Author(s):

Liu ◽

Hsu

Keyword(s):

Search Engine ◽

Similarity Measure ◽

Scientific Literature ◽

State Of The Art ◽

Bibliographic Coupling ◽

Research Issues ◽

Better Than

Bibliographic coupling (BC) is a similarity measure for scientific articles. It works based on an expectation that two articles that cite a similar set of references may focus on related (or even the same) research issues. For analysis and mapping of scientific literature, BC is an essential measure, and it can also be integrated with different kinds of measures. Further improvement of BC is thus of both practical and technical significance. In this paper, we propose a novel measure that improves BC by tackling its main weakness: two related articles may still cite different references. Category-based cocitation (category-based CC) is proposed to estimate how these different references are related to each other, based on the assumption that two different references may be related if they are cited by articles in the same categories about specific topics. The proposed measure is thus named BCCCC (Bibliographic Coupling with Category-based Cocitation). Performance of BCCCC is evaluated by experimentation and case study. The results show that BCCCC performs significantly better than state-of-the-art variants of BC in identifying highly related articles, which report conclusive results on the same specific topics. An experiment also shows that BCCCC provides helpful information to further improve a biomedical search engine. BCCCC is thus an enhanced version of BC, which is a fundamental measure for retrieval and analysis of scientific literature.

Download Full-text

PLANEJAMENTO FAMILIAR COMO ASSUNTO DE MULHER!? PERFIL DE GÊNERO NA PRODUÇÃO CIENTÍFICA NO BRASIL

Revista Interdisciplinar de Estudos em Saúde ◽

10.33362/ries.v8i1.1511 ◽

2019 ◽

Vol 8 (1) ◽

pp. 221-235 ◽

Cited By ~ 1

Author(s):

Daniella De Paula Chiesa ◽

Mário Antônio Sanches ◽

Daiane Priscila Simão-Silva

Keyword(s):

Family Planning ◽

Scientific Literature ◽

State Of The Art ◽

Health Sciences ◽

Scientific Production ◽

The State ◽

Review Of The Literature ◽

The Subject

O estudo do Planejamento familiar, no contexto da bioética, abre-se para diversas perspectivas, entre elas a valorização dos seus diferentes atores. Situado neste contexto o artigo tem como objetivo identificar o perfil de gênero na produção científica sobre Planejamento Familiar no Brasil, entre 2000 e 2014, assim como a área de formação e especialização dos autores. Foram utilizadas metodologias que permitiram mapear o estado da arte do tema estudado, a partir de uma revisão da literatura. O resultado da pesquisa identifica que a produção científica sobre Planejamento Familiar no Brasil se compõe de perfil destacadamente feminino (71,76%). Dos 73 artigos analisados, 42 (57,53%) o foco do tema está direcionado à mulher assim como evidencia-se a área de ciências da saúde com maior concentração das publicações do tema. Este aspecto da pesquisa abre para uma realidade complexa onde se buscam criticamente as razões para a pesquisa em Planejamento Familiar ter ênfase na mulher e ser um tema de relevância nas ciências da saúde.Palavras-chave: Produção científica, Planejamento Familiar, Gênero. ABSTRACT: The study of Family Planning, in the context of bioethics, opens to diverse perspectives, among them the appreciation of their different agents. Situated in this context the article aims to identify the profile of gender in scientific literature on Family Planning in Brazil, between 2000 and 2014, as well as the area of training and specialization of the authors. Methodologies were used which allowed to map the State of the art of the subject studied, from a review of the literature. The results found identify that the scientific production on Family Planning in Brazil is formed with a outstandingly female profile (71,76%). Of the 73 articles examined, 42 (57.53%) the focus of the topic is directed to women as well as showing the health sciences area with highest concentration of publications. This aspect of the research opens to a complex reality where we seek critically the reasons for Research in Family Planning have emphasis on woman and be a topic of relevance in health sciences.Keywords: Scientific Production, Family Planning, Gender.

Download Full-text

A Multi-Timescale Bilinear Model for Optimization and Control of HVAC Systems with Consistency

Energies ◽

10.3390/en14020400 ◽

2021 ◽

Vol 14 (2) ◽

pp. 400 ◽

Cited By ~ 1

Author(s):

Zelin Nie ◽

Feng Gao ◽

Chao-Bo Yan

Keyword(s):

Optimization Model ◽

Air Conditioning ◽

State Of The Art ◽

Hvac Systems ◽

Practical Significance ◽

Hvac System ◽

Bilinear Model ◽

Optimization And Control ◽

And Control ◽

Slow Timescale

Reducing the energy consumption of the heating, ventilation, and air conditioning (HVAC) systems while ensuring users’ comfort is of both academic and practical significance. However, the-state-of-the-art of the optimization model of the HVAC system is that either the thermal dynamic model is simplified as a linear model, or the optimization model of the HVAC system is single-timescale, which leads to heavy computation burden. To balance the practicality and the overhead of computation, in this paper, a multi-timescale bilinear model of HVAC systems is proposed. To guarantee the consistency of models in different timescales, the fast timescale model is built first with a bilinear form, and then the slow timescale model is induced from the fast one, specifically, with a bilinear-like form. After a simplified replacement made for the bilinear-like part, this problem can be solved by a convexification method. Extensive numerical experiments have been conducted to validate the effectiveness of this model.

Download Full-text

Treatment Following Progression in Metastatic Melanoma: the State of the Art from Scientific Literature to Clinical Need

Current Oncology Reports ◽

10.1007/s11912-021-01065-3 ◽

2021 ◽

Vol 23 (7) ◽

Author(s):

F. Serra ◽

S. Barruscotti ◽

T. Dominioni ◽

A. Zuccarini ◽

P. Pedrazzoli ◽

...

Keyword(s):

Metastatic Melanoma ◽

Scientific Literature ◽

State Of The Art ◽

The State

Download Full-text

Computational intelligence in product design engineering: review and trends

10.32920/ryerson.14669190.v1 ◽

2021 ◽

Author(s):

Filippo A. Salustri

Keyword(s):

Decision Making ◽

Product Design ◽

Engineering Design ◽

Computational Intelligence ◽

Design Automation ◽

State Of The Art ◽

Design Engineering ◽

Broad Perspective ◽

Research Issues ◽

Product Engineering

Product design engineering is undergoing a transformation from informal and largely experience-based discipline to a science-based domain. Computational intelligence offers models and algorithms that can contribute greatly to design formalization and automation. This paper surveys computational intelligence concepts and approaches applicable to product design engineering. Taxonomy of the surveyed literature is presented according to the generally recognized areas in both product design engineering and computational intelligence. Some research issues that arise from the broad perspective presented in the paper have been signaled but not fully pursued. No survey of such a broad field can be complete, however, the material presented in the paper is a summary of state-of-the-art computational intelligence concepts and approaches in product design engineering. Keywords: Computational intelligence, engineering design, product engineering, decision making, design automation

Download Full-text

Theoretical insights into expression of leadership competencies in the process of management

Problems and Perspectives in Management ◽

10.21511/ppm.15(1-1).2017.09 ◽

2017 ◽

Vol 15 (1) ◽

pp. 220-226 ◽

Cited By ~ 1

Author(s):

Regina Andriukaitienė ◽

Valentyna Voronkova ◽

Olga Kyvliuk ◽

Marina Maksimenyuk ◽

Aita Sakun

Keyword(s):

Organizational Development ◽

Scientific Literature ◽

Leadership Competencies ◽

Practical Significance ◽

Literature Analysis ◽

The Core ◽

Career Opportunities ◽

Analysis And Synthesis ◽

Research Findings ◽

Leadership Research

The relevance of the topic is defined through the idea that appropriate leadership competencies and their application in certain activities enabling the followers can ensure the prospects of organizational development and individual career opportunities. To review and summarize the aspects of research findings of leadership science in expression of competencies in managerial processes, highlighting the leadership competencies in the context of general competencies. Methods. In order to formulate analytical findings describing the concept of leadership, generalizing the stages of development of theories, expression of leadership competencies and impact, there were used the methods of scientific literature analysis and synthesis as well as simulation. Results. According to the scientists insights, the article deals with leadership concept analysis, leadership research overview according to development stages. Scientific novelty. The analyzed theme has a scientific novelty, because recently there has been more and more discussion about the importance of leadership, but it is important to analyze the core leadership competencies that would predetermine both the findings of decisions of organizations’ managerial processes and positive changes of individual career in the integration in the activities of organizations. Practical significance. The need in leadership competencies is related to the issues of good leadership in organizations. Aiming to implement ideas of modern leadership in organisations, the leader has to have certain characteristics of leadership expressions, such as ability to communicate effectively, respond to the needs of others, and influence the behavior of the followers directing them towards the achieving of the set goals and implementation of the leader’s vision.

Download Full-text

Performance Analysis of Pose Invariant Face Recognition Approaches in Unconstrained Environments

International Journal of Computer Vision and Image Processing ◽

10.4018/ijcvip.2015010104 ◽

2015 ◽

Vol 5 (1) ◽

pp. 66-81 ◽

Cited By ~ 1

Author(s):

M. Parisa Beham ◽

S. M. Mansoor Roomi ◽

J. Alageshan ◽

V. Kapileshwaran

Keyword(s):

Face Recognition ◽

Nearest Neighbor ◽

State Of The Art ◽

Recognition Algorithm ◽

K Nearest Neighbor ◽

Research Issues ◽

Histograms Of Oriented Gradients ◽

Recognition Phase ◽

The Face ◽

Pose Variation

Face recognition and authentication are two significant and dynamic research issues in computer vision applications. There are many factors that should be accounted for face recognition; among them pose variation is a major challenge which severely influence in the performance of face recognition. In order to improve the performance, several research methods have been developed to perform the face recognition process with pose invariant conditions in constrained and unconstrained environments. In this paper, the authors analyzed the performance of a popular texture descriptors viz., Local Binary Pattern, Local Derivative Pattern and Histograms of Oriented Gradients for pose invariant problem. State of the art preprocessing techniques such as Discrete Cosine Transform, Difference of Gaussian, Multi Scale Retinex and Gradient face have also been applied before feature extraction. In the recognition phase K- nearest neighbor classifier is used to accomplish the classification task. To evaluate the efficiency of pose invariant face recognition algorithm three publicly available databases viz. UMIST, ORL and LFW datasets have been used. The above said databases have very wide pose variations and it is proved that the state of the art method is efficient only in constrained situations.

Download Full-text