Identifying High-Level Concept Clones in Software Programs Using Method’s Descriptive Documentation

Aditi Gupta; Rinkaj Goyal

doi:10.3390/sym13030447

Identifying High-Level Concept Clones in Software Programs Using Method’s Descriptive Documentation

Symmetry ◽

10.3390/sym13030447 ◽

2021 ◽

Vol 13 (3) ◽

pp. 447

Author(s):

Aditi Gupta ◽

Rinkaj Goyal

Keyword(s):

Similarity Measures ◽

Latent Semantic Indexing ◽

Coarse Grained ◽

Software Systems ◽

Abstract Data Type ◽

Code Clones ◽

Text Corpus ◽

Fine Grained ◽

Corpus Size ◽

High Level

Software clones are code fragments with similar or nearly similar functionality or structures. These clones are introduced in a project either accidentally or deliberately during software development or maintenance process. The presence of clones poses a significant threat to the maintenance of software systems and is on the top of the list of code smell types. Clones can be simple (fine-grained) or high-level (coarse-grained), depending on the chosen granularity of code for the clone detection. Simple clones are generally viewed at the lines/statements level, whereas high-level clones have granularity as a block, method, class, or file. High-level clones are said to be composed of multiple simple clones. This study aims to detect high-level conceptual code clones (having granularity as java methods) in java-based projects, which is extendable to the projects developed in other languages as well. Conceptual code clones are the ones implementing a similar higher-level abstraction such as an Abstract Data Type (ADT) list. Based on the assumption that “similar documentation implies similar methods”, the proposed mechanism uses “documentation” associated with methods to identify method-level concept clones. As complete documentation does not contribute to the method’s semantics, we extracted only the description part of the method’s documentation, which led to two benefits: increased efficiency and reduced text corpus size. Further, we used Latent Semantic Indexing (LSI) with different combinations of weight and similarity measures to identify similar descriptions in the text corpus. To show the efficacy of the proposed approach, we validated it using three java open source systems of sufficient length. The findings suggest that the proposed mechanism can detect methods implementing similar high-level concepts with improved recall values.

Download Full-text

Distractor-Aware Tracking with Multi-Task and Dynamic Feature Learning

Journal of Circuits System and Computers ◽

10.1142/s0218126621500316 ◽

2020 ◽

pp. 2150031

Author(s):

Weichun Liu ◽

Xiaoan Tang ◽

Chenglin Zhao

Keyword(s):

Correlation Filter ◽

Coarse Grained ◽

Dynamic Feature ◽

Semantic Features ◽

Low Level ◽

Fine Grained ◽

Semantic Embedding ◽

Training Stage ◽

Online Tracking ◽

High Level

Recently, deep trackers based on the siamese networking are enjoying increasing popularity in the tracking community. Generally, those trackers learn a high-level semantic embedding space for feature representation but lose low-level fine-grained details. Meanwhile, the learned high-level semantic features are not updated during online tracking, which results in tracking drift in presence of target appearance variation and similar distractors. In this paper, we present a novel end-to-end trainable Convolutional Neural Network (CNN) based on the siamese network for distractor-aware tracking. It enhances target appearance representation in both the offline training stage and online tracking stage. In the offline training stage, this network learns both the low-level fine-grained details and high-level coarse-grained semantics simultaneously in a multi-task learning framework. The low-level features with better resolution are complementary to semantic features and able to distinguish the foreground target from background distractors. In the online stage, the learned low-level features are fed into a correlation filter layer and updated in an interpolated manner to encode target appearance variation adaptively. The learned high-level features are fed into a cross-correlation layer without online update. Therefore, the proposed tracker benefits from both the adaptability of the fine-grained correlation filter and the generalization capability of the semantic embedding. Extensive experiments are conducted on the public OTB100 and UAV123 benchmark datasets. Our tracker achieves state-of-the-art performance while running with a real-time frame-rate.

Download Full-text

Central volcanoes as sources for the Antarctic Peninsula Volcanic Group

Antarctic Science ◽

10.1017/s0954102094000568 ◽

1994 ◽

Vol 6 (3) ◽

pp. 365-374 ◽

Cited By ~ 10

Author(s):

Philip T. Leat ◽

Jane H. Scarrow

Keyword(s):

Antarctic Peninsula ◽

Volcanic Rocks ◽

Coarse Grained ◽

North West ◽

Magmatic Arc ◽

Fine Grained ◽

Volcanic Group ◽

Land Sliding ◽

High Level ◽

The Antarctic

From at least the Early Jurassic to the Miocene, eastward subduction of oceanic crust took place beneath the Antarctic Peninsula. Magmatism associated with the subduction generated a N-S linear belt of volcanic rocks known as the Antarctic Peninsula Volcanic Group (APVG), and which erosion has now exposed at about the plutonic/volcanic interface. Large central volcanoes from the APVG are described here for the first time. The structures are situated in north-west Palmer Land within the main Mesozoic magmatic arc. One centre, Zonda Towers, is recognized by the presence of a 160 m thick silicic ignimbrite, containing accidental lava blocks up to 25 m in diameter. This megabreccia is interpreted as a caldera-fill deposit which formed by land sliding of steep caldera walls during ignimbrite eruption and deposition. A larger centre, Mount Edgell-Wright Spires, is dominated by coarse-grained debris flow deposits and silicic ignimbrites which, with minor lavas and fine-grained tuffs, form a volcanic succession some 1.5 km thick. Basic intermediate and silicic sills c. 50 m thick intrude the succession. A central gabbro-granite intrusion is interpreted to be a high-level magma chamber of the Mount Edgell volcano.

Download Full-text

SHAPE-BASED IMAGE RETRIEVAL USING TWO-LEVEL SIMILARITY MEASURES

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001407005843 ◽

2007 ◽

Vol 21 (06) ◽

pp. 995-1015 ◽

Cited By ~ 5

Author(s):

WAI-TAK WONG ◽

FRANK Y. SHIH ◽

TE-FENG SU

Keyword(s):

Image Retrieval ◽

Length Distribution ◽

Similarity Measures ◽

Shape Retrieval ◽

Coarse Grained ◽

Cross Sectional ◽

Fine Grained ◽

Geometric Moments ◽

Dominant Points ◽

Curve Length

In this paper, we present a novel method of using two-level similarity measures for shape-based image retrieval. We first identify the dominant points of a given shape, and then calculate their geometric moments and the distances between two consecutive dominant points. A spectrum representing the normalized geometric moments versus normalized distances is generated, and its area and curve length are computed. We use these two values as similarity features for the indexes in coarse-grained shape retrieval. Furthermore, we use the cross-sectional area and curve length distribution for the indexes in fine-grained shape retrieval. Experimental results show that the proposed method is simple and efficient and can reach the accuracy rate of 95%.

Download Full-text

Exploring Many-Core Design Templates for FPGAs and ASICs

International Journal of Reconfigurable Computing ◽

10.1155/2012/439141 ◽

2012 ◽

Vol 2012 ◽

pp. 1-15 ◽

Cited By ~ 4

Author(s):

Ilia Lebedev ◽

Christopher Fletcher ◽

Shaoyi Cheng ◽

James Martin ◽

Austin Doupnik ◽

...

Keyword(s):

Graphics Processing Unit ◽

General Purpose ◽

Coarse Grained ◽

Processing Unit ◽

Fine Grained ◽

Data Parallel ◽

Level Data ◽

Graph Inference ◽

High Level ◽

Many Core

We present a highly productive approach to hardware design based on a many-core microarchitectural template used to implement compute-bound applications expressed in a high-level data-parallel language such as OpenCL. The template is customized on a per-application basis via a range of high-level parameters such as the interconnect topology or processing element architecture. The key benefits of this approach are that it (i) allows programmers to express parallelism through an API defined in a high-level programming language, (ii) supports coarse-grained multithreading and fine-grained threading while permitting bit-level resource control, and (iii) reduces the effort required to repurpose the system for different algorithms or different applications. We compare template-driven design to both full-custom and programmable approaches by studying implementations of a compute-bound data-parallel Bayesian graph inference algorithm across several candidate platforms. Specifically, we examine a range of template-based implementations on both FPGA and ASIC platforms and compare each against full custom designs. Throughout this study, we use a general-purpose graphics processing unit (GPGPU) implementation as a performance and area baseline. We show that our approach, similar in productivity to programmable approaches such as GPGPU applications, yields implementations with performance approaching that of full-custom designs on both FPGA and ASIC platforms.

Download Full-text

An Efficient Residual-Based Method for Railway Image Dehazing

Sensors ◽

10.3390/s20216204 ◽

2020 ◽

Vol 20 (21) ◽

pp. 6204

Author(s):

Qinghong Liu ◽

Yong Qin ◽

Zhengyu Xie ◽

Zhiwei Cao ◽

Limin Jia

Keyword(s):

Superior Performance ◽

Coarse Grained ◽

Image Dehazing ◽

Fine Grained ◽

Full Reference ◽

Surrounding Environment ◽

Train Operation ◽

Block Based ◽

High Level ◽

Residual Block

Trains shuttle in semiopen environments, and the surrounding environment plays an important role in the safety of train operation. The weather is one of the factors that affect the surrounding environment of railways. Under haze conditions, railway monitoring and staff vision could be blurred, threatening railway safety. This paper tackles image dehazing for railways. The contributions of this paper for railway video image dehazing are as follows: (1) this paper proposes an end-to-end residual block-based haze removal method that consists of two subnetworks, namely fine-grained and coarse-grained network can directly generate the clean image from input hazy image, called RID-Net (Railway Image Dehazing Network). (2) The combined loss function (per-pixel loss and perceptual loss functions) is proposed to achieve both low-level features and high-level features so to generate the high-quality restored images. (3) We take the full-reference criterion (PSNR&SSIM), object detection, running time, and sensory vision to evaluate the proposed dehazing method. Experimental results on railway synthesized dataset, benchmark indoor dataset, and real-world dataset demonstrate our method has superior performance compared to the state-of-the-art methods.

Download Full-text

Symbolic Analysis of Maude Theories with Narval

Theory and Practice of Logic Programming ◽

10.1017/s1471068419000243 ◽

2019 ◽

Vol 19 (5-6) ◽

pp. 874-890

Author(s):

MARÍA ALPUENTE ◽

SANTIAGO ESCOBAR ◽

JULIA SAPIÑA ◽

DEMIS BALLIS

Keyword(s):

Logic Programming ◽

Reachability Analysis ◽

Software Systems ◽

Symbolic Computations ◽

Fine Grained ◽

Equational Unification ◽

Symbolic Reasoning ◽

Equational Theories ◽

Graphical Tool ◽

High Level

AbstractConcurrent functional languages that are endowed with symbolic reasoning capabilities such as Maude offer a high-level, elegant, and efficient approach to programming and analyzing complex, highly nondeterministic software systems. Maude’s symbolic capabilities are based on equational unification and narrowing in rewrite theories, and provide Maude with advanced logic programming capabilities such as unification modulo user-definable equational theories and symbolic reachability analysis in rewrite theories. Intricate computing problems may be effectively and naturally solved in Maude thanks to the synergy of these recently developed symbolic capabilities and classical Maude features, such as: (i) rich type structures with sorts (types), subsorts, and overloading; (ii) equational rewriting modulo various combinations of axioms such as associativity, commutativity, and identity; and (iii) classical reachability analysis in rewrite theories. However, the combination of all of these features may hinder the understanding of Maude symbolic computations for non-experienced developers. The purpose of this article is to describe how programming and analysis of Maude rewrite theories can be made easier by providing a sophisticated graphical tool called Narval that supports the fine-grained inspection of Maude symbolic computations.

Download Full-text

MASP – An Enhanced Model of Fault Type Identification in Object-Oriented Software Engineering

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2006.p0312 ◽

2006 ◽

Vol 10 (3) ◽

pp. 312-322 ◽

Cited By ~ 2

Author(s):

Atchara Mahaweerawat ◽

◽

Peraphon Sophatsathit ◽

Chidchanok Lursinsap ◽

Petr Musilek ◽

...

Keyword(s):

Object Oriented ◽

Coarse Grained ◽

Fault Classification ◽

Software Systems ◽

Fine Grained ◽

Fault Type ◽

High Discrimination ◽

Selection Algorithms ◽

Metric Selection ◽

Object Oriented Systems

To remain competitive in the dynamic world of software development, organizations must optimize the use of their limited resources to deliver quality products on time and within budget. This requires prevention of fault introduction and quick discovery and repair of residual faults. In this paper, a new model for predicting and identifying of faults in object-oriented software systems is introduced. In particular, faults due to the use of inheritance and polymorphism are considered as they account for significant portion of faults in object-oriented systems. The proposed MASP model acts as a fault metric selector that gathers relevant filtering metrics suitable for specific fault types employing coarse-grained and fine-grained metric selection algorithms. A fault predictor is subsequently established to identify the fault type of individual fault classification. It is concluded that the proposed model yields high discrimination accuracy between faulty and fault-free classes.

Download Full-text

SPARQA: Skeleton-Based Semantic Parsing for Complex Questions over Knowledge Bases

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6426 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8952-8959

Author(s):

Yawei Sun ◽

Lingling Zhang ◽

Gong Cheng ◽

Yuzhong Qu

Keyword(s):

Knowledge Base ◽

Level Structure ◽

Knowledge Bases ◽

Coarse Grained ◽

Semantic Parsing ◽

Fine Grained ◽

Sentence Level ◽

Natural Language Question ◽

Parsing Algorithm ◽

High Level

Semantic parsing transforms a natural language question into a formal query over a knowledge base. Many existing methods rely on syntactic parsing like dependencies. However, the accuracy of producing such expressive formalisms is not satisfying on long complex questions. In this paper, we propose a novel skeleton grammar to represent the high-level structure of a complex question. This dedicated coarse-grained formalism with a BERT-based parsing algorithm helps to improve the accuracy of the downstream fine-grained semantic parsing. Besides, to align the structure of a question with the structure of a knowledge base, our multi-strategy method combines sentence-level and word-level semantics. Our approach shows promising performance on several datasets.

Download Full-text

Correlation between dislocation, grain boundary and interface of duplex SS in stress corrosion cracking(SCC)

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100177775 ◽

1990 ◽

Vol 48 (4) ◽

pp. 928-929

Author(s):

Wang Zheng-fang ◽

Z.F. Wang

Keyword(s):

Stainless Steel ◽

Thermal Cycle ◽

Secondary Phase ◽

Corrosion Cracking ◽

Coarse Grained ◽

Test Environment ◽

Fine Grained ◽

Evaluation Test ◽

Welding Thermal Cycle ◽

Micro Analysis

The main purpose of this study highlights on the evaluation of chloride SCC resistance of the material,duplex stainless steel,OOCr18Ni5Mo3Si2 (18-5Mo) and its welded coarse grained zone(CGZ).18-5Mo is a dual phases (A+F) stainless steel with yield strength:512N/mm2 .The proportion of secondary Phase(A phase) accounts for 30-35% of the total with fine grained and homogeneously distributed A and F phases(Fig.1).After being welded by a specific welding thermal cycle to the material,i.e. Tmax=1350°C and t8/5=20s,microstructure may change from fine grained morphology to coarse grained morphology and from homogeneously distributed of A phase to a concentration of A phase(Fig.2).Meanwhile,the proportion of A phase reduced from 35% to 5-10°o.For this reason it is known as welded coarse grained zone(CGZ).In association with difference of microstructure between base metal and welded CGZ,so chloride SCC resistance also differ from each other.Test procedures:Constant load tensile test(CLTT) were performed for recording Esce-t curve by which corrosion cracking growth can be described, tf,fractured time,can also be recorded by the test which is taken as a electrochemical behavior and mechanical property for SCC resistance evaluation. Test environment:143°C boiling 42%MgCl2 solution is used.Besides, micro analysis were conducted with light microscopy(LM),SEM,TEM,and Auger energy spectrum(AES) so as to reveal the correlation between the data generated by the CLTT results and micro analysis.

Download Full-text

Balanced Sparsity for Efficient DNN Inference on GPU

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33015676 ◽

2019 ◽

Vol 33 ◽

pp. 5676-5683 ◽

Cited By ~ 3

Author(s):

Zhuliang Yao ◽

Shijie Cao ◽

Wencong Xiao ◽

Chen Zhang ◽

Lanshun Nie

Keyword(s):

Deep Neural Networks ◽

General Purpose ◽

Coarse Grained ◽

Efficient Computation ◽

Model Accuracy ◽

Sparse Model ◽

Model Inference ◽

Fine Grained ◽

Practical Inference ◽

Speed Up

In trained deep neural networks, unstructured pruning can reduce redundant weights to lower storage cost. However, it requires the customization of hardwares to speed up practical inference. Another trend accelerates sparse model inference on general-purpose hardwares by adopting coarse-grained sparsity to prune or regularize consecutive weights for efficient computation. But this method often sacrifices model accuracy. In this paper, we propose a novel fine-grained sparsity approach, Balanced Sparsity, to achieve high model accuracy with commercial hardwares efficiently. Our approach adapts to high parallelism property of GPU, showing incredible potential for sparsity in the widely deployment of deep learning services. Experiment results show that Balanced Sparsity achieves up to 3.1x practical speedup for model inference on GPU, while retains the same high model accuracy as finegrained sparsity.

Download Full-text