An efficient new multi-language clone detection approach from large source code

Author(s):  
Saif Ur Rehman ◽  
Kamran Khan ◽  
Simon Fong ◽  
Robert Biuk-Aghai
Author(s):  
Evan Moritz ◽  
Mario Linares-Vasquez ◽  
Denys Poshyvanyk ◽  
Mark Grechanik ◽  
Collin McMillan ◽  
...  
Keyword(s):  

Author(s):  
Sara McCaslin ◽  
Kent Lawrence

Closed-form solutions, as opposed to numerically integrated solutions, can now be obtained for many problems in engineering. In the area of finite element analysis, researchers have been able to demonstrate the efficiency of closed-form solutions when compared to numerical integration for elements such as straight-sided triangular [1] and tetrahedral elements [2, 3]. With higher order elements, however, the length of the resulting expressions is excessive. When these expressions are to be implemented in finite element applications as source code files, large source code files can be generated, resulting in line length/ line continuation limit issues with the compiler. This paper discusses a simple algorithm for the reduction of large source code files in which duplicate terms are replaced through the use of an adaptive dictionary. The importance of this algorithm lies in its ability to produce manageable source code files that can be used to improve efficiency in the element generation step of higher order finite element analysis. The algorithm is applied to Fortran files developed for the implementation of closed-form element stiffness and error estimator expressions for straight-sided tetrahedral finite elements through the fourth order. Reductions in individual source code file size by as much as 83% are demonstrated.


2016 ◽  
Vol 2 ◽  
pp. e49 ◽  
Author(s):  
Stefan Wagner ◽  
Asim Abdulkhaleq ◽  
Ivan Bogicevic ◽  
Jan-Peter Ostberg ◽  
Jasmin Ramadani

Background. Today, redundancy in source code, so-called “clones” caused by copy&paste can be found reliably using clone detection tools. Redundancy can arise also independently, however, not caused by copy&paste. At present, it is not clear how onlyfunctionally similar clones(FSC) differ from clones created by copy&paste. Our aim is to understand and categorise the syntactical differences in FSCs that distinguish them from copy&paste clones in a way that helps clone detection research.Methods. We conducted an experiment using known functionally similar programs in Java and C from coding contests. We analysed syntactic similarity with traditional detection tools and explored whether concolic clone detection can go beyond syntax. We ran all tools on 2,800 programs and manually categorised the differences in a random sample of 70 program pairs.Results. We found no FSCs where complete files were syntactically similar. We could detect a syntactic similarity in a part of the files in <16% of the program pairs. Concolic detection found 1 of the FSCs. The differences between program pairs were in the categories algorithm, data structure, OO design, I/O and libraries. We selected 58 pairs for an openly accessible benchmark representing these categories.Discussion. The majority of differences between functionally similar clones are beyond the capabilities of current clone detection approaches. Yet, our benchmark can help to drive further clone detection research.


Sign in / Sign up

Export Citation Format

Share Document