Program refactoring in the presence of preprocessor directives

Mapping Intimacies ◽

10.35537/10915/4167 ◽

2005 ◽

Author(s):

◽

Alejandra Garrido

Keyword(s):

Program Analysis ◽

Single Unit ◽

Source Code ◽

C Language ◽

C Programs ◽

Novel Approach ◽

C Preprocessor ◽

Single Configuration

The C preprocessor is heavily used in C programs because it provides useful and even necessary additions to the C language. Since preprocessor directives are not part of C, they are removed before parsing and program analysis take place, during the phase called preprocessing. In the context of refactoring, it is inappropriate to remove preprocessor directives: if changes are applied on the preprocessed version of a program, it may not be possible to recover the un-preprocessed version. This means that after refactoring, all the source code would be contained in a single unit, targeted to a single configuration and without preprocessor macros. This thesis describes a novel approach to preserve preprocessor directives during parsing and program analysis, and integrate them in the program representations. Furthermore, it illustrates how the program representations are used during refactor ing and how transformations preserve preprocessor directives. Additionally, the semantics of the C preprocessor are formally specified, and the results of implementing this approach in a refactoring tool for C, CRefactory, are presented.

Download Full-text

Defining the undefinedness of C11 : practical semantics-based program analysis

10.32469/10355/63596 ◽

2017 ◽

Author(s):

◽

Chris Hathhorn

Keyword(s):

Program Analysis ◽

Third Party ◽

Test Suite ◽

University Of Missouri ◽

C Language ◽

C Programs ◽

Semantic Treatment ◽

Practical Analysis ◽

Core Language ◽

The University

[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT AUTHOR'S REQUEST.] This thesis extends the work of Ellison and Ros,u [13, 12] but focuses on the "negative" semantics of the C11 language--the semantics required to not just give meaning to correct programs, but also to reject undefined programs. We investigate undefined behavior in C and discuss the techniques and special considerations needed to formally specify it. Using these techniques, we have modified and extended a semantics of C into one that captures undefined behavior. The amount of semantic infrastructure and effort required to achieve this was unexpectedly high, in the end more than tripling the size of the original semantics. From our semantics, we automatically extract kcc, a tool for checking realworld C programs for undefined behavior and other common programmer mistakes. Previous versions of this tool were used primarily for testing the correctness of the semantics, but we have improved it into a tool for doing practical analysis of real C programs. It beats many similar tools in its ability to catch a broad range of undesirable behaviors. We demonstrate this with comparisons based on our own test suite in addition to third-party benchmarks. Our checker is capable of detecting examples of all 77 categories of core language undefinedness appearing in the C11 standard, more than any other tool we considered. Based on this evaluation, we argue that our work is the most comprehensive and complete semantic treatment of undefined behavior in C, and thus of the C language itself.

Download Full-text

Shape Neutral Analysis of Graph-based Data-structures

Theory and Practice of Logic Programming ◽

10.1017/s147106841800025x ◽

2018 ◽

Vol 18 (3-4) ◽

pp. 470-483 ◽

Cited By ~ 1

Author(s):

GREGORY J. DUCK ◽

JOXAN JAFFAR ◽

ROLAND H. C. YAP

Keyword(s):

Data Structure ◽

Data Structures ◽

Program Analysis ◽

Constraint Handling ◽

C Programs ◽

Wide Range ◽

Target Program ◽

Target Data ◽

Structure Graph ◽

Structure Properties

AbstractMalformed data-structures can lead to runtime errors such as arbitrary memory access or corruption. Despite this, reasoning over data-structure properties for low-level heap manipulating programs remains challenging. In this paper we present a constraint-based program analysis that checks data-structure integrity, w.r.t. given target data-structure properties, as the heap is manipulated by the program. Our approach is to automatically generate a solver for properties using the type definitions from the target program. The generated solver is implemented using a Constraint Handling Rules (CHR) extension of built-in heap, integer and equality solvers. A key property of our program analysis is that the target data-structure properties are shape neutral, i.e., the analysis does not check for properties relating to a given data-structure graph shape, such as doubly-linked-lists versus trees. Nevertheless, the analysis can detect errors in a wide range of data-structure manipulating programs, including those that use lists, trees, DAGs, graphs, etc. We present an implementation that uses the Satisfiability Modulo Constraint Handling Rules (SMCHR) system. Experimental results show that our approach works well for real-world C programs.

Download Full-text

Information processing for Similar Source code using LSH Algorithm

Journal of University of Shanghai for Science and Technology ◽

10.51201/jusst/21/06500 ◽

2021 ◽

Vol 23 (07) ◽

pp. 23-34

Author(s):

Mrs. Vani Dave ◽

◽

Mr Sanjeev Kumar shukla ◽

Keyword(s):

Information Processing ◽

Execution Time ◽

Source Code ◽

Locality Sensitive Hashing ◽

Source File ◽

C Language ◽

Fast Search

In this study, we propose a method to quickly search for similar source files for a given source file as a method to examine the origin of reused code. By outputting not only the same contents but also similar contents, it corresponds to the source file that has been changed during reuse. In addition, locality-sensitive hashing is used to search from a large number of source files, enabling fast search. By this method, it is possible to know the origin of the reused code. A case study was conducted on a library that is being reused written in C language. Some of the changes were unique to the project, and some were no longer consistent with the source files. As a result, it was possible to detect the source files that were reused from among the 200 projects with 92% accuracy. In addition, when we measured the execution time of the search using 4 files, the search was completed within 1 second for each file.

Download Full-text

A Novel Approach for Measurement of Source Code Similarity

2020 8th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO) ◽

10.1109/icrito48877.2020.9197972 ◽

2020 ◽

Author(s):

Mayank Agrawal ◽

Vinod Jain ◽

Atul Kumar Uttam

Keyword(s):

Source Code ◽

Novel Approach

Download Full-text

Deadlocks Detection in Multithreaded Applications Based on Source Code Analysis

Applied Sciences ◽

10.3390/app10020532 ◽

2020 ◽

Vol 10 (2) ◽

pp. 532 ◽

Cited By ~ 1

Author(s):

Damian Giebas ◽

Rafał Wojszczyk

Keyword(s):

Petri Nets ◽

Source Code ◽

Source Code Analysis ◽

C Language ◽

Code Analysis ◽

Advantages And Disadvantages ◽

Code Model

This paper extends multithreaded application source code model and shows how to using it to detect deadlocks in C language applications. Four known deadlock scenarios from literature can be detected using our model. For every scenario we created theorems and proofs whose fulfillment guarantees the occurrence of deadlocks in multithreaded applications. Paper also contains comparison of multithreaded application source code model and Petri nets and describe advantages and disadvantages both of them.

Download Full-text

A Novel Approach to Change Management in Requirements Engineering Context

International Journal of Agent Technologies and Systems ◽

10.4018/ijats.2015070102 ◽

2015 ◽

Vol 7 (3) ◽

pp. 18-44 ◽

Cited By ~ 4

Author(s):

Soumia Bendakir ◽

Nacereddine Zarour ◽

Pierre Jean Charrel

Keyword(s):

Change Management ◽

Requirements Engineering ◽

Source Code ◽

User Requirements ◽

Requirements Management ◽

Multi Agent System ◽

Novel Approach ◽

Requirements Change ◽

Multi Agent ◽

The Cost

Requirements change management (RCM) is actually an inevitable task that might be considered in system development's life cycle, since user requirements are continuously evolving (some are added, others are modified or deleted). A large majority of studies have examined the issue of change, while most of them focused on the design and source code, requirements were often forgotten, even though, the cost of fixing the defect and introduced error due to the requirements is less higher compared to the cost of error in design and implementation. For this purpose, this work focuses on change issues in the requirements engineering (RE) context, which contains the complete initial specification. Properties such as adaptability, perception, and cooperation of the multi-agent system (MAS) allow managing changing requirements in a controlled manner. The main objective of this work is to develop an agent-oriented approach which will be effective in the requirements management to be adapted to changes in their environments.

Download Full-text

Data mining source code to facilitate program comprehension: experiments on clustering data retrieved from C++programs

Proceedings. 12th IEEE International Workshop on Program Comprehension, 2004. ◽

10.1109/wpc.2004.1311063 ◽

2004 ◽

Cited By ~ 6

Author(s):

Y. Kanellopoulos ◽

C. Tjortjis

Keyword(s):

Data Mining ◽

Source Code ◽

Program Comprehension ◽

C Programs ◽

Clustering Data

Download Full-text

CODE-CHANGE IMPACT ANALYSIS USING COUNTERFACTUALS: THEORY AND IMPLEMENTATION

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194013500460 ◽

2013 ◽

Vol 23 (10) ◽

pp. 1459-1486

Author(s):

MANUEL PERALTA ◽

SUPRATIK MUKHOPADHYAY

Keyword(s):

Static Analysis ◽

Theorem Proving ◽

Program Analysis ◽

Impact Analysis ◽

Source Code ◽

Analysis Framework ◽

Change Impact Analysis ◽

Code Change ◽

Change Impact ◽

Automated Tool

This article shows a novel program analysis framework based on Lewis' theory of counterfactuals. Using this framework we are capable of performing change-impact static analysis on a program's source code. In other words, we are able to prove the properties induced by changes to a given program before applying these changes. Our contribution is two-fold; we show how to use Lewis' logic of counterfactuals to prove that proposed changes to a program preserve its correctness. We report the development of an automated tool based on resolution and theorem proving for performing code change-impact analysis.

Download Full-text

Directed Reachability for Infinite-State Systems

Tools and Algorithms for the Construction and Analysis of Systems - Lecture Notes in Computer Science ◽

10.1007/978-3-030-72013-1_1 ◽

2021 ◽

pp. 3-23

Author(s):

Michael Blondin ◽

Christoph Haase ◽

Philip Offtermatt

Keyword(s):

Petri Nets ◽

Program Analysis ◽

Infinite Graphs ◽

Reachability Problem ◽

Graph Exploration ◽

Concurrent Program ◽

Domain Specific ◽

Novel Approach ◽

Distance Oracles ◽

Analysis And Synthesis

AbstractNumerous tasks in program analysis and synthesis reduce to deciding reachability in possibly infinite graphs such as those induced by Petri nets. However, the Petri net reachability problem has recently been shown to require non-elementary time, which raises questions about the practical applicability of Petri nets as target models. In this paper, we introduce a novel approach for efficiently semi-deciding the reachability problem for Petri nets in practice. Our key insight is that computationally lightweight over-approximations of Petri nets can be used as distance oracles in classical graph exploration algorithms such as $$\mathsf {A}^{*}$$ A ∗ and greedy best-first search. We provide and evaluate a prototype implementation of our approach that outperforms existing state-of-the-art tools, sometimes by orders of magnitude, and which is also competitive with domain-specific tools on benchmarks coming from program synthesis and concurrent program analysis.

Download Full-text

End-to-End Prediction of Buffer Overruns from Raw Source Code via Neural Memory Networks

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/214 ◽

2017 ◽

Cited By ~ 4

Author(s):

Min-je Choi ◽

Sehun Jeong ◽

Hakjoo Oh ◽

Jaegul Choo

Keyword(s):

Programming Languages ◽

Program Analysis ◽

Question Answering ◽

Source Code ◽

Source Codes ◽

Program Language ◽

Proposed Model ◽

Challenging Tasks ◽

Data Driven Approach ◽

End To End

Detecting buffer overruns from a source code is one of the most common and yet challenging tasks in program analysis. Current approaches based on rigid rules and handcrafted features are limited in terms of flexible applicability and robustness due to diverse bug patterns and characteristics existing in sophisticated real-world software programs. In this paper, we propose a novel, data-driven approach that is completely end-to-end without requiring any hand-crafted features, thus free from any program language-specific structural limitations. In particular, our approach leverages a recently proposed neural network model called memory networks that have shown the state-of-the-art performances mainly in question-answering tasks. Our experimental results using source code samples demonstrate that our proposed model is capable of accurately detecting different types of buffer overruns. We also present in-depth analyses on how a memory network can learn to understand the semantics in programming languages solely from raw source codes, such as tracing variables of interest, identifying numerical values, and performing their quantitative comparisons.

Download Full-text