An Empirical Study of a Hybrid Code Clone Detection Approach on Java Byte Code

Aritra Ghosh, Young Lee


Code clones increase the complexity of the system;
therefore the software maintenance costs. Code clone
detection techniques have been proposed and evaluated based
on metric value and runtime evaluations. But in the existing
methods, many false positive clones are detected. In this
paper, we suggest a hybrid approach combining Program
Dependence Graph-based technique with Metric-based
technique to improve the precision of clone detection. We
conduct a case study on two open source code Java projects
such as Eclipse-ant and Eclipse-JDT core to show the effectiveness of our tool. The application of this hybrid technique is then compared with the existing clone detection technique, CloneDR. The result shows that our tool increases the performance in precision, recall, false positive and false negative compared to CloneDR.


Code Clone; Byte code; Metrics; False positive; False negative; Precision; Recall

Full Text:



K. H. Bennett, and V.T Rajlich, “Software Maintenance and Evolution:

a Roadmap,” in ICSE '00 Proceedings of the Conference on The Future of

Software Engineering Pages 73-87.

Shahid Hussain, Muhammad Zubair Asghar, Bashir Ahmad and

Shakeel Ahmad, “A Step towards Software Corrective Maintenance: Using

RCM model,” (IJCSIS) International Journal of Computer Science and

Information Security, Vol. 4, No. 1 & 2, 2009.

Mrs. E.Kodhai, V.Vijayakumar, G. Balabaskaran, T.Stalin, and

B.Kanagaraj, “Method Level Detection and Removal of Code Clones in C

and Java Programs using Refactoring,” International Journal of Computer

Communication and Information System (IJCCIS) – Vol2. No1. ISSN:

–1349 July – Dec 2010.

Robert L. Glass, “Frequently Forgotten Fundamental Facts about

Software Engineering,” an article in IEEE Software May/June 2001.

Deepak Sethi, Manisha Sehrawat, and Bharat Bhushan Naib,

“Detection of code clones using Datasets,” International Journal of

Advanced Research in Computer Science and Software Engineering,

Volume 2, Issue 7, July 2012.

C.K. Roy, and J.R. Cordy, “A Survey on Software Clone Detection

Research,” Queen’s School of Computing Tech. Report 2007-541,

Kingston, 2007, 115 pp.

C.J. Kapser, and M.W. Godfrey, ““Cloning Considered Harmful”

Considered Harmful: Patterns of Cloning in Software,” Emp. Soft. Eng.,

(6), 2008, pp. 645-692.

Z. Li, S. Lu, S. Myagmar, and Y. Zhou, “CP-Miner: A Tool for

Finding Copy-Paste and Related Bugs in Operating System Code,” in

OSDI, San Francisco, 2004, pp. 289–302.

T. Bakota, R. Ferenc, and T. Gyimothy, “Clone Smells in Software

Evolution,” in ICSM, Paris, 2007, pp. 24–33.

D. Gayathri Devi, and Dr.M.Punithavalli, “Developing a Novel and

Effective Clone Detection Using Data Mining Technique,” International

Journal of Advanced Research in Computer Science and Software

Engineering, Volume 2, Issue 8, August 2012.

Hiroaki Murakami, Keisuke Hotta, Yoshiki Higo, Hiroshi Igaki, and

Shinji Kusumoto, “Gapped Code Clone Detection with Lightweight

Source Code Analysis,” ICPC 2013, San Francisco, CA, USA, 978-1-

-3091-6/13/$31.00 c 2013 IEEE.

C. K. Roy, M. F. Zibran, R. Koschke, “The Vision of Software Clone

Management: Past, Present, and Future (Keynote Paper),” in IEEE

Conference on Software Maintenance, Reengineering and Reverse

Engineering (CSMR- WCRE), Software Evolution Week, 2014, pp.18-33.

C. K. Roy, James R. Cordy, and Rainer Koschke, “Comparison and

evaluation of code clone detection techniques and tools: A qualitative

approach,” in Science of Computer Programming, May 2009, pp. 470-495.

J. Johnson, “Identifying redundancy in source code using

fingerprints,” in: Proceedings of Conference of the Centre for Advanced

Studies on Collaborative Research (CASCON), 1993, pp. 171-183.

J. Johnson, “Visualizing textual redundancy in legacy source,” in:

Proceedings of Conference of the Centre for Advanced Studies on

Collaborative research, (CASCON), 1994, pp. 171-183.

C. Roy and J. Cordy, “NICAD: Accurate detection of near-miss

intentional clones using flexible pretty-printing and code normalization,”

In 16th IEEE International Conference on Program Comprehension, 2008,

pp. 172–181.

Seunghak Lee and Jeong Iryoung, “SDD: high-performance code

clone detection system for large scale source code,” In Companion to the

th annual ACM SIGPLAN conference on Object-oriented programming,

systems, languages, and applications, ACM, 2005, pp. 140-141.

S. Ducasse, M. Rieger, and S. Demeyer, “A Language Independent

Approach for Detecting Duplicated Code,” in Proceedings of the 15th

International Conference on Software Maintenance (ICSM’99), September

, pp. 109–118.

Baker, Brenda S, “On finding duplication and near-duplication in

large software systems,” In Proceedings of 2nd Working Conference on

Reverse Engineering, IEEE, 1995, pp. 86-95.

T. Kamiya, S. Kusumoto, K. Inoue, “CCFinder: A multi-linguistic

token- based code clone detection system for large scale source code,” in

IEEE Transactions on Software Engineering, 2002, pp. 654-670.

Zhenmin Li, Shan Lu, Suvda Myagmar and Yuanyuan Zhou, “CPMiner:

A Tool for Finding Copy-paste and Related Bugs in Operating

System Code,” Software Engineering, IEEE Transactions, vol. 32, March

, pp. 176-192.

D. Rattan, Rajesh Bhatia, and Maninder Singh, “Software clone

detection: A systematic review,” Information and Software Technology,

vol. 55, no. 7, 2013, pp 1165-1199.

Jiang, Zhen Ming, and Ahmed E. Hassan, “A framework for studying

clones in large software systems,” In Seventh IEEE International Working

Conference on Source Code Analysis and Manipulation (SCAM), 2007,

pp. 203-212.

I. D. Baxter, A. Yahin, L. Moura, M. SantAnna, L. Bier, “ Clone

Detection using abstract syntax trees,” in Proceedings of the 14th

International Conference on Software Maintenance (ICSM ‟98), Bethesda,

Maryland, USA, 1998, pp. 368-378.

Jiang, Lingxiao, Ghassan Misherghi, Zhendong Su, and Stephane

Glondu. “Deckard: Scalable and accurate tree-based detection of code

clones,” In Proceedings of the 29th International conference on Software

Engineering, Minneapolis, MN, USA, 2007, pp. 96-105.

R. Komondoor, S. Horwitz, “Using slicing to identify duplication in

the source code,” in Proceedings of the 8th International Symposium on

Static Analysis (SAS' 01), Vol. LNCS 2126, Paris, France, 2001, pp. 40-

J. Krinke,“Identifying Similar code with program dependence

graphs,” In Proceedings of the 8th Working Conference on Reverse

Engineering (WCRE'01), Stuttgart, Germany, 2001, pp. 301-309.

C. Liu, C. Chen, J. Han, P. S. Yu, “GPLAG: Detection of Software

Plagiarism by Program Dependence Graph Analysis,” In Conference on

Knowledge Discovery and Data Mining, 2006, pp. 872-881.

Gabel, Mark, Lingxiao Jiang, and Zhendong Su, “Scalable detection

of semantic clones,” In 30th International Conference on Software

Engineering, (ICSE'08), ACM/IEEE, 2008, pp. 321-330.

Johnson, J. Howard, “Identifying redundancy in source code using

fingerprints,” In Proceedings of the conference of the Centre for Advanced

Studies on Collaborative research: software engineering- IBM Press, vol.

, 1993, pp. 171-183.

J. Mayrand, Claude Leblanc, and Ettore M. Merlo, “Experiment on

the automatic detection of function clones in a software system using

metrics,” In Proceedings of International Conference on Software

Maintenance, IEEE, 1996, pp. 244-253.

A. Kostas Kontogiannis, Renator DeMori, Ettore Merlo, M. Galler,

and Morris Bernstein, “Pattern matching for clone and concept detection,”

In Reverse engineering, Springer US, 1996, pp. 77-108.

R. Koschke, Raimar Falke, and Pierre Frenzel, “Clone Detection

using abstract syntax suffix trees,” In 13th Working Conference on

Reverse Engineering (WCRE'06), IEEE, 2006, pp. 253-262.

Leitao Antonio Menezes, “Detection of redundant code using R 2 D

,” Software quality journal, vol. 12, no. 4, 2004, pp. 361-382.

G. Anil Kumar, Dr. C.R.K.Reddy, Dr. A. Govardhan, “AN




TECHNOLOGY (IJCET) ISSN 0976 – 6367(Print) ISSN 0976 –

(Online) Volume 3, Issue 1, January- June (2012), pp. 273-288.


  • There are currently no refbacks.