Defect Dependency based Heuristic Approaches to Improve Software Quality in Large Scale Integrated Software Products

Author: Sai Anirudh Karre
Date: 2016-11-29
Report no: IIIT/TH/2016/73
Advisor:Raghu Reddy

Abstract

Software Quality has been an important economic factor for every successful Software Product. It is quite challenging for large-scale software products, especially integrated software products to maintain its quality after every version release. Integrated Software Product is a software that combines two or more sub-products to achieve a similar business goal. These sub-products are interconnected to each other and share common packages of programs to form one application. Unlike non-integrated software products, integrated software products are complex in design. They require detailed exploration on the spread of a defect across entire integrated product suite along with causal analysis so as to improve overall product quality. Either due to security constraints or due to complicated workflow, quality engineers may not be able to access and analyze the source code of all sub-products together in a large scale integrated software product. Thus investigating defect dependency in large scale integrated software product is difficult. In this thesis, we present our efforts towards exploring various ways to estimate defect widespread or defect dependency of a defect in a large scale integrated software product. Current approaches are mostly non-generic and are developed for a specific product design with domain constraints and with predefined conditions. Very few approaches were found to be adaptable for large software products. However, they were neither evaluated nor validated using a real-time defect dataset of a large scale integrated software product. This motivated us to present a few approaches to estimate the defect widespread or defect dependency of a defect in a large scale integrated software product. Our approaches are heuristic by design, this is to achieve a generalized way to study the defect dependency in a software product of any domain and size. These approaches help product owners of an integrated software product to prioritize defects based on its widespread across the large software product and also help quality teams to evaluate the defect dependency so as to address dependent defects based on artifacts accessible to them. As part of our initial research, we examined Defect Dependency calculation using the concept of Generalized Dependency Degree method which is measure a of studying the dependency of an element over another. This method was primarily inspired by dependency degree approach from Rough Set Theory. We formulated this method into an approach by representing all recorded defects per sub product in a large scale integrated software product as onevset. We calculated Generalized Dependency Degree between defect sets of each sub-product so as to estimate the defect dependency over another. We implemented this approach on a real-time industrial defect dataset of a large scale integrated Human vi vii Resource Management product. Also captured significant results across various version releases. This is a simple and a generic approach to implement on a large software products. However, the level of abstraction on defect dependency calculation for a large software was limited to defects recorded at sub-product only. This prompted us to explore new ways to study defect dependency at a deeper level of abstraction in a large software product. Correlation analysis in statistics is one of the efficient way to understand the independence or dependency of elements in a given relation. We explored various ways on implementing correlation analysis on a defect dataset and formulated an approach based on correlation to study defect dependency between defects recorded in a large software product. For a better level of generalization, we have grouped defects recorded per feature in a large software product and performed defect correlation analysis to understand the defect dependency among defect sets of available software features. We evaluated this approach against Open Source (NetBeans IDE) defect dataset and captured noteworthy observations. We were able to record list of most defective features which are unstable when deployed together in a large software product.These results help product managers to prioritize defects recorded in defective features and focus towards building a stable software. However, product managers still may not be able to isolate the exact cause-effect of a single defect over entire large software product i.e. with the above proposed methods to study defect dependency, it is difficult to estimate the dependency of a single defect over entire large software product. This stimulated us towards exploring defect dependency at the lowest level of abstraction in a large software product. We composed another approach with a model framework as basis to evaluate defect dependency so as to fill this gap. We have implemented our approach using an i* modelling framework on a few real time software projects and have recorded notable observations. The approaches composed as part of this thesis are targeted for large scale integrated software products. From the beginning of our research, we focused formulating approaches which are simple and easier to implement. In the later part of this thesis, we compared above proposed approaches against various parameters so as to help software practitioners to choose the most suitable approach as per their business need. In the end, we discussed the scope of industrial adoption and possible future work.

Full thesis: pdf

Centre for Software Engineering Research Lab

IIIT Hyderabad Publications

Defect Dependency based Heuristic Approaches to Improve Software Quality in Large Scale Integrated Software Products

Abstract