ISSRE2001::The 12th International Symposium on Software Reliability Engineering

Program: November 28 Industry Day

Industry Program Committee:

P. Santhanam (IBM Research, USA), Chair
Bill Everett (SPRE, USA)
George WY Leung (ITSD, Hong Kong)
Ying-Hua Min (Academy of Sciences, China)
Linda Rosenberg (NASA, USA)
Noel Samaan (Motorola, USA)
Min Xie (National University, Singapore)

Track 1

Time	Track 1
08:30 - 09:00	Registration
09:00 - 09:30	Opening Remarks
09:30 - 10:15	Keynote 1
10:15 - 10:30	Short Break
10:30 -12:00	Industrial Practice Talks (1R): Software Reliability Practice Session Chair: Allen Nikora, Jet Propulsion Laboratory, USA Managing Reliability Development & Growth in a Widely Distributed, Safety-Critical System Samuel Keene, Consulting Engineer Jon Peterson, Meng-Lai Yin, Raytheon System Company, USA Software Reliability- Theory vs. Practice in the Development Environment Patrick Carnes, Air Force Operational Test Center (AFOTEC), USA Applying Reliability Prediction Models among Different Software Systems and Identifying Commonly Effective Metrics to Predict Fault-Prone Modules Shin-Ichi Sata, NTT Data Corportaion, Japan Akito Monden, and Ken-ichi Matsumoto, Nara Institute of Science and Technology, Japan
12:00 - 13:00	Lunch
13:00 - 13:45	Keynote 2
13:45 - 14:00	Short Break
14:00 - 15:30	Industrial Practice Talks (2R): Testing Techniques Session Chair: Brendan Murphy, Microsoft Research, UK Software Testing Challenges in Telecom Domain Debashis Mitra Pankaj Bhatt, Tata Consultancy Services, India Testing for Software Security in Hostile Environments Herbert Thompson and Dr. James A. Whittaker Florida Institute of Technology, Melbourne Beach, FL USA Optimization of Test Case Selection for Regression Noel Samaan, Motorola Inc., Schaumburg, IL, USA
15:30 - 16:00	Break
16:00 - 17:30	Industrial Practice Talks (3R): Development Techniques Session Chair: Karama Kanoun, LAAS-CNRS, France Development and Verification of Windows 2000 and Windows XP Brendan Murphy, Microsoft Research, UK

Track 2

Time	Track 2
08:30 - 09:00	Registration
09:00 - 09:30	Opening Remarks
09:30 - 10:15	Keynote 1
10:15 - 10:30	Short Break
10:30 -12:00	Industrial Practice Talks (1S): Analysis and Modeling Session Chair: Jianong Cao, Hong Kong Polytechnic University, Hong Kong Defect Density Prediction Model For Software Application Maintenance Ananya Ray, Mahua Seth Cognizant Technology Solutions India Pvt. Ltd., Calcutta, India Source Code Analysis as a Cost Effective Practice to Improve the High Reliability of Cisco Products Christopher Pham, Jun Xu, and Jialin Zhou, Cisco Systems,Inc., USA A DSS Model for Tool Evaluation Rakesh Agarwal and Ashiss Kumar Dash, Infosys Technologies Limited, India
12:00 - 13:00	Lunch
13:00 - 13:45	Keynote 2
13:30 - 14:00	Short Break
14:00 - 15:30	Industrial Practice Talks (2S): Dependability & Reliability Measurement Session Chair: Samuel Keene, Seagate Technology, USA An Application of the Discrete Function Theory and the software Control Flow to Dependability Jong Gyun Choi and Poong Hyun Seong, Korea Advanced Institute, Korea Measures to Increase the Dependability of Information Systems in the IT Age Katsuyuki Yasuda, Kenji Tokunaga, Mariko Shimizu, and Shigeru Yamada Hitachi Ltd., Japan A Practical Software Fault Measurement and Estimation Framework Allen Nikora, California Institute of Technology Pasadena, USA John Munson, University of Idaho, Moscow, ID, USA
15:30 - 16:00	Break
16:00 - 17:30	Industrial Practice Talks (3S): Website & Software Reliability Session Chair: William Everett, SPRE Inc, USA Boosting Website Reliability by Mining Patterns from User Transactions Biplav Srivastava, IBM India Research Laboratory, India

Track 3

Time	Track 3
08:30 - 09:00	Registration
09:00 - 09:30	Opening Remarks
09:30 - 10:15	Keynote 1
10:15 - 10:30	Short Break
10:30 -12:00	Fast Abstract Session 1: Systems E-commerce e-Business Reliability with Web Performance Management Robert B. Wen, AP Technology Corporation, USA Development of a Kernel Thread Web Accelerator Jonggyu Park Hanna Lim HagBae Kim Yonsei University, Korea Signature Verification System using Pen Pressure for Internet and E-Commerce Application Tham Heng Keit R. Palaniappan P. Raveendran Fumiaki Takeda University of Malaya, Malaysia Real Systems Specification-based Detection of Telecom Service Degradations A. M. da Silva Filho, State University of Maringa, Brazil J. T. Saito, State University of Maringa, Brazil Oracle's Technologies for High Availability Sasidhar Pendyala, Oracle S/W India Ltd., India High Performance Computing Software Infrastructure For Cervical Cancer Detection and Monitoring Savitri Bevinakoppa Royal Melbourne Institute of Techology, Australia Kailash Narayan Peter MacCallum Institute, Australia The House Keeping System of Automated Evaluation of Students' Programming Reports Hikofumi Suzuki, Nagano National College of Technology, Japan Katsumi Wasaki, Shinshu University, Japan Tatsuo Nakazawa, Nagano National College of Technology, Japan Yasunari Shidama, Shinshu University, Japan
12:00 - 13:00	Lunch
13:00 - 13:45	Keynote 2
13:45 - 14:00	Short Break
14:00 - 15:30	Panel 1 Everything You Wanted to Know About Software Reliability Engineering But Didn't Know Who to Ask John D.Musa, Independent Consultant, Morristown, NJ, USA William W. Everett, SPRE, Inc., Albuquerque, NM, USA Karama Kanoun, CNRS, France Roy Ko, Hong Kong Productivity Council, Hong Kong Norman F. Schneidewind, Naval Postgraduate School, Monterey, CA, USA Mladen A. Vouk, North Carolina State University, NC, USA
15:30 - 16:00	Break
16:00 - 17:30	Panel 2 ODC for Process Management and Cost Control Ram Chillarege, Chillarege Corp., Peekskill, NY, USA Albert Liu, Motorola, China Michael Lyu, Chinese University of Hong Kong, Hong Kong Peter Santhanam, IBM Research, New York, USA

Details:

Boosting Website Reliability by Mining Patterns from User Transactions
A web site can be considered as a GUI application whose performance, reliability and security is as crucial as correctness. However, software testing in practice is a tradeoff between budget, time and quality which stops whenever the allocated time or budget is exhausted. In this work, we propose a web site testing methodology in which user transactional information is mined from a site development view to learn test characteristics of a web site and software(s) running behind the scene on it. Mining transactional information for useful patterns has received wide attention in data mining in recent years. However, attention until now has only focused on learning characteristics about users whose activities created the transactions to begin with. These may be the visitors of the website and/ or the suppliers of the product and services on the site. By using experimental and ongoing transactions to improve testing on a web site, we can provide increased reliability at no additional product development time, thereby maintaining aggressive time to market schedules. Our approach has been employed in a startup company and the initial lessons suggest that monitoring ongoing transactions helps identification of potential problems very early. Moreover, both trends during data cleaning and learned patterns are useful.

Testing for Software Security in Hostile Environments
When building secure software systems, functionality and security are often in contention as development goals. Increased functionality leads to decreased security and, conversely, to achieve true security would require the software to be completely free of contact with its environment. Such extreme measures would render the system unusable, yet, mission critical systems must be integrated into complex networks and be accessible to many. Such systems typically undergo rigorous testing but with limited resources and the ubiquitous force of market pressure only a small subset of the infinite test cases can be executed. Thus, a significant portion of the code is left under-exercised and, in large development efforts, not exercised at all. Arguably the most neglected code paths during the testing process are error-handling routines. Tests that involve disk errors and network problems are often only cursorily explored. These extreme conditions that are possible in the real world are sometimes neglected during the testing phase due to the difficulty in simulating a hostile environment. It is during these periods that the software is at its most vulnerable and where carefully conceived security measures break down. If such situations are ignored and other test cases pass what we are left with is a dangerous illusion of security. Servers do run out of disk space; network connectivity is sometimes intermittent and file permissions can be improperly set. Such conditions cannot be ignored as part of an overall testing strategy. What is needed then is to integrate such failures into our test cases and to become aware of their impact on the security and integrity of the product itself and the user's data. In this presentation we show how security procedures break down in hostile environments. We then go on to show measures we have taken to simulate such failures by intercepting and controlling return values from system calls made by the application. For example, we can selectively deny write access to the hard drive or simulate a strained or intermittent network connection, all through controlling responses from system calls. This provides a much needed, easy to implement method that creates a turbulent environment to execute selected tests. These methods can reveal potentially severe security bugs that would otherwise escape testing and surface in the field.

Defect Density Prediction Model For Software Application Maintenance
With growth in demand for zero defects, predicting reliability of software products is gaining importance. Software reliability models are used to estimate the reliability or the number of latent defects in a software product. Most of the reliability models available in the literature, to estimate the reliability of software, are based on the complete software development lifecycle. However there are large software systems which are maintained by the developer organization themselves or by a third party vendor. Such systems have been in production for a considerably large period of time. The various details regarding the development life cycle stages are usually not known to the organizations that are responsible for the maintenance of these systems. In such a scenario, the reliability models available for the complete development life cycle model cannot be applied. However, predicting the defects, in the maintenance being done for such systems, is of primary importance so as to provide confidence to the system owner and also for resource planning. In this paper Non Homogeneous Poisson Process (NHPP) has been applied to develop a prediction model for the defect density of application system maintenance projects. A comparative study was done on the numerous models present. Based on the assumptions and constraints of the models and the maintenance scenario, the NHPP model was used to predict defect density. Goel-Okumoto had applied the Non Homogeneous Poisson Process model estimating software reliability using testing defects. The model has been adapted to estimate software reliability using maintenance defects. The model was established using defects data from a maintenance project and validated with the data from another maintenance project in Cognizant Technology Solutions, Calcutta. The parameters of the model were estimated using a SAS program which used a nonlinear regression approach based on the DUD algorithm ( Ralston and Jennrich, Technometrica, 1978, Vol. 20). The Kolmogorov -Smirnov test was applied to test the goodness of fit. Predictions based on the estimated model were compared with the actuals observed for a few subsequent months for both the projects. The difference between these are within a statistically acceptable range. The model was accepted by both projects and have been used for resource planning and assuring the quality of the maintenance work to the customer.

Managing Reliability Development & Growth in a Widely Distributed, Safety-Critical System
Predicting software reliability is important and especially challenging when the software is widely distributed for a safety-critical system. Typically, safety-critical systems have built-in redundancies that result in high reliability for the hardware, leaving the software reliability as the limiting factor. In particular, the widely distributed system considered here uses multiple copies of identical software, with inputs that differ to varying degrees between different copies. How to predict the software reliability for such a system is presented in this paper. This paper details the evolution of a software reliability prediction from initial predictions made via the Keene Model before development started, Rayleigh model predictions used during software development, and CASRE predictions made using data collected after final acceptance test. Comparisons are made at each reliability prediction so compare which models most accurately predict the actual performance.

A DSS Model for Tool Evaluation
There are many tools and technologies in a data warehousing project. It is a challenging task to select or recommend a single tool over others to perform the Decision Support for an organization. This paper proposes an approach with a model to do this evaluation to choose a DSS tool. The model proposes a three step approach for selecting the most-suited DSS tool. Firstly, the key parameters needs to be identify for its usability. For a DSS tool the key evaluation parameters can be broadly categorized into the following 4 areas: User Interface, System Requirements, Response Factors and Development Environment. Next, we assign weightages to each of the parameters identified. Finally, we identify the key areas of evaluation on each parameter. This should be done by meeting the users and the IS personnel for the DSS solution. This model provides a step by step process for requirement gathering, area identification, quantifying user/information-system preferences and prioritizing of the parameters to be taken into consideration for evaluation of tools. This model can be deployed for software tool evaluation of any kind with modifications to the approach. This approach can also be extended to any other software tool with minor modifications. Limitations of the model proposed in this paper are: (a) The requirement analysis and evaluation scales have to be the same; (b) It may be difficult to deploy this model if there are too many parameters to be considered for evaluation and the parameters can not be classified into smaller number of categories; (c) If there are other business considerations than what can be succinctly quantified on evaluation, this model may not work; and (d) If two tools achieve areas close to each other, then a blind numerical comparison may not be the best approach.

Optimization of Test Case Selection for Regression
When one considers software systems for a technology that is evolving fast, with time to market at very low costs as an almost mandatory requirement, the selection of test cases for regression from a system (black-box) test viewpoint is not easy in real-world environment. With the best processes in place and best intentions of every contributing organization, the academic view of total control on what/where the change in any part of the software has taken (or will take) place is a myth. The situation if even more complex because black-box teams do not have source code to analyze the change with acceptable uncertainties. Regression testing typically constitutes a significant portion of certifying software for large complex systems. For new products with a large component reuse from previous releases, although thoroughly tested in previous releases, when new features are added, yet feature interaction and impact of new features on existing features remain a major concern of test release managers. Consequently, it is not unusual to have half or more of the test activity effort devoted to repeating test cases either at the start of the test cycle or during bug-fixes. Research workers have addressed the issue of minimizing the number of test cases selected for regression based on analyzing source code, number/nature of change in requirement, line of code and, in some cases, formal methods of representing the change in software. In most cases, typical applications are either small software systems or systems that have not reached a high maturity and the change in software specification is well-contained (typically with a relatively high cost). The success of any science and engineering discipline is attributed to measurement. Measurants that are not associated with uncertainty in the measurement quoted are used with lower confidence level leading to, in some cases, the inability to reproduce the scenario that led to a particular measurement. Software engineering is a good example since it is lagging behind other engineering disciplines. There are no primary or reference standards agreed upon by national bodies to certify software systems in a manner similar to that used when physical systems are certified based on whether or not a system meets a prescribed set standard(s). To this end, testing large complex software systems remains by-and-large a black art. From a system (black-box) standpoint, the assemblage of software components is not well defined; the interaction between such components is not clarified and is typically carried out with inadequate analysis of most critical paths. There is more emphasis on the static, rather than dynamic, behavior of software components. The latter is the hardest as there are very few (or no) tools available to indicate how well the software is performing compared to what it is supposed to be doing. In this paper, the author proposes a methodology that is based on behavioral correlation of three elements to extract knowledge leading to decisions as to what test case(s) set is the most critical at any point the test cycle. The three elements are: (a) the user-profile characterized by customer-reported defects, (b) the analysis of past test results (by applying reliability engineering analysis models) together with monitoring test case behavior over past test releases (or cycles) and, (c) the simulation of dynamic behavior of executable software objects driven by past test scenarios. The correlation of knowledge extracted from the three elements is implemented using ?true? and fuzzy logic and artificial neural network. Initial results indicate the reduction in cycle time of up to 30% in efforts to characterize customer-profiles and when planning for regression testing forthcoming releases. The methodology has already been recognized to providing traceability to parametric representation of past test activities with measurements that are based on "behavioral" analysis rather than only on "counting" existing and predicting remaining defects.

Software Reliability - Theory vrs Practice in the Operational Environment
AFOTEC has been an advocate of employing software reliability as one of the methodologies to determine the operational effectiveness and suitability of software intensive systems. AFOTEC has employed software reliability and software maturity as a measure of the readiness for operational test. Software reliability growth models, regression analysis, relative complexity analysis, statistical process control and Markov modeling are the analytical techniques currently employed to assess software reliability during development testing. Ideally, software reliability can then be used to predict reliability during operational testing. Unfortunately, there are several reasons for not achieving this ideal state. The operational profiles usually differ during these two test periods. Also, unlike hardware the failure rate of software may not be stable at the conclusion of development testing. This is especially true for complex systems often deployed while still in an evolving state of development. Additionally, operating at or near memory capacity, inadequate operator training, hardware/software interface incompatibilities and environmental factors all lead to performance instabilities. Hence, the approach currently used to determine software reliability during operational is to record and validate all software failures and use the cumulative operational test time to determine the average failure rate. Current practice is to add the hardware and software failure rates to obtain system reliability. Continuing research is needed to determine how best to determine system reliability. AFOTEC has initiated research in several areas on how to better characterize software failures during operational testing. One of these efforts involves problem identification and analysis in which the module(s) and system functions are analyzed to determine the degree of mission impact. Test resources should be focused on elements critical to mission success. Results to date indicate that in order to quantify the impact, the software must be instrumented to derive a usability factor which will lead to a more accurate value of software reliability for a given mission profile. The drawback to this is that the instrumentation code also becomes a potential source for software failures. Knowledge of the test data distribution will provide insight as to how to best combine hardware and software failure data to derive a system failure rate. Improved data collection practices will identify the relevant variables effecting system reliability. Principal component analysis can then be used to establish an orthogonal operational impacts classification system. Multivariate analysis would determine which of the impacts/variables had the biggest impact on system reliability. Bayesian techniques could be used to combine operational metrics with previously development test data to provide improved insight to system reliability. AFOTEC is currently in discussion with other government and academic institutions to initiate joint research efforts.

Software Testing Challenges in Telecom Domain
This paper describes the challenges faced in the development of solutions to industrial problems in the area of networking and telecommunications domain. These problems are technically challenging especially since the tester needs to take into consideration the temporal behavior of the solution (e.g. a protocol). The solutions may also include integration of third party products. In this paper we have taken two instances of such problems. The first one discusses the design and development of a simulation tool to facilitate the testing of a software solution for network management problem involving mobile base stations. The second one focuses on the design and development of a test framework to automate regression testing to save on testing time and effort. Current telecom equipment manufacturers face the problem of having to adhere to aggressive product release schedules and shorten the "Time to Market". It is also euphemistically called the "Internet Time" and often multiple releases of the same product are scheduled in the same calendar year. In this scenario, the Testing and Quality Assurance functions of the organization are severely challenged. They have to complete the testing on an ever-decreasing budget. Additionally, the developers often send for testing code with new fixes very close to the planned finish date. This stretches the testing ability of the team, which then very often resorts to experience driven testing without completing all regression (manual) tests on the new load. A typical testing facility will have testing tools from various third party vendors to test base product functionality, performance and load behavior. These tools seldom talk to each other. However, each is required to test a specific sub function within an end to end function within a test case. In an unautomated environment, each tool has to be used manually. In addition, the initial setup for testing has to be manually performed. To address the few issues of costly effort stated here, a test automation architecture was gainfully used, wherein the tools are integrated into a common framework. This architecture also helps in organizing, managing and executing the test cases remotely from a centralized location. Specific implementations of such architectures have demonstrated a significant saving of testing effort. The GSM Abis interface is located between the BSC (base station controller) and the BTS (base transceiver station). This includes layer 3 procedures related to Radio Link Layer Management (RLM), Dedicated Channel Management (DCM), Common Channel Management (CCM), and Transceiver Management (TRX). Operation and Maintenance (O&M). While there are many tools available to monitor the protocol message flow between the BSC and BTS and some of these also provide the BTS simulation capabilities, these are very expensive and generic in nature. In such situations, use of a tool which could simulate the O&M plane of a particular flavor of Abis implementation is appropriate. This tool allows Base Station management software (under test) to communicate with it as if it were communicating with a real BTS.

Source Code Analysis as a Cost Effective Practice to Improve the High Reliability of Cisco Products
For huge code base exceeding dozen million lines, it was proven to save multi million dollars to practice source code analysis in the early stages of the development cycle. Early analysis allows defects to be discovered and addressed before they ever expose to the customer sites, thus saving the company tremendous effort and finance to correct them on the field. Most important, it helps to improve customer satisfaction and increase the MTBF figures. The paper covers the types of defects being discovered by some third vendors and in-house source code analysis tools, the implementation and the deployment challenges. It presents a Cisco's integration architecture to gain the most productive results out of all tools while allowing a high number of developers to share the same centralized resource and efforts. Cisco�s scalable integration also overcomes the high noise ratio and other complex technical issues arising from the source code analysis tools and practices. The saving contributed to a 99.999% RAS (Reliability, Availability, Serviceability) Executive Team Award towards improving the high reliability of Cisco products. The paper also challenges further academic research to fulfill the need of the globalizing industry.

CMM Level of an organization does not rise higher than level of process improvement activity
QA process becomes more important for products used in networks, but there is not enough time for QA process because it is hard-pressed by other bottlenecked processes. Process, in the first place, is for maintaining quality in products. Started a software process improvement activity. Lesson learned: CMM level of an organization does not rise higher than the maturity level of process improvement activity. Using this model, it is necessary to plan for improving the level of the SPI activity at first.

Applying reliability prediction models among different software systems and identifying commonly effective metrics to predict fault-prone modules
Many studies have been done to predict fault-prone modules. The typical approach is constructing prediction models based on complexity metrics and failure frequencies of each module, using some analysis methods such as linear regression analysis, discriminant analysis, logistic regression analysis, classification tree, neural network, and so on. Most of these results showed that these prediction models could predict fault-prone modules at high accuracy. However, these studies were various in such as analysis method, target software and measured metrics. Furthermore, most of the results were validated on only a single software system and it has not been practically investigated whether a prediction model based on a software system can also predict fault-prone modules accurately in other software systems. In this study, we evaluated the applicability of prediction models between two software systems. They are different in function, written language, size, and data collecting time. For each module in each software system, we collected 15 complexity metrics and constructed prediction models using linear discriminant analysis. Then we predicted fault-prone modules of one software system by applying the prediction model based on data of another software system to their metrics values, and vice versa. The accuracy of the prediction results (the number of modules predicted correctly per the number of all modules) were very bad (26 percent on the average), because most of modules were predicted as "Fault-Prone". A prediction model typically tends to be specialized to the data collected from the software system, so that the prediction performance may be less accurate when it is applied to other software system. We assume this is the reason of the low accuracy. Next, we focused on metrics effective to predicting fault-prone modules in every prediction model, and we identified two metrics ( Lines of Code and Max nesting level) as commonly effective metrics in all prediction models. Then we constructed prediction models using only the two metrics and evaluated the prediction performance. In this case, the prediction performance were dramatically improved (the accuracy were 78 percent on the average). We believe that this result is very useful to predict fault-prone modules in newly developed software by constructing the prediction model based on those metrics. By conducting similar studies using more metrics, we will be able to identify still more metrics commonly effective to fault-proneness.

An Application of the Discrete Function Theory and the Software Control Flow to Dependability Assessment of Embedded Digital Systems
This article describes a combinatorial model for estimating the reliability of the embedded digital system by means of discrete function theory and software control flow. This model includes a coverage model for fault processing mechanisms implemented in digital systems. Furthermore, the model considers the interaction between hardware (H/W) and software (S/W). The fault processing mechanisms make it difficult for many types of components in digital system to be treated as binary state, good or bad. The discrete function theory provides a complete analysis of multi-state systems as which the digital systems can be regarded. Through adaptation of software control flow to discrete function theory, the HW/SW interaction is also considered for estimation of the reliability of digital system. Using this model, we predict the reliability of one board controller in a digital system, Interposing Logic System (ILS), which is installed in YGN nuclear power units 3 and 4. Since the proposed model is a general combinatorial model, the simplification of this model becomes the conventional model that treats the system as binary state. Moreover, when the information on the coverage factor of fault tolerance mechanisms implemented in systems is obtained through fault injection experiments, this model can cover the detailed interaction of system components.

Measures to Increase the Dependability of Information Systems in the IT Age
This paper deals with several important quality issues for industrial large systems involving networks systems and including "black box" components delivered by other companies. It details Hitachi's strategy to obtain high-reliable/dependable system. The goal of the SST approach (that stands for System Simulation Testing) is to stress the system for testing its performances and its fault tolerance under load fluctuations and device failure. Hitachi has a dedicated system simulation testing (SST) Center in-house to implement SST and is developing various tools that generate heavy loads and various failures. The Information System Quality Assurance Department operates these facilities and accumulates and uses SST test technology know-how. The Systems Engineering (SE) Department in charge of assisting customers performs the individual SST. One SST example is the case of SST for large-scale network systems. Thanks to the above measures, Hitachi's information system has been highly regarded for high reliability and dependability for over a quarter of a century since the installation of SST.