Program: November 28 Industry Day
Industry Program Committee:
- P. Santhanam (IBM Research, USA), Chair
- Bill Everett (SPRE, USA)
- George WY Leung (ITSD, Hong Kong)
- Ying-Hua Min (Academy of Sciences, China)
- Linda Rosenberg (NASA, USA)
- Noel Samaan (Motorola, USA)
- Min Xie (National University, Singapore)
Track 1
Time |
Track 1
|
08:30 - 09:00 |
Registration
|
09:00 - 09:30 |
Opening Remarks
|
09:30 - 10:15 |
Keynote 1
|
10:15 - 10:30 |
Short Break
|
10:30 -12:00 |
Industrial Practice
Talks (1R): Software Reliability Practice
Session
Chair: Allen Nikora, Jet Propulsion Laboratory, USA
Managing
Reliability Development & Growth in a Widely Distributed,
Safety-Critical System
Samuel Keene, Consulting
Engineer
Jon Peterson,
Meng-Lai Yin, Raytheon System Company, USA
Software
Reliability- Theory vs. Practice in the Development Environment
Patrick
Carnes, Air Force Operational Test Center (AFOTEC), USA
Applying
Reliability Prediction Models among Different Software Systems and
Identifying Commonly Effective Metrics to Predict Fault-Prone Modules
Shin-Ichi
Sata, NTT Data Corportaion, Japan
Akito Monden, and Ken-ichi Matsumoto, Nara
Institute of Science and Technology, Japan
|
12:00 - 13:00 |
Lunch |
13:00 - 13:45 |
Keynote 2 |
13:45 - 14:00 |
Short Break |
14:00 - 15:30 |
Industrial
Practice Talks (2R): Testing Techniques
Session
Chair: Brendan Murphy, Microsoft Research, UK
Software
Testing Challenges in Telecom Domain
Debashis Mitra
Pankaj Bhatt, Tata Consultancy Services, India
Testing
for Software Security in Hostile Environments
Herbert
Thompson and Dr. James A. Whittaker
Florida Institute of Technology,
Melbourne Beach, FL USA
Optimization
of Test Case Selection for Regression
Noel
Samaan, Motorola Inc., Schaumburg, IL, USA
|
15:30 - 16:00 |
Break |
16:00 - 17:30 |
Industrial Practice
Talks (3R): Development Techniques
Session
Chair: Karama Kanoun, LAAS-CNRS, France
Development
and Verification of Windows 2000 and Windows XP
Brendan
Murphy, Microsoft Research, UK
|
Track 2
Time |
Track 2
|
08:30 - 09:00 |
Registration
|
09:00 - 09:30 |
Opening Remarks
|
09:30 - 10:15 |
Keynote 1
|
10:15 - 10:30 |
Short Break
|
10:30 -12:00 |
Industrial Practice
Talks (1S): Analysis and Modeling
Session
Chair: Jianong Cao, Hong Kong Polytechnic University, Hong Kong
Defect
Density Prediction Model For Software Application Maintenance
Ananya
Ray, Mahua Seth
Cognizant Technology Solutions India Pvt. Ltd., Calcutta,
India
Source
Code Analysis as a Cost Effective Practice to Improve the High Reliability
of Cisco Products
Christopher
Pham, Jun Xu, and Jialin Zhou, Cisco Systems,Inc., USA
A
DSS Model for Tool Evaluation
Rakesh
Agarwal and Ashiss Kumar Dash, Infosys Technologies Limited, India
|
12:00 - 13:00 |
Lunch |
13:00 - 13:45 |
Keynote 2 |
13:30 - 14:00 |
Short Break |
14:00 - 15:30 |
Industrial Practice
Talks (2S): Dependability & Reliability Measurement
Session
Chair: Samuel Keene, Seagate Technology, USA
An
Application of the Discrete Function Theory and the software Control Flow
to Dependability
Jong Gyun Choi
and Poong Hyun Seong, Korea Advanced Institute, Korea
Measures
to Increase the Dependability of Information Systems in the IT Age
Katsuyuki
Yasuda, Kenji Tokunaga, Mariko Shimizu, and Shigeru Yamada
Hitachi
Ltd., Japan
A
Practical Software Fault Measurement and Estimation Framework
Allen
Nikora, California Institute of Technology Pasadena, USA
John Munson,
University of Idaho, Moscow, ID, USA
|
15:30 - 16:00 |
Break |
16:00 - 17:30 |
Industrial
Practice Talks (3S): Website & Software Reliability
Session
Chair: William Everett, SPRE Inc, USA
Boosting
Website Reliability by Mining Patterns from User Transactions
Biplav
Srivastava, IBM India Research Laboratory, India
|
Track 3
Time |
Track 3
|
08:30 - 09:00 |
Registration
|
09:00 - 09:30 |
Opening Remarks
|
09:30 - 10:15 |
Keynote 1
|
10:15 - 10:30 |
Short Break
|
10:30 -12:00 |
Fast
Abstract Session 1: Systems
E-commerce
e-Business
Reliability with Web Performance Management
Robert B. Wen, AP Technology Corporation, USA
Development of a Kernel Thread Web Accelerator
Jonggyu Park
Hanna Lim
HagBae
Kim
Yonsei
University, Korea
Signature
Verification System using Pen Pressure for Internet and E-Commerce
Application
Tham Heng Keit
R. Palaniappan
P.
Raveendran
Fumiaki Takeda
University
of Malaya, Malaysia
Real
Systems
Specification-based
Detection of Telecom Service Degradations
A.
M. da Silva Filho, State University of Maringa, Brazil
J. T. Saito, State University of Maringa, Brazil
Oracle's Technologies for High Availability
Sasidhar Pendyala, Oracle S/W India Ltd., India
High
Performance Computing Software Infrastructure For Cervical Cancer
Detection and Monitoring
Savitri
Bevinakoppa
Royal
Melbourne Institute of Techology, Australia
Kailash
Narayan
Peter
MacCallum Institute, Australia
The
House Keeping System of Automated Evaluation of Students' Programming Reports
Hikofumi Suzuki, Nagano National College of Technology, Japan
Katsumi Wasaki, Shinshu University, Japan
Tatsuo Nakazawa, Nagano National College of Technology, Japan
Yasunari Shidama, Shinshu University, Japan
|
12:00 - 13:00 |
Lunch |
13:00 - 13:45 |
Keynote 2 |
13:45 - 14:00 |
Short Break |
14:00 - 15:30 |
Panel 1
Everything You Wanted to Know About Software
Reliability Engineering But Didn't Know Who to Ask
- John D.Musa, Independent Consultant, Morristown,
NJ, USA
- William W. Everett, SPRE, Inc., Albuquerque, NM, USA
- Karama Kanoun, CNRS, France
- Roy Ko, Hong Kong Productivity Council, Hong Kong
- Norman F. Schneidewind, Naval Postgraduate
School, Monterey, CA, USA
- Mladen A. Vouk, North Carolina State University,
NC, USA
|
15:30 - 16:00 |
Break |
16:00 - 17:30 |
Panel 2
ODC for
Process Management and Cost Control
- Ram Chillarege, Chillarege Corp., Peekskill, NY, USA
- Albert Liu, Motorola, China
- Michael Lyu, Chinese University of Hong Kong, Hong
Kong
- Peter Santhanam, IBM Research, New York, USA
|
|
Details:
Boosting Website Reliability by Mining Patterns
from User Transactions
A web site can be considered as a GUI application whose performance,
reliability and security is as crucial as correctness. However, software
testing in practice is a tradeoff between budget, time and quality which stops
whenever the allocated time or budget is exhausted. In this work, we propose a
web site testing methodology in which user transactional information is mined
from a site development view to learn test characteristics of a web site and
software(s) running behind the scene on it. Mining transactional information
for useful patterns has received wide attention in data mining in recent
years. However, attention until now has only focused on learning
characteristics about users whose activities created the transactions to begin
with. These may be the visitors of the website and/ or the suppliers of the
product and services on the site. By using experimental and ongoing
transactions to improve testing on a web site, we can provide increased
reliability at no additional product development time, thereby maintaining
aggressive time to market schedules. Our approach has been employed in a
startup company and the initial lessons suggest that monitoring ongoing
transactions helps identification of potential problems very early. Moreover,
both trends during data cleaning and learned patterns are useful.
Testing for Software Security in Hostile
Environments
When building secure software systems, functionality and security are often in
contention as development goals. Increased functionality leads to decreased
security and, conversely, to achieve true security would require the software
to be completely free of contact with its environment. Such extreme measures
would render the system unusable, yet, mission critical systems must be
integrated into complex networks and be accessible to many. Such systems
typically undergo rigorous testing but with limited resources and the
ubiquitous force of market pressure only a small subset of the infinite test
cases can be executed. Thus, a significant portion of the code is left
under-exercised and, in large development efforts, not exercised at all.
Arguably the most neglected code paths during the testing process are
error-handling routines. Tests that involve disk errors and network problems
are often only cursorily explored. These extreme conditions that are possible
in the real world are sometimes neglected during the testing phase due to the
difficulty in simulating a hostile environment. It is during these periods
that the software is at its most vulnerable and where carefully conceived
security measures break down. If such situations are ignored and other test
cases pass what we are left with is a dangerous illusion of security. Servers
do run out of disk space; network connectivity is sometimes intermittent and
file permissions can be improperly set. Such conditions cannot be ignored as
part of an overall testing strategy. What is needed then is to integrate such
failures into our test cases and to become aware of their impact on the
security and integrity of the product itself and the user's data. In this
presentation we show how security procedures break down in hostile
environments. We then go on to show measures we have taken to simulate such
failures by intercepting and controlling return values from system calls made
by the application. For example, we can selectively deny write access to the
hard drive or simulate a strained or intermittent network connection, all
through controlling responses from system calls. This provides a much needed,
easy to implement method that creates a turbulent environment to execute
selected tests. These methods can reveal potentially severe security bugs that
would otherwise escape testing and surface in the field.
Defect Density Prediction Model For Software
Application Maintenance
With growth in demand for zero defects, predicting reliability of software
products is gaining importance. Software reliability models are used to
estimate the reliability or the number of latent defects in a software
product. Most of the reliability models available in the literature, to
estimate the reliability of software, are based on the complete software
development lifecycle. However there are large software systems which are
maintained by the developer organization themselves or by a third party
vendor. Such systems have been in production for a considerably large period
of time. The various details regarding the development life cycle stages are
usually not known to the organizations that are responsible for the
maintenance of these systems. In such a scenario, the reliability models
available for the complete development life cycle model cannot be applied.
However, predicting the defects, in the maintenance being done for such
systems, is of primary importance so as to provide confidence to the system
owner and also for resource planning. In this paper Non Homogeneous Poisson
Process (NHPP) has been applied to develop a prediction model for the defect
density of application system maintenance projects. A comparative study was
done on the numerous models present. Based on the assumptions and constraints
of the models and the maintenance scenario, the NHPP model was used to predict
defect density. Goel-Okumoto had applied the Non Homogeneous Poisson Process
model estimating software reliability using testing defects. The model has
been adapted to estimate software reliability using maintenance defects. The
model was established using defects data from a maintenance project and
validated with the data from another maintenance project in Cognizant
Technology Solutions, Calcutta. The parameters of the model were estimated
using a SAS program which used a nonlinear regression approach based on the
DUD algorithm ( Ralston and Jennrich, Technometrica, 1978, Vol. 20). The
Kolmogorov -Smirnov test was applied to test the goodness of fit. Predictions
based on the estimated model were compared with the actuals observed for a few
subsequent months for both the projects. The difference between these are
within a statistically acceptable range. The model was accepted by both
projects and have been used for resource planning and assuring the quality of
the maintenance work to the customer.
Managing Reliability Development & Growth in a
Widely Distributed, Safety-Critical System
Predicting software reliability is important and especially challenging when
the software is widely distributed for a safety-critical system. Typically,
safety-critical systems have built-in redundancies that result in high
reliability for the hardware, leaving the software reliability as the limiting
factor. In particular, the widely distributed system considered here uses
multiple copies of identical software, with inputs that differ to varying
degrees between different copies. How to predict the software reliability for
such a system is presented in this paper. This paper details the evolution of
a software reliability prediction from initial predictions made via the Keene
Model before development started, Rayleigh model predictions used during
software development, and CASRE predictions made using data collected after
final acceptance test. Comparisons are made at each reliability prediction so
compare which models most accurately predict the actual performance.
A DSS Model for Tool Evaluation
There are many tools and technologies in a data warehousing project. It is
a challenging task to select or recommend a single tool over others to perform
the Decision Support for an organization. This paper proposes an approach with
a model to do this evaluation to choose a DSS tool. The model proposes a three
step approach for selecting the most-suited DSS tool. Firstly, the key
parameters needs to be identify for its usability. For a DSS tool the key
evaluation parameters can be broadly categorized into the following 4 areas:
User Interface, System Requirements, Response Factors and Development
Environment. Next, we assign weightages to each of the parameters identified.
Finally, we identify the key areas of evaluation on each parameter. This
should be done by meeting the users and the IS personnel for the DSS solution.
This model provides a step by step process for requirement gathering, area
identification, quantifying user/information-system preferences and
prioritizing of the parameters to be taken into consideration for evaluation
of tools. This model can be deployed for software tool evaluation of any kind
with modifications to the approach. This approach can also be extended to any
other software tool with minor modifications. Limitations of the model
proposed in this paper are: (a) The requirement analysis and evaluation scales
have to be the same; (b) It may be difficult to deploy this model if there are
too many parameters to be considered for evaluation and the parameters can not
be classified into smaller number of categories; (c) If there are other
business considerations than what can be succinctly quantified on evaluation,
this model may not work; and (d) If two tools achieve areas close to each
other, then a blind numerical comparison may not be the best approach.
Optimization of Test Case Selection for Regression
When one considers software systems for a technology that is evolving
fast, with time to market at very low costs as an almost mandatory
requirement, the selection of test cases for regression from a system
(black-box) test viewpoint is not easy in real-world environment. With the
best processes in place and best intentions of every contributing
organization, the academic view of total control on what/where the
change in any part of the software has taken (or will take) place is a myth.
The situation if even more complex because black-box teams do not have source
code to analyze the change with acceptable uncertainties. Regression testing
typically constitutes a significant portion of certifying software for large
complex systems. For new products with a large component reuse from previous
releases, although thoroughly tested in previous releases, when new features
are added, yet feature interaction and impact of new features on existing
features remain a major concern of test release managers. Consequently, it is
not unusual to have half or more of the test activity effort devoted to
repeating test cases either at the start of the test cycle or during
bug-fixes. Research workers have addressed the issue of minimizing the number
of test cases selected for regression based on analyzing source code,
number/nature of change in requirement, line of code and, in some cases,
formal methods of representing the change in software. In most cases, typical
applications are either small software systems or systems that have not
reached a high maturity and the change in software specification is
well-contained (typically with a relatively high cost). The success of any
science and engineering discipline is attributed to measurement. Measurants
that are not associated with uncertainty in the measurement quoted are used
with lower confidence level leading to, in some cases, the inability to
reproduce the scenario that led to a particular measurement. Software
engineering is a good example since it is lagging behind other engineering
disciplines. There are no primary or reference standards agreed upon by
national bodies to certify software systems in a manner similar to that used
when physical systems are certified based on whether or not a system meets a
prescribed set standard(s). To this end, testing large complex software
systems remains by-and-large a black art. From a system (black-box)
standpoint, the assemblage of software components is not well defined; the
interaction between such components is not clarified and is typically carried
out with inadequate analysis of most critical paths. There is more emphasis on
the static, rather than dynamic, behavior of software components. The latter
is the hardest as there are very few (or no) tools available to indicate how
well the software is performing compared to what it is supposed to be doing.
In this paper, the author proposes a methodology that is based on behavioral
correlation of three elements to extract knowledge leading to decisions as to
what test case(s) set is the most critical at any point the test cycle. The
three elements are: (a) the user-profile characterized by customer-reported
defects, (b) the analysis of past test results (by applying reliability
engineering analysis models) together with monitoring test case behavior over
past test releases (or cycles) and, (c) the simulation of dynamic behavior of
executable software objects driven by past test scenarios. The correlation of
knowledge extracted from the three elements is implemented using ?true? and
fuzzy logic and artificial neural network. Initial results indicate the
reduction in cycle time of up to 30% in efforts to characterize
customer-profiles and when planning for regression testing forthcoming
releases. The methodology has already been recognized to providing
traceability to parametric representation of past test activities with
measurements that are based on "behavioral" analysis rather than
only on "counting" existing and predicting remaining defects.
Software Reliability - Theory vrs Practice in the
Operational Environment
AFOTEC has been an advocate of employing software reliability as one of the
methodologies to determine the operational effectiveness and suitability of
software intensive systems. AFOTEC has employed software reliability and
software maturity as a measure of the readiness for operational test. Software
reliability growth models, regression analysis, relative complexity analysis,
statistical process control and Markov modeling are the analytical techniques
currently employed to assess software reliability during development testing.
Ideally, software reliability can then be used to predict reliability during
operational testing. Unfortunately, there are several reasons for not
achieving this ideal state. The operational profiles usually differ during
these two test periods. Also, unlike hardware the failure rate of software may
not be stable at the conclusion of development testing. This is especially
true for complex systems often deployed while still in an evolving state of
development. Additionally, operating at or near memory capacity, inadequate
operator training, hardware/software interface incompatibilities and
environmental factors all lead to performance instabilities. Hence, the
approach currently used to determine software reliability during operational
is to record and validate all software failures and use the cumulative
operational test time to determine the average failure rate. Current practice
is to add the hardware and software failure rates to obtain system
reliability. Continuing research is needed to determine how best to determine
system reliability. AFOTEC has initiated research in several areas on how to
better characterize software failures during operational testing. One of these
efforts involves problem identification and analysis in which the module(s)
and system functions are analyzed to determine the degree of mission impact.
Test resources should be focused on elements critical to mission success.
Results to date indicate that in order to quantify the impact, the software
must be instrumented to derive a usability factor which will lead to a more
accurate value of software reliability for a given mission profile. The
drawback to this is that the instrumentation code also becomes a potential
source for software failures. Knowledge of the test data distribution will
provide insight as to how to best combine hardware and software failure data
to derive a system failure rate. Improved data collection practices will
identify the relevant variables effecting system reliability. Principal
component analysis can then be used to establish an orthogonal operational
impacts classification system. Multivariate analysis would determine which of
the impacts/variables had the biggest impact on system reliability. Bayesian
techniques could be used to combine operational metrics with previously
development test data to provide improved insight to system reliability.
AFOTEC is currently in discussion with other government and academic
institutions to initiate joint research efforts.
Software Testing Challenges in Telecom Domain
This paper describes the challenges faced in the development of solutions
to industrial problems in the area of networking and telecommunications
domain. These problems are technically challenging especially since the tester
needs to take into consideration the temporal behavior of the solution (e.g. a
protocol). The solutions may also include integration of third party products.
In this paper we have taken two instances of such problems. The first one
discusses the design and development of a simulation tool to facilitate the
testing of a software solution for network management problem involving mobile
base stations. The second one focuses on the design and development of a test
framework to automate regression testing to save on testing time and effort.
Current telecom equipment manufacturers face the problem of having to adhere
to aggressive product release schedules and shorten the "Time to
Market". It is also euphemistically called the "Internet Time"
and often multiple releases of the same product are scheduled in the same
calendar year. In this scenario, the Testing and Quality Assurance functions
of the organization are severely challenged. They have to complete the testing
on an ever-decreasing budget. Additionally, the developers often send for
testing code with new fixes very close to the planned finish date. This
stretches the testing ability of the team, which then very often resorts to
experience driven testing without completing all regression (manual) tests on
the new load. A typical testing facility will have testing tools from various
third party vendors to test base product functionality, performance and load
behavior. These tools seldom talk to each other. However, each is required to
test a specific sub function within an end to end function within a test case.
In an unautomated environment, each tool has to be used manually. In addition,
the initial setup for testing has to be manually performed. To address the few
issues of costly effort stated here, a test automation architecture was
gainfully used, wherein the tools are integrated into a common framework. This
architecture also helps in organizing, managing and executing the test cases
remotely from a centralized location. Specific implementations of such
architectures have demonstrated a significant saving of testing effort. The
GSM Abis interface is located between the BSC (base station controller) and
the BTS (base transceiver station). This includes layer 3 procedures related
to Radio Link Layer Management (RLM), Dedicated Channel Management (DCM),
Common Channel Management (CCM), and Transceiver Management (TRX). Operation
and Maintenance (O&M). While there are many tools available to monitor the
protocol message flow between the BSC and BTS and some of these also provide
the BTS simulation capabilities, these are very expensive and generic in
nature. In such situations, use of a tool which could simulate the O&M
plane of a particular flavor of Abis implementation is appropriate. This tool
allows Base Station management software (under test) to communicate with it as
if it were communicating with a real BTS.
Source Code Analysis as a Cost Effective Practice
to Improve the High Reliability of Cisco Products
For huge code base exceeding dozen million lines, it was proven to save
multi million dollars to practice source code analysis in the early stages of
the development cycle. Early analysis allows defects to be discovered and
addressed before they ever expose to the customer sites, thus saving the
company tremendous effort and finance to correct them on the field. Most
important, it helps to improve customer satisfaction and increase the MTBF
figures. The paper covers the types of defects being discovered by some third
vendors and in-house source code analysis tools, the implementation and the
deployment challenges. It presents a Cisco's integration architecture to gain
the most productive results out of all tools while allowing a high number of
developers to share the same centralized resource and efforts. Cisco’s
scalable integration also overcomes the high noise ratio and other complex
technical issues arising from the source code analysis tools and practices.
The saving contributed to a 99.999% RAS (Reliability, Availability,
Serviceability) Executive Team Award towards improving the high reliability of
Cisco products. The paper also challenges further academic research to fulfill
the need of the globalizing industry.
CMM Level of an organization does not rise higher
than level of process improvement activity
QA process becomes more important for products used in networks, but there
is not enough time for QA process because it is hard-pressed by other
bottlenecked processes. Process, in the first place, is for maintaining
quality in products. Started a software process improvement activity. Lesson
learned: CMM level of an organization does not rise higher than the maturity
level of process improvement activity. Using this model, it is necessary to
plan for improving the level of the SPI activity at first.
Applying reliability prediction models among
different software systems and identifying commonly effective metrics to
predict fault-prone modules
Many studies have been done to predict fault-prone modules. The typical
approach is constructing prediction models based on complexity metrics and
failure frequencies of each module, using some analysis methods such as linear
regression analysis, discriminant analysis, logistic regression analysis,
classification tree, neural network, and so on. Most of these results showed
that these prediction models could predict fault-prone modules at high
accuracy. However, these studies were various in such as analysis method,
target software and measured metrics. Furthermore, most of the results were
validated on only a single software system and it has not been practically
investigated whether a prediction model based on a software system can also
predict fault-prone modules accurately in other software systems. In this
study, we evaluated the applicability of prediction models between two
software systems. They are different in function, written language, size, and
data collecting time. For each module in each software system, we collected 15
complexity metrics and constructed prediction models using linear discriminant
analysis. Then we predicted fault-prone modules of one software system by
applying the prediction model based on data of another software system to
their metrics values, and vice versa. The accuracy of the prediction results
(the number of modules predicted correctly per the number of all modules) were
very bad (26 percent on the average), because most of modules were predicted
as "Fault-Prone". A prediction model typically tends to be
specialized to the data collected from the software system, so that the
prediction performance may be less accurate when it is applied to other
software system. We assume this is the reason of the low accuracy. Next, we
focused on metrics effective to predicting fault-prone modules in every
prediction model, and we identified two metrics ( Lines of Code and Max
nesting level) as commonly effective metrics in all prediction models. Then we
constructed prediction models using only the two metrics and evaluated the
prediction performance. In this case, the prediction performance were
dramatically improved (the accuracy were 78 percent on the average). We
believe that this result is very useful to predict fault-prone modules in
newly developed software by constructing the prediction model based on those
metrics. By conducting similar studies using more metrics, we will be able to
identify still more metrics commonly effective to fault-proneness.
An Application of the Discrete Function Theory and
the Software Control Flow to Dependability Assessment of Embedded Digital
Systems
This article describes a combinatorial model for estimating the reliability of
the embedded digital system by means of discrete function theory and software
control flow. This model includes a coverage model for fault processing
mechanisms implemented in digital systems. Furthermore, the model considers
the interaction between hardware (H/W) and software (S/W). The fault
processing mechanisms make it difficult for many types of components in
digital system to be treated as binary state, good or bad. The discrete
function theory provides a complete analysis of multi-state systems as which
the digital systems can be regarded. Through adaptation of software control
flow to discrete function theory, the HW/SW interaction is also considered for
estimation of the reliability of digital system. Using this model, we predict
the reliability of one board controller in a digital system, Interposing Logic
System (ILS), which is installed in YGN nuclear power units 3 and 4. Since the
proposed model is a general combinatorial model, the simplification of this
model becomes the conventional model that treats the system as binary state.
Moreover, when the information on the coverage factor of fault tolerance
mechanisms implemented in systems is obtained through fault injection
experiments, this model can cover the detailed interaction of system
components.
Measures to Increase the Dependability of
Information Systems in the IT Age
This paper deals with several important quality issues for industrial
large systems involving networks systems and including "black box"
components delivered by other companies. It details Hitachi's strategy to
obtain high-reliable/dependable system. The goal of the SST approach (that
stands for System Simulation Testing) is to stress the system for testing its
performances and its fault tolerance under load fluctuations and device
failure. Hitachi has a dedicated system simulation testing (SST) Center
in-house to implement SST and is developing various tools that generate heavy
loads and various failures. The Information System Quality Assurance
Department operates these facilities and accumulates and uses SST test
technology know-how. The Systems Engineering (SE) Department in charge of
assisting customers performs the individual SST. One SST example is the case
of SST for large-scale network systems. Thanks to the above measures,
Hitachi's information system has been highly regarded for high reliability and
dependability for over a quarter of a century since the installation of SST.
|