U.S. Department of Transportation
Federal Highway Administration
1200 New Jersey Avenue, SE
Washington, DC 20590
202-366-4000


Skip to content
Facebook iconYouTube iconTwitter iconFlickr iconLinkedInInstagram

Federal Highway Administration Research and Technology
Coordinating, Developing, and Delivering Highway Transportation Innovations

Report
This report is an archived publication and may contain dated technical, contact, and link information
Publication Number: FHWA-RD-04-080
Date: September 2004

Software Reliability: A Federal Highway Administration Preliminary Handbook

PDF Version (697 KB)

PDF files can be viewed with the Acrobat® Reader®

Chapter 6: Wrapping

This chapter:

  • Explains how verification code within a program can make the program self-testing.
  • Documents how to use executable specifications to implement wrapping for highway software.
  • Discusses a sample highway application.
  • Outlines what is gained from wrapping.
  • Lists the limitations of wrapping.

The following simple examples provide some context for the framework.

Introduction

A wrapping for a piece of software is additional information that the software can use to answer questions about itself. The concept of wrapping was originally developed in National Aeronautics and Space Administration- (NASA) sponsored work on large systems.(6) An intended application of wrapping was to allow software to describe its function to make it easier to call the right function in a large software project.

Wrapping also appears in Java in the form of the "instanceof" relation and the reflect package. These facilities allow objects to report on their contents and class membership.

Applied to software correctness, wrapping is a technique that uses a modified version of a piece of software to compute its own correctness. To wrap a program P used in system S with a requirement R on P:

  • R is translated (using the techniques for translating formulas) into an equivalent representation R' in the programming language of P.
  • R' is inserted at the end of P, right before P returns its value(s). This insertion is done in such a way that the calling context of P can examine the result of R. The modified version of P will be called P'.
  • S is modified to:
    • Test the value of R'.
    • Discard the values from P' if R' is FALSE.

Wrapping has several different V&V applications:

  • Wrapping a software module with a logical formula specification.
  • Wrapping a software module with necessary but not sufficient conditions for correctness that the output must satisfy. This is useful when sufficient conditions for correctness are not available, e.g., with software that predicts values that have not yet been observed.
  • Wrapping inputs to all programs to ensure that the inputs are in the domain for which the program is intended.

Several examples follow.(3)

The Matrix Inverse

The matrix inverse is a troublesome computation for matrices that are close to being singular. Except by using the theory of the application domain, there is no completely reliable way, to distinguish between:

  • A theoretically nonsingular matrix that is nearly singular.
  • A theoretically singular matrix that appears computationally to be nonsingular because of accumulated roundoff errors in the row vectors.

Particular inverse functions work well for some matrices but break down when the input matrix becomes too ill-conditioned for the function. As an example, researchers have observed that several standard math packages computed an inverse for the following singular matrix:

1 2 3 4 5 6 7 8 9

Problems with the matrix inverse can be due either to numerical roundoff errors or to an underlying problem in the computer code. One of the math packages that has a problem with this matrix is Numerical Recipes in C.(7) The code segment that seems to cause the problem is:

Click for full description

This is the only point in the matrix inverse function that tries to detect singular matrices. It apparently fails because it is possible for both >0 and == 0 to fail for very small positive numbers (something that cannot happen in abstract mathematics of the real numbers). For reasons like this, it is considered good practice in writing numerical software to avoid strict equality tests with 0 and instead to write a test like:

if (big< EPSILON)

Where EPSILON is some number chosen to catch very small numbers that are more likely due to roundoff errors than legitimate results of numerical computations.

However, no choice of EPSILON truly can separate actual and false inverses all the time. So instead:

  • EPSILON is passed in as a parameter.
  • The definition of the inverse, i.e., A*inv(A) = inv(A)*A = I is used to test that the computed inverse is actually an inverse to within EPSILON.
Click for full description

Where:

  • Matrices are represented by pointers to the first (0 index) double (real number).
  • Inverse, multiply, identity, and distance have the purposes suggested by their names.
  • Outputs are represented by pointers to arrays that hold the computed information (pointers to pointers to doubles in C).

Note that wrapped_inverse:

  • Sets a pointer (out) to point to the possible inverse.
  • Returns a truth value (as an integer) that indicates whether the identity computed by trying the possible inverse is close to the true inverse.

As a result, the caller of wrapped_inverse always knows whether the computed inverse really works as an inverse.

Wrapping the Integer Factorial

As another example of wrapping, wrap the factorial in C. This function is shown below:

Click for full description

The problem is that:

  • The range of C integers is only 32KBytes
  • C does not warn of integer overflows.

A run of this program illustrates the problem:

Enter an integer: 8 asking for fact(7) asking for fact(6) ... j=6, fact(j) = 720 j=7, fact(j) = 5040 j=8, fact(j) = -25216

To solve this program, create a wrapped version of the factorial:

Click for full description

This function has the same arguments and returns the same values as fact. However, it uses a floating point version of the factorial, fact, as a comparison to check for integer overflow. It returns the Boolean result of whether overflow occurred to the calling context of wrapped_fact using the call-by reference to flag, written here as C. The function ffact can be encoded by syntactic transformations on fact, e.g., changing int parameters to float. When the wrapped factorial is run, the error is detected:

Inside wrapped_fact, arg = 8, *flag = 0

The double computation of fact, once integer and once real, can be improved in the following version of wrapped_fact. This version detects a discrepancy between integer and float computations and aborts at the earliest possible step.



Click for full description
Click for full description

Although the last definition of wrapped_fact is more efficient, it is more perilous. The type cast to float is needed. Otherwise,

j * temp_fn_result

is integer multiplication in C, and the overflow occurs in both the value of fn_result and float_fn_result. The more efficient wrapping requires a more detailed knowledge of C arithmetic.

The final code segment illustrates how passed-back correctness information can be used to control the calling context of a wrapped function. The following example driver tries an increasing set of factorials determined by the user. However, it quits as soon as an error as reported by the Boolean error flag occurs in any factorial.

Click for full description

In turn, this function returns a success value that can control its calling context:

Click for full description

The complete output of the above source code is listed in appendix A.

Limitations of Wrapping

Wrapping, at least as shown in the examples, where wrapping code is placed directly in the source code, has the following limitations:

  • The source code is necessary to wrap.
  • An error in the wrapping code may lead to a false positive, in which an error is flagged but none exists in the wrapped computation.
  • An underlying problem, such as an error in a procedure called both by the main computation and its wrapping, can cause a false negative: An error exists but is not caught by the wrapping.
  • Large, complex specifications, which are typical for engineering applications, are difficult to code as wrappings.

Wrapping with Executable Specifications

To overcome the wrapping limitations caused by writing wrapping code directly in the source, write the specifications in a more declarative but computer-executable form, as is done in expert systems. This:

  • Reduces coding errors in the wrapping code.
  • Provides a practical way to use large complex engineering specifications in a wrapping.

Then call the wrapping code external to the wrapped program, and apply the wrapping to files of inputs and outputs of the target program. This allows wrapping to be used even if source code is not available.

The particular form for the specification knowledge base that will be used is a set of decision trees. This knowledge representation and the software for using it are discussed below.

Knowledge Representation of Specifications

Decision trees are useful in formalizing specifications, because:

  • Decision trees break a problem into successively smaller subproblems. This allows the domain expert to use a divide-and-conquer strategy in writing and checking specifications.
  • Decision trees mirror many informal specifications, e.g., large parts of the specifications for the Interactive Highway Safety Design Model (IHSDM) policy module.
  • Tools can be written to automatically check the consistency and completeness of the decision tree.
  • A decision tree interpreter can apply a decision tree of specifications as a wrapping to enforce the specifications at runtime.

When decision trees are used to check the results of a computer program, the decision tree returns a Boolean value: true if the decision tree is satisfied, and false if not. The inputs to and outputs from the program are used as values of variables referred to in the decision tree. The rules for computing the Boolean value of the decision tree are listed in the section below.

Decision Tree Nodes for Representing Specifications

  • Or: Is satisfied if at least one of its subnodes is satisfied. The subnodes of an "or" node can be "or,", "and,", "ifthenelse," "test,", or "action" nodes.
  • And: Is satisfied if all its subnodes are satisfied. The subnodes of an "and" node can be "or,", "and,", "ifthenelse,", "test,", or "action nodes".
  • IfThenElse: Is satisfied if one of its subnodes is satisfied. The subnodes of an "ifthenelse" node must be an "ifthen" or an "else" node. The "else" node must appear last, and the nodes must be tried in order.
  • IfThen: Is satisfied if both its "if" and "then" subnodes are satisfied; otherwise the node fails. The "ifthen" node has only one "if" subnode and only one "then" subnode.
  • Else: Has a single subnode of any type, and is satisfied when its subnode is satisfied.
  • If: Is logically like an "and" node. It is satisfied if all its subnodes are satisfied. The "if" node is not necessary logically, but is convenient in formalizing informal specifications. The subnodes of an "if" node can be "or", "and", "ifthenelse", or "test" nodes.
  • Then: Is logically like an "and" node. It is satisfied if all its subnodes are satisfied. The "then" is not necessary logically, but is convenient in formalizing informal specifications. The subnodes of a "then" node can be "or", "and", "ifthenelse", "test", or "action" nodes.
  • Test: Is satisfied if the Boolean expression enclosed within text brackets is true. "Test" nodes are C-like Boolean expressions containing constants and values of decision tree variables. "Test" nodes have no subnodes.
  • Action: Is satisfied if the action presented in the node is executed successfully. Actions include C-like assignment statements that set decision tree variables or write output to a file. Action nodes have no decision tree subnodes.

An example decision tree for part of the IHSDM design speed module appears later in this chapter.

Deriving Useful Specifications

Good specifications are important in creating reliable software; but specifications are notoriously incomplete and vague. SpecChek helps create more complete, precise specifications because it lets users incrementally develop and test specifications. Use SpecChek with the following plan to create precise, complete, and enforceable specifications for software projects:

  1. Collect items to include in the specifications: The specifications may be organized into a single document (e.g., the Highway Capacity Manual), but if not, collect individual specification items from experts, existing projects, and specification documents.
  2. Conduct a safety analysis of the project using the software to determine how software errors might compromise safety (see Safeware, System Safety and Computers for details).(2) Then add specific items to the software specifications to prevent these errors.
  3. Organize the specifications into a version 0 decision tree. Do not worry at this point about converting the tree into SpecChek's XML notation.
  4. Encode the version 0 tree into XML. See syntax.htm linked from the SpecChek homepage for information about how to do this.(8)
  5. Gather input and output parameter data from successful previous projects, as well as good and bad designs.
  6. Run the data from these real and imaginary projects through SpecChek and examine the results.
  7. When SpecChek approves a bad design or fails a good one, use the SpecChek report to find possible errors in your evolving decision tree of specifications. Fix any errors, and if the decision tree is changed, repeat steps 5 and 6.

Translating Informal Specifications into Decision Trees

Translate informal specifications by applying the following set of transformational rules to a partially translated specification, until the informal specification is completely translated into a decision tree. To represent the partially finished translation, the above decision tree specification will be modified to allow each node to have a special text subnode.

The contents of the text node will be the part of the informal specification that is relevant to a decision tree node, and have not yet been translated into explicit decision tree nodes. A node with a text subnode will be considered to be satisfied if the satisfaction properties stated above are satisfied and the contents of the text node are satisfied.

With these conventions, the starting point for translating an informal specification is the decision tree:

<and> <text> [put the entire informal specification here] </text> <and>

Partial Translation Transformations

The following transformations of decision tree nodes with text carry out the translation of informal specifications into a decision tree without text nodes step by step. The set of rules below is a first attempt, and additional rules are expected to be found from translating informal specifications.

Different Topics

A decision tree from an informal specification addressing different topics is the and of the decision trees of those topics. Given a node of the form,

Click for full description

transform to

Click for full description

Sequential Branching

Given a node of the form,

transform to

And Translation

If a text presents a set of requirements R1,..,RN that must all be satisfied, transform

Or Translation

If a text presents a set of requirements R1,..,RN that is satisfied if any are satisfied, transform

<and><text>text(R1...RN) </text> </and>

into

Atomic Formula Translation

If text can be translated into an atomic formula A, transform

<then><text>text </text> </then>

into

<then><action>A </action> </then>

A similar transformation holds for tests.

Empty Text Node Deletion

Transform

<node> subnode1,...,subnodeN <text> </text> </node>

into

<node> subnode1,...,subnodeN </node>

Logical Equivalence Simplification

A tree can be transformed if its logical value is never changed. An example is double and elimination.

Transform

<and><and> [whatever] </and>into <and> [whatever] </and>

SpecChekTM

SpecChek is a computer program that interprets decision trees of the type described above, developed as part of this handbook project. When a specification is written as a decision tree, SpecChek can be used to determine whether a program meets its specifications.

An executable version of SpecChek and complete user documentation is available on the SpecChek homepage.(8)

How SpecChekTM Is Used

The following diagram shows how SpecChek is used in checking software against its specifications.

Legend: DT = "decision tree"
Figure 3: Model of SpecChek Method

Preparations for Checking Software

Before running SpecChek:

  • Develop specifications for the planned software.
  • Translate the specifications into a SpecChek decision tree.

Checking Software

To setup a SpecChek run:

  • Capture both the inputs and outputs of the program being tested, and send these to SpecChek.
  • Run the inputs and outputs through the SpecChek decision tree.
  • Analyze SpecChek results (see drawing below).
Click to view alternate text

Legend: DT = "decision tree"
Figure 4: Checking Software with SpecChek

Interpreting SpecChek Results

Case 1: Interpreting Specification Satisfaction

If SpecChek reports that the specifications are satisfied, there are two possibilities:

  • The specifications are satisfied.
  • The specifications are not satisfied and the SpecChek result is a false positive.

Usually, a SpecChek report of satisfaction reports the real thing. The independence of the tested program and SpecChek decrease the probability of a false positive. As specifications become more exacting, a false positive becomes less likely.

Case 2: Interpreting Specification Failure

If SpecChek reports that the specifications fail, there may be an error in:

  • The program being tested.
  • The decision tree encoding the specifications.
  • The SpecChek software.
  • The supporting infrastructure (e.g., the hardware, operating system, Java libraries).

Finding the Cause of Failure

To find the cause of specification failure with such a diverse set of possible errors, use SpecChek's reasoning report to see why SpecChek thinks the specifications fail. The error can be located by checking the logic in this report.

As illustrated in the example below, the error is usually in the program being tested. When developing or using a new decision tree, the error is sometimes in the decision tree or in the specifications put into the tree.

Although it has not been reported yet, it is possible that the error is in SpecChek or in the infrastructure for SpecChek. However, while most programs hide their errors inside the computer, SpecChek provides all the information users need to either verify SpecChek's correctness or pinpoint its error.

The Importance of the Reasoning Report

The reasoning report of why a program passed or failed the specifications is a key to checking the performance of deployed software. Because of the low probability of false positives, reports of specification satisfaction need not be checked thoroughly. For reports of specification failure, a report of the reasoning behind failure provides a practical way for engineers or other domain experts to determine whether and why the target program actually failed. Because the reasoning report allows the identification of false negatives due to errors in SpecChek, having a readable reasoning report makes it possible to use SpecChek even though the application may contain bugs. This is important, because it is not possible with today's technology to guarantee that any piece of software as large as SpecChek is free of bugs, or that a particular run is free of errors caused by the underlying computer system. (Of course, if there are too many false positives, SpecChek is not useful. However, experience to date has not revealed any false positives from the current version.)

More about the Reasoning Report

This section discusses how to interpret SpecChek's reasoning report in greater detail.

Interpreting a Report of Satisfaction

If SpecChek reports that the decision tree of specifications was satisfied, there are two possibilities:

  • The report of satisfaction is a false positive, i.e., there is an undetected failure of the supplied input and output data to meet the specifications.
  • The program being tested actually satisfied the specifications.

Satisfying engineering decision trees usually involves satisfying equations up to a small error tolerance and satisfying numerical inequalities that specify acceptable ranges for output variables. To estimate accurately the probability that such specifications were satisfied while both SpecChek and the target program contained errors generally requires problem-specific reasoning. However, the general comments below show that this probability is usually very small for engineering applications.

Most engineering specifications contain formulas that are applied to inputs. The chance that two erroneous calculating agents (SpecChek and the target program) would randomly produce equal results (within tolerance) out of all the possible real number answers is very small, provided that the computations resemble Bernoulli trials (the SpecChek and target program calculate their results independently).

Complete independence of the SpecChek and target program calculations generally are not achieved. For example, both may be run on the same computer, so errors in the arithmetic chip could produce outcancelling errors in each computation.

However, the following implementation features make SpecChek's calculations significantly independent of those in the target program:

  • SpecChek does not use any target program subroutines; it uses software only in the Java runtime environment, XML libraries, and the SpecChek decision tree interpreter (a compiled Java program).
  • If the target program is not in Java, SpecChek and the target program are independent at the compiler and runtime library level. If the target program is in Java, there is a reasonable chance that the two programs are compiled on different versions of the Java environment.

In addition, SpecChek contains various features that were designed to minimize errors to decrease the probability of a false positive. These are discussed in more detail below.

Because the probability of a false positive is small, a report of specification satisfaction reasonably can be interpreted as actual specification satisfaction.

Other Sources of False Positives

The above discussion focused on false positives that are due to outcancelling errors in the target program in SpecChek. However, there is also the possibility that a specification failure was not found because the XML decision tree did not correctly encode the underlying engineering specification.

There is no guaranteed way to eliminate this source of error. However, it can be minimized by carefully writing and reviewing the specifications. Because the XML specifications are relatively brief and readable by engineers, this checking and review is feasible. If carried out by multiple independent reviewers, it is likely to catch most errors. The reliability of a decision tree also is increased by basing the decision tree on established engineering specifications. These specifications generally have been subjected to intensive reviews, so that errors in the decision tree probably occur from translating the natural-language specifications into XML rather than from errors in the specifications themselves.

For a discussion of estimating errors in knowledge bases, see Verification, Validation, and Evaluation of Expert Systems, an FHWA Handbook.(5)

Interpreting a Report of Failure

If SpecChek reports that the decision tree of specifications was not satisfied, there are two possibilities:

  • The report of satisfaction is a false negative (there is actually nothing wrong with the program being tested).
  • The program being tested actually fails the specifications.

As discussed below, care has been taken to reduce the probability of an error in SpecChek. However, it is assumed that SpecChek, like almost all software, contains some bugs. When specifications are not satisfied, it must be determined whether the error is in SpecChek or in the program being tested.

The way to make this determination is to manually check the reasoning in the SpecChek report. This report describes why SpecChek thinks the decision tree is not satisfied; it is usually one page or less-small enough to be reviewed manually.

If the reasoning in the SpecChek report is sound, the error is in the target program. If the reasoning in the SpecChek report is unsound, the error is in SpecChek.

However, sometimes the specifications were not satisfied because the target program was correct and there was an error in the specification decision tree. Examining the chain of reasoning in the SpecChek report should help locate errors in the decision tree.

SpecChek's Own Reliability

SpecChek, like almost all software, likely contains bugs. This section discusses:

  • How SpecChek bugs affect the results obtained from SpecChek.
  • How the design of SpecChek reduces the likelihood of bugs.

How SpecChek Bugs Affect Users

It does not matter for a particular specification satisfaction report whether SpecChek contains bugs in general; what matters is whether one of those bugs has made that particular report incorrect. This can be determined by manually checking the SpecChek report for the reasoning that led to its conclusion about specification satisfaction. If the reasoning in the SpecChek report is sound based on the decision tree of specifications, the conclusion is valid, even if the SpecChek software contained bugs, and even if some of these bugs were activated in the SpecChek computation. Because SpecChek reports its reasoning, the significance of SpecChek bugs is reduced, and any such bugs are as easy to find as possible.

How the Design of SpecChek Reduces the Likelihood of Bugs

  • SpecChek uses a compiler and software libraries that are extensively used in other applications, namely:
    • The Sun Microsystems Java 2TM compiler and runtime libraries: These are used in the very large number of Java application programs. SpecChek uses a relatively modest subset of Java and avoids such potentially error-prone areas as the control of threads (concurrent processes).
    • Standard XML libraries (currently the Sun Microsystems XML implementation): XML increasingly is used as a standard for sending database information over the Internet, creating a vast application pool to detect errors in the underlying libraries.
  • The SpecChek decision tree interpreter is a relatively small, well-organized program that can be read and manually checked by SpecChek subscribers who wish to verify the correctness of the code themselves.
  • The software developer is currently carrying out an informal correctness proof of the decision tree interpreter.

Using SpecChek in the Software Life Cycle

During the software life cycle, SpecChek can be used to promote software reliability:

  • During specification development, to test specifications.
  • During design, as a rapid prototyping tool (SpecChek has extensions not used in specification checking that turn the software into an expert system shell).
  • As a testing tool.
  • During deployment and maintenance to detect software failures.

Advantages of SpecChek for Wrapping

  • SpecChek puts the professional in control of checking computed results.
  • SpecChek provides an independent check of computed results, because:
    • It uses its own software, not subroutines of the program being tested, to perform the necessary computations for checking output of the target program.
    • The professional can write and self-check the specifications being applied to the target program.
  • SpecChek can be called from within a target program to provide real-time checking of all computed results, or SpecChek can be used on input and output files to wrap black-box programs.
  • SpecChek runs on all platforms because it is written in Java.

An Example

The following decision tree encodes part of the specifications for the design consistency module of IHSDM. Note that the decision tree consists of logic built over equations and inequalities that relate variables such as V85, familiar to design engineers working in this area.

Click for full description


Click for full description


Click for full description

SpecChek was run on files of input and output data like the following file of input data. (Supplemental software was written to extract this data from a Computer-Aided Design (CAD) file.)

Click for full description


Click for full description

When SpecChek was run on inputs and outputs to which some noise had been added, it generated the following report of failure:

Click for full description

Previous | Table of Contents | Next

Federal Highway Administration | 1200 New Jersey Avenue, SE | Washington, DC 20590 | 202-366-4000
Turner-Fairbank Highway Research Center | 6300 Georgetown Pike | McLean, VA | 22101