SUMMARY REPORT

This summary report is an archived publication and may contain dated technical, contact, and link information

Federal Highway Administration >
Publications >
Research Publications >
Nondestructive Evaluation >
Verification, Validation and Evaluation of Expert Systems, Volume I, A FHWA Handbook

Verification, Validation and Evaluation of Expert Systems, Volume I, A FHWA Handbook

A draft Verification, Validation and Evaluation of Expert Systems,. Volume I, A FHWA Handbook, has been developed. The purpose of this communication is not to present an official document, but to share a work in process and to solicit advice. It is the intention of the Development Team to produce a quality product that will truly be of value to those developing and testing expert systems. Thus I encourage you to critically review this draft handbook and provide any suggestions for improvement. We want to hear the bad news as well as the good so that we can improve this handbook.

This draft handbook discusses how verification, validation and evaluation (VV&E) should be incorporated into the expert system life cycle, shows how to partition knowledge bases with or without expert domain knowledge, presents knowledge models, presents methods of validating domain (the experts') knowledge, and discusses management issues related to expert systems development and testing. Mathematical proofs for partitioning and consistency and visualization of concepts are also presented.

I would have considered the draft handbook to be of little use if we were unable to do a pen and paper analysis using its procedures on a real-world expert system of reasonable size and complexity (with computer support for matrix manipulation, solving differential equations, etc.). The expert system PAMEX: Expert System for Maintenance Management of Flexible Pavements, with 327 rules, 20 input variables and 59 qualifiers, was selected for the in-house pen and paper analysis. The results of this analysis provided insights into PAMEX and identified errors in programming that the developers and users were unaware of.

This draft handbook will also be field tested in a number of States using operational expert systems as test cases. At the end of this testing (probably in about one year), the handbook will be updated based on the results of the testing.

It will be impossible to respond directly to every comment, but be assured that every comment will be reviewed and given the consideration it deserves. Thank you in advance for your assistance.

Sincerely yours,

James A. Wentworth

Chief, Advanced Research Team

Office of Safety and Traffic Operations

Research and Development

Federal Highway Administration

E-Mail: tfhrc.webmaster@dot.gov

Verification, Validation & Evaluation of Expert Systems,. Volume 1
A FHWA Handbook (1st Edition, December 1995)

EXECUTIVE SUMMARY

This handbook (MS Word 6.0 document downloadable as a pkzip file) has been prepared for the Federal Highway Administration to cover the subject of verification, validation, and evaluation of expert systems. The difficulty of performing verification, validation, and evaluation (VV&E) on expert systems is one of the major factors slowing the development and acceptance of expert systems in the transportation community. There is little agreement among experts on how to accomplish the VV&E of expert systems. The complexity and uncertainty related to these tasks has lead to the situation where most expert systems are not adequately tested. In some cases testing is ignored until late in the development cycle, always with predictable disastrous results.

This guide discusses how VV&E should be incorporated into the expert system lifecycle, shows how to partition knowledge bases with and without expert domain knowledge, presents knowledge models, presents methods for validating underlying the experts' knowledge, and presents management issues related to expert systems development and testing. Mathematical proofs for partitioning, consistency, and completeness and visualization of concepts are presented. The relevant information for this handbook came from the research efforts of a three entity task force represented by James A Wentworth of the Federal Highway Administration, Rodger Knaus of MiTech, and Hamid Aougab of RAM.

1. Introduction

Many of the new technologies for road building engineering do not achieve the levels of reliability and standardization required by the civil engineering profession. Particularly within this category are many expert systems designed for the transportation industry that have proven to be major disappointment due partly to the lack of verification, validation, and evaluation standards. The goals of expert systems are usually more ambitious than those of conventional or algorithmic programs. They frequently perform not only as problem solvers but also as intelligent assistants and training aids. Expert systems have great potential for capturing the knowledge and experience of current senior professionals (many of whom are approaching retirement age) and making it available to others in the form of training aids or technical support tools. Applications include design, operations, inspection, maintenance, training, and many others.

In traditional software engineering, testing [verification, validation, and evaluation (VV&E)] is claimed to be an integral part of the design and development process. However, in the field of expert systems, there is little consensus on what testing is necessary or how to perform it. Furthermore, many of the procedures that have been developed are so poorly documented that it is difficult, if not impossible, for the procedures to be reproduced by anyone other than the originator. Also, many procedures used for VV&E were designed to be specific to the particular domain in which they were introduced. The complexity and uncertainty related to these tasks has led to a situation where most expert systems are not adequately tested.

Impelled by the existing environment of lack of consensus among experts and inadequate procedures and tools, the Federal Highway Administration (FHWA) developed this guideline for expert system verification, validation, and evaluation. The guideline is needed because knowledge engineers today do not often design and carry out rigorous test plans for expert systems.

2. Basic Definitions

This guide covers verification, validation, and evaluation of expert systems. An expert system is a computer program that includes a representation of the experience, knowledge, and reasoning processes of an expert.

Verification of an expert system is the task of determining that the system is built according to its specifications. Validation is the process of determining that the system actually fulfills the purpose for which it was intended. Evaluation reflects the acceptance of the system by the end users and its performance in the field. In other words the VV&E elements of the expert system are designed to:

a) Verify to show the system is built right.

b) Validate to show the right system was built.

c) Evaluate to show the usefulness of the system.

Once the system is validated, the next step is to verify it. This involves completeness and consistency checks and examining for technical correctness using techniques such as are described in this handbook. The final step is evaluation. For the serviceability program, this means giving the system to engineers to use in computing the coefficient. Although the system is known to produce the correct result, it could fail the evaluation because it is too cumbersome to use, requires data that is not readily available, does not really save any effort, does something that can be estimated accurately enough without a computer, solves a problem rarely needed in practice, or produces a result not universally accepted because different people define the coefficient in different ways.

3. Need for V&V

It is very important to verify and validate expert systems as well as all other software. When software is part of a machine or structure that can cause death or serious injury, V&V is especially critical. In fact, there have already been failures of expert systems and other software that have resulted in disasters.

Expert systems use computational techniques that involve making guesses, just as human experts do. Like human experts, the expert system will be wrong some of the time, even if the expert system contains no errors. The knowledge on which the expert system is based, even if it's the best available, does not completely predict what will happen. For this reason alone, it is important for the human expert to validate that the advice being given by the expert system is sound, this is especially critical when the expert system will be used by persons with less expertise than the expert, who can not themselves judge the accuracy of the advice from the expert system.

In addition to mistakes which an expert system will make because the available knowledge is not sufficient for prediction in every case, expert systems contain only a limited amount of knowledge concentrated in carefully defined knowledge areas. Today's expert systems have no common sense knowledge. They only "know" exactly what has been input into their knowledge bases. There is no underlying truth or fact structure to which it can turn in cases of ambiguity. If the expert system does not realize its mistake, and it is being used by a person with limited expertise, there is nobody to detect the error. Therefore, where the expert system is going to be used by someone without expertise, and the decisions made have the potential for harm if made badly, the very best effort at verification and validation is required.

4. Problems in Implementing Verification, Validation, and Evaluation for Expert Systems

One of the impediments to a successful V&V effort for expert systems is the nature of expert systems themselves. Expert systems are often employed for working with incomplete or uncertain information or "ill structured" situations. Since expert system specifications often do not provide a precise criteria against which to test, there is a problem in verifying, validating, and evaluating expert systems according to the definitions. Some vagueness in the specifications for expert systems is unavoidable; if there are precise enough specifications for a system, it may be more effective to design the system using conventional programming languages.

Another problem in VV&E for expert systems is that expert system languages are unstructured to accommodate the relatively unstructured applications. However, rigid structure in implementing code is a key technique used in writing verifiable code, such as the Cleanroom approach.

Selected for the FHWA Handbook are those techniques which seem the most straightforward, precise and powerful in practice. Included are particular variations of partitioning, incidence matrices, and the use of metaknowledge (i.e., knowledge models).

This handbook will provide guidance on planning and decision making early in an expert systems project. This concept applies not only to new developments, but to thinking/improved decision making at any stage from development through implementation, this includes planning the verification, validation, and evaluation of an already developed system. The advice given here should aid in developing clear problem definition and thorough system requirements, reflecting realism from both technical and organizational viewpoints. Risk identification information is also provided.

This handbook will discuss how VV&E should be incorporated into the expert system lifecycle. Although some ideas may be used for revising and/or reengineering existing systems, the aim is to design new systems and ensuring that enough VV&E operations are done during the lifecycle so that these systems are verifiable. This includes decisions that should be made during system specification and verification/validation during stepwise development of an expert system.

An overview of the basic method for formal proofs is provided to prove the correctness of small systems by non-recursive means; and to partition the larger systems into smaller systems and to insure that the component systems are proved to possess the correct relations as required by partitioning theorems. Moreover, the basic method for formal proof will insure that the components agree among themselves. In addition, this handbook will cover selected techniques for partitioning large expert systems when expert knowledge is unavailable.

Generally, it is best to partition a knowledge base using expert knowledge. This results in a knowledge base that reflects the expert's conception of the knowledge domain. This in turn facilitates communication with the expert, and later maintenance of the knowledge base. However, sometimes it is not possible to obtain expert insight into a knowledge base. In this case functions and incidence matrices can be extracted from the knowledge base, and the information contained therein used to partition the knowledge base.

Knowledge models are high level templates for expert knowledge. These templates express the high level structure of the expert knowledge. Examples of knowledge models are decision trees, flowcharts and state diagrams. By organizing the knowledge, a knowledge model helps with VV&E by suggesting strategies for proofs and partitions; in addition, some knowledge models have mathematical properties that help establish completeness, consistency or specification satisfaction.

Small expert systems are those for which direct proof of completeness, consistency and specification satisfaction, without partitioning the knowledge base. This handbook discusses techniques for these proofs.

Finally, evaluation, which includes field testing, addresses the issue "is the system valuable?". This is reflected by the acceptance of the system by its end users and the performance of the system in application. This handbook addresses this issue and some general guidelines which help in the distribution and Maintenance of expert systems.

5. Intended Audiences for the Handbook

The following table describes the intended audiences for the handbook, and the parts of the handbook that will be most useful to these audiences:

Audience	Task to be Performed	Part of Handbook
Managers	Manage expert system project	Introduction
Knowledge Engineers	Build new expert systems	Techniques VV&E on New Systems
Knowledge Engineers	Perform VV&E on existing systems	Techniques VV&E on Existing Systems
Highway Engineers	Insure that a correct new expert system is built	VV&E on New Systems
Highway Engineers	Insure that an existing expert system has been validated	VV&E on Existing Systems
Software Researchers	Critique and extend VV&E methods	Techniques VV&E on Existing Systems VV&E on New Systems

Table of Contents

1. Introduction

Basic Definitions

Need for V&V

Problems in Implementing Verification, Validation, and Evaluation for Expert Systems

Intended Audiences for the Handbook

2. Verification and Validation: Past Practices

3. Planning and Management

Introduction

Identify the Need for an Expert System

The Development Team

The T / E Team

4. Developing a Verifiable System

Introduction Specification

The Importance of Specifications

The General Form of Specifications

Defining Specifications

Gather Informal Descriptions of Specifications

Obtain Expert Certification of the Specifications

Validating Informal Descriptions of Specifications

Validating the Translation of Informal Descriptions

Validation of Formalized Requirements

Step-Wise Refinement Development

Design

Implementation

Correctness Verification

5. The Basic Proof Method

Introduction

Overview of Proofs Using Partitions

A Simple Example

The other subsystems of KB1 can be proved consistent in the same way.

6. Finding Partitions without Expert Knowledge

Introduction

Functions

Expert Systems are Mathematical Functions

Partitioning Functions into Compositions of Simpler Functions

Cartesian Product

Function Composition

Dependency Relations

Immediate Dependency Relation

Operations on Relations

Finding Functions in a Knowledge Base

Choosing the Output and Input Variables of a Function

Finding the Knowledge Base that Computes a Function

Hoffman Regions

The Hoffman Regions of KB1

When is a Partitioning Advantageous

Hoffman Regions of Partitioned KB1

7. Knowledge Modeling

Introduction

An Example of a Knowledge Model

Using Knowledge Models in VV&E

Decision Trees

Introduction

Definition

Example

Use During Development

Use During VV&E

Ripple Down Rules

Introduction

Definition

Example

Use During Development

Changing a Ripple Down Rule System

Use During VV&E

A Ripple-Down-Rule System is Complete.

State Diagrams

Introduction

Definition

Example

Use During Development

Use During VV&E

Flowcharts

Use During Development

Use During VV&E

Functionally Modeled Expert Systems

Introduction

Use During Development

8. VV&E for Small Expert Systems

Completeness

Consistency

Specification Satisfaction

Specification Based on Domain Subsets

Effect of the Inference Engine

Inference Engines for Very High Reliability Applications

9. Validating Underlying Knowledge

Introduction

Validating Knowledge Models

Validating the Semantic Consistency of Underlying Knowledge Items

Creating a TRUE/FALSE Test

Giving the Test

Formulating the Experiment

Analyzing the Test Results

Overall Agreement Among Experts

Approaches to Disagreement Among Experts

Clues of Incompleteness

Variable Completeness

Semantic Rule Completeness and Consistency

Validating Important Rules

Validating Confidence Factors

10. Testing

Simple Experiments for the Rate of Success

Selecting a Data Sample

Estimating a Proportion (Fraction) of a Population

The Confidence Interval of a Proportion

Choosing Sample Size

Estimating Very Reliable Systems

How a Proof Increases Reliability

11. Evaluation and Other Management Issues

Evaluation

Distributing And Maintaining Expert Systems

Distribution

Maintenance

Appendix

Symbolic Evaluation of Atomic Formulas

General Regression Neural Nets

References

LIST OF FIGURES

Figure 1.1: the V&V Process

Figure 3.1: Initial Project Planning

Figure 3.1.1: KB1 Initial Project Planning

Figure 4.1: Developing a Verifiable System

Figure 4.2: Specification

Figure 4.2.1: KB1 Specification

Figure 4.2.2: KB1 Design

Figure 4.3: Correctness Verification

Figure 4.3.1: KB1 Implementation

Figure 5.1: Knowledge Base 1

Figure 5.2: An Example of Knowledge Base Partitioning

Figure 6.1: Immediate Dependency Relation as Ordered Pairs

Figure 6.2: Examples of domains

Figure 7.1: Pamex DT

Figure 7.2: Example ES

Figure 8.1: Completeness of Investment Subsystem

Figure 8.2: Consistency of I Subsystem

Figure 8.3: Example Specification for KB1

Figure 8.4: Symbolic Evaluation

Figure 8.5: Symbolic Inference Engine

LIST OF TABLES

Table 1.1: Intended Audiences for the Handbook

Table 2.1: Validation Methods

Table 2.2: Verification Methods

Table 2.3: V&V Software

Table 4.1: Level of Effort for the Correctness Verification Stage

Table 6.1: Immediate Dependency Relation for KBI

Table 6.2: Matrix Product of the DR by Itself

Table 6.3: Immediate DR of KB1

Table 6.4: Varible Clusters of the DR of KB1

Table 6.5: How Variables Influence Rules

Table 6.6: How Rules Influence Variables

Table 6.7:Immediate Dependency Matrix for KB1

Table 6.8: Hoffman Regions for KB1

Table 9.1: Confidence Level

Table 9.2: Confidence Level with One Expert Disagreeing

Page Owner: Office of Research, Development, and Technology, Office of Safety, RDT

Topics: research, infrastructure, pavement and materials
Keywords: research, safety, expert system PAMEX, validation and evaluation
Scheduled Update: Archive - No Update needed

This page last modified on 03/08/2016