U.S. Department of Transportation
Federal Highway Administration
1200 New Jersey Avenue, SE
Washington, DC 20590
202-366-4000
Federal Highway Administration Research and Technology
Coordinating, Developing, and Delivering Highway Transportation Innovations
SUMMARY REPORT |
This summary report is an archived publication and may contain dated technical, contact, and link information |
|
Verification, Validation and Evaluation of Expert Systems, Volume I, A FHWA HandbookA draft Verification, Validation and Evaluation of Expert Systems,. Volume I, A FHWA Handbook, has been developed. The purpose of this communication is not to present an official document, but to share a work in process and to solicit advice. It is the intention of the Development Team to produce a quality product that will truly be of value to those developing and testing expert systems. Thus I encourage you to critically review this draft handbook and provide any suggestions for improvement. We want to hear the bad news as well as the good so that we can improve this handbook. This draft handbook discusses how verification, validation and evaluation (VV&E) should be incorporated into the expert system life cycle, shows how to partition knowledge bases with or without expert domain knowledge, presents knowledge models, presents methods of validating domain (the experts') knowledge, and discusses management issues related to expert systems development and testing. Mathematical proofs for partitioning and consistency and visualization of concepts are also presented. I would have considered the draft handbook to be of little use if we were unable to do a pen and paper analysis using its procedures on a real-world expert system of reasonable size and complexity (with computer support for matrix manipulation, solving differential equations, etc.). The expert system PAMEX: Expert System for Maintenance Management of Flexible Pavements, with 327 rules, 20 input variables and 59 qualifiers, was selected for the in-house pen and paper analysis. The results of this analysis provided insights into PAMEX and identified errors in programming that the developers and users were unaware of. This draft handbook will also be field tested in a number of States using operational expert systems as test cases. At the end of this testing (probably in about one year), the handbook will be updated based on the results of the testing. It will be impossible to respond directly to every comment, but be assured that every comment will be reviewed and given the consideration it deserves. Thank you in advance for your assistance. Sincerely yours, James A. Wentworth Chief, Advanced Research Team Office of Safety and Traffic Operations Research and Development Federal Highway Administration E-Mail: tfhrc.webmaster@dot.gov Verification, Validation & Evaluation of Expert Systems,. Volume 1 EXECUTIVE SUMMARY This handbook (MS Word 6.0 document downloadable as a pkzip file) has been prepared for the Federal Highway Administration to cover the subject of verification, validation, and evaluation of expert systems. The difficulty of performing verification, validation, and evaluation (VV&E) on expert systems is one of the major factors slowing the development and acceptance of expert systems in the transportation community. There is little agreement among experts on how to accomplish the VV&E of expert systems. The complexity and uncertainty related to these tasks has lead to the situation where most expert systems are not adequately tested. In some cases testing is ignored until late in the development cycle, always with predictable disastrous results. This guide discusses how VV&E should be incorporated into the expert system lifecycle, shows how to partition knowledge bases with and without expert domain knowledge, presents knowledge models, presents methods for validating underlying the experts' knowledge, and presents management issues related to expert systems development and testing. Mathematical proofs for partitioning, consistency, and completeness and visualization of concepts are presented. The relevant information for this handbook came from the research efforts of a three entity task force represented by James A Wentworth of the Federal Highway Administration, Rodger Knaus of MiTech, and Hamid Aougab of RAM. 1. IntroductionMany of the new technologies for road building engineering do not achieve the levels of reliability and standardization required by the civil engineering profession. Particularly within this category are many expert systems designed for the transportation industry that have proven to be major disappointment due partly to the lack of verification, validation, and evaluation standards. The goals of expert systems are usually more ambitious than those of conventional or algorithmic programs. They frequently perform not only as problem solvers but also as intelligent assistants and training aids. Expert systems have great potential for capturing the knowledge and experience of current senior professionals (many of whom are approaching retirement age) and making it available to others in the form of training aids or technical support tools. Applications include design, operations, inspection, maintenance, training, and many others.In traditional software engineering, testing [verification, validation, and evaluation (VV&E)] is claimed to be an integral part of the design and development process. However, in the field of expert systems, there is little consensus on what testing is necessary or how to perform it. Furthermore, many of the procedures that have been developed are so poorly documented that it is difficult, if not impossible, for the procedures to be reproduced by anyone other than the originator. Also, many procedures used for VV&E were designed to be specific to the particular domain in which they were introduced. The complexity and uncertainty related to these tasks has led to a situation where most expert systems are not adequately tested. Impelled by the existing environment of lack of consensus among experts and inadequate procedures and tools, the Federal Highway Administration (FHWA) developed this guideline for expert system verification, validation, and evaluation. The guideline is needed because knowledge engineers today do not often design and carry out rigorous test plans for expert systems. 2. Basic DefinitionsThis guide covers verification, validation, and evaluation of expert systems. An expert system is a computer program that includes a representation of the experience, knowledge, and reasoning processes of an expert.Verification of an expert system is the task of determining that the system is built according to its specifications. Validation is the process of determining that the system actually fulfills the purpose for which it was intended. Evaluation reflects the acceptance of the system by the end users and its performance in the field. In other words the VV&E elements of the expert system are designed to: a) Verify to show the system is built right. b) Validate to show the right system was built. c) Evaluate to show the usefulness of the system. 3. Need for V&VIt is very important to verify and validate expert systems as well as all other software. When software is part of a machine or structure that can cause death or serious injury, V&V is especially critical. In fact, there have already been failures of expert systems and other software that have resulted in disasters.Expert systems use computational techniques that involve making guesses, just as human experts do. Like human experts, the expert system will be wrong some of the time, even if the expert system contains no errors. The knowledge on which the expert system is based, even if it's the best available, does not completely predict what will happen. For this reason alone, it is important for the human expert to validate that the advice being given by the expert system is sound, this is especially critical when the expert system will be used by persons with less expertise than the expert, who can not themselves judge the accuracy of the advice from the expert system. In addition to mistakes which an expert system will make because the available knowledge is not sufficient for prediction in every case, expert systems contain only a limited amount of knowledge concentrated in carefully defined knowledge areas. Today's expert systems have no common sense knowledge. They only "know" exactly what has been input into their knowledge bases. There is no underlying truth or fact structure to which it can turn in cases of ambiguity. If the expert system does not realize its mistake, and it is being used by a person with limited expertise, there is nobody to detect the error. Therefore, where the expert system is going to be used by someone without expertise, and the decisions made have the potential for harm if made badly, the very best effort at verification and validation is required. 4. Problems in Implementing Verification, Validation, and Evaluation for Expert SystemsOne of the impediments to a successful V&V effort for expert systems is the nature of expert systems themselves. Expert systems are often employed for working with incomplete or uncertain information or "ill structured" situations. Since expert system specifications often do not provide a precise criteria against which to test, there is a problem in verifying, validating, and evaluating expert systems according to the definitions. Some vagueness in the specifications for expert systems is unavoidable; if there are precise enough specifications for a system, it may be more effective to design the system using conventional programming languages.Another problem in VV&E for expert systems is that expert system languages are unstructured to accommodate the relatively unstructured applications. However, rigid structure in implementing code is a key technique used in writing verifiable code, such as the Cleanroom approach. Selected for the FHWA Handbook are those techniques which seem the most straightforward, precise and powerful in practice. Included are particular variations of partitioning, incidence matrices, and the use of metaknowledge (i.e., knowledge models). This handbook will provide guidance on planning and decision making early in an expert systems project. This concept applies not only to new developments, but to thinking/improved decision making at any stage from development through implementation, this includes planning the verification, validation, and evaluation of an already developed system. The advice given here should aid in developing clear problem definition and thorough system requirements, reflecting realism from both technical and organizational viewpoints. Risk identification information is also provided. This handbook will discuss how VV&E should be incorporated into the expert system lifecycle. Although some ideas may be used for revising and/or reengineering existing systems, the aim is to design new systems and ensuring that enough VV&E operations are done during the lifecycle so that these systems are verifiable. This includes decisions that should be made during system specification and verification/validation during stepwise development of an expert system. An overview of the basic method for formal proofs is provided to prove the correctness of small systems by non-recursive means; and to partition the larger systems into smaller systems and to insure that the component systems are proved to possess the correct relations as required by partitioning theorems. Moreover, the basic method for formal proof will insure that the components agree among themselves. In addition, this handbook will cover selected techniques for partitioning large expert systems when expert knowledge is unavailable. Generally, it is best to partition a knowledge base using expert knowledge. This results in a knowledge base that reflects the expert's conception of the knowledge domain. This in turn facilitates communication with the expert, and later maintenance of the knowledge base. However, sometimes it is not possible to obtain expert insight into a knowledge base. In this case functions and incidence matrices can be extracted from the knowledge base, and the information contained therein used to partition the knowledge base. Knowledge models are high level templates for expert knowledge. These templates express the high level structure of the expert knowledge. Examples of knowledge models are decision trees, flowcharts and state diagrams. By organizing the knowledge, a knowledge model helps with VV&E by suggesting strategies for proofs and partitions; in addition, some knowledge models have mathematical properties that help establish completeness, consistency or specification satisfaction. Small expert systems are those for which direct proof of completeness, consistency and specification satisfaction, without partitioning the knowledge base. This handbook discusses techniques for these proofs. Finally, evaluation, which includes field testing, addresses the issue "is the system valuable?". This is reflected by the acceptance of the system by its end users and the performance of the system in application. This handbook addresses this issue and some general guidelines which help in the distribution and Maintenance of expert systems. 5. Intended Audiences for the HandbookThe following table describes the intended audiences for the handbook, and the parts of the handbook that will be most useful to these audiences:
1. Introduction Basic Definitions Need for V&V Problems in Implementing Verification, Validation, and Evaluation for Expert Systems Intended Audiences for the Handbook 2. Verification and Validation: Past Practices 3. Planning and Management Introduction Identify the Need for an Expert System The Development Team The T / E Team 4. Developing a Verifiable System Introduction Specification The Importance of Specifications The General Form of Specifications Defining Specifications Gather Informal Descriptions of Specifications Obtain Expert Certification of the Specifications Validating Informal Descriptions of Specifications Validating the Translation of Informal Descriptions Validation of Formalized Requirements Step-Wise Refinement Development Design Implementation Correctness Verification 5. The Basic Proof Method Introduction Overview of Proofs Using Partitions A Simple Example The other subsystems of KB1 can be proved consistent in the same way. 6. Finding Partitions without Expert Knowledge Introduction Functions Expert Systems are Mathematical Functions Partitioning Functions into Compositions of Simpler Functions Cartesian Product Function Composition Dependency Relations Immediate Dependency Relation Operations on Relations Finding Functions in a Knowledge Base Choosing the Output and Input Variables of a Function Finding the Knowledge Base that Computes a Function Hoffman Regions The Hoffman Regions of KB1 When is a Partitioning Advantageous Hoffman Regions of Partitioned KB1 7. Knowledge Modeling Introduction An Example of a Knowledge Model Using Knowledge Models in VV&E Decision Trees Introduction Definition Example Use During Development Use During VV&E Ripple Down Rules Introduction Definition Example Use During Development Changing a Ripple Down Rule System Use During VV&E A Ripple-Down-Rule System is Complete. State Diagrams Introduction Definition Example Use During Development Use During VV&E Flowcharts Use During Development Use During VV&E Functionally Modeled Expert Systems Introduction Use During Development 8. VV&E for Small Expert Systems Completeness Consistency Specification Satisfaction Specification Based on Domain Subsets Effect of the Inference Engine Inference Engines for Very High Reliability Applications 9. Validating Underlying Knowledge Introduction Validating Knowledge Models Validating the Semantic Consistency of Underlying Knowledge Items Creating a TRUE/FALSE Test Giving the Test Formulating the Experiment Analyzing the Test Results Overall Agreement Among Experts Approaches to Disagreement Among Experts Clues of Incompleteness Variable Completeness Semantic Rule Completeness and Consistency Validating Important Rules Validating Confidence Factors 10. Testing Simple Experiments for the Rate of Success Selecting a Data Sample Estimating a Proportion (Fraction) of a Population The Confidence Interval of a Proportion Choosing Sample Size Estimating Very Reliable Systems How a Proof Increases Reliability 11. Evaluation and Other Management Issues Evaluation Distributing And Maintaining Expert Systems Distribution Maintenance Appendix Symbolic Evaluation of Atomic Formulas General Regression Neural Nets References LIST OF FIGURES Figure 1.1: the V&V Process Figure 3.1: Initial Project Planning Figure 3.1.1: KB1 Initial Project Planning Figure 4.1: Developing a Verifiable System Figure 4.2: Specification Figure 4.2.1: KB1 Specification Figure 4.2.2: KB1 Design Figure 4.3: Correctness Verification Figure 4.3.1: KB1 Implementation Figure 5.1: Knowledge Base 1 Figure 5.2: An Example of Knowledge Base Partitioning Figure 6.1: Immediate Dependency Relation as Ordered Pairs Figure 6.2: Examples of domains Figure 7.1: Pamex DT Figure 7.2: Example ES Figure 8.1: Completeness of Investment Subsystem Figure 8.2: Consistency of I Subsystem Figure 8.3: Example Specification for KB1 Figure 8.4: Symbolic Evaluation Figure 8.5: Symbolic Inference Engine
Table 1.1: Intended Audiences for the Handbook Table 2.1: Validation Methods Table 2.2: Verification Methods Table 2.3: V&V Software Table 4.1: Level of Effort for the Correctness Verification Stage Table 6.1: Immediate Dependency Relation for KBI Table 6.2: Matrix Product of the DR by Itself Table 6.3: Immediate DR of KB1 Table 6.4: Varible Clusters of the DR of KB1 Table 6.5: How Variables Influence Rules Table 6.6: How Rules Influence Variables Table 6.7:Immediate Dependency Matrix for KB1 Table 6.8: Hoffman Regions for KB1 Table 9.1: Confidence Level Table 9.2: Confidence Level with One Expert Disagreeing |