Programming "By Contract" #### Writing robust code There are different classes of problems in a running program: 1 system failure (cpu dies, memory error, disk full, network error, site down, ...) 1 invalid data (user-derived data is faulty, out of range, bad filename) 1 programming error; results in unknown or invalid state. Programming languages can automatically help: o static typing o GC; no stale pointers, no leaks, don't have to write code, ... o automatic condition checking such as array overflow To aid/encourage robustness programmers can manually add o assertions o exceptions o programming "by contract" In this lecture, I am concerned with point 3. #### Most common programming errors o cut-n-paste errors o software reuse errors; essentially you may not assume the same thing as the component designers; e.g., \ @(http://archive.eiffel.com/doc/manuals/technology/contract/ariane/page.html, Ariane rocket example) int conversion caused uncaught exception; ironically it occured in unneeded code (after launch); cost $500M. \ Another example: metric vs English units caused Mars probe to crash. o didn't think of or check all permutations, possibilities o off-by-one errors; e.g., buffer overflow, integer overflow (e.g., 1990 gulf war, Patriot missile system clock word overflow), ... o writing rigid code. Not tolerant of perturbations in data or new data, cannot adapt to changes made in the future elsewhere in your code. Example: Mac OS X keyboard preferences (I forgot a '>' and my keyboard no longer worked; had to remotely log in and fix). Hardcoded file pathnames, case sensitivity issues, etc... \ Polymorphism helps a lot in terms of program longevity. Using properties and config files helps avoid hardcoded constants in your code, making it portable. o lack of error checking or graceful error recovery. E.g., does your entire server crash if a web page encounters a database or file system error? Why bother to list these common errors? Because bugs are usually introduced for one of these reasons, hence, checking these common faults first dramatically improves your chances of success and your debugging speed. Often you can solve a problem with just logic--w/o even sitting at the computer! #### Programming "By Contract" Programming "by contract" is essentially a means of allowing programmers to verify execution of their methods does not corrupt the state of their data structures and so on. It is about _what_ your code does not _how_ it does it, some like to say. Note that verifying execution is not the same as *guaranteeing* valid execution. These so-called _contracts_ are just code like your program and can therefore also be faulty and may not provide sufficient coverage due to negligence or lack of experience or efficiency. Quoting from http://www.cs.unc.edu/~stotts/COMP204/contract.html "A paradigm which was first introduced by Bertrand Meyer, the creator of Eiffel. Although Eiffel has support for programming by contract built into the language, most of the concepts can be used in any language." "Basically programming by contract creates a contract between the software developer and software user - in Meyer's terms the supplier and the consumer. Every feature, or method, starts with a precondition that must be satisfied by the consumer of the routine. And each feature ends with postconditions which the supplier guarantees to be true (if and only if the preconditions were met). Also, each class has an invariant which must be satisfied after any changes to the object represented by the class. In the other words, the invariant guarantees the object is in a valid state." A decent place to start reading: http://archive.eiffel.com/doc/manuals/technology/contract/page.html ### Terminology "By Contract" asks: 1 when is it appropriate to call a method? (_precondition_) 1 what is a method supposed to have done? (_postcondition_) 1 did a method execution screw up the state of an object? (_class invariance_) In Meyer's (Eiffel guy) terminology, if class A uses class B, A is _client_ of _supplier_ of B. Then we have a contract between A and B: %box << _supplier B guarantees postcondition of B's methods assuming client A guarantees the preconditions of those methods_ >> Postconditions are a function of old/new system state and the arguments/return value. Sometimes a precondition implies an order of method call: must push before can pop. Hard to implement! Hard to check old values too ;) pre/post/class-invariants should be side-effect free. You can also call these conditions _assertions_, though an {assert} usually has a much more limited meaning (simple throw exception upon boolean-expression failure). ### Failure *Asserts and "by contract" specifications catch programmer errors not run-time errors!* They catch invalid or unknown state problems caused by programming errors. Do not attempt to recover from assert failures. You are "out of known space." Don't use asserts for runtime problems such as: o "can't find file" o invalid web site page arg o input arg "rate" too fast In these cases, you should throw {InvalidArgumentException} or whatever. The system can continue to operate in this case. Asserts/conditions can be taken out after development, but you need input checks always. What does it mean when contract fails? 1 If preconditions met, supplier guarantees postconditions. 1 Failure of precondition implies problem with client. 1 If postcondition fails, it's a bug in supplier. ### Benefits "By contract" is good because it forces you to think about the specification and thus the method will be more likely to actually do that ;) Better than asserts: [from Reto Kramer] o asserts aren't a contract o clients can't see asserts as part of interface o not explicit whether they are pre/post/invariants. o no inheritance/interface support o no automatic documentation I'll add that adding any kinds of checks tends to reduce the propogation of errors. I've spend days tracking down bugs that were extremely difficult to reproduce because a small error became a large one in a remote section of my code. ### "By Contract" Defined by Eiffel While asserts have been around since the 70's, Eiffel really defined a formalism for checking a programmer's work. Naturally, people have been building IF checks since the dawn of computing. Here is a simple {put} method for a dictionary in Eiffel (from eiffel.com): << put (x: ELEMENT; key: STRING) is -- Insert x so that it will be retrievable through key. require count <= capacity not key.empty do ... Some insertion algorithm ... ensure has (x) item (key) = x count = old count + 1 end >> Notice how this is checking for programming errors; i.e., that you in fact did put something into the dictionary. ### What happens with inheritance? For subclasses, Eiffel subcontracting says can "a redefined version of [a method] may keep or weaken the precondition; it may keep or strengthen the postcondition" ### An example in "Java" The usual stack example: << class Stack { // supplier Object[] data = null; int top = -1; int n = 0; public Stack(int n) // require n>0; { this.n = n; data = new Object[n]; } public void push(Object o) // require top<(n-1); // ensure top==o; // ensure top = old top + 1; { top++; data[top] = o; } >> << class Main { // client public static void main(String[] args) { Stack s = new Stack(10); s.push("Apple"); } } >> Or, a more complicated example: << class FixedBitSet { int n = 0; // num members int nbits = ...; // multiple of 32-bit word size public void add(int x) // require !member(x); // require x> That "old value" concept could be expensive! Sounds like good idea, but not clear where you should put code to check preconditions: client, supplier, both, neither? Can make a debugging wrapper. For example, << public class TestingAccount extends Account { public void debit(float value) { if ( debit_precondition(value) ) { // generate exception or log error } else { super.debit(value); } } } >> In most systems, asserts can be turned off. Java requires you to do all this wrapper stuff manually. ### Class Invariants Condition that class must always obey before/after a method call; for example, in a linked list, the {head} should always point at {null} or a valid node. Though could point at another list by mistake. << class LinkedList { protected LinkedListWrapper head = null; private void classInvariants() { // search through list to ensure head points at a node // in this list } ... } >> In a language w/o invariants, you must put a call to {classInvariants()} at the end of every method. ### Log4j http://jakarta.apache.org/log4j/docs/index.html @(http://www.jguru.com/faq/view.jsp?EID=1007092,"What is Log4j?"): "Log4j is an open source tool developed for putting log statements into your application. It was developed by the good people at Apache's Jakarta Project. It's speed and flexibility allows log statements to remain in shipped code while giving the user the ability to enable logging at runtime without modifying any of the application binary. All of this while not incurring a high performance cost." ### iContract (uses ANTLR): Java iContract: http://www.javaworld.com/javaworld/jw-02-2001/jw-0216-cooltools.html http://www.reliable-systems.com/tools/iContract/iContract.htm it's a preprocessor that grabs {@post}, etc... allows you to include or not include the assertions From Javaworld article by Oliver Enseling. << /** * @pre f >= 0.0 * @post Math.abs((return * return) - f) < 0.001 * * want to calculate the square root of f within a specific * margin of error(+/- 0.001). */ public float sqrt(float f) { ... } >> To check old value, use {@pre} suffix: << /** * Append an element to a collection. * * @post c.size() = c@pre.size() + 1 * @post c.contains(o) */ public void append(Collection c, Object o) { ... } >> << /** * A PositiveInteger is an Integer that is guaranteed to be positive. * * @invariant intValue() > 0 */ class PositiveInteger extends Integer { ... } >> iContract based upon OCL (object constraint language by OMG). From it you can use {forall} and {exists}. << /* * @invariant forall IEmployee e in getEmployees() | * getRooms().contains(e.getOffice()) * * specifies that every employee returned by getEmployees() has an * office in the collection of rooms returned by getRooms(). */ >> << /** * @post exists IRoom r in getRooms() | r.isAvailable() * * Make sure {getRooms()} contains at least one available room. * {exists} precedes the Java collection type. */ >> ## Implications iContract has an implies operator A=>B. << /** * @invariant getRooms().isEmpty() implies getEmployees().isEmpty() * * no rooms => no employees */ >> ## Stack Example Note how, in this example, the interface describes not only method names, but also the code to validate their behavior. Also, because the code is so simple, the conditions actually dictate the functionality! I once saw Tim Teitelbaum http://www.cs.cornell.edu/Info/People/tt/Tim_Teitelbaum.html type in these conditions and his translator generated the code to implement a stack! << /** * @invariant !isEmpty() implies top() != null // no null objects are allowed */ public interface Stack { /** * @pre o != null * @post !isEmpty() * @post top() == o */ void push(Object o); /** * @pre !isEmpty() * @post @return == top()@pre */ Object pop(); /** * @pre !isEmpty() */ Object top(); boolean isEmpty(); } >> Implementations don't have the conditions, the interface "pushes" them to the implementation. ### JDK 1.4 adds a simple assertion facility to Java Just adds an {assert} keyword that tests a boolean expression and {AssertionError} class for assertion failures. 1 {assert expr1;} 1 {assert expr1 : string-to-pass-to-assertion-error;} Example {assert x>0 && p!=null;} Since it's a new keyword, compile with: << javac -source 1.4 Foo.java >> This adds nothing but the old #ifdef version of asserts in C/C++ as far as I can tell. But, you can enable for a class or package. Also, you don't have to put IF around test to get it to disappear (Java doesn't have a preprocessor like C/C++). From javaworld http://www.javaworld.com/javaworld/jw-11-2001/jw-1109-assert-p2.html By: Wm. Paul Rogers << public class Bar { public void m1( int value ) { assert 0 <= value : "Value must be non-negative: value= "+value; System.out.println( "OK" ); } public static void main( String[] args ) { Bar bar = new Bar(); System.out.print( "bar.m1( 1 ): " ); bar.m1( 1 ); System.out.print( "bar.m1( -1 ): " ); bar.m1( -1 ); } } >> << bar.m1( 1 ): OK bar.m1( -1 ): Exception in thread "main" java.lang.AssertionError: Value must be non-negative: value= -1 at Bar.m1(Bar.java:6) at Bar.main(Bar.java:17) >> More info: http://java.sun.com/j2se/1.4/docs/guide/lang/assert.html Why no major language additions for pre/post? They couldn't convince themselves that it was ok to massively modify the language given all the IDEs and such there is out there. #### Is Everybody Convinced? Not completely. For example, at Karlstad University they tried to measure what was to gain by using "by contract" programming: @(http://csdl.computer.org/comp/proceedings/ecbs/2002/1549/00/15490118abs.htm) They couldn't find significant benefits in their experiment. This was of course a student project not the "real world".