Keywords and phrases: Algebraic semantics, axiomatic semantics, denotational semantics, operational semantics, semantic algebra, semantic axiom, semantic domain, semantic equation, semantic function, loop variant, loop invariant, valuation function, sort, signature, many-sorted algebra
Semantics is concerned with the interpretation or understanding of programs and how to predict the outcome of program execution. The semantics of a programming language describe the relation between the syntax and the model of computation. Semantics can be thought of as a function which maps syntactical constructs to the computational model.
This approach is called syntax-directed semantics.
There are several widely used techniques ( algebraic, axiomatic, denotational, operational, and translation) for the description of the semantics of programming languages.
Figure N.1 contains an example of an algebraic definition. It is an algebraic definition of a fragment of Peano arithmetic.
Domains:Bool = {true, false} (Boolean values)
N in Nat (the natural numbers)
N ::= 0 | S(N)Functions:
= : (Nat, Nat) -> Bool
+ : (Nat, Nat) -> Nat
× : (Nat, Nat) -> NatAxioms and equations:
not S(N) = 0
if S(M) = S(N) then M = N
( n + 0 ) = n
( m + S(n) ) = S( m + n )
( n × 0 ) = 0
( m × S(n)) = (( m × n) + m)where m,n in Nat
The equations define equivalences between syntactic elements; they specify the transformations that are used to translate from one syntactic form to another. The domain is often called a sort and the domain and the function sections constitute the signature of the algebra. Functions with zero, one, and two operands are referred to as nullary, unary, and binary operations. Because there is more than one domain, the algebra is called a many sorted algebra. As in this example, abstract data types may require values from several different sorts. The signature of the algebra is a set of sorts and a set of functions taking arguments and returning values of different sorts. A stack of natural numbers may be modeled as a many-sorted algebra with three sorts (natural numbers, stacks and booleans) and four operations (newStack, push, pop, top, empty). Figure N.2 contains an algebraic definition of a stack.
Figure N.2: Algebraic definition of an Integer Stack ADT Domains: Nat (the natural numbers
Stack ( of natural numbers)
Bool (boolean values)Functions:
newStack: () -> Stack
push : (Nat, Stack) -> Stack
pop: Stack -> Stack
top: Stack -> Nat
empty : Stack -> BoolAxioms: or
pop(push(N,S)) = S
top(push(N,S)) = N
empty(push(N,S)) = false
empty(newStack()) = trueErrors:
pop(newStack())
top(newStack()) where N in Nat and S in Stack.
Defining Equations:
newStack() = []
push(N,S) = [N|S]
pop([N|S]) = S
top([N|S]) = N
In Figure N.1, the structure of the numbers is described. In Figure N.2 the structure of a stack is not defined. This means that we cannot use equations to describe syntactic transformations. Instead, we use axioms that describe the relationships between the operations. The axioms are more abstract than equations because the results of the operations are not described. To be more specific would require decisions to be made concerning the implementation of the stack data structure. Decisions which would tend to obscure the algebraic properties of stacks. The axioms impose constraints on the stack operations that are sound in the sense that they are consistent with the actual behavior of stacks reguardless of the implementation. Finding axioms that are complete, in the sense that they completely specify the behavior of the operations of an ADT, is more difficult.
The goal of algebraic semantics is to capture the semantics of behavior by a set of axioms with purely syntactic properties. Algebraic definitions (semantic algebras) are the favored method for defining the properties of abstract data types.
where c is a command or control structure in the programming language, P and Q are assertions or statements concerning the properties of program objects (often program variables) which may be true or false. P is called a pre-condition and Q is called a post-condition. The pre- and post-conditions are formulas in some arbitrary logic and summarize the progress of the computation.
The meaning of
is that if c is executed in a state in which assertion P is satisfied and c terminates, then c terminates in a state in which assertion Q is satisfied. We illustrate axiomatic semantics with a program to compute the sum of the elements of an array (see Figure N.3).
Figure N.3: Program to compute S = sumi=1nA[i] S,I := 0,0 while I < n do S,I := S+A[I+1],I+1 end
The assignment statements are simultaneous assignment statements. The expressions on the righthand side are evaluated simultaneously and assigned to the variables on the lefthand side in the order they appear.
Figure N.4 illustrates the use of axiomatic semantics to verify the program of Figure N.3.
Figure N.4: Verification of S = sumi=1nA[i]
Pre/Post-conditions Code 1. { 0 = Sumi=10A[i], 0 < |A| = n } 2. S,I := 0,0 3. {S = Sumi=1IA[i], I <= n } 4. while I < n do 5. {S = Sumi=1IA[i], I < n } 6. {S+A[I+1] = Sumi=1I+1A[i], I+1 <= n } 7. S,I := S+A[I+1],I+1 8. { S = Sumi=1IA[i], I <= n } 9. end 10. {S = Sumi=1IA[i], I <= n, I >= n } 11. {S = Sumi=1nA[i] }
The program sums the values stored in an array and the program is decorated with the assertions which help to verify the correctness of the code. The pre-condition in line 1 and the post-condition in line 11 are the pre- and post-conditions respectively for the program. The pre-condition asserts that the array contains at least one element zero and that the sum of the first zero elements of an array is zero. The post-condition asserts that S is sum of the values stored in the array. After the first assignment we know that the partial sum is the sum of the first I elements of the array and that I is less than or equal to the number of elements in the array.
The only way into the body of the while command is if the number of elements summed is less than the number of elements in the array. When this is the case, The sum of the first I+1 elements of the array is equal to the sum of the first I elements plus the I+1st element and I+1 is less than or equal to n. After the assignment in the body of the loop, the loop entry assertion holds once more. Upon termination of the loop, the loop index is equal to n. To show that the program is correct, we must show that the assertions satisfy some verification scheme. To verify the assignment commands, we use the Assignment Axiom:
This axiom asserts that:
If after the execution of the assignment command the environment satisfies the condition P, then the environment prior to the execution of the assignment command also satisfies the condition P but with E substituted for x (In this and the following axioms we assume that the evaluation of expressions does not produce side effects.).An examination of the respective pre- and post-conditions for the asssignment statements shows that the axiom is satisfied.
To verify the while command of lines 4. 7 and 9, we use the Loop Axiom:
The assertion above the bar is the condition that must be met before the axiom (below the bar) can hold. In this rule, {I} is called the loop invariant. This axiom asserts that:
To verify a loop, there must be a loop invariant I which is part of both the pre- and post-conditions of the body of the loop and the conditional expression of the loop must be true to execute the body of the loop and false upon exit from the loop.The invariant for the loop is: S = sumi=1IA[i], I <= n. Lines 6, 7, and 8 satisfy the condition for the application of the Loop Axiom. To prove termination requires the existence of a loop variant. The loop variant is an expression whose value is a natural number and whose value is decreased on each iteration of the loop. The loop variant provides an upper bound on the number of iterations of the loop.
A variant for a loop is a natural number valued expression V whose run-time values satisfy the following two conditions:The loop variant for this example is the expression n - I. That it is non-negative is guaranteed by the loop continuation condition and its value is decreased by one in the assignment command found on line 7. More general loop variants may be used; loop variants may be expressions in any well-founded set (every decreasing sequence is finite). However, there is no loss in generality in requiring the variant expression to be an integer. Recursion is handled much like loops in that there must be an invariant and a variant. The correctness requirement for loops is stated in the following:
- The value of V greater than zero prior to each execution of the body of the loop.
- The execution of the body of the loop decreases the value of V by at least one.
Loop Correctness Principle: Each loop must have both an invariant and a variant.Lines 5 and 6 and lines 10 and 11 are justified by the Rule of Consequence.
The justification for the composition the assignment command in line 2 and the while command requires the following the Sequential Composition Axiom.
This axiom is read as follows:
The sequential composition of two commands is permitted when the post-condition of the first command is the pre-condition of the second command.The following rules are required to complete the deductive system.
The axiomatic method is the most abstract of the semantic methods and yet, from the programmer's point of view, the most practical method. It is most abstract in that it does not try to determine the meaning of a program, but only what may be proved about the program. This makes it the most practical since the programmer is concerned with things like, whether the program will terminate and what kind of values will be computed.
Axiomatics semantics are appropiate for program verification and program derivation.
Loops and recursion are two constructs that require invention on the part of the programmer. The loop correctness principle requires the programmer to come up with both a variant and an invariant. Recursion is a generalization of loops so proofs of correctness for recursive programs also require a loop variant and a loop invariant. In the summation example, a loop variant is readily appearent from an examination of the post-condition. Simply replace the summation upper limit, which is a constant, with a variable. Initializing the sum and index to zero establishes the invariant. Once the invariant is established, either the index is equal to the upper limit in which case there sum has been computed or the next value must be added to the sum and the index incremented reestablishing the loop invariant. The position of the loop invariants define a loop body and the second occurrence suggests a recursive call. A recursive version of the summation program is given in Figure N.5.
S,I := 0,0 loop: if I < n then S,I := S+A[I+1],I+1; loop else skip fi
The advantage of using recursion is that the loop variant and invariant may be developed separately. First develop the invariant then the variant.
The summation program is developed from the post-condition by replacing a constant by a variable. The initialization assigns some trivial value to the variable to establish the invariant and each iteration of the loop moves the variable's value closer to the constant.
A program to perform integer division by repeated subtraction can be developed from the post-condition { 0 <= r < d, (a = q× d + r) } by deleting a conjunct. In this case the invariant is { 0 <= r, (a = q× d + r) } and is established by setting the the quotient to zero and the remainder to a.
Another technique is called for in the construction of programs with multiple loops. For example, the post condition of a sorting program might be specified as:
or the post condition of an array search routine might be specifies as:
To develop an invariant in these cases requires that the assertion be strengthened by adding additional constraints. The additional constraints make assertions about different parts of the array.
Figure N.6: Denotational definition of Peano Arithmetic Abstract Syntax: N in Nat (the Natural Numbers)
N ::= 0 | S(N) | (N + N) | (N × N)Semantic Algebra:
Nat (the natural numbers (0, 1, ...)
+ : Nat -> Nat -> NatValuation Function:
D : Nat -> Nat
D[( n + 0 )] = D[n]
D[( m + S(n) )] = D[(m+n)] + 1
D[( n × 0 )] = 0
D[( m × S(n))] = D[ (( m × n) + m) ]where m,n in Nat
It is is a denotational definition of a fragment of Peano arithmetic. Notice the subtle distinction between the syntactic and semantic domains. The syntactic expressions are mapped into an algebra of the natural numbers by the valuation function. The denotational definition almost seems to be unnecessary. Since the syntax so closely resembles that of the semantic algebra.
Programming languages are not as close to their computational model. Figure N.7 is a denotational definition of the small imperative programming language Simple encountered in the previous chapter.
Abstract Syntax: C in Command E in Expression O in Operator N in Numeral V in Variable C ::= V := E | if E then C1 else C2 end | while E do C3 end | C1;C2 | skip E ::= V | N | E1 O E2 | (E) O ::= + | - | * | / | = | < | > | <> Semantic Algebra: Domains: tau in T = {true, false}; the boolean values zeta in Z = {...-1,0,1,...}; the integers + : Z -> Z -> Z ... = : Z -> Z -> T ... sigma in S = Variable -> Numeral; the state Valuation Functions: C in C -> (S -> S) E in E -> E -> (N union T) C[ skip ] sigma = sigma C[ V := E ] sigma = sigma [ V:E[ E ] sigma C[ C1; C2 ] = C[ C2 ] · C[ C1] C[ if E then C1 else C2 end ] sigma = C[ C1 ] sigma if E[ E ]sigma = true = C[ C2 ] sigma if E[ E ]sigma = false C[ while E do C end}]sigma = limn -> infty C[ (if E then C else skip end)n ] sigma E[ V ] sigma = sigma(V) E[ N ] = zeta E[ E1+E2 ] = E[ E ] sigma + E[ E ] sigma ... E[ E1=E2 ] sigma = E[ E ] sigma = E[ E ] sigma |
Denotational definitions are favored for theoretical and comparative programming language studies. Denotational definitions have been used for the automatic construction of compilers for the programming language. Denotations other than mathematical objects are possible. For example, a compiler writer would prefer that the object denoted would be appropriate object code. Systems have been developed for the automatic construction of compilers from the denotation specification of a programming language.
Figure N.8: Operational semantics for Peano arithmetic Abstract Syntax: N in Nat (the natural numbers) N ::= 0 | S(N) | (N + N) | (N × N) Interpreter: I: N -> N I[ ( n + 0 ) ] ==> n I[ ( m + S(n) ) ] ==> S( I[ (m+n ) ] ) I[ ( n × 0 ) ] ==> 0 I[ ( m × S(n)) ] ==> I[ (( m × n) + m) ] where m,n in Nat
It is is an operational definition of a fragment of Peano arithmetic.
The interpreter is used to rewrite natural number expressions to a standard form (a form involving only S and 0 ) and the rewriting rules show how move the + and × operators inward toward the base cases. Operational definitions are favored by language implementors for the construction of compilers and by language tutorials because operational definitions describe how the actions take place.
The operational semantics of Simp is found in Figure N.9.
Figure N.9: Operational semantics for Simple Interpreter: I: C × Sigma -> Sigma {nu} in E × Sigma} -> T union Z Semantic Equations: I(skip,sigma) = sigma I(V := E,sigma) = sigma[V:nu(E,sigma)] I(C1 ;C2,sigma) = E(C2,E(C1,sigma)) I(if E then C1 else C2 end,sigma) = I(C1,sigma)&if nu(E,sigma) = true} I(C2,sigma)&if nu(E,sigma) = false} while E do C end = if E then (C;while E do C end) else skip nu(V,sigma) = sigma(V) nu(N,sigma) = N nu(E1+E2,sigma) = nu(E1,sigma) + nu(E2,sigma) ... nu(E1=E2,sigma) = true if nu(E,sigma) = nu(E,sigma)} false if nu(E,sigma) != nu(E,sigma)} otherwise ...
The operational semantics are defined by using two semantic functions, I which interprets commands and nu which evaluates expressions. The interpreter is more complex since there is an environment associated with the program with does not appear as a syntactic element and the environment is the result of the computation. The environment (variously called the store or referencing environment}) is an association between variables and the values to which they are assigned. Initially the environment is empty since no variable has been assigned to a value. During program execution each assignment updates the environment. The interpreter has an auxiliary function which is used to evaluate expressions. The while command is given a recursive definition but may be defined using the interpreter instead. Operational semantics are particularly useful in constructing an implementation of a programming language.
IF guard --> command FI | = | if guard then command |
LOOP guard --> command POOL | = | while guard do command |
int f (int n) { if n > 1 then n*f(n-1) else 1 } int f (int n) { int t = 1; while n > 1 do { t := t*n; n := n-1} }