This document is a transcription of these scans.

Non-ASCII characters are represented using HTML entities. These include £ for pound (currency).


                            USING IMP
                     
                     An informal introduction
                            Issue 1.1
                     
                        Peter S. Robertson
                     Lattice Logic Ltd. 1986
This document is intended as an informal introduction to the IMP
language for people with a general understanding  of  the  ideas
and  concepts of programming.   It introduces sample programs or
program fragments and discusses various features of the language
which have been used.   Also included are comments on why things
are  done the way they are and the benefits and disadvantages of
the choices which  were  made  during  the  development  of  the
language.   For  a  more  formal and detailed description of the
language please refer to "The IMP Language".


Throughout the text the IMP language will be referred to  simply
as IMP.


Part of the philosophy of IMP is to provide convenience forms of
the standard constructions which, if used with care, can greatly
improve  the  readability of programs.   These convenience forms
need never be used as the standard forms will always work.


First of all, here is a trivial program which prints out:

Hello there.


      %begin                          {simple first program}
         Printstring("Hello there.")
         newline
      %end
      %end %of %file

It would be a good idea at this stage simply to get this program
into your machine, exactly as  written  above,  and  attempt  to
compile  and  run  it.   This will give you some feeling for how
easy (or more likely  how  difficult)  it  is  to  generate  the
necessary  incantations  to  get  the  operating  system  to  do
anything.
If the compiler produces error messages then you  have  probably
mistyped  the  program.   If  the program is exactly the same as
given here then  the  fault  lies  outwith  the  scope  of  this
document.


Even  though  this  program is so simple, it illustrates many of
the features which give IMP its 'flavour'.


In the example above the text between braces (the curly brackets
{ and }) is considered to be a comment and along with the braces
will be ignored by  the  compiler.   If  the  closing  brace  is
omitted the compiler will assume that there should have been one
immediately before the end of the line, so if you have a comment
which  extends  over several lines, each line must start with an
opening brace.
While it is considered  good  programming  practice  to  include
apposite  comments, their use will be minimised in this document
as they would probably distract the reader,  especially  as  the
programs  will  be  described  in the text for it is the program
statements themselves and  not  the  algorithms  they  implement
which are of interest.



                             1
Next,  some  'words'  start  with a % character and some do not.
The reason for this is to divide the 'words' of a  program  into
two  totally  distinct  categories:  KEYWORDS  and  IDENTIFIERS.
KEYWORDS start with a % and  'belong'  to  the  compiler,  while
IDENTIFIERS   do  not  start  with  a  %  and  'belong'  to  the
programmer.   When writing in IMP,  if  an  identifier  is  more
easily understood with spaces in it, put them in.
For example, our original program could be rewritten:
      %begin
          print string("Hello there.")
          new line
      %end
      %end %of %file

The  %  character in keywords is often thought of as underlining
the keyword, because when writing programs on paper it  is  much
faster  to  underline  than to write a percent sign.   The exact
definition of the  effect  of  percent  is  that  it  underlines
everything following it stopping at the first character which is
not  a  letter.   Hence  %end %of %file  could  equally  well be
written %endoffile.


There are absolutely no  restrictions  on  what  characters  can
appear  (or  not appear if they are invisible) inside quotes; if
your editor or operating system will let you put  characters  in
the  program  the  compiler will accept them.   The only problem
might be the effect of these characters  on  output  devices  if
listings  are generated from the compiler.   In particular there
is no need for the obscure 'escape sequences' of some  languages
to include spaces, newlines, tabs or whatever into a program.
Some programmers may prefer to rewrite the program as:


      %begin
          printstring("Hello there.
      ")
      %end
      %endoffile

The  disadvantage  of allowing newlines inside quotes is that if
the closing quote is omitted the compiler will  remain  in  text
mode  and  suck  in the rest of the program, eventually giving a
fault such as  'String too long' or 'Input ended'.
In practice this is never much of a problem, especially  as  the
compiler  will  mark  line  numbers  in the listing file to show
whenever it is still in text mode at the start of a line:

      1   %begin
      2      printstring("Hello there.
      3"   ")
      4   %end
      5   %endoffile





                             2
IMP requires that statements are terminated by a  newline  or  a
semicolon; newlines are not ignored.   The only time a semicolon
is needed is if you want two or  more  statements  on  the  same
line:

      %begin
         printstring("Hello there.");  newline
      %end
      %end %of %file

If  you  want  to break a statement over several lines each line
break must be preceded by a hyphen (which is otherwise ignored),
or must come after a comma or the keywords %AND or %OR.

      %begin
         print string   -
         ("Hello there.")
         newline
      %end
      %end -
      %offile

The question of where statements start and finish is one of  the
more  obscure  parts  of  IMP and does cause some difficulty for
beginners, but this is learned as one becomes more familiar with
the language.

Now for a program which is a little more adventurous:

      %begin                   {program to add two numbers}
          %integer First, Second, Sum
          read(first)          {input first number}
          read(second)         {input second number}
          sum = first+second
          write(sum, 0)        {output the sum}
          newline
      %end
      %endoffile

This program is made up of one block (the bit between %begin and
%end).
Within the block is  a  declaration  (%integer)  and  imperative
statements (read ..., sum = ..., etc.).   Within each block in a
program declarations must come before imperative statements.

When a block is executed the declarations cause various  objects
(integers,  reals  etc.)  to be created and given identifiers by
means of which they can be referenced.   When the block is  left
any  such  objects are destroyed and the identifiers loose their
meaning.

The program above creates three objects which can  hold  integer
(whole  number)  values  and calls them: first, second, and sum.
Such objects are commonly called variables.  These variables may


                             3
be given values and subsequently the values they have been given
may be retrieved for furthur use.   Note that if an  attempt  is
made  to  take the value of a variable before any value has been
given to  it  the  program  will  signal  an  error  (unassigned
variable).   This  check  that  variables have been given values
catches one of the most frequent programming bugs (at  least  in
the  experience  of  the  author),  but sadly very few languages
bother with it.   Some languages give  every  new  variable  the
value  zero  (or  its  equivalent).   This  often  lets programs
stagger on and fail long after the point at which  a  particular
variable  should  have  been  initialised, or worse, the program
just produces a credible but  wrong  answer.   The  more  common
approach  is  for new variables to be left with whatever rubbish
is lying about in the memory, giving rise to programs which  run
sometimes  and  fail at other times depending what was happening
previously.

The statement:

      Sum = First + Second

is an assignment which computes the value of the  expression  on
the right hand side of the equals sign and assigns that value to
the  variable  on  the  left hand side of the equals sign.   The
expression can be as simple or as complex as  you  like  but  it
must give a result which is the same type (sort of thing) as the
final destination.  For example as First and Second are integers
their  sum  must be an integer, and hence may be assigned to the
integer variable Sum.
However, the division operator (/) always gives an answer  which
is real (fractional) and hence the assignment:

      Sum = First / Second

would be faulted by the compiler as Sum cannot hold real values.
If  an  integer result is required as the result of dividing two
integer values, the integer division operator (//) must be used.
This performs the division and then discards any remainder.

The definition of IMP gives and number of operators which may be
used to form expressions  (add, ,subtract, multiply,  etc.)  and
also  defines  how  the  expressions  are to be evaluated, hence
A+B*C means  'multiply B by C and then add in A' but not  'add A
to  B  and  multiply  the  answer  by C'.   It is to be strongly
recommended that wherever there might be the slightest confusion
don't be clever; use brackets to make the meaning obvious,  i.e.
write  A+(B*C)  or  (A+B)*C.   A very common mistake is to write
A/2*B intending to get A/(2*B) but in fact getting  (A/2)*B.

READ, WRITE, and NEWLINE are examples of routine calls, and  the
things in brackets after READ and WRITE are parameters.  NEWLINE
has  no  parameters and so is not followed by any brackets (some
languages would insist  that  the  call  be  followed  by  empty
brackets:   NEWLINE(),   perhaps   consistent   but   definitely
irritating).




                             4
Note  the  difference  in  the  parameters  in  READ(FIRST)  and
WRITE(SUM, 0).   Because  READ  inputs  a value from outside the
program and assigns it to the variable  given  as  a  parameter,
FIRST in this case, it is not the value in FIRST which is passed
to the routine but the object FIRST itself.  Such a parameter is
said  to be passed by reference.   Conversely, the parameters to
WRITE (SUM and zero) represent the value to be output (SUM)  and
the  minimum number of characters to be output (0), and hence it
is the values which are important.  These parameters are said to
be passed by value.   Whether parameters are passed by reference
or  by value depends on how the procedure was defined; this will
be discussed later.

As the call to WRITE only requires the value of SUM as its first
parameter, and that value just happens to  be  First+Second  the
program could be rewritten:

      %begin
          %integer First, Second
          read(first)
          read(second)
          write(first+second, 0)
          newline
      %end
      %endoffile



Now  for a program which introduces conditional statements.   It
simply reads in two numbers and  outputs  the  relation  between
them.

      %begin
          %integer X, Y
          Read(X);  Read(Y)

         %if X > Y %start
             Printstring("The first was larger")
         %finish %else %if Y > X %start
             Printstring("The second was larger")
         %finish %else %start
             Printstring("They were equal")
         %finish

         Newline
      %end
      %end %of %file

This  subject  causes  the most difficulty for beginners, mainly
because it is different from most other  languages.   The  first
point  to note is that lines starting %if..... and %finish......
are all complete statements and must be terminated by a  newline
or  a  semicolon.   Secondly the bits between %start and %finish
may contain as many  statements  as  you  like,  including  more
conditional  statements.   Thirdly,  the %else %if clause may be
repeated as often as you wish (including zero times).



                             5
Finally, if there is nothing to be done in the case when none of
the previous  conditions  has  been  satisfied  then  the  final
%finish %else %start may be omitted altogether.

This  is the most general form of conditional structure and will
always  work.    The  difficulty  results  from  the  fact  that
conditions  occur  so  frequently and are usually so simple that
use of the general form can  be  like  cracking  a  nut  with  a
sledgehammer.    Consequently  IMP  provides  convenience  forms
which, if used with care, can lead to  more  readable  programs.
It  is  the  firmly-held opinion of the author that extra effort
spent in organising and writing  programs  is  well  worthwhile;
programs  are  usually  only  written  once but read many times.
Therefore all the emphasis should go in attempting to  make  the
program  readable  and  its  logic  clear.   Demanding  that the
standard form be used everywhere does not help to make code more
understandable.

The first simplification is for the trivial case where something 
simple is to be done if a condition  is  true.   This  could  be
written:

      %if [condition] %start
         Do something simple
      %finish

but a simpler and more readable variant is:

      Do something simple %if [condition]

after  all,  that  was almost the form of words used to describe
the problem in the first place!

The second change again follows  from  everyday  English  usage.
The  effect  of  the  condition  may be inverted by changing the
keyword %if into %unless.   This should be  used  with  care  as
while  it  can  make  code  clearer,  misuse  can make them very
unclear:

      Average = Total/Number %unless Number = 0

is clear enough, but what about:

      %unless %not 1 <= N <= 9 %or M # 23 %start    {??????}

The final form is just a way to remove some of the wood so  that
the trees can be seen.   Any statement starting with the keyword
%FINISH and ending with the keyword %START may be rewritten with
both of those keywords omitted.





                             6
Hence the original program could be written in what we think  is
a more understandable form:

      %begin
         %integer X, Y
         Read(X);  Read(Y)
         %if X > Y %start
            Printstring("The first is larger")
         %else %if Y > X
            Printstring("The second is larger")
         %else
            Printstring("They were equal")
         %finish

         Newline
      %end
      %endoffile

IMP  does  not  see  conditions  (things  like  X  > Y) as being
'expressions' which give a boolean value, consequently IMP  does
not  have  boolean  variables.   Instead, IMP sees conditions as
questions, so instead of the statement:

      %if A = B ......

being thought of as 'does A=B have the value TRUE' it is thought
of as 'is A equal to B'.   This may seem a trivial point but  it
can have can have a major effect on the language.
For example, in IMP the statement:

   Do something special %if A = B %and C = D

means  exactly  what most people would understand by the English
statement formed by removing the  percent  signs.   Pascal,  for
example  would  insist  on  having brackets round the components
(A=B), (C=D).

Trying to keep as close to common English  usage  does  help  to
make  a  language more readable (within limits) but it can cause
problems when English is ambiguous.   For example, what  is  the
exact  meaning  of  'Bring me an apple or a pear and an orange'.
Many computer languages resolve the ambiguity of AND and  OR  by
means  of  precedence  rules  which bear no relation to everyday
usage: AND is done before OR,  the  analogy  being  that  AND =
MULTIPLY  and  OR  = ADD.   Rather than leave the possibility of
getting this wrong (and complex conditions are difficult  enough
anyway)  IMP  resolves  the ambiguity by refusing to accept both
AND and OR in the same condition  unless  the  meaning  is  made
clear with brackets.  Hence the fruity example would become:
'Bring me (an apple and a pear) or an orange' or
'Bring me an apple and (a pear or an orange)'.




                             7
The  next  program will take in a sequence of 'words' (character
sequences delimited by spaces or newlines) and count them.   The
program  stops  when it finds the word "%file" and so it can use
itself as input for test purposes.

      %begin      {program to count words}
         %string(63) Word
         %integer Number of words

         Number of words = 0
         %cycle
            read(word)
            Number of words = Number of words+1
         %repeat %until Word as = "%file"
         printstring("There were ")
         write(Number of words, 0)
         printstring(" words")
         newline
      %end
      %end %of %file

Before getting into the main ideas in this program it  is  worth
discussing  the  routine READ in a little detail.   You may have
noticed that in this example READ is given a string variable  as
its  parameter  while  the  previous  example gave it an integer
variable.    This  looks  suspiciously  like  the   non-standard
procedures  which  are  commonly  used  to  perform input/output
operations.   Non-standard  means  that  (taking  Pascal  as  an
example)  although  READ and WRITE look like ordinary procedures
because they can take parameters of almost any type they  cannot
be defined in Pascal, as all user-defined parameters must have a
fixed type.   In IMP this is not the case as there is a 'general
type' reference parameter which will accept a  variable  of  any
type.  The functions ADDR, SIZE OF and TYPE OF are available for
making use of such parameters.

In  the  same  way  that %integer introduces variables which may
hold integer values, %string introduces variables which may hold
string values, where a string is just a sequence of  up  to  255
characters.   Any  string  variable  has  three  properties: the
number of characters it currently  contains  (its  LENGTH),  the
maximum  number  of  characters  is  can  contain  (its  MAXIMUM
LENGTH), and  the  actual  characters  themselves.   Whenever  a
string  variable  is  declared  the maximum length of the string
must be specified (63 characters in this example), and  must  be
an integer in the range 1 to 255 inclusive.  The reason for this
limitation  is  so  that the length of a string can be held as a
sort  of  invisible  character  at  the  start  of  the   actual
characters.  While this is not guaranteed to be the way in which
strings  will  be implemented, to the knowledge of the author no
compiler handles them differently.   The only  place  where  the
knowledge  that  a  string  of maximum length N will take up N+1
characters-worth of storage is when  the  function  SIZE  OF  is
used.   SIZEOF(WORD)  will  return  the  value  64, which is one
greater than the maximum length of the string.


                             8
A  frequent  gripe  against  IMP  is  'strings  limited  to  255
characters  are useless'.   There is no question that strings of
any length would be ideal, but in  practice  the  limit  is  not
often  a  problem,  and  IMP strings are a lot more powerful and
convenient (and useful) than no strings at  all  or  the  highly
restrictive strings offered by the more common languages.

The main point of the example is to introduce cycles (or loops).
In  general  any  sequence  of  statements  may  be  repeated by
enclosing them in the statements: %CYCLE and %REPEAT.  Note that
%CYCLE and %REPEAT are statements and must be terminated (by  a
newline  or  a  semicolon).   If  no furthur action is taken the
cycle will continue indefinitely, so some means must be provided
for terminating the loop.   IMP provides one  general  mechanism
and  three  'syntactic sugarings' of common cases.   The general
mechanism is to use the instruction %EXIT,  execution  of  which
causes  the  loop  to  be  terminated and control to pass to the
statements following the corresponding %REPEAT. %EXIT  can  only
be  used to terminate one loop at a time; it cannot take you out
of nested loops in one go.
The three  common  cases  are  provided  by  adding  conditional
clauses  to either the %CYCLE or the %REPEAT statement, and they
are:

      Simple form                    Expanded form

      %while [condition] %cycle      %cycle
         .........                      %exit %unless [condition]
      %repeat                           .........
                                     %repeat

      %cycle                         %cycle
         .........                      .........
      %repeat %until [condition]        %exit %if [condition]
                                     %repeat

      %for V = F, B, L %cycle        Temp1 = B;  Temp2 = L
         .........                    V = F-Temp1
      %repeat                        %cycle
                                        %exit %if V = Temp2
                                        V = V+Temp1
                                        .........
                                     %repeat


Apart from simplicity there are no  reasons  why  you  shouldn't
just  stick  to the %exit form of loops in all cases if you find
it easier.   In fact if the loop stops in the  middle  there  is
little choice other than contorting the program to force it into
a %while or %until form.





                             9
In  a  direct  parallel  to  the  %if  statement,  IMP  provides
convenience forms of the %while, %until, and %for loops when the
'body' of the loop is very simple.  E.g.
 
      Buy something %while Money left # 0
      X = X*10 %until X > 100
      Visit(Patient) %for Patient = First, 1, Last

Beware that %while always does the test before the action  which
may  not be executed at all, whereas %until always does the test
after performing the action at least once.

The next program counts the number of letters, digits, and other
characters in a piece of text, but instead of detecting the  end
of  the  text  by looking for a special data item, it just waits
for the event 'Input Ended' to be signalled instead.

      %begin      {program to count letters and digits}
         %constant %integer Input Ended = 9
         %integer Sym, Letters, Digits, Others
         
         %on %event Input Ended %start
            Printstring("There were ")
            Write(Letters, 0);  Printstring(" letters,  ")
            Write(Digits, 0);   Printstring(" digits, and  ")
            Write(Others, 0);   Printstring(" other characters")
            Newline
            %stop
         %finish

         Letters = 0;  Digits = 0;  Others = 0
         %cycle
            Readsymbol(Sym)
            %if  'A' <= Sym <=  'z' %or  'a' <= Sym <= 'z'  %start
               Letters = Letters+1
            %else %if  '0' <= Sym <= '9'
               Digits = Digits+1
            %else
               Others as Others+1
            %finish
         %repeat
      %end
      %endoffile

The statement %on %event Input Ended %start can be thought of as
a sort of condition which is never satisfied.   This means  that
when  the  %on  statement is reached control will always pass by
the bit between %start and %finish (the 'Event block') and carry
on from after the %finish (or the %else if  one  is  specified).
However,  if  during  the execution of the rest of the block the
event (or events) mentioned in the %on statement  is  signalled,
control  is  immediately  passed  to  the first statement of the
event block and execution continues from there.   Any block  may
contain  an  %on statement, but there may only be one in a block
and it must come immediately after the  declarations  (if  there
are  any).   In  general  it is not possible to resume execution
from the point at which the event was signalled.



                             10
The exact definition of this event mechanism is  a  little  more
complicated  but  this description should be enough for the time
being.
This program demonstrates the use of the single quote to provide
the internal value (an integer) corresponding to any  character.
For  example  in  the  ASCII  character  set the constant 'A' is
indistinguishable from the constant 65.   Again,  there  are  no
limitations on the characters which can be placed between single
quotes, so that ' ' is the value of a space (32 in ASCII) and '
' is the value of a newline (10 in ASCII).   Because the newline
constant can make it  awkward  to  read  a  program,  the  named
constant NL is available as a substitute.

The  three  tests of SYM are of interest as they are examples of
'double-sided conditions'.   Effectively 'A' <= Sym <= 'Z' is an
abbreviation  for  the  condition:  'A' <= Sym  %and Sym <= 'Z'.
This sort of condition is very useful for testing for ranges  of
values  as in this example.   Note that the example assumes that
there are no 'holes' in the character set and that  the  letters
and digits are in sequence (beware of EBCDIC!).


In  an  earlier  example  the  standard routine READ was used to
input 'words' which were then counted.   This program  made  the
assumption  that  no  word  contained  more  than 63 characters.
However, the implementation-provided routine  READ  cannot  know
this  and  will  attempt  to  input words of any length.   If it
encounters a word of 64 characters or more it will quite happily
take it in and then fail with 'string overflow' when  trying  to
assign the value to its parameter.






                             11
The  next  program  overcomes  this  problem  by  redefining the
routine READ so that it simply truncates any words with too many
characters.

      %begin      {program to count words revisited}
         %string(63) Word
         %integer Number of words = 0

         %routine Read (%string (63)%name Text)
            %integer Sym
            Text = ""
            %cycle
               readsymbol(Sym)
               %return %if Sym = ' ' %or Sym = NL
               Text = Text . To String(Sym) %unless Length(Text) = 63
            %repeat
         %end

         %on %event 9 %start               {end of input}
            Printstring("There were ")
            Write(Number of words, 0)
            Printstring(" words")
            Newline
         %else                             {first entry comes here}
            %cycle
               Read(word)
               Number of words = Number of words+1
            %repeat
         %finish
      %end
      %endoffile

The definition of the routine READ is a block, nested within the
%begin-%end block of the main program.   Whenever an  identifier
is  declared  in a block it remains available for use throughout
the rest of that block and any  blocks  which  are  subsequently
nested  within  it.   The  only exception to this is the case of
labels which are visible in the block in which they were defined
but are not visible  in  any  nested  blocks.   Labels  will  be
discussed  in  a later section.   If an identifier is redeclared
within a nested block, that  definition  effectively  masks  out
access to the outer definition.   The result of all this is that
in the  sample  program  references  to  READ  will  access  the
newly-defined  procedure  and  not the standard procedure, which
along with all standard procedures  is  defined  in  a  sort  of
'super block' which contains the whole program.

In  brackets following the procedure identifier, READ, is a list
of  declarations  which  define  the  number  and  type  of  the
parameters  which must be given at each call.   The %NAME suffix
specifies that the parameter is to be passed by reference,  that
is,  in  this example any reference to TEXT within the procedure
will be exactly equivalent to a reference to the string variable
actually given in the call.




                             12
Execution of the routine is terminated whenever the  instruction
%RETURN  is executed.   For convenience the %END of a routine is
considered to be an abbreviation for %RETURN; %END.   Note  that
this  is  only  true for routines, other types of procedure must
have explicit terminating statements or the compiler  will  flag
an error (RESULT missing).


Three new features of IMP are illustrated by the statement:
   Text = Text . To String(Sym) %unless Length(Text) = 63

The   dot   is  the  only  operator  available  to  form  string
expressions  and  indicates  concatenation.    Concatenation  is
simply the joining-together of the two string operands to give a
new string.  The number of characters in this new string will be
the  sum  of  the numbers of characters in the original strings,
not the sum of their maximum lengths; magic  padding  characters
are never inserted.
The  identifier  TO  STRING  refers to a standard function which
takes as its parameter an integer expression which must  give  a
value  corresponding to a character value (commonly any value in
the range 0 to 255 inclusive).   The result is a string value of
length  one  character,  that  character  being the value of the
parameter.
For example, the following two assignments both  give  the  same
value to the variable Text:

   Text = "H"
   Text = To String('H')

Once  again,  note  the difference between "H" which is a string
value and 'H' which is an integer value.

The identifier LENGTH refers to a standard function which  takes
as its parameter a reference to a string variable and returns as
its  result  the number of characters currently contained in the
string.

In passing it is worth commenting on the initial  assignment  to
Text:

   Text = ""

The  double-quotes  contain  no characters, that is, a string of
length zero; this is termed a null string.

Before leaving the routine READ itself it should be  clear  that
the   definition  as  given  will  accept  string  variables  as
parameters only if they are defined to have a maximum length  of
exactly  63.   In the program in question this is no limitation,
but if the procedure were wanted in other contexts it  would  be
useful if it could accept strings of any maximum length.




                             13
To  specify  this the maximum length of the parameter definition
should be changed to a star:

      %routine Read(%string(*)%name Text)
         %integer Sym
         Text = ""
         %cycle
            readsymbol(Sym)
            %return %if Sym = ' ' %or Sym = NL
            Text = Text . To String(Sym) -
                   %unless Length(Text) = Size of(Text)-1
         %repeat
      %end

Also the reference to 63 has been  changed  to  Size of(Text)-1.
'Size  of'  returns  the  size  of  the storage allocated to the
variable given as its parameter.  Remember that the -1 is needed
as there is a one character overhead  in  strings  to  hold  the
current length.

Finally,  there are two minor points about the complete program.
The first is the declarations

      %integer Number of Words = 0

which may loosely be thought of as a contraction of:

      %integer Number of Words
      Number of Words = 0

although strictly  the  initialisation  is  performed  when  the
variable  is  created,  but  this  is only significant when %OWN
variable are concerned. %OWN variables will be discussed later.

The second is the use of an  %ELSE  clause  with  an  %on %event
statement.   The  %else  clause  is  executed when control first
reaches the %on statement, and is skipped if the %on  clause  is
executed following the signalling of a suitable event,

The  following  program demonstrates the definition and use of a
function.   In addition it shows the method by  which  constants
can  be given identifiers by means of the %constant declaration.
Such identifiers may be used wherever their value may  be  given
as  a  literal  constant.   This  provides  a  convenient way of
parameterising  a  program so that it  is  easier  to  read  and
change.





                             14
The  example also uses the CHAR NO map which returns a reference
to the N'th character in the given string  variable.   An  error
will be signalled if Char No attempts to access characters which
are  not  within the string, i.e. if N is outwith the range 1 to
Length(First parameter) inclusive.

      %begin      {program to test capitalisation function}
         %routine Read Line(%string(*)%name Line)
            %integer Sym
            Line = ""
            %cycle
               Readsymbol(Sym)
               %exit %if  Sym = NL
               Line = Line.Tostring(Sym)
            %repeat
         %end

         %string(127)%function Capital form of(%string(127) Who)
            %constant %integer  Shift = 'A'-'a'
            %integer Up, N, Sym
            %for N = 1, 1, Length(Who) %cycle
               Sym = Char No(Who, N)
               Sym = Sym-Shift %if  'A' <> Sym <= 'Z'  {to lower case}
               %if 'a' <= Sym <=  'z' %start
                  %if Up # 0 %start
                     Sym = Sym+Shift                   {to upper case}
                     Up = 0
                  %finish
               %else
                  Up = 1
               %finish
               Char No(Who, N) = Sym
            %repeat
            %result = Who
         %end
         %string(80) Line

         %cycle
            Read Line(Line);  %exit %if Line = ""
            Printstring(Capital form of(Line))
            Newline
         %repeat
      %end
      %endoffile

This time the declaration of the  parameter  for  the  procedure
does not end with %NAME and so the parameter is passed by value,
that is when the procedure is called an ordinary string variable
(Who)  is  created  and  it  is assigned the value of the string
expression given as the parameter to the call.   The  use  of  a
star  for  the maximum length of a string value parameter is not
permitted.

In the same way that  %RETURN  terminates  the  execution  of  a
%ROUTINE, the instruction %RESULT=..... terminates the execution
of  a %FUNCTION.   The expression following the equals sign must
produce a value of the same type as the function (a string value
in this case).


                             15
A common feature of languages  which  have  been  influenced  by
FORTRAN  or ALGOL 60 is that they return the result of functions
by means of a rather unpleasant pun on the function  name.   The
mechanism  is roughly that the use of the function identifier on
the left hand side of an assignment  specifies  the  'result  so
far'  but does not terminate the function.   Eventually when the
END of the function is reached the 'result so far'  is  returned
as the actual result.   IMP does not (indeed cannot) permit this
as apart from the generally unpleasant nature of the pun and its
associated difficulties, there would be no way of returning  the
result  of a %MAP, as use of the map's identifier on the left of
an assignment would quite naturally be seen as a recursive  call
on the map!

Within  a  function  there may be as many %RESULT= statements as
you wish; the execution of  any  one  of  the  will  immediately
terminate the function.
For  example,  the following function returns the ordinal number
of an upper-case letter, when that letter  is  from  the  EBCDIC
character set.  The difficulty is caused by the fact that EBCDIC
has   'holes'   between   the  letters;  their  values  are  not
consecutive as they are in ASCII.

      %integer %function Ebcdic letter number(%integer Sym)
         %constant %integer Ea =  16_C1, Ei = 16_C9,
                            Ej =  16_D1, Er = 16_D9,
                            Es =  16_E2, Ez = 16_E9

         %result = Sym-Ea+1  %if Ea <= Sym <= Ei
         %result = Sym-Ej+9  %if Ej <= Sym <= Er
         %result = Sym-Es+18 %if Es <= Sym <= Ez
         %result = 0
      %end

The equivalent function for ASCII would be:

      %integer %function Ascii letter number(%integer Sym)
         %result = Sym - 'A' +1
      %end

Note the form of the constants defining the  EBCDIC  values  the
the  letters A,I,J,R,S,Z.   The 16_ specifies that the following
constant  is  expressed  in  base  16  (hexadecimal).   In  such
constants  the  letters  (upper  or  lower  case)  represent the
'digits' 10 (A), 11 (B) , 12 (C) etc.   The  notation  is  quite
general  and  any  base greater than one can be specified.   For
example: Octal  is  8_77715,  Binary  is  2_010101110  and  base
seventeen is 17_ABCDEFG.  The notation may also be used for real
constants.  This is especially useful when the limit of accuracy
is  required  as  putting the constant into the base used by the
machine can give more accuracy than expressing it in decimal.

For example PI could be defined as:

      %constant %long %real PI = 16_3.243F 6A89



                             16
Now for a program using real variables.

%begin      {reals}
   %real %function Compound interest(%real Capital, Rate,
                                     %integer Years)
      %real Balance
      %integer Years Left

      Balance = Capital
      Years Left = Years
      %while Years Left > 0 %cycle
         Years Left = Years Left-1
         Balance = Balance+(Balance*Rate/100)
      %repeat
      %result = Balance-Capital
   %end

   %real Cap, Rate
   %integer Time
   Read(Cap);  Read(Rate);  Read(Time)
   Printstring("The interest on  £"); Print(Cap,  0, 2)
   Printstring(" at ");  Print(Rate, 0, 2)
   Printstring("% per  annum for")
   Write(Time, 0);  Printstring(" years  is £")
   Print(Compound interest(Cap,  Rate, Time), 0, 2)
   Newline
%end
%endoffile

The function Compound interest takes three parameters, two  real
and one integer, all of which are passed by value.   Again, this
means that the function creates three variables and copies  into
them   the   values   given  in  the  call.    Apart  from  this
initialisation  there  are  absolutely  no  differences  between
parameters and other variables declared in a procedure.
The  program  uses  the  two  standard output routines WRITE and
PRINT to generate its output.   WRITE outputs an  integer  value
using  its  second  parameter  to  control the minimum number of
characters output.   PRINT outputs a real value using the second
parameter  to  control  the  minimum size of the part before the
decimal point, and its third parameter  to  control  the  actual
number  of  places printed after the decimal point, if the third
parameter of print is zero the decimal point and the  fractional
part of the number are not output.


So  far,  all  the  variables  declared  inside blocks have been
destroyed when the  execution  of  the  blocks  terminated.   In
several  cases  it  is  convenient  for procedures to be able to
exist in different 'states', that is to remember what  they  did
last.   This  could  be  achieved  by  using  variables declared
outside the procedures (global variables) but then  there  would
be   no  protection  against  other  procedures  altering  those
varaiables (perhaps as the result of a typing error).   This  is
where %OWN variables are useful.   An %OWN variable is identical


                             17
to equivalent non-%OWN variables in every  respect  except  that
they  effectively  always  exist, at least as far as the program
which declares them is concerned.   This means that they are not
created  and destroyed like other variables but exist throughout
the execution of the whole program.   As they are not  destroyed
when  control  passes  from  a  procedure  their  values will be
retained and will be available for use on  subsequent  calls  of
the  procedure.   However,  access  to the identifier of an %OWN
variable is still limited to the block in which it was  declared
and to blocks subsequently defined within that block.

      %constant %integer FF = 12
      %routine Print and Suppress(%integer Sym)
         %own %integer Previous = FF
         %return %if Sym = FF %and Previous = FF
         Previous = Sym
         Printsymbol(Sym)
      %end

This  routine  is  intended for use in place of PRIMTSYMBOL when
the output is to be sent to a printer and consecutive form feeds
(FF) are to be suppressed.   The %OWN variable PREVIOUS is  used
to  remember  the  character  last  output  and to stop printing
consecutive ones.

Unlike the initialisation of non-%OWN variables the statement:

      %own %integer Previous = FF

does not mean:

      %own %integer Previous
      Previous = FF

as this would set PREVIOUS to FF every  time  the  routine  were
called.   Rather,  the  initialisation  is  performed  when  the
variable PREVIOUS is created,  which  is  effectively  when  the
program containing the routine starts execution.


The  basic  data  types  provided  by IMP are integer, real, and
string.  There are several ways in which these types may be used
to create more complicated objects.   The first of these  is  by
means of record variables.

A  record  is  a  variable which is made up from a collection of
other variables.  The collection of other variables is described
using a %RECORD
%FORMAT  declaration.    This  defines  the  type,  order,   and
identifiers  of  the  components  and  gives  an identifier, the
format identifier, to the complete collection.




                            18
The format identifier can then be used to  create  objects  with
the internal structure described by the %FORMAT declaration:

      %record %format Person(%string(63) Surname, Prename,
                             %integer Age,
                             %real Weight)

      %record  (Person) Fred

This  defines  the  variable  FRED to be a record containing two
string variables, one integer and one real variable.  To extract
a particular variable from a record just follow the reference to
the record by an underline followed by  the  identifier  of  the
required component variable.  For example:

         Fred_Prename = "Frederic"
         Fred_Surname = "Chopin"
         Printstring(Fred_Prename." ".Fred_Surname)

Records  may  be  used  in  the  same ways as the other sorts of
variable discussed previously, in particular they may be  passed
as  parameters, by value or be reference, and may be the results
of record  functions  and  record  maps.   The  only  operations
available  on  records as complete entities are to copy one into
another of the same format (Rec1=Rec2) or to  set  the  complete
record to zero (Rec=O).

%recordformat Coordinate(%real X, Y)
%real %function Distance between(%record(Coordinate) Point1, Point2)
   %external %real %function %spec Sqrt(%real Arg)
   %real Dx, Dy
   Dx = Point1_X - Point2_X
   Dy = Point1_Y - Point2_Y
   %result = Sqrt(Dx^2 + Dy^2)
%end

This  program  fragment  defines  a  function  which operates on
points in a two-dimensional  plane,  and  returns  the  distance
between  two  of  them.   The parameters to DISTANCE BETWEEN are
records passed by value.   The user should take care with record
value  parameters  as  they  require  that  the  whole record be
copied, not very expensive in this  particular  case,  but  with
large records the overhead can be considerable.  For this reason
records  are more commonly passed by reference, even though they
will not be altered by the procedure.  The function makes use of
another function, SQRT, which returns the  square  root  of  its
parameter.   This  function  is  not defined in this program but
will exist at run-time in some other module.   However,  as  all
identifiers  must  be  declared  before  they  can be used, this
information must be presented to  the  compiler.   This  is  the
function  of  the  %external  statement.   The %external keyword
tells the compiler that the identifier about to be  declared  is
to  be  made  available  to  the environment outside the program
(commonly a linker of some sort).   The %spec keyword tells  the
compiler   that  this  declaration  is  a  specification  of  an
identifier  which  is  not  actually  being  defined   by   this


                              19
statement.   In  other  words the complete declaration tells the
compiler what sort of thing SQRT is (a  longreal  function  with
one  longreal value parameter), and that it is defined somewhere
in the external environment of the  program.   For  the  program
containing  the  %spec to be able to run, there must be a module
somewhere which defines SQRT.
This module could be written in  IMP  and  the  complete  module
would look something like:

      %external %longreal %function Sqrt(%longreal Arg)
         ..........
         ..........
         %result = ......
      %end
      %endoffile

This  definition  of Sqrt must match the specification statement
used to reference it.   In fact  they  are  identical  with  the
exception  that  the  definition  does  not  contain the keyword
%spec.

The external mechanism is not just limited to procedures but may
be used with variables.   For example a module could  define  an
external record into which various modules can place data:

      %record %format Things(%integer Number of washers,
                             %string(255) the saying of the day,
                             %real rate of inflation)

      %external %record(Things) Useful rubbish

      %endoffile

%external   declarations  like  this  also  give  variables  the
properties of %OWN variables.   If a program or  another  module
wishes  to  access  this  record  it  just  declares  it with an
external specifications

   %begin
      %recordformat Things(%integer Number of washers,
                           %string(255) the saying of the day,
                           %real Rate of Inflation)

      %external %record(Things) %spec Useful rubbish

      Printstring(Useful  rubbish_the saying of the day)
      newline
   %end
   %endoffile





                             20
Now you should be able to see that there is nothing magic  about
the  routines  NEWLINE,  PRINTSTRING, WRITE etc.   They are just
external routines which the compiler  automatically  %SPECs  for
you.    In  effect  the  compiler  starts  each  compilation  by
compiling a special file which contains statements like:

%external %routine      %spec Newline
%external %routine      %spec Write(%integer Value, Places)
%external %routine      %spec Print(%real V, %integer B, A)
%external %integer %fn %spec Length(%string(*)%name S)
%external %byte %map   %spec Char No(%string(*)%name S, %integer N)
.........
.........
%endoffile


Another way of  creating  more  complex  objects  is  to  gather
together a number of objects of the same type as an array.

      %begin            {counting letters}
         %integer %array Times('A':'Z')
         %integer Sym, J, N

         %on %event 9 %start
            %for J = 'A', 1, 'Z' %cycle
               N = Times(J)
               Printstring("There ")
               %if N = 1 %start
                  Printstring("was ")
               %else
                  Printstring("were ")
               %finish
               write (N, 0)
               Printsymbol(J);  Printstring("'s")
               newline
            %repeat
            %stop
         %finish

         Times(J) =  0 %for J = 'A', 1, 'Z'

         %cycle
            Readsymbol(Sym)
            %if 'A' <= Sym <= 'Z' %start
               Times(Sym) = Times(Sym)+1
            %else %if 'a' <= Sym <= 'z'
               Times(Sym-'a'+'A') = Times(Sym-'a'+'A') + 1
            %finish
         %repeat
      %end
      %endoffile




                             21
This   program   creates   an  array  TIMES  with  26  elements:
TIMES('A'),   TIMES('B')  .....   TIMES('Z')  and  uses  it   to
accumulate the number of times each letter (upper or lower case)
appears in the input.

As a general point about efficiency, the expression -'a'+'A' has
the  value  -32  but  it is much clearer to write it in the form
given rather than as the magic value -32.   Perhaps even  better
would be to define a %constant %integer with the value -'a'+'A'.
In  all  of  these  cases though, the compiler will generate the
same machine code so there is nothing at all  to  be  gained  by
calculating  such constant expressions and obscuring the program
with wierd and wonderful values.   In the  same  way  absolutely
nothing  is  gained  by  using  65  instead of 'A' where that is
meant, in fact legibility and perhaps character-set independence
is lost by doing it.

Unlike Pascal and FORTRAN the  bounds  of  arrays  need  not  be
constants;  it  is  quite  common for a program to calculate the
size of arrays needed and then  create  them  dynamically.   The
only  restriction  is  that  all arrays must have a non-negative
number of elements.   In other words the upper bound (the second
one)  minus the lower bound (the first one) plus one must not be
negative.
      %integer %array A(1:0)   {is valid}
      %integer %array B(2:0)   {is not}

Consistently in the definition of IMP  repetitions  of  zero  or
more  times  are  always  valid whereas negative repetitions are
not.  Does anyone know what it means to execute a loop -1 times?
Do you do it backwards once?

The program also  illustrates  the  use  of  a  %for  clause  to
initialise  all  of  the elements of an array.   This is a place
where IMP is a little weak; it would not be difficult to  permit
simple operations on complete arrays as a direct parallel to the
operations  on complete records viz. copying and initialisation.
This could be included as part of the  continuing  evolution  of
the language.


Records  and  array may be combined, that is you can have arrays
of records and arrays within records, although any arrays inside
record formats must have constant bounds and be one-dimensional.

The following program uses an array of records.   It also uses a
specification  of  a routine which is defined in the same block.
This is to enable the routine to be used before is  is  actually
defined.   Whether  you put routines first and then use them, or
put specs first with the routines at the end of the  program  is
purely  a  matter  of  taste; in general it has no effect on the
efficiency of the program.




                             22
%begin
   %constant %integer Max Items = 100
   %recordformat Inf(%string(63) Word, %integer Occurred)
   %record(Inf)%array Item(l:Max Items+1)
   %string(63) Word
   %integer Items in = 0
   %constant %string(3) End Mark = "*E*"

   %routine %spec Add word to table

   %cycle
      Read(word)
      %exit %if Word = End Mark
      add word to table
   %repeat

   %for J = 1, 1, Items in %cycle
      printstring(Item(J)_Word)
      printstring(" occurred ")
      write(Item(J)_Occurred, 0)
      printstring(" time")
      printstring("s") %if Item(J)_Occurred # 1
      newline
   %repeat

   %routine Add word to table
      %integer P
      {insert the word provisionally}
      Items in = Items in+1
      Item(Items in)_Word     = Word
      Item(Items in)_Occurred = 0

      {now look for it}
      %for P as 1, 1, Items in %cycle
         %exit %if Item(P)_Word = Word
      %repeat
      {remove it if duplicated}
      Items in = Items in-1 %if P # Items in
      Item(P)_Occurred = Item(P)_Occurred+1
   %end
%end
%endoffile


It has been mentioned previously that parameters can  be  passed
by  reference, that is a reference to a variable can be assigned
to the parameter rather than the actual value of that  variable.
Variable which can hold references to other variables are called
'pointer variables' and they may be declared and used like other
variables.




                             23
For  example,  the procedure "Add word to table" described above
could be rewritten:

      %routine Add word to table
         %integer P
         %record(Inf)%name New, Old
         {insert the word provisionally}
         Items in = Items in+1
         New == Item (Items in)
         New_Word   = Word
         New_Occurred = 0

         {now look for it}
         %for P = 1, 1, Items in %cycle
            Old == Item(P)
         %repeat %until Old Word = Word
         {remove it if duplicated}
         Items in = Items in-1 %unless New === Old
         Old_Occurred = Old_Occurred+1
      %end
The assignment Old == Item(P) assigns  to  Old  a  reference  to
Item(P).   This  can  be  thought  of  as  making Old 'point at'
Item(P).   It is important to realise that  it  is  the  current
value  of  P  which  is  used  in the sense that it after the ==
assignment P is altered Old will still point at the same element
of the array Item.

Except  when  being  used  as  the  left  hand  side  of  an  ==
assignment,  use  of a pointer variable is exactly equivalent to
the use of the variable to which it is  pointing.   For  example
given:

      %integer X = 0, Y = 1
      %integer %name N
      N == X

The assignment Y = N is equivalent to Y = X, N = 3 is equivalent
to  X = 3,  M == N  is  equivalent  to  M == X,  and  Read(N) is
equivalent to Read(X).

Now the mechanism of passing  parameters  can  be  understood  a
little  more  clearly,  parameters  passed by value are assigned
using = and those passed by reference are assigned using ==.


Pointer variables may be used as components of  records  as  the
following program fragment shows.  It is part of a program which
manipulates  lists  of  cells,  with  each  cell using a pointer
variable to point at the next one.   The record NULL is a  dummy
record used to mark the end of the lists.




                             24
%begin
   %record format Cellfm(%integer Data, %record(Cellfm)%name Link)
   %record(Cellfm)%array Cells(1:Max Cells)
   %record(Cellfm) Null

   %record(Cellfm)%map New Cell
      %own integer Last = 0
      %signal 14,1,Last %if Last = Max Cells  {none left}
      Last = Last+1
      %result ==  Cells(Last)
   %end

   %record(Cellfm)%map Copy of(%record(Cellfm)%name  List)
      %record(Cellfm) Head
      %record(Cellfm)%name End, Cell
      End == Head
      %while List ##  Null %cycle
         Cell == New Cell
         Cell_Data = List_Data
         End_Link == Cell
         End      == Cell
         List ==  List_Link
      %repeat
      End_Link == Null
      %result == Head_Link
   %end

   %record (Cellfm)%map Reversed copy of (%record (Cellfm)%name List)
      %record(Cellfm)%name New, Cell
      New == Null
      %while List ## Null %cycle
         Cell == New Cell
         Cell_Data = List_Data
         Cell_Link == New
         New       == Cell
         List ==  List_Link
      %repeat
      %result == New
   %end
   .......

The  example also shows the use of %result == to give the result
of a %map.   This  is  an  exact  parallel  to  functions  where
%result =  is  used to return a value, while in a map %result ==
is used to return a reference.  The compiler will report a fault
if an attempt is made to use %result= in a map or %result== in a
function.




                             25
The next to  functions  demonstrate  the  use  of  the  standard
function SUB STRING to split a string into fragments:

   %integer %function Index(%string(255) Data, Pattern)
      {returns the index of the first occurrence of Pattern
      {in the string Data. Zero is returned if the pattern
      {cannot be found}

      %integer Chars Left, Here, Limit, Len
      Len = Length(Pattern)
      Limit = Length(Data)-Len+1  {limit of search}
      Here = 0
      %while Here < Limit %cycle
         Here = Here+1
         %result = Here %if -
            Sub String(Data, Here, Here+Len-1) = Pattern
      %repeat
      %result = 0
   %end

   %routine Insert Today(%string(*)%name Line)
      %integer Pos
      %string(255) Before, After
      Pos = Index(Line, "*DATE*")
      %if Pos # 0 %start            {found}
         Before = Sub String(Line, 1, Pos-1)
         After  = Sub String(Line, Pos+6-1, Length(Line))
         Line = Before . Date . After
      %finish
   %end

SubString simply returns as its result the string made up of the
sequence  of  characters between and including the characters at
the positions specified by  the  second  and  third  parameters.
E.g.  Substring("123456", 2, 4) = "234".  If the third parameter
is  equal  to the second a string of length 1 is returned, while
is the third parameter is one less than the second a null string
is returned.

Date is a standard string function which returns the date in the
system-standard format.   Similarly there  is  a  function  Time
which does the same thing for the time of day.


As  the  operation  programmed  in the previous example is quite
common IMP provides a unique instruction for doing it.   This is
termed  'string resolution'  and  looks like a backwards sort of
assignment using ->.

      %routine Insert Today(%string(*)%name Line)
         %string(255) Before, After
         %if Line -> Before .("*DATE*"). After %start
            Line = Before . Date . After
         %finish
      %end



                             26
The string expression in brackets is evaluated  and  the  string
variable on the left (Line) is searched for that value.   If the
value is found the characters to the left are  assigned  to  the
variable  to  the left of the bracket (Before), and those on the
right of the pattern are assigned to the variable on  the  right
(After).   A  string  resolution instruction has the strange but
useful property that it can either be used  on  its  own  as  an
instruction,  or,  as  in  the  example,  it  can  be  used as a
condition.  When so used the success of the resolution satisfies
the condition and the implied assignments are  carried  out  (so
beware,  this condition has a side-effect!).   If the resolution
fails, that is the pattern cannot be found, the condition is not
satisfied and no assignments are performed.   When a  resolution
is  used  as  an  instruction  failure  causes  an  event  to be
signalled (Resolution fails).

Either or both of the variables outside the brackets may be left
out in which case the corresponding fragments  of  the  original
string are simply discarded.  Hence the condition:
      %if S -> (Rude Word) %start
asks  the  question:  'does  the  string S contain within it the
string contained in Rude Word?'.


Single-dimensional arrays of constants may be declared:

      %external %string (3) %function Month(%integer N)
         %constant %string (3) %array M(1:12) =
            "Jan", "Feb", "Mar", "Apr", "May", "Jun",
            "Jul", "Aug", "Sep", "Oct", "Mov", "Dec"

         %result  = M(N) %if 1 <= N <= 12
         %result  = "???"
      %end
      %endoffile



Procedures may be textually nested inside other procedures:
  {1}  %routine Print Hex(%integer N, Width)
         %integer Places

         %routine Hex Digit(%integer D)
            %if D <= 9 %then Printsymbol(D+'0')     -
                       %else Printsymbol(D-10+'A')
         %end

         %return %if Width <= 0
         %for Places = (Width-1)*4, -4, 0 %cycle
            Hex Digit ( (N>>Places)&15 )
         %repeat
      %end




                             27
 {2}  %routine Print to Base(%integer Base, N)
         %routine Print Digit(%integer D)
            %if D <> 9 %then Printsymbol(D+'0')     -
                       %else Printsymbol(D-10+'A')
         %end

         Print to Base(Base, N//Base)  %unless N < Base
         Print Digit(Rem(N, Base))
      %end

The second example (Print to Base) demonstrates that  procedures
in IMP may be recursive, that is they may be defined in terms of
themselves.   This  routine  will output any positive integer to
any base greater than 1, although the output will be a  bit  odd
for bases greater than 36.








                             28
Last updated on 2007-Jun-18 15:36:10 by bfoley@compsoc.nuigalway.ie