A. MAJOR CHARACTERISTICS
A.a
The language should be extensible so that the user
of the language can extend the apparent set of data
types and operations available to his programs by
means of specifications made within his programs.
The number of specialized capabilities needed for
a common language is large and diverse. In many
cases, there is no consensus as to the form these
capabilities should take in a programming language.
The operational requirements dictating specific
specialized language capabilities are volatile and
future needs cannot always be foreseen. No language
can make available all the features useful to the
broad spectrum of military applications, anticipate
future applications and requirements, or even
provide a universally "best" capability in support
of a single application area. A common language
must have capability for growth. It should contain
all the power necessary to satisfy all the applications
and the ability to specialize that power to the
particular application task. An extensible language
will make it possible to add new application-oriented
features and to add new programming techniques
and mechanisms to the language using descriptions
written entirely within the language. Extensions
should have the appearance and costs of features
which are built into the language while actually
being only catalogued accessible application packages.
A static programming language cannot be all things to
all people, but an extensible language can be adapted
to meet changing requirements in a variety of areas.
A.b
The source language should contain a simple clearly
identified kernel which houses all the power of the
language.
The capabilities available in an extensible language
can be partitioned into two groups, those which are
definable by extension and these which provide an
essential primitive capability of the language. The
smaller and simpler the kernel, the easier
the language will be to learn and use. If the kernel
is clearly delineated and language features not in
the kernel are defined in terms of the kernel, then
only the kernel language need be implemented to make
the full source language capability available.
The kernel language should be simple in the sense
that it is small and each feature provides a single unique
capability not duplicated in other kernel features.
Kernel features should provide relatively low level
general purpose capabilities not yet specialized
particular applications.
A.c
A variety of application-oriented extensions should
be provided with the language.
An extensible kernel language alone is not sufficient
for a common language. Even though in theory such
a language provides the necessary power and the
capability for extension to special applications, the
users of the language cannot be expected
to become language designers or to divert project
funds to develop the required extensions to make the
language useful.
A.d
Source language structures not in kernel language
should be maintained in a compile-time accessible
library of extensions. The library should be
capable of holding anything definable in the source
language.
In an extensible language with a simple kernel the
usefulness of the language derives primarily from
the existence and accessability of specialized
application-oriented extensions. Whether an extension
library should contain source or object code is a
question of implementation efficiency and should
not be determined by the definition of the source
language. It should be remembered, however, that
interfaces cannot be validated at program assembly
time without some equivalent of their source language
interface specifications, that object modules are
machine dependent and therefore not portable, that source
code is often more compact than object code, and that
compilers for simple languages can often compile faster
than a loader can load from relocatable object programs.
There is no reason why routines written in other
programming languages should not be accessible through
the library, providing they conform to the object
language interface conventions.
A.e
The language should be typed. The type or mode of all
variables, components of composite data structures,
expressions, operations, and parameters should be
determinable at compile time and unalterable at
run-time. The language should require that the type
of each variable, component of composite data
structures, and formal parameter be explicitly
specified in source programs.
By the type of a data object is meant the set of
objects themselves, the essential properties of
those objects and the set of operations which give
access to and take advantage of those properties.
The author of any correct program in any programming
language must, of course, know the type of all data
and variables used in his programs. If the program
is to be maintainable, modifiable and comprehensible
by someone other than its author, then the types
of variables, operations, and expressions should be
easily determined from the source program. Type
specifications in programs also provide the redundancy
necessary to automatically verify that the programmer
has adhered to his own type conventions.
A.f
The source language should provide access to machine
dependent hardware facilities through encapsulated
machine language insertions.
Machine language insertions are necessary for interfacing
special purpose devices for accessing special hardware
capabilities, and for certain code optimizations on time
critical paths. The language should, however, be so
designed that there is little need or incentive for its
users to enter the machine language level. The machine
language insertions should be encapsulated so they can
be easily recognized when moving to another object machine
and so the full security of procedure calls can be
provided at their invocation.
B. SOURCE LANGUAGE CHARACTERISTICS
B.a
Neither the language definition nor the translator
should limit the size of program components.
This is an example of the principle that a programming
language should not impose arbitrary rules and
restrictions which must be learned and dealt with
by the programmer. Neither the language nor the
translator should limit the maximum array dimensions,
the length of identifiers, the maximum number of
parenthesis levels, the size of data structures,
or the number of identifiers. Program components
which affect the object representation of programs
will, of course, have limits imposed by the object
machine. The translator should report when the
program exceeds the resources of the intended
object machine but should not build in arbitrary
limits of its own.
B.b
Each data structure and operation of the Kernel
language should provide a single capability which
is composable and has a straightforward implementation
in the object code of conventional architecture
machines.
Kernel language data and operations should be simple
and provide a single capability so that their use
does not impose costs for unwanted capability. They
should be composable so they can be used as building
blocks for more specialized capabilities. They
should be compatible with object machines so that
they have low cost implementation.
B.c
There should be no defaults in programs which affect the
program logic. Decisions which affect program logic
should either be made irrevocably when the time [the] language
is designed or made explicit in each program.
The only alternative is implementation dependent defaults
with the translator determining the meaning of programs.
What a program will do should be determinable from the
program and the defining documentation for the programming
language. Omission of any selection which affects the program
logic should be treated as an error by the translator.
B.d
Defaults should be provided for special capabilities affecting
object representation and other properties which the average
programmer does not know or care about. Such defaults should
always mean that the programmer does not care which choice
is made.
The language should be oriented to provide a high degree of
management control and visibility to programs and toward
self documenting programs with the programmer required
to make his decisions explicit. On the other hand, the
programmer should not be forced to overspecify his program
and thereby cloud their logic, unnecessarily eliminate
opportunities for optimization, and misrepresent
arbitrary choices as essential to the program logic.
Defaults should be allowed, in fact encouraged, in don't
care situations.
B.e
No language defined symbols appearing in the same
program should have essentially different meanings.
This contributes to the clarity and uniformity of
programs, protects against psychological ambiguity
and avoids some error prone features of extant languages
In particular, this would exclude the use of = to
imply both assignment and equality, would exclude
conventions implying that parenthesized parameters have
special semantics (as with PL/l subroutines), and would
exclude the use of a colon to both declare a label and
separate input and output parameters (as in Jovial).
It would not, however, require different operator
symbols for integer, real or even matrix arithmetic,
since these are, in fact, uses of the same abstract
operations.
B.f
There should be source language capability for specifying
the intended object environment.
When a language has different host and object machines
and when its compiler can produce code for several
object machines or several configurations of a given
object machine, the programmer should be able io
document and to specify the intended object machine
configuration within the source language program. The
object environment specification should include the
correct computer model, the memory size, any special
hardware options, the operating system if present,
special object site conventions, and the peripheral
configurations. These specifications might be simply
a list of identifiers and would probably be canned as
library elements when several programs are being
developed for the same object machine.
B.g
The source language should permit inclusion of assertions,
assumptions, axiomatic definitions of data types, and
units of measures in programs. Because there is currently
no best notation for these purposes the language should
not impose any particular syntax for their use.
There are many opinions on the desirability, usefulness,
and proper form of each of these specifications. It is
clear that better program documentation is needed and
that specifications of these kinds may help. Specifications
also introduce the possibility of automated testing,
formal program proofs, and dimensional analysis. The
language should not prohibit inclusion of these forms of
specification but neither should any particular form be
imposed for their use, or translators required to take
special action on them. The presence or absence of
assertions, assumptions, axiomatic definitions, units or
measure or comments in source language programs shouldn't
affect the translators ability to translate the program
and generate object code.
B.1 DATA TYPES
B.1.a
The use of defined types should be indistinguishable
from built-in types.
There should be no special cases, ad hoc,or inconsistent
rules to interfere and complicate learning, using and
implementing the language. If built-in features
and user defined extensions are treated in the same
way throughout the language so that the kernel language,
standard application-oriented extensions,
extensions and application programs are treated
in a uniform manner by the user and by the translator
then these distinctions will grow dim to everyone's
advantage. When the language contains all the essential
power, when few can tell the difference between
the kernel language and the extensions, and when
extensions to the source language do not impact the
compiler and its standardization, then there is no
incentive to proliferate languages.
B.1.b
The language should provide data types for integer,
real, Boolean, character, array (i.e., composite
data structures with indexable components of
homogeneous type), and record (i.e., composite data
structures with labeled components of heterogeneous
type) types.
These are the common data types of most programming
languages and object machines and are sufficient
to mechanize any other desired type.
B.1.c
The language should provide a pointer mechanism
which can be used to point within specified composite
data structures to build data with shared and/or
recursive substructure; but variables and
expressions of pointer type are not desired.
The need for pointers is obvious in building data
structures with shared or recursive substructures;
such as, directed graphs, stacks, queues, and list
structures. Unfortunately, providing pointers as
absolute address data types produces a gap in
security mechanisms and encourages the development
of ad hoc data structures incapable of comprehension or
proof. The desired pointer capability is that
required to build a data structure containing
fields which need not be collocated with the
structure. That is, some fields are indirectly
named rather than being allocated within the structure
itself. There is no requirement for pointer variables,
for pointers to data of unknown type, nor for
pointers to variables.
B.1.d
Two types of reals should be provided: normalized
and floating point numbers and fixed point numbers in
the interval -l to l. Scale factor management
for fixed point numbers should be the responsibility
of the user.
Many small machines do not have floating point hardware
and some applications require greater precision
than can be obtained from the floating point hardware
of their object machines. Both floating point and
fixed point arithmetic should be provided, but scale
management for fixed point should be left to the
programmer and no special effort should be made to
encourage the use of fixed point.
B.1.e
The source language should require global specification
of the precision for real numbers. This specification
should be interpreted as the maximum precision
required by the program logic and the minimum
precision to be supported by the object code.
Machine independence, in the use of real numbers,
can be achieved only if the user can place constraints
on the translator and object machine without forcing
a specific mechanization of the arithmetic. Precision
specifications, as the maximum required by the
program and the minimum to be implemented by the
object code, provides all the power and guarantees
needed by the programmer without unnecessarily
[x]sing on the object machine. Precision
specifications do not change the type of reals or
the set of applicable operations.
B.1.f
The character set and collating sequence for character
data should be specifiable within user designated
program scopes.
The character set to be used in data is often determined
by the object machine and its peripheral devices.
In some cases, several character sets may be required in
the same program. The user should be able to define the
desired character set within his program, and should be
able to convert between character sets: The definitions
of the most common character sets (including ASCII) might
be made available in the standard library
B.1.g
The language should require user specification of the
number of dimensions, the range of subscript values
for each dimension, and the type of each array.
The number of dimensions and type should be determined
at compile time.
This allows static arrays (which can be allocated at
compile or load time) and automatic arrays (which
can be allocated at scope entry). These are sufficient
to permit allocation of space pools for management
of more complex data structures including dynamic
arrays. The range of subscript values for any given
dimension should be 2 contiguous subsequence of some
enumeration type. It has been suggested that the
lower bound on array subscripts (i.e., the array
origin) be fixed by the language definition at 0 or
1. Certainly the origin should be determinable at
compile time, but limiting the origin to 0 or 1
would be an arbitrary special case decision to aid
the compiler writer at the expense of application
programs. The run time costs of implementing origin
1 are no more than for any other nonzero origin
known at compile time. Most programmers are not used
to origin 0 and find it inconvenient or unnatural.
B.1.h
The language should permit records to have alternative
structures, each of which is fixed at compile time.
The name and type of each record component should be
specified by the user at compile time.
This provides all that is safe to use in CMS-2 and
JOVIAL OVERLAY and in FORTRAN EQUIVALENCE. It permits
hierarchically structured data of heterogeneous type,
permits records to have alternative structures as
long as each structure is fixed at compile time and
the choice is fully discriminated at run time, but
does not permit arbitrary references to memory through
renaming nor does it permit dropping type checking
to handle overlaid structures.
B.1.i
The user should be able to specify whether composite
data structures are to be packed for maximum storage
utilization or unpacked to minimize access time.
Packed data should have a uniform field sizes
independent of the object machine.
Data can be placed one item per machine word (or
half word, or double word) for each inexpensive
access or it can be packed to maximal density to
conserve storage space. The user should be permitted
to specify which if it is important to his program.
If he does not specify then the packing should be
optimal as determined by the compiler, neither choice
should be a default. Dense data is required when
dealing with large data files which also must be
transferred among different machines. If field
sizes are determined directly from the description
of the data then there will be a machine independent
bit equivalent form for transferring data (e.g., the
COBOL data description for records).
B.2 OPERATIONS
B.2.a
Assignment and access operations should be automatically
defined for all data types. The assignment operation
should permit any value of a given type to be assigned
to a variable, array or record component of that type,
Variables should be available for all data types.
Variables are useful only when there exist corresponding
access and assignment operations. Because no special
semantics is required as a function of the type for
reference and assignment, they can be defined automatically
B.2.b
The source language should have built-in equivalence
and nonequivalence operations which can be used to
compare any two data objects (regardless of type
compatibility) for identity.
Equivalence is an essential universal operation which
should not be subject to restriction on its use. Proper
semantic interpretation of equivalence requires that
operends of disjoint types never be equivalent.
Consequently, its usefulness at run time is restricted
to data of the same type or of types with nonempty
intersections. In any case, the test should be for logical
identity. The use of equivalence is not recommended
for real numbers but resolution of what equivalence means
for imprecise quantities is a problem of numerical
analysis not language design.
B.2.c
Relational operations should be automatically defined
for numeric data and all types defined by enumeration.
Numbers and types defined by enumeration have an
obvious ordering which should be available through
relational operations. The same mechanism might be
used for the character set collating sequence (i.e.,
define character set as an enumeration of characters).
B.2.d
The built-in operations for numbers should include:
addition, subtraction, multiplication, division (with
a real result) and module division.
These are the most widely used numeric operations and
are available as hardware operations in most machines.
B.2.e
No arithmetic operation which is within the precision
or range specifications of the program should ever
truncate the most significant digits of a numeric
quantity; truncation and rounding should always be
on the least significant digits.
This requirement seems obvious, particularly for floating
point numbers, and yet many of our existing languages
truncate the must significant mantissa digits in some
mixed and floating point operations. The language
should adhere to the "law of least astonishment".
B.2.f
The built-in Boolean operations should include and,
or and xor. The operations and and or on scalars
should be evaluated in short circuit mode.
Short circuit mode means that and and or are in fact
control operations which do not evaluate their second
argument if the value of the first argument is false
or true, respectively. Short circuit evaluation has
no disadvantages over the corresponding computational
operations and sometimes produces faster executing
code, particularly in languages where the user can
rely on the short circuit execution.
B.2.g
The source language should permit scalar operations
to be applied to conformable arrays and records to
indicate component by component operations.
Conformability should require exactly the same number
of components and one for one compatability in type.
For arrays, correspondence should be by position in
similarly shaped arrays. For records, correspondence
should be by component name. In many situations
component by component operations are done on array
and record elements. In fact, a primary reason for
having arrays is to permit large numbers of similarly
treated objects to have a uniform notation. The
COBOL language is built around the idea of operations
on corresponding components of records. Component
by component operations available directly in the
source language hides the details of the sequencing
and thereby simplifies the program and makes more
optimizations available. In addition it permits
simultaneous execution on machines with parallel
processing hardware. Although component by component
operations should be available for built-in
composite data structures which are used to define
application oriented structures, but that capability
should not be automatically inherited by defined
data structures. A matrix might be defined using
arrays, but it should not inherit the array operations
automatically. Multiplication for matrices would
for example be unnatural, confusing and inconvenient
if the product operator for matrices were interpreted
as a component by component operation instead of
cross product. Component by component operations
will also allow operations on character strings
represented as vectors of characters and efficient
Boolean vector operations.
B.2.h
Explicit type conversion operations should not be
required for floating point arithmetic with integer
or fixed point arguments, nor for conversion between
numeric ranges.
An explicit integer to floating point operation is
not required because within the specified real
precision any range of integers is a subset of the
same range of reals. Similarly the possible fixed
point values will always be a subset of the floating
point values of the same precision. Because ranges
do not form closed systems range, validation is not
possible at compile time (e.g., I = I + l may be
a range error). At best, the compiler might point
out likely range errors.
B.3 VARIABLES, LITERALS, AND CONSTANTS
B.3.a
The user should have the ability to associate constant
values with identifiers.
The use of identifiers to represent literal values
has often made programs more readable, more easily
modifiable and less prone to error when the value
of a constant must be changed. Associating constant
values with an identifier is preferable to assigning
the value to a variable because it is then clearly
marked in the program as a constant, can be checked
for unintentional changes, and often can have a more
efficient object representation.
B.3.b
The language should provide a syntax and a consistent
interpretation for numeric literals. Numeric literals
should have the same value (within the specified
precision) in both programs and data.
The point here, and one that should be obvious to any
programmer who must use numeric data, is that regardless
of the source of the data and regardless of the object
machine the value of constants should be the same. For
integers it should be exact and for reals it should be
the same within the specified precision. Compiler writers
however would disagree. They object to this requirement
on two grounds: that it is too costly if the host and
object machines are different and that it is unnecessary
if they are the same. In fact, all costs are at compile-
time and must be insignificant compared to the life time
costs resulting from object code containing the wrong
constant values. As for being unnecessary, there
have been all too many cases of different values from
program and data literals on the same machine because
the compile-time and run-time conversion packages
were different and imprecise.
B.3.c
The language should permit the user to specify the
initial values of individual variables at the time
of their allocation. There should be no default initial
values. It should be considered an error if a variable
is accessed before it obtains an initial value.
The ability to initialize variables at the time of their
allocation will contribute to program clarity, but a
requirement to do so would be an arbitrary and sometimes
costly decision to the user. Default initial
values, on the other hand, contribute to neither program
clarity nor correctness and can be even more costly at
run-time. Every variable must be initialized before
it is accessed or its value will be unpredictable
garbage with no chance for program correctness. The
translator should treat any access to a variable before
it has been assigned as an error. Whether a variable
will be assigned a value is in general unsolvable at
compile time, but in those cases in which it is not easily
determined by the translator, it will not be easily
determined by the programmer and those who must maintain
the program and should, therefore, be considered an
error.
B.3.d
The source language should require its users to
individually specify the range of values for integer
variables. These specifications should be interpreted
as an upper bound on the range of values which will
be assigned to a variable and a lower bound on the
range which must be supported by the object code.
Range specifications should not be interpreted as
defining new types.
Range specifications are a special form of assertion.
They aid in understanding and determining the correctness of
programs. They can also be used as additional
information by the compiler in deciding what storage
and allocation to use (e.g., half words may be more
efficient for integers in the range 0 to 1000); Range
specifications also offer the opportunity for the
translator to automatically insert range tests for
run-time or debug-time validation of the program
logic. With variable ranges specified in the program,
it becomes possible to perform many subscript bounds
checks at compile-time. These bounds checks, however,
will be only as valid as the range specifications which
cannot, in general, be validated at compile-time.
B.3.e
The range of values which can be associated with a
variable, array, or record component may be any
built-in type, any defined type or a subset of any
enumeration type.
B.4 EXTENSION FACILITIES
B.4.a
There should be no default declarations. Each program
element should be defined in the kernel language, in
a library extension, or in the program.
As programmers, we should not expect the translator
to write our programs for us. If we somehow know
that the translator's default convention is
compatible with our needs for the case at hand,
we should still document the choice so others can
understand and maintain our programs. Neither should
we be able to delay definitions (possibly forget
them) until they cause trouble in the operational
system.
B.4.b
The user should be able, within the source language,
to extend existing operations to new data types.
When an operation is an abstraction of an existing
operation for a new type or is a generalization of
an existing operation, it is inconvenient, confusing
and misleading to use any but the existing operator
symbol or name.
B.4.c
Type definitions in the source language should
include, as a unit, both the class of data objects
comprising the type and the set of operations
applicable to that class.
Types define abstract data objects with special
properties. The data objects are given a representation
in terms of existing data structures, but they are
of little value until operators are available to
take advantage of their special properties. When
we obtain access to a type, we need its operations
as well as its data. Numeric data is needed in many
applications but is of no value to any
without arithmetic operations. Neither should a defined
type automatically inherit the operations of the data
with which it is represented.
B.4.d
The data objects comprising a defined type should
be definable by enumeration of their literal names,
as Cartesian products of existing types (i.e., as
array and record classes), by discriminated union
(i.e., as the union of disjoint types) and as the
power set of an enumeration type.
This list comprises the currently known set of useful
definitional mechanisms for data types which do not
require run-time support, such as, garbage collection
and dynamic storage allocation. These mechanisms
are sufficient to define data sequences, recursive
data structures, and efficient sparce data structures.
B.4.e
Type definition by free union and subsetting is not
desired.
Free union adds no new power not provided by discriminated
union but does require giving up the security
of types in return for programmer freedom. Range
or subset specifications on variables are useful
documentation and debugging aids but should not
be construed as types. Subsets do not introduce
new properties or operations not available to the
superset and often do not form a closed system
under the superset operations. Unlike types,
membership in subsets can be determined only at
run time.
B.4.f
The source language should permit user specification
of the axiomatic properties of a defined type
independent of the particular mechanization used to
implement those properties.
Programming languages require specification of not
only the effect of programs, routines, and expressions
but how those actions are to take place. Often
decisions are made arbitrarily and are nonconsequential
when made but are not identified as such. lf there
is no note made of which decisions were intended
and which are arbitrary, the program will grow to
rely on the arbitrary decisions and neither the
translator nor the programmer will be able to predict
the consequences when a better choice is found.
B.4.g
When defining a type the user should be able to specify
the initialization procedure for the type and the
actions to be taken at the time of allocation and
deallocation of variables of that type.
It is often necessary to do bookkeeping or to take
other special action when variables of a given type
are allocated or deallocated. The language should
not limit the class of definable types by withholding
the ability to define those actions. Initialization
might take place once when the type is allocated
(i.e., in its allocation scope) and would be used
to set up the procedures and initialize the variables
which are local to the type definition.
B.5 SCOPES
B.5.a
The language should allow the user to distinguish
between scope of allocation and scope of access.
The scope of allocation of a program structure is
that region of the program for which the object
representation of the structure should be present.
The allocation scope defines the program scope
for which own variables of the structure must be
maintained and identifies the time for initialization
of the structure. The access scope defines the
regions of the program in which the allocated
structure is accessible to the program. In some
cases the user may desire that each use of a
defined program structure be independent (i.e., the
allocation and accessing scopes would be identical.)
In other cases, the various accessing scopes
might share a common allocation of the structure.
B.5.b
The ability to limit the scope of access for separately
defined structures should be available to both the
designer and the user of the structure.
Limited access specified in a type definition is
necessary to guarantee that changes to data representations
and to management routines which purportedly do not
affect the calling programs, are in fact safe.
By rigorously controlling the set of operations
applicable to a defined type, the type definition
guarantees that no external use of the type can
accidentally or intentionally use hidden nonessential
properties of the type.
Limited access on the call side provides a high
degree of security and eliminates nonessential
naming conflicts without limiting the degree of
accessability which can be built into programs.
The alternative notion, that all declarations which
are external to a program should have the same scope,
is inconvenient and costly in creating large systems
which are composed from many subsystems because it
forces global access scopes and the attendant naming
conflicts on subsystems not using the defined items.
B.5.c
The scope of identifiers should be wholly determined
at compile time. Identifiers should be introduced
at the beginning of their scope and multiple use of
identifiers should not be allowed in the same scope
except for embedded blocks in which case the innermost
identifier should apply.
The language should use conventional scope rules
while making declarations and other definitions of
identifiers easy to recognize and avoiding errors
and ambiguities from multiple use of identifiers
in a single scope.
B.6 EXPRESSIONS
B.6.a
There should be no order dependent side effects in
expressions.
This is a semantic restriction saying that the effect
of evaluating an expression (at least from the point of
view of the caller) should be independent of the order
in which the arguments to the expression are evaluated.
This is less restrictive to the compiler and the
generation of efficient object code than is a straight
left-to-right or other language imposed operand order
execution rule. It is less restrictive to the programmer
than a strict no side effect rule. It would, for example,
allow imbedded assignments within expressions providing
they do not assign to variables used elsewhere in the
expression.
B.6.b
The order of execution of operations within an
expression should be obvious to the reader. There
should be few levels of operator hierarchy and they
should be widely recognized.
Care must be taken to insure that the execution order
of operators within expressions is not psychologically
ambiguous. That is, to guarantee that the order
implemented by the language is the same as intended by
the programmer and understood by those reading the program.
This kind of problem can be minimized
by having few precedence levels, by allowing explicit
parenthesis to specify the intended execution order, by
requiring explicit parenthesis in sequences of non-associative
operators at the same precedence level
(e.g., x/y/z should not be allowed without parenthesis}.
If user defined in-fix operators are permitted explicit
parantheses should be required for their use.
B.6.c
Expressions of a given type should be permitted anywhere
in source programs where constants or references to
variables of that type are allowed.
This is just a special case of not imposing arbitrary
restrictions and special case rules on the user of the
source language. Special mention is made here only
because so many languages do restrict the form of
expressions. FORTRAN, for example, has a list of seven
different syntactic forms for subscript expressions but
does not permit all forms of arithmetic expressions.
B.6.d
Constant expressions in programs should be evaluated
at compile or load time.
The ability to write constant expressions in programs has
proven valuable in languages with this capability,
particularly with regard to program readability and in
avoiding programmer error in externally evaluating and
transcribing constant expressions. They are most often
used in declarations. There is no need, however, that
constant expressions impose run-time costs for their evaluation.
They can be evaluated once at compile time or if this is
inconvenient because of incompatibilities between the
host and object machines, the compiler can generate code
for their evaluation at load time. In any case, the
resulting value should be the same (at least within the stated
precision) regardless of the object machine.
B.7 CONTROL STRUCTURES
B.7.a
The language should provide structured control mechanisms
for sequential, conditional, iterative, recursive,
pseudo parallel processing, exception handling and
asynchronous interrupt handling.
These mechanisms provide a spanning set of
control structures. Adding additional kinds would
be redundant; omitting any of these will leave a gap
in the classes of programs which can be written without
resorting to machine level primitives. The most
appropriate operations in several of these areas is
an open question. For the present, the choice should
be a complete set of composable control primitives
each of which is easily mapped onto object machines
and which does not impose run-time charges for unused
or unneeded generality.
B.7.b
The source language should provide a "go to" operation
applicable to program labels within its most local
scope of definition.
The go to is a machine level capability which is still
needed to fill in any gaps which might remain in the
choice of structured control primitives, to provide
compatability for transliterating programs written in
older languages, and because of the wide familiarity
of current practitioners with its use. The language
should not, however, impose unnecessary costs for its
presence. The go to should be limited to explicitly
specified program labels within the most local scope
of definition. The go to should not be used to exit
procedures or scope blocks. Neither should the language
provide specialized facilities which encourage its
use in dangerous and confusing ways. Switches, designational expressions,
label variables, label parameters and numeric labels are all undesirable.
B.7.c
The conditional control structures should be fully partitioned
and should permit selection among alternative
computations based on the value of a Boolean expression,
on the subtype of a value from a discriminated union,
or on a computed choice among labeled alternatives.
The conditional control operations should be fully
partitioned so that choice is clear and explicit in
each case. There should be some general form of
conditional which allows an arbitrary computation to
determine the label chosen (e.g., Zahn's device provides a
good solution to the general problem). Special cases
are also needed for the more common cases of the Boolean
expression (e.g., if then else) and for value or type
discrimination (e.g., case on one of a set of values
or subtype of a union).
B.7.d
The iterative control structure should permit the
termination condition to appear anywhere in the loop,
should permit control variables to be local to the
iterative control, and should not impose excessive
overhead in clarity or run time execution costs for
common special case termination conditions (e.g.; fixed
number of iterations or elements of an array exhausted)
In its most general form, a programmed loop is executed
repetitively until some computed predicate becomes
true. There may be more than one terminating predicate,
and they might appear anywhere in the loop. Specialized
control structures (e.g.; While do) have been used for
the common situation in which the termination condition
precedes each iteration. The most common case is termination
after a fixed number of iterations and a specialized
control structure should be provided for that purpose
(e.g., FORTRAN DO or Algol for}. A problem which arises
in many programming languages is that loop control
variables are global to the iterative control and thus
will have a value after loop termination but that value
is usually an accident of the implementation. Specifying
the meaning of control variables after loop termination
in the language definition resolves the ambiguity but
must be an arbitrary decision which will not aid
program clarity or correctness, and will interfere with
the generation of efficient object code. Loop control
variables are, by definition, variables used to control
the repetitive execution of a programmed loop and, as
such, have, and should have, meaning only during loop
executions.
B.7.e
There should be no source language distinctions between
recursive and nonrecursive procedures.
Recursion is desirable in many applications because it
is a neat and elegant concept which can shorten and
clarify programs and simplify proof procedures. Recursion is
required in order to avoid unnecessarily opaque, complex
and confusing programs when operating on recursive
data structures. If recursive and nonrecursive procedures
are marked in the source language, that specification
represents just one more special case to be
learned and dealt with by the user. The objections
to recursion come from a feeling that recursion
requires greater run-time costs in time and
space than does nonrecursive procedures. In fact,
recursion and iteration have the same costs in many
cases, and stack allocation of procedure bodies can
save space at run-time. The problem has been that
recursion has, for the most part, been implemented only
in environments in which run-time efficiency has not
been of great importance and, therefore, has been implemented
in a straightforward, inefficient
manner in which the user pays the full cost of the
worst case recursion for all procedures. As with
any other feature, procedures should be implemented
in the most efficient manner consistent with their use.
In particular, if there are costs inherent in the use
of recursion, they should not be charged to non-recursive
procedures. Optimizations and special case
processing should, however, be the responsibility of
the translator and not the user.
B.7.f
The pseudo parallel processing capabilities should
include the ability to create, pass control among, and
terminate processes.
The particular form of parallel processing, interleaved
execution, or coroutine to be used should be left to
the user. The kernel language should, however,
provide a low level capability for creating processes,
passing control among them and terminating them so
the user can build his own form of parallel or coroutine
processing. Creation of processes, of necessity,
requires dynamic storage allocation. The kernel
capability should be such that only parallel processes
and coroutines pay that price and it is desirable that
the user be able to specify the allocation scheme.
B.7.g
The exception handling control structure should permit
the user to cause transfer of control and data for
any error or exception condition which might occur
in his program.
It is essential in many applications that there be
no program halts beyond the user's control. The user
must be able to specify the action to be taken on any
exception condition which might occur within his program.
The exception handling mechanism should be parameterized
so data can be passed to the recovery point.
Exception situations might include arithmetic overflow,
exhaustion of available space, and hardware errors.
B.7.h
There should be a source language capability for handling
asynchronous hardware interrupts in a recoverable
manner.
One cannot write programs such as operating systems,
executives and monitors which service hardware interrupts
without access to the interrupt system. Minimally there
must be an ability in the source language to specify the
interrupt processing routine, to dynamically determine
what interrupt has occurred, and to return to the
interrupted program. These capabilities can be
provided in a machine independent form, but the set of
available interrupts must be machine dependent. There
should be no source language distinction between true
hardware interrupts and those synthesized by an
operating system, language extensions, or the user
program.
B.8 PARAMETERS
B.8.a
There should be a consistent set of rules applicable to
all parameters, whether they be for procedures, for types,
for exception handling, for parallel processes, for
declarations, or for built-in operators. There should be
special operations (e.g., array substructuring) applicable
only to parameters.
Uniformity and consistency contributes to ease of learning.
implementing and using a language; allows the user to
concentrate on the programming task instead of the
language; and leads to more readable, understandable,
and predictable programs.
B.8.b
Formal and actual parameters should always agree in type.
The size and subscript range for array parameters
need not be determinable at compile time, but can
themselves be passed as part of the parameter.
Type transfers hidden in procedure calls with incompatible
formal and actual parameters, whether intentional or
accidental, has long been a source of program errors and
difficult to maintain programs.
B.8.c
There should be only two classes of formal parameter data:
those which act as constants representing the actual parameter
value at the time of call, and those which rename the
actual parameter which must be a variable. In addition,
there should be a formal parameter class for specifying
the control action when exception conditions occur, and
a class for parameters processed entirely at compile time.
The two data parameter classes are often called call by
value and call by reference, respectively. They are
the only two widely used parameter passing mechanisms
and the many alternatives (at least 9 have been suggested)
add complexity and cost to a language without increasing
the clarity or power. A language with exception handling
capability must have a way to pass control and related
data through procedure call interfaces. Actual exception
handling control parameters should be optional (i.e., only
specified when needed). Compile time parameters are
needed in extensible languages to permit specification
of generic procedures and data structures such as stacks,
and queues without repeating the definition for each
element type.
B.8.d
There should be provision for variable numbers of
parameters, but in such cases all but a constant number
of them must be of the same type and probably treated as
an array on the formal parameter side.
There are many useful purposes for procedures with
variable numbers of arguments. These include what are
usually called intrinsic functions such as print,
generalizations of operations which are both commutative
and associative such as max and min, and for repetitive
application of the same binary operation such as the Lisp
list operation. The use of variable number of argument
operations need not and should not cause relaxation of
any compile-time checks, require use of multiple entry
procedures, allow the number of actual parameters to vary
at run-time, nor require special calling mechanisms. If the
parameters which can vary are limited to a program specified
type treated as any other argument on the call side and as
elements of an array within the procedure definition, full
type checking can be done at compile time. There is no
reason to prohibit in line expansion, and there is no
prohibition on writing special procedures for some
fixed number of parameters.
B.9 STANDARD EXTENSIONS
B.9.a
All run time overhead in programs should be avoidable.
Language features which require run time support should
be provided as extensions which are brought in only
when used.
Language features (such as, automatic and dynamic
array allocation, process scheduling, file management,
and I/O processing) require run-time support software.
These features should be provided as extensions and
not as part of the kernel language so that the user
assess the costs and can write his own specialized
extensions for these purposes when the standard
extensions are not compatible with his requirements.
Neither should there be any automatic movement of
programs or data between main store and backing store
unless the user can bring that movement under his
control. In no case should the user have to pay
space or time for support packages he does not use.
B.9.b
The source language should contain standard line
independent interfaces to machine dependent capabilities,
including peripheral equipment and special hardware.
The convenience, ease of use and savings in production
and maintenance costs resulting from using high order
languages come from being able to use specialized
capabilities without building them from scratch. Thus,
it is essential that high-level capabilities be supplied
with the language.
There is currently little agreement on standard operating
system, I/O, or file system interfaces. This does not
preclude support of one or more forms for the near term.
If these interfaces are supported as standard extensions
and not built into the kernel language, they can be
supplanted as better forms are recognized.
B.9.c
There should be a standard data base interface. It
should be semantically compatible with systems
generated using data base languages, syntactically
consistent with the remainder of the common language,
and provided as an extension.
The use of large data bases and logical files is
essential to many DoD computer applications. Any
selected common language must be capable of interfacing
with data base systems; and, because standards are
limited and there is ongoing research in this area,
the data base interface should be definable as an
extension which can grow at the user level without
inventing a new language.
B.9.d
The language should give access to real time clocks
in a machine independent form. Operations on real
time clocks should include reading the time of day,
waiting for a specified time, and interruption after
a specified time. The same capabilities, operations
and notations available for real time should be
available for virtual time.
Real time capability is essential to many DoD
applications. The source language should provide
a machine independent form of access and operation
to real time clocks. This should be to avoid the
cost when the capability is not needed, to keep
the kernel language simple and uncluttered by
features for particular applications, and to insure
that, when necessary, the user can define his own
specialized real time facility. Virtual time is
very helpful in discrete simulation problems and
is conceptually similar to real time capability.
There is no reason why they should not be treated
in a consistent manner.
B.10 SYNTAX
B.10.a
The source language should be free format, should allow
the use of mnemonically signficant identifiers, should
be based on conventional forms, should be simple, uniform
and probably LR(l), should not provide special notations
for rare cases, and should not permit abbreviation of
identifiers or key words.
Clarity and readability of programs should be the
primary criteria for selecting a syntax. Each of
the above points can contribute to program clarity.
The use of free format, mnemonic identifiers and
conventional forms allows the programmer to use
notations which have their familiar meanings, to put
down his ideas and intentions in the order and form
that humans think about them, and to transfer skills
he already has to the solution of the problem at hand.
A simple uniform language reduces the number of cases
which must be dealt with by anyone using the language;
if programs are difficult for the translator to
parse, they will be difficult for people. Similar
things should use the same notations with the special
case processing reserved for the translator and object
machine. The purpose of mnemonic identifiers and
key words is to be informative and increase the distance
between lexical units of programs. The use of abbreviation
eliminates these advantages for a questionable increase
in coding ease.
B.10.b
The user should not be able to modify the source language
syntax. Specifically, he should not be able to modify
operator hierarchies, introduce new precedence rules
or define new key word forms.
If the user can change the syntax of the language then
he can change the basic character and understanding
of the language. The distinction between semantic
extensions and syntactic extensions is similar to that
between being able to coin new words in English or
being able to move to another natural language. Coining
words requires learning those new meanings before they
can be used but at the same time increases the power
of the language for some application area. Changing
the grammar (e.g., using French), however, undermines
the basic understanding of the language itself, changes
the mode of expression, and removes the commonalities
which obtain between various specializations of the
language. Growth of a language through definition of
new data and operations and the introduction of new
words and symbols to identify them is desirable but
there should be no provision for changing the structure
of the language. The language should, of course,
provide sufficiently general forms that they can be
adopted to new possibly unforeseen situations. Neither
does this preclude associating new meanings with existing
in-fix operators nor defining new in-fix operators
without precedence rules.
B.10.c
The syntax of source language programs should be
composable from a character set suitable for publication
purposes, but no feature of the language should be
inaccessable using the 64 character ASCII subset.
A common language should use notations and a charscter
set convenient for communicating algorithms, programs,
and programming techniques among its users. On the
other hand, the language should not require special
equipment (c.g., card readers and printers) for
its use. The use of the 64 character ASCII subset will
make the language compatible with the international
standard seven level subset, ISO-7 and with the
Federal information processing standard 64 character
set, FIPS-l, which has been adopted by the U.S.A.
Standard Code for Information Interchange (USASCII).
B.10.d
The language definition should provide the formation
rules for identifiers and literals. These should
include a language defined break character for use
internal to identifiers and literals.
Lexical units of the language should be defined in a
simple, uniform and easily understood manner. The most
desirable break character is the space. A literal
break character contributes to the readability of
programs and makes the entry of long literals less
error prone. With a space as a break character one
can enter multipart identifiers such as REAL TIME CLOCK
or long literals such as 3.l4l59 26535 84. Use of the
break can also be used to guarantee that missing quote
brackets on character literals do not cause errors which
propagate beyond the next end-of-1ine. The language
might require separate quoting of each line of a long
literal:
"This is a Long"
"literal string".
B.10.e
There should be no continuation of lexical units
across lines.
Many elementary input errors arise at the end-of-lines.
Programs are input on line-oriented media, but the concept
of end-of-line is foreign to free format text. Most
of the error prone aspects of end-of-line can be
eliminated by prohibiting lexical units to continue
over lines. This has the sometimes undesirable effect
of limiting identifiers and literals to the length
of lines unless spaces and end-of-lines are permitted
to break identifiers and literals into multiple
lexical units.
B.10.f
Key words should be reserved, should be few in number,
should be informative, and should not be usable in
place of an identifier.
By key words of the language are meant those symbols
and symbol strings which have special meaning in the
syntax of programs. They introduce special syntactic
forms such as are used for control structures and
declarations, or they are used as in-fix operators,
or as some form of parenthesis. Key words should be
reserved, that is unusable as identifiers, to avoid
confusion and ambiguity. Key words should
be few in number because each new key word introduces
another case in the parsing rules and, thereby, adds
to the complexity of the language, and because large
numbers of key words inconvenience and complicate the
programmers task of chasing informative identifiers.
It is more important that key words be informative
than that they be short but cryptic. A major exception
is the key word introducing a comment; it is the comment
and not its key word which should do the informing.
Comments should begin with a single special character
which will encourage their use and not take the space
needed for the ccmment. Finally, there should be no
place in a source language program in which a key word
can be used in place of an identifier. That is,
functional form operations and special data items built
into the language or accessible as a standard extension
should not be treated as key words, but should be treated
as any other identifier.
B.10.g
The source language should have a single uniform comment
convention. Comments should be easily distinguishable
from code, should be introduced by a single language
defined character, should permit any combination of
characters to appear, and should be able to appear
anywhere in programs. Comments should not prohibit
automatic reformatting of programs, and should not permit
errors in missing comment brackets to propagate beyond
the next end-of-line.
There are all obvious points which will encourage the use
of comments in programs and avoid their error prone
features in some existing language. Comments anywhere
in a program should not be taken to mean that they can
appear internal to a lexical unit such as an identifier,
key word, or between the opening and closing brackets
of a character string. One comment convention which
nearly meets these criteria is to have a special comment
end with either the quote or an end-of-line ending
comment.
B.10.h
The language should not permit unmatched parenthesis.
Some programming languages permit closing parenthesis
to be omitted. If for example a program contained
more BEGINs than ENDS the translator might insert
enough ENDs at the end of the program to make up
the difference. This makes programs easier to write
because it sometimes saves writing several ENDs at
the end of programs and because it eliminates all
syntax errors for missing ENDs. Failure to require
proper parenthesis matching makes it more difficult
to write correct programs. Good programming practice
requires that matching parenthesis be included in
programs whether required by the language. Unfortunately,
if they are not required by the language then there can
be no syntax check to discover when errors are made.
The language should require full parenthesis matching.
This does not preclude syntactic features such as
case x of s1, s2 ... sn
end case in which end is paired with a key word other than begin.
C. COMPILE TIME CAPABILITIES
C.a
The library of extensions should be organized as a
collection of specialized compools giving the user
access to all definitions related to a given application
or specialized capability.
Compools have proven very useful in organizing and
controlling shared data structures. A similar
mechanism should be employed to manage and control
access to related library definitions. The content
of both library extensions and type definitions are
related objects definable in the language. There
is little reason to distinguish these two kinds of
program modules and a language which merges the two
will be simpler, easier to learn and easier to use.
These same modules might also act as parallel and
co-routine templates.
C.b
The translator should provide a variety of useful options
to aid generation, test, documentation and modification
of programs.
The translator should have special capabilities to aid
the programmer. The "best" set of capabilities and
their proper form is not currently known. Since nonstandard
choices of translator options will not adversely affect
software commonality, the language definition should
not dictate any arbitrary choice. Instead the development
of new translator aids should be encouraged within the
constraint of implementing the source language as
defined.
Some of the translator options which have been sugggested
and may be useful inc1ude the following. Code might be
compiled for assertions which would give run-time
warnings when the value of the assertion predicate is
false. Dimensional analysis might be done on units of
measure specifications. Special optimizations might
be invoked. There might be capability for timing analysis
and gathering run-time statistics. There might be
translator supplied feedback to provide management
visibility regarding progress and conformity with local
conventions. The user might be able to inhibit code
generation. The translator might provide a listing of
the number of instructions generated against corresponding
source inputs and/or an estimate of their execution times.
It might provide a variety of listing options including
cross-reference lists.
C.c
The language should support the integration of separately
written modules into an operational program.
This is required to permit use of extension and subroutine
libraries and for the integration of large system programs.
The user should be able to cause anything in the library
to be inserted into his program.
C.d
The source language should permit the use of conditional
statements (e.g., case statements) dependent on the
object environment. In such cases, the conditional
should be evaluated at compile-time and object code
produced only for the selected path.
This capability permits the writing of procedures
with a standard source language interface, but
different object representations as a function of the
object machine and configuration. With the exception
of permitting program reference to the environment
specification it is just a special case of evaluation
of constant expressions at compile time.
D. OBJECT REPRESENTATION
D.a
The translator should not impose run-time cost for
unused generality. A primary goal of any translator
should be the generation of efficient object code.
The source language, both Kernel and extensions, will
contain capabilities which are not needed by everyone
or, at least, not by everyone all the time. When a
program does not use a feature or capability, that
program should pay no penalty for the capability being
in the language. That the penalties can be avoided for
library extension capabilities is obvious, since they
need not be brought in at all. Other features may
generate special object codes when their full generality
is not required. Parameter passing for single
arguments might, for example, be implemented much less
expensively at run time than is the general case.
D.b
The user should be able to specify that a particular
call on a procedure is to be implemented as an open
routine.
The use of inline open procedures can reduce the run-time
execution costs significantly in some cases. There
are the obvious advantages in eliminating the parameter
passing in avoiding the saving of return marks, and in
not having to pass to and from the routine. Some less
obvious, but often more important, advantages in saving
run-time costs is the ability to execute constant
portions of routines at compile time and thereby
eliminate time and space for those portions of the
procedure body at run time. Open Routine capability
is especially important for machine language insertion.
D.c
Any optimizations performed by the translator should not
change the effect of the program.
More simply, the translator cannot give up program
reliability and correctness, regardless of the excuse.
It should be noted that for most programming languages
there are few known safe optimizations and many unsafe
ones. The number of applicable safe optimizations
can be increased by making more information available
to the compiler and by choosing language constructs
which allow safe optimizations. This requirement allows
optimization by code motion providing that motion does
not change the effect of the program.
E. THE TRANSLATOR
E.a
No implementation of the language should contain source
language features which are not defined in the "standard"
language.
This guarantees that use of programs and software subsystems
will not be restricted to a particular site by virtue
of using their unique version of the language. It also
represents a commitment to freezing the source
language, inhibiting innovations and growth of the form
of the source language, and confining the source language
to the current state of the art in return for stability,
wider applicability of software tools, reusable software,
greater software visibility, and increased payoff for
tool-building efforts.
E.b
Every translator for the language should implement the
entire language. There should be no subset implementations.
If individual compilers implement only a subset
of the language, then there is no chance for software
commonality. If a translator does not implement the
entire language, it cannot give its users access
to standard supported libraries or to application programs
implemented on some other translator. Requiring that the
full language be implemented will be expensive only
if the language is large, complex, and nonuniform. The
intended source language product from this effort is a
small simple uniform kernel language with the specialized
features, support packages and complex features relegated
to library routines not requiring direct translator
support. If simple low cost translators are not feasible
for the selected language, then the language is too
large and complex to be standardized and the goal of
language commonality will not be achievable. The effort
should be terminated.
E.c
The translator should not impose compile-time costs
for unused generality. A primary goal of any translator
should be low cost translation.
The user should have control over the level of optimization
applied to his programs. He should have control over the
costs and benefits he obtains from the translator.
Optimization is unimportant to some programs and only
sometimes important in the development of any program.
E.d
Translators should be able to produce code for a variety
of object machines. The machine independent parts of
translators should be built independent of the code generators.
There is currently no common widely used computer in the
DoD. There are at least 250 different models of
commercial machines in use in DoD with many more home
grown varieties. A common language must be applicable
to a wide variety of models and sizes of machines.
Translators should be written so that they can produce
object code for several machines. This reduces the
proliferation of translators and makes the full power
of an existing translator available at the cost of producing
an additional code generator.
E.e
The translator need not be able to run on all the object
machines. Self-hosting is not required.
This follows from having an operational environment which
includes many small machines which are unable to support
the design, documentation, test, and debugging aids
necessary for the development of timely, reliable or
efficient software. It also follows from the need to
avoid penalizing large machine users for the restrictions
of small machines when a common language is used. It
is desirable that the translator be able to run on a
variety of machines, but this should not be used as an
excuse to eliminate needed source language capabilities.
E.f
The translator should do full syntax checking, should
check all operations and parameters for type compatibility
and should verify that any other semantic restrictions
on the source language are met.
The purpose of source language redundancy and avoidance
of error prone language features is security. The price
is paid in programmer inconvenience in having to specify
his intent in greater detail. The payoff comes when
the translator checks that the source language is
internally consistent and adheres to its authors' stated
intentions. There is a clear trade-off between security
and programming ease; surveys conducted in the services
show that the programmers as well as managers will opt for
security over ease when given the choice. The same
choice is dictated by the need for well documented modifiable
software.
E.g
The translator should produce compile time explanatory
diagnostic messages. These should include error messages
and warnings.
The translator should attempt to provide the maximal
useful feedback to the user. Diagnostic messages should
not be coded but should be explanatory and in source
language terms. Translators should continue checking
after one error has been found but should be careful not
to generate erroneous messages because of translator
confusion. Warnings should be generated when a source
language construct is exceptionally expensive or impossible
to implement on the specified object machine. The set of
diagnostic messages should be determined by the translator
as a function of its environment and translation method
are not specified in the language definition, although
the language definition might provide guidelines.
E.h
The translator should be amenable to change.
The adopting of a common language should be a commitment
to the current state of the art for programming language
design for some duration. It should not, however, prevent
access to new software and hardware technology, new
techniques and new management strategies which will not
impact source language design. In particular, inovation
should be encouraged in the development of translators
for a common language providing they implement exactly
the source language as defined. Translators like all
computer programs should be written in expectation of
change; they should be well documented and easily
modified.
E.1
It is desirable that translators for a common language
be written in their own source language.
The existence of at least one such translator assures
that the language is rich enough to perform a useful
and demanding programming task. If the language is
well defined and uniform in structure, a self description
will contribute to user understanding of the language.
The existence of translators written in their own
source language also makes the compiler automatically
available on any of their object machines (assuming
sufficient hardware resources are available.)
F. DEBUGGING FACILITIES
There should be effective debugging facilities associated
with the language. The particular function or form of
these facilities should not be dictated.
Software tools and aids are needed for the effective use
of any programming language. Particularly important are
debugging aids. There are however no recognized standards
for debugging systems nor should there be. Although
debugging facilities must be available or made available
for any selected language, neither the language definition,
selection process, or a standards group should dictate
the particular debugging facilities or their form.
Whatever facilities are built should however be widely
available. This allows research and development to
continue and does not impede transfer of new debugging
technology.
Some debugging facilities suggested to this effort have
been post mortem analysis including frequency information
for statements obeyed and reports of abnormal termination
in source language terms. Controlled snapshots and
some form of tracing might be useful. There might
be facilities for breakpoints, binary dumps with
restart, traceback from errors, diagnostics in source
language terms, interactive debugging, and filtered
debugging data.
G. LANGUAGE DEFINITON, STANDARDS AND CONTROL
G.a
The semantics of the language should be defined unambiguously
and clearly. To the extent a formal definition assists
in attaining these objectives, the language's semantics
should be specified formally.
A complete and unambiguous definition of a common language is
essential. Otherwise each translator will resolve the
ambiguities and fill in the gaps in their own unique way.
There are currently a variety of methods for formal
specification of programming language semantics, but it
remains a major effort to produce a rigorous formal
description, and the resulting products are of questionable
practical value. The real value in attempting a formal
definition is that it uncovers the incomplete and
ambiguous specifications. An attempt should be made to
provide a formal definition of any language selected but
success in that effort should not be requisite to its
selection.
G.b
The user documentation of the language should be complete
and tutorial in nature. The source language syntax should
be given in BNF or some other easily understood formal
metalanguage with the corresponding semantics given in
English with examples.
The language should be intuitively correct and easily
learned and understood by its potential users. A
successful example of a language description of this
type is the Algol-60 report.
G.c
There should be a control agent to ensure that there is
only one version of the source language and that
implementations of the language conform to that standard.
Without controls a hopefully common language will become
another umbrella under which new languages will proliferate
while retaining the same name.
G.d
There should be identifiable support agent(s) responsible
for maintaining the translators, thc design, development,
debugging and maintenance aids, and the support and
application libraries for the common language.
Language commonality is an essential step in achieving
software commonality, but the real benefits accrue when
projects and contractors can draw on existing software
with assurance that it will be supported, when systems
can build from off the shelf components or at least with
common goals, and when funds can be spent to expand
existing capabilities rather than building from scratch.
Support of common widely used tools and aids must be
provided independent of progects and their individual
funding if common software is to be widely used.
G.e
Library extension facilities should be given the same
kind of control and support as the kernel language.
In any given application of an extensible language
three levels of the system must be learned and used:
the kernel language, the standard extensions used
in that application area, and the local application
programs. The project must be responsible for the
local application programs and local extensions,
but not for the language and its standard extensions
which are used by many projects and sites. lf the
local project or site is responsible, then they will
be responsible only for their own project and
site unique language and extensions and there will
be no common extensions.