Intermetrics, Inc. - RED Rationale - March 1979 - Advanced and Low-Level Facilities

.	Navigate

8. Overloading & Generic Definitions

10. High-Level Input-Output

ADVANCED &
LOW-LEVEL FACILITIES

User-Defined Assignment     Need for It     Interactions     Aggregate Assignment     Copy Semantics
Binding Classes     Exporting Assignment     Optimization     User-Defined Init and Finalize     User-Defined Selection
Machine-Dependent Facilities     Spec of Physical Representations     Bit as Storage Unit
Ref Parameter Restrictions     Built-In Restrictions     Separation of Logical & Physical     Dope Vectors
Low-Level Multitasking     Low-Level Input/Output

9.   ADVANCED AND LOW-LEVEL FACILITIES

9.1  USER-DEFINED ASSIGNMENT

9.1.1  THE NEED FOR USER-DEFINED ASSIGNMENT

In the RED language, an assignment routine is automatically provided for each new data type, in agreement with Steelman 5F. However, there are two situations which must be addressed:

when the new type should in fact not have assignment available, and

when the new type should have an assignment routine with different semantics from the one automatically provided.

A "non-assignable" type (i.e., a type for which assignment is illegal) is easily obtained in RED, either by defining an assignment routine which raises an exception, or by not exporting the (automatically provided) assignment routine from the capsule in which the type is defined. In the first case, assignment would raise the exception even when used in the defining capsule. In the second, assignment would be allowed inside the capsule but disallowed outside.
RED allows the user to specify an assignment routine for any new type for which the automatically provided assignment semantics would be inappropriate. Such a facility, neither required nor prohibited by Steelman¹, was included to support data abstraction. The generality provided by user-defined assignment may be ignored by those programmers who do not require it; there is no run-time cost incurred when the facility is not used.
The desirability of allowing user-defined assignment can be seen by examining some typical applications. Consider a user who defines a data type implementing queues by a capsule which exports appropriate type and routine declarations. It is highly desirable, in the interest of maintainability, for the uses of the queue routines outside the capsule to be independent of the underlying type of the queue type. In particular, whether queue is defined as a direct type or as an indirect type should not be ascertainable outside the capsule, since if it is necessary to change from one to the other during program maintenance the users of the capsule should not be affected. However, there is an obvious semantic difference between the built-in assignment for indirect and (say) array types. Since the former implies sharing and the latter does not, the effect of

depends on whether queue is defined as a direct or as an indirect type. This problem can be avoided if the user defines an assignment routine for queue which performs an element-by-element copy instead of a "pointer" copy. As another example, consider the problem of defining "varying strings" in the language. A varying string is declared with a value specified for its maximum length; during its lifetime it may hold string values whose lengths may be between 0 and this maximum length, inclusive. For example,

It is certainly desirable to allow the assignment of one vstring to another, provided that the cur_length of the source does not exceed the max_length of the destination. However, the built-in record assignment for vstring would fail for two reasons:

If the vstrings had different max_lengths, the string assignment of the data fields would be illegal, and

If the vstrings had the same max_length but the source vstring's cur_length was less than its max_length, then the X_INIT exception could be raised when the copy of the data field was attempted.

with user-defined assignment, these problems are avoided:

9.1.2  INTERACTIONS WITH OTHER FEATURES

Aggregate Assignment

If the user defines assignment for type T, and T is used as a component in an aggregate (e.g., an array, record, or union), then the assignment automatically provided for the aggregate type will invoke the user-defined assignment for T. For example, if the user declares:

then the assignment

is equivalent to

Analogously, record and union assignment will invoke user—defined assignment for any component for which such an operation has been provided. Thus, with the following declarations

the assignment

is conceptually equivalent to

Features with "Copy" Semantics

A number of features in the language have "copy" semantics: initialization in data declarations; CONST and OUT binding in formal parameter specifications; READ and WRITE input-output procedures; function result returns; message passing via MAILBOXes. In each case, should the copy invoke the automatically provided (i.e., "bit copy") assignment, or, if one is defined for the particular type in question, the user-defined assignment? If the type is "non-assignable", should the copy be allowed at all?
In RED, all features with copy semantics are treated consistently with respect to these questions. The copy always invokes the user-defined assignment (if one is defined), and is illegal if the type is non-assignable. (Thus CONST parameter passage is illegal for non-assignable types.) The reasons for adopting this approach are that

the use of "bit-copy" would allow the detection of whether the underlying type were direct or indirect, and

the behavior expected from the copy is in fact the abstract (i.e., user-defined) assignment.

For example, if the user defines a procedure with a CONST parameter declared as a vstring(20), then it would be counterintuitive to have bit copy assignment when an actual parameter is passed.

Binding Classes on Formal Parameters to User-Defined Assignment

The ability to define the assignment routine for user-defined types, together with the fact that CONST and OUT binding invoke the user-defined assignment, imply that these two binding classes should not be used for the formal parameters to user-defined assignment routines. Instead, the VAR binding should be used for the parameter corresponding to the assignment target, and the READONLY binding for the assignment source. Further discussion of this point may be found in Section 5.1, where the reasons for providing the READONLY binding class are given.

Exporting Assignment from Capsules

The issue is the following: if a type T is exported from a capsule, must asssignment for T (whether system- or user-defined) be also explicitly exported, or is it exported automatically with T? RED requires explicit export of both T and its assignment operator, in the interest of avoiding unnecessary defaults. Allowing assignment to be exported automatically with T would be potentially dangerous in case T was intended as a "non-assignable" type.

9.1.3  OPTIMIZATION ISSUES

As mentioned earlier, user-defined assignment, if not used, will not cause run-time overhead. Moreover, an attempt was made in the RED design to facilitate optimization even when user-defined assignment is used. This is not an empty issue, since, as an arbitrary procedure, a user-defined assignment routine may have extremely perverse behavior and, if "worst case" assumptions are made, very few optimizations could be carried out across a user-defined assignment.
The approach taken is to impose a weak restriction on user-defined assignment and to allow the translator to make certain assumptions about the behavior of user-defined assignment which facilitate optimization.² The restriction on a user- defined assignment routine is that the two formal parameters must be of the same type. The assumption which the translator may make is that the value of the target after the assignment is the same (via the user-defined equality operation, if one has been defined, or the automatically-provided equality operation otherwise) as the value of the source before the assignment. 1 This is directly analogous to the approach to side effects in functions (see Section 5.2). The restrictions in this latter case prohibit many (but not all) side effects, and Steelman HC allows the translator to optimize expressions via reordering operands independent of whether side effects are reordered as a result.

9.2  USER-DEFINED INITIALIZATION AND FINALIZATION

For any data item there is a set of actions performed when the item is created and a complementary set of actions carried out when the data item ceases to exist. (A data item is considered to be created by CONST and VAR declarations, CONST and OUT formal parameter declarations, function returns, and the allocation statement. A declared data item ceases to exist when an exit occurs from a scope containing the declaration; a dynamic variable ceases to exist when any pointer referencing it is FREEd, or when no pointers reference it.)
The set of actions performed during data creation is called initialization; finalization is said to occur when the data item ceases to exist. In most languages, the user has no control over initialization and finalization semantics. In many cases, however -- especially in connection with the definition of abstract data types -- such control is necessary. For example, a user who defines an abstract type in terms of an underlying indirect type may want declarations to initialize variables to point to newly allocated dynamic variables, and scope exit to induce a "cleanup" of dynamic data whose pointers are the values of local pointer variables. This is especially true in environments where the system has no automatic storage reclamation -- typical of embedded systems. Another example of the utility of user-defined initialization and finalization routines is for metering the use of data. An initialization procedure could keep track of the number of data items created with given subtype constraints -- e.g., it could produce statistics on the frequency of use of strings of different lengths. For the above reasons, and also to comply with the Steelman requirement for resource control (SM 8E), RED allows the user to define initialization and finalization routines.
An example of such user-defined routines appears in the Language Reference Manual, Section 13.3.

9.3  USER-DEFINED SELECTION OPERATIONS

Steelman (SM 3-3F) requires that the user have the ability "to declare constants and (unary) functions that may be thought of as record components and may be referenced using the same notation as for accessing record components." To meet this requirement, and also to satisfy the more global objective of supporting data abstraction, RED allows the user to define selector operations. For uniformity, this facility applies not just to "dot qualification" but to subscripting and subarray selection. For efficiency, user-defined selectors are in fact functions which return (constant) values; thus their invocations may not be used in contexts where a variable is required.
Examples of user-defined selectors may be found in Section 3.6.2 of this document, and in Section 13.4 of the Language Reference Manual.

9.4  MACHINE-DEPENDENT FACILITIES

One of the major objectives of the RED language is to eliminate much of the need for assembly language programming in embedded systems. To meet this goal, RED provides a variety of facilities with which the user can perform low-level and machine-dependent operations. These include features for defining physical representations for data items, aligning data, storing data at specified machine addresses, and defining handlers for asynchronous interrupts.

9.4.1  SPECIFICATION OF PHYSICAL REPRESENTATIONS

It is sometimes necessary to process data which have been externally generated (i.e., whose representations have not been determined by the RED translator). To accomplish this, the programmer must be able to specify both a high-level description of the data (i.e., their types), so that the appropriate operations can be performed, and a low-level definition of the data (i.e., their representations), so that their values can be correctly retrieved and stored.
An extreme position would be to allow the user to associate multiple representations with any type; the translator could generate code to perform the necessary conversions. This level of generality is not necessary, however, and the implicit inefficiencies which would result are undesirable. In fact, one of the changes from Ironman to Steelman was the removal of the requirement that the language allow the definition of multiple physical representations per type.
The RED language complies with the Steelman requirement (SM 11A) for data representation. The user can specify a single physical representation for a new type; the interpretation is that each data item of the type will have that representation. As an example, consider the type declaration:

The declaration comprises the logical properties of t (i.e., the underlying type) and the physical properties (size, alignment, and component representation). lf x is declared to be of type t, then x will occupy one halfword, be word-aligned, and have constituents x.a (3 bits, from bit 0 through 2), x.b (2 bits, bits 3 and M), and x.c (N bits, bits 12 through 15). Bits 5 through 11 are unused.

Choice of Bit as Storage Unit

Size, alignment, and offset are specified in terms of bits, and the language defines the bit numbering to start with bit 0 in the leftmost (i.e., high-order) position. An alternative is to assume the existence of implementation-dependent storage units (e.g., bytes, halfwords, etc.) for use in representation specifications and to adopt the bit numbering of the target machine. This alternative was rejected for several reasons.

Since the implementation can define manifest configuration constants reflecting the number of bits per byte, bits per halfword, etc., programs can make use of these constants in representation specifications. In the example above, the size of each t data item is one halfword (HALFWORDS is assumed to be the number of bits per halfword). Thus the implementation dependency occurs where it is appropriate -- in the value of the configuration constant, rather than in the language.

Use of the implementation-defined bit numbering would make programs harder to read and transport. Although transportability of machine-dependent programs might not seem to be of high priority, there are occasions where this is useful, and the language should not adopt implementation dependencies which defeat this purpose.

Restrictions on Reference Parameter Passing

The components of a data item whose type has a representation specification will generally have non-standard representations. For example, the translator may conventionally store INT data as full words, whereas the declaration of type t above requires an INT (its "a" field) stored in the leftmost three bits of a word, with other data in the rest of the storage unit. There is no problem when a value is copied into or out of this component (e.g., when the component is passed as an actual CONST or OUT parameter or is used as the source or destination of the built-in INT assignment operation). In such a case, the translator can generate conversion code which "unpacks" the component into a (standard) INT, or which "packs" an INT value into the field. However, if the component were permitted to be passed by reference -- either READONLY or VAR -- substantial inefficiency would result, since the code accessing the value of the formal parameter would have to work for non-standard representations such as fields which overlap storage unit boundaries. As a consequence of these considerations, it is not permitted to pass READONLY or VAR the components of` a data item whose type has a representation specification. Note that this restriction applies to ".ALL" components, too. For example, if x is declared to have type t, then x.ALL has type

but the actual representation for x.ALL will not be the standard one for this type.

Restrictions on Built-In Types

In order for a new type to be declared with a representation specification, its underlying type must comprise only built-in types and type generators. The reason for this restriction is that no assumptions may be made about the representation for data of a user-defined type; i.e., one of the main reasons to define a new type is A to allow for changes in the underlying type without affecting users of the new type. It is thus not appropriate to apply a representational item to a component having a user-defined type.
A related issue concerns enumeration types. Since enumeration literals are not automatically inherited by new types defined in terms of enumeration types (see Sections 3.3.4 and 3.5.3), allowing the specification of "collating sequences" for enumeration literals as part of the representation specification for a new type would not be too useful. Instead, the user can specify that a standard, representation be given for the literals in an enumeration type; viz., 0 for the least element, 1 for the next, etc. This approach avoids certain anomalies which can arise if the user were to supply a representation with "holes" in it; e.g., the implementation of arrays indexed by such enumeration types.

Separation of Logical and Physical Properties

The representation specification for a new type is disjoint from its logical properties (i.e., the specification of the underlying type), in compliance with SM 11A. It is, however, part of the type declaration form. We considered an alternative approach with distinct declarations but rejected it for several reasons:

A type declaration can be parameterized, and the parameters may be used in the representation specification. It would be semantically anomalous for the invocation of a type to result in an invocation of a completely separate entity (the representation). Since the two are invoked together with one syntactic form, they should be declared with a single syntactic form, with enough separation in the syntax to allow the reader to distinguish logical from physical properties.

Adding a representation specification to a type which previously lacked one can have a semantic effect (e.g., previously valid VAR or READONLY parameter passing may become illegal). Removing or changing a representation specification can likewise have semantic impact (e.g., when fields overlap). As a result, it is appropriate for the representation specification to be included as part of the type declaration, where the semantics of the type are established.

"Dope Vector" Issues

It is often necessary to have arrays in machine-dependent data structures, and it is important that the user be able to control the storage layout for them. In particular, the user should be able to determine whether bounds information will be stored with the array data. (The same issues arise in connection with other subtypes -- e.g., range bounds for INTs, precision for FLOATs. We will restrict our attention to the array case with no loss of generality.)
Two situations are expected to arise most frequently in practice: when the bounds are manifest, or when one or both is an attribute associated with the type. An example of the first case appeared at the beginning of this section: the "c" field of t is an ARRAY INT(1..4) OF BOOL. The translator will not store "dope vector" information at run time, since it can be maintained at translation time. If the "c" field is used as an actual parameter (only CONST and OUT are possible, since reference passage of components is prohibited), and the formal parameter is represented with a run-time dope vector, then the appropriate representational conversion can be performed at run time on copy-in or copy-out. Thus the ARRAY INT(1..4) OF BOOL field will in fact fit in A bits, and bounds checking is still performed on subscripting.
A less frequent but equally important case arises when an attribute associated with the type is used to specify bounds information. As an example, suppose that a machine-dependent record is desired with the following format:

the first byte gives the length of the record (in bytes),

the second byte is an INT(0..255),

each remaining byte contains a BOOL in the leftmost bit and an ASCII character in the remaining 7 bits.

The following type declaration specifies the desired representation:

Since the NOBOUNDS specification is given for the array component, subscript range checking is not done. E.g., if x is declared to be of type mrec, then before selecting x.b(i) the user should check that i is less than or equal to x.n-2. Further examples of machine-dependent representation specifications appear in Section 12.2 of the Language Reference Manual and in Appendix B of this document.

9.5  LOW-LEVEL MULTITASKING

There are two motivations for the RED multitasking design:

The need to provide simple, safe, and efficient high-level mechanisms for standard applications, and

The paramount importance of permitting flexible and efficient implementation of a wide variety of system design paradigms.

The first of these is addressed by the built-in high-level facilities. The second is addressed by providing means by which new high-level facilities may be defined. The philosophy is to make available the low-level primitives that are used in defining the high-level facilities. These primitives can then be used to define alternative high-level facilities when needed. Alternative methods of sending and receiving messages, delaying activations, controlling access to shared data, and scheduling activations can all be defined. In each case a capsule will be defined which provides one or more new types and associated operations.
The rationale for this philosophy is simply that not enough is known about multitasking to enforce a single system design paradigm on all parallel systems. On the contrary, many legitimate paradigms exist, and each of these is better than the others for certain applications. Furthermore, in any large project, there will be system designers who are better able to assess the needs of that system than can the language designers. It is these system designers whom we expect to use the low-level multitasking facilities to define mechanisms that will be used by the application programmers. To this end, we have carefully designed the low-level facilities of RED to smoothly integrate with the high-level facilities; the application programmers will not need to know whether they are invoking built-in or application-specific mechanisms.
The integration of high- and low-level facilities is achieved in three steps:

The definition of the high-level multitasking statements (CREATE, WAIT, REGION) in terms of semantically equivalent low-level operations,

The definition of the low-level operations on the built-in types, and

The addition of a small number of additional low-level operations (e.g., CRITICAL) and types (e.g., LATCH).

For example, Section 10.5 of the Language Reference Manual defines the semantics of the WAIT statement. The definition is given as a semantically equivalent expansion into a sequence of RED statements and routine invocations. The routines invoked in this expansion are low-level routines that operate on data of MAILBOX type. To define alternative methods of sending and receiving messages, and of delaying execution (e.g., for some length of simulated time), one only needs to define new types. (These types will normally be defined as abstract data types (and operations) within a capsule.) The defining capsule must provide the operations used in the elaboration of a WAIT statement. This is done by defining a subtype and four additional routines for each operation intended to be so used. The subtype and the four operations are used in the WAIT statement when the associated waiting operation appears in a clause. In the definition of the waiting operation, various low-level primitives such as SYNC_SIGNAL will be used. The example in Section 14.3.1 of the Reference Manual illustrates how mailboxes can be implemented.
Alternative methods of locking can be obtained through the definition of a new type with operations named LOCK and UNLOCK. These operations will be invoked during elaboration of the (expansion of) the REGION statement. The example in Section 14.4.1 of the Reference Manual shows how monitors can be defined in RED.
Users can define new schedulers by providing a new activation type with operations for starting, ending and synchronizing activations. Sometimes, a task definition for the scheduler of activations of the new type must also be provided. The example in Section 14.5.4 of the Reference Manual shows how a round-robin scheduler can be defined.
In addition to the primitives used in defining alternatives to high-level multitasking facilities, RED also provides some low-level features intended to be used in special circumstances. For example, interrupts can be handled by associating a particular interrupt with a mailbox. Latches (low-level alternatives to datalocks) can be used when access to hardware-provided mutual-exclusion primitives (e.g., test-and-set) is needed. The CRITICAL and NONCRITICAL operations can be used to ensure that preemption does not occur for (brief) periods.
All of the above facilities can, of course, be misused -- just as any well- intentioned feature of a language can be misused. However, the way in which the low-level facilities of RED have been designed, and, in particular, the way in which they may be used to augment the built-in high-levelfacilities, should improve program clarity, correctness, and maintainability. The alternative approach, that of providing a single, rigid multitasking facility, will lead to contorted program structures in those cases where the facility is inappropriate to the most natural application-induced program structure. Given the present state of knowledge about parallel programs, there are likely to be a large number of these applications.
The RED multitasking philosophy is precisely the same philosophy that leads to the notion of data abstraction. Since it is not feasible for a language to provide all of the data types appropriate to all applications, we provide

A basic set of types that are frequently used, and

A mechanism for defining new types, for associating operations with them, for hiding the details of their implementation, and for creating and manipulating these types in the same high-level manner as is used for built- in types.

This is precisely what RED has done for multitasking, and for the same reasons.

9.6  LOW-LEVEL INPUT/OUTPUT

Steelman 8A requires that all high-level I/O be definable within the language. RED already has mechanisms for specifying the exact representation and placement of data, and for handling interrupts. Thus, the only remaining requirement is the ability to execute any of the primitive I/O instructions defined by the hardware. This ability is provided by appropriate overloadings of the procedure
      LOW_IO(device, command, data);
Typically, the device argument would be an integer channel number or bus address. The command argument could be an enumeration element designating the I/O instruction to be executed. The data argument might be a value to be transmitted, or a buffer to be filled.

***********
1 The prohibition against implicit type conversion (Steelman 3B) implies that both parameters to the assignment routine must have the same type. This restriction is imposed by RED on user-defined assignment as well as being true for the built-in types.
2 This is directly analogous to the approach to side effects in functions (see Section 5.2). The restrictions in this latter case prohibit many (but not all) side effects, and Steelman 4C allows the translator to optimize expressions via reordering operands independent of whether side effects are reordered as a result.

8. Overloading & Generic Definitions

10. High-Level Input-Output

Overview
Requirements
      Strawman
      Woodenman
      Tinman
      Ironman
      Steelman
RED Reference
RED Rationale
Types in RED
Time/Life Computer Languages
Memories
Site Index

Overview Reference ToC Rationale ToC Site Index