(This page is incomplete but is no longer maintained. Its primary value is probably the Appendix describing the PALMAT intermediate language.)

Table of Contents

Introduction

It is not feasible at the present time to run the original Shuttle-era version of the compiler for the HAL/S language to compile Space Shuttle flight software.  Refer to our Space Shuttle flight-software page for more context.

That original HAL/S compiler was known as HAL/S-FC.  Even if it were possible to run that original compiler, it would in any case produce only object code which could be run on an IBM System/360 mainframe, or else on the IBM AP-101S General Purpose Computers (GPC) of the Shuttle itself ... neither of which (with certain provisos insofar as System/360 is concerned) can be run on "modern" computers likely to be available or useful to most folks interested in this Virtual AGC Project website.

Therefore, for the Shuttle flight software to be generally accessible, we instead need modern versions of several HAL/S-related development tools, including at minimum:

The ideal HAL/S emulator and executable-code format would be such that the emulator could be easily integrated with spaceflight-simulation systems such as Orbiter or FlightGear, and would execute the HAL/S software rapidly and efficiently, supporting the entire HAL/S language and its entire feature set.

The "ideal" just mentioned is quite difficult to achieve in one giant leap forward, and so I intend to achieve it in two stages, which I refer to as "preliminary" and "advanced".  Potentially even-more-advanced stages may follow those, and I suppose there may be potential intermediate stages as well, but I don't care to speculate about them at the present time.  I expect to implement the entire preliminary stage entirely by myself, but I also expect that help (any volunteers?) will be needed to reach the advanced stage.

Sections below are devoted to a summary of the roadmap, and to the (in-progress, partially-implemented) preliminary version of the compiler.

Roadmap

In brief, the "modern" HAL/S compiler is structured as follows:

  1. A preprocessor.
  2. A combined tokenizer+parser.
  3. A code generator.

Because modules of HAL/S code may be separately compiled, thus producing multiple output object-code files, an additional linker program is potentially needed to combine the object-code files into a single executable file.

To produce efficient code, it is likely that some additional optimizer program will be needed for processing the object-code files and/or executable files, so eliminating redundant or useless instructions, replacing code sequences by more-efficient code sequences, and so on.

Finally, unless the executable file is itself natively executable on the target computer system, then it will necessarily be in some format that must itself be executable only in an emulator program, and thus such an emulator program is needed as well.

Of the software components mentioned above, the "preliminary" stage of the development tools includes:

I have no firm schedule for completing "preliminary" development.  The preliminary version is partially usable now (February 2023), with a restricted HAL/S feature set.  I certainly hope it will be complete prior to the end of 2023.  But it isn't required for the preliminary version to be 100% complete before commencing the "advanced" development.

Of the preliminary components mentioned above, most but not all of them can be suitably retained in the "advanced" stage.  The problem begins with the PALMAT intermediate language, because it does not employ an encoding conducive to efficient storage of the object files, and the problems continue with the PALMAT emulator, because in addition to the inefficiency of the PALMAT format itself, the emulator is written in Python and cannot be adequately improved in terms of execution speed.

Therefore, once the preliminary stage of development is completed, the preprocessor, parser, and PALMAT code generator can continue to be used but the following advanced development, at minimum, needs to take place as well:

  1. An efficient object language must be designed, perhaps just a better way of encoding PALMAT numerically rather than its current JSON representation.  For the sake of discussion, I'll call this efficient represention "GALMAT", but in practice it may not have that name.  For example, one could develop a PALMAT-to-C translator, and then GALMAT would simply be C.  I argue neither for nor against that, but it would have some obvious advantages and some unobvious disadvantages.
  2. Software for translating PALMAT to GALMAT must be created.
  3. Optimization sofware for PALMAT and/or GALMAT object code must be created.
  4. A linker for GALMAT code must be created.
  5. An emulator for GALMAT files must be created, preferentially written in C for integration purposes.

No interactive interpreter loop is needed in the advanced stage.

Preliminary Version of the Compiler

By the way, it's helpful to understand Python (or JavaScript) datatypes somewhat, in order to understand the description that follows.  At the very least, you should know the distinctions between "lists" vs "tuples" vs "sets" vs "dictionaries".

Code

Folder Setup

All Virtual AGC software related to the Space Shuttle, both "modern" and from the original Shuttle project, appears in the yaShuttle folder of Virtual AGC's GitHub repository. At the present time, all such software is incomplete and/or under development, and it is not automatically built when building Virtual AGC from source.

There are three main categories of material found in the afore-mentioned yaShuttle folder:

The bulk of the modern code is written in Python 3, and thus can be run in any Python 3 installation without other preparation.

However, the tokenizer+parser is a program in the C language, and must be compiled for the target platform. There are two approaches to this, depending on whether or not the LBNF description of HAL/S (i.e., the HAL_S.cf file mentioned above) has changed or not. For the sake of discussion, let's call these deep builds (HAL_S.cf changed) vs shallow builds (HAL_S.cf unchanged). I've not as of yet given much thought or put much effort into the most-convenient way to build this program on a target-platform by target-platform basis, so I'll just explain the way I currently do it.

Shallow Build

As mentioned above, a "shallow build" is one in which the file HAL_S.cf has not changed.

The current Makefile is written for my own setup, namely Linux Mint 21 with the usual C compiler (gcc) and a Windows cross-compiler (x86_64-w64-mingw32-gcc) installed, though I expect it would work on may versions of Linux as-is. Simply running the command make from the folder yaShuttle/ builds both the native Linux version of the tokenizer+compiler (namely modernHAL-S-FC) and a 64-bit Windows version of it (x86_64-w64-mingw32-gcc.exe). I've not tried to build the Windows version of the program directly on Windows, but the one included in the yaShuttle/ folder already in the repository should work fine, so there's not much need to rebuild it directly on Windows. It even works in Linux, under WINE, though I don't recommend doing so.

The yaShuttle/ folder in the repository also has a pre-compiled Mac OS version of the tokenizer+Parser (modernHAL-S-FC-macosx), but since Mac OS programs cannot really be built from Linux, and since it's a great hassle for me to build it directly on Mac OS, the Mac version is unlikely to be fully up-to-date except when I make an effort to release it. Thus if you have Mac OS, you're very likely to have to build it from source yourself. Fortunately, it's easy. Just clone the Virtual AGC GitHub repository onto your Mac; in a console, descend into the yaShuttle/ folder, and do make modernHAL-S-FC-macosx.

If you have some platform other than Linux, Windows 64, or Mac OS, or are using some other C compiler than what I've anticipated, you should still be able to do a shallow build of the tokenizer+parser quite easily. Just compile all of the files yaShuttle/*.c together. The default name the modern HAL/S compiler expects this program to have is modernHAL-S-FC.exe on Windows, modernHAL-S-FC-macosx on Mac, or modernHAL-S-FC everywhere else, but you can override this command-line switch at when you invoke the modern HAL/S compiler. The command-line switch needed is --compiler=NameOfCompiler.

Deep Build

A "deep build" is one in which the file HAL_S.cf has been modified in some way other than what's available in our GitHub repository. (In other words, if you don't like my LBNF description of HAL/S, and want to make your own tweaks to it.) In this case, an extra processing step is needed to create the *.c and *.h files for a shallow build of the tokenizer+parser as described above.

This extra processing step requires installation of the so-called BNF Converter (BNFC) framework, which builds tokenizers and parsers from an LBNF description of a context-free language, such as our LBNF file HAL_S.cf, so ...

One-time setup: Install BNF Converter.

But I must warn you, while BNF Converter purports to be platform-independent, it achieves that independence by using whatever standardized tokenizer builders and parser builders are popular on the platform on which it is running. On Linux, for example, it uses the flex tokenizer library and the bison parser library. On other platforms it uses other libraries. Hence, I have no way in the world to guarantee the results when running BNFC on some platform other than Linux. Maybe it'll work, maybe it won't.

You should be able to perform a deep build using the following command:

make distclean all

This operation creates a subfolder called yaShuttle/temp/, uses it as a working directory for building new *.c and *.h files for the tokenizer+parser, copies those files into yaShuttle/, and then performs the steps for a shallow build as described above.

Again ... if this doesn't work for you, I have little advice other than "next time, try Linuxing."

Invocation of the Modern HAL/S Compiler

At any rate, given that a native version of the tokenizer+parser is available, the modern compiler is invoked by the command

yaHAL-S-FC.py [OPTIONS] HalSourceFiles

You can see what OPTIONS are available by instead using the command

yaHAL-S-FC.py --help

However, at this stage of development, rather than using the compiler as a compiler (i.e., to compile source files into executable object files), you'd be more-likely to want to run it interactively as an interpreter. I would advise you to consult the HAL/S Interpreter page dedicated to that particular usage.

Testing

TBD

Preprocessor

TBD

Parser and Tokenizer

TBD

PALMAT Code Generator

TBD

PALMAT Optimizer

The Python module called "optimizePALMAT" provides a few extremely-crude optimizations of the PALMAT code created by the code generator. It's trying to pick off just the lowest-hanging fruit. The comments in the module itself describe it fairly well, so I'd just suggest just reading those if you're interested.

PALMAT Interpreter

TBD

PALMAT Linker

TBD

Advanced Version of the Compiler

TBD

Validation and Regression Testing

There is a HAL/S file in the source-code tree called "val.hal" which can be used for validation and regression testing of the HAL/S compiler and emulator.

Unfortunately, it is a program I wrote myself, rather than being one from the Shuttle development team, so how correct you think it is depends on how much confidence you have in me. (Don't feel guilty if you haven't any. Just help me out by writing your own HAL/S validation program or helping to fix mine!) I'd venture tentatively to grade val.hal perhaps 7 out of 10, though I expect it to grow continually more complete and more accurate over the course of time, even if never 100% so. I am told by one of the original Shuttle software developers that it is unlikely any contemporary test suite for the HAL/S compiler has survived, and that mini test suites kept by individual developers have likely disappeared as well; but fortunately, in my experience, opinions that such-and-such code hasn't survived haven't proven to be entirely reliable. A lot more has probably survived, completely forgotten or else selfishly guarded from prying eyes, in the hidden recesses of government or private collections than you or I will ever know.

Running the test is quite simple, since all it asks of you is to enter either 0 or 1, and then it runs without further assistance from you: It computes various results, and then tests those results to see if they're what I expected them to be on the basis of my understanding of the HAL/S language requirements and other contemporary documentation. Entering a 0 at the initial prompt disables certain "time-consuming" tests (requiring a total of about 57 seconds), whereas 1 enables them. The possibility of disabling them is really more of a convenience for my own workflow — since I tend to run these tests a lot — than for any more generally satisfying reason, and you'll probably always want these tests enabled.

Each individual test prints a designation for the test (such as "8.01A" or "MATRIX MULTIPLY") and then prints either "success" or "failure". In the case of "failure", you can use the test designation to look at val.hal and see in more detail what was happening when the test failed; it could be that val.hal itself had a bug. The total number of failures is printed at the end, so you usually don't even need to look at the individual results.

In general, the individual tests are implementation-agnostic, and therefore should produce the same results in any implementation. (If you've written your own HAL/S compiler and emulator, feel free to run val.hal on it.) There are some exceptions, at the moment involving only tests in which the arithmetical precision (which is implementation-dependent) could differ, and messages are printed along with those results to point out this possibility.

Also, at the very end of the test run, some tests are performed for informational purposes, such as for timing the execution speed, and such tests have no "correct" success/fail result. They can be ignored or not, as desired.

Appendix: PALMAT Specification

Background

The "preliminary" version of the HAL/S compiler produces object files in the form of an intermediate language named "PALMAT".  This name derives from the fact that the original Shuttle era compiler, HAL/S-FC, produced an intermediate language known as "HALMAT"; but I'm not sure why HALMAT was given that name.  HALMAT was machine-independent.  The original compiler then optimized the HALMAT intermediate code, and translated it to whatever machine-dependent form was needed for the target computer.  In the case of IBM System/360 (mainframe) or IBM AP-101S (Shuttle GPC) target systems, the target language was some variant of IBM's Basic Assembly Language (BAL).

As a matter of fact, I would have simply had the modern compiler produce HALMAT instead of inventing my own intermediate language for that purpose, except that there was insufficient available surviving documentation of HALMAT to accomplish that.  Essentially the only useful available documentation about HALMAT was in Appendix A of the document "HAL/S-360 Compiler System Specification", but the only copy we have of that is partial, and is missing perhaps 80% of Appendix A.  I have ongoing FOIA requests in place for additional documentation, and it remains to be seen if my requested documents can be found or not; but regardless, I have chosen not to wait to find out, and that has turned out to be the correct decision.

Thus ... PALMAT!

The PALMAT language has a minimal instruction set intended to have behaviors that are very easy to implement in an emulator.  Since the preliminary code compiler code generator and the preliminary emulator are written in Python, the format of a PALMAT dataset makes heavy usage of Python datatypes such as dictionaries, lists, tuples, and sets; moreover, PALMAT object files are basically just JSON representations of the Python internal dataset, and are thus both very human-friendly and very easily output from and input into Python programs.  But use of Python-style datatypes and JSON storage formats is not conducive to rapid emulation.

PALMAT is thus described in detail here in the hope of aiding translation from PALMAT to a more-efficient representation for both storage and emulation.

Implementation-Dependent Details

The general HAL/S documentation leaves a number of details of the language specification as "implementation dependent", and in a spectacular display of nonsensical behavior, the Shuttle-specific documentation for HAL/S written near the end of the Shuttle program continues to do so! Obviously I want the PALMAT-based implementation to match the behavior of the actual compiler used for the Shuttle flight software, but alas I am forced to invent my own implementation-dependence which may not match that of the original Shuttle software. Perhaps when the actual Shuttle flight-software source code can be examined, the original implementation details can be deduced and this section will be updated.

Other than the unknown limits imposed by available memory (which are unlikely to be of any significance for Shuttle flight software), the following limits currently exist in the PALMAT implementation:

Representation of HAL/S Data Types

In PALMAT, the various HAL/S data types are represented as follows:

Here are a few examples of STRUCTUREs. The HAL/S statements

STRUCTURE FIRST:
    1 X SCALAR,
    1 I INTEGER;
DECLARE DUMMY1 FIRST-STRUCTURE INITIAL(100, 200);
results in DUMMY1 being represented as [100.0, 200, "FIRST-STRUCTURE"].  Whereas
STRUCTURE SECOND:
    1 A FIRST-STRUCTURE,
    1 B FIRST-STRUCTURE,
    1 Y SCALAR;
DECLARE DUMMY2 SECOND-STRUCTURE INITIAL(100, 200, 300, 400, 500);

would result in DUMMY2 being represented as [[100.0, 200, "FIRST-STRUCTURE"], [300.0, 400, "FIRST-STRUCTURE], 500.0, "SECOND-STRUCTURE"]. Whereas

STRUCTURE THIRD:
    1 A,
        2 X SCALAR,
        2 I INTEGER,
    1 B,
        2 X SCALAR,
        2 I INTEGER,
    1 Y SCALAR;
DECLARE DUMMY3 THIRD-STRUCTURE INITIAL(100, 200, 300, 400, 500);
results in DUMMY3 being [[100.0, 200, "-STRUCTURE"], [300.0, 400, "-STRUCTURE], 500.0, "THIRD-STRUCTURE"].

Here are a few examples of the representation of the ARRAY vs VECTOR and MATRIX types of comparable geometries.

Example 1: VECTOR(5) would be represented (in Python lists) as:

[
    scalar1,
   
scalar2,
   
scalar3,
   
scalar4,
   
scalar5
]
.

Example 2: ARRAY(5) SCALAR would be represented as:

[
    scalar1,
   
scalar2,
   
scalar3,
   
scalar4,
   
scalar5,
    'a'
]
.

Example 3: MATRIX(3, 2) would be represented as:

[
    [scalar11, scalar12],
    [scalar21, scalar22],
    [scalar31, scalar32]
]

Example 2: ARRAY(3, 2) SCALAR would be represented as:

[
    [scalar11, scalar12, 'a'],
    [scalar21, scalar22, 'a'],
    [scalar31, scalar32, 'a'],
    'a'
]
Example 3: ARRAY(3) VECTOR(2) would be represented as:
[
    [scalar11, scalar12],
    [scalar21, scalar22],
    [scalar31, scalar32],
    'a'
]

The Computation Stack

To understand some of the descriptions below, particularly in the Instructions Section, it's helpful to have some idea of the computational model PALMAT envisages.

Specifically, whenever PALMAT instructions are engaged in any activities that require computation of a value, it employs a construct that I call the "computation stack".  The computation stack is simply a last-in first-out buffer, onto which the PALMAT instructions (discussed later) can do things like push numerical values, pop numerical values to perform mathematical operations on them, and so forth.  By combining various primitive operations of this kind, the values of expressions of any complexity can be computed, including some operations (like the HAL/S repetition operator, #) that have no analogs in programming languages which tend to be more familiar.  I'll refer to the computation stack quite often later, and may simply call it "the stack". There's no confusion in doing so, since there is no general-purpose stack in the HAL/S computation model, and in particular, return addresses for subroutines are not stored on a stack.

Elements of any of the types described in the preceding section can appear on the computation stack, but not objects of any other kind.

However, the computation stack does not form any part of the PALMAT structure we're discussing, and I mention it merely as background information; it's an an entirely separate object within the emulator from the PALMAT structure.

Scopes

At any rate, at the topmost level, a PALMAT structure is little more than a list of "scopes":

PALMAT = {
    "scopes" : [ SCOPE0, SCOPE1, SCOPE2, ... ],
    "instantiation" : N ,
"sourceFiles" : []
}

Aside: There can be fields other than "scopes" in the PALMAT structure. While this section focuses just on scopes, let me at least give you a brief rundown on the extra fields you do see in the sample above.

What these scopes signify, and the reason they are called "scopes", is that they let us determine which identifiers (variables, constants, program labels) are accessible from any given point in the code.  As you'll see in a moment, the scopes are actually in a hierarchical arrangement that's more complex than the simple list structure shown above, and a given scope (with some exceptions) can generally access identifiers defined in itself or in its ancestors in the hierarchy, but not in its descendants or "cousins", "uncles", "nephews", etc.

Scope 0 is the top-level scope, of which all other scopes are descendants, and essentially represents what you might call "global" elements.  Each scope comprises the dictionary of the identifiers (variables, program labels) defined in it, along with their properties, and the list of PALMAT instructions defined in the scope, plus (potentially) other items.  Pure HAL/S actually has no global variables or statements that can be compiled to instructions, but it does have other top-level compilation objects, each of which has an associated identifier, such as

(In HAL/S, a COMPOOL is a set of variables which can be shared among programs, functions, or procedures which normally could not do so, perhaps due to being in separate source-code files. This is like a FORTRAN COMMON area.  A COMPOOL is the closest thing in HAL/S to a set of global variables.) Thus in pure HAL/S, Scope 0 would be basically be just a collection of child scopes, one for each of the programs, functions, procedures, or compools defined in it.

However, for several reasons, such as our desire to have a HAL/S interpreter, our own implementation is "impure" in the sense that we actually do allow the global scope to have global variables as well as active code not wrapped in any program, function, procedure, or compool.  These "impure" possibilities are transparent to existing Shuttle software written in "pure" HAL/S.

To illustrate with an example, consider the HAL/S source code,

DECLARE I INTEGER INITIAL(5);
WRITE(6) I;

which simply prints out the value 5.  This code is impure since it exists outside of any containing PROGRAM.  Upon compilation we get a PALMAT structure that looks like this:

PALMAT = {
    "scopes": [
        {
            "parent": None,
            "self": 0,
            "children": [],
            "identifiers": {
                "^I^": {"integer": true, "initial": 5, "value": 5}
            },
            "instructions": [
                {"fetch": [0, "I"]},
                {"write": "6"}
            ],
            "type": "root"
        }
    ],
    "instantiation": 0
}

This structure contains everything needed to run the sample code just mentioned.

There's only a single scope here, and everything occurs within this scope, but most of the basic parts are present that would appear in any scope. So what are the properties of that scope?

As I began by saying, this example is "impure" HAL/S.  But we can convert it to a "pure" HAL/S example if we provide a PROGRAM wrapper:

EXAMPLE: PROGRAM;
    DECLARE I INTEGER INITIAL(5);
    WRITE(6) I;
CLOSE EXAMPLE;
Having done that, the compiled PALMAT representation becomes instead:
{
    "scopes": [
        {
            "parent": null,
            "self": 0,
            "children": [1],
            "identifiers": {
                "^l_EXAMPLE^": {"program": true, "scope": 1}
            },
            "instructions": [],
            "type": "root",
            "templates": {}
        },
        {
            "parent": 0,
            "self": 1,
            "children": [],
            "identifiers": {
                "^I^": {"integer": true, "initial": 5, "value": 5}
            },
            "instructions": [
                {"fetch": [1, "I"]},
                {"write": "6"}
            ],
            "type": "program",
            "templates": {},
            "name": "^l_EXAMPLE^",
            "attributes": {"program": true, "scope": 1}}
    ],
    "instantiation": 0
}

Thus we now have two scopes rather than just the one.  Scope 0 is now largely empty, just as we should have expected, except that it defines a single identifier, namely EXAMPLE, the newly-defined PROGRAM.  Similarly, scope 1 now contains everything that our earlier scope 0 used to, with a few relatively-minor changes.  One such change is that it now has type "program", which is presumably self explanatory, as well as fields "name" and "attributes" that simply duplicate the declaration in scope 0. 
Aside:  If we were to run this code, starting from instruction 0 in scope 0, it would actually do nothing at all, for there are no instructions listed in scope 0.  That's perfect HAL/S behavior, since HAL/S imagines the existence of a "monitor" program as part of the operating system, and it is the monitor which initiates and supervises execution of a list of HAL/S PROGRAMs it is given, on a timesharing basis.
However, top-level compilation units like PROGRAM, FUNCTION, PROCEDURE, and COMPOOL are not the only things that are segregated into their own dedicated scopes by the compiler's code generator.  Any HAL/S DO block or IF statement will trigger creation of a new child scope dedicated to it as well, and mark the new scope with the associated type.  Here are the values of type that have been used up to now:

As one final example, consider the HAL/S code, which contains several of the block types just mentioned:

DECLARE I INTEGER INITIAL(0);
DO WHILE TRUE;
    WRITE(6) I;
    I = I + 1;
    IF I >= 10 THEN EXIT;
END;

This produces a PALMAT structure like the one below.  Since my purpose here in this section is just to illustrate how scopes interact, I won't dissect it in detail for you (exercise for the reader!), except to say that the various identifiers you see whose names are of the form xx_X, where xx is a pair of lower-case letters and X is a number, are program labels automatically created by the compiler's code generator.  They are used both by PALMAT instructions like "goto" (discussed later) that cause some kind of jump to a different address, and by the PALMAT instructions that are the targets of such jumps.  Thus, for example, you'll see an instruction {"goto": [0, "^ue_1^"]} in scope 0 and a matching instruction {"noop": true, "label": "^ue_1^"} in scope 1.  The former is a jump to the latter.  When this code is executed, beginning at instruction 0 in scope 0, the first thing that happens is a jump to instruction 0 in scope 1, which represents the DO WHILE block.  Several instructions in the DO WHILE block are executed, until near the end of the block we see the instruction {"goto": [1, "^ue_2^"]}, which is a jump to scope 2, the IF block.  Eventually, after scope 2 does its work it returns to scope 1 (via the instruction {"goto": [1, "^ur_2^"]}), and scope 1 eventually returns to scope 0 (via {"goto": [0, "^ur_1^"]}), at which point scope 0 runs out of instructions to execute, and emulation halts.

{
    "scopes": [
        {
            "parent": null,
            "self": 0,
            "children": [1],
            "identifiers": {
                "^I^": {"integer": true, "initial": 0, "value": 0},
                "^ue_1^": {"label": [1, 0]},
                "^ur_1^": {"label": [0, 1]}
            },
            "instructions": [
                {"goto": [0, "^ue_1^"]},
                {"noop": true, "label": "^ur_1^"}
            ],
            "type": "root",
"templates": {}
        },
        {
            "parent": 0,
            "self": 1,
            "children": [2],
            "identifiers": {
                "^ue_2^": {"label": [2, 0]},
                "^ur_2^": {"label": [1, 10]},
                "^ux_1^": {"label": [1, 12]}
            },
            "instructions": [
                {"noop": true, "label": "^ue_1^"},
                {"boolean": [[1, 1]]},
                {"iffalse": [1, "^ux_1^"]},
                {"fetch": [0, "I"]},
                {"write": "6"},
                {"number": "1"},
                {"fetch": [0, "I"]},
                {"operator": "+"},
                {"storepop": [0, "I"]},
                {"goto": [1, "^ue_2^"]},
                {"noop": true, "label": "^ur_2^"},
                {"goto": [0, "^ue_1^"]},
                {"noop": true, "label": "^ux_1^"},
                {"goto": [0, "^ur_1^"]}
            ],
            "type": "do while",
"templates": {}
        },
        {
            "parent": 1,
            "self": 2,
            "children": [],
            "identifiers": {"^ux_2^": {"label": [2, 6]}},
            "instructions": [
                {"noop": true, "label": "^ue_2^"},
                {"number": "10"},
                {"fetch": [0, "I"]},
                {"operator": ">="},
                {"iffalse": [2, "^ux_2^"]},
                {"goto": [1, "^ux_1^"]},
                {"noop": true, "label": "^ux_2^"},
                {"goto": [1, "^ur_2^"]}
            ],
            "type": "if",
"templates": {}
        }
    ],
    "instantiation": 0
}

This particular example happens to illustrate an interesting point about some program-label identifiers, namely that they may be defined in one scope but actually represent a location in a descendant scope.  This is the case with the program labels ue_1 and ue_2.  Recall that I mentioned earlier that scopes have access to identifiers declared in their ancestor scopes, but ancestors have no access to identifiers defined in their descendant scopes.  That's what's happening here.  Both scopes 0 and 1 need the definition of ue_1 to be accessible to them, and scope 1 has access to scope 0's definitions, but not vice-versa.  This forces the code generator to define ue_1 in scope 0, even though the label itself really refers to an address in scope 1.

As you can probably imagine by now, any practical example of HAL/S code will compile to quite a large hierarchy of scopes, descending to many levels deep and quite wide in terms of the numbers of sibling scopes.  Part of the job of a linker program would be arrange all of these scopes linearly in memory in a reasonably efficient way to eliminate useless jumps; whereas when emulating PALMAT directly, each scope can continue to exist in its own memory space that is essentially independent of all the others, rather than being in any particular linear memory order with respect to them.

Data Model

By the "data model" for a scope, I mean the "identifiers" dictionary in the scope, which was illustrated a little bit in the preceding section.  The identifiers dictionary has an entry — i.e., a key/value pair — for each identifier accessible to the scope and its descendant scopes, but not to other scopes.  Very often these identifiers will be the names of variables or constants for which actual space within the scope is allocated.  Sometimes, however, the identifiers will be symbolic program labels, either for named entities (like a PROGRAM, FUNCTION, or PROCEDURE) or for positions in the PALMAT instruction stream not explicitly named by the HAL/S source code but for which a label has been transparently created by the compiler's code generator.  Often, as mentioned above, these locations will not be in the scope where their symbolic labels are defined in the identifiers dictionary, but rather in some descendant scope.    For example, in HAL/S code such as the following,

F: FUNCTION(X);
    .
    .
    .
    G: FUNCTION(Y);
        .
        .
        .
    CLOSE G;
    .
    .
    .
    Z = G(X);
    .
    .
    .
CLOSE F;

function G (and hence the program label "G") will always be in a child scope of the scope of function F, and yet the identifier G itself will always be defined in the F's scope.

Aside:  Let me detour for a moment to cover technical issues you may have noticed concerning the names of identifiers.  There are two of these issues to be aware of:

Or to be blunt, ^l_EXAMPLE^ is nothing more nor less than EXAMPLE, and you can basically ignore the decorations. Both of these modifications to the identifier names are inherited from details of the inner workings of the preprocessor and parser; I won't go into detail with that here, except to say that there is no good reason for either of these modifications to persist once PALMAT has been generated and is going to be emulated.  Both types of modifications can simply be stripped away in "advanced" stage development.  (Though I've seen no actual Shuttle HAL/S source code as of yet, my belief is that it is all upper case, while the carat (^) is not a legal character in HAL/S.  So there should be no identifiers in the Shuttle flight software that mimic these modifications. If it turns out that there are, I guess I'll have to modify the name-mangling scheme somewhat.)

The value associated with an identifier key, i.e. PALMAT["scopes"][index]["identifiers"][identifier], is itself a dictionary of key/value pairs describing the identifier's attributes.  At the time I'm writing this, I don't necessarily know all of the possible attributes an identifier may have, but here are the ones I do know about:

Here are some examples of attributes dictionaries for structure templates. Consider first the HAL/S statement

STRUCTURE T1:
    1 A SCALAR,
    1 B SCALAR,
    1 I INTEGER;
Due to name-mangling, this would appear in the scope's identifiers dictionary as
"^s_T1^": { "template": ( ["A", "B", "I"], [ { "scalar": True }, { "scalar": True }, { "integer": True } ] ) }
Or consider the slightly-more-complex case of
STRUCTURE T2:
    1 C,
        2 A SCALAR,
        2 B SCALAR,
        2 I INTEGER,
    1 D,
        2 A SCALAR,
        2 B SCALAR,
        2 I INTEGER,
    1 E VECTOR;
This would appear in the identifiers dictionary as
"^s_T2^": { "template": (
                              ["C", "D", "E" ],
                              [
{ "template": ( ["A", "B", "I"], [ { "scalar": True }, { "scalar": True }, { "integer": True } ] ) },
                               
{ "template": ( ["A", "B", "I"], [ { "scalar": True }, { "scalar": True }, { "integer": True } ] ) },
                                { "vector": 3 }
                              ] ) }
Or consider
STRUCTURE T3:
    1 A,
        2 B,
            3 C INTEGER;
This would appear as:
"^s_T3^": { "template": (
                            ["A"],
                            [ { "template": (
                                    ["B"],
                                    [ { "template": ( ["C"], [ { "integer": True } ] ) } ]
                                ) } ]
                       )
                    }
Finally, consider
STRUCTURE T4:
    1 C T1-STRUCTURE,
    1 D T1-STRUCTURE,
    1 E VECTOR;
This would appear in the identifiers dictionary as
"^s_T4^": { "template": (
                              ["C", "D", "E" ],
                              [
                                { "structure": "T1-STRUCTURE" },
                                { "structure": "T1-STRUCTURE" },
                                { "vector": 3 }
                              ] ) }

Note the use of "structure" vs "template" here for the C and D fields; they serve the same purpose, except that the former identifies an applicable template by name, whereas the latter expands the template explicitly.  The explicit form ("template") is used if and only if the substructure being described has no explicitly-named template.

One might imagine (incorrectly) that template T4 would instead appear in the identifiers dictionary as
"^s_T4^": { "template": (
                              ["C", "D", "E" ],
                              [
{ "template": ( ["A", "B", "I"], [ { "scalar": True }, { "scalar": True }, { "integer": True } ] ) },
                               
{ "template": ( ["A", "B", "I"], [ { "scalar": True }, { "scalar": True }, { "integer": True } ] ) },
                                { "vector": 3 }
                              ] ) }
This form (which again, is not used) would have some advantages.  For example, in this form it is immediately apparent that T2-STRUCTURE and T4-STRUCTURE are identical.  The reason this fully-exanded form isn't used to represent T4 has to do with structure-template libraries.  Recall that one usage pattern for structure templates is to store them in "libraries" for later reuse.  In the unexpanded form, it is possible to redefine T1-STRUCTURE in the library in such a way that T4-STRUCTURE is transparently redefined at the same time, thus reducing the cost to change the code; whereas it is not possible for a redefinition of T1-STRUCTURE to transparently carry over into T2-STRUCTURE since T2-STRUCTURE has no dependence on T1-STRUCTURE, thus improving reliability by reducing the possibility of unintended consequences. 

Any given structure template may have some substructures described by "structure" fields and others described by "template" fields.  For example, consider
STRUCTURE T5:
    1 C T1-STRUCTURE,
    1 D,
        2 A SCALAR,
        2 B SCALAR,
        2 I INTEGER,
    1 E VECTOR;
which is identical in layout to T2-STRUCTURE and T4-STRUCTURE.  It is represented in the identifier dictionary as
"^s_T5^": {"template": [
                            ["s_C", "s_D", "E"],
                            [
                                {"structure": "T1-STRUCTURE"},
                                {"template": [ ["A", "B", "I"], [ {"scalar": true}, {"scalar": true}, {"integer": true} ] ] },
                                {"vector": 3}
                            ]
                        ]}

But note that this expansion or non-expansion of "structure" subfields in the internal representation of structure templates is irrelevant to the structures DECLARE'd with those structure templates.  DECLARE'd structures are always represented in a fully-expanded form.

Instructions

This section covers all available PALMAT instructions which can appear in an instruction stream, i.e., in PALMAT["scopes"][index]["instructions"].  I've grouped them below according to my categorization of their functionality.

PALMAT instructions in the preliminary compiler implementation are always Python dictionaries. Aside from the explicit specifications for those dictionaries described below, any instruction may have the following additional fields:

Aside: In general, any given HAL/S source-code line will result in generation of multiple PALMAT instructions. It's therefore a waste to attach "source" fields to every instruction. The rule-of-thumb is that if any PALMAT instruction lacks a source field, then its source is presumed to be identical to that of the preceding PALMAT instruction.

There is one exception to this rule, namely instructions with "label" fields. In that case, the instruction may be jumped to directly, and it's more difficult for the emulator to know what the associated source line is without the extra work of analyzing backward through the instruction addresses rather than the instruction flow actually executed (which the emulator already knows and thus requires no additional work). To assist the emulator, source fields may therefore be attached to instructions with label fields, even if the actual value of the source may be the same as that of the preceding instruction address.

Operations on the Computation Stack

Recall the discussion of the so-called Computation Stack from earlier.  The instructions described in this section are primarily for the purpose of performing computations on the stack. Below, if there is a single operand on the computation stack, I'll simply refer to it as "operand".  If there are two operands on the stack, I'll refer to the one at the top of the stack (i.e., the first to be popped) as "operand1" and the next-to-top as "operand2".

In general, if the operands in one of the operations described below are Python integers, and if the result can continue to be represented exactly as an integer, then the result will continue to be a Python integer.  More usually, however, numerical-operation results will be Python floating-point values.

Note also that several control-transfer instructions and miscellaneous instructions also affect the computation stack in a generally-trivial way without that being their principal function, but are included in later sections rather than in the table below.

Pushing onto the Computation Stack

Mnemonic
Python Form of the Instruction
Description
number
{'number': string}
Pushes a literal number (HAL/S INTEGER or SCALAR onto the computation stack.  In this "preliminary" Python implementation, the number embedded within the instruction is given in string form, and hence must be converted by the emulator at runtime into a Python integer or float before pushing it onto the stack.  (In an instruction corresponding to this in an "advanced" implementation, the literal constant could presumably already be in the required native arithmetical format, thus needing no additional conversion.)  That's because the compiler's code generator retains exponential modifiers like B5 (power of 2) and H-2 (power of 16) as-is, accepting the loss of runtime efficiency.  That was an explicit design choice in the design of PALMAT, if perhaps its wisdom is somewhat questionable.  Of course, Python understands only exponential modifiers like "E12", and does not understand these alternate exponential modifiers, nor the potential use of multiple exponential identifiers within the same literal.
string {'string': string} Pushes a literal CHARACTER string onto the computation stack as a single entry.
vector
{'vector': [...]}
Pushes a literal VECTOR onto the computation stack as a single entry.  Unlike the number instruction, the compiler's code generator provides VECTOR entries as actual Python numbers rather than strings.  Note:  There are no means in HAL/S for  forming a literal VECTOR (or MATRIX or ARRAY, or within certain constraints a BIT-string), but evaluation of expressions which are computable at compile-time sometimes results in the generated code pretending that literals for these datatypes have been provided.  Since there are no compile-time-computable expressions which can product STRUCTURE results, I believe, there is no need for a corresponding instruction that pushes STRUCTURE literals.
matrix
{'matrix': [[...],...,[...]]}
Pushes a literal MATRIX into the computation stack as a single entry.  Unlike the number instruction, the compiler's code generator provides MATRIX entries as actual Python numbers rather than strings.  See the comment for VECTOR above.
array
{'array': [[..., 'a'], ..., [..., 'a'], 'a']}
Pushes a literal ARRAY onto the computation stack as a single entry.  See the comment for VECTOR above.
boolean
{'boolean': [value, length, 'b']}
Pushes a BOOLEAN or BIT-stream literal onto the computation stack.  The BOOLEAN type is distinguished from BIT types only in that length=1.  See the comment for VECTOR above.
bitarray
{'bitarray': [value, length, 'p']}
TBD ... I may be confused.  It may be that {'boolean': [value, length, 'p']} is used for all booleans and bit-strings.
fetch {'fetch': (index, identifier)} Finds the identifier (a Python string, mangled, but without carat quotes) representing a variable or constant in the indicated scope, and pushes the current or constant value of it onto the computation stack.  If it's an uninitialized variable, then a Python None (or a hierarchy of lists of None, as appropriate for VECTOR, MATRIX, or ARRAY) is pushed onto the stack.  Note that the value being pushed onto the stack can be of any supported datatype, including VECTOR, MATRIX, ARRAY, and STRUCTURE, and that the values are pushed as-is, without conversion of any kind. See the explanation of datatype representation given earlier.

Note that an immediately-preceding subscripts instruction, if present, is used to indicate subscripts if the chosen variable is for a composite object like a VECTOR, MATRIX, or ARRAY.

As a special case, the provided scope index can be -1.  This is used within the scope of a PROCEDURE if the identifier is for a variable that's in the ASSIGN clause of the PROCEDURE's definition.  In this case, the value that is fetched is not from the local scope of the PROCEDURE, but rather that of the corresponding variable in the ASSIGN clause of the call to the procedure.  To clarify, consider the following sample HAL/S code:
P: PROCEDURE ASSIGN(X);
    ...
    Y = X;
    ...
CLOSE P;
...
CALL P ASSIGN(Z);
Not surprisingly, the compilation of "Y = X;" involves the PALMAT fetch instruction, but what needs to be fetched is not the value of X local to P, but rather the variable Z in an entirely different scope.  The actual PALMAT instruction generated is {'fetch': (-1, X)}, causing the emulator to determine at runtime that X must be replaced (by Z) and that -1 must be replaced (by Z's scope index).

Since PROCEDURE calls can be nested, it may be that when an ASSIGN alias like this is resolved, it may turn out to be yet another ASSIGN alias (with a scope index of -1), thus resulting in another search to resolve that alias.  And this process could (but wouldn't necessarily) continue for however many levels the PROCEDURE calls are nested.

If some manner is found in the future of having the compiler's code-generator perform this lookup at compile time rather than forcing the emulator to do it at runtime, then the exception for index=-1 will become unnecessary, though emulator code for it will be retained (to avoid the necessity of recompilation) if that situation arises.  The preliminary implementation of the emulator transparently replaces these indirect references directly within the identifier dictionary whenever it performs one of these searches, so any given search is never performed more than once in any session.
fetchp
{'fetchp': (index, identifier)}
This is similar to a fetch instruction, except that instead of placing the current value of a variable on the computation stack, a "pointer" to the variable is placed there instead.  In this preliminary implementation, a "pointer" to the variable specified by (index, identifier) is simply the Python construct
[index, identifier, 'p']
At this writing, the only envisaged use for the fetchp instruction is in setup of the computation stack for a read instruction.  Like fetch instructions, when a fetchp instruction is executed, it can be conditioned by an immediately-preceding subscripts collection on the computation stack.
unravel
{'unravel': (index, identifier)}
Like the fetch instruction, except that if the variable whose value is being fetched is composite (i.e., VECTOR, MATRIX, ARRAY, STRUCTURE) it is unraveled in the process of being fetched.  For example, if fetching a VECTOR(20), then 20 locations on the computation stack are used instead of the single location used by fetch.  Unraveling is in row-major order.

Note:  I no longer believe this instruction is needed or used.
empty
{'empty': True}
Pushes an "uninitialized" value (Python None in the preliminary implementation) onto the computation stack.
sentinel
{'sentinel': string}
Pushes a "sentinel" object (Python {'sentinel'} in the preliminary implementation) onto the computation stack.  It acts as a delimiter for a variable-length sequence of operands on the computation stack.  The string is discarded, and does not affect the stack; it's intended to be used for debugging/documentation purposes, though I have a sense that the preliminary implementation of the emulator may sometimes use the values that it expects string to take, in order to more-certainly detect potential implementation errors.
fill
{'fill': True}
Pushes a "fill" object (Python {'fill'} in the preliminary implementation) onto the computation stack.  This corresponds to the presence of a "*" at the end of a HAL/S repetition list, or a "*" subscript.
partition
{'partition': True}
Pushes a "semicolon" object (Python {'semicolon'} in the preliminary implementation) onto the computation stack.  This should happen only within a list of subscripts, and marks the separation point (if applicable) between ARRAY subscripts and VECTOR/MATRIX subscripts.  (Or potentially for the latter, VECTOR/MATRIX/BIT/CHARACTER subscripts.  But at this writing I've not yet implemented BIT or CHARACTER subscripts.)

Standard Computations on the Stack

Mnemonic
Python Form of the Instruction
Description
+
{'operator': '+'}
Pops operand1 and operand2 from the stack and pushes operand1+operand2 back onto the stack.  Note that this encompasses not just the addition of simple numbers, but of various other datatypes as well:
  • (INTEGER or SCALAR) + (INTEGER or SCALAR)
  • VECTOR + VECTOR
  • MATRIX + MATRIX
-
{'operator': '-'}
Pops operand1 and operand2 from the stack and pushes operand1-operand2 back onto the stack.  Supports:
  • (INTEGER or SCALAR) - (INTEGER or SCALAR)
  • VECTOR - VECTOR
  • MATRIX - MATRIX
U-
{'operator': 'U-'}
Pops the operand from the stack and pushes -operand back onto the stack.
times
{'operator': ''}
Pops operand1 and operand2 from the stack and pushes operand1 times operand2 back onto the stack.  I.e., the operands are multiplied.  Supports:
  • (INTEGER or SCALAR) times (INTEGER or SCALAR) is just multiplication in the normal sense of the word.
  • (INTEGER or SCALAR) times (VECTOR or MATRIX), or (VECTOR or MATRIX) times (INTEGER or SCALAR), separately multiplies each component of the vector or the matrix by the integer or scalar.
  • VECTOR times VECTOR is an outer product of the two vectors.  Thus an N-vector times a M-vector would be an N×M matrix.
  • MATRIX times MATRIX is normal matrix multiplication in the sense of linear algebra.
  • VECTOR times MATRIX, or MATRIX times VECTOR, also performs a matrix multiplication in the sense of linear algebra.
/
{'operator': '/'}
Pops operand1 and operand2 from the stack and pushes operand1/operand2 back onto the stack.  I.e., this is a division operation in the normal sense of the word.  There is no "integer division" operator as such, and the results of divisions will be integers only if the dividend and divisor are both integers, and if the dividend is an integral multiple of the divisor.  The result of a division by zero is presently TBD.
**
{'operator': '**'}
Pops operand1 and operand2 from the stack and pushes operand1**operand2 back onto the stack.  I.e., operand1 is raised to the operand2 power.  The result of operations like 0**0 or (-1)**0.5 are presently TBD.

Note that operand1 can be a square, non-singular MATRIX in addition to INTEGER or SCALAR types.  Note that the HAL/S operation M**T (where M is a MATRIX) is implemented alternate means in PALMAT, and is not handled by the ** instruction.
C||
{'operator': 'C||'}
The string-concatenation operator.  It pops operand1 and operand2 from the stack.  Operand2 is concatenated to the end of operand1, and then the combination is pushed onto the stack.  Note that the computation stack does not track the declared maximum length of strings; the resulting string is simply as long as required by the actual (vs maximum) lengths of the operands.
not
{'operator': 'NOT'}
Pops the operand from the stack, performs a bitwise inversion on it, and pushes the result back onto the stack.

At present, in the preliminary implementation this instruction does not perform any automatic type conversions, so the operand must be a BIT-string.  It is TBD whether or not INTEGER or SCALAR values should be automatically converted to BIT-strings.  But note that even if they were converted, their lengths (number of significant bits) would still be unknown.
and
{'operator': 'AND'}
Pops operand1 and operand2 from the stack, performs a bitwise AND, and pushes the result back onto the stack.  The length (number of significant bits) of the result is the minimum of the lengths of operand1 and operand2.

At present, in the preliminary implementation this instruction does not perform any automatic type conversions, so both operands must be BIT-strings.  It is TBD whether or not INTEGER or SCALAR values should be automatically converted to BIT-strings.  But note that even if they were converted, their lengths (number of significant bits) would still be unknown.
or
{'operator': 'OR'}
Pops operand1 and operand2 from the stack, performs a bitwise OR, and pushes the result back onto the stack.  The length (number of significant bits) of the result is the minimum of the lengths of operand1 and operand2.

At present, in the preliminary implementation this instruction does not perform any automatic type conversions, so both operands must be BIT-strings.  It is TBD whether or not INTEGER or SCALAR values should be automatically converted to BIT-strings.  But note that even if they were converted, their lengths (number of significant bits) would still be unknown.
ornot
{'operator': 'ORNOT'}

Pops operand1 and operand2 from the stack, performs the bitwise operation
operand1 OR NOT operand2
and pushes the result back onto the stack.  The length (number of significant bits) of the result is the minimum of the lengths of operand1 and operand2.

At present, in the preliminary implementation this instruction does not perform any automatic type conversions, so both operands must be BIT-strings.  It is TBD whether or not INTEGER or SCALAR values should be automatically converted to BIT-strings.  But note that even if they were converted, their lengths (number of significant bits) would still be unknown.
B||
{'operator': 'B||'}
This is the BIT-string concatentation operation.  It pops operand1 and operand2 from the stack, shifts operand1 leftward by the defined length of operand2, and then fills the rightmost portion with operand2, and pushes the result back onto the stack.  The defined length of the result is equal to the sum of the lengths of the defined lengths of the operands.

At present, in the preliminary implementation this instruction does not perform any automatic type conversions, so both operands must be BIT-strings.  It is TBD whether or not INTEGER or SCALAR values should be automatically converted to BIT-strings.  But it's unclear what the result of the operation should be if operand2 has no known maximum length.
==
{'operator': '=='}
Pops operand1 and operand2 from the stack.  If they are equal, pushes the BIT-string corresponding to HAL/S TRUE, namely [1,1,'b'], onto the stack; otherwise, pushes FALSE, [0,1,'b'], onto the stack.

Note that the BNF specification for HAL/S places some pretty-severe limits on the datatypes comparable by means of relational operators.  The allowed comparisons are:  arithmetical (SCALAR, INTEGER, VECTOR, MATRIX) to arithmetical, CHARACTER to CHARACTER, BIT/BOOLEAN to BIT/BOOLEAN, STRUCTURE to STRUCTURE, or name to name.  Thus there are no datatype-conversion considerations with this operator.
!=
{'operator': '!='}
Pops operand1 and operand2 from the stack.  If they are not equal, pushes the BIT-string corresponding to HAL/S TRUE, namely [1,1,'b'], onto the stack; otherwise, pushes FALSE, [0,1,'b'], onto the stack.

Note that the BNF specification for HAL/S places some pretty-severe limits on the datatypes comparable by means of relational operators.  The allowed comparisons are:  arithmetical (SCALAR, INTEGER, VECTOR, MATRIX) to arithmetical, CHARACTER to CHARACTER, BIT/BOOLEAN to BIT/BOOLEAN, STRUCTURE to STRUCTURE, or name to name.  Thus there are no datatype-conversion considerations with this operator.
<
{'operator': '<'}
Pops operand1 and operand2 from the stack.  If operand1 < operand2, pushes the BIT-string corresponding to HAL/S TRUE, namely [1,1,'b'], onto the stack; otherwise, pushes FALSE, [0,1,'b'], onto the stack.

Note that the BNF specification for HAL/S places some pretty-severe limits on the datatypes comparable by means of relational operators.  The allowed comparisons are:  arithmetical (SCALAR, INTEGER, VECTOR, MATRIX) to arithmetical, CHARACTER to CHARACTER, BIT/BOOLEAN to BIT/BOOLEAN, STRUCTURE to STRUCTURE, or name to name.  Thus there are no datatype-conversion considerations with this operator.

The interpretation of "<" when the operands are not SCALAR or INTEGER is TBD.
>
{'operator': '>'}
Pops operand1 and operand2 from the stack.  If operand1 > operand2, pushes the BIT-string corresponding to HAL/S TRUE, namely [1,1,'b'], onto the stack; otherwise, pushes FALSE, [0,1,'b'], onto the stack.

Note that the BNF specification for HAL/S places some pretty-severe limits on the datatypes comparable by means of relational operators.  The allowed comparisons are:  arithmetical (SCALAR, INTEGER, VECTOR, MATRIX) to arithmetical, CHARACTER to CHARACTER, BIT/BOOLEAN to BIT/BOOLEAN, STRUCTURE to STRUCTURE, or name to name.  Thus there are no datatype-conversion considerations with this operator.

The interpretation of ">" when the operands are not SCALAR or INTEGER is TBD.
<=
{'operator': '<='}
Pops operand1 and operand2 from the stack.  If operand1 ≤ operand2, pushes the BIT-string corresponding to HAL/S TRUE, namely [1,1,'b'], onto the stack; otherwise, pushes FALSE, [0,1,'b'], onto the stack.

Note that the BNF specification for HAL/S places some pretty-severe limits on the datatypes comparable by means of relational operators.  The allowed comparisons are:  arithmetical (SCALAR, INTEGER, VECTOR, MATRIX) to arithmetical, CHARACTER to CHARACTER, BIT/BOOLEAN to BIT/BOOLEAN, STRUCTURE to STRUCTURE, or name to name.  Thus there are no datatype-conversion considerations with this operator.

The interpretation of "≤" when the operands are not SCALAR or INTEGER is TBD.
>=
{'operator': '>='}
Pops operand1 and operand2 from the stack.  If operand1 ≥ operand2, pushes the BIT-string corresponding to HAL/S TRUE, namely [1,1,'b'], onto the stack; otherwise, pushes FALSE, [0,1,'b'], onto the stack.

Note that the BNF specification for HAL/S places some pretty-severe limits on the datatypes comparable by means of relational operators.  The allowed comparisons are:  arithmetical (SCALAR, INTEGER, VECTOR, MATRIX) to arithmetical, CHARACTER to CHARACTER, BIT/BOOLEAN to BIT/BOOLEAN, STRUCTURE to STRUCTURE, or name to name.  Thus there are no datatype-conversion considerations with this operator.

The interpretation of "≥" when the operands are not SCALAR or INTEGER is TBD.
#
{'operator': '#'}

This is the repetition operator. It has a variable number of operands, terminated by a sentinel (Python {'sentinel'} in the preliminary implementation). The operand at the top of the computation stack is the repetition factor. The remaining operands on the computation stack, down to the first sentinel encountered, comprise the sequence to be replicated. Any operands among these latter which are composites — i.e., VECTOR, MATRIX, or ARRAY — are "unraveled" in row-major order to their constituent simple values in the process. (The handling of unraveling of a STRUCTURE is TBD.) The Python value None is also an allowed operand, and indicated an unused position in the sequence. If the repetition factor is immediately followed by the sentinel without intervening operands to be replicated, it means that the value None is being replicated. The value {'fill'} (in the preliminary Python-based implementation) can also be present as the final operand preceding the sentinel. (It corresponds to a trailing "*" in HAL/S, meaning that all remaining positions are left uninitialized. But since that's the default behavior of the emulator in the preliminary implementation anyway, the fill element is essentially ignored.) All of the operands, including the sentinel, are popped from the computation stack, and replaced with the desired replicated sequence of operands (and no added sentinel).


An example may clarify. Consider the HAL/S statement

DECLARE V VECTOR(32) CONSTANT(16, 5#(3#5, 2#6, 62), 1);
The CONSTANT expression is computed on the computation stack, though it happens to be done at compile-time rather than at runtime.  Nevertheless, it is done using the operations described within this table, including the repetition operator.  Below we see the PALMAT instructions which the compiler generates to perform the computation, and to the right of that the state of the computation stack (growing to the right) after each instruction is executed:

{'number': '1'} [1]

{'sentinel': 'repeat'} [1, {'sentinel'}]

{'number': '62'} [1, {'sentinel'}, 62]

{'sentinel': 'repeat'} [1, {'sentinel'}, 62, {'sentinel'}]

{'number': '6'} [1, {'sentinel'}, 62, {'sentinel'}, 6]

{'number': '2'} [1, {'sentinel'}, 62, {'sentinel'}, 6, 2]

{'operator': '#'} [1, {'sentinel'}, 62, 6, 6]

{'sentinel': 'repeat'} [1, {'sentinel'}, 62, 6, 6, {'sentinel'}]

{'number': '5'} [1, {'sentinel'}, 62, 6, 6, {'sentinel'}, 5]

{'number': '3'} [1, {'sentinel'}, 62, 6, 6, {'sentinel'}, 5, 3]

{'operator': '#'} [1, {'sentinel'}, 62, 6, 6, 5, 5, 5]

{'number': '5'} [1, {'sentinel'}, 62, 6, 6, 5, 5, 5, 5]

{'operator': '#'} [1, 62, 6, 6, 5, 5, 5, 62, 6, 6, 5, 5, 5, 62, 6, 6, 5, 5, 5, 62, 6, 6, 5, 5, 5, 62, 6, 6, 5, 5, 5]

{'number': '16'}] [1, 62, 6, 6, 5, 5, 5, 62, 6, 6, 5, 5, 5, 62, 6, 6, 5, 5, 5, 62, 6, 6, 5, 5, 5, 62, 6, 6, 5, 5, 5, 16]

The values in the computation stack at the end of this computation are in LIFO order, thus the 32 positions of VECTOR V are initialized to 16, 5, 5, 5, 6, 6, 62, ..., 1.

+><
{'+><': (index, identifier)}
This operation is difficult to describe concisely, because unlike almost all other PALMAT instructions (which are intended to be simple building blocks for flexibly implementing more-complex operations), this instruction performs a rather complex operation for a dedicated purpose.  Its observable effects are simple, however:  Two operands are popped from the computation stack, and a single BOOLEAN is pushed onto the stack in their place.

Conceptually, if you have a HAL/S statement of the form
DO FOR [TEMPORARY] X = Y TO Z BY T; ...; END;
then the +>< PALMAT instruction is used for updating the loop variable X to X+T and then comparing the result to Z to determine if the end-condition of the for-loop has been met, taking into account for the comparison whether the step-size T is positive or negative.

The stack setup for this instruction is that T is operand1 (the top of stack) and Z is operand2 (the next-to-top).  The index and identifier encoded in the instruction itself specify the loop variable X.

Specifically, the +>< instruction can be said to begin with the equivalent of the PALMAT instructions
{'fetch': (index, X)}
{'operator': '+'}
{'store': (index, X)}
which update the loop variable in memory and leave the updated value of it at the top of the stack.  Depending on whether T is positive or negative, +>< then ends up with the equivalent of either the PALMAT instruction
{'operator': '>'}
or
{'operator': '<'}
respectively, either of which removes all remaining operands from the stack, leaving a single TRUE or FALSE to tell us if the end-condition has been met.
builtin
{'function': f} Calls a Run-Time Library (RTL) function, otherwise known as a "built-in function".  The name of the desired run-time library function is the string f.  The HAL/S run-time library functions are documented in the HAL/S documentation, particularly the "HAL/S Programmer's Guide", Appendix B, and I see no reason to duplicate that information here.  In general, these functions pop 0, 1, 2, or 3 operands from the computation stack, and push a result back onto the stack.  The functions presently implemented in the preliminary implementation are the following, though others will presumably be implemented over time until complete support is achieved:
  • ABS
  • ABVAL
  • ARCCOS
  • ARCCOSH
  • ARCSIN
  • ARCSINH
  • ARCTAN
  • ARCTAN2
  • ARCTANH
  • CEILING
  • COS
  • COSH
  • DET
  • DIV
  • EXP
  • FLOOR
  • INDEX
  • INVERSE
  • LENGTH
  • LJUST
  • LOG
  • MAX
  • MIDVAL
  • MIN
  • MOD
  • ODD
  • PROD
  • RANDOM
  • RANDOMG
  • REMAINDER
  • RJUST
  • ROUND
  • RUNTIME
  • SHL
  • SHR
  • SIGN
  • SIGNUM
  • SIN
  • SINH
  • SQRT
  • SUM
  • TAN
  • TANH
  • TRACE
  • TRANSPOSE
  • TRIM
  • TRUNCATE
  • UNIT
  • XOR
typeof
{'modern': 'TYPEOF'}
This PALMAT instruction is related to a built-in function I've invented to aid creation of validation programs written in HAL/S for the "modern" HAL/S compiler, but not present in original implementations of HAL/S.  The typeof instruction is generated when something like the following appears in a HAL/S expression:
typeof('IDENTIFIER')
Here, typeof is my invented built-in function and IDENTIFIER is the name of any defined identifier within the scope, presented as a HAL/S string.  Note that so-called carat-quotations should not be used.  Name-mangling prefixes, if known, can be used or not, but I'd recommend not doing so since the emulator can find the identifiers without them anyway (albeit more slowly) and by omitting them you avoid potentially providing them incorrectly.

The instruction pops the operand string from the computation stack and replaces it with an ARRAY of strings that describe the attributes of the chosen identifier.  In essence it provides much the same information as the modern HAL/S interpreter's `data command, but in a form available to a test program written in HAL/S itself.  The ARRAY is one-dimensional, of length 20, and has the following entries.  (In general, an entry of an empty string, "", indicates an unused position.)
  1. The datatype of the identifier, which is one of the following: "MISSING" (if the identifier is not found), "INTEGER", "SCALAR", "VECTOR", "MATRIX", "BIT", "CHARACTER", "STRUCTURE", "LABEL", "EVENT", or "?" (indicating a datatype typeof cannot identify, presumably due to inadequate implementation).
  2. The maximum length of a bit-string or character-string, the length of a vector, or the number of rows of a matrix.
  3. The number of columns of a matrix; empty otherwise.
  4. Stringified value of a CONSTANT clause in the declaration. In the case of an INTEGER or SCALAR, this is just the numerical value formatted as a string. In the case of a bit-string, this is the "N, L", where N is the integer value of the array and L is the length in bits. In the case of a VECTOR or MATRIX, it is an unraveling of all of the data, separated by commas and a space. In general, floating-point numbers are rounded to no more than 5 fractional decimal places if the numbers are small enough that there is no E-style exponent.
  5. Stringified value of an INITIAL clause in the declaration.
  6. Reserved for future use. (These reserved positions are for attributes such as DOUBLE, AUTOMATIC, STATIC, and so on, which I haven't yet chosen to implement in typeof but may at some point.)
  7. Reserved for future use.
  8. Reserved for future use.
  9. Reserved for future use.
  10. Reserved for future use.
  11. Reserved for future use.
  12. Reserved for future use.
  13. Reserved for future use.
  14. Reserved for future use.
  15. Reserved for future use.
  16. If an ARRAY, then the 1st ARRAY dimension.
  17. If an ARRAY 2D or higher, then the 2nd ARRAY dimension.
  18. If an ARRAY 3D or higher, then the 3rd ARRAY dimension.
  19. If an ARRAY 4D or higher, then the 4th ARRAY dimension.
  20. If an ARRAY 5D or higher, then the 5th ARRAY dimension. (There's no provision for displaying higher dimensions.)

For example, the HAL/S code

DECLARE I INTEGER CONSTANT(5), M MATRIX(3,5) A ARRAY(2, 4) MATRIX(3, 5);
WRITE(6) typeof('I');
WRITE(6) typeof('M');
WRITE(6) typeof('A');
prints
"INTEGER"  ""  ""  "5"  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""
"MATRIX"  "3"  "5"  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  "" 
""  ""  ""  ""  ""
"MATRIX"  "3"  "5"  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  "2"  "4"  ""  ""  ""
typeofv
{'modern': 'TYPEOFV'}
This PALMAT instruction is related to a built-in function I've invented to aid creation of validation programs written in HAL/S for the "modern" HAL/S compiler, but not present in original implementations of HAL/S.  It is generated when something like the following appears in a HAL/S expression:
typeofv(EXPRESSION)
typeofv is similar to typeof (see above) except that it analyzes the result of a computation — i.e., of a value — rather than the properties of a declared identifier.  Both instructions pop the value of the expression from the computation stack and replace it with an ARRAY of attributes.  The output, moreover, is in the same format and is interpreted in the same way.  Of course, typeofv has no access to attributes like CONSTANT and INITIAL, and attributes to which it has no access are always reported as empty strings.

For example, the HAL/S code

DECLARE I INTEGER CONSTANT(5);
WRITE(6) typeof('I');
WRITE(6) typeofv(I);
WRITE(6) typeofv(I+12);
prints
"INTEGER"  ""  ""  "5"  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""
"INTEGER"  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  "" 
""  ""  ""  ""  ""
"INTEGER"  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""
Moreover, typeofv can analyze any datatype found on the computation stack, and that's not quite the same as the datatypes accessible to typeof.  Some additional types reported by typeofv but not by typeof are: "SENTINEL" (for {'sentinel'} objects), "*" (for {'fill'} objects), "POINTER" (for pointers to identifiers, from the PALMAT fetchp instruction), and "NONE" for uninitialized data.  Most of these extra types actually cannot appear to typeofv for correctly-generated and -executed PALMAT, but remember that the purpose of typeofv is to help detect incorrect PALMAT generation or execution.

Finally, in reporting the datatype of an ARRAY, say array A, typeofv only analyzes the datatype of array element A$(1,...,1).  It does not check that every element of array A has the same (or compatible) datatype, so it could theoretically be fooled by an illegal ARRAY-like construct like [1, 2, 'HELLO', FALSE, 'a'] into thinking that it had found an ARRAY(4) INTEGER.
initialized
{'modern': 'INITIALIZED'}
This PALMAT instruction is related to a built-in function I've invented to aid creation of validation programs written in HAL/S for the "modern" HAL/S compiler, but not present in original implementations of HAL/S.  It is generated when something like the following appears in a HAL/S expression:
initialized(value)
This function returns TRUE if the selected value has been completely initialized, and FALSE otherwise.  By "completely" initialized, I mean that if the data item is composite (VECTOR, MATRIX, ARRAY, STRUCTURE) rather than simple (INTEGER, SCALAR, BIT, CHARACTER), then a recursive check is done to determine if every item comprising the value is initialized.
shaping
{'shaping': s}
Indicates invocation of a "shaping function", as specified by the string s.  The choices for s are integer, scalar, vector, matrix, doubleinteger, doublescalar, doublevector, or doublematrix.  In the preliminary implementation, all arithmetical datatypes are double-precision, so there is no distinction in the preliminary implementation between the doubleXXXX shaping functions and the non-double ones.  The shaping functions both perform datatype conversions (to integer and floating-point types) and recast the geometry of composite objects.  Refer to HAL/S documentation for the detailed effects.

Note that an immediately-preceding subscripts instruction, if present, is used for specification of geometry.
sliceAT
sliceTO

{'shaping': 'sliceAT'}
{'shaping': 'sliceTO'}

The first two are similar to the aforementioned instructions {'shaping': 'vector'} and {'shaping': integer}, respectively.  They're used in implementing the HAL/S subscripting conventions "length AT start" and "start TO end".  They're intended to execute more efficiently than the full shaping instructions, by eliminating all flexibility and checking.  At runtime, either instruction pops operand1 and operand2, which are supposed to be INTEGER, or at worst SCALAR.  The sliceAT instruction then pushes the Python list [round(operand1), round(operand2)] in their place, while the sliceTO instruction instead pushes the Python tuple (round(operand1), round(operand2)).  Subscript-list processing recognizes these forms as describing "slices" and consequently will not confuse them with the VECTOR(2) or ARRAY(2) INTEGER types they otherwise match.

See the subscripts instruction.

Storing and Popping from the Stack

Mnemonic
Python Form of the Instruction
Description
pop
{'pop': number}
Pops the indicated number of elements from the top of the computation stack.
store
{'store': (index, identifier)}
The store instruction mostly has the same characteristics as the fetch instruction (see above), except that it stores the top of the computation stack into a specified variable instead of vice-versa.  In particular, it has the same exception of index=-1 as do ASSIGN clauses.

Note, though, that the operand is not popped from the computation stack, and remains on the stack even after its value has been stored elsewhere.  This is to facilitate HAL/S capability of performing assignments to multiple variables in a single assignment statement.  See also the storepop and pop instructions below.

Note also that the store instruction will perform any automatic type conversions (if possible) needed to make the operand conform to the attributes of the variable into which the value is supposed to be stored.  It makes these conversions in the manner specified in the HAL/S documentation.  For example, if the operand is a Python float and the variable for storage is a HAL/S INTEGER (i.e., if in the identifier dictionary it has the attribute "integer":True), then the operand will be rounded to the nearest integer before storage.  It's worth noting explicitly that these automatic conversions include arithmetical-to-string, and string-to-arithmetical.  Thus if we tried to store the number 1 to a string variable, it would be stored as the string '1'.
storepop
{'storepop': (index, identifier)}
Mostly equivalent to the two-instruction sequence
{'store': (index, identifier)}
{'pop': 1}
The equivalence is not exact in one particular case, namely if all of the following conditions are present:
  • identifier is a formal subroutine parameter in the current scope, as evidenced by the attribute "parameter" being present; furthermore,
  • it was initially declared as ARRAY(*) (of any type); moreover,
  • the value on the computation stack is also a one-dimensional array (though obviously having some specific length rather than *).

If all of these conditions are true, the value on the stack is popped into the identifier's "value" attribute and the dimensions of the identifier in its attribute list are rewritten to match those of the popped array. This gimmick is used to implement the HAL/S feature of declaring one-dimensional ARRAY subroutine parameters as ARRAY(*) and allowing the subroutine itself to determine the array-length dynamically using the SIZE() RTL function.

I say that the identifier was "initially" declared as ARRAY(*) to make it clear that any implementation must have the ability to still determine the original declaration on subsequent calls to the subroutine after storepop has overwritten the originally-declared length of * with some explicit number, because on the next call to the subroutine the length of the array passed to it may differ. In this preliminary PALMAT implementation, the identifier is assigned a new attribute, "flex" (value True), to indicate that the original length was * but has since been changed.

substore
{'substore': (index, identifier)} This is like the store instruction, but provides for subscripting if the targeted variable is a composite one such as a MATRIX.  (Note that the subscripts operator is not explicitly used in this particular application, but all other considerations about the nature of the subscripts are the same.)  The computation-stack setup for this instruction, growing to the right, is:
..., value to be saved, sentinel, last subscript, ..., first subscript
The subscripts and sentinel are popped from the computation stack, but the value is left intact.

At present, this mechanism handles only individual subscripts, and not HAL/S subscript constructs like AT, TO, or *.
substorepop
{'substorepop': (index, indentifier)}
Like substore, but additionally pops the save-value from the computation stack.

Transfer of Control

Mnemonic
Python Form of the Instruction
Description
goto
{'goto': (indexD, identifier)}
{'goto': (indexT, offset)}
{'goto': (indexT, offset), 'symbolicLabel': (indexD, identifier)}
Each of these three alternative variants of the instruction unconditionally transfer control to a different location in the PALMAT instruction stream, but they differ in the manner in which they specify the target location. 

Specifications of the form (indexD, identifier) transfer control to a PALMAT instruction (in the scope having the specified index indexD) having the associated symbolic label "identifier".  Such symbolic labels are attached to a PALMAT instruction by means of an added "label" field.  For example, in the preceding section we encountered the PALMAT write instruction, which had the form {'write': lun}.  To attach a symbolic label (say "MYLABEL") to the write instruction, the instruction would be modified to {'write': lun, 'label': 'MYLABEL'}.  Note, however, that the indexD is the index of the scope in which the identifier is defined, and not the scope in which the targetted location resides.

Specifications of the form (indexT, offset) instead transfer control to the specific numeric offset within the PALMAT instruction stream in the scope having the index indexT.  Thus the indexT is not interpreted the same way as indexD.

The variant in which the instruction contains both a "goto" key and a "symbolicLabel" key is a hybrid of the two above.  The compiler's code-generator will have generated the (indexD, identifier) variant of the instruction, but after executing the instruction once, the emulator will have chosen to transparently replace it by the (indexT, offset) form in order to avoid having to perform the same identifier-lookup again if the same instruction is revisited later.  The emulator saves the original form in a newly-created "symbolicLabel" field for debugging purposes, but does not itself use the contents of symbolicLabel.
iffalse
{'iffalse': (indexD, identifier)}
{'iffalse': (indexT, offset)}
{'iffalse': (indexT, offset), 'symbolicLabel': (indexD, identifier)}
This instruction is generally similar to the goto instruction described above.  It differs only in that it performs a jump to the target location only if the top of the computation stack, which the instruction pops from the stack as an operand, is FALSE.  Whereas if the operand is TRUE, control simply passes normally to the next PALMAT instruction in the stream.
iftrue
{'iftrue': (indexD, identifier)}
{'iftrue': (indexT, offset)}
{'iftrue': (indexT, offset), 'symbolicLabel': (indexD, identifier)}
This instruction is generally similar to the goto instruction described above.  It differs only in that it performs a jump to the target location only if the top of the computation stack, which the instruction pops from the stack as an operand, is TRUE.  Whereas if the operand is FALSE, control simply passes normally to the next PALMAT instruction in the stream.
call
{'call': (index, identifier)}
{'call': (index, identifier), 'assignments': {...}}
Calls a subroutine (i.e., a FUNCTION or PROCEDURE) written in HAL/S (vs a HAL/S built-in function that instead would be invoked by the function instruction). 

In terms of the setup for this instruction, the parameters of the FUNCTION or PARAMETER are placed on the computation stack, with the first parameter being the top of the stack, the second parameter being next to the top, and so on.  Each such subroutine has a fixed number of parameters, invariable, and it must pop all of those parameters from the stack when it is CALL'd.

As far as the return address of the subroutine is concerned, the call instruction automatically creates a key called "RETURN" in the subroutine's scope to hold it.  Realize that subroutines in HAL/S are not reentrant, so this "RETURN" key cannot accidentally be overwritten by nested calls.  As far as multiprocessing is concerned, separate programs running simultaneously would have separate clones of the subroutine's scope, and thus separate "RETURN" keys, and therefore also would not accidentally overwrite the return value.

The instruction variant which includes the "assignments" key is only present for PROCEDURE calls, but this alone cannot be used by the emulator to differentiate between FUNCTION vs PROCEDURE calls, since some PROCEDURE definitions do not have an ASSIGN clause.  Recall that in HAL/S, both the code which declares a procedure, such as
P: PROCEDURE(...) ASSIGN(...); ...; CLOSE P;
and the code which calls the procedure,
CALL P(...) ASSIGN(...);
typically but not always have ASSIGN clauses, with each identifier listed in the procedure-definition's ASSIGN clause being paired with a variable in the call's ASSIGN clause when the CALL occurs.  The "assignments" field in the call instruction lists the variables selected by the function call's ASSIGN clause and their relationships to the corresponding variables in the procedure's ASSIGN clause.  This may seem inelegant, but it obviates the need for having constructs like "pointers" to variables, and it eliminates within the parameter lists of PROCEDUREs the need to somehow distinguish explicit values vs pointers to variables.

Perhaps this is best illustrated by an example.  Consider the following HAL/S code, in the global scope (0):
P: PROCEDURE(X) ASSIGN(Y); DECLARE X, Y; Y = X; CLOSE P;
DECLARE Y INITIAL(5), Z;
CALL P(Y) ASSIGN(Z);
Thus we have a PROCEDURE, P, which does nothing more than assign Y (from its ASSIGN clause) the value X (its sole parameter).  When the PROCEDURE is called, with a parameter equal to 5, its ASSIGN clause specifies the variable Z.  So while P is executing in the context of this CALL, it's own Y is identified with the calling code's Z.  In other words, when P(5) is called, we'd expect the calling code's Z variable to be set to 5.

In terms of the PALMAT instructions generated by the compiler, the CALL looks like this:
{'fetch': (0, 'Y')}
{'call': (0, 'l_P'), 'assignments': {'Y': (0, 'Z')}}

(Recall that PROCEDURE names are mangled by the compiler with an added prefix "l_".)  So the emulator must be able to preserve the dictionary stored in the "assignments" field of the call instruction, and use it as a source of aliases of variables within the PROCEDURE.  As mentioned for the fetch instruction, variables which must be aliased in this fashion due to their presence in the PROCEDURE's ASSIGN clause are represented as belonging to scope -1.  Since there is no scope -1, the emulator looks in this "assignments" dictionary to determine the actual scope index and identifier.  The PALMAT code generated for PROCEDURE P looks like the following, in toto:
{'storepop': (1, 'X')}
{'fetch': (1, 'X')}
{'storepop': (-1, 'Y')}
{'return': 1}
The first instruction pops the operand from the computation stack and store's it in PROCEDURE P's own X variable.  It then pushes the value of X back onto the stack — a clever code optimizer could simply remove both of the first two instructions — in preparation for the third instruction.  For that third instruction, the emulator looks at the index of -1, consults the "assignments", and (in effect, but not literally) replaces the entire third instruction by {'storepop': (0, 'Z')}.
return
{'return': 1}
{'return': 2}
This instruction returns from a CALL'd subroutine.  (See the preceding entry.)

The first variant of the instruction is for return from a PROCEDURE, while the second variant is for return from a FUNCTION.  The distinction is that while either type of subroutine will have popped all of its input parameters from the computation stack, the FUNCTION will have additionally pushed its return value back onto the stack before returning.  Thus for a {'return": 1} the computation stack will be left empty, while for a {'return': 2} the emulator will not clean up the computation stack.  In principle, for correctly written code the computation stack should have the correct number of values on it anyway, so the emulator shouldn't have to do anything different in the two cases.  In practice, this can be used at runtime to detect conditions like functions not returning any value or procedures which do return values.
run
{'run': (index, identifier)}
Runs the PROGRAM whose name is given by the identifier defined in the specified scope.  In principle, I don't think there's any way to define a PROGRAM other than in the scope with index=0.  Also in principle, I don't believe there's any HAL/S code whose compilation is capable of generating this instruction.
calloffset
{'calloffset': identifier}
Stores the offset within the instruction stream of the succeeding instruction into a dedicated and otherwise-inaccessible register (called "returnoffset") in the scope, and then transfers control to the specified identifier, which must be defined in the current scope.  This instruction is used in conjunction with the returnoffset instruction (see below), and designed for the implementation of HAL/S discrete DO FOR loops.  I.e., it is used for HAL/S constructs like
DO FOR I = EXPRESSION1, EXPRESSION2, ..., EXPRESSIONn; ... (body) ...; END;
In practice, since this instruction is used for a single purpose, the identifer is always called "df_b", which stands for "do for body", and is the symbolic label at the beginning of the loop's body.  But in principle it could be something else.  Recall, by the way, that while such loops can be nested, each nested loop resides in a distinct PALMAT scope, so the use of the dedicated "returnoffset" register and dedicated identifier "df_b" do not conflict with such nestings.
returnoffset
{'returnoffset': -1}
{'returnoffset': index}
The first form of the instruction transfers control, within the current scope, to the offset held in the scope's dedicated offset register, which is written only by the calloffset instruction (see above).  If the dedicated offset register doesn't exist in the current scope, find the nearest ancestor scope with an offset register, and use it instead.  In practice, what this means is finding the scope comprising the enclosing discrete FOR-loop.  (And what this allows is using the returnoffset instruction for implementing a HAL/S REPEAT without a symbolic label inside a discrete FOR-loop.)

The second form of the instruction is similar, except that the scope to which the transfer occurs is specified by the index embedded in the instruction, which must contain a dedicated offset register.  There is no hunting for an appropriate scope.  (This allows the returnoffset instruction to be used for implementing a HAL/S REPEAT with a symbolic label to an enclosing discrete FOR-loop, given that the compiler's code generator can figure out which scope that is.)
case
{'case': prefix}
As the name of this instruction hints, it is designed to implement HAL/S DO CASE statements.  Its action is as follows:
  1. It pops its operand from the top of the computation stack. This operand must be an INTEGER or SCALAR; if a SCALAR, it is rounded to an INTEGER.
  2. The prefix specified in the instruction is a string. For PALMAT generated by the compiler from a DO CASE statement, prefix is always "dc_", but in principle it could be something else.
  3. The prefix and the operand are used to construct an identifier by concatenating a suffix to the prefix:
    • If the operand is < 1, then the suffix is "else".
    • If the operand is ≥ 1, then suffix is the operand formatted as a string. But if this identifier is not defined in the current scope, then the suffix is changed to "else".
  4. But if this identifier is not defined in the current scope, the suffix is changed to "exit".
  5. Control is transferred to the instruction in the current scope having the identifier as a symbolic label.

For example, if prefix is "dc_" and the number 5 is at the top of the computation stack, then the case instruction attempts to transfer control to the symbolic label dc_5, but if the symbolic label dc_5 is not defined in the current scope it instead tries to transfer control to dc_else, and if dc_else isn't defined then it at last transfers control to dc_exit.

Explanation: In a DO CASE statement, the compiler's code-generator assigns the symbolic label dc_else to the DO CASE statement's ELSE clause (if there is one). It similarly assigns the symbolic labels dc_1, dc_2, dc_3, ... to the various statements comprising the different cases in the statement, and finally places the symbolic label dc_exit at the very end.

Finally, while it is true that DO CASE statements can be nested, since each DO CASE statement will reside in its own PALMAT scope, there is no conflict between the dc_X labels of the nested statements.


Miscellaneous Instructions

Mnemonic
Python Form of the Instruction
Description
noop
{'noop': True}
{'noop': True, 'label': identifier}
This is the "no-operation" instruction, and it performs no action when executed other than to advance to the next instruction in the stream.  It is typically present only in the second form, and is generally inserted by the compiler's code-generator only as a wrapper for a symbolic label for a location in the instruction stream. 

In all (or almost all) cases a noop instruction can be eliminated during optimization of PALMAT code simply moving the label attribute into the succeeding PALMAT instruction, and then fixing any the side effects of that action.  Obviously, if there are two different symbolic labels in (say) two adjacent noop instructions, then that optimization strategy can't be used.
debug
{'debug': string}
This instruction is ignored during execution, similar to a noop, though it would not be recommended to a attach a label to a debug instruction.  As the name indicates, it's useful primarily for embedding debugging messages into the PALMAT instruction stream.
write
{'write': lun}
Prints out the entire contents of the computation stack to the specified "logical unit number" (lun).  Note that the preliminary implementation supports only lun=6, namely stdout (i.e., the console).  The values to be printed are removed from the stack in first-in first-out (FIFO) order, rather than in the last-in first-out (LIFO) order usual for the computation stack (and stacks in general).  If the computation stack is empty, then a single line-feed is printed.

Any datatype may be printed (including composite types such as ARRAY), and the printout is in the format described by the HAL/S documentation.  For example, 0.0 is printed as " 0.0" (note the leading space), while 0.1 is printed as " 1.00000000000000E-01".

Note, however, that so-called "i/o controls" affecting the output format are not supported in the preliminary implementation, and are simply ignored by the emulator.
read
{'read': lun}
Reads values from the "logical unit number (lun).  Note that the preliminary implementation only supports lun=5, namely stdin (i.e., the keyboard).  The values to be read are specified on the computation stack, and must be placed on the stack by the fetchp PALMAT instruction.  Note that the entries on the computation stack are consumed by the read instruction in first-in first-out (FIFO) order, rather than the usual last-in first-out (LIFO) order of stack operations, and that the computation stack is left empty.

Input data is delimited by whitespace (including linefeeds) and/or commas.  Missing values, interpreted as uninitialized data are indicated in the input stream by adjacent commas optionally separated by whitespace.  A semicolon in the input stream, whether or not delimited by whitespace and/or commas, is treated as the end of input, and all of the variables whose read-requests have not yet been fulfilled are filled with uninitialized data.

There is no provision for reading a string variable containing spaces, commas, or semicolons.
subscripts
{'operator': 'subscripts'}

Indicates that the elements at the top of the computation stack, ended by a {'sentinel'}, form subscripts for the next PALMAT instruction encountered, which should be a fetch, fetchp, or shaping. The top element of the computation stack is the first subscript, the next-to-top is the 2nd subscript, and so on. Upon execution, the subscripts instruction pops all of these from the instruction stack and stores them in some implementation-dependent way to be applied to the succeeding PALMAT instruction.


This instruction may also appear in standalone computation of subscripts, as may occur when processing the variable-names appearing in the left-hand side of a HAL/S assignment statement, and thus appear as the final step in such an evaluation, without there being any succeeding PALMAT instruction at all.


The subscripts appearing on the computation stack are of four possible forms:

  1. A simple INTEGER or SCALAR (which will be automatically rounded to an INTEGER during execution) picks out a single element along the affected dimension.
  2. A Python list, [length, start], of two INTEGERs indicates a range of length elements along the affected dimension. This corresponds to the HAL/S "length AT start" subscripting convention. HAL/S requires length (2 up to the dimensionality of the corresponding index) to be known at compile time, with start being computable, but PALMAT doesn't enforce compile-time computability. See also the sliceAT instruction.
  3. A Python tuple, (start, end), of two INTEGERs indicates a range of end-start+1 elements along the affected dimension, beginning at start and ending at end. This corresponds to the HAL/S subscripting convention "start TO end". HAL/S requires both start and end to be known (or computable) at compile-time. See also the sliceTO instruction.
  4. A Python set {'fill'} is used for a subscript specified using the HAL/S, and it is placed on the computation stack by a fill instruction.

For the other subscripting conventions of HAL/S, namely "#" and "#±N" to mean the last element in the affected direction and elements relative to it, the code generator performs the trick of replacing # by a very large integer (2,000,000,000 at this writing). The emulator performs the reverse trick of subtracting this constant from very large superscripts, and instead adding the width of the corresponding dimension to the subscript. This allows the emulator to compute subscript expressions involving # without having to provide special software separate from the normal expression software to do so. However, this limits the maximum width of any dimension of a composite object to half of the large constant.

Note that although in an ARRAY of VECTOR, MATRIX, BIT or CHARACTER HAL/S visually distinguishes between the ARRAY indices and the VECTOR/MATRIX/BIT/CHARACTER indices by having a colon (:) between the two groups indices rather than a comma (,), PALMAT makes no such distinction. Rather, if subscripts are present, it expects the number of subscripts to correspond either to the dimensionality of the ARRAY or else to the dimensionality of the ARRAY plus the dimensionality of the VECTOR/MATRIX/BIT/CHARACTER objects comprising the ARRAY.

dotted
{'operator': 'dotted'}
The dotted instruction provides functionality similar to the subscripts instruction described in the preceding entry, except that instead of implementing subscripts for a variable it implements the dotted notation used to specify the hierarchical fields of STRUCTUREs.  It indicates that the strings at the top of the computation stack, ended by a {'sentinel'}, form a sequence of STRUCTURE field-names.  Upon execution, the dotted instruction pops all of these from the instruction stack and stores them in some implementation-dependent way to be applied to the succeeding PALMAT instruction, which should be a fetch or fetchp.

(Note that there are two ways in HAL/S in which STRUCTUREs can be combined with subscripts.  One way is if the topmost level of the STRUCTURE has "copyness" and the one-dimensional subscript refers to the particular copy of the STRUCTURE.  The other way is if the leaf is an ARRAY, VECTOR, MATRIX, BIT-string, or CHARACTER type and the subscript or subscripts refer to the leaf.  I haven't yet thought too much about handling these cases, so the discussion here will for now pretend that those cases don't occur.  Hopefully, some details about those cases will be added in the future.)

As an example, suppose we have HAL/S code such as "X = A.B.C.D;".  PALMAT code similar to the following would be generated:
{'sentinel': 'dotted'}
{'string': 'A'}
{'string': 'B'}
{'string': 'C'}
{'operator': 'dotted'}
{'fetch': (index1, 'D')}
{'storepop': (index2, 'Y'}
where index1 and index2 are the scopes in which the variables A (not D, since there is no variable D) and Y are DECLARE'd.  What would appear on the computation stack (growing rightward) at the time the dotted instruction executes would be:
..., {'sentinel'}, 'A', 'B', 'C'
In theory, the PALMAT emulator would allow any of the strings on the computation stack to themselves be the results of string expressions.  However, syntactically, HAL/S does not provide any means of generating PALMAT code for such a case, so in practice all of the fieldnames are deposited onto the computation stack by PALMAT string instructions.

This method of accessing STRUCTURE fields by the preliminary PALMAT-based implementation of HAL/S is rather inefficient, in that each fieldname involves some kind of string lookup at execution time.  In an advanced implementation, one would expect the lookups to occur at compile-time, and the fieldname strings in the PALMAT instructions and on the computation stack to instead be pointers directly to the items being referenced, thus requiring no more execution time than any other variable.
halt
{'halt': True}
Halts execution/emulation of the the current instantiation.  Every PALMAT scope of type "program" ends with this instruction.  Moreover, this instruction is generated for every HAL/S RETURN statement that's outside the context of a FUNCTION or PROCEDURE.
automatics
{'automatics': True}
This instruction acts on all identifiers in its immediate scope (i.e., not on the ancestor scopes) which have both an "automatic" attribute and an "initial" attribute in the identifier table.  For those identifiers, which are variables, their "initial" attributes are copied into their "value" attributes.




This page is available under the Creative Commons No Rights Reserved License
Last modified by Ronald Burkey on 2023-03-28

Virtual AGC is
              hosted by ibiblio.org